summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2016-07-25Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Ingo Molnar: "The main change in this tree is the reworking, fixing and extension of the TSC frequency enumeration code (by Len Brown)" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Remove the unused check_tsc_disabled() x86/tsc: Enumerate BXT tsc_khz via CPUID x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID x86/tsc_msr: Remove irqoff around MSR-based TSC enumeration x86/tsc_msr: Add Airmont reference clock values x86/tsc_msr: Correct Silvermont reference clock values x86/tsc_msr: Update comments, expand definitions x86/tsc_msr: Remove debugging messages x86/tsc_msr: Identify Intel-specific code Revert "x86/tsc: Add missing Cherrytrail frequency to the table"
2016-07-25Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "The main changes in this cycle were: - Intel-SoC enhancements (Andy Shevchenko) - Intel CPU symbolic model definition rework (Dave Hansen) - ... other misc changes" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) x86/sfi: Enable enumeration of SD devices x86/pci: Use MRFLD abbreviation for Merrifield x86/platform/intel-mid: Make vertical indentation consistent x86/platform/intel-mid: Mark regulators explicitly defined x86/platform/intel-mid: Rename mrfl.c to mrfld.c x86/platform/intel-mid: Enable spidev on Intel Edison boards x86/platform/intel-mid: Extend PWRMU to support Penwell x86/pci, x86/platform/intel_mid_pci: Remove duplicate power off code x86/platform/intel-mid: Add pinctrl for Intel Merrifield x86/platform/intel-mid: Enable GPIO expanders on Edison x86/platform/intel-mid: Add Power Management Unit driver x86/platform/atom/punit: Enable support for Merrifield x86/platform/intel_mid_pci: Rework IRQ0 workaround x86, thermal: Clean up and fix CPU model detection for intel_soc_dts_thermal x86, mmc: Use Intel family name macros for mmc driver x86/intel_telemetry: Use Intel family name macros for telemetry driver x86/acpi/lss: Use Intel family name macros for the acpi_lpss driver x86/cpufreq: Use Intel family name macros for the intel_pstate cpufreq driver x86/platform: Use new Intel model number macros x86/intel_idle: Use Intel family macros for intel_idle ...
2016-07-25Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Ingo Molnar: "The main x86 FPU changes in this cycle were: - a large series of cleanups, fixes and enhancements to re-enable the XSAVES instruction on Intel CPUs - which is the most advanced instruction to do FPU context switches (Yu-cheng Yu, Fenghua Yu) - Add FPU tracepoints for the FPU state machine (Dave Hansen)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Do not BUG_ON() in early FPU code x86/fpu/xstate: Re-enable XSAVES x86/fpu/xstate: Fix fpstate_init() for XRSTORS x86/fpu/xstate: Return NULL for disabled xstate component address x86/fpu/xstate: Fix __fpu_restore_sig() for XSAVES x86/fpu/xstate: Fix xstate_offsets, xstate_sizes for non-extended xstates x86/fpu/xstate: Fix XSTATE component offset print out x86/fpu/xstate: Fix PTRACE frames for XSAVES x86/fpu/xstate: Fix supervisor xstate component offset x86/fpu/xstate: Align xstate components according to CPUID x86/fpu/xstate: Copy xstate registers directly to the signal frame when compacted format is in use x86/fpu/xstate: Keep init_fpstate.xsave.header.xfeatures as zero for init optimization x86/fpu/xstate: Rename 'xstate_size' to 'fpu_kernel_xstate_size', to distinguish it from 'fpu_user_xstate_size' x86/fpu/xstate: Define and use 'fpu_user_xstate_size' x86/fpu: Add tracepoints to dump FPU state at key points
2016-07-25Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 stackdump update from Ingo Molnar: "A number of stackdump enhancements" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/dumpstack: Add show_stack_regs() and use it printk: Make the printk*once() variants return a value x86/dumpstack: Honor supplied @regs arg
2016-07-25Merge branch 'x86-boot-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: "The main changes: - add initial commits to randomize kernel memory section virtual addresses, enabled via a new kernel option: RANDOMIZE_MEMORY (Thomas Garnier, Kees Cook, Baoquan He, Yinghai Lu) - enhance KASLR (RANDOMIZE_BASE) physical memory randomization (Kees Cook) - EBDA/BIOS region boot quirk cleanups (Andy Lutomirski, Ingo Molnar) - misc cleanups/fixes" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Simplify EBDA-vs-BIOS reservation logic x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does x86/boot: Reorganize and clean up the BIOS area reservation code x86/mm: Do not reference phys addr beyond kernel x86/mm: Add memory hotplug support for KASLR memory randomization x86/mm: Enable KASLR for vmalloc memory regions x86/mm: Enable KASLR for physical mapping memory regions x86/mm: Implement ASLR for kernel memory regions x86/mm: Separate variable for trampoline PGD x86/mm: Add PUD VA support for physical mapping x86/mm: Update physical mapping variable names x86/mm: Refactor KASLR entropy functions x86/KASLR: Fix boot crash with certain memory configurations x86/boot/64: Add forgotten end of function marker x86/KASLR: Allow randomization below the load address x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G x86/KASLR: Randomize virtual address separately x86/KASLR: Clarify identity map interface x86/boot: Refuse to build with data relocations x86/KASLR, x86/power: Remove x86 hibernation restrictions
2016-07-25Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "Various x86 low level modifications: - preparatory work to support virtually mapped kernel stacks (Andy Lutomirski) - support for 64-bit __get_user() on 32-bit kernels (Benjamin LaHaise) - (involved) workaround for Knights Landing CPU erratum (Dave Hansen) - MPX enhancements (Dave Hansen) - mremap() extension to allow remapping of the special VDSO vma, for purposes of user level context save/restore (Dmitry Safonov) - hweight and entry code cleanups (Borislav Petkov) - bitops code generation optimizations and cleanups with modern GCC (H. Peter Anvin) - syscall entry code optimizations (Paolo Bonzini)" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) x86/mm/cpa: Add missing comment in populate_pdg() x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2 x86/smp: Remove unnecessary initialization of thread_info::cpu x86/smp: Remove stack_smp_processor_id() x86/uaccess: Move thread_info::addr_limit to thread_struct x86/dumpstack: Rename thread_struct::sig_on_uaccess_error to sig_on_uaccess_err x86/uaccess: Move thread_info::uaccess_err and thread_info::sig_on_uaccess_err to thread_struct x86/dumpstack: When OOPSing, rewind the stack before do_exit() x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mm x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPS x86/dumpstack: Try harder to get a call trace on stack overflow x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() x86/mm: Use pte_none() to test for empty PTE x86/mm: Disallow running with 32-bit PTEs to work around erratum x86/mm: Ignore A/D bits in pte/pmd/pud_none() x86/mm: Move swap offset/type up in PTE to work around erratum x86/entry: Inline enter_from_user_mode() ...
2016-07-25Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/apic updates from Ingo Molnar: "Misc cleanups and a small fix" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Remove the unused struct apic::apic_id_mask field x86/apic: Fix misspelled APIC x86/ioapic: Simplify ioapic_setup_resources()
2016-07-25Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - introduce and use task_rcu_dereference()/try_get_task_struct() to fix and generalize task_struct handling (Oleg Nesterov) - do various per entity load tracking (PELT) fixes and optimizations (Peter Zijlstra) - cputime virt-steal time accounting enhancements/fixes (Wanpeng Li) - introduce consolidated cputime output file cpuacct.usage_all and related refactorings (Zhao Lei) - ... plus misc fixes and enhancements * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Panic on scheduling while atomic bugs if kernel.panic_on_warn is set sched/cpuacct: Introduce cpuacct.usage_all to show all CPU stats together sched/cpuacct: Use loop to consolidate code in cpuacct_stats_show() sched/cpuacct: Merge cpuacct_usage_index and cpuacct_stat_index enums sched/fair: Rework throttle_count sync sched/core: Fix sched_getaffinity() return value kerneldoc comment sched/fair: Reorder cgroup creation code sched/fair: Apply more PELT fixes sched/fair: Fix PELT integrity for new tasks sched/cgroup: Fix cpu_cgroup_fork() handling sched/fair: Fix PELT integrity for new groups sched/fair: Fix and optimize the fork() path sched/cputime: Add steal time support to full dynticks CPU time accounting sched/cputime: Fix prev steal time accouting during CPU hotplug KVM: Fix steal clock warp during guest CPU hotplug sched/debug: Always show 'nr_migrations' sched/fair: Use task_rcu_dereference() sched/api: Introduce task_rcu_dereference() and try_get_task_struct() sched/idle: Optimize the generic idle loop sched/fair: Fix the wrong throttled clock time for cfs_rq_clock_task()
2016-07-25Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "With over 300 commits it's been a busy cycle - with most of the work concentrated on the tooling side (as it should). The main kernel side enhancements were: - Add per event callchain limit: Recently we introduced a sysctl to tune the max-stack for all events for which callchains were requested: $ sysctl kernel.perf_event_max_stack kernel.perf_event_max_stack = 127 Now this patch introduces a way to configure this per event, i.e. this becomes possible: $ perf record -e sched:*/max-stack=2/ -e block:*/max-stack=10/ -a allowing finer tuning of how much buffer space callchains use. This uses an u16 from the reserved space at the end, leaving another u16 for future use. There has been interest in even finer tuning, namely to control the max stack for kernel and userspace callchains separately. Further discussion is needed, we may for instance use the remaining u16 for that and when it is present, assume that the sample_max_stack introduced in this patch applies for the kernel, and the u16 left is used for limiting the userspace callchain (Arnaldo Carvalho de Melo) - Optimize AUX event (hardware assisted side-band event) delivery (Kan Liang) - Rework Intel family name macro usage (this is partially x86 arch work) (Dave Hansen) - Refine and fix Intel LBR support (David Carrillo-Cisneros) - Add support for Intel 'TopDown' events (Andi Kleen) - Intel uncore PMU driver fixes and enhancements (Kan Liang) - ... other misc changes. Here's an incomplete list of the tooling enhancements (but there's much more, see the shortlog and the git log for details): - Support cross unwinding, i.e. collecting '--call-graph dwarf' perf.data files in one machine and then doing analysis in another machine of a different hardware architecture. This enables, for instance, to do: $ perf record -a --call-graph dwarf on a x86-32 or aarch64 system and then do 'perf report' on it on a x86_64 workstation (He Kuang) - Allow reading from a backward ring buffer (one setup via sys_perf_event_open() with perf_event_attr.write_backward = 1) (Wang Nan) - Finish merging initial SDT (Statically Defined Traces) support, see cset comments for details about how it all works (Masami Hiramatsu) - Support attaching eBPF programs to tracepoints (Wang Nan) - Add demangling of symbols in programs written in the Rust language (David Tolnay) - Add support for tracepoints in the python binding, including an example, that sets up and parses sched:sched_switch events, tools/perf/python/tracepoint.py (Jiri Olsa) - Introduce --stdio-color to set up the color output mode selection in 'annotate' and 'report', allowing emit color escape sequences when redirecting the output of these tools (Arnaldo Carvalho de Melo) - Add 'callindent' option to 'perf script -F', to indent the Intel PT call stack, making this output more ftrace-like (Adrian Hunter, Andi Kleen) - Allow dumping the object files generated by llvm when processing eBPF scriptlet events (Wang Nan) - Add stackcollapse.py script to help generating flame graphs (Paolo Bonzini) - Add --ldlat option to 'perf mem' to specify load latency for loads event (e.g. cpu/mem-loads/ ) (Jiri Olsa) - Tooling support for Intel TopDown counters, recently added to the kernel (Andi Kleen)" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (303 commits) perf tests: Add is_printable_array test perf tools: Make is_printable_array global perf script python: Fix string vs byte array resolving perf probe: Warn unmatched function filter correctly perf cpu_map: Add more helpers perf stat: Balance opening and reading events tools: Copy linux/{hash,poison}.h and check for drift perf tools: Remove include/linux/list.h from perf's MANIFEST tools: Copy the bitops files accessed from the kernel and check for drift Remove: kernel unistd*h files from perf's MANIFEST, not used perf tools: Remove tools/perf/util/include/linux/const.h perf tools: Remove tools/perf/util/include/asm/byteorder.h perf tools: Add missing linux/compiler.h include to perf-sys.h perf jit: Remove some no-op error handling perf jit: Add missing curly braces objtool: Initialize variable to silence old compiler objtool: Add -I$(srctree)/tools/arch/$(ARCH)/include/uapi perf record: Add --tail-synthesize option perf session: Don't warn about out of order event if write_backward is used perf tools: Enable overwrite settings ...
2016-07-25Merge branch 'ras-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "The biggest change in this cycle was an enhancement by Yazen Ghannam to reduce the number of MCE error injection related IPIs. The rest are smaller fixes" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Fix mce_rdmsrl() warning message x86/RAS/AMD: Reduce the number of IPIs when prepping error injection x86/mce/AMD: Increase size of the bank_map type x86/mce: Do not use bank 1 for APEI generated error logs
2016-07-25x86/acpi: store ACPI ids from MADT for future usageVitaly Kuznetsov
Currently we don't save ACPI ids (unlike LAPIC ids which go to x86_cpu_to_apicid) from MADT and we may need this information later. Particularly, ACPI ids is the only existent way for a PVHVM Xen guest to figure out Xen's idea of its vCPUs ids before these CPUs boot and in some cases these ids diverge from Linux's cpu ids. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25Merge branch 'x86/cpu' from tipRafael J. Wysocki
2016-07-25Merge branches 'pm-sleep' and 'pm-tools'Rafael J. Wysocki
* pm-sleep: PM / hibernate: Introduce test_resume mode for hibernation x86 / hibernate: Use hlt_play_dead() when resuming from hibernation PM / hibernate: Image data protection during restoration PM / hibernate: Add missing braces in __register_nosave_region() PM / hibernate: Clean up comments in snapshot.c PM / hibernate: Clean up function headers in snapshot.c PM / hibernate: Add missing braces in hibernate_setup() PM / hibernate: Recycle safe pages after image restoration PM / hibernate: Simplify mark_unsafe_pages() PM / hibernate: Do not free preallocated safe pages during image restore PM / suspend: show workqueue state in suspend flow PM / sleep: make PM notifiers called symmetrically PM / sleep: Make pm_prepare_console() return void PM / Hibernate: Don't let kasan instrument snapshot.c * pm-tools: PM / tools: scripts: AnalyzeSuspend v4.2 tools/turbostat: allow user to alter DESTDIR and PREFIX
2016-07-25Merge branch 'acpi-tables'Rafael J. Wysocki
* acpi-tables: ACPI: Rename configfs.c to acpi_configfs.c to prevent link error ACPI: add support for loading SSDTs via configfs ACPI: add support for configfs efi / ACPI: load SSTDs from EFI variables spi / ACPI: add support for ACPI reconfigure notifications i2c / ACPI: add support for ACPI reconfigure notifications ACPI: add support for ACPI reconfiguration notifiers ACPI / scan: fix enumeration (visited) flags for bus rescans ACPI / documentation: add SSDT overlays documentation ACPI: ARM64: support for ACPI_TABLE_UPGRADE ACPI / tables: introduce ARCH_HAS_ACPI_TABLE_UPGRADE ACPI / tables: move arch-specific symbol to asm/acpi.h ACPI / tables: table upgrade: refactor function definitions ACPI / tables: table upgrade: use cacheable map for tables Conflicts: arch/arm64/include/asm/acpi.h
2016-07-22x86/boot: Simplify EBDA-vs-BIOS reservation logicAndy Lutomirski
Both the intent and the effect of reserve_bios_regions() is simple: reserve the range from the apparent BIOS start (suitably filtered) through 1MB and, if the EBDA start address is sensible, extend that reservation downward to cover the EBDA as well. The code is overcomplicated, though, and contains head-scratchers like: if (ebda_start < BIOS_START_MIN) ebda_start = BIOS_START_MAX; That snipped is trying to say "if ebda_start < BIOS_START_MIN, ignore it". Simplify it: reorder the code so that it makes sense. This should have no functional effect under any circumstances. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mario Limonciello <mario_limonciello@dell.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21x86/fpu: Do not BUG_ON() in early FPU codeDave Hansen
I don't think it is really possible to have a system where CPUID enumerates support for XSAVE but that it does not have FP/SSE (they are "legacy" features and always present). But, I did manage to hit this case in qemu when I enabled its somewhat shaky XSAVE support. The bummer is that the FPU is set up before we parse the command-line or have *any* console support including earlyprintk. That turned what should have been an easy thing to debug in to a bit more of an odyssey. So a BUG() here is worthless. All it does it guarantee that if/when we hit this case we have an empty console. So, remove the BUG() and try to limp along by disabling XSAVE and trying to continue. Add a comment on why we are doing this, and also add a common "out_disable" path for leaving fpu__init_system_xstate(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21x86/boot: Reorganize and clean up the BIOS area reservation codeIngo Molnar
So the reserve_ebda_region() code has accumulated a number of problems over the years that make it really difficult to read and understand: - The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks... - 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it super confusing to read. - It does not help at all that we have various memory range markers, half of which are 'start of range', half of which are 'end of range' - but this crucial property is not obvious in the naming at all ... gave me a headache trying to understand all this. - Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is obvious, all values here are addresses!), while it does not highlight that it's the _start_ of the EBDA region ... - 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value that is a pointer to a value, not a memory range address! - The function name itself is a misnomer: it says 'reserve_ebda_region()' while its main purpose is to reserve all the firmware ROM typically between 640K and 1MB, while the 'EBDA' part is only a small part of that ... - Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this too should be about whether to reserve firmware areas in the paravirt case. - In fact thinking about this as 'end of RAM' is confusing: what this function *really* wants to reserve is firmware data and code areas! Once the thinking is inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure 'reserved area' notion everything becomes a lot clearer. To improve all this rewrite the whole code (without changing the logic): - Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start' and propagate this concept through all the variable names and constants. BIOS_RAM_SIZE_KB_PTR // was: BIOS_LOWMEM_KILOBYTES BIOS_START_MIN // was: INSANE_CUTOFF ebda_start // was: ebda_addr bios_start // was: lowmem BIOS_START_MAX // was: LOWMEM_CAP - Then clean up the name of the function itself by renaming it to reserve_bios_regions() and renaming the ::ebda_search paravirt flag to ::reserve_bios_regions. - Fix up all the comments (fix typos), harmonize and simplify their formulation and remove comments that become unnecessary due to the much better naming all around. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-20x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUsPeter Zijlstra
Monitored cached line may not wake up from mwait on certain Goldmont based CPUs. This patch will avoid calling current_set_polling_and_test() and thereby not set the TIF_ flag. The result is that we'll always send IPIs for wakeups. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468867270-18493-1-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-20Merge branch 'linus' into x86/cpu, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-19x86/apic: Remove duplicated include from probe_64.cWei Yongjun
Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/1468929740-8999-1-git-send-email-weiyj_lk@163.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-19x86/headers: Include spinlock_types.h in x8664_ksyms_64.c for missing spinlock_tStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-next@vger.kernel.org Fixes: 186f43608a5c ("x86/kernel: Audit and remove any unnecessary uses of module.h") Link: http://lkml.kernel.org/r/20160718182922.7b41f923@canb.auug.org.au Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86 / hibernate: Use hlt_play_dead() when resuming from hibernationRafael J. Wysocki
On Intel hardware, native_play_dead() uses mwait_play_dead() by default and only falls back to the other methods if that fails. That also happens during resume from hibernation, when the restore (boot) kernel runs disable_nonboot_cpus() to take all of the CPUs except for the boot one offline. However, that is problematic, because the address passed to __monitor() in mwait_play_dead() is likely to be written to in the last phase of hibernate image restoration and that causes the "dead" CPU to start executing instructions again. Unfortunately, the page containing the address in that CPU's instruction pointer may not be valid any more at that point. First, that page may have been overwritten with image kernel memory contents already, so the instructions the CPU attempts to execute may simply be invalid. Second, the page tables previously used by that CPU may have been overwritten by image kernel memory contents, so the address in its instruction pointer is impossible to resolve then. A report from Varun Koyyalagunta and investigation carried out by Chen Yu show that the latter sometimes happens in practice. To prevent it from happening, temporarily change the smp_ops.play_dead pointer during resume from hibernation so that it points to a special "play dead" routine which uses hlt_play_dead() and avoids the inadvertent "revivals" of "dead" CPUs this way. A slightly unpleasant consequence of this change is that if the system is hibernated with one or more CPUs offline, it will generally draw more power after resume than it did before hibernation, because the physical state entered by CPUs via hlt_play_dead() is higher-power than the mwait_play_dead() one in the majority of cases. It is possible to work around this, but it is unclear how much of a problem that's going to be in practice, so the workaround will be implemented later if it turns out to be necessary. Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371 Reported-by: Varun Koyyalagunta <cpudebug@centtech.com> Original-by: Chen Yu <yu.c.chen@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/x2apic: Convert to CPU hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153337.736898691@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/tboot: Convert to hotplug state machineRichard Cochran
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Gang Wei <gang.wei@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ning Sun <ning.sun@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard L Maliszewski <richard.l.maliszewski@intel.com> Cc: Shane Wang <shane.wang@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: rt@linutronix.de Cc: tboot-devel@lists.sourceforge.net Link: http://lkml.kernel.org/r/20160713153337.400227322@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/apb_timer: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine. There is no setup just one teardown callback. Remove the silly comment about the workqueue up dependency. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153335.625342983@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/apic: Remove the unused struct apic::apic_id_mask fieldWei Jiangang
The only user verify_local_APIC() had been removed by commit: 4399c03c6780 ("x86/apic: Remove verify_local_APIC()") ... so there is no need to keep it. Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: bsd@redhat.com Cc: david.vrabel@citrix.com Cc: jgross@suse.com Cc: konrad.wilk@oracle.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1468463046-20849-1-git-send-email-weijg.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15Merge branch 'linus' into x86/apic, to refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/tsc: Remove the unused check_tsc_disabled()Wei Jiangang
check_tsc_disabled() was introduced by commit: c73deb6aecda ("perf/x86: Add ability to calculate TSC from perf sample timestamps") The only caller was arch_perf_update_userpage(), which had been refactored by commit: d8b11a0cbd1c ("perf/x86: Clean up cap_user_time* setting") ... so no need keep and export it any more. Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: a.p.zijlstra@chello.nl Cc: adrian.hunter@intel.com Cc: bp@suse.de Link: http://lkml.kernel.org/r/1468570330-25810-1-git-send-email-weijg.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/smp: Remove unnecessary initialization of thread_info::cpuAndy Lutomirski
It's statically initialized to zero -- no need to dynamically initialize it to zero as well. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/6cf6314dce3051371a913ee19d1b88e29c68c560.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/smp: Remove stack_smp_processor_id()Andy Lutomirski
It serves no purpose -- raw_smp_processor_id() works fine. This change will be needed to move thread_info off the stack. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a2bf4f07fbc30fb32f9f7f3f8f94ad3580823847.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/uaccess: Move thread_info::addr_limit to thread_structAndy Lutomirski
struct thread_info is a legacy mess. To prepare for its partial removal, move thread_info::addr_limit out. As an added benefit, this way is simpler. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/15bee834d09402b47ac86f2feccdf6529f9bc5b0.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/dumpstack: When OOPSing, rewind the stack before do_exit()Andy Lutomirski
If we call do_exit() with a clean stack, we greatly reduce the risk of recursive oopses due to stack overflow in do_exit, and we allow do_exit to work even if we OOPS from an IST stack. The latter gives us a much better chance of surviving long enough after we detect a stack overflow to write out our logs. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/32f73ceb372ec61889598da5e5b145889b9f2e19.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPSAndy Lutomirski
If we overflow the stack into a guard page, we'll recursively fault when trying to dump the contents of the guard page. Use probe_kernel_address() so we can recover if this happens. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e626d47a55d7b04dcb1b4d33faa95e8505b217c8.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15x86/dumpstack: Try harder to get a call trace on stack overflowAndy Lutomirski
If we overflow the stack, print_context_stack() will abort. Detect this case and rewind back into the valid part of the stack so that we can trace it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/ee1690eb2715ccc5dc187fde94effa4ca0ccbbcd.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86/reboot: Add Dell Optiplex 7450 AIO reboot quirkAlex Hung
Dell Optiplex 7450 AIO works with BOOT_ACPI; however, the quirk for "OptiPlex 745" changes its boot method to BOOT_BIOS and causes 7450 AIO hangs when rebooting; as a result, 7450 AIO is appended to overwrite BOOT_BIOS by BOOT_ACPI in order not to break the original 745 series Signed-off-by: Alex Hung <alex.hung@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86: Audit and remove any remaining unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. In the case of some of these which are modular, we can extend that to also include files that are building basic support functionality but not related to loading or registering the final module; such files also have no need whatsoever for module.h The advantage in removing such instances is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each instance for the presence of either and replace as needed. In the case of crypto/glue_helper.c we delete a redundant instance of MODULE_LICENSE in order to delete module.h -- the license info is already present at the top of the file. The uncore change warrants a mention too; it is uncore.c that uses module.h and not uncore.h; hence the relocation done there. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160714001901.31603-9-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86/kernel: Audit and remove any unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance for the presence of either and replace as needed. Build testing revealed some implicit header usage that was fixed up accordingly. Note that some bool/obj-y instances remain since module.h is the header for some exception table entry stuff, and for things like __init_or_module (code that is tossed when MODULES=n). Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160714001901.31603-4-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86: Don't use module.h just for AUTHOR / LICENSE tagsPaul Gortmaker
The Kconfig controlling compilation of these files are: arch/x86/Kconfig.debug:config DEBUG_RODATA_TEST arch/x86/Kconfig.debug: bool "Testcase for the marking rodata read-only" arch/x86/Kconfig.debug:config X86_PTDUMP_CORE arch/x86/Kconfig.debug: def_bool n ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modular infrastructure use, so that when reading the driver there is no doubt it is builtin-only. We delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160714001901.31603-2-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14Merge branch 'x86/platform' into x86/headers, to apply dependent patchesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-14x86/hpet: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153335.279718463@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-13Merge branch 'timers/core' into smp/hotplug to pick up dependenciesThomas Gleixner
2016-07-11x86/tsc: Enumerate BXT tsc_khz via CPUIDLen Brown
Hard code the BXT crystal clock (aka ART - Always Running Timer) to 19.200 MHz, and use CPUID leaf 0x15 to determine the BXT TSC frequency. Use tsc_khz to sanity check BXT cpu_khz, which can be erroneous in some configurations. (I simplified the original patch from Bin Gao.) Original-From: Bin Gao <bin.gao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/bf4e7c175acd6d09719c47c319b10ff1f0627ff8.1466138954.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-11x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUIDLen Brown
Skylake CPU base-frequency and TSC frequency may differ by up to 2%. Enumerate CPU and TSC frequencies separately, allowing cpu_khz and tsc_khz to differ. The existing CPU frequency calibration mechanism is unchanged. However, CPUID extensions are preferred, when available. CPUID.0x16 is preferred over MSR and timer calibration for CPU frequency discovery. CPUID.0x15 takes precedence over CPU-frequency for TSC frequency discovery. Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/b27ec289fd005833b27d694d9c2dbb716c5cdff7.1466138954.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-11x86/tsc_msr: Remove irqoff around MSR-based TSC enumerationLen Brown
Remove the irqoff/irqon around MSR-based TSC enumeration, as it is not necessary. Also rename: try_msr_calibrate_tsc() to cpu_khz_from_msr(), as that better describes what the routine does. Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/a6b5c3ecd3b068175d2309599ab28163fc34215e.1466138954.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-11x86/fpu/xstate: Re-enable XSAVESYu-cheng Yu
We did not handle XSAVES instructions correctly. There were issues in converting between standard and compacted format when interfacing with user-space. These issues have been corrected. Add a WARN_ONCE() to make it clear that XSAVES supervisor states are not yet implemented. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: H. Peter Anvin <h.peter.anvin@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468253937-40008-5-git-send-email-fenghua.yu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-11x86/fpu/xstate: Fix fpstate_init() for XRSTORSYu-cheng Yu
In XSAVES mode if fpstate_init() is used to initialize a task's extended state area, xsave.header.xcomp_bv[63] must be set. Otherwise, when the task is scheduled, a warning is triggered from copy_kernel_to_xregs(). One such test case is: setting an invalid extended state through PTRACE. When xstateregs_set() rejects the syscall and re-initializes the task's extended state area. This triggers the warning mentioned above. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: H. Peter Anvin <h.peter.anvin@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468253937-40008-4-git-send-email-fenghua.yu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-11x86/fpu/xstate: Return NULL for disabled xstate component addressYu-cheng Yu
It is an error to request a disabled XSAVE/XSAVES component address. For that case, make __raw_xsave_addr() return a NULL and issue a warning. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: H. Peter Anvin <h.peter.anvin@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468253937-40008-3-git-send-email-fenghua.yu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-11x86/fpu/xstate: Fix __fpu_restore_sig() for XSAVESYu-cheng Yu
When the kernel is using XSAVES compacted format, we cannot do __copy_from_user() from a signal frame, which has standard-format data. Fix it by using copyin_to_xsaves(), which converts between formats and filters out all supervisor states that we do not allow userspace to write. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: H. Peter Anvin <h.peter.anvin@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468253937-40008-2-git-send-email-fenghua.yu@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-10Revert "perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86"Ingo Molnar
This reverts commit 2c95afc1e83d93fac3be6923465e1753c2c53b0a. Stephane reported the following regression: > Since Andi added: > > commit 2c95afc1e83d93fac3be6923465e1753c2c53b0a > Author: Andi Kleen <ak@linux.intel.com> > Date: Thu Jun 9 06:14:38 2016 -0700 > > perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86 > > $ perf stat -e ref-cycles ls > <not counted> .... > > fails systematically because the ref-cycles is now used by the > watchdog and given this is a system-wide pinned event, it monopolizes > the fixed counter 2 which is the only counter able to measure this event. Since the next merge window is near, fix the regression for now by reverting the commit. Reported-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-10x86/quirks: Add early quirk to reset Apple AirPort cardLukas Wunner
The EFI firmware on Macs contains a full-fledged network stack for downloading OS X images from osrecovery.apple.com. Unfortunately on Macs introduced 2011 and 2012, EFI brings up the Broadcom 4331 wireless card on every boot and leaves it enabled even after ExitBootServices has been called. The card continues to assert its IRQ line, causing spurious interrupts if the IRQ is shared. It also corrupts memory by DMAing received packets, allowing for remote code execution over the air. This only stops when a driver is loaded for the wireless card, which may be never if the driver is not installed or blacklisted. The issue seems to be constrained to the Broadcom 4331. Chris Milsted has verified that the newer Broadcom 4360 built into the MacBookPro11,3 (2013/2014) does not exhibit this behaviour. The chances that Apple will ever supply a firmware fix for the older machines appear to be zero. The solution is to reset the card on boot by writing to a reset bit in its mmio space. This must be done as an early quirk and not as a plain vanilla PCI quirk to successfully combat memory corruption by DMAed packets: Matthew Garrett found out in 2012 that the packets are written to EfiBootServicesData memory (http://mjg59.dreamwidth.org/11235.html). This type of memory is made available to the page allocator by efi_free_boot_services(). Plain vanilla PCI quirks run much later, in subsys initcall level. In-between a time window would be open for memory corruption. Random crashes occurring in this time window and attributed to DMAed packets have indeed been observed in the wild by Chris Bainbridge. When Matthew Garrett analyzed the memory corruption issue in 2012, he sought to fix it with a grub quirk which transitions the card to D3hot: http://git.savannah.gnu.org/cgit/grub.git/commit/?id=9d34bb85da56 This approach does not help users with other bootloaders and while it may prevent DMAed packets, it does not cure the spurious interrupts emanating from the card. Unfortunately the card's mmio space is inaccessible in D3hot, so to reset it, we have to undo the effect of Matthew's grub patch and transition the card back to D0. Note that the quirk takes a few shortcuts to reduce the amount of code: The size of BAR 0 and the location of the PM capability is identical on all affected machines and therefore hardcoded. Only the address of BAR 0 differs between models. Also, it is assumed that the BCMA core currently mapped is the 802.11 core. The EFI driver seems to always take care of this. Michael Büsch, Bjorn Helgaas and Matt Fleming contributed feedback towards finding the best solution to this problem. The following should be a comprehensive list of affected models: iMac13,1 2012 21.5" [Root Port 00:1c.3 = 8086:1e16] iMac13,2 2012 27" [Root Port 00:1c.3 = 8086:1e16] Macmini5,1 2011 i5 2.3 GHz [Root Port 00:1c.1 = 8086:1c12] Macmini5,2 2011 i5 2.5 GHz [Root Port 00:1c.1 = 8086:1c12] Macmini5,3 2011 i7 2.0 GHz [Root Port 00:1c.1 = 8086:1c12] Macmini6,1 2012 i5 2.5 GHz [Root Port 00:1c.1 = 8086:1e12] Macmini6,2 2012 i7 2.3 GHz [Root Port 00:1c.1 = 8086:1e12] MacBookPro8,1 2011 13" [Root Port 00:1c.1 = 8086:1c12] MacBookPro8,2 2011 15" [Root Port 00:1c.1 = 8086:1c12] MacBookPro8,3 2011 17" [Root Port 00:1c.1 = 8086:1c12] MacBookPro9,1 2012 15" [Root Port 00:1c.1 = 8086:1e12] MacBookPro9,2 2012 13" [Root Port 00:1c.1 = 8086:1e12] MacBookPro10,1 2012 15" [Root Port 00:1c.1 = 8086:1e12] MacBookPro10,2 2012 13" [Root Port 00:1c.1 = 8086:1e12] For posterity, spurious interrupts caused by the Broadcom 4331 wireless card resulted in splats like this (stacktrace omitted): irq 17: nobody cared (try booting with the "irqpoll" option) handlers: [<ffffffff81374370>] pcie_isr [<ffffffffc0704550>] sdhci_irq [sdhci] threaded [<ffffffffc07013c0>] sdhci_thread_irq [sdhci] [<ffffffffc0a0b960>] azx_interrupt [snd_hda_codec] Disabling IRQ #17 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79301 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111781 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=728916 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=895951#c16 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009819 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1098621 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1149632#c5 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1279130 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1332732 Tested-by: Konstantin Simanov <k.simanov@stlk.ru> # [MacBookPro8,1] Tested-by: Lukas Wunner <lukas@wunner.de> # [MacBookPro9,1] Tested-by: Bryan Paradis <bryan.paradis@gmail.com> # [MacBookPro9,2] Tested-by: Andrew Worsley <amworsley@gmail.com> # [MacBookPro10,1] Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> # [MacBookPro10,2] Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Rafał Miłecki <zajec5@gmail.com> Acked-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chris Milsted <cmilsted@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Michael Buesch <m@bues.ch> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yinghai Lu <yinghai@kernel.org> Cc: b43-dev@lists.infradead.org Cc: linux-pci@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: stable@vger.kernel.org Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Apply nvidia_bugs quirk only on root bus Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Reintroduce scanning of secondary buses Link: http://lkml.kernel.org/r/48d0972ac82a53d460e5fce77a07b2560db95203.1465690253.git.lukas@wunner.de [ Did minor readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>