summaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)Author
2017-01-24x86/ras, EDAC, acpi: Assign MCE notifier handlers a priorityBorislav Petkov
Assign all notifiers on the MCE decode chain a priority so that they get called in the correct order. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24x86/ras: Get rid of mce_process_work()Borislav Petkov
Make mce_gen_pool_process() the workqueue function directly and save us an indirection. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-9-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24x86/ras: Flip the TSC-adding logicBorislav Petkov
Add the TSC value to the MCE record only when the MCE being logged is precise, i.e., it is logged as an exception or an MCE-related interrupt. So it doesn't look particularly easy to do without touching/changing a bunch of places. That's why I'm trying tricks first. For example, the mce-apei.c case I'm addressing by setting ->tsc only for errors of panic severity. The idea there is, that, panic errors will have raised an #MC and not polled. And then instead of propagating a flag to mce_setup(), it seems easier/less code to set ->tsc depending on the call sites, i.e., are we polling or are we preparing an MCE record in an exception handler/thresholding interrupt. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-5-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24x86/ras/amd: Make sysfs names of banks more user-friendlyYazen Ghannam
Currently, we append the MCA_IPID[InstanceId] to the bank name to create the sysfs filename. The InstanceId field uniquely identifies a bank instance but it doesn't look very nice for most banks. Replace the InstanceId with a simpler, ascending (0, 1, ..) value. Only use this in the sysfs name when there is more than 1 instance. Otherwise, just use the bank's name as the sysfs name. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1484322741-41884-3-git-send-email-Yazen.Ghannam@amd.com Link: http://lkml.kernel.org/r/20170123183514.13356-4-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24x86/ras/therm_throt: Do not log a fake MCE for thermal eventsBorislav Petkov
We log a fake bank 128 MCE to note that we're handling a CPU thermal event. However, this confuses people into thinking that their hardware generates MCEs. Hijacking MCA for logging thermal events is a gross misuse anyway and it shouldn't have been done in the first place. And besides we have other means for dealing with thermal events which are much more suitable. So let's kill the MCE logging part. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Ashok Raj <ashok.raj@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170105213846.GA12024@gmail.com Link: http://lkml.kernel.org/r/20170123183514.13356-3-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24x86/ras/inject: Make it depend on X86_LOCAL_APIC=yBorislav Petkov
... and get rid of the annoying: arch/x86/kernel/cpu/mcheck/mce-inject.c:97:13: warning: ‘mce_irq_ipi’ defined but not used [-Wunused-function] when doing randconfig builds. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170123183514.13356-2-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-24x86/fpu/xstate: Fix xcomp_bv in XSAVES headerYu-cheng Yu
The compacted-format XSAVES area is determined at boot time and never changed after. The field xsave.header.xcomp_bv indicates which components are in the fixed XSAVES format. In fpstate_init() we did not set xcomp_bv to reflect the XSAVES format since at the time there is no valid data. However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(), as in commit: b22cbe404a9c x86/fpu: Fix invalid FPU ptrace state after execve() and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode app, a #GP occurs. This can be easily triggered by doing valgrind on a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and others: https://bugzilla.kernel.org/show_bug.cgi?id=190061 Fix it by setting xcomp_bv correctly. This patch also moves the xcomp_bv initialization to the proper place, which was in copyin_to_xsaves() as of: 4c833368f0bf x86/fpu: Set the xcomp_bv when we fake up a XSAVES area which fixed the bug too, but it's more efficient and cleaner to initialize things once per boot, not for every signal handling operation. Reported-by: Kevin Hao <haokexin@gmail.com> Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: haokexin@gmail.com Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com [ Combined it with 4c833368f0bf. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-23x86/fpu: Set the xcomp_bv when we fake up a XSAVES areaKevin Hao
I got the following calltrace on a Apollo Lake SoC with 32-bit kernel: WARNING: CPU: 2 PID: 261 at arch/x86/include/asm/fpu/internal.h:363 fpu__restore+0x1f5/0x260 [...] Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS APLIRVPA.X64.0138.B35.1608091058 08/09/2016 Call Trace: dump_stack() __warn() ? fpu__restore() warn_slowpath_null() fpu__restore() __fpu__restore_sig() fpu__restore_sig() restore_sigcontext.isra.9() sys_sigreturn() do_int80_syscall_32() entry_INT80_32() The reason is that a #GP occurs when executing XRSTORS. The root cause is that we forget to set the xcomp_bv when we fake up the XSAVES area in the copyin_to_xsaves() function. Signed-off-by: Kevin Hao <haokexin@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Link: http://lkml.kernel.org/r/1485075023-30161-1-git-send-email-haokexin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Remove struct cont_desc.eq_idBorislav Petkov
The equivalence ID was needed outside of the container scanning logic but now, after this has been cleaned up, not anymore. Now, cont_desc.mc is used to denote whether the container we're looking at has the proper microcode patch for this CPU or not. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170120202955.4091-17-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Remove AP scanning optimizationBorislav Petkov
The idea was to not scan the microcode blob on each AP (Application Processor) during boot and thus save us some milliseconds. However, on architectures where the microcode engine is shared between threads, this doesn't work. Here's why: The microcode on CPU0, i.e., the first thread, gets updated. The second thread, i.e., CPU1, i.e., the first AP walks into load_ucode_amd_ap(), sees that there's no container cached and goes and scans for the proper blob. It finds it and as a last step of apply_microcode_early_amd(), it tries to apply the patch but that core has already the updated microcode revision which it has received through CPU0's update. So it returns false and we do desc->size = -1 to prevent other APs from scanning. However, the next AP, CPU2, has a different microcode engine which hasn't been updated yet. The desc->size == -1 test prevents it from scanning the blob anew and we fail to update it. The fix is much more straight-forward than it looks: the BSP (BootStrapping Processor), i.e., CPU0, caches the microcode patch in amd_ucode_patch. We use that on the AP and try to apply it. In the 99.9999% of cases where we have homogeneous cores - *not* mixed-steppings - the application will be successful and we're good to go. In the remaining small set of systems, we will simply rescan the blob and find (or not, if none present) the proper patch and apply it then. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-16-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Simplify saving from initrdBorislav Petkov
No need to use the previously stashed info in the container - simply go ahead and parse the initrd once more. It simplifies and streamlines the code a whole lot. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-15-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Unify load_ucode_amd_ap()Borislav Petkov
Use a version for both bitness by adding a helper which does the actual container finding and parsing which can be used on any CPU - BSP or AP. Streamlines the paths more. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170120202955.4091-14-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Check patch level only on the BSPBorislav Petkov
Check final patch levels for AMD only on the BSP. This way, we decide early and only once whether to continue loading or to leave the loader disabled on such systems. Simplify a lot. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170120202955.4091-13-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode: Remove local vendor variableBorislav Petkov
Use x86_cpuid_vendor() directly. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170120202955.4091-12-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Use find_microcode_in_initrd()Borislav Petkov
Use the generic helper instead of semi-open-coding the procedure. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-11-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Get rid of global this_equiv_idBorislav Petkov
We have a container which we update/prepare each time before applying a microcode patch instead of using a global. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-10-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode: Decrease CPUID useBorislav Petkov
Get CPUID(1).EAX value once per CPU and propagate value into the callers instead of conveniently calling it every time. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-9-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Rework container parsingBorislav Petkov
It was pretty clumsy before and the whole work of parsing the microcode containers was spread around the functions wrongly. Clean it up so that there's a main scan_containers() function which iterates over the microcode blob and picks apart the containers glued together. For each container, it calls a parse_container() helper which concentrates on one container only: sanity-checking, parsing, counting microcode patches in there, etc. It makes much more sense now and it is actually very readable. Oh, and we luvz a diffstat removing more crap than adding. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170120202955.4091-8-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Extend the container structBorislav Petkov
Make it into a container descriptor which is being passed around and stores important info like the matching container and the patch for the current CPU. Make it static too. Later patches will use this and thus get rid of a double container parsing. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-7-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Shorten function parameter's nameBorislav Petkov
The whole driver calls this "mc", do that here too. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-6-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/AMD: Clean up find_equiv_id()Borislav Petkov
No need to have it marked "inline" - let gcc decide. Also, shorten the argument name and simplify while-test. While at it, make it into a proper for-loop and simplify it even more, as tglx suggests. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170120202955.4091-5-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/microcode/intel: Drop stashed AP patch pointer optimizationBorislav Petkov
This was meant to save us the scanning of the microcode containter in the initrd since the first AP had already done that but it can also hurt us: Imagine a single hyperthreaded CPU (Intel(R) Atom(TM) CPU N270, for example) which updates the microcode on the BSP but since the microcode engine is shared between the two threads, the update on CPU1 doesn't happen because it has already happened on CPU0 and we don't find a newer microcode revision on CPU1. Which doesn't set the intel_ucode_patch pointer and at initrd jettisoning time we don't save the microcode patch for later application. Now, when we suspend to RAM, the loaded microcode gets cleared so we need to reload but there's no patch saved in the cache. Removing the optimization fixes this issue and all is fine and dandy. Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading") Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170120202955.4091-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-23x86/crash: Update the stale comment in reserve_crashkernel()Xunlei Pang
CRASH_KERNEL_ADDR_MAX has been missing for a long time, update it with a more detailed explanation. Signed-off-by: Xunlei Pang <xlpang@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Young <dyoung@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert LeBlanc <robert@leblancnet.us> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/1485154103-18426-1-git-send-email-xlpang@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-20Drivers: hv: vmbus: Move the extracting of Hypervisor version informationK. Y. Srinivasan
As part of the effort to separate out architecture specific code, extract hypervisor version information in an architecture specific file. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-20Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource codeK. Y. Srinivasan
As part of the effort to separate out architecture specific code, consolidate all Hyper-V specific clocksource code to an architecture specific code. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-20x86/ioapic: Return suitable error code in mp_map_gsi_to_irq()Andy Shevchenko
mp_map_gsi_to_irq() in some cases might return legacy -1, which would be wrongly interpreted as -EPERM. Correct those cases to return proper error code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: http://lkml.kernel.org/r/20170119192425.189899-2-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-20sched/clock: Fix hotplug crashPeter Zijlstra
Mike reported that he could trigger the WARN_ON_ONCE() in set_sched_clock_stable() using hotplug. This exposed a fundamental problem with the interface, we should never mark the TSC stable if we ever find it to be unstable. Therefore set_sched_clock_stable() is a broken interface. The reason it existed is that not having it is a pain, it means all relevant architecture code needs to call clear_sched_clock_stable() where appropriate. Of the three architectures that select HAVE_UNSTABLE_SCHED_CLOCK ia64 and parisc are trivial in that they never called set_sched_clock_stable(), so add an unconditional call to clear_sched_clock_stable() to them. For x86 the story is a lot more involved, and what this patch tries to do is ensure we preserve the status quo. So even is Cyrix or Transmeta have usable TSC they never called set_sched_clock_stable() so they now get an explicit mark unstable. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 9881b024b7d7 ("sched/clock: Delay switching sched_clock to stable") Link: http://lkml.kernel.org/r/20170119133633.GB6536@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-19Drivers: hv vmbus: Move Hypercall page setup out of common codeK. Y. Srinivasan
As part of the effort to separate out architecture specific code, move the hypercall page setup to an architecture specific file. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-19sched/x86: Remove unnecessary TBM3 check to update topologyTim Chen
Scheduling to the max performance core is enabled by default for Turbo Boost Maxt Technology 3.0 capable platforms. Remove the useless sysctl_sched_itmt_enabled check to update sched topology for adding the prioritized core scheduling flag. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Cc: jolsa@redhat.com Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/1484778629-4404-1-git-send-email-tim.c.chen@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-18x86/ioapic: Restore IO-APIC irq_chip retrigger callbackRuslan Ruslichenko
commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC chip and did not add it to the new IO-APIC-IR irq chip. Unfortunately the software resend fallback is not enabled on X86, so edge interrupts which are received during the lazy disabled state of the interrupt line are not retriggered and therefor lost. Restore the callbacks. [ tglx: Massaged changelog ] Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces") Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com> Cc: xe-linux-external@cisco.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rruslich@cisco.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-18x86/ioapic: Restore IO-APIC irq_chip retrigger callbackRuslan Ruslichenko
commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC chip and did not add it to the new IO-APIC-IR irq chip. There is no harm because the interrupts are resent in software when the retrigger callback is NULL, but it's less efficient. So restore them. [ tglx: Massaged changelog ] Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces") Signed-off-by: Ruslan Ruslichenko <rruslich@cisco.com> Cc: xe-linux-external@cisco.com Link: http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rruslich@cisco.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-16x86/cpufeature: Add AVX512_VPOPCNTDQ featurePiotr Luc
Vector population count instructions for dwords and qwords are going to be available in future Intel Xeon & Xeon Phi processors. Bit 14 of CPUID[level:0x07, ECX] indicates that the instructions are supported by a processor. The specification can be found in the Intel Software Developer Manual (SDM) and in the Instruction Set Extensions Programming Reference (ISE). Populate the feature bit and clear it when xsave is disabled. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Link: http://lkml.kernel.org/r/20170110173403.6010-2-piotr.luc@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-15Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - unwinder fixes - AMD CPU topology enumeration fixes - microcode loader fixes - x86 embedded platform fixes - fix for a bootup crash that may trigger when clearcpuid= is used with invalid values" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mpx: Use compatible types in comparison to fix sparse error x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc() x86/entry: Fix the end of the stack for newly forked tasks x86/unwind: Include __schedule() in stack traces x86/unwind: Disable KASAN checks for non-current tasks x86/unwind: Silence warnings for non-current tasks x86/microcode/intel: Use correct buffer size for saving microcode data x86/microcode/intel: Fix allocation size of struct ucode_patch x86/microcode/intel: Add a helper which gives the microcode revision x86/microcode: Use native CPUID to tickle out microcode revision x86/CPU: Add native CPUID variants returning a single datum x86/boot: Add missing declaration of string functions x86/CPU/AMD: Fix Bulldozer topology x86/platform/intel-mid: Rename 'spidev' to 'mrfld_spidev' x86/cpu: Fix typo in the comment for Anniedale x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option
2017-01-14sched/clock, clocksource: Add optional cs::mark_unstable() methodThomas Gleixner
PeterZ reported that we'd fail to mark the TSC unstable when the clocksource watchdog finds it unsuitable. Allow a clocksource to run a custom action when its being marked unstable and hook up the TSC unstable code. Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14locking/spinlocks/x86, paravirt: Remove paravirt_ticketlocks_enabledWaiman Long
This is a follow-up of commit: cfd8983f03c7b2 ("x86, locking/spinlocks: Remove ticket (spin)lock implementation") The static_key structure 'paravirt_ticketlocks_enabled' is now removed as it is no longer used. As a result, the init functions kvm_spinlock_init_jump() and xen_init_spinlocks_jump() are also removed. A simple build and boot test was done to verify it. Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1484252878-1962-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc()Len Brown
The Intel Denverton microserver uses a 25 MHz TSC crystal, so we can derive its exact [*] TSC frequency using CPUID and some arithmetic, eg.: TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000) [*] 'exact' is only as good as the crystal, which should be +/- 20ppm Signed-off-by: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/306899f94804aece6d8fa8b4223ede3b48dbb59c.1484287748.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14x86/platform/UV: Fix 2 socket config problemMike Travis
A UV4 chassis with only 2 sockets configured can unexpectedly target the wrong UV hub. Fix the problem by limiting the minimum size of a partition to 4 sockets even if only 2 are configured. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Russ Anderson <rja@hpe.com> Acked-by: Dimitri Sivanich <sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170113152111.313888353@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14x86/platform/UV: Fix panic with missing UVsystab supportMike Travis
Fix the panic where KEXEC'd kernel does not have access to EFI runtime mappings. This may cause the extended UVsystab to not be available. The solution is to revert to non-UV mode and continue with limited capabilities. Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Russ Anderson <rja@hpe.com> Reviewed-by: Alex Thorlton <athorlton@sgi.com> Acked-by: Dimitri Sivanich <sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170113152111.118886202@asylum.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12locking/jump_labels: Update bug_at() boot messageAndy Shevchenko
First of all, %*ph specifier allows to dump data in hex format using the pointer to a buffer. This is suitable to use here. Besides that Thomas suggested to move it to critical level and replace __FILE__ by explicit mention of "jumplabel". Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170110164354.47372-1-andriy.shevchenko@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12x86/e820/32: Fix e820_search_gap() error handling on x86-32Arnd Bergmann
GCC correctly points out that on 32-bit kernels, e820_search_gap() not finding a start now leads to pci_mem_start ('gapstart') being set to an uninitialized value: arch/x86/kernel/e820.c: In function 'e820_setup_gap': arch/x86/kernel/e820.c:641:16: error: 'gapstart' may be used uninitialized in this function [-Werror=maybe-uninitialized] This restores the behavior from before this cleanup: b4ed1d15b453 ("x86/e820: Make e820_search_gap() static and remove unused variables") ... defaulting to address 0x10000000 if nothing was found. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Wei Yang <richard.weiyang@gmail.com> Fixes: b4ed1d15b453 ("x86/e820: Make e820_search_gap() static and remove unused variables") Link: http://lkml.kernel.org/r/20170111144926.695369-1-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12x86/unwind: Disable KASAN checks for non-current tasksJosh Poimboeuf
There are a handful of callers to save_stack_trace_tsk() and show_stack() which try to unwind the stack of a task other than current. In such cases, it's remotely possible that the task is running on one CPU while the unwinder is reading its stack from another CPU, causing the unwinder to see stack corruption. These cases seem to be mostly harmless. The unwinder has checks which prevent it from following bad pointers beyond the bounds of the stack. So it's not really a bug as long as the caller understands that unwinding another task will not always succeed. In such cases, it's possible that the unwinder may read a KASAN-poisoned region of the stack. Account for that by using READ_ONCE_NOCHECK() when reading the stack of another task. Use READ_ONCE() when reading the stack of the current task, since KASAN warnings can still be useful for finding bugs in that case. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12x86/unwind: Silence warnings for non-current tasksJosh Poimboeuf
There are a handful of callers to save_stack_trace_tsk() and show_stack() which try to unwind the stack of a task other than current. In such cases, it's remotely possible that the task is running on one CPU while the unwinder is reading its stack from another CPU, causing the unwinder to see stack corruption. These cases seem to be mostly harmless. The unwinder has checks which prevent it from following bad pointers beyond the bounds of the stack. So it's not really a bug as long as the caller understands that unwinding another task will not always succeed. Since stack "corruption" on another task's stack isn't necessarily a bug, silence the warnings when unwinding tasks other than current. Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-09x86/microcode/intel: Use correct buffer size for saving microcode dataJunichi Nomura
In generic_load_microcode(), curr_mc_size is the size of the last allocated buffer and since we have this performance "optimization" there to vmalloc a new buffer only when the current one is bigger, curr_mc_size ends up becoming the size of the biggest buffer we've seen so far. However, we end up saving the microcode patch which matches our CPU and its size is not curr_mc_size but the respective mc_size during the iteration while we're staring at it. So save that mc_size into a separate variable and use it to store the previously found microcode buffer. Without this fix, we could get oops like this: BUG: unable to handle kernel paging request at ffffc9000e30f000 IP: __memcpy+0x12/0x20 ... Call Trace: ? kmemdup+0x43/0x60 __alloc_microcode_buf+0x44/0x70 save_microcode_patch+0xd4/0x150 generic_load_microcode+0x1b8/0x260 request_microcode_user+0x15/0x20 microcode_write+0x91/0x100 __vfs_write+0x34/0x120 vfs_write+0xc1/0x130 SyS_write+0x56/0xc0 do_syscall_64+0x6c/0x160 entry_SYSCALL64_slow_path+0x25/0x25 Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading") Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/4f33cbfd-44f2-9bed-3b66-7446cd14256f@ce.jp.nec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-09x86/microcode/intel: Fix allocation size of struct ucode_patchJunichi Nomura
We allocate struct ucode_patch here. @size is the size of microcode data and used for kmemdup() later in this function. Fixes: 06b8534cb728 ("x86/microcode: Rework microcode loading") Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/7a730dc9-ac17-35c4-fe76-dfc94e5ecd95@ce.jp.nec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-09x86/microcode/intel: Add a helper which gives the microcode revisionBorislav Petkov
Since on Intel we're required to do CPUID(1) first, before reading the microcode revision MSR, let's add a special helper which does the required steps so that we don't forget to do them next time, when we want to read the microcode revision. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170109114147.5082-4-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-09x86/microcode: Use native CPUID to tickle out microcode revisionBorislav Petkov
Intel supplies the microcode revision value in MSR 0x8b (IA32_BIOS_SIGN_ID) after CPUID(1) has been executed. Execute it each time before reading that MSR. It used to do sync_core() which did do CPUID but c198b121b1a1 ("x86/asm: Rewrite sync_core() to use IRET-to-self") changed the sync_core() implementation so we better make the microcode loading case explicit, as the SDM documents it. Reported-and-tested-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170109114147.5082-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-09x86/apic: Implement set_state_oneshot_stopped() callbackFrederic Weisbecker
When clock_event_device::set_state_oneshot_stopped() is not implemented, hrtimer_cancel() can't stop the clock when there is no more timer in the queue. So the ghost of the freshly cancelled hrtimer haunts us back later with an extra interrupt: <idle>-0 [002] d..2 2248.557659: hrtimer_cancel: hrtimer=ffff88021fa92d80 <idle>-0 [002] d.h1 2249.303659: local_timer_entry: vector=239 So let's implement this missing callback for the lapic clock. This consist in calling its set_state_shutdown() callback. There don't seem to be a lighter way to stop the clock. Simply writing 0 to APIC_TMICT won't be enough to stop the clock and avoid the extra interrupt, as opposed to what is specified in the specs. We must also mask the timer interrupt in the device. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/1483029949-6925-1-git-send-email-fweisbec@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-06Merge branch 'stable/for-linus-4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fixes from Konrad Rzeszutek Wilk: "This has one fix to make i915 work when using Xen SWIOTLB, and a feature from Geert to aid in debugging of devices that can't do DMA outside the 32-bit address space. The feature from Geert is on top of v4.10 merge window commit (specifically you pulling my previous branch), as his changes were dependent on the Documentation/ movement patches. I figured it would just easier than me trying than to cherry-pick the Documentation patches to satisfy git. The patches have been soaking since 12/20, albeit I updated the last patch due to linux-next catching an compiler error and adding an Tested-and-Reported-by tag" * 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: Export swiotlb_max_segment to users swiotlb: Add swiotlb=noforce debug option swiotlb: Convert swiotlb_force from int to enum x86, swiotlb: Simplify pci_swiotlb_detect_override()
2017-01-06x86/apic: Fix typos in commentsDou Liyang
s/ID/IDs/ s/inr_logical_cpuidi/nr_logical_cpuids/ s/generic_processor_info()/__generic_processor_info()/ Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1483610083-24314-1-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-06x86/boot/32: Convert the 32-bit pgtable setup code from assembly to CBoris Ostrovsky
The new Xen PVH entry point requires page tables to be setup by the kernel since it is entered with paging disabled. Pull the common code out of head_32.S so that mk_early_pgtbl_32() can be invoked from both the new Xen entry point and the existing startup_32() code. Convert resulting common code to C. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: matt@codeblueprint.co.uk Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1481215471-9639-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>