summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/mce/threshold.c
AgeCommit message (Collapse)Author
2024-12-30x86/mce/threshold: Remove the redundant this_cpu_dec_return()Qiuxu Zhuo
The 'storm' variable points to this_cpu_ptr(&storm_desc). Access the 'stormy_bank_count' field through the 'storm' to avoid calling this_cpu_*() on the same per-CPU variable twice. This minor optimization reduces the text size by 16 bytes. $ size threshold.o.* text data bss dec hex filename 1395 1664 0 3059 bf3 threshold.o.old 1379 1664 0 3043 be3 threshold.o.new No functional changes intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20241212140103.66964-3-qiuxu.zhuo@intel.com
2023-12-15x86/mce: Handle Intel threshold interrupt stormsTony Luck
Add an Intel specific hook into machine_check_poll() to keep track of per-CPU, per-bank corrected error logs (with a stub for the CONFIG_MCE_INTEL=n case). When a storm is observed the rate of interrupts is reduced by setting a large threshold value for this bank in IA32_MCi_CTL2. This bank is added to the bitmap of banks for this CPU to poll. The polling rate is increased to once per second. When a storm ends reset the threshold in IA32_MCi_CTL2 back to 1, remove the bank from the bitmap for polling, and change the polling rate back to the default. If a CPU with banks in storm mode is taken offline, the new CPU that inherits ownership of those banks takes over management of storm(s) in the inherited bank(s). The cmci_discover() function was already very large. These changes pushed it well over the top. Refactor with three helper functions to bring it back under control. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231115195450.12963-4-tony.luck@intel.com
2023-12-15x86/mce: Add per-bank CMCI storm mitigationTony Luck
This is the core functionality to track CMCI storms at the machine check bank granularity. Subsequent patches will add the vendor specific hooks to supply input to the storm detection and take actions on the start/end of a storm. machine_check_poll() is called both by the CMCI interrupt code, and for periodic polls from a timer. Add a hook in this routine to maintain a bitmap history for each bank showing whether the bank logged an corrected error or not each time it is polled. In normal operation the interval between polls of these banks determines how far to shift the history. The 64 bit width corresponds to about one second. When a storm is observed a CPU vendor specific action is taken to reduce or stop CMCI from the bank that is the source of the storm. The bank is added to the bitmap of banks for this CPU to poll. The polling rate is increased to once per second. During a storm each bit in the history indicates the status of the bank each time it is polled. Thus the history covers just over a minute. Declare a storm for that bank if the number of corrected interrupts seen in that history is above some threshold (defined as 5 in this series, could be tuned later if there is data to suggest a better value). A storm on a bank ends if enough consecutive polls of the bank show no corrected errors (defined as 30, may also change). That calls the CPU vendor specific function to revert to normal operational mode, and changes the polling rate back to the default. [ bp: Massage. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231115195450.12963-3-tony.luck@intel.com
2023-08-09x86/apic: Nuke ack_APIC_irq()Dave Hansen
Yet another wrapper of a wrapper gone along with the outdated comment that this compiles to a single instruction. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
2020-06-11x86/entry: Convert various system vectorsThomas Gleixner
Convert various system vectors to IDTENTRY_SYSVEC: - Implement the C entry point with DEFINE_IDTENTRY_SYSVEC - Emit the ASM stub with DECLARE_IDTENTRY_SYSVEC - Remove the ASM idtentries in 64-bit - Remove the BUILD_INTERRUPT entries in 32-bit - Remove the old prototypes No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20200521202119.464812973@linutronix.de
2018-12-26Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Remove trampoline_handler() prototype x86/kernel: Fix more -Wmissing-prototypes warnings x86: Fix various typos in comments x86/headers: Fix -Wmissing-prototypes warning x86/process: Avoid unnecessary NULL check in get_wchan() x86/traps: Complete prototype declarations x86/mce: Fix -Wmissing-prototypes warnings x86/gart: Rewrite early_gart_iommu_check() comment
2018-12-06x86/mce: Unify pr_* prefixBorislav Petkov
Move the pr_fmt prefix to internal.h and include it everywhere. This way, all pr_* printed strings will be prepended with "mce: ". No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20181205200913.GR29510@zn.tnic
2018-12-05x86/mce: Streamline MCE subsystem's namingBorislav Petkov
Rename the containing folder to "mce" which is the most widespread name. Drop the "mce[-_]" filename prefix of some compilation units (while others don't have it). This unifies the file naming in the MCE subsystem: mce/ |-- amd.c |-- apei.c |-- core.c |-- dev-mcelog.c |-- genpool.c |-- inject.c |-- intel.c |-- internal.h |-- Makefile |-- p5.c |-- severity.c |-- therm_throt.c |-- threshold.c `-- winchip.c No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20181205141323.14995-1-bp@alien8.de