summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)Author
2020-10-12Merge tag 'x86-mm-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "Do not sync vmalloc/ioremap mappings on x86-64 kernels. Hopefully now without the bugs!" * tag 'x86-mm-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/64: Update comment in preallocate_vmalloc_pages() x86/mm/64: Do not sync vmalloc/ioremap mappings
2020-10-12Merge tag 'perf-core-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "x86 Intel updates: - Add Jasper Lake support - Add support for TopDown metrics on Ice Lake - Fix Ice Lake & Tiger Lake uncore support, add Snow Ridge support - Add a PCI sub driver to support uncore PMUs where the PCI resources have been claimed already - extending the range of supported systems. x86 AMD updates: - Restore 'perf stat -a' behaviour to program the uncore PMU to count all CPU threads. - Fix setting the proper count when sampling Large Increment per Cycle events / 'paired' events. - Fix IBS Fetch sampling on F17h and some other IBS fine tuning, greatly reducing the number of interrupts when large sample periods are specified. - Extends Family 17h RAPL support to also work on compatible F19h machines. Core code updates: - Fix race in perf_mmap_close() - Add PERF_EV_CAP_SIBLING, to denote that sibling events should be closed if the leader is removed. - Smaller fixes and updates" * tag 'perf-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) perf/core: Fix race in the perf_mmap_close() function perf/x86: Fix n_metric for cancelled txn perf/x86: Fix n_pair for cancelled txn x86/events/amd/iommu: Fix sizeof mismatch perf/x86/intel: Check perf metrics feature for each CPU perf/x86/intel: Fix Ice Lake event constraint table perf/x86/intel/uncore: Fix the scale of the IMC free-running events perf/x86/intel/uncore: Fix for iio mapping on Skylake Server perf/x86/msr: Add Jasper Lake support perf/x86/intel: Add Jasper Lake support perf/x86/intel/uncore: Reduce the number of CBOX counters perf/x86/intel/uncore: Update Ice Lake uncore units perf/x86/intel/uncore: Split the Ice Lake and Tiger Lake MSR uncore support perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge perf/x86/intel/uncore: Generic support for the PCI sub driver perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister() perf/x86/intel/uncore: Factor out uncore_pci_pmu_register() perf/x86/intel/uncore: Factor out uncore_pci_find_dev_pmu() perf/x86/intel/uncore: Factor out uncore_pci_get_dev_die_info() perf/amd/uncore: Inform the user how many counters each uncore PMU has ...
2020-10-12Merge tag 'core-static_call-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull static call support from Ingo Molnar: "This introduces static_call(), which is the idea of static_branch() applied to indirect function calls. Remove a data load (indirection) by modifying the text. They give the flexibility of function pointers, but with better performance. (This is especially important for cases where retpolines would otherwise be used, as retpolines can be pretty slow.) API overview: DECLARE_STATIC_CALL(name, func); DEFINE_STATIC_CALL(name, func); DEFINE_STATIC_CALL_NULL(name, typename); static_call(name)(args...); static_call_cond(name)(args...); static_call_update(name, func); x86 is supported via text patching, otherwise basic indirect calls are used, with function pointers. There's a second variant using inline code patching, inspired by jump-labels, implemented on x86 as well. The new APIs are utilized in the x86 perf code, a heavy user of function pointers, where static calls speed up the PMU handler by 4.2% (!). The generic implementation is not really excercised on other architectures, outside of the trivial test_static_call_init() self-test" * tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) static_call: Fix return type of static_call_init tracepoint: Fix out of sync data passing by static caller tracepoint: Fix overly long tracepoint names x86/perf, static_call: Optimize x86_pmu methods tracepoint: Optimize using static_call() static_call: Allow early init static_call: Add some validation static_call: Handle tail-calls static_call: Add static_call_cond() x86/alternatives: Teach text_poke_bp() to emulate RET static_call: Add simple self-test for static calls x86/static_call: Add inline static call implementation for x86-64 x86/static_call: Add out-of-line static call implementation static_call: Avoid kprobes on inline static_call()s static_call: Add inline static call infrastructure static_call: Add basic static call infrastructure compiler.h: Make __ADDRESSABLE() symbol truly unique jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved() module: Properly propagate MODULE_STATE_COMING failure module: Fix up module_notifier return values ...
2020-10-12Merge tag 'core-build-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull orphan section checking from Ingo Molnar: "Orphan link sections were a long-standing source of obscure bugs, because the heuristics that various linkers & compilers use to handle them (include these bits into the output image vs discarding them silently) are both highly idiosyncratic and also version dependent. Instead of this historically problematic mess, this tree by Kees Cook (et al) adds build time asserts and build time warnings if there's any orphan section in the kernel or if a section is not sized as expected. And because we relied on so many silent assumptions in this area, fix a metric ton of dependencies and some outright bugs related to this, before we can finally enable the checks on the x86, ARM and ARM64 platforms" * tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/boot/compressed: Warn on orphan section placement x86/build: Warn on orphan section placement arm/boot: Warn on orphan section placement arm/build: Warn on orphan section placement arm64/build: Warn on orphan section placement x86/boot/compressed: Add missing debugging sections to output x86/boot/compressed: Remove, discard, or assert for unwanted sections x86/boot/compressed: Reorganize zero-size section asserts x86/build: Add asserts for unwanted sections x86/build: Enforce an empty .got.plt section x86/asm: Avoid generating unused kprobe sections arm/boot: Handle all sections explicitly arm/build: Assert for unwanted sections arm/build: Add missing sections arm/build: Explicitly keep .ARM.attributes sections arm/build: Refactor linker script headers arm64/build: Assert for unwanted sections arm64/build: Add missing DWARF sections arm64/build: Use common DISCARDS in linker script arm64/build: Remove .eh_frame* sections due to unwind tables ...
2020-10-12Merge tag 'x86-entry-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry code updates from Thomas Gleixner: "More consolidation and correctness fixes for the debug exception: - Ensure BTF synchronization under all circumstances - Distangle kernel and user mode #DB further - Get ordering vs. the debug notifier correct to make KGDB work more reliably. - Cleanup historical gunk and make the code simpler to understand" * tag 'x86-entry-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/debug: Change thread.debugreg6 to thread.virtual_dr6 x86/debug: Support negative polarity DR6 bits x86/debug: Simplify hw_breakpoint_handler() x86/debug: Remove aout_dump_debugregs() x86/debug: Remove the historical junk x86/debug: Move cond_local_irq_enable() block into exc_debug_user() x86/debug: Move historical SYSENTER junk into exc_debug_kernel() x86/debug: Simplify #DB signal code x86/debug: Remove handle_debug(.user) argument x86/debug: Move kprobe_debug_handler() into exc_debug_kernel() x86/debug: Sync BTF earlier
2020-10-12Merge tag 'x86-irq-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Surgery of the MSI interrupt handling to prepare the support of upcoming devices which require non-PCI based MSI handling: - Cleanup historical leftovers all over the place - Rework the code to utilize more core functionality - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain assignment to PCI devices possible. - Assign irqdomains to PCI devices at initialization time which allows to utilize the full functionality of hierarchical irqdomains. - Remove arch_.*_msi_irq() functions from X86 and utilize the irqdomain which is assigned to the device for interrupt management. - Make the arch_.*_msi_irq() support conditional on a config switch and let the last few users select it" * tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS x86/apic/msi: Unbreak DMAR and HPET MSI iommu/amd: Remove domain search for PCI/MSI iommu/vt-d: Remove domain search for PCI/MSI[X] x86/irq: Make most MSI ops XEN private x86/irq: Cleanup the arch_*_msi_irqs() leftovers PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable x86/pci: Set default irq domain in pcibios_add_device() iommm/amd: Store irq domain in struct device iommm/vt-d: Store irq domain in struct device x86/xen: Wrap XEN MSI management into irqdomain irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() x86/xen: Consolidate XEN-MSI init x86/xen: Rework MSI teardown x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() PCI/MSI: Provide pci_dev_has_special_msi_domain() helper PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI x86/irq: Initialize PCI/MSI domain at PCI init time x86/pci: Reducde #ifdeffery in PCI init code ...
2020-10-12Merge tag 'x86_cache_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache resource control updates from Borislav Petkov: - Misc cleanups to the resctrl code in preparation for the ARM side (James Morse) - Add support for controlling per-thread memory bandwidth throttling delay values on hw which supports it (Fenghua Yu) * tag 'x86_cache_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Enable user to view thread or core throttling mode x86/resctrl: Enumerate per-thread MBA controls cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps x86/resctrl: Merge AMD/Intel parse_bw() calls x86/resctrl: Add struct rdt_membw::arch_needs_linear to explain AMD/Intel MBA difference x86/resctrl: Use is_closid_match() in more places x86/resctrl: Include pid.h x86/resctrl: Use container_of() in delayed_work handlers x86/resctrl: Fix stale comment x86/resctrl: Remove struct rdt_membw::max_delay x86/resctrl: Remove unused struct mbm_state::chunks_bw
2020-10-12Merge tag 'x86_fsgsbase_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fsgsbase updates from Borislav Petkov: "Misc minor cleanups and corrections to the fsgsbase code and respective selftests" * tag 'x86_fsgsbase_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86/fsgsbase: Test PTRACE_PEEKUSER for GSBASE with invalid LDT GS selftests/x86/fsgsbase: Reap a forgotten child x86/fsgsbase: Replace static_cpu_has() with boot_cpu_has() x86/entry/64: Correct the comment over SAVE_AND_SET_GSBASE
2020-10-12Merge tag 'x86_pasid_for_5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PASID updates from Borislav Petkov: "Initial support for sharing virtual addresses between the CPU and devices which doesn't need pinning of pages for DMA anymore. Add support for the command submission to devices using new x86 instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support for process address space identifiers (PASIDs) which are referenced by those command submission instructions along with the handling of the PASID state on context switch as another extended state. Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang" * tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction x86/asm: Carve out a generic movdir64b() helper for general usage x86/mmu: Allocate/free a PASID x86/cpufeatures: Mark ENQCMD as disabled when configured out mm: Add a pasid member to struct mm_struct x86/msr-index: Define an IA32_PASID MSR x86/fpu/xstate: Add supervisor PASID state for ENQCMD x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions Documentation/x86: Add documentation for SVA (Shared Virtual Addressing) iommu/vt-d: Change flags type to unsigned int in binding mm drm, iommu: Change type of pasid to u32
2020-10-12Merge tag 'x86_platform_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Borislav Petkov: - Cleanup different aspects of the UV code and start adding support for the new UV5 class of systems (Mike Travis) - Use a flexible array for a dynamically sized struct uv_rtc_timer_head (Gustavo A. R. Silva) * tag 'x86_platform_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Update Copyrights to conform to HPE standards x86/platform/uv: Update for UV5 NMI MMR changes x86/platform/uv: Update UV5 TSC checking x86/platform/uv: Update node present counting x86/platform/uv: Update UV5 MMR references in UV GRU x86/platform/uv: Adjust GAM MMR references affected by UV5 updates x86/platform/uv: Update MMIOH references based on new UV5 MMRs x86/platform/uv: Add and decode Arch Type in UVsystab x86/platform/uv: Add UV5 direct references x86/platform/uv: Update UV MMRs for UV5 drivers/misc/sgi-xp: Adjust references in UV kernel modules x86/platform/uv: Remove SCIR MMR references for UV systems x86/platform/uv: Remove UV BAU TLB Shootdown Handler x86/uv/time: Use a flexible array in struct uv_rtc_timer_head
2020-10-12Merge tag 'x86_cpu_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Borislav Petkov: - Add support for hardware-enforced cache coherency on AMD which obviates the need to flush cachelines before changing the PTE encryption bit (Krish Sadhukhan) - Add Centaur initialization support for families >= 7 (Tony W Wang-oc) - Add a feature flag for, and expose TSX suspend load tracking feature to KVM (Cathy Zhang) - Emulate SLDT and STR so that windows programs don't crash on UMIP machines (Brendan Shanks and Ricardo Neri) - Use the new SERIALIZE insn on Intel hardware which supports it (Ricardo Neri) - Misc cleanups and fixes * tag 'x86_cpu_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM: SVM: Don't flush cache if hardware enforces cache coherency across encryption domains x86/mm/pat: Don't flush cache if hardware enforces cache coherency across encryption domnains x86/cpu: Add hardware-enforced cache coherency as a CPUID feature x86/cpu/centaur: Add Centaur family >=7 CPUs initialization support x86/cpu/centaur: Replace two-condition switch-case with an if statement x86/kvm: Expose TSX Suspend Load Tracking feature x86/cpufeatures: Enumerate TSX suspend load address tracking instructions x86/umip: Add emulation/spoofing for SLDT and STR instructions x86/cpu: Fix typos and improve the comments in sync_core() x86/cpu: Use XGETBV and XSETBV mnemonics in fpu/internal.h x86/cpu: Use SERIALIZE in sync_core() when available
2020-10-12Merge tag 'ras_updates_for_v5.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Extend the recovery from MCE in kernel space also to processes which encounter an MCE in kernel space but while copying from user memory by sending them a SIGBUS on return to user space and umapping the faulty memory, by Tony Luck and Youquan Song. - memcpy_mcsafe() rework by splitting the functionality into copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables support for new hardware which can recover from a machine check encountered during a fast string copy and makes that the default and lets the older hardware which does not support that advance recovery, opt in to use the old, fragile, slow variant, by Dan Williams. - New AMD hw enablement, by Yazen Ghannam and Akshay Gupta. - Do not use MSR-tracing accessors in #MC context and flag any fault while accessing MCA architectural MSRs as an architectural violation with the hope that such hw/fw misdesigns are caught early during the hw eval phase and they don't make it into production. - Misc fixes, improvements and cleanups, as always. * tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Allow for copy_mc_fragile symbol checksum to be generated x86/mce: Decode a kernel instruction to determine if it is copying from user x86/mce: Recover from poison found while copying from user space x86/mce: Avoid tail copy when machine check terminated a copy from user x86/mce: Add _ASM_EXTABLE_CPY for copy user access x86/mce: Provide method to find out the type of an exception handler x86/mce: Pass pointer to saved pt_regs to severity calculation routines x86/copy_mc: Introduce copy_mc_enhanced_fast_string() x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list x86/mce: Add Skylake quirk for patrol scrub reported errors RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE() x86/mce: Annotate mce_rd/wrmsrl() with noinstr x86/mce/dev-mcelog: Do not update kflags on AMD systems x86/mce: Stop mce_reign() from re-computing severity for every CPU x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR x86/mce: Increase maximum number of banks to 64 x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check() x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap RAS/CEC: Fix cec_init() prototype
2020-10-08x86/mce: Allow for copy_mc_fragile symbol checksum to be generatedBorislav Petkov
Add asm/mce.h to asm/asm-prototypes.h so that that asm symbol's checksum can be generated in order to support CONFIG_MODVERSIONS with it and fix: WARNING: modpost: EXPORT symbol "copy_mc_fragile" [vmlinux] version \ generation failed, symbol will not be versioned. For reference see: 4efca4ed05cb ("kbuild: modversions for EXPORT_SYMBOL() for asm") 334bb7738764 ("x86/kbuild: enable modversions for symbols exported from asm") Fixes: ec6347bb4339 ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()") Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201007111447.GA23257@zn.tnic
2020-10-07x86/asm: Add an enqcmds() wrapper for the ENQCMDS instructionDave Jiang
Currently, the MOVDIR64B instruction is used to atomically submit 64-byte work descriptors to devices. Although it can encounter errors like device queue full, command not accepted, device not ready, etc when writing to a device MMIO, MOVDIR64B can not report back on errors from the device itself. This means that MOVDIR64B users need to separately interact with a device to see if a descriptor was successfully queued, which slows down device interactions. ENQCMD and ENQCMDS also atomically submit 64-byte work descriptors to devices. But, they *can* report back errors directly from the device, such as if the device was busy, or device not enabled or does not support the command. This immediate feedback from the submission instruction itself reduces the number of interactions with the device and can greatly increase efficiency. ENQCMD can be used at any privilege level, but can effectively only submit work on behalf of the current process. ENQCMDS is a ring0-only instruction and can explicitly specify a process context instead of being tied to the current process or needing to reprogram the IA32_PASID MSR. Use ENQCMDS for work submission within the kernel because a Process Address ID (PASID) is setup to translate the kernel virtual address space. This PASID is provided to ENQCMDS from the descriptor structure submitted to the device and not retrieved from IA32_PASID MSR, which is setup for the current user address space. See Intel Software Developer’s Manual for more information on the instructions. [ bp: - Make operand constraints like movdir64b() because both insns are basically doing the same thing, more or less. - Fixup comments and cleanup. ] Link: https://lkml.kernel.org/r/20200924180041.34056-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20201005151126.657029-3-dave.jiang@intel.com
2020-10-07x86/asm: Carve out a generic movdir64b() helper for general usageDave Jiang
Carve out the MOVDIR64B inline asm primitive into a generic helper so that it can be used by other functions. Move it to special_insns.h and have iosubmit_cmds512() call it. [ bp: Massage commit message. ] Suggested-by: Michael Matz <matz@suse.de> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201005151126.657029-2-dave.jiang@intel.com
2020-10-07x86/mce: Decode a kernel instruction to determine if it is copying from userTony Luck
All instructions copying data between kernel and user memory are tagged with either _ASM_EXTABLE_UA or _ASM_EXTABLE_CPY entries in the exception table. ex_fault_handler_type() returns EX_HANDLER_UACCESS for both of these. Recovery is only possible when the machine check was triggered on a read from user memory. In this case the same strategy for recovery applies as if the user had made the access in ring3. If the fault was in kernel memory while copying to user there is no current recovery plan. For MOV and MOVZ instructions a full decode of the instruction is done to find the source address. For MOVS instructions the source address is in the %rsi register. The function fault_in_kernel_space() determines whether the source address is kernel or user, upgrade it from "static" so it can be used here. Co-developed-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-7-tony.luck@intel.com
2020-10-07x86/mce: Add _ASM_EXTABLE_CPY for copy user accessYouquan Song
_ASM_EXTABLE_UA is a general exception entry to record the exception fixup for all exception spots between kernel and user space access. To enable recovery from machine checks while coping data from user addresses it is necessary to be able to distinguish the places that are looping copying data from those that copy a single byte/word/etc. Add a new macro _ASM_EXTABLE_CPY and use it in place of _ASM_EXTABLE_UA in the copy functions. Record the exception reason number to regs->ax at ex_handler_uaccess which is used to check MCE triggered. The new fixup routine ex_handler_copy() is almost an exact copy of ex_handler_uaccess() The difference is that it sets regs->ax to the trap number. Following patches use this to avoid trying to copy remaining bytes from the tail of the copy and possibly hitting the poison again. New mce.kflags bit MCE_IN_KERNEL_COPYIN will be used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space. Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-4-tony.luck@intel.com
2020-10-07x86/mce: Provide method to find out the type of an exception handlerTony Luck
Avoid a proliferation of ex_has_*_handler() functions by having just one function that returns the type of the handler (if any). Drop the __visible attribute for this function. It is not called from assembler so the attribute is not necessary. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201006210910.21062-3-tony.luck@intel.com
2020-10-07x86/platform/uv: Update Copyrights to conform to HPE standardsMike Travis
Add Copyrights to those files that have been updated for UV5 changes. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201005203929.148656-14-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update for UV5 NMI MMR changesMike Travis
The UV NMI MMR addresses and fields moved between UV4 and UV5 necessitating a rewrite of the UV NMI handler. Adjust references to accommodate those changes. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-13-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update UV5 TSC checkingMike Travis
Update check of BIOS TSC sync status to include both possible "invalid" states provided by newer UV5 BIOS. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-12-mike.travis@hpe.com
2020-10-07x86/platform/uv: Add and decode Arch Type in UVsystabMike Travis
When the UV BIOS starts the kernel it passes the UVsystab info struct to the kernel which contains information elements more specific than ACPI, and generally pertinent only to the MMRs. These are read only fields so information is passed one way only. A new field starting with UV5 is the UV architecture type so the ACPI OEM_ID field can be used for other purposes going forward. The UV Arch Type selects the entirety of the MMRs available, with their addresses and fields defined in uv_mmrs.h. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-7-mike.travis@hpe.com
2020-10-07x86/platform/uv: Add UV5 direct referencesMike Travis
Add new references to UV5 (and UVY class) system MMR addresses and fields primarily caused by the expansion from 46 to 52 bits of physical memory address. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-6-mike.travis@hpe.com
2020-10-07x86/platform/uv: Update UV MMRs for UV5Mike Travis
Update UV MMRs in uv_mmrs.h for UV5 based on Verilog output from the UV Hub hardware design files. This is the next UV architecture with a new class (UVY) being defined for 52 bit physical address masks. Uses a bitmask for UV arch identification so a single test can cover multiple versions. Includes other adjustments to match the uv_mmrs.h file to keep from encountering compile errors. New UV5 functionality is added in the patches that follow. [ Fix W=1 build warnings. ] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-5-mike.travis@hpe.com
2020-10-07x86/platform/uv: Remove SCIR MMR references for UV systemsMike Travis
UV class systems no longer use System Controller for monitoring of CPU activity provided by this driver. Other methods have been developed for BIOS and the management controller (BMC). Remove that supporting code. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-3-mike.travis@hpe.com
2020-10-07x86/platform/uv: Remove UV BAU TLB Shootdown HandlerMike Travis
The Broadcast Assist Unit (BAU) TLB shootdown handler is being rewritten to become the UV BAU APIC driver. It is designed to speed up sending IPIs to selective CPUs within the system. Remove the current TLB shutdown handler (tlb_uv.c) file and a couple of kernel hooks in the interim. Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Link: https://lkml.kernel.org/r/20201005203929.148656-2-mike.travis@hpe.com
2020-10-06x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()Dan Williams
In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-09-22x86/irq: Make run_on_irqstack_cond() typesafeThomas Gleixner
Sami reported that run_on_irqstack_cond() requires the caller to cast functions to mismatching types, which trips indirect call Control-Flow Integrity (CFI) in Clang. Instead of disabling CFI on that function, provide proper helpers for the three call variants. The actual ASM code stays the same as that is out of reach. [ bp: Fix __run_on_irqstack() prototype to match. ] Fixes: 931b94145981 ("x86/entry: Provide helpers for executing on the irqstack") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Sami Tolvanen <samitolvanen@google.com> Cc: <stable@vger.kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1052 Link: https://lkml.kernel.org/r/87pn6eb5tv.fsf@nanos.tec.linutronix.de
2020-09-20Merge tag 'x86_urgent_for_v5.9_rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - A defconfig fix (Daniel Díaz) - Disable relocation relaxation for the compressed kernel when not built as -pie as in that case kernels built with clang and linked with LLD fail to boot due to the linker optimizing some instructions in non-PIE form; the gory details in the commit message (Arvind Sankar) - A fix for the "bad bp value" warning issued by the frame-pointer unwinder (Josh Poimboeuf) * tag 'x86_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/unwind/fp: Fix FP unwinding in ret_from_fork x86/boot/compressed: Disable relocation relaxation x86/defconfigs: Explicitly unset CONFIG_64BIT in i386_defconfig
2020-09-18x86/cpu: Add hardware-enforced cache coherency as a CPUID featureKrish Sadhukhan
In some hardware implementations, coherency between the encrypted and unencrypted mappings of the same physical page is enforced. In such a system, it is not required for software to flush the page from all CPU caches in the system prior to changing the value of the C-bit for a page. This hardware- enforced cache coherency is indicated by EAX[10] in CPUID leaf 0x8000001f. [ bp: Use one of the free slots in word 3. ] Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200917212038.5090-2-krish.sadhukhan@oracle.com
2020-09-18x86/unwind/fp: Fix FP unwinding in ret_from_forkJosh Poimboeuf
There have been some reports of "bad bp value" warnings printed by the frame pointer unwinder: WARNING: kernel stack regs at 000000005bac7112 in sh:1014 has bad 'bp' value 0000000000000000 This warning happens when unwinding from an interrupt in ret_from_fork(). If entry code gets interrupted, the state of the frame pointer (rbp) may be undefined, which can confuse the unwinder, resulting in warnings like the above. There's an in_entry_code() check which normally silences such warnings for entry code. But in this case, ret_from_fork() is getting interrupted. It recently got moved out of .entry.text, so the in_entry_code() check no longer works. It could be moved back into .entry.text, but that would break the noinstr validation because of the call to schedule_tail(). Instead, initialize each new task's RBP to point to the task's entry regs via an encoded frame pointer. That will allow the unwinder to reach the end of the stack gracefully. Fixes: b9f6976bfb94 ("x86/entry/64: Move non entry code into .text section") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/f366bbf5a8d02e2318ee312f738112d0af74d16f.1600103007.git.jpoimboe@redhat.com
2020-09-17x86/mmu: Allocate/free a PASIDFenghua Yu
A PASID is allocated for an "mm" the first time any thread binds to an SVA-capable device and is freed from the "mm" when the SVA is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVA cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to a per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherited. This is done to ensure that a running thread has the right PASID in the MSR matching the mm's PASID. [ bp: s/SVM/SVA/g; massage. ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com
2020-09-17x86/cpufeatures: Mark ENQCMD as disabled when configured outFenghua Yu
Currently, the ENQCMD feature depends on CONFIG_IOMMU_SUPPORT. Add X86_FEATURE_ENQCMD to the disabled features mask so that it gets disabled when the IOMMU config option above is not enabled. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-9-git-send-email-fenghua.yu@intel.com
2020-09-17x86/msr-index: Define an IA32_PASID MSRFenghua Yu
The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier (PASID), a 20-bit value. Bit 31 must be set to indicate the value programmed in the MSR is valid. Hardware uses the PASID to identify a process address space and direct responses to the right address space. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-7-git-send-email-fenghua.yu@intel.com
2020-09-17x86/fpu/xstate: Add supervisor PASID state for ENQCMDYu-cheng Yu
The ENQCMD instruction reads a PASID from the IA32_PASID MSR. The MSR is stored in the task's supervisor XSAVE* PASID state and is context-switched by XSAVES/XRSTORS. [ bp: Add (in-)definite articles and massage. ] Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-6-git-send-email-fenghua.yu@intel.com
2020-09-17x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructionsFenghua Yu
Work submission instruction comes in two flavors. ENQCMD can be called both in ring 3 and ring 0 and always uses the contents of a PASID MSR when shipping the command to the device. ENQCMDS allows a kernel driver to submit commands on behalf of a user process. The driver supplies the PASID value in ENQCMDS. There isn't any usage of ENQCMD in the kernel as of now. The CPU feature flag is shown as "enqcmd" in /proc/cpuinfo. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-5-git-send-email-fenghua.yu@intel.com
2020-09-16ACPI: processor: Use CPUIDLE_FLAG_TLB_FLUSHEDPeter Zijlstra
Make acpi_processor_idle() use the generic TLB flushing code. This again removes RCU usage after rcu_idle_enter(). (XXX make every C3 invalidate TLBs, not just C3-BM) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-16x86/irq: Make most MSI ops XEN privateThomas Gleixner
Nothing except XEN uses the setup/teardown ops. Hide them there. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112334.198633344@linutronix.de
2020-09-16x86/irq: Cleanup the arch_*_msi_irqs() leftoversThomas Gleixner
Get rid of all the gunk and remove the 'select PCI_MSI_ARCH_FALLBACK' from the x86 Kconfig so the weak functions in the PCI core are replaced by stubs which emit a warning, which ensures that any fail to set the irq domain pointer results in a warning when the device is used. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112334.086003720@linutronix.de
2020-09-16x86/pci: Set default irq domain in pcibios_add_device()Thomas Gleixner
Now that interrupt remapping sets the irqdomain pointer when a PCI device is added it's possible to store the default irq domain in the device struct in pcibios_add_device(). If the bus to which a device is connected has an irq domain associated then this domain is used otherwise the default domain (PCI/MSI native or XEN PCI/MSI) is used. Using the bus domain ensures that special MSI bus domains like VMD work. This makes XEN and the non-remapped native case work solely based on the irq domain pointer in struct device for PCI/MSI and allows to remove the arch fallback and make most of the x86_msi ops private to XEN in the next steps. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112333.900423047@linutronix.de
2020-09-16x86/irq: Initialize PCI/MSI domain at PCI init timeThomas Gleixner
No point in initializing the default PCI/MSI interrupt domain early and no point to create it when XEN PV/HVM/DOM0 are active. Move the initialization to pci_arch_init() and convert it to init ops so that XEN can override it as XEN has it's own PCI/MSI management. The XEN override comes in a later step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112332.859209894@linutronix.de
2020-09-16x86/pci: Reducde #ifdeffery in PCI init codeThomas Gleixner
Adding a function call before the first #ifdef in arch_pci_init() triggers a 'mixed declarations and code' warning if PCI_DIRECT is enabled. Use stub functions and move the #ifdeffery to the header file where it is not in the way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112332.767707340@linutronix.de
2020-09-16x86/msi: Use generic MSI domain opsThomas Gleixner
pci_msi_get_hwirq() and pci_msi_set_desc are not longer special. Enable the generic MSI domain ops in the core and PCI MSI code unconditionally and get rid of the x86 specific implementations in the X86 MSI code and in the hyperv PCI driver. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200826112332.564274859@linutronix.de
2020-09-16x86/msi: Consolidate MSI allocationThomas Gleixner
Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de
2020-09-16x86/irq: Consolidate UV domain allocationThomas Gleixner
Move the UV specific fields into their own struct for readability sake. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112332.255792469@linutronix.de
2020-09-16x86/irq: Consolidate DMAR irq allocationThomas Gleixner
None of the DMAR specific fields are required. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112332.163462706@linutronix.de
2020-09-16x86_ioapic_Consolidate_IOAPIC_allocationThomas Gleixner
Move the IOAPIC specific fields into their own struct and reuse the common devid. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de
2020-09-16x86/msi: Consolidate HPET allocationThomas Gleixner
None of the magic HPET fields are required in any way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de
2020-09-16x86/irq: Prepare consolidation of irq_alloc_infoThomas Gleixner
struct irq_alloc_info is a horrible zoo of unnamed structs in a union. Many of the struct fields can be generic and don't have to be type specific like hpet_id, ioapic_id... Provide a generic set of members to prepare for the consolidation. The goal is to make irq_alloc_info have the same basic member as the generic msi_alloc_info so generic MSI domain ops can be reused and yet more mess can be avoided when (non-PCI) device MSI support comes along. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112331.849577844@linutronix.de
2020-09-16iommu/irq_remapping: Consolidate irq domain lookupThomas Gleixner
Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN types, consolidate the two getter functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.741909337@linutronix.de