summaryrefslogtreecommitdiff
path: root/arch/x86/xen/enlighten_pv.c
AgeCommit message (Collapse)Author
2023-12-07x86/entry: Convert INT 0x80 emulation to IDTENTRYThomas Gleixner
There is no real reason to have a separate ASM entry point implementation for the legacy INT 0x80 syscall emulation on 64-bit. IDTENTRY provides all the functionality needed with the only difference that it does not: - save the syscall number (AX) into pt_regs::orig_ax - set pt_regs::ax to -ENOSYS Both can be done safely in the C code of an IDTENTRY before invoking any of the syscall related functions which depend on this convention. Aside of ASM code reduction this prepares for detecting and handling a local APIC injected vector 0x80. [ kirill.shutemov: More verbose comments ] Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@vger.kernel.org> # v6.0+
2023-09-19x86/xen: allow nesting of same lazy modeJuergen Gross
When running as a paravirtualized guest under Xen, Linux is using "lazy mode" for issuing hypercalls which don't need to take immediate effect in order to improve performance (examples are e.g. multiple PTE changes). There are two different lazy modes defined: MMU and CPU lazy mode. Today it is not possible to nest multiple lazy mode sections, even if they are of the same kind. A recent change in memory management added nesting of MMU lazy mode sections, resulting in a regression when running as Xen PV guest. Technically there is no reason why nesting of multiple sections of the same kind of lazy mode shouldn't be allowed. So add support for that for fixing the regression. Fixes: bcc6cc832573 ("mm: add default definition of set_ptes()") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20230913113828.18421-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-09-19x86/xen: move paravirt lazy codeJuergen Gross
Only Xen is using the paravirt lazy mode code, so it can be moved to Xen specific sources. This allows to make some of the functions static or to merge them into their only call sites. While at it do a rename from "paravirt" to "xen" for all moved specifiers. No functional change. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20230913113828.18421-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-08-31Merge tag 'x86_shstk_for_6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 shadow stack support from Dave Hansen: "This is the long awaited x86 shadow stack support, part of Intel's Control-flow Enforcement Technology (CET). CET consists of two related security features: shadow stacks and indirect branch tracking. This series implements just the shadow stack part of this feature, and just for userspace. The main use case for shadow stack is providing protection against return oriented programming attacks. It works by maintaining a secondary (shadow) stack using a special memory type that has protections against modification. When executing a CALL instruction, the processor pushes the return address to both the normal stack and to the special permission shadow stack. Upon RET, the processor pops the shadow stack copy and compares it to the normal stack copy. For more information, refer to the links below for the earlier versions of this patch set" Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/ Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/ * tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) x86/shstk: Change order of __user in type x86/ibt: Convert IBT selftest to asm x86/shstk: Don't retry vm_munmap() on -EINTR x86/kbuild: Fix Documentation/ reference x86/shstk: Move arch detail comment out of core mm x86/shstk: Add ARCH_SHSTK_STATUS x86/shstk: Add ARCH_SHSTK_UNLOCK x86: Add PTRACE interface for shadow stack selftests/x86: Add shadow stack test x86/cpufeatures: Enable CET CR4 bit for shadow stack x86/shstk: Wire in shadow stack interface x86: Expose thread features in /proc/$PID/status x86/shstk: Support WRSS for userspace x86/shstk: Introduce map_shadow_stack syscall x86/shstk: Check that signal frame is shadow stack mem x86/shstk: Check that SSP is aligned on sigreturn x86/shstk: Handle signals for shadow stack x86/shstk: Introduce routines modifying shstk x86/shstk: Handle thread shadow stack x86/shstk: Add user-mode shadow stack support ...
2023-08-30Merge tag 'x86_apic_for_6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Dave Hansen: "This includes a very thorough rework of the 'struct apic' handlers. Quite a variety of them popped up over the years, especially in the 32-bit days when odd apics were much more in vogue. The end result speaks for itself, which is a removal of a ton of code and static calls to replace indirect calls. If there's any breakage here, it's likely to be around the 32-bit museum pieces that get light to no testing these days. Summary: - Rework apic callbacks, getting rid of unnecessary ones and coalescing lots of silly duplicates. - Use static_calls() instead of indirect calls for apic->foo() - Tons of cleanups an crap removal along the way" * tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) x86/apic: Turn on static calls x86/apic: Provide static call infrastructure for APIC callbacks x86/apic: Wrap IPI calls into helper functions x86/apic: Mark all hotpath APIC callback wrappers __always_inline x86/xen/apic: Mark apic __ro_after_init x86/apic: Convert other overrides to apic_update_callback() x86/apic: Replace acpi_wake_cpu_handler_update() and apic_set_eoi_cb() x86/apic: Provide apic_update_callback() x86/xen/apic: Use standard apic driver mechanism for Xen PV x86/apic: Provide common init infrastructure x86/apic: Wrap apic->native_eoi() into a helper x86/apic: Nuke ack_APIC_irq() x86/apic: Remove pointless arguments from [native_]eoi_write() x86/apic/noop: Tidy up the code x86/apic: Remove pointless NULL initializations x86/apic: Sanitize APIC ID range validation x86/apic: Prepare x2APIC for using apic::max_apic_id x86/apic: Simplify X2APIC ID validation x86/apic: Add max_apic_id member x86/apic: Wrap APIC ID validation into an inline ...
2023-08-28Merge tag 'acpi-6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These include new ACPICA material, a rework of the ACPI thermal driver, a switch-over of the ACPI processor driver to using _OSC instead of (long deprecated) _PDC for CPU initialization, a rework of firmware notifications handling in several drivers, fixes and cleanups for suspend-to-idle handling on AMD systems, ACPI backlight driver updates and more. Specifics: - Update the ACPICA code in the kernel to upstream revision 20230628 including the following changes: - Suppress a GCC 12 dangling-pointer warning (Philip Prindeville) - Reformat the ACPI_STATE_COMMON macro and its users (George Guo) - Replace the ternary operator with ACPI_MIN() (Jiangshan Yi) - Add support for _DSC as per ACPI 6.5 (Saket Dumbre) - Remove a duplicate macro from zephyr header (Najumon B.A) - Add data structures for GED and _EVT tracking (Jose Marinho) - Fix misspelled CDAT DSMAS define (Dave Jiang) - Simplify an error message in acpi_ds_result_push() (Christophe Jaillet) - Add a struct size macro related to SRAT (Dave Jiang) - Add AML_NO_OPERAND_RESOLVE flag to Timer (Abhishek Mainkar) - Add support for RISC-V external interrupt controllers in MADT (Sunil V L) - Add RHCT flags, CMO and MMU nodes (Sunil V L) - Change ACPICA version to 20230628 (Bob Moore) - Introduce new wrappers for ACPICA notify handler install/remove and convert multiple drivers to using their own Notify() handlers instead of the ACPI bus type .notify() slated for removal (Michal Wilczynski) - Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2 (Hans de Goede) - Put ACPI video and its child devices explicitly into D0 on boot to avoid platform firmware confusion (Kai-Heng Feng) - Add backlight=native DMI quirk for Lenovo Ideapad Z470 (Jiri Slaby) - Support obtaining physical CPU ID from MADT on LoongArch (Bibo Mao) - Convert ACPI CPU initialization to using _OSC instead of _PDC that has been depreceted since 2018 and dropped from the specification in ACPI 6.5 (Michal Wilczynski, Rafael Wysocki) - Drop non-functional nocrt parameter from ACPI thermal (Mario Limonciello) - Clean up the ACPI thermal driver, rework the handling of firmware notifications in it and make it provide a table of generic trip point structures to the core during initialization (Rafael Wysocki) - Defer enumeration of devices with _DEP pointing to IVSC (Wentong Wu) - Install SystemCMOS address space handler for ACPI000E (TAD) to meet platform firmware expectations on some platforms (Zhang Rui) - Fix finding the generic error data in the ACPi extlog driver for compatibility with old and new firmware interface versions (Xiaochun Lee) - Remove assorted unused declarations of functions (Yue Haibing) - Move AMBA bus scan handling into arm64 specific directory (Sudeep Holla) - Fix and clean up suspend-to-idle interface for AMD systems (Mario Limonciello, Andy Shevchenko) - Fix string truncation warning in pnpacpi_add_device() (Sunil V L)" * tag 'acpi-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (66 commits) ACPI: x86: s2idle: Add a function to get LPS0 constraint for a device ACPI: x86: s2idle: Add for_each_lpi_constraint() helper ACPI: x86: s2idle: Add more debugging for AMD constraints parsing ACPI: x86: s2idle: Fix a logic error parsing AMD constraints table ACPI: x86: s2idle: Catch multiple ACPI_TYPE_PACKAGE objects ACPI: x86: s2idle: Post-increment variables when getting constraints ACPI: Adjust #ifdef for *_lps0_dev use ACPI: TAD: Install SystemCMOS address space handler for ACPI000E ACPI: Remove assorted unused declarations of functions ACPI: extlog: Fix finding the generic error data for v3 structure PNP: ACPI: Fix string truncation warning ACPI: Remove unused extern declaration acpi_paddr_to_node() ACPI: video: Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2 ACPI: video: Put ACPI video and its child devices into D0 on boot ACPI: processor: LoongArch: Get physical ID from MADT ACPI: scan: Defer enumeration of devices with a _DEP pointing to IVSC device ACPI: thermal: Eliminate code duplication from acpi_thermal_notify() ACPI: thermal: Drop unnecessary thermal zone callbacks ACPI: thermal: Rework thermal_get_trend() ACPI: thermal: Use trip point table to register thermal zones ...
2023-08-21x86/xen: Make virt_to_pfn() a static inlineLinus Walleij
Making virt_to_pfn() a static inline taking a strongly typed (const void *) makes the contract of a passing a pointer of that type to the function explicit and exposes any misuse of the macro virt_to_pfn() acting polymorphic and accepting many types such as (void *), (unitptr_t) or (unsigned long) as arguments without warnings. Also fix all offending call sites to pass a (void *) rather than an unsigned long. Since virt_to_mfn() is wrapping virt_to_pfn() this function has become polymorphic as well so the usage need to be fixed up. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230810-virt-to-phys-x86-xen-v1-1-9e966d333e7a@linaro.org Signed-off-by: Juergen Gross <jgross@suse.com>
2023-08-09x86/xen/apic: Use standard apic driver mechanism for Xen PVJuergen Gross
Instead of setting the Xen PV apic driver very early during boot, just use the standard apic driver probing by setting an appropriate x86_init.irqs.intr_mode_init callback. At the same time eliminate xen_apic_check() which has never been used. The #ifdef CONFIG_X86_LOCAL_APIC around the call of xen_init_apic() can be removed, too, as CONFIG_XEN depends on CONFIG_X86_LOCAL_APIC. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest) Link: https://lore.kernel.org/lkml/aa086365-fd02-210f-67c6-5c9175c0dfee@suse.com
2023-08-02x86/shstk: Add user control-protection fault handlerRick Edgecombe
A control-protection fault is triggered when a control-flow transfer attempt violates Shadow Stack or Indirect Branch Tracking constraints. For example, the return address for a RET instruction differs from the copy on the shadow stack. There already exists a control-protection fault handler for handling kernel IBT faults. Refactor this fault handler into separate user and kernel handlers, like the page fault handler. Add a control-protection handler for usermode. To avoid ifdeffery, put them both in a new file cet.c, which is compiled in the case of either of the two CET features supported in the kernel: kernel IBT or user mode shadow stack. Move some static inline functions from traps.c into a header so they can be used in cet.c. Opportunistically fix a comment in the kernel IBT part of the fault handler that is on the end of the line instead of preceding it. Keep the same behavior for the kernel side of the fault handler, except for converting a BUG to a WARN in the case of a #CP happening when the feature is missing. This unifies the behavior with the new shadow stack code, and also prevents the kernel from crashing under this situation which is potentially recoverable. The control-protection fault handler works in a similar way as the general protection fault handler. It provides the si_code SEGV_CPERR to the signal handler. Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: John Allen <john.allen@amd.com> Tested-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20230613001108.3040476-28-rick.p.edgecombe%40intel.com
2023-07-14ACPI: processor: Rename ACPI_PDC symbolsMichal Wilczynski
The prefix in the names of the ACPI_PDC symbols suggests that they are only relevant for _PDC, but in fact they can also be used in the _OSC. Change that prefix to a more generic ACPI_PROC_CAP that will better reflect the purpose of those symbols as they represent bits in a general processor capabilities buffer. Rename pdc_intel.h to proc_cap_intel.h to follow the change of the symbol name prefix. No intentional functional impact. Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-16x86/xen: Set default memory type for PV guests to WBJuergen Gross
When running as an unprivileged PV guest under Xen (not dom0), the default MTRR memory type should be write-back. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20230615123959.12298-1-jgross@suse.com
2023-06-01x86/xen: Set MTRR state when running as Xen PV initial domainJuergen Gross
When running as Xen PV initial domain (aka dom0), MTRRs are disabled by the hypervisor, but the system should nevertheless use correct cache memory types. This has always kind of worked, as disabled MTRRs resulted in disabled PAT, too, so that the kernel avoided code paths resulting in inconsistencies. This bypassed all of the sanity checks the kernel is doing with enabled MTRRs in order to avoid memory mappings with conflicting memory types. This has been changed recently, leading to PAT being accepted to be enabled, while MTRRs stayed disabled. The result is that mtrr_type_lookup() no longer is accepting all memory type requests, but started to return WB even if UC- was requested. This led to driver failures during initialization of some devices. In reality MTRRs are still in effect, but they are under complete control of the Xen hypervisor. It is possible, however, to retrieve the MTRR settings from the hypervisor. In order to fix those problems, overwrite the MTRR state via mtrr_overwrite_state() with the MTRR data from the hypervisor, if the system is running as a Xen dom0. Fixes: 72cbc8f04fe2 ("x86/PAT: Have pat_enabled() properly reflect state when running on Xen") Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20230502120931.20719-6-jgross@suse.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-03-17Merge tag 'for-linus-6.3-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - cleanup for xen time handling - enable the VGA console in a Xen PVH dom0 - cleanup in the xenfs driver * tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: remove unnecessary (void*) conversions x86/PVH: obtain VGA console info in Dom0 x86/xen/time: cleanup xen_tsc_safe_clocksource xen: update arch/x86/include/asm/xen/cpuid.h
2023-03-14x86/PVH: obtain VGA console info in Dom0Jan Beulich
A new platform-op was added to Xen to allow obtaining the same VGA console information PV Dom0 is handed. Invoke the new function and have the output data processed by xen_init_vga(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2023-02-21Merge tag 'x86_cpu_for_v6.3_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Cache the AMD debug registers in per-CPU variables to avoid MSR writes where possible, when supporting a debug registers swap feature for SEV-ES guests - Add support for AMD's version of eIBRS called Automatic IBRS which is a set-and-forget control of indirect branch restriction speculation resources on privilege change - Add support for a new x86 instruction - LKGS - Load kernel GS which is part of the FRED infrastructure - Reset SPEC_CTRL upon init to accomodate use cases like kexec which rediscover - Other smaller fixes and cleanups * tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd: Cache debug register values in percpu variables KVM: x86: Propagate the AMD Automatic IBRS feature to the guest x86/cpu: Support AMD Automatic IBRS x86/cpu, kvm: Add the SMM_CTL MSR not present feature x86/cpu, kvm: Add the Null Selector Clears Base feature x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code x86/cpu, kvm: Add support for CPUID_80000021_EAX x86/gsseg: Add the new <asm/gsseg.h> header to <asm/asm-prototypes.h> x86/gsseg: Use the LKGS instruction if available for load_gs_index() x86/gsseg: Move load_gs_index() to its own new header file x86/gsseg: Make asm_load_gs_index() take an u16 x86/opcode: Add the LKGS instruction to x86-opcode-map x86/cpufeature: Add the CPU feature bit for LKGS x86/bugs: Reset speculation control settings on init x86/cpu: Remove redundant extern x86_read_arch_cap_msr()
2023-01-13cpuidle, xenpv: Make more PARAVIRT_XXL noinstr cleanPeter Zijlstra
objtool found a few cases where this code called out into instrumented code: vmlinux.o: warning: objtool: acpi_idle_enter_s2idle+0xde: call to wbinvd() leaves .noinstr.text section vmlinux.o: warning: objtool: default_idle+0x4: call to arch_safe_halt() leaves .noinstr.text section vmlinux.o: warning: objtool: xen_safe_halt+0xa: call to HYPERVISOR_sched_op.constprop.0() leaves .noinstr.text section Solve this by: - marking arch_safe_halt(), wbinvd(), native_wbinvd() and HYPERVISOR_sched_op() as __always_inline(). - Explicitly uninlining xen_safe_halt() and pv_native_wbinvd() [they were already uninlined by the compiler on use as function pointers] and annotating them as 'noinstr'. - Annotating pv_native_safe_halt() as 'noinstr'. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195541.171918174@infradead.org
2023-01-13x86/gsseg: Use the LKGS instruction if available for load_gs_index()H. Peter Anvin (Intel)
The LKGS instruction atomically loads a segment descriptor into the %gs descriptor registers, *except* that %gs.base is unchanged, and the base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly what we want this function to do. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230112072032.35626-6-xin3.li@intel.com Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-14Merge tag 'x86_core_for_v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/20220915111039.092790446@infradead.org/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups * tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) x86/paravirt: Use common macro for creating simple asm paravirt functions x86/paravirt: Remove clobber bitmask from .parainstructions x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit x86/Kconfig: Enable kernel IBT by default x86,pm: Force out-of-line memcpy() objtool: Fix weak hole vs prefix symbol objtool: Optimize elf_dirty_reloc_sym() x86/cfi: Add boot time hash randomization x86/cfi: Boot time selection of CFI scheme x86/ibt: Implement FineIBT objtool: Add --cfi to generate the .cfi_sites section x86: Add prefix symbols for function padding objtool: Add option to generate prefix symbols objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf objtool: Slice up elf_create_section_symbol() kallsyms: Revert "Take callthunks into account" x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces x86/retpoline: Fix crash printing warning x86/paravirt: Fix a !PARAVIRT build warning ...
2022-12-13Merge tag 'x86_boot_for_v6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Borislav Petkov: "A of early boot cleanups and fixes. - Do some spring cleaning to the compressed boot code by moving the EFI mixed-mode code to a separate compilation unit, the AMD memory encryption early code where it belongs and fixing up build dependencies. Make the deprecated EFI handover protocol optional with the goal of removing it at some point (Ard Biesheuvel) - Skip realmode init code on Xen PV guests as it is not needed there - Remove an old 32-bit PIC code compiler workaround" * tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Remove x86_32 PIC using %ebx workaround x86/boot: Skip realmode init code when running as Xen PV guest x86/efi: Make the deprecated EFI handover protocol optional x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=y x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit() x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.S x86/boot/compressed: Move startup32_check_sev_cbit() into .text x86/boot/compressed: Move startup32_load_idt() out of head_64.S x86/boot/compressed: Move startup32_load_idt() into .text section x86/boot/compressed: Pull global variable reference into startup32_load_idt() x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry() x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunk x86/boot/compressed, efi: Merge multiple definitions of image_offset into one x86/boot/compressed: Move efi32_pe_entry() out of head_64.S x86/boot/compressed: Move efi32_entry out of head_64.S x86/boot/compressed: Move efi32_pe_entry into .text section x86/boot/compressed: Move bootargs parsing out of 32-bit startup code x86/boot/compressed: Move 32-bit entrypoint code into .text section x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S
2022-12-12Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
2022-11-25x86/boot: Skip realmode init code when running as Xen PV guestJuergen Gross
When running as a Xen PV guest there is no need for setting up the realmode trampoline, as realmode isn't supported in this environment. Trying to setup the trampoline has been proven to be problematic in some cases, especially when trying to debug early boot problems with Xen requiring to keep the EFI boot-services memory mapped (some firmware variants seem to claim basically all memory below 1Mb for boot services). Introduce new x86_platform_ops operations for that purpose, which can be set to a NOP by the Xen PV specific kernel boot code. [ bp: s/call_init_real_mode/do_init_real_mode/ ] Fixes: 084ee1c641a0 ("x86, realmode: Relocator for realmode code") Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221123114523.3467-1-jgross@suse.com
2022-11-21Merge tag 'v6.1-rc6' into x86/core, to resolve conflictsIngo Molnar
Resolve conflicts between these commits in arch/x86/kernel/asm-offsets.c: # upstream: debc5a1ec0d1 ("KVM: x86: use a separate asm-offsets.c file") # retbleed work in x86/core: 5d8213864ade ("x86/retbleed: Add SKL return thunk") ... and these commits in include/linux/bpf.h: # upstram: 18acb7fac22f ("bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")") # x86/core commits: 931ab63664f0 ("x86/ibt: Implement FineIBT") bea75b33895f ("x86/Kconfig: Introduce function padding") The latter two modify BPF_DISPATCHER_ATTRIBUTES(), which was removed upstream. Conflicts: arch/x86/kernel/asm-offsets.c include/linux/bpf.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-11-18stackprotector: move get_random_canary() into stackprotector.hJason A. Donenfeld
This has nothing to do with random.c and everything to do with stack protectors. Yes, it uses randomness. But many things use randomness. random.h and random.c are concerned with the generation of randomness, not with each and every use. So move this function into the more specific stackprotector.h file where it belongs. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-16Merge tag 'for-linus-6.1-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two trivial cleanups, and three simple fixes" * tag 'for-linus-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/platform-pci: use define instead of literal number xen/platform-pci: add missing free_irq() in error path xen-pciback: Allow setting PCI_MSIX_FLAGS_MASKALL too xen/pcpu: fix possible memory leak in register_pcpu() x86/xen: Use kstrtobool() instead of strtobool()
2022-11-14x86/xen: Use kstrtobool() instead of strtobool()Christophe JAILLET
strtobool() is the same as kstrtobool(). However, the latter is more used within the kernel. In order to remove strtobool() and slightly simplify kstrtox.h, switch to the other function name. While at it, include the corresponding header file (<linux/kstrtox.h>) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/e91af3c8708af38b1c57e0a2d7eb9765dda0e963.1667336095.git.christophe.jaillet@wanadoo.fr Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-17x86/cpu: Get rid of redundant switch_to_new_gdt() invocationsThomas Gleixner
The only place where switch_to_new_gdt() is required is early boot to switch from the early GDT to the direct GDT. Any other invocation is completely redundant because it does not change anything. Secondary CPUs come out of the ASM code with GDT and GSBASE correctly set up. The same is true for XEN_PV. Remove all the voodoo invocations which are left overs from the ancient past, rename the function to switch_gdt_and_percpu_base() and mark it init. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111143.198076128@infradead.org
2022-10-12Merge tag 'for-linus-6.1-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - Some minor typo fixes - A fix of the Xen pcifront driver for supporting the device model to run in a Linux stub domain - A cleanup of the pcifront driver - A series to enable grant-based virtio with Xen on x86 - A cleanup of Xen PV guests to distinguish between safe and faulting MSR accesses - Two fixes of the Xen gntdev driver - Two fixes of the new xen grant DMA driver * tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum" xen/pv: support selecting safe/unsafe msr accesses xen/pv: refactor msr access functions to support safe and unsafe accesses xen/pv: fix vendor checks for pmu emulation xen/pv: add fault recovery control to pmu msr accesses xen/virtio: enable grant based virtio on x86 xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT xen/virtio: restructure xen grant dma setup xen/pcifront: move xenstore config scanning into sub-function xen/gntdev: Accommodate VMA splitting xen/gntdev: Prevent leaking grants xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page() xen/xenbus: Fix spelling mistake "hardward" -> "hardware" xen-pcifront: Handle missed Connected state
2022-10-11xen/pv: support selecting safe/unsafe msr accessesJuergen Gross
Instead of always doing the safe variants for reading and writing MSRs in Xen PV guests, make the behavior controllable via Kconfig option and a boot parameter. The default will be the current behavior, which is to always use the safe variant. Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-11xen/pv: refactor msr access functions to support safe and unsafe accessesJuergen Gross
Refactor and rename xen_read_msr_safe() and xen_write_msr_safe() to support both cases of MSR accesses, safe ones and potentially GP-fault generating ones. This will prepare to no longer swallow GPs silently in xen_read_msr() and xen_write_msr(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-10xen/virtio: enable grant based virtio on x86Juergen Gross
Use an x86-specific virtio_check_mem_acc_cb() for Xen in order to setup the correct DMA ops. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # common code Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-09-26x86/entry: Work around Clang __bdos() bugKees Cook
Clang produces a false positive when building with CONFIG_FORTIFY_SOURCE=y and CONFIG_UBSAN_BOUNDS=y when operating on an array with a dynamic offset. Work around this by using a direct assignment of an empty instance. Avoids this warning: ../include/linux/fortify-string.h:309:4: warning: call to __write_overflow_field declared with 'warn ing' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wat tribute-warning] __write_overflow_field(p_size_field, size); ^ which was isolated to the memset() call in xen_load_idt(). Note that this looks very much like another bug that was worked around: https://github.com/ClangBuiltLinux/linux/issues/1592 Cc: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: xen-devel@lists.xenproject.org Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/lkml/41527d69-e8ab-3f86-ff37-6b298c01d5bc@oracle.com Signed-off-by: Kees Cook <keescook@chromium.org>
2022-08-01xen: don't require virtio with grants for non-PV guestsJuergen Gross
Commit fa1f57421e0b ("xen/virtio: Enable restricted memory access using Xen grant mappings") introduced a new requirement for using virtio devices: the backend now needs to support the VIRTIO_F_ACCESS_PLATFORM feature. This is an undue requirement for non-PV guests, as those can be operated with existing backends without any problem, as long as those backends are running in dom0. Per default allow virtio devices without grant support for non-PV guests. On Arm require VIRTIO_F_ACCESS_PLATFORM for devices having been listed in the device tree to use grants. Add a new config item to always force use of grants for virtio. Fixes: fa1f57421e0b ("xen/virtio: Enable restricted memory access using Xen grant mappings") Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 guest using Xen Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20220622063838.8854-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-07-01x86/xen: Use clear_bss() for Xen PV guestsJuergen Gross
Instead of clearing the bss area in assembly code, use the clear_bss() function. This requires to pass the start_info address as parameter to xen_start_kernel() in order to avoid the xen_start_info being zeroed again. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20220630071441.28576-2-jgross@suse.com
2022-06-06xen/virtio: Enable restricted memory access using Xen grant mappingsJuergen Gross
In order to support virtio in Xen guests add a config option XEN_VIRTIO enabling the user to specify whether in all Xen guests virtio should be able to access memory via Xen grant mappings only on the host side. Also set PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS feature from the guest initialization code on Arm and x86 if CONFIG_XEN_VIRTIO is enabled. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/1654197833-25362-5-git-send-email-olekstysh@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-25Merge back reboot/poweroff notifiers rework for 5.19-rc1.Rafael J. Wysocki
2022-05-19xen/x86: Use do_kernel_power_off()Dmitry Osipenko
Kernel now supports chained power-off handlers. Use do_kernel_power_off() that invokes chained power-off handlers. It also invokes legacy pm_power_off() for now, which will be removed once all drivers will be converted to the new sys-off API. Acked-by: Juergen Gross <jgross@suse.com> Reviewed-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19x86: xen: remove STACK_FRAME_NON_STANDARD from xen_cpuidMaximilian Heyne
Since commit 4d65adfcd119 ("x86: xen: insn: Decode Xen and KVM emulate-prefix signature"), objtool is able to correctly parse the prefixed instruction in xen_cpuid and emit correct orc unwind information. Hence, marking the function as STACKFRAME_NON_STANDARD is no longer needed. This commit is basically a revert of commit 983bb6d254c7 ("x86/xen: Mark xen_cpuid() stack frame as non-standard"). Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> CC: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20220517162425.100567-1-mheyne@amazon.de Signed-off-by: Juergen Gross <jgross@suse.com>
2022-03-15x86/ibt,xen: Sprinkle the ENDBRPeter Zijlstra
Even though Xen currently doesn't advertise IBT, prepare for when it will eventually do so and sprinkle the ENDBR dust accordingly. Even though most of the entry points are IRET like, the CPL0 Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154317.873919996@infradead.org
2022-03-15x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()Peter Zijlstra
By doing an early rewrite of 'jmp native_iret` in restore_regs_and_return_to_kernel() we can get rid of the last INTERRUPT_RETURN user and paravirt_iret. Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220308154317.815039833@infradead.org
2022-02-03x86/Xen: streamline (and fix) PV CPU enumerationJan Beulich
This started out with me noticing that "dom0_max_vcpus=<N>" with <N> larger than the number of physical CPUs reported through ACPI tables would not bring up the "excess" vCPU-s. Addressing this is the primary purpose of the change; CPU maps handling is being tidied only as far as is necessary for the change here (with the effect of also avoiding the setting up of too much per-CPU infrastructure, i.e. for CPUs which can never come online). Noticing that xen_fill_possible_map() is called way too early, whereas xen_filter_cpu_maps() is called too late (after per-CPU areas were already set up), and further observing that each of the functions serves only one of Dom0 or DomU, it looked like it was better to simplify this. Use the .get_smp_config hook instead, uniformly for Dom0 and DomU. xen_fill_possible_map() can be dropped altogether, while xen_filter_cpu_maps() is re-purposed but not otherwise changed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/2dbd5f0a-9859-ca2d-085e-a02f7166c610@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-11-02xen: remove highmem remnantsJuergen Gross
There are some references to highmem left in Xen pv specific code which can be removed. Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20211028081221.2475-4-jgross@suse.com Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-02x86/xen: remove xen_have_vcpu_info_placement flagJuergen Gross
The flag xen_have_vcpu_info_placement was needed to support Xen hypervisors older than version 3.4, which didn't support the VCPUOP_register_vcpu_info hypercall. Today the Linux kernel requires at least Xen 4.0 to be able to run, so xen_have_vcpu_info_placement can be dropped (in theory the flag was used to ensure a working kernel even in case of the VCPUOP_register_vcpu_info hypercall failing for other reasons than the hypercall not being supported, but the only cases covered by the flag would be parameter errors, which ought not to be made anyway). This allows to let some functions return void now, as they can never fail. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20211028072748.29862-2-jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-01Merge tag 'objtool-core-2021-10-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Thomas Gleixner: - Improve retpoline code patching by separating it from alternatives which reduces memory footprint and allows to do better optimizations in the actual runtime patching. - Add proper retpoline support for x86/BPF - Address noinstr warnings in x86/kvm, lockdep and paravirtualization code - Add support to handle pv_opsindirect calls in the noinstr analysis - Classify symbols upfront and cache the result to avoid redundant str*cmp() invocations. - Add a CFI hash to reduce memory consumption which also reduces runtime on a allyesconfig by ~50% - Adjust XEN code to make objtool handling more robust and as a side effect to prevent text fragmentation due to placement of the hypercall page. * tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) bpf,x86: Respect X86_FEATURE_RETPOLINE* bpf,x86: Simplify computing label offsets x86,bugs: Unconditionally allow spectre_v2=retpoline,amd x86/alternative: Add debug prints to apply_retpolines() x86/alternative: Try inline spectre_v2=retpoline,amd x86/alternative: Handle Jcc __x86_indirect_thunk_\reg x86/alternative: Implement .retpoline_sites support x86/retpoline: Create a retpoline thunk array x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h x86/asm: Fixup odd GEN-for-each-reg.h usage x86/asm: Fix register order x86/retpoline: Remove unused replacement symbols objtool,x86: Replace alternatives with .retpoline_sites objtool: Shrink struct instruction objtool: Explicitly avoid self modifying code in .altinstr_replacement objtool: Classify symbols objtool: Support pv_opsindirect calls for noinstr x86/xen: Rework the xen_{cpu,irq,mmu}_opsarrays x86/xen: Mark xen_force_evtchn_callback() noinstr x86/xen: Make irq_disable() noinstr ...
2021-10-07Merge branch 'objtool/urgent'Peter Zijlstra
Fixup conflicts. # Conflicts: # tools/objtool/check.c
2021-10-05xen/x86: hook up xen_banner() also for PVHJan Beulich
This was effectively lost while dropping PVHv1 code. Move the function and arrange for it to be called the same way as done in PV mode. Clearly this then needs re-introducing the XENFEAT_mmu_pt_update_preserve_ad check that was recently removed, as that's a PV-only feature. Since the string pointed at by pv_info.name describes the mode, drop "paravirtualized" from the log message while moving the code. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/de03054d-a20d-2114-bb86-eec28e17b3b8@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: generalize preferred console model from PV to PVH Dom0Jan Beulich
Without announcing hvc0 as preferred it won't get used as long as tty0 gets registered earlier. This is particularly problematic with there not being any screen output for PVH Dom0 when the screen is in graphics mode, as the necessary information doesn't get conveyed yet from the hypervisor. Follow PV's model, but be conservative and do this for Dom0 only for now. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/582328b6-c86c-37f3-d802-5539b7a86736@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: allow "earlyprintk=xen" to work for PV Dom0Jan Beulich
With preferred consoles "tty" and "hvc" announced as preferred, registering "xenboot" early won't result in use of the console: It also needs to be registered as preferred. Generalize this from being DomU- only so far. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/d4a34540-a476-df2c-bca6-732d0d58c5f0@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: allow PVH Dom0 without XEN_PV=yJan Beulich
Decouple XEN_DOM0 from XEN_PV, converting some existing uses of XEN_DOM0 to a new XEN_PV_DOM0. (I'm not convinced all are really / should really be PV-specific, but for starters I've tried to be conservative.) For PVH Dom0 the hypervisor populates MADT with only x2APIC entries, so without x2APIC support enabled in the kernel things aren't going to work very well. (As opposed, DomU-s would only ever see LAPIC entries in MADT as of now.) Note that this then requires PVH Dom0 to be 64-bit, as X86_X2APIC depends on X86_64. In the course of this xen_running_on_version_or_later() needs to be available more broadly. Move it from a PV-specific to a generic file, considering that what it does isn't really PV-specific at all anyway. Note that xen/interface/version.h cannot be included on its own; in enlighten.c, which uses SCHEDOP_* anyway, include xen/interface/sched.h first to resolve the apparently sole missing type (xen_ulong_t). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/983bb72f-53df-b6af-14bd-5e088bd06a08@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-21xen/x86: fix PV trap handling on secondary processorsJan Beulich
The initial observation was that in PV mode under Xen 32-bit user space didn't work anymore. Attempts of system calls ended in #GP(0x402). All of the sudden the vector 0x80 handler was not in place anymore. As it turns out up to 5.13 redundant initialization did occur: Once from cpu_initialize_context() (through its VCPUOP_initialise hypercall) and a 2nd time while each CPU was brought fully up. This 2nd initialization is now gone, uncovering that the 1st one was flawed: Unlike for the set_trap_table hypercall, a full virtual IDT needs to be specified here; the "vector" fields of the individual entries are of no interest. With many (kernel) IDT entries still(?) (i.e. at that point at least) empty, the syscall vector 0x80 ended up in slot 0x20 of the virtual IDT, thus becoming the domain's handler for vector 0x20. Make xen_convert_trap_info() fit for either purpose, leveraging the fact that on the xen_copy_trap_info() path the table starts out zero-filled. This includes moving out the writing of the sentinel, which would also have lead to a buffer overrun in the xen_copy_trap_info() case if all (kernel) IDT entries were populated. Convert the writing of the sentinel to clearing of the entire table entry rather than just the address field. (I didn't bother trying to identify the commit which uncovered the issue in 5.14; the commit named below is the one which actually introduced the bad code.) Fixes: f87e4cac4f4e ("xen: SMP guest support") Cc: stable@vger.kernel.org Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/7a266932-092e-b68f-f2bb-1473b61adc6e@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-17x86/xen: Rework the xen_{cpu,irq,mmu}_opsarraysPeter Zijlstra
In order to allow objtool to make sense of all the various paravirt functions, it needs to either parse whole pv_ops[] tables, or observe individual assignments in the form: bf87: 48 c7 05 00 00 00 00 00 00 00 00 movq $0x0,0x0(%rip) bf92 <xen_init_spinlocks+0x5f> bf8a: R_X86_64_PC32 pv_ops+0x268 As is, xen_cpu_ops[] is at offset +0 in pv_ops[] and could thus be parsed as a 'normal' pv_ops[] table, however xen_irq_ops[] and xen_mmu_ops[] are not. Worse, both the latter two are compiled into the individual assignment for by current GCC, but that's not something one can rely on. Therefore, convert all three into full pv_ops[] tables. This has the benefit of not needing to teach objtool about the offsets and resulting in more conservative code-gen. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210624095149.057262522@infradead.org