summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2025-03-25Merge tag 'timers-vdso-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull VDSO infrastructure updates from Thomas Gleixner: - Consolidate the VDSO storage The VDSO data storage and data layout has been largely architecture specific for historical reasons. That increases the maintenance effort and causes inconsistencies over and over. There is no real technical reason for architecture specific layouts and implementations. The architecture specific details can easily be integrated into a generic layout, which also reduces the amount of duplicated code for managing the mappings. Convert all architectures over to a unified layout and common mapping infrastructure. This splits the VDSO data layout into subsystem specific blocks, timekeeping, random and architecture parts, which provides a better structure and allows to improve and update the functionalities without conflict and interaction. - Rework the timekeeping data storage The current implementation is designed for exposing system timekeeping accessors, which was good enough at the time when it was designed. PTP and Time Sensitive Networking (TSN) change that as there are requirements to expose independent PTP clocks, which are not related to system timekeeping. Replace the monolithic data storage by a structured layout, which allows to add support for independent PTP clocks on top while reusing both the data structures and the time accessor implementations. * tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) sparc/vdso: Always reject undefined references during linking x86/vdso: Always reject undefined references during linking vdso: Rework struct vdso_time_data and introduce struct vdso_clock vdso: Move architecture related data before basetime data powerpc/vdso: Prepare introduction of struct vdso_clock arm64/vdso: Prepare introduction of struct vdso_clock x86/vdso: Prepare introduction of struct vdso_clock time/namespace: Prepare introduction of struct vdso_clock vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct vdso/vsyscall: Prepare introduction of struct vdso_clock vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock vdso/gettimeofday: Prepare introduction of struct vdso_clock vdso/helpers: Prepare introduction of struct vdso_clock vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks vdso: Make vdso_time_data cacheline aligned arm64: Make asm/cache.h compatible with vDSO ...
2025-03-25Merge tag 'timers-cleanups-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A treewide hrtimer timer cleanup hrtimers are initialized with hrtimer_init() and a subsequent store to the callback pointer. This turned out to be suboptimal for the upcoming Rust integration and is obviously a silly implementation to begin with. This cleanup replaces the hrtimer_init(T); T->function = cb; sequence with hrtimer_setup(T, cb); The conversion was done with Coccinelle and a few manual fixups. Once the conversion has completely landed in mainline, hrtimer_init() will be removed and the hrtimer::function becomes a private member" * tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits) wifi: rt2x00: Switch to use hrtimer_update_function() io_uring: Use helper function hrtimer_update_function() serial: xilinx_uartps: Use helper function hrtimer_update_function() ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup() RDMA: Switch to use hrtimer_setup() virtio: mem: Switch to use hrtimer_setup() drm/vmwgfx: Switch to use hrtimer_setup() drm/xe/oa: Switch to use hrtimer_setup() drm/vkms: Switch to use hrtimer_setup() drm/msm: Switch to use hrtimer_setup() drm/i915/request: Switch to use hrtimer_setup() drm/i915/uncore: Switch to use hrtimer_setup() drm/i915/pmu: Switch to use hrtimer_setup() drm/i915/perf: Switch to use hrtimer_setup() drm/i915/gvt: Switch to use hrtimer_setup() drm/i915/huc: Switch to use hrtimer_setup() drm/amdgpu: Switch to use hrtimer_setup() stm class: heartbeat: Switch to use hrtimer_setup() i2c: Switch to use hrtimer_setup() iio: Switch to use hrtimer_setup() ...
2025-03-25Merge tag 'irq-drivers-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq driver updates from Thomas Gleixner: - Support for hard indices on RISC-V. The hart index identifies a hart (core) within a specific interrupt domain in RISC-V's Priviledged Architecture. - Rework of the RISC-V MSI driver This moves the driver over to the generic MSI library and solves the affinity problem of unmaskable PCI/MSI controllers. Unmaskable PCI/MSI controllers are prone to lose interrupts when the MSI message is updated to change the affinity because the message write consists of three 32-bit subsequent writes, which update address and data. As these writes are non-atomic versus the device raising an interrupt, the device can observe a half written update and issue an interrupt on the wrong vector. This is mitiated by a carefully orchestrated step by step update and the observation of an eventually pending interrupt on the CPU which issues the update. The algorithm follows the well established method of the X86 MSI driver. - A new driver for the RISC-V Sophgo SG2042 MSI controller - Overhaul of the Renesas RZQ2L driver Simplification of the probe function by using devm_*() mechanisms, which avoid the endless list of error prone gotos in the failure paths. - Expand the Renesas RZV2H driver to support RZ/G3E SoCs - A workaround for Rockchip 3568002 erratum in the GIC-V3 driver to ensure that the addressing is limited to the lower 32-bit of the physical address space. - Add support for the Allwinner AS23 NMI controller - Expand the IMX irqsteer driver to handle up to 960 input interrupts - The usual small updates, cleanups and device tree changes * tag 'irq-drivers-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) irqchip/imx-irqsteer: Support up to 960 input interrupts irqchip/sunxi-nmi: Support Allwinner A523 NMI controller dt-bindings: irq: sun7i-nmi: Document the Allwinner A523 NMI controller irqchip/davinci-cp-intc: Remove public header irqchip/renesas-rzv2h: Add RZ/G3E support irqchip/renesas-rzv2h: Update macros ICU_TSSR_TSSEL_{MASK,PREP} irqchip/renesas-rzv2h: Update TSSR_TIEN macro irqchip/renesas-rzv2h: Add field_width to struct rzv2h_hw_info irqchip/renesas-rzv2h: Add max_tssel to struct rzv2h_hw_info irqchip/renesas-rzv2h: Add struct rzv2h_hw_info with t_offs variable irqchip/renesas-rzv2h: Use devm_pm_runtime_enable() irqchip/renesas-rzv2h: Use devm_reset_control_get_exclusive_deasserted() irqchip/renesas-rzv2h: Simplify rzv2h_icu_init() irqchip/renesas-rzv2h: Drop irqchip from struct rzv2h_icu_priv irqchip/renesas-rzv2h: Fix wrong variable usage in rzv2h_tint_set_type() dt-bindings: interrupt-controller: renesas,rzv2h-icu: Document RZ/G3E SoC riscv: sophgo: dts: Add msi controller for SG2042 irqchip: Add the Sophgo SG2042 MSI interrupt controller dt-bindings: interrupt-controller: Add Sophgo SG2042 MSI arm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI ...
2025-03-25RISC-V: add f & d extension validation checksConor Dooley
Using Clement's new validation callbacks, support checking that dependencies have been satisfied for the floating point extensions. The check for "d" might be slightly confusingly shorter than that of "f", despite "d" depending on "f". This is because the requirement that a hart supporting double precision must also support single precision, should be validated by dt-bindings etc, not the kernel but lack of support for single precision only is a limitation of the kernel. Tested-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Clément Léger <cleger@rivosinc.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250312-reptile-platinum-62ee0f444a32@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25RISC-V: add vector crypto extension validation checksConor Dooley
Using Clement's new validation callbacks, support checking that dependencies have been satisfied for the vector crpyto extensions. Currently riscv_isa_extension_available(<vector crypto>) will return true on systems that support the extensions but vector itself has been disabled by the kernel, adding validation callbacks will prevent such a scenario from occuring and make the behaviour of the extension detection functions more consistent with user expectations - it's not expected to have to check for vector AND the specific crypto extension. The Unpriv spec states: | The Zvknhb and Zvbc Vector Crypto Extensions --and accordingly the | composite extensions Zvkn, Zvknc, Zvkng, and Zvksc-- require a Zve64x | base, or application ("V") base Vector Extension. All of the other | Vector Crypto Extensions can be built on any embedded (Zve*) or | application ("V") base Vector Extension. While this could be used as the basis for checking that the correct base for individual crypto extensions, but that's not really the kernel's job in my opinion and it is sufficient to leave that sort of precision to the dt-bindings. The kernel only needs to make sure that vector, in some form, is available. Link: https://github.com/riscv/riscv-isa-manual/blob/main/src/vector-crypto.adoc#extensions-overview Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20250312-entertain-shaking-b664142c2f99@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25RISC-V: add vector extension validation checksConor Dooley
Using Clement's new validation callbacks, support checking that dependencies have been satisfied for the vector extensions. From the kernel's perfective, it's not required to differentiate between the conditions for all the various vector subsets - it's the firmware's job to not report impossible combinations. Instead, the kernel only has to check that the correct config options are enabled and to enforce its requirement of the d extension being present for FPU support. Since vector will now be disabled proactively, there's no need to clear the bit in elf_hwcap in riscv_fill_hwcap() any longer. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250312-eclair-affluent-55b098c3602b@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25x86/split_lock: Simplify reenablingMaksim Davydov
When split_lock_mitigate is disabled, each CPU needs its own delayed_work structure. They are used to reenable split lock detection after its disabling. But delayed_work structure must be correctly initialized after its allocation. Current implementation uses deferred initialization that makes the split lock handler code unclear. The code can be simplified a bit if the initialization is moved to the appropriate initcall. sld_setup() is called before setup_per_cpu_areas(), thus it can't be used for this purpose, so introduce an independent initcall for the initialization. [ mingo: Simplified the 'work' assignment line a bit more. ] Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250325085807.171885-1-davydov-max@yandex-team.ru
2025-03-25x86/fpu: Update the outdated comment above fpstate_init_user()Chao Gao
fpu_init_fpstate_user() was removed in: commit 582b01b6ab27 ("x86/fpu: Remove old KVM FPU interface"). Update that comment to accurately reflect the current state regarding its callers. Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250324131931.2097905-1-chao.gao@intel.com
2025-03-25objtool: Fix X86_FEATURE_SMAP alternative handlingJosh Poimboeuf
For X86_FEATURE_SMAP alternatives which replace NOP with STAC or CLAC, uaccess validation skips the NOP branch to avoid following impossible code paths, e.g. where a STAC would be patched but a CLAC wouldn't. However, it's not safe to assume an X86_FEATURE_SMAP alternative is patching STAC/CLAC. There can be other alternatives, like static_cpu_has(), where both branches need to be validated. Fix that by repurposing ANNOTATE_IGNORE_ALTERNATIVE for skipping either original instructions or new ones. This is a more generic approach which enables the removal of the feature checking hacks and the insn->ignore bit. Fixes the following warnings: arch/x86/mm/fault.o: warning: objtool: do_user_addr_fault+0x8ec: __stack_chk_fail() missing __noreturn in .c/.h or NORETURN() in noreturns.h arch/x86/mm/fault.o: warning: objtool: do_user_addr_fault+0x8f1: unreachable instruction [ mingo: Fix up conflicts with recent x86 changes. ] Fixes: ea24213d8088 ("objtool: Add UACCESS validation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/de0621ca242130156a55d5d74fed86994dfa4c9c.1742852846.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/oe-kbuild-all/202503181736.zkZUBv4N-lkp@intel.com/
2025-03-25x86/early_printk: Add support for MMIO-based UARTsDenis Mukhin
During the bring-up of an x86 board, the kernel was crashing before reaching the platform's console driver because of a bug in the firmware, leaving no trace of the boot progress. The only available method to debug the kernel boot process was via the platform's MMIO-based UART, as the board lacked an I/O port-based UART, PCI UART, or functional video output. Then it turned out that earlyprintk= does not have a knob to configure the MMIO-mapped UART. Extend the early printk facility to support platform MMIO-based UARTs on x86 systems, enabling debugging during the system bring-up phase. The command line syntax to enable platform MMIO-based UART is: earlyprintk=mmio,membase[,{nocfg|baudrate}][,keep] Note, the change does not integrate MMIO-based UART support to: arch/x86/boot/early_serial_console.c Also, update kernel parameters documentation with the new syntax and add the missing 'nocfg' setting to the PCI serial cards description. Signed-off-by: Denis Mukhin <dmukhin@ford.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250324-earlyprintk-v3-1-aee7421dc469@ford.com
2025-03-25x86/dumpstack: Fix inaccurate unwinding from exception stacks due to ↵Jann Horn
misplaced assignment Commit: 2e4be0d011f2 ("x86/show_trace_log_lvl: Ensure stack pointer is aligned, again") was intended to ensure alignment of the stack pointer; but it also moved the initialization of the "stack" variable down into the loop header. This was likely intended as a no-op cleanup, since the commit message does not mention it; however, this caused a behavioral change because the value of "regs" is different between the two places. Originally, get_stack_pointer() used the regs provided by the caller; after that commit, get_stack_pointer() instead uses the regs at the top of the stack frame the unwinder is looking at. Often, there are no such regs at all, and "regs" is NULL, causing get_stack_pointer() to fall back to the task's current stack pointer, which is not what we want here, but probably happens to mostly work. Other times, the original regs will point to another regs frame - in that case, the linear guess unwind logic in show_trace_log_lvl() will start unwinding too far up the stack, causing the first frame found by the proper unwinder to never be visited, resulting in a stack trace consisting purely of guess lines. Fix it by moving the "stack = " assignment back where it belongs. Fixes: 2e4be0d011f2 ("x86/show_trace_log_lvl: Ensure stack pointer is aligned, again") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250325-2025-03-unwind-fixes-v1-2-acd774364768@google.com
2025-03-25x86/entry: Fix ORC unwinder for PUSH_REGS with save_ret=1Jann Horn
PUSH_REGS with save_ret=1 is used by interrupt entry helper functions that initially start with a UNWIND_HINT_FUNC ORC state. However, save_ret=1 means that we clobber the helper function's return address (and then later restore the return address further down on the stack); after that point, the only thing on the stack we can unwind through is the IRET frame, so use UNWIND_HINT_IRET_REGS until we have a full pt_regs frame. ( An alternate approach would be to move the pt_regs->di overwrite down such that it is the final step of pt_regs setup; but I don't want to rearrange entry code just to make unwinding a tiny bit more elegant. ) Fixes: 9e809d15d6b6 ("x86/entry: Reduce the code footprint of the 'idtentry' macro") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20250325-2025-03-unwind-fixes-v1-1-acd774364768@google.com
2025-03-25x86/Kconfig: Fix lists in X86_EXTENDED_PLATFORM help textMateusz Jończyk
Support for STA2X11-based systems was removed in February in: dcbb01fbb7ae ("x86/pci: Remove old STA2x11 support") Intel MID for 32-bit platforms was removed from this list also in February in: ca5955dd5f08 ("x86/cpu: Document CONFIG_X86_INTEL_MID as 64-bit-only") Intel MID for 64-bit platforms is a duplicate for "Merrifield/Moorefield MID devices". Fixes: 4047e8773fb6 ("x86/Kconfig: Update lists in X86_EXTENDED_PLATFORM") Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20250322175052.43611-1-mat.jonczyk@o2.pl
2025-03-25x86/Kconfig: Correct X86_X2APIC help textMateusz Jończyk
Currently, it is not true that the kernel will panic with CONFIG_X86_X2APIC=n on systems that require it; it will try to disable the APIC and run without it to at least give the user a clear warning message. See the second variant of check_x2apic() in arch/x86/kernel/apic/apic.c . Also massage some other parts of the help text. Fixes: 9232c49ff31c ("x86/Kconfig: Enable X86_X2APIC by default and improve help text") Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250322154541.40325-1-mat.jonczyk@o2.pl
2025-03-25Merge branch 'linus' into x86/urgent, to pick up fixes and refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-24x86 boot build: make git ignore stale 'tools' directoryLinus Torvalds
We've had this before: when we remove infrastructure to generate files, the old stale build artifacts still remain in-tree. And when the infrastructure to generate them is gone, so is the gitignore file for those build artifacts. End result: git will see the old generated files, and people will mistakenly commit them. That's what happened with the 'genheaders' file not that long ago (see commit 04a3389b3535 "Remove stale generated 'genheaders' file"). This time it's commit 9c54baab4401 ("x86/boot: Drop CRC-32 checksum and the build tool that generates it") that removed the 'build' file from the arch/x86/boot/tools/ subdirectory, and removed the .gitignore file too (because the whole subdirectory is gone). And as a result, if you don't do a 'git clean -dqfx' or similar to clean up your tree, 'git status' will say Untracked files: (use "git add <file>..." to include in what will be committed) arch/x86/boot/tools/ and some hapless sleep-deprived developer will inevitably decide that that means that they need to 'git add' that directory. Which would bring back some stale generated file that we most definitely do not want in the tree. So when removing directories that had special .gitignore patterns, make sure to add a new gitignore entry in the parent directory for the no longer existing subdirectory. It will avoid mistakes. Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Fixes: 9c54baab4401 ("x86/boot: Drop CRC-32 checksum and the build tool that generates it") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-24Merge tag 'x86-platform-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "Two small cleanups in the x86 platform support code" * tag 'x86-platform-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/olpc: Remove unused variable 'len' in olpc_dt_compatible_match() x86/platform/olpc-xo1-sci: Don't include <linux/pm_wakeup.h> directly
2025-03-24Merge tag 'x86-sev-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Ingo Molnar: - Improve sme_enable() PIC build robustness (Kevin Loughlin) - Simplify vc_handle_msr() a bit (Peng Hao) [ Just reminding myself and everybody else about the endless stream of x86 TLAs: "SEV" is AMD's Secure Encrypted Virtualization - Linus ] * tag 'x86-sev-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Simplify the code by removing unnecessary 'else' statement x86/sev: Add missing RIP_REL_REF() invocations during sme_enable()
2025-03-24Merge tag 'x86-cleanups-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Miscellaneous x86 cleanups by Arnd Bergmann, Charles Han, Mirsad Todorovac, Randy Dunlap, Thorsten Blum and Zhang Kunbo" * tag 'x86-cleanups-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/coco: Replace 'static const cc_mask' with the newly introduced cc_get_mask() function x86/delay: Fix inconsistent whitespace selftests/x86/syscall: Fix coccinelle WARNING recommending the use of ARRAY_SIZE() x86/platform: Fix missing declaration of 'x86_apple_machine' x86/irq: Fix missing declaration of 'io_apic_irqs' x86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s description x86/apic: Use str_disabled_enabled() helper in print_ipi_mode()
2025-03-24Merge tag 'x86-fpu-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/fpu updates from Ingo Molnar: - Improve crypto performance by making kernel-mode FPU reliably usable in softirqs ((Eric Biggers) - Fully optimize out WARN_ON_FPU() (Eric Biggers) - Initial steps to support Support Intel APX (Advanced Performance Extensions) (Chang S. Bae) - Fix KASAN for arch_dup_task_struct() (Benjamin Berg) - Refine and simplify the FPU magic number check during signal return (Chang S. Bae) - Fix inconsistencies in guest FPU xfeatures (Chao Gao, Stanislav Spassov) - selftests/x86/xstate: Introduce common code for testing extended states (Chang S. Bae) - Misc fixes and cleanups (Borislav Petkov, Colin Ian King, Uros Bizjak) * tag 'x86-fpu-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu/xstate: Fix inconsistencies in guest FPU xfeatures x86/fpu: Clarify the "xa" symbolic name used in the XSTATE* macros x86/fpu: Use XSAVE{,OPT,C,S} and XRSTOR{,S} mnemonics in xstate.h x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs x86/fpu/xstate: Simplify print_xstate_features() x86/fpu: Refine and simplify the magic number check during signal return selftests/x86/xstate: Fix spelling mistake "hader" -> "header" x86/fpu: Avoid copying dynamic FP state from init_task in arch_dup_task_struct() vmlinux.lds.h: Remove entry to place init_task onto init_stack selftests/x86/avx: Add AVX tests selftests/x86/xstate: Clarify supported xstates selftests/x86/xstate: Consolidate test invocations into a single entry selftests/x86/xstate: Introduce signal ABI test selftests/x86/xstate: Refactor ptrace ABI test selftests/x86/xstate: Refactor context switching test selftests/x86/xstate: Enumerate and name xstate components selftests/x86/xstate: Refactor XSAVE helpers for general use selftests/x86: Consolidate redundant signal helper functions x86/fpu: Fix guest FPU state buffer allocation size x86/fpu: Fully optimize out WARN_ON_FPU()
2025-03-24Merge tag 'x86-boot-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot code updates from Ingo Molnar: - Memblock setup and other early boot code cleanups (Mike Rapoport) - Export e820_table_kexec[] to sysfs (Dave Young) - Baby steps of adding relocate_kernel() debugging support (David Woodhouse) - Replace open-coded parity calculation with parity8() (Kuan-Wei Chiu) - Move the LA57 trampoline to separate source file (Ard Biesheuvel) - Misc micro-optimizations (Uros Bizjak) - Drop obsolete E820_TYPE_RESERVED_KERN and related code (Mike Rapoport) * tag 'x86-boot-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kexec: Add relocate_kernel() debugging support: Load a GDT x86/boot: Move the LA57 trampoline to separate source file x86/boot: Do not test if AC and ID eflags are changeable on x86_64 x86/bootflag: Replace open-coded parity calculation with parity8() x86/bootflag: Micro-optimize sbf_write() x86/boot: Add missing has_cpuflag() prototype x86/kexec: Export e820_table_kexec[] to sysfs x86/boot: Change some static bootflag functions to bool x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code x86/boot: Split parsing of boot_params into the parse_boot_params() helper function x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function x86/boot: Move setting of memblock parameters to e820__memblock_setup()
2025-03-24Merge tag 'x86-build-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: - Drop CRC-32 checksum and the build tool that generates it (Ard Biesheuvel) - Fix broken copy command in genimage.sh when making isoimage (Nir Lichtman) * tag 'x86-build-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Add back some padding for the CRC-32 checksum x86/boot: Drop CRC-32 checksum and the build tool that generates it x86/build: Fix broken copy command in genimage.sh when making isoimage
2025-03-24Merge tag 'x86-core-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Ingo Molnar: "x86 CPU features support: - Generate the <asm/cpufeaturemasks.h> header based on build config (H. Peter Anvin, Xin Li) - x86 CPUID parsing updates and fixes (Ahmed S. Darwish) - Introduce the 'setcpuid=' boot parameter (Brendan Jackman) - Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan Jackman) - Utilize CPU-type for CPU matching (Pawan Gupta) - Warn about unmet CPU feature dependencies (Sohil Mehta) - Prepare for new Intel Family numbers (Sohil Mehta) Percpu code: - Standardize & reorganize the x86 percpu layout and related cleanups (Brian Gerst) - Convert the stackprotector canary to a regular percpu variable (Brian Gerst) - Add a percpu subsection for cache hot data (Brian Gerst) - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak) - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak) MM: - Add support for broadcast TLB invalidation using AMD's INVLPGB instruction (Rik van Riel) - Rework ROX cache to avoid writable copy (Mike Rapoport) - PAT: restore large ROX pages after fragmentation (Kirill A. Shutemov, Mike Rapoport) - Make memremap(MEMREMAP_WB) map memory as encrypted by default (Kirill A. Shutemov) - Robustify page table initialization (Kirill A. Shutemov) - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn) - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW (Matthew Wilcox) KASLR: - x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir Singh) CPU bugs: - Implement FineIBT-BHI mitigation (Peter Zijlstra) - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta) - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta) - RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta) System calls: - Break up entry/common.c (Brian Gerst) - Move sysctls into arch/x86 (Joel Granados) Intel LAM support updates: (Maciej Wieczor-Retman) - selftests/lam: Move cpu_has_la57() to use cpuinfo flag - selftests/lam: Skip test if LAM is disabled - selftests/lam: Test get_user() LAM pointer handling AMD SMN access updates: - Add SMN offsets to exclusive region access (Mario Limonciello) - Add support for debugfs access to SMN registers (Mario Limonciello) - Have HSMP use SMN through AMD_NODE (Yazen Ghannam) Power management updates: (Patryk Wlazlyn) - Allow calling mwait_play_dead with an arbitrary hint - ACPI/processor_idle: Add FFH state handling - intel_idle: Provide the default enter_dead() handler - Eliminate mwait_play_dead_cpuid_hint() Build system: - Raise the minimum GCC version to 8.1 (Brian Gerst) - Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor) Kconfig: (Arnd Bergmann) - Add cmpxchg8b support back to Geode CPUs - Drop 32-bit "bigsmp" machine support - Rework CONFIG_GENERIC_CPU compiler flags - Drop configuration options for early 64-bit CPUs - Remove CONFIG_HIGHMEM64G support - Drop CONFIG_SWIOTLB for PAE - Drop support for CONFIG_HIGHPTE - Document CONFIG_X86_INTEL_MID as 64-bit-only - Remove old STA2x11 support - Only allow CONFIG_EISA for 32-bit Headers: - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers (Thomas Huth) Assembly code & machine code patching: - x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf) - x86/alternatives: Simplify callthunk patching (Peter Zijlstra) - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf) - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf) - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra) - x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> (Uros Bizjak) - Use named operands in inline asm (Uros Bizjak) - Improve performance by using asm_inline() for atomic locking instructions (Uros Bizjak) Earlyprintk: - Harden early_serial (Peter Zijlstra) NMI handler: - Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus() (Waiman Long) Miscellaneous fixes and cleanups: - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner, Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly Kuznetsov, Xin Li, liuye" * tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (211 commits) zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work around compiler segfault x86/asm: Make asm export of __ref_stack_chk_guard unconditional x86/mm: Only do broadcast flush from reclaim if pages were unmapped perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones perf/x86/intel, x86/cpu: Simplify Intel PMU initialization x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions x86/asm: Use asm_inline() instead of asm() in clwb() x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h> x86/hweight: Use asm_inline() instead of asm() x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm() x86/hweight: Use named operands in inline asm() x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP x86/head/64: Avoid Clang < 17 stack protector in startup code x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro x86/cpu/intel: Limit the non-architectural constant_tsc model checks x86/mm/pat: Replace Intel x86_model checks with VFM ones x86/cpu/intel: Fix fast string initialization for extended Families ...
2025-03-24Merge tag 'perf-core-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Core: - Move perf_event sysctls into kernel/events/ (Joel Granados) - Use POLLHUP for pinned events in error (Namhyung Kim) - Avoid the read if the count is already updated (Peter Zijlstra) - Allow the EPOLLRDNORM flag for poll (Tao Chen) - locking/percpu-rwsem: Add guard support [ NOTE: this got (mis-)merged into the perf tree due to related work ] (Peter Zijlstra) perf_pmu_unregister() related improvements: (Peter Zijlstra) - Simplify the perf_event_alloc() error path - Simplify the perf_pmu_register() error path - Simplify perf_pmu_register() - Simplify perf_init_event() - Simplify perf_event_alloc() - Merge struct pmu::pmu_disable_count into struct perf_cpu_pmu_context::pmu_disable_count - Add this_cpc() helper - Introduce perf_free_addr_filters() - Robustify perf_event_free_bpf_prog() - Simplify the perf_mmap() control flow - Further simplify perf_mmap() - Remove retry loop from perf_mmap() - Lift event->mmap_mutex in perf_mmap() - Detach 'struct perf_cpu_pmu_context' and 'struct pmu' lifetimes - Fix perf_mmap() failure path Uprobes: - Harden x86 uretprobe syscall trampoline check (Jiri Olsa) - Remove redundant spinlock in uprobe_deny_signal() (Liao Chang) - Remove the spinlock within handle_singlestep() (Liao Chang) x86 Intel PMU enhancements: - Support PEBS counters snapshotting (Kan Liang) - Fix intel_pmu_read_event() (Kan Liang) - Extend per event callchain limit to branch stack (Kan Liang) - Fix system-wide LBR profiling (Kan Liang) - Allocate bts_ctx only if necessary (Li RongQing) - Apply static call for drain_pebs (Peter Zijlstra) x86 AMD PMU enhancements: (Ravi Bangoria) - Remove pointless sample period check - Fix ->config to sample period calculation for OP PMU - Fix perf_ibs_op.cnt_mask for CurCnt - Don't allow freq mode event creation through ->config interface - Add PMU specific minimum period - Add ->check_period() callback - Ceil sample_period to min_period - Add support for OP Load Latency Filtering - Update DTLB/PageSize decode logic Hardware breakpoints: - Return EOPNOTSUPP for unsupported breakpoint type (Saket Kumar Bhaskar) Hardlockup detector improvements: (Li Huafei) - perf_event memory leak - Warn if watchdog_ev is leaked Fixes and cleanups: - Misc fixes and cleanups (Andy Shevchenko, Kan Liang, Peter Zijlstra, Ravi Bangoria, Thorsten Blum, XieLudan)" * tag 'perf-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) perf: Fix __percpu annotation perf: Clean up pmu specific data perf/x86: Remove swap_task_ctx() perf/x86/lbr: Fix shorter LBRs call stacks for the system-wide mode perf: Supply task information to sched_task() perf: attach/detach PMU specific data locking/percpu-rwsem: Add guard support perf: Save PMU specific data in task_struct perf: Extend per event callchain limit to branch stack perf/ring_buffer: Allow the EPOLLRDNORM flag for poll perf/core: Use POLLHUP for pinned events in error perf/core: Use sysfs_emit() instead of scnprintf() perf/core: Remove optional 'size' arguments from strscpy() calls perf/x86/intel/bts: Check if bts_ctx is allocated when calling BTS functions uprobes/x86: Harden uretprobe syscall trampoline check watchdog/hardlockup/perf: Warn if watchdog_ev is leaked watchdog/hardlockup/perf: Fix perf_event memory leak perf/x86: Annotate struct bts_buffer::buf with __counted_by() perf/core: Clean up perf_try_init_event() perf/core: Fix perf_mmap() failure path ...
2025-03-24Merge tag 'sched-core-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core & fair scheduler changes: - Cancel the slice protection of the idle entity (Zihan Zhou) - Reduce the default slice to avoid tasks getting an extra tick (Zihan Zhou) - Force propagating min_slice of cfs_rq when {en,de}queue tasks (Tianchen Ding) - Refactor can_migrate_task() to elimate looping (I Hsin Cheng) - Add unlikey branch hints to several system calls (Colin Ian King) - Optimize current_clr_polling() on certain architectures (Yujun Dong) Deadline scheduler: (Juri Lelli) - Remove redundant dl_clear_root_domain call - Move dl_rebuild_rd_accounting to cpuset.h Uclamp: - Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan) - Optimize sched_uclamp_used static key enabling (Xuewen Yan) Scheduler topology support: (Juri Lelli) - Ignore special tasks when rebuilding domains - Add wrappers for sched_domains_mutex - Generalize unique visiting of root domains - Rebuild root domain accounting after every update - Remove partition_and_rebuild_sched_domains - Stop exposing partition_sched_domains_locked RSEQ: (Michael Jeanson) - Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y - Fix segfault on registration when rseq_cs is non-zero - selftests: Add rseq syscall errors test - selftests: Ensure the rseq ABI TLS is actually 1024 bytes Membarriers: - Fix redundant load of membarrier_state (Nysal Jan K.A.) Scheduler debugging: - Introduce and use preempt_model_str() (Sebastian Andrzej Siewior) - Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar) Fixes and cleanups: - Always save/restore x86 TSC sched_clock() on suspend/resume (Guilherme G. Piccoli) - Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian Andrzej Siewior)" * tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling() sched/debug: Remove CONFIG_SCHED_DEBUG sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from documentation sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional sched/debug: Make 'const_debug' tunables unconditional __read_mostly sched/debug: Change SCHED_WARN_ON() to WARN_ON_ONCE() rseq/selftests: Fix namespace collision with rseq UAPI header include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h sched/topology: Stop exposing partition_sched_domains_locked cgroup/cpuset: Remove partition_and_rebuild_sched_domains sched/topology: Remove redundant dl_clear_root_domain call sched/deadline: Rebuild root domain accounting after every update sched/deadline: Generalize unique visiting of root domains sched/topology: Wrappers for sched_domains_mutex sched/deadline: Ignore special tasks when rebuilding domains tracing: Use preempt_model_str() xtensa: Rely on generic printing of preemption model x86: Rely on generic printing of preemption model s390: Rely on generic printing of preemption model ...
2025-03-24Merge tag 'objtool-core-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - The biggest change is the new option to automatically fail the build on objtool warnings: CONFIG_OBJTOOL_WERROR. While there are no currently known unfixed false positives left, such an expansion in the severity of objtool warnings inevitably creates a risk of build failures, so it's disabled by default and depends on !COMPILE_TEST, so it shouldn't be enabled on allyesconfig/allmodconfig builds and won't be forced on people who just accept build-time defaults in 'make oldconfig'. While the option is strongly recommended, only people who enable it explicitly should see it. (Josh Poimboeuf) - Disable branch profiling in noinstr code with a broad brush that includes all of arch/x86/ and kernel/sched/. (Josh Poimboeuf) - Create backup object files on objtool errors and print exact objtool arguments to make failure analysis easier (Josh Poimboeuf) - Improve noreturn handling (Josh Poimboeuf) - Improve rodata handling (Tiezhu Yang) - Support jump tables, switch tables and goto tables on LoongArch (Tiezhu Yang) - Misc cleanups and fixes (Josh Poimboeuf, David Engraf, Ingo Molnar) * tag 'objtool-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) tracing: Disable branch profiling in noinstr code objtool: Use O_CREAT with explicit mode mask objtool: Add CONFIG_OBJTOOL_WERROR objtool: Create backup on error and print args objtool: Change "warning:" to "error:" for --Werror objtool: Add --Werror option objtool: Add --output option objtool: Upgrade "Linked object detected" warning to error objtool: Consolidate option validation objtool: Remove --unret dependency on --rethunk objtool: Increase per-function WARN_FUNC() rate limit objtool: Update documentation objtool: Improve __noreturn annotation warning objtool: Fix error handling inconsistencies in check() x86/traps: Make exc_double_fault() consistently noreturn LoongArch: Enable jump table for objtool objtool/LoongArch: Add support for goto table objtool/LoongArch: Add support for switch table objtool: Handle PC relative relocation type objtool: Handle different entry size of rodata ...
2025-03-24Merge tag 'locking-core-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Locking primitives: - Micro-optimize percpu_{,try_}cmpxchg{64,128}_op() and {,try_}cmpxchg{64,128} on x86 (Uros Bizjak) - mutexes: extend debug checks in mutex_lock() (Yunhui Cui) - Misc cleanups (Uros Bizjak) Lockdep: - Fix might_fault() lockdep check of current->mm->mmap_lock (Peter Zijlstra) - Don't disable interrupts on RT in disable_irq_nosync_lockdep.*() (Sebastian Andrzej Siewior) - Disable KASAN instrumentation of lockdep.c (Waiman Long) - Add kasan_check_byte() check in lock_acquire() (Waiman Long) - Misc cleanups (Sebastian Andrzej Siewior) Rust runtime integration: - Use Pin for all LockClassKey usages (Mitchell Levy) - sync: Add accessor for the lock behind a given guard (Alice Ryhl) - sync: condvar: Add wait_interruptible_freezable() (Alice Ryhl) - sync: lock: Add an example for Guard:: Lock_ref() (Boqun Feng) Split-lock detection feature (x86): - Fix warning mode with disabled mitigation mode (Maksim Davydov) Locking events: - Add locking events for rtmutex slow paths (Waiman Long) - Add locking events for lockdep (Waiman Long)" * tag 'locking-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Remove disable_irq_lockdep() lockdep: Don't disable interrupts on RT in disable_irq_nosync_lockdep.*() rust: lockdep: Use Pin for all LockClassKey usages rust: sync: condvar: Add wait_interruptible_freezable() rust: sync: lock: Add an example for Guard:: Lock_ref() rust: sync: Add accessor for the lock behind a given guard locking/lockdep: Add kasan_check_byte() check in lock_acquire() locking/lockdep: Disable KASAN instrumentation of lockdep.c locking/lock_events: Add locking events for lockdep locking/lock_events: Add locking events for rtmutex slow paths x86/split_lock: Fix the delayed detection logic lockdep/mm: Fix might_fault() lockdep check of current->mm->mmap_lock x86/locking: Remove semicolon from "lock" prefix locking/mutex: Add MUTEX_WARN_ON() into fast path x86/locking: Use asm_inline for {,try_}cmpxchg{64,128} emulations x86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op()
2025-03-24Merge tag 'bitmap-for-6.15' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - cpumask_next_wrap() rework (me) - GENMASK() simplification (I Hsin) - rust bindings for cpumasks (Viresh and me) - scattered cleanups (Andy, Tamir, Vincent, Ignacio and Joel) * tag 'bitmap-for-6.15' of https://github.com/norov/linux: (22 commits) cpumask: align text in comment riscv: fix test_and_{set,clear}_bit ordering documentation treewide: fix typo 'unsigned __init128' -> 'unsigned __int128' MAINTAINERS: add rust bindings entry for bitmap API rust: Add cpumask helpers uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)" cpumask: drop cpumask_next_wrap_old() PCI: hv: Switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() scsi: lpfc: rework lpfc_next_{online,present}_cpu() scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap() s390: switch stop_machine_yield() to using cpumask_next_wrap() padata: switch padata_find_next() to using cpumask_next_wrap() cpumask: use cpumask_next_wrap() where appropriate cpumask: re-introduce cpumask_next{,_and}_wrap() cpumask: deprecate cpumask_next_wrap() powerpc/xmon: simplify xmon_batch_next_cpu() ibmvnic: simplify ibmvnic_set_queue_affinity() virtio_net: simplify virtnet_set_affinity() objpool: rework objpool_pop() cpumask: add for_each_{possible,online}_cpu_wrap ...
2025-03-24Merge tag 'seccomp-v6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: - avoid the lock trip seccomp_filter_release in common case (Mateusz Guzik) - remove unused 'sd' argument through-out (Oleg Nesterov) - selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64 * tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: avoid the lock trip seccomp_filter_release in common case seccomp: remove the 'sd' argument from __seccomp_filter() seccomp: remove the 'sd' argument from __secure_computing() seccomp: fix the __secure_computing() stub for !HAVE_ARCH_SECCOMP_FILTER seccomp/mips: change syscall_trace_enter() to use secure_computing() selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64
2025-03-24Merge tag 'hardening-v6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "As usual, it's scattered changes all over. Patches touching things outside of our traditional areas in the tree have been Acked by maintainers or were trivial changes: - loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan Vadivel) - samples/check-exec: Fix script name (Mickaël Salaün) - yama: remove needless locking in yama_task_prctl() (Oleg Nesterov) - lib/string_choices: Sort by function name (R Sundar) - hardening: Allow default HARDENED_USERCOPY to be set at compile time (Mel Gorman) - uaccess: Split out compile-time checks into ucopysize.h - kbuild: clang: Support building UM with SUBARCH=i386 - x86: Enable i386 FORTIFY_SOURCE on Clang 16+ - ubsan/overflow: Rework integer overflow sanitizer option - Add missing __nonstring annotations for callers of memtostr*()/strtomem*() - Add __must_be_noncstr() and have memtostr*()/strtomem*() check for it - Introduce __nonstring_array for silencing future GCC 15 warnings" * tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits) compiler_types: Introduce __nonstring_array hardening: Enable i386 FORTIFY_SOURCE on Clang 16+ x86/build: Remove -ffreestanding on i386 with GCC ubsan/overflow: Enable ignorelist parsing and add type filter ubsan/overflow: Enable pattern exclusions ubsan/overflow: Rework integer overflow sanitizer option to turn on everything samples/check-exec: Fix script name yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl() kbuild: clang: Support building UM with SUBARCH=i386 loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported lib/string_choices: Rearrange functions in sorted order string.h: Validate memtostr*()/strtomem*() arguments more carefully compiler.h: Introduce __must_be_noncstr() nilfs2: Mark on-disk strings as nonstring uapi: stddef.h: Introduce __kernel_nonstring x86/tdx: Mark message.bytes as nonstring string: kunit: Mark nonstring test strings as __nonstring scsi: qla2xxx: Mark device strings as nonstring scsi: mpt3sas: Mark device strings as nonstring scsi: mpi3mr: Mark device strings as nonstring ...
2025-03-24Merge tag 'execve-v6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - elf: Define and use note name macros (Akihiko Odaki) - elf: add remaining SHF_ flag macros (Timur Tabi) - binfmt: Remove loader from linux_binprm struct (Yonatan Goldschmidt) - binfmt_elf_fdpic: fix variable set but not used warning (sunliming) * tag 'execve-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: binfmt_elf_fdpic: fix variable set but not used warning elf: add remaining SHF_ flag macros binfmt: Remove loader from linux_binprm struct crash: Remove KEXEC_CORE_NOTE_NAME s390/crash: Use note name macros crash: Use note name macros powerpc/crash: Use note name macros binfmt_elf: Use note name macros elf: Define note name macros
2025-03-24ARM: davinci: always enable CONFIG_ARCH_DAVINCI_DA850Arnd Bergmann
A change to the clk driver broke configurations that enable DA830 but not DA850: arm-linux-gnueabi-ld: drivers/clk/davinci/pll.o: in function `__da850_pll0_of_clk_init_declare': pll.c:(.init.text+0x30): undefined reference to `of_da850_pll0_init' arm-linux-gnueabi-ld: drivers/clk/davinci/pll.o:(.rodata.davinci_pll_id_table+0x14): undefined reference to `da850_pll0_init' arm-linux-gnueabi-ld: drivers/clk/davinci/pll.o:(.rodata.davinci_pll_id_table+0x2c): undefined reference to `da850_pll1_init' arm-linux-gnueabi-ld: drivers/clk/davinci/pll.o:(.rodata.davinci_pll_of_match+0xc0): undefined reference to `of_da850_pll1_init' arm-linux-gnueabi-ld: drivers/clk/davinci/psc.o:(.rodata.davinci_psc_id_table+0x14): undefined reference to `da850_psc0_init_data' arm-linux-gnueabi-ld: drivers/clk/davinci/psc.o:(.rodata.davinci_psc_id_table+0x2c): undefined reference to `da850_psc1_init_data' Select ARCH_DAVINCI_DA850 unconditionally to ensure the driver can still build. Fixes: a31b4dcf188c ("clk: davinci: remove support for da830") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-24Merge tag 'vfs-6.15-rc1.sysv' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs sysv removal from Christian Brauner: "This removes the sysv filesystem. We've discussed this various times. It's time to try" * tag 'vfs-6.15-rc1.sysv' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: sysv: Remove the filesystem
2025-03-24Merge tag 'vfs-6.15-rc1.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - Mount notifications The day has come where we finally provide a new api to listen for mount topology changes outside of /proc/<pid>/mountinfo. A mount namespace file descriptor can be supplied and registered with fanotify to listen for mount topology changes. Currently notifications for mount, umount and moving mounts are generated. The generated notification record contains the unique mount id of the mount. The listmount() and statmount() api can be used to query detailed information about the mount using the received unique mount id. This allows userspace to figure out exactly how the mount topology changed without having to generating diffs of /proc/<pid>/mountinfo in userspace. - Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount api - Support detached mounts in overlayfs Since last cycle we support specifying overlayfs layers via file descriptors. However, we don't allow detached mounts which means userspace cannot user file descriptors received via open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to attach them to a mount namespace via move_mount() first. This is cumbersome and means they have to undo mounts via umount(). Allow them to directly use detached mounts. - Allow to retrieve idmappings with statmount Currently it isn't possible to figure out what idmapping has been attached to an idmapped mount. Add an extension to statmount() which allows to read the idmapping from the mount. - Allow creating idmapped mounts from mounts that are already idmapped So far it isn't possible to allow the creation of idmapped mounts from already idmapped mounts as this has significant lifetime implications. Make the creation of idmapped mounts atomic by allow to pass struct mount_attr together with the open_tree_attr() system call allowing to solve these issues without complicating VFS lookup in any way. The system call has in general the benefit that creating a detached mount and applying mount attributes to it becomes an atomic operation for userspace. - Add a way to query statmount() for supported options Allow userspace to query which mount information can be retrieved through statmount(). - Allow superblock owners to force unmount * tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits) umount: Allow superblock owners to force umount selftests: add tests for mount notification selinux: add FILE__WATCH_MOUNTNS samples/vfs: fix printf format string for size_t fs: allow changing idmappings fs: add kflags member to struct mount_kattr fs: add open_tree_attr() fs: add copy_mount_setattr() helper fs: add vfs_open_tree() helper statmount: add a new supported_mask field samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP selftests: add tests for using detached mount with overlayfs samples/vfs: check whether flag was raised statmount: allow to retrieve idmappings uidgid: add map_id_range_up() fs: allow detached mounts in clone_private_mount() selftests/overlayfs: test specifying layers as O_PATH file descriptors fs: support O_PATH fds with FSCONFIG_SET_FD vfs: add notifications for mount attach and detach fanotify: notify on mount attach and detach ...
2025-03-24Merge tag 'vfs-6.15-rc1.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Add CONFIG_DEBUG_VFS infrastucture: - Catch invalid modes in open - Use the new debug macros in inode_set_cached_link() - Use debug-only asserts around fd allocation and install - Place f_ref to 3rd cache line in struct file to resolve false sharing Cleanups: - Start using anon_inode_getfile_fmode() helper in various places - Don't take f_lock during SEEK_CUR if exclusion is guaranteed by f_pos_lock - Add unlikely() to kcmp() - Remove legacy ->remount_fs method from ecryptfs after port to the new mount api - Remove invalidate_inodes() in favour of evict_inodes() - Simplify ep_busy_loopER by removing unused argument - Avoid mmap sem relocks when coredumping with many missing pages - Inline getname() - Inline new_inode_pseudo() and de-staticize alloc_inode() - Dodge an atomic in putname if ref == 1 - Consistently deref the files table with rcu_dereference_raw() - Dedup handling of struct filename init and refcounts bumps - Use wq_has_sleeper() in end_dir_add() - Drop the lock trip around I_NEW wake up in evict() - Load the ->i_sb pointer once in inode_sb_list_{add,del} - Predict not reaching the limit in alloc_empty_file() - Tidy up do_sys_openat2() with likely/unlikely - Call inode_sb_list_add() outside of inode hash lock - Sort out fd allocation vs dup2 race commentary - Turn page_offset() into a wrapper around folio_pos() - Remove locking in exportfs around ->get_parent() call - try_lookup_one_len() does not need any locks in autofs - Fix return type of several functions from long to int in open - Fix return type of several functions from long to int in ioctls Fixes: - Fix watch queue accounting mismatch" * tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits) fs: sort out fd allocation vs dup2 race commentary, take 2 fs: call inode_sb_list_add() outside of inode hash lock fs: tidy up do_sys_openat2() with likely/unlikely fs: predict not reaching the limit in alloc_empty_file() fs: load the ->i_sb pointer once in inode_sb_list_{add,del} fs: drop the lock trip around I_NEW wake up in evict() fs: use wq_has_sleeper() in end_dir_add() VFS/autofs: try_lookup_one_len() does not need any locks fs: dedup handling of struct filename init and refcounts bumps fs: consistently deref the files table with rcu_dereference_raw() exportfs: remove locking around ->get_parent() call. fs: use debug-only asserts around fd allocation and install fs: dodge an atomic in putname if ref == 1 vfs: Remove invalidate_inodes() ecryptfs: remove NULL remount_fs from super_operations watch_queue: fix pipe accounting mismatch fs: place f_ref to 3rd cache line in struct file to resolve false sharing epoll: simplify ep_busy_loop by removing always 0 argument fs: Turn page_offset() into a wrapper around folio_pos() kcmp: improve performance adding an unlikely hint to task comparisons ...
2025-03-24m68k: coldfire: select PCI_IOMAP for PCIArnd Bergmann
After I dropped CONFIG_GENERIC_IOMAP, some PCI drivers started failing to link when CONFIG_MMU is disabled: ERROR: modpost: "pci_iounmap" [drivers/video/fbdev/i740fb.ko] undefined! ERROR: modpost: "pci_iounmap" [drivers/video/fbdev/vt8623fb.ko] undefined! ERROR: modpost: "pci_iomap_wc" [drivers/video/fbdev/vt8623fb.ko] undefined! ERROR: modpost: "pci_iomap" [drivers/video/fbdev/vt8623fb.ko] undefined! ERROR: modpost: "pci_iounmap" [drivers/video/fbdev/s3fb.ko] undefined! ... It turns out that there were two mistakes in my patch: on !MMU I forgot to enable CONFIG_GENERIC_PCI_IOMAP, and for Coldfire with MMU enabled, teh GENERIC_IOMAP was left in place but incorrectly configured. Fixes: 9d48cc07d0d7 ("m68k/nommu: stop using GENERIC_IOMAP") Reported-by: Greg Ungerer <gerg@linux-m68k.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-24Merge branch 'pm-cpufreq'Rafael J. Wysocki
Merge cpufreq updates for 6.15-rc1: - Manage sysfs attributes and boost frequencies efficiently from cpufreq core to reduce boilerplate code from drivers (Viresh Kumar). - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider, Dhananjay Ugwekar, Imran Shaik, and zuoqian). - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky Bai). - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski). - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng). - Optimize the amd-pstate driver to avoid cases where call paths end up calling the same writes multiple times and needlessly caching variables through code reorganization, locking overhaul and tracing adjustments (Mario Limonciello, Dhananjay Ugwekar). - Make it possible to avoid enabling capacity-aware scheduling (CAS) in the intel_pstate driver and relocate a check for out-of-band (OOB) platform handling in it to make it detect OOB before checking HWP availability (Rafael Wysocki). - Fix dbs_update() to avoid inadvertent conversions of negative integer values to unsigned int which causes CPU frequency selection to be inaccurate in some cases when the "conservative" cpufreq governor is in use (Jie Zhan). * pm-cpufreq: (91 commits) dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650 dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1 dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible cpufreq: Init cpufreq only for present CPUs cpufreq: tegra186: Share policy per cluster cpufreq/amd-pstate: Drop actions in amd_pstate_epp_cpu_offline() cpufreq/amd-pstate: Stop caching EPP cpufreq/amd-pstate: Rework CPPC enabling cpufreq/amd-pstate: Drop debug statements for policy setting cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions cpufreq/amd-pstate: Cache CPPC request in shared mem case too cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks cpufreq/amd-pstate-ut: Adjust variable scope cpufreq/amd-pstate-ut: Run on all of the correct CPUs cpufreq/amd-pstate-ut: Drop SUCCESS and FAIL enums cpufreq/amd-pstate-ut: Allow lowest nonlinear and lowest to be the same cpufreq/amd-pstate-ut: Use _free macro to free put policy cpufreq/amd-pstate: Drop `cppc_cap1_cached` ...
2025-03-24Merge branches 'acpi-x86', 'acpi-platform-profile', 'acpi-apei' and 'acpi-misc'Rafael J. Wysocki
Merge an ACPI CPPC update, ACPI platform-profile driver updates, an ACPI APEI update and a MAINTAINERS update related to ACPI for 6.15-rc1: - Add a missing header file include to the x86 arch CPPC code (Mario Limonciello). - Rework the sysfs attributes implementation in the ACPI platform-profile driver and improve the unregistration code in it (Nathan Chancellor, Kurt Borja). - Prevent the ACPI HED driver from being built as a module and change its initcall level to subsys_initcall to avoid initialization ordering issues related to it (Xiaofei Tan). - Update a maintainer email address in the ACPI PMIC entry in MAINTAINERS (Mika Westerberg). * acpi-x86: x86/ACPI: CPPC: Add missing include * acpi-platform-profile: ACPI: platform_profile: Improve platform_profile_unregister() ACPI: platform-profile: Fix CFI violation when accessing sysfs files * acpi-apei: ACPI: HED: Always initialize before evged * acpi-misc: MAINTAINERS: Use my kernel.org address for ACPI PMIC work
2025-03-22x86: drop unnecessary prefix map configurationThomas Weißschuh
The toplevel Makefile already provides -fmacro-prefix-map as part of KBUILD_CPPFLAGS. In contrast to the KBUILD_CFLAGS and KBUILD_AFLAGS variables, KBUILD_CPPFLAGS is not redefined in the architecture specific Makefiles. Therefore the toplevel KBUILD_CPPFLAGS do apply just fine, to both C and ASM sources. The custom configuration was necessary when it was added in commit 9e2276fa6eb3 ("arch/x86/boot: Use prefix map to avoid embedded paths") but has since become unnecessary in commit a716bd743210 ("kbuild: use -fmacro-prefix-map for .S sources"). Drop the now unnecessary custom prefix map configuration. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-03-22tracing: Disable branch profiling in noinstr codeJosh Poimboeuf
CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update() for each use of likely() or unlikely(). That breaks noinstr rules if the affected function is annotated as noinstr. Disable branch profiling for files with noinstr functions. In addition to some individual files, this also includes the entire arch/x86 subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle directories, all of which are noinstr-heavy. Due to the nature of how sched binaries are built by combining multiple .c files into one, branch profiling is disabled more broadly across the sched code than would otherwise be needed. This fixes many warnings like the following: vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section ... Reported-by: Ingo Molnar <mingo@kernel.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
2025-03-22x86/speculation: Remove the extra #ifdef around CALL_NOSPECPawan Gupta
Commit: 010c4a461c1d ("x86/speculation: Simplify and make CALL_NOSPEC consistent") added an #ifdef CONFIG_MITIGATION_RETPOLINE around the CALL_NOSPEC definition. This is not required as this code is already under a larger #ifdef. Remove the extra #ifdef, no functional change. vmlinux size remains same before and after this change: CONFIG_MITIGATION_RETPOLINE=y: text data bss dec hex filename 25434752 7342290 2301212 35078254 217406e vmlinux.before 25434752 7342290 2301212 35078254 217406e vmlinux.after # CONFIG_MITIGATION_RETPOLINE is not set: text data bss dec hex filename 22943094 6214994 1550152 30708240 1d49210 vmlinux.before 22943094 6214994 1550152 30708240 1d49210 vmlinux.after Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20250320-call-nospec-extra-ifdef-v1-1-d9b084d24820@linux.intel.com
2025-03-22perf/amd/ibs: Prevent leaking sensitive data to userspaceNamhyung Kim
Although IBS "swfilt" can prevent leaking samples with kernel RIP to the userspace, there are few subtle cases where a 'data' address and/or a 'branch target' address can fall under kernel address range although RIP is from userspace. Prevent leaking kernel 'data' addresses by discarding such samples when {exclude_kernel=1,swfilt=1}. IBS can now be invoked by unprivileged user with the introduction of "swfilt". However, this creates a loophole in the interface where an unprivileged user can get physical address of the userspace virtual addresses through IBS register raw dump (PERF_SAMPLE_RAW). Prevent this as well. This upstream commit fixed the most obvious leak: 65a99264f5e5 perf/x86: Check data address for IBS software filter Follow that up with a more complete fix. Fixes: d29e744c7167 ("perf/x86: Relax privilege filter restriction on AMD IBS") Suggested-by: Matteo Rizzo <matteorizzo@google.com> Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250321161251.1033-1-ravi.bangoria@amd.com
2025-03-22x86/Kconfig: Document release year of glibc 2.3.3Mateusz Jończyk
I wonder how many people were checking their glibc version when considering whether to enable this option. Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: David Heidelberg <david@ixit.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-7-b0cbaa6fa338@ixit.cz
2025-03-22x86/Kconfig: Make CONFIG_PCI_CNB20LE_QUIRK depend on X86_32Mateusz Jończyk
I was unable to find a good description of the ServerWorks CNB20LE chipset. However, it was probably exclusively used with the Pentium III processor (this CPU model was used in all references to it that I found where the CPU model was provided: dmesgs in [1] and [2]; [3] page 2; [4]-[7]). As is widely known, the Pentium III processor did not support the 64-bit mode, support for which was introduced by Intel a couple of years later. So it is safe to assume that no systems with the CNB20LE chipset have amd64 and the CONFIG_PCI_CNB20LE_QUIRK may now depend on X86_32. Additionally, I have determined that most computers with the CNB20LE chipset did have ACPI support and this driver was inactive on them. I have submitted a patch to remove this driver, but it was met with resistance [8]. [1] Jim Studt, Re: Problem with ServerWorks CNB20LE and lost interrupts Linux Kernel Mailing List, https://lkml.org/lkml/2002/1/11/111 [2] RedHat Bug 665109 - e100 problems on old Compaq Proliant DL320 https://bugzilla.redhat.com/show_bug.cgi?id=665109 [3] R. Hughes-Jones, S. Dallison, G. Fairey, Performance Measurements on Gigabit Ethernet NICs and Server Quality Motherboards, http://datatag.web.cern.ch/papers/pfldnet2003-rhj.doc [4] "Hardware for Linux", Probe #d6b5151873 of Intel STL2-bd A28808-302 Desktop Computer (STL2) https://linux-hardware.org/?probe=d6b5151873 [5] "Hardware for Linux", Probe #0b5d843f10 of Compaq ProLiant DL380 https://linux-hardware.org/?probe=0b5d843f10 [6] Ubuntu Forums, Dell Poweredge 2400 - Adaptec SCSI Bus AIC-7880 https://ubuntuforums.org/showthread.php?t=1689552 [7] Ira W. Snyder, "BISECTED: 2.6.35 (and -git) fail to boot: APIC problems" https://lkml.org/lkml/2010/8/13/220 [8] Bjorn Helgaas, "Re: [PATCH] x86/pci: drop ServerWorks / Broadcom CNB20LE PCI host bridge driver" https://lore.kernel.org/lkml/20220318165535.GA840063@bhelgaas/T/ Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: David Heideberg <david@ixit.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-6-b0cbaa6fa338@ixit.cz
2025-03-22x86/Kconfig: Document CONFIG_PCI_MMCONFIGMateusz Jończyk
This configuration option had no help text, so add it. CONFIG_EXPERT is enabled on some distribution kernels, so people using a distribution kernel's configuration as a starting point will see this option. [ mingo: Standardized the new Kconfig text a bit. ] Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: David Heideberg <david@ixit.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-5-b0cbaa6fa338@ixit.cz
2025-03-22x86/Kconfig: Update lists in X86_EXTENDED_PLATFORMMateusz Jończyk
The order of the entries matches the order they appear in Kconfig. In 2011, AMD Elan was moved to Kconfig.cpu and the dependency on X86_EXTENDED_PLATFORM was dropped in: ce9c99af8d4b ("x86, cpu: Move AMD Elan Kconfig under "Processor family"") Support for Moorestown MID devices was removed in 2012 in: 1a8359e411eb ("x86/mid: Remove Intel Moorestown") SGI 320/540 (Visual Workstation) was removed in 2014 in: c5f9ee3d665a ("x86, platforms: Remove SGI Visual Workstation") Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: David Heideberg <david@ixit.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-4-b0cbaa6fa338@ixit.cz
2025-03-22x86/Kconfig: Move all X86_EXTENDED_PLATFORM options togetherMateusz Jończyk
So that these options will be displayed together in menuconfig etc. Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: David Heidelberg <david@ixit.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-3-b0cbaa6fa338@ixit.cz
2025-03-22x86/Kconfig: Always enable ARCH_SPARSEMEM_ENABLEMateusz Jończyk
It appears that (X86_64 || X86_32) is always true on x86. This logical OR directive was introduced in: 6ea3038648da ("arch/x86: remove depends on CONFIG_EXPERIMENTAL") By (EXPERIMENTAL && X86_32) turning into (X86_32). Since this change was an identity transformation, nobody noticed that the condition turned into 'true'. [ mingo: Updated changelog ] Fixes: 6ea3038648da ("arch/x86: remove depends on CONFIG_EXPERIMENTAL") Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: David Heideberg <david@ixit.cz> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-2-b0cbaa6fa338@ixit.cz
2025-03-22x86/Kconfig: Enable X86_X2APIC by default and improve help textMateusz Jończyk
As many current platforms (most modern Intel CPUs and QEMU) have x2APIC present, enable CONFIG_X86_X2APIC by default as it gives performance and functionality benefits. Additionally, if the BIOS has already switched APIC to x2APIC mode, but CONFIG_X86_X2APIC is disabled, the kernel will panic in arch/x86/kernel/apic/apic.c . Also improve the help text, which was confusing and really did not describe what the feature is about. Help text references and discussion: Both Intel [1] and AMD [3] spell the name as "x2APIC", not "x2apic". "It allows faster access to the local APIC" [2], chapter 2.1, page 15: "More efficient MSR interface to access APIC registers." "x2APIC was introduced in Intel CPUs around 2008": I was unable to find specific information which Intel CPUs support x2APIC. Wikipedia claims it was "introduced with the Nehalem microarchitecture in November 2008", but I was not able to confirm this independently. At least some Nehalem CPUs do not support x2APIC [1]. The documentation [2] is dated June 2008. Linux kernel also introduced x2APIC support in 2008, so the year seems to be right. "and in AMD EPYC CPUs in 2019": [3], page 15: "AMD introduced an x2APIC in our EPYC 7002 Series processors for the first time." "It is also frequently emulated in virtual machines, even when the host CPU does not support it." [1] "If this configuration option is disabled, the kernel will not boot on some platforms that have x2APIC enabled." According to some BIOS documentation [4], the x2APIC may be "disabled", "enabled", or "force enabled" on this system. I think that "enabled" means "made available to the operating system, but not already turned on" and "force enabled" means "already switched to x2APIC mode when the OS boots". Only in the latter mode a kernel without CONFIG_X86_X2APIC will panic in validate_x2apic() in arch/x86/kernel/apic/apic.c . QEMU 4.2.1 and my Intel HP laptop (bought in 2019) use the "enabled" mode and the kernel does not panic. [1] "Re: [Qemu-devel] [Question] why x2apic's set by default without host sup" https://lists.gnu.org/archive/html/qemu-devel/2013-07/msg03527.html [2] Intel® 64 Architecture x2APIC Specification, ( https://www.naic.edu/~phil/software/intel/318148.pdf ) [3] Workload Tuning Guide for AMD EPYC ™ 7002 Series Processor Based Servers Application Note, https://developer.amd.com/wp-content/resources/56745_0.80.pdf [4] UEFI System Utilities and Shell Command Mobile Help for HPE ProLiant Gen10, ProLiant Gen10 Plus Servers and HPE Synergy: Enabling or disabling Processor x2APIC Support https://techlibrary.hpe.com/docs/iss/proliant-gen10-uefi/s_enable_disable_x2APIC_support.html Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250321-x86_x2apic-v3-1-b0cbaa6fa338@ixit.cz
2025-03-21x86/mm: restore early initialization of high_memory for 32-bitsMike Rapoport (Microsoft)
Kernel test robot reports the following crash on 32-bit system with HIGHMEM and DEBUG_VIRTUAL: [ 0.056128][ T0] kernel BUG at arch/x86/mm/physaddr.c:77! PANIC: early exception 0x06 IP 60:c116539d error 0 cr2 0x0 [ 0.056916][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.14.0-rc4-00010-ga4dbe5c71817 #1 [ 0.057570][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 0.058299][ T0] EIP: __phys_addr (arch/x86/mm/physaddr.c:77) [ 0.058633][ T0] Code: 00 74 33 89 f0 e8 d3 8b 2e 00 89 c3 0f b6 d0 b8 58 bb 4b c5 31 c9 6a 00 e8 70 f5 15 00 83 c4 04 84 db 74 25 ff 05 78 de 5d c5 <0f> 0b b8 c8 91 ea c4 e8 e7 6e ea ff b8 58 bb 4b c5 31 d2 31 c9 6a All code [ 0.060017][ T0] EAX: 00000000 EBX: c61f7001 ECX: 00000000 EDX: 00000000 [ 0.060519][ T0] ESI: c61f7000 EDI: 061f7000 EBP: c4e31f04 ESP: c61f7000 [ 0.061016][ T0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: cff4 EFLAGS: 00210002 [ 0.061560][ T0] CR0: 80050033 CR2: 00000000 CR3: 059fc000 CR4: 00000090 [ 0.062060][ T0] Call Trace: [ 0.062288][ T0] ? show_regs (arch/x86/kernel/dumpstack.c:478) [ 0.062588][ T0] ? early_fixup_exception (arch/x86/include/asm/nospec-branch.h:595) [ 0.062968][ T0] ? early_idt_handler_common (arch/x86/kernel/head_32.S:352) [ 0.063360][ T0] ? __phys_addr (arch/x86/mm/physaddr.c:77) [ 0.063677][ T0] ? one_page_table_init (arch/x86/mm/init_32.c:100) [ 0.064037][ T0] ? page_table_range_init (arch/x86/mm/init_32.c:227) [ 0.064411][ T0] ? permanent_kmaps_init (include/linux/pgtable.h:191 include/linux/pgtable.h:196 arch/x86/mm/init_32.c:395) [ 0.064814][ T0] ? paging_init (arch/x86/mm/init_32.c:677) [ 0.065118][ T0] ? native_pagetable_init (arch/x86/mm/init_32.c:481) [ 0.065503][ T0] ? setup_arch (arch/x86/kernel/setup.c:1131) [ 0.065819][ T0] ? start_kernel (include/linux/jump_label.h:267 init/main.c:920) [ 0.066143][ T0] ? i386_start_kernel (arch/x86/kernel/head32.c:79) [ 0.066501][ T0] ? startup_32_smp (arch/x86/kernel/head_32.S:292) The crash happens because commit e120d1bc12da ("arch, mm: set high_memory in free_area_init()") moved initialization of high_memory after __vmalloc_start_set and with high_memory still set to 0 any address passes is_vmalloc_addr() check. Restore early initialization of high_memory on 32-bit systems in initmem_init(). Link: https://lkml.kernel.org/r/20250319122337.1538924-1-rppt@kernel.org Fixes: e120d1bc12da ("arch, mm: set high_memory in free_area_init()") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202503191442.112e954f-lkp@intel.com Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>