summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2025-02-15Merge tag 'uml-for-linus-6.14-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML fixes from Richard Weinberger: - Align signal stack correctly - Convert to raw spinlocks where needed (irq and virtio) - FPU related fixes * tag 'uml-for-linus-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: convert irq_lock to raw spinlock um: virtio_uml: use raw spinlock um: virt-pci: don't use kmalloc() um: fix execve stub execution on old host OSs um: properly align signal stack on x86_64 um: avoid copying FP state from init_task um: add back support for FXSAVE registers
2025-02-15Merge tag 's390-6.14-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix isolated VFs handling by verifying that a VF’s parent PF is locally owned before registering it in an existing PCI domain - Disable arch_test_bit() optimization for PROFILE_ALL_BRANCHES to workaround gcc failure in handling __builtin_constant_p() in this case - Fix CHPID "configure" attribute caching in CIO by not updating the cache when SCLP returns no data, ensuring consistent sysfs output - Remove CONFIG_LSM from default configs and rely on defaults, which enables BPF LSM hook * tag 's390-6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: Fix handling of isolated VFs s390/pci: Pull search for parent PF out of zpci_iov_setup_virtfn() s390/bitops: Disable arch_test_bit() optimization for PROFILE_ALL_BRANCHES s390/cio: Fix CHPID "configure" attribute caching s390/configs: Remove CONFIG_LSM
2025-02-14Merge tag 'alpha-fixes-v6.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha Pull alpha fixes from Matt Turner: "A few changes for alpha, including some important fixes for kernel stack alignment" * tag 'alpha-fixes-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha: alpha: Use str_yes_no() helper in pci_dac_dma_supported() alpha: Replace one-element array with flexible array member alpha: align stack for page fault and user unaligned trap handlers alpha: make stack 16-byte aligned (most cases) alpha: replace hardcoded stack offsets with autogenerated ones
2025-02-14alpha: Use str_yes_no() helper in pci_dac_dma_supported()Thorsten Blum
Remove hard-coded strings by using the str_yes_no() helper function. Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Matt Turner <mattst88@gmail.com>
2025-02-14alpha: Replace one-element array with flexible array memberThorsten Blum
Replace the deprecated one-element array with a modern flexible array member in the struct crb_struct. Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Matt Turner <mattst88@gmail.com>
2025-02-14alpha: align stack for page fault and user unaligned trap handlersIvan Kokshaysky
do_page_fault() and do_entUna() are special because they use non-standard stack frame layout. Fix them manually. Cc: stable@vger.kernel.org Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Maciej W. Rozycki <macro@orcam.me.uk> Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Ivan Kokshaysky <ink@unseen.parts> Signed-off-by: Matt Turner <mattst88@gmail.com>
2025-02-14alpha: make stack 16-byte aligned (most cases)Ivan Kokshaysky
The problem is that GCC expects 16-byte alignment of the incoming stack since early 2004, as Maciej found out [1]: Having actually dug speculatively I can see that the psABI was changed in GCC 3.5 with commit e5e10fb4a350 ("re PR target/14539 (128-bit long double improperly aligned)") back in Mar 2004, when the stack pointer alignment was increased from 8 bytes to 16 bytes, and arch/alpha/kernel/entry.S has various suspicious stack pointer adjustments, starting with SP_OFF which is not a whole multiple of 16. Also, as Magnus noted, "ALPHA Calling Standard" [2] required the same: D.3.1 Stack Alignment This standard requires that stacks be octaword aligned at the time a new procedure is invoked. However: - the "normal" kernel stack is always misaligned by 8 bytes, thanks to the odd number of 64-bit words in 'struct pt_regs', which is the very first thing pushed onto the kernel thread stack; - syscall, fault, interrupt etc. handlers may, or may not, receive aligned stack depending on numerous factors. Somehow we got away with it until recently, when we ended up with a stack corruption in kernel/smp.c:smp_call_function_single() due to its use of 32-byte aligned local data and the compiler doing clever things allocating it on the stack. This adds padding between the PAL-saved and kernel-saved registers so that 'struct pt_regs' have an even number of 64-bit words. This makes the stack properly aligned for most of the kernel code, except two handlers which need special threatment. Note: struct pt_regs doesn't belong in uapi/asm; this should be fixed, but let's put this off until later. Link: https://lore.kernel.org/rcu/alpine.DEB.2.21.2501130248010.18889@angie.orcam.me.uk/ [1] Link: https://bitsavers.org/pdf/dec/alpha/Alpha_Calling_Standard_Rev_2.0_19900427.pdf [2] Cc: stable@vger.kernel.org Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Tested-by: Magnus Lindholm <linmag7@gmail.com> Tested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Ivan Kokshaysky <ink@unseen.parts> Signed-off-by: Matt Turner <mattst88@gmail.com>
2025-02-14alpha: replace hardcoded stack offsets with autogenerated onesIvan Kokshaysky
This allows the assembly in entry.S to automatically keep in sync with changes in the stack layout (struct pt_regs and struct switch_stack). Cc: stable@vger.kernel.org Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Tested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Ivan Kokshaysky <ink@unseen.parts> Signed-off-by: Matt Turner <mattst88@gmail.com>
2025-02-14Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix kexec and hibernation when using 5-level page-table configuration - Remove references to non-existent SF8MM4 and SF8MM8 ID register fields, hooking up hwcaps for the FPRCVT, F8MM4 and F8MM8 fields instead - Drop unused .ARM.attributes ELF sections - Fix array indexing when probing CPU cache topology from firmware - Fix potential use-after-free in AMU initialisation code - Work around broken GTDT entries by tolerating excessively large timer arrays - Force use of Rust's "softfloat" target to avoid a threatening warning about the NEON target feature - Typo fix in GCS documentation and removal of duplicate Kconfig select * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: rust: clean Rust 1.85.0 warning using softfloat target arm64: Add missing registrations of hwcaps ACPI: GTDT: Relax sanity checking on Platform Timers array count arm64: amu: Delay allocating cpumask for AMU FIE support arm64: cacheinfo: Avoid out-of-bounds write to cacheinfo array arm64: Handle .ARM.attributes section in linker scripts arm64/hwcap: Remove stray references to SF8MMx arm64/gcs: Fix documentation for HWCAP arm64: Kconfig: Remove selecting replaced HAVE_FUNCTION_GRAPH_RETVAL arm64: Fix 5-level paging support in kexec/hibernate trampoline
2025-02-14Merge tag 'for-linus-6.14-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Three fixes to xen-swiotlb driver: - two fixes for issues coming up due to another fix in 6.12 - addition of an __init annotation" * tag 'for-linus-6.14-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Xen/swiotlb: mark xen_swiotlb_fixup() __init x86/xen: allow larger contiguous memory regions in PV guests xen/swiotlb: relax alignment requirements
2025-02-13Merge tag 'loongarch-fixes-6.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix bugs about idle, kernel_page_present(), IP checksum and KVM, plus some trival cleanups" * tag 'loongarch-fixes-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Set host with kernel mode when switch to VM mode LoongArch: KVM: Remove duplicated cache attribute setting LoongArch: KVM: Fix typo issue about GCFG feature detection LoongArch: csum: Fix OoB access in IP checksum code for negative lengths LoongArch: Remove the deprecated notifier hook mechanism LoongArch: Use str_yes_no() helper function for /proc/cpuinfo LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGE LoongArch: Fix idle VS timer enqueue
2025-02-13x86/xen: allow larger contiguous memory regions in PV guestsJuergen Gross
Today a PV guest (including dom0) can create 2MB contiguous memory regions for DMA buffers at max. This has led to problems at least with the megaraid_sas driver, which wants to allocate a 2.3MB DMA buffer. The limiting factor is the frame array used to do the hypercall for making the memory contiguous, which has 512 entries and is just a static array in mmu_pv.c. In order to not waste memory for non-PV guests, put the initial frame array into .init.data section and dynamically allocate an array from the .init_after_bootmem hook of PV guests. In case a contiguous memory area larger than the initially supported 2MB is requested, allocate a larger buffer for the frame list. Note that such an allocation is tried only after memory management has been initialized properly, which is tested via a flag being set in the .init_after_bootmem hook. Fixes: 9f40ec84a797 ("xen/swiotlb: add alignment check for dma buffers") Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Alan Robinson <Alan.Robinson@fujitsu.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-13arm64: rust: clean Rust 1.85.0 warning using softfloat targetMiguel Ojeda
Starting with Rust 1.85.0 (to be released 2025-02-20), `rustc` warns [1] about disabling neon in the aarch64 hardfloat target: warning: target feature `neon` cannot be toggled with `-Ctarget-feature`: unsound on hard-float targets because it changes float ABI | = note: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release! = note: for more information, see issue #116344 <https://github.com/rust-lang/rust/issues/116344> Thus, instead, use the softfloat target instead. While trying it out, I found that the kernel sanitizers were not enabled for that built-in target [2]. Upstream Rust agreed to backport the enablement for the current beta so that it is ready for the 1.85.0 release [3] -- thanks! However, that still means that before Rust 1.85.0, we cannot switch since sanitizers could be in use. Thus conditionally do so. Cc: stable@vger.kernel.org # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Matthew Maurer <mmaurer@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Ralf Jung <post@ralfj.de> Cc: Jubilee Young <workingjubilee@gmail.com> Link: https://github.com/rust-lang/rust/pull/133417 [1] Link: https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/arm64.20neon.20.60-Ctarget-feature.60.20warning/near/495358442 [2] Link: https://github.com/rust-lang/rust/pull/135905 [3] Link: https://github.com/rust-lang/rust/issues/116344 Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Trevor Gross <tmgross@umich.edu> Tested-by: Matthew Maurer <mmaurer@google.com> Reviewed-by: Ralf Jung <post@ralfj.de> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250210163732.281786-1-ojeda@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-02-13arm64: Add missing registrations of hwcapsMark Brown
Commit 819935464cb2 ("arm64/hwcap: Describe 2024 dpISA extensions to userspace") added definitions for HWCAP_FPRCVT, HWCAP_F8MM8 and HWCAP_F8MM4 but did not include the crucial registration in arm64_elf_hwcaps. Add it. Fixes: 819935464cb2 ("arm64/hwcap: Describe 2024 dpISA extensions to userspace") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250212-arm64-fix-2024-dpisa-v2-1-67a1c11d6001@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-02-13arm64: amu: Delay allocating cpumask for AMU FIE supportBeata Michalska
For the time being, the amu_fie_cpus cpumask is being exclusively used by the AMU-related internals of FIE support and is guaranteed to be valid on every access currently made. Still the mask is not being invalidated on one of the error handling code paths, which leaves a soft spot with theoretical risk of UAF for CPUMASK_OFFSTACK cases. To make things sound, delay allocating said cpumask (for CPUMASK_OFFSTACK) avoiding otherwise nasty sanitising case failing to register the cpufreq policy notifications. Signed-off-by: Beata Michalska <beata.michalska@arm.com> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> Reviewed-by: Sumit Gupta <sumitg@nvidia.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20250131155842.3839098-1-beata.michalska@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-02-13LoongArch: KVM: Set host with kernel mode when switch to VM modeBibo Mao
PRMD register is only meaningful on the beginning stage of exception entry, and it is overwritten with nested irq or exception. When CPU runs in VM mode, interrupt need be enabled on host. And the mode for host had better be kernel mode rather than random or user mode. When VM is running, the running mode with top command comes from CRMD register, and running mode should be kernel mode since kernel function is executing with perf command. It needs be consistent with both top and perf command. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: KVM: Remove duplicated cache attribute settingBibo Mao
Cache attribute comes from GPA->HPA secondary mmu page table and is configured when kvm is enabled. It is the same for all VMs, so remove duplicated cache attribute setting on vCPU context switch. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: KVM: Fix typo issue about GCFG feature detectionBibo Mao
This is typo issue and misusage about GCFG feature macro. The code is wrong, only that it does not cause obvious problem since GCFG is set again on vCPU context switch. Fixes: 0d0df3c99d4f ("LoongArch: KVM: Implement kvm hardware enable, disable interface") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: csum: Fix OoB access in IP checksum code for negative lengthsYuli Wang
Commit 69e3a6aa6be2 ("LoongArch: Add checksum optimization for 64-bit system") would cause an undefined shift and an out-of-bounds read. Commit 8bd795fedb84 ("arm64: csum: Fix OoB access in IP checksum code for negative lengths") fixes the same issue on ARM64. Fixes: 69e3a6aa6be2 ("LoongArch: Add checksum optimization for 64-bit system") Co-developed-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: Remove the deprecated notifier hook mechanismYuli Wang
The notifier hook mechanism in proc and cpuinfo is actually unnecessary for LoongArch because it's not used anywhere. It was originally added to the MIPS code in commit d6d3c9afaab4 ("MIPS: MT: proc: Add support for printing VPE and TC ids"), and LoongArch then inherited it. But as the kernel code stands now, this notifier hook mechanism doesn't really make sense for either LoongArch or MIPS. In addition, the seq_file forward declaration needs to be moved to its proper place, as only the show_ipi_list() function in smp.c requires it. Co-developed-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: Use str_yes_no() helper function for /proc/cpuinfoYuli Wang
Remove hard-coded strings by using the str_yes_no() helper function. Similar to commit c4a0a4a45a45 ("MIPS: kernel: proc: Use str_yes_no() helper function"). Co-developed-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: Fix kernel_page_present() for KPRANGE/XKPRANGEHuacai Chen
Now kernel_page_present() always return true for KPRANGE/XKPRANGE addresses, this isn't correct because hibernation (ACPI S4) use it to distinguish whether a page is saveable. If all KPRANGE/XKPRANGE addresses are considered as saveable, then reserved memory such as EFI_RUNTIME_SERVICES_CODE / EFI_RUNTIME_SERVICES_DATA will also be saved and restored. Fix this by returning true only if the KPRANGE/XKPRANGE address is in memblock.memory. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-13LoongArch: Fix idle VS timer enqueueMarco Crivellari
LoongArch re-enables interrupts on its idle routine and performs a TIF_NEED_RESCHED check afterwards before putting the CPU to sleep. The IRQs firing between the check and the idle instruction may set the TIF_NEED_RESCHED flag. In order to deal with such a race, IRQs interrupting __arch_cpu_idle() rollback their return address to the beginning of __arch_cpu_idle() so that TIF_NEED_RESCHED is checked again before going back to sleep. However idle IRQs can also queue timers that may require a tick reprogramming through a new generic idle loop iteration but those timers would go unnoticed here because __arch_cpu_idle() only checks TIF_NEED_RESCHED. It doesn't check for pending timers. Fix this with fast-forwarding idle IRQs return address to the end of the idle routine instead of the beginning, so that the generic idle loop can handle both TIF_NEED_RESCHED and pending timers. Fixes: 0603839b18f4 ("LoongArch: Add exception/interrupt handling") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-02-12um: convert irq_lock to raw spinlockJohannes Berg
Since this is deep in the architecture, and the code is called nested into other deep management code, this really needs to be a raw spinlock. Convert it. Link: https://patch.msgid.link/20250110125550.32479-8-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-12um: virtio_uml: use raw spinlockJohannes Berg
This is needed because at least in time-travel the code can be called directly from the deep architecture and IRQ handling code. Link: https://patch.msgid.link/20250110125550.32479-7-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-12um: virt-pci: don't use kmalloc()Johannes Berg
This code can be called deep in the IRQ handling, for example, and then cannot normally use kmalloc(). Have its own pre-allocated memory and use from there instead so this doesn't occur. Only in the (very rare) case of memcpy_toio() we'd still need to allocate memory. Link: https://patch.msgid.link/20250110125550.32479-6-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-12um: fix execve stub execution on old host OSsBenjamin Berg
The stub execution uses the somewhat new close_range and execveat syscalls. Of these two, the execveat call is essential, but the close_range call is more about stub process hygiene rather than safety (and its result is ignored). Replace both calls with a raw syscall as older machines might not have a recent enough kernel for close_range (with CLOSE_RANGE_CLOEXEC) or a libc that does not yet expose both of the syscalls. Fixes: 32e8eaf263d9 ("um: use execveat to create userspace MMs") Reported-by: Glenn Washburn <development@efficientek.com> Closes: https://lore.kernel.org/20250108022404.05e0de1e@crass-HP-ZBook-15-G2 Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250113094107.674738-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-12um: properly align signal stack on x86_64Benjamin Berg
The stack needs to be properly aligned so 16 byte memory accesses on the stack are correct. This was broken when introducing the dynamic math register sizing as the rounding was not moved appropriately. Fixes: 3f17fed21491 ("um: switch to regset API and depend on XSTATE") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20250107133509.265576-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-12um: avoid copying FP state from init_taskBenjamin Berg
The init_task instance of struct task_struct is statically allocated and does not contain the dynamic area for the userspace FP registers. As such, limit the copy to the valid area of init_task and fill the rest with zero. Note that the FP state is only needed for userspace, and as such it is entirely reasonable for init_task to not contain it. Reported-by: Brian Norris <briannorris@chromium.org> Closes: https://lore.kernel.org/Z1ySXmjZm-xOqk90@google.com Fixes: 3f17fed21491 ("um: switch to regset API and depend on XSTATE") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241217202745.1402932-3-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-12um: add back support for FXSAVE registersBenjamin Berg
It was reported that qemu may not enable the XSTATE CPU extension, which is a requirement after commit 3f17fed21491 ("um: switch to regset API and depend on XSTATE"). Add a fallback to use FXSAVE (FP registers on x86_64 and XFP on i386) which is just a shorter version of the same data. The only difference is that the XSTATE magic should not be set in the signal frame. Note that this still drops support for the older i386 FP register layout as supporting this would require more backward compatibility to build a correct signal frame. Fixes: 3f17fed21491 ("um: switch to regset API and depend on XSTATE") Reported-by: SeongJae Park <sj@kernel.org> Closes: https://lore.kernel.org/r/20241203070218.240797-1-sj@kernel.org Tested-by: SeongJae Park <sj@kernel.org> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20241204074827.1582917-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-02-11s390/pci: Fix handling of isolated VFsNiklas Schnelle
In contrast to the commit message of the fixed commit VFs whose parent PF is not configured are not always isolated, that is put on their own PCI domain. This is because for VFs to be added to an existing PCI domain it is enough for that PCI domain to share the same topology ID or PCHID. Such a matching PCI domain without a parent PF may exist when a PF from the same PCI card created the domain with the VF being a child of a different, non accessible, PF. While not causing technical issues it makes the rules which VFs are isolated inconsistent. Fix this by explicitly checking that the parent PF exists on the PCI domain determined by the topology ID or PCHID before registering the VF. This works because a parent PF which is under control of this Linux instance must be enabled and configured at the point where its child VFs appear because otherwise SR-IOV could not have been enabled on the parent. Fixes: 25f39d3dcb48 ("s390/pci: Ignore RID for isolated VFs") Cc: stable@vger.kernel.org Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-02-11s390/pci: Pull search for parent PF out of zpci_iov_setup_virtfn()Niklas Schnelle
This creates a new zpci_iov_find_parent_pf() function which a future commit can use to find if a VF has a configured parent PF. Use zdev->rid instead of zdev->devfn such that the new function can be used before it has been decided if the RID will be exposed and zdev->devfn is set. Also handle the hypotheical case that the RID is not available but there is an otherwise matching zbus. Fixes: 25f39d3dcb48 ("s390/pci: Ignore RID for isolated VFs") Cc: stable@vger.kernel.org Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-02-11s390/bitops: Disable arch_test_bit() optimization for PROFILE_ALL_BRANCHESHeiko Carstens
With PROFILE_ALL_BRANCHES enabled gcc sometimes fails to handle __builtin_constant_p() correctly: In function 'arch_test_bit', inlined from 'node_state' at include/linux/nodemask.h:423:9, inlined from 'warn_if_node_offline' at include/linux/gfp.h:252:2, inlined from '__alloc_pages_node_noprof' at include/linux/gfp.h:267:2, inlined from 'alloc_pages_node_noprof' at include/linux/gfp.h:296:9, inlined from 'vm_area_alloc_pages.constprop' at mm/vmalloc.c:3591:11: >> arch/s390/include/asm/bitops.h:60:17: warning: 'asm' operand 2 probably does not match constraints 60 | asm volatile( | ^~~ >> arch/s390/include/asm/bitops.h:60:17: error: impossible constraint in 'asm' Therefore disable the optimization for this case. This is similar to commit 63678eecec57 ("s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES") Fixes: b2bc1b1a77c0 ("s390/bitops: Provide optimized arch_test_bit()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502091912.xL2xTCGw-lkp@intel.com/ Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-02-11s390/configs: Remove CONFIG_LSMIlya Leoshkevich
s390 defconfig does not have BPF LSM, resulting in systemd[1]: bpf-restrict-fs: BPF LSM hook not enabled in the kernel, BPF LSM not supported. with the respective kernels. The other architectures do not explicitly set it, and the default values have BPF in them, so just drop it. Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-02-11x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-ExitPatrick Bellasi
In [1] the meaning of the synthetic IBPB flags has been redefined for a better separation of concerns: - ENTRY_IBPB -- issue IBPB on entry only - IBPB_ON_VMEXIT -- issue IBPB on VM-Exit only and the Retbleed mitigations have been updated to match this new semantics. Commit [2] was merged shortly before [1], and their interaction was not handled properly. This resulted in IBPB not being triggered on VM-Exit in all SRSO mitigation configs requesting an IBPB there. Specifically, an IBPB on VM-Exit is triggered only when X86_FEATURE_IBPB_ON_VMEXIT is set. However: - X86_FEATURE_IBPB_ON_VMEXIT is not set for "spec_rstack_overflow=ibpb", because before [1] having X86_FEATURE_ENTRY_IBPB was enough. Hence, an IBPB is triggered on entry but the expected IBPB on VM-exit is not. - X86_FEATURE_IBPB_ON_VMEXIT is not set also when "spec_rstack_overflow=ibpb-vmexit" if X86_FEATURE_ENTRY_IBPB is already set. That's because before [1] this was effectively redundant. Hence, e.g. a "retbleed=ibpb spec_rstack_overflow=bpb-vmexit" config mistakenly reports the machine still vulnerable to SRSO, despite an IBPB being triggered both on entry and VM-Exit, because of the Retbleed selected mitigation config. - UNTRAIN_RET_VM won't still actually do anything unless CONFIG_MITIGATION_IBPB_ENTRY is set. For "spec_rstack_overflow=ibpb", enable IBPB on both entry and VM-Exit and clear X86_FEATURE_RSB_VMEXIT which is made superfluous by X86_FEATURE_IBPB_ON_VMEXIT. This effectively makes this mitigation option similar to the one for 'retbleed=ibpb', thus re-order the code for the RETBLEED_MITIGATION_IBPB option to be less confusing by having all features enabling before the disabling of the not needed ones. For "spec_rstack_overflow=ibpb-vmexit", guard this mitigation setting with CONFIG_MITIGATION_IBPB_ENTRY to ensure UNTRAIN_RET_VM sequence is effectively compiled in. Drop instead the CONFIG_MITIGATION_SRSO guard, since none of the SRSO compile cruft is required in this configuration. Also, check only that the required microcode is present to effectively enabled the IBPB on VM-Exit. Finally, update the KConfig description for CONFIG_MITIGATION_IBPB_ENTRY to list also all SRSO config settings enabled by this guard. Fixes: 864bcaa38ee4 ("x86/cpu/kvm: Provide UNTRAIN_RET_VM") [1] Fixes: d893832d0e1e ("x86/srso: Add IBPB on VMEXIT") [2] Reported-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Patrick Bellasi <derkling@google.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly clean the BSS to the PoC before allowing EL2 to access it on nVHE/hVHE/protected configurations - Propagate ownership of debug registers in protected mode after the rework that landed in 6.14-rc1 - Stop pretending that we can run the protected mode without a GICv3 being present on the host - Fix a use-after-free situation that can occur if a vcpu fails to initialise the NV shadow S2 MMU contexts - Always evaluate the need to arm a background timer for fully emulated guest timers - Fix the emulation of EL1 timers in the absence of FEAT_ECV - Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0 s390: - move some of the guest page table (gmap) logic into KVM itself, inching towards the final goal of completely removing gmap from the non-kvm memory management code. As an initial set of cleanups, move some code from mm/gmap into kvm and start using __kvm_faultin_pfn() to fault-in pages as needed; but especially stop abusing page->index and page->lru to aid in the pgdesc conversion. x86: - Add missing check in the fix to defer starting the huge page recovery vhost_task - SRSO_USER_KERNEL_NO does not need SYNTHESIZED_F" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (31 commits) KVM: x86/mmu: Ensure NX huge page recovery thread is alive before waking KVM: remove kvm_arch_post_init_vm KVM: selftests: Fix spelling mistake "initally" -> "initially" kvm: x86: SRSO_USER_KERNEL_NO is not synthesized KVM: arm64: timer: Don't adjust the EL2 virtual timer offset KVM: arm64: timer: Correctly handle EL1 timer emulation when !FEAT_ECV KVM: arm64: timer: Always evaluate the need for a soft timer KVM: arm64: Fix nested S2 MMU structures reallocation KVM: arm64: Fail protected mode init if no vgic hardware is present KVM: arm64: Flush/sync debug state in protected mode KVM: s390: selftests: Streamline uc_skey test to issue iske after sske KVM: s390: remove the last user of page->index KVM: s390: move PGSTE softbits KVM: s390: remove useless page->index usage KVM: s390: move gmap_shadow_pgt_lookup() into kvm KVM: s390: stop using lists to keep track of used dat tables KVM: s390: stop using page->index for non-shadow gmaps KVM: s390: move some gmap shadowing functions away from mm/gmap.c KVM: s390: get rid of gmap_translate() KVM: s390: get rid of gmap_fault() ...
2025-02-08Merge tag 'execve-v6.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve fix from Kees Cook: "This is an alpha-specific fix, but since it touched ELF I was asked to carry it. - alpha/elf: Fix misc/setarch test of util-linux by removing 32bit support (Eric W. Biederman)" * tag 'execve-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: alpha/elf: Fix misc/setarch test of util-linux by removing 32bit support
2025-02-08Merge tag 'x86-urgent-2025-02-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Ingo Molnar: "Fix a build regression on GCC 15 builds, caused by GCC changing the default C version that is overriden in the main Makefile but not in the x86 boot code Makefile" * tag 'x86-urgent-2025-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Use '-std=gnu11' to fix build with GCC 15
2025-02-07arm64: cacheinfo: Avoid out-of-bounds write to cacheinfo arrayRadu Rendec
The loop that detects/populates cache information already has a bounds check on the array size but does not account for cache levels with separate data/instructions cache. Fix this by incrementing the index for any populated leaf (instead of any populated level). Fixes: 5d425c186537 ("arm64: kernel: add support for cpu cache information") Signed-off-by: Radu Rendec <rrendec@redhat.com> Link: https://lore.kernel.org/r/20250206174420.2178724-1-rrendec@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2025-02-07arm64: Handle .ARM.attributes section in linker scriptsNathan Chancellor
A recent LLVM commit [1] started generating an .ARM.attributes section similar to the one that exists for 32-bit, which results in orphan section warnings (or errors if CONFIG_WERROR is enabled) from the linker because it is not handled in the arm64 linker scripts. ld.lld: error: arch/arm64/kernel/vdso/vgettimeofday.o:(.ARM.attributes) is being placed in '.ARM.attributes' ld.lld: error: arch/arm64/kernel/vdso/vgetrandom.o:(.ARM.attributes) is being placed in '.ARM.attributes' ld.lld: error: vmlinux.a(lib/vsprintf.o):(.ARM.attributes) is being placed in '.ARM.attributes' ld.lld: error: vmlinux.a(lib/win_minmax.o):(.ARM.attributes) is being placed in '.ARM.attributes' ld.lld: error: vmlinux.a(lib/xarray.o):(.ARM.attributes) is being placed in '.ARM.attributes' Discard the new sections in the necessary linker scripts to resolve the warnings, as the kernel and vDSO do not need to retain it, similar to the .note.gnu.property section. Cc: stable@vger.kernel.org Fixes: b3e5d80d0c48 ("arm64/build: Warn on orphan section placement") Link: https://github.com/llvm/llvm-project/commit/ee99c4d4845db66c4daa2373352133f4b237c942 [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250206-arm64-handle-arm-attributes-in-linker-script-v3-1-d53d169913eb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-02-07genirq: Remove leading space from irq_chip::irq_print_chip() callbacksGeert Uytterhoeven
The space separator was factored out from the multiple chip name prints, but several irq_chip::irq_print_chip() callbacks still print a leading space. Remove the superfluous double spaces. Fixes: 9d9f204bdf7243bf ("genirq/proc: Add missing space separator back") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/893f7e9646d8933cd6786d5a1ef3eb076d263768.1738764803.git.geert+renesas@glider.be
2025-02-06Merge tag 'for-linus-6.14-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Three fixes for xen_hypercall_hvm() that was introduced in the 6.13 cycle" * tag 'for-linus-6.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: remove unneeded dummy push from xen_hypercall_hvm() x86/xen: add FRAME_END to xen_hypercall_hvm() x86/xen: fix xen_hypercall_hvm() to not clobber %rbx
2025-02-06alpha/elf: Fix misc/setarch test of util-linux by removing 32bit supportEric W. Biederman
Richard Henderson <richard.henderson@linaro.org> writes[1]: > There was a Spec benchmark (I forget which) which was memory bound and ran > twice as fast with 32-bit pointers. > > I copied the idea from DEC to the ELF abi, but never did all the other work > to allow the toolchain to take advantage. > > Amusingly, a later Spec changed the benchmark data sets to not fit into a > 32-bit address space, specifically because of this. > > I expect one could delete the ELF bit and personality and no one would > notice. Not even the 10 remaining Alpha users. In [2] it was pointed out that parts of setarch weren't working properly on alpha because it has it's own SET_PERSONALITY implementation. In the discussion that followed Richard Henderson pointed out that the 32bit pointer support for alpha was never completed. Fix this by removing alpha's 32bit pointer support. As a bit of paranoia refuse to execute any alpha binaries that have the EF_ALPHA_32BIT flag set. Just in case someone somewhere has binaries that try to use alpha's 32bit pointer support. Link: https://lkml.kernel.org/r/CAFXwXrkgu=4Qn-v1PjnOR4SG0oUb9LSa0g6QXpBq4ttm52pJOQ@mail.gmail.com [1] Link: https://lkml.kernel.org/r/20250103140148.370368-1-glaubitz@physik.fu-berlin.de [2] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/87y0zfs26i.fsf_-_@email.froward.int.ebiederm.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-05x86/xen: remove unneeded dummy push from xen_hypercall_hvm()Juergen Gross
Stack alignment of the kernel in 64-bit mode is 8, not 16, so the dummy push in xen_hypercall_hvm() for aligning the stack to 16 bytes can be removed. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-05x86/xen: add FRAME_END to xen_hypercall_hvm()Juergen Gross
xen_hypercall_hvm() is missing a FRAME_END at the end, add it. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502030848.HTNTTuo9-lkp@intel.com/ Fixes: b4845bb63838 ("x86/xen: add central hypercall functions") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-05x86/xen: fix xen_hypercall_hvm() to not clobber %rbxJuergen Gross
xen_hypercall_hvm(), which is used when running as a Xen PVH guest at most only once during early boot, is clobbering %rbx. Depending on whether the caller relies on %rbx to be preserved across the call or not, this clobbering might result in an early crash of the system. This can be avoided by using an already saved register instead of %rbx. Fixes: b4845bb63838 ("x86/xen: add central hypercall functions") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2025-02-04KVM: x86/mmu: Ensure NX huge page recovery thread is alive before wakingSean Christopherson
When waking a VM's NX huge page recovery thread, ensure the thread is actually alive before trying to wake it. Now that the thread is spawned on-demand during KVM_RUN, a VM without a recovery thread is reachable via the related module params. BUG: kernel NULL pointer dereference, address: 0000000000000040 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:vhost_task_wake+0x5/0x10 Call Trace: <TASK> set_nx_huge_pages+0xcc/0x1e0 [kvm] param_attr_store+0x8a/0xd0 module_attr_store+0x1a/0x30 kernfs_fop_write_iter+0x12f/0x1e0 vfs_write+0x233/0x3e0 ksys_write+0x60/0xd0 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f3b52710104 </TASK> Modules linked in: kvm_intel kvm CR2: 0000000000000040 Fixes: 931656b9e2ff ("kvm: defer huge page recovery vhost task to later") Cc: stable@vger.kernel.org Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250124234623.3609069-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-04KVM: remove kvm_arch_post_init_vmPaolo Bonzini
The only statement in a kvm_arch_post_init_vm implementation can be moved into the x86 kvm_arch_init_vm. Do so and remove all traces from architecture-independent code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-04Merge tag 'kvmarm-fixes-6.14-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.14, take #1 - Correctly clean the BSS to the PoC before allowing EL2 to access it on nVHE/hVHE/protected configurations - Propagate ownership of debug registers in protected mode after the rework that landed in 6.14-rc1 - Stop pretending that we can run the protected mode without a GICv3 being present on the host - Fix a use-after-free situation that can occur if a vcpu fails to initialise the NV shadow S2 MMU contexts - Always evaluate the need to arm a background timer for fully emulated guest timers - Fix the emulation of EL1 timers in the absence of FEAT_ECV - Correctly handle the EL2 virtual timer, specially when HCR_EL2.E2H==0
2025-02-04Merge tag 'kvm-s390-next-6.14-2' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - some selftest fixes - move some kvm-related functions from mm into kvm - remove all usage of page->index and page->lru from kvm - fixes and cleanups for vsie