summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-06-26KVM: arm64: Don't free hyp pages with pKVM on GICv2Quentin Perret
Marc reported that enabling protected mode on a device with GICv2 doesn't fail gracefully as one would expect, and leads to a host kernel crash. As it turns out, the first half of pKVM init happens before the vgic probe, and so by the time we find out we have a GICv2 we're already committed to keeping the pKVM vectors installed at EL2 -- pKVM rejects stub HVCs for obvious security reasons. However, the error path on KVM init leads to teardown_hyp_mode() which unconditionally frees hypervisor allocations (including the EL2 stacks and per-cpu pages) under the assumption that a previous cpu_hyp_uninit() execution has reset the vectors back to the stubs, which is false with pKVM. Interestingly, host stage-2 protection is not enabled yet at this point, so this use-after-free may go unnoticed for a while. The issue becomes more obvious after the finalize_pkvm() call. Fix this by keeping track of the CPUs on which pKVM is initialized in the kvm_hyp_initialized per-cpu variable, and use it from teardown_hyp_mode() to skip freeing pages that are in fact used. Fixes: a770ee80e662 ("KVM: arm64: pkvm: Disable GICv2 support") Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20250626101014.1519345-1-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-26KVM: arm64: Fix error path in init_hyp_mode()Mostafa Saleh
In the unlikely case pKVM failed to allocate carveout, the error path tries to access NULL ptr when it de-reference the SVE state from the uninitialized nVHE per-cpu base. [ 1.575420] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 1.576010] pc : teardown_hyp_mode+0xe4/0x180 [ 1.576920] lr : teardown_hyp_mode+0xd0/0x180 [ 1.577308] sp : ffff8000826fb9d0 [ 1.577600] x29: ffff8000826fb9d0 x28: 0000000000000000 x27: ffff80008209b000 [ 1.578383] x26: ffff800081dde000 x25: ffff8000820493c0 x24: ffff80008209eb00 [ 1.579180] x23: 0000000000000040 x22: 0000000000000001 x21: 0000000000000000 [ 1.579881] x20: 0000000000000002 x19: ffff800081d540b8 x18: 0000000000000000 [ 1.580544] x17: ffff800081205230 x16: 0000000000000152 x15: 00000000fffffff8 [ 1.581183] x14: 0000000000000008 x13: fff00000ff7f6880 x12: 000000000000003e [ 1.581813] x11: 0000000000000002 x10: 00000000000000ff x9 : 0000000000000000 [ 1.582503] x8 : 0000000000000000 x7 : 7f7f7f7f7f7f7f7f x6 : 43485e525851ff30 [ 1.583140] x5 : fff00000ff6e9030 x4 : fff00000ff6e8f80 x3 : 0000000000000000 [ 1.583780] x2 : 0000000000000000 x1 : 0000000000000002 x0 : 0000000000000000 [ 1.584526] Call trace: [ 1.584945] teardown_hyp_mode+0xe4/0x180 (P) [ 1.585578] init_hyp_mode+0x920/0x994 [ 1.586005] kvm_arm_init+0xb4/0x25c [ 1.586387] do_one_initcall+0xe0/0x258 [ 1.586819] do_initcall_level+0xa0/0xd4 [ 1.587224] do_initcalls+0x54/0x94 [ 1.587606] do_basic_setup+0x1c/0x28 [ 1.587998] kernel_init_freeable+0xc8/0x130 [ 1.588409] kernel_init+0x20/0x1a4 [ 1.588768] ret_from_fork+0x10/0x20 [ 1.589568] Code: f875db48 8b1c0109 f100011f 9a8903e8 (f9463100) [ 1.590332] ---[ end trace 0000000000000000 ]--- As Quentin pointed, the order of free is also wrong, we need to free SVE state first before freeing the per CPU ptrs. I initially observed this on 6.12, but I could also repro in master. Signed-off-by: Mostafa Saleh <smostafa@google.com> Fixes: 66d5b53e20a6 ("KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM") Reviewed-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20250625123058.875179-1-smostafa@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-26KVM: arm64: Adjust range correctly during host stage-2 faultsQuentin Perret
host_stage2_adjust_range() tries to find the largest block mapping that fits within a memory or mmio region (represented by a kvm_mem_range in this function) during host stage-2 faults under pKVM. To do so, it walks the host stage-2 page-table, finds the faulting PTE and its level, and then progressively increments the level until it finds a granule of the appropriate size. However, the condition in the loop implementing the above is broken as it checks kvm_level_supports_block_mapping() for the next level instead of the current, so pKVM may attempt to map a region larger than can be covered with a single block. This is not a security problem and is quite rare in practice (the kvm_mem_range check usually forces host_stage2_adjust_range() to choose a smaller granule), but this is clearly not the expected behaviour. Refactor the loop to fix the bug and improve readability. Fixes: c4f0935e4d95 ("KVM: arm64: Optimize host memory aborts") Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20250625105548.984572-1-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-26KVM: arm64: nv: Fix MI line level calculation in vgic_v3_nested_update_mi()Wei-Lin Chang
The state of the vcpu's MI line should be asserted when its ICH_HCR_EL2.En is set and ICH_MISR_EL2 is non-zero. Using bitwise AND (&=) directly for this calculation will not give us the correct result when the LSB of the vcpu's ICH_MISR_EL2 isn't set. Correct this by directly computing the line level with a logical AND operation. Signed-off-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Link: https://lore.kernel.org/r/20250625084709.3968844-1-r09922117@csie.ntu.edu.tw [maz: drop the level check from the original code] Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: VHE: Centralize ISBs when returning to hostMark Rutland
The VHE hyp code has recently gained a few ISBs. Simplify this to one unconditional ISB in __kvm_vcpu_run_vhe(), and remove the unnecessary ISB from the kvm_call_hyp_ret() macro. While kvm_call_hyp_ret() is also used to invoke __vgic_v3_get_gic_config(), but no ISB is necessary in that case either. For the moment, an ISB is left in kvm_call_hyp(), as there are many more users, and removing the ISB would require a more thorough audit. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250617133718.4014181-8-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: Remove cpacr_clear_set()Mark Rutland
We no longer use cpacr_clear_set(). Remove cpacr_clear_set() and its helper functions. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250617133718.4014181-7-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: Remove ad-hoc CPTR manipulation from kvm_hyp_handle_fpsimd()Mark Rutland
The hyp code FPSIMD/SVE/SME trap handling logic has some rather messy open-coded manipulation of CPTR/CPACR. This is benign for non-nested guests, but broken for nested guests, as the guest hypervisor's CPTR configuration is not taken into account. Consider the case where L0 provides FPSIMD+SVE to an L1 guest hypervisor, and the L1 guest hypervisor only provides FPSIMD to an L2 guest (with L1 configuring CPTR/CPACR to trap SVE usage from L2). If the L2 guest triggers an FPSIMD trap to the L0 hypervisor, kvm_hyp_handle_fpsimd() will see that the vCPU supports FPSIMD+SVE, and will configure CPTR/CPACR to NOT trap FPSIMD+SVE before returning to the L2 guest. Consequently the L2 guest would be able to manipulate SVE state even though the L1 hypervisor had configured CPTR/CPACR to forbid this. Clean this up, and fix the nested virt issue by always using __deactivate_cptr_traps() and __activate_cptr_traps() to manage the CPTR traps. This removes the need for the ad-hoc fixup in kvm_hyp_save_fpsimd_host(), and ensures that any guest hypervisor configuration of CPTR/CPACR is taken into account. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250617133718.4014181-6-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: Remove ad-hoc CPTR manipulation from fpsimd_sve_sync()Mark Rutland
There's no need for fpsimd_sve_sync() to write to CPTR/CPACR. All relevant traps are always disabled earlier within __kvm_vcpu_run(), when __deactivate_cptr_traps() configures CPTR/CPACR. With irrelevant details elided, the flow is: handle___kvm_vcpu_run(...) { flush_hyp_vcpu(...) { fpsimd_sve_flush(...); } __kvm_vcpu_run(...) { __activate_traps(...) { __activate_cptr_traps(...); } do { __guest_enter(...); } while (...); __deactivate_traps(....) { __deactivate_cptr_traps(...); } } sync_hyp_vcpu(...) { fpsimd_sve_sync(...); } } Remove the unnecessary write to CPTR/CPACR. An ISB is still necessary, so a comment is added to describe this requirement. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250617133718.4014181-5-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: Reorganise CPTR trap manipulationMark Rutland
The NVHE/HVHE and VHE modes have separate implementations of __activate_cptr_traps() and __deactivate_cptr_traps() in their respective switch.c files. There's some duplication of logic, and it's not currently possible to reuse this logic elsewhere. Move the logic into the common switch.h header so that it can be reused, and de-duplicate the common logic. This rework changes the way SVE traps are deactivated in VHE mode, aligning it with NVHE/HVHE modes: * Before this patch, VHE's __deactivate_cptr_traps() would unconditionally enable SVE for host EL2 (but not EL0), regardless of whether the ARM64_SVE cpucap was set. * After this patch, VHE's __deactivate_cptr_traps() will take the ARM64_SVE cpucap into account. When ARM64_SVE is not set, SVE will be trapped from EL2 and below. The old and new behaviour are both benign: * When ARM64_SVE is not set, the host will not touch SVE state, and will not reconfigure SVE traps. Host EL0 access to SVE will be trapped as expected. * When ARM64_SVE is set, the host will configure EL0 SVE traps before returning to EL0 as part of reloading the EL0 FPSIMD/SVE/SME state. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250617133718.4014181-4-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: VHE: Synchronize CPTR trap deactivationMark Rutland
Currently there is no ISB between __deactivate_cptr_traps() disabling traps that affect EL2 and fpsimd_lazy_switch_to_host() manipulating registers potentially affected by CPTR traps. When NV is not in use, this is safe because the relevant registers are only accessed when guest_owns_fp_regs() && vcpu_has_sve(vcpu), and this also implies that SVE traps affecting EL2 have been deactivated prior to __guest_entry(). When NV is in use, a guest hypervisor may have configured SVE traps for a nested context, and so it is necessary to have an ISB between __deactivate_cptr_traps() and fpsimd_lazy_switch_to_host(). Due to the current lack of an ISB, when a guest hypervisor enables SVE traps in CPTR, the host can take an unexpected SVE trap from within fpsimd_lazy_switch_to_host(), e.g. | Unhandled 64-bit el1h sync exception on CPU1, ESR 0x0000000066000000 -- SVE | CPU: 1 UID: 0 PID: 164 Comm: kvm-vcpu-0 Not tainted 6.15.0-rc4-00138-ga05e0f012c05 #3 PREEMPT | Hardware name: FVP Base RevC (DT) | pstate: 604023c9 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : __kvm_vcpu_run+0x6f4/0x844 | lr : __kvm_vcpu_run+0x150/0x844 | sp : ffff800083903a60 | x29: ffff800083903a90 x28: ffff000801f4a300 x27: 0000000000000000 | x26: 0000000000000000 x25: ffff000801f90000 x24: ffff000801f900f0 | x23: ffff800081ff7720 x22: 0002433c807d623f x21: ffff000801f90000 | x20: ffff00087f730730 x19: 0000000000000000 x18: 0000000000000000 | x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 | x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 | x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 | x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff000801f90d70 | x5 : 0000000000001000 x4 : ffff8007fd739000 x3 : ffff000801f90000 | x2 : 0000000000000000 x1 : 00000000000003cc x0 : ffff800082f9d000 | Kernel panic - not syncing: Unhandled exception | CPU: 1 UID: 0 PID: 164 Comm: kvm-vcpu-0 Not tainted 6.15.0-rc4-00138-ga05e0f012c05 #3 PREEMPT | Hardware name: FVP Base RevC (DT) | Call trace: | show_stack+0x18/0x24 (C) | dump_stack_lvl+0x60/0x80 | dump_stack+0x18/0x24 | panic+0x168/0x360 | __panic_unhandled+0x68/0x74 | el1h_64_irq_handler+0x0/0x24 | el1h_64_sync+0x6c/0x70 | __kvm_vcpu_run+0x6f4/0x844 (P) | kvm_arm_vcpu_enter_exit+0x64/0xa0 | kvm_arch_vcpu_ioctl_run+0x21c/0x870 | kvm_vcpu_ioctl+0x1a8/0x9d0 | __arm64_sys_ioctl+0xb4/0xf4 | invoke_syscall+0x48/0x104 | el0_svc_common.constprop.0+0x40/0xe0 | do_el0_svc+0x1c/0x28 | el0_svc+0x30/0xcc | el0t_64_sync_handler+0x10c/0x138 | el0t_64_sync+0x198/0x19c | SMP: stopping secondary CPUs | Kernel Offset: disabled | CPU features: 0x0000,000002c0,02df4fb9,97ee773f | Memory Limit: none | ---[ end Kernel panic - not syncing: Unhandled exception ]--- Fix this by adding an ISB between __deactivate_traps() and fpsimd_lazy_switch_to_host(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250617133718.4014181-3-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: VHE: Synchronize restore of host debug registersMark Rutland
When KVM runs in non-protected VHE mode, there's no context synchronization event between __debug_switch_to_host() restoring the host debug registers and __kvm_vcpu_run() unmasking debug exceptions. Due to this, it's theoretically possible for the host to take an unexpected debug exception due to the stale guest configuration. This cannot happen in NVHE/HVHE mode as debug exceptions are masked in the hyp code, and the exception return to the host will provide the necessary context synchronization before debug exceptions can be taken. For now, avoid the problem by adding an ISB after VHE hyp code restores the host debug registers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250617133718.4014181-2-mark.rutland@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: selftests: Close the GIC FD in arch_timer_edge_casesZenghui Yu
Close the GIC FD to free the reference it holds to the VM so that we can correctly clean up the VM. This also gets rid of the "KVM: debugfs: duplicate directory 395722-4" warning when running arch_timer_edge_cases. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Miguel Luis <miguel.luis@oracle.com> Reviewed-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20250608095402.1131-1-yuzenghui@huawei.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: Explicitly treat routing entry type changes as changesSean Christopherson
Explicitly treat type differences as GSI routing changes, as comparing MSI data between two entries could get a false negative, e.g. if userspace changed the type but left the type-specific data as- Note, the same bug was fixed in x86 by commit bcda70c56f3e ("KVM: x86: Explicitly treat routing entry type changes as changes"). Fixes: 4bf3693d36af ("KVM: arm64: Unmap vLPIs affected by changes to GSI routing information") Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20250611224604.313496-3-seanjc@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19KVM: arm64: nv: Fix tracking of shadow list registersMarc Zyngier
Wei-Lin reports that the tracking of shadow list registers is majorly broken when resync'ing the L2 state after a run, as we confuse the guest's LR index with the host's, potentially losing the interrupt state. While this could be fixed by adding yet another side index to track it (Wei-Lin's fix), it may be better to refactor this code to avoid having a side index altogether, limiting the risk to introduce this class of bugs. A key observation is that the shadow index is always the number of bits in the lr_map bitmap. With that, the parallel indexing scheme can be completely dropped. While doing this, introduce a couple of helpers that abstract the index conversion and some of the LR repainting, making the whole exercise much simpler. Reported-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Reviewed-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250614145721.2504524-1-r09922117@csie.ntu.edu.tw Link: https://lore.kernel.org/r/86qzzkc5xa.wl-maz@kernel.org
2025-06-15Linux 6.16-rc2v6.16-rc2Linus Torvalds
2025-06-15Merge tag 'kbuild-fixes-v6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Move warnings about linux/export.h from W=1 to W=2 - Fix structure type overrides in gendwarfksyms * tag 'kbuild-fixes-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: gendwarfksyms: Fix structure type overrides kbuild: move warnings about linux/export.h from W=1 to W=2
2025-06-16gendwarfksyms: Fix structure type overridesSami Tolvanen
As we always iterate through the entire die_map when expanding type strings, recursively processing referenced types in type_expand_child() is not actually necessary. Furthermore, the type_string kABI rule added in commit c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings") can fail to override type strings for structures due to a missing kabi_get_type_string() check in this function. Fix the issue by dropping the unnecessary recursion and moving the override check to type_expand(). Note that symbol versions are otherwise unchanged with this patch. Fixes: c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings") Reported-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-16kbuild: move warnings about linux/export.h from W=1 to W=2Masahiro Yamada
This hides excessive warnings, as nobody builds with W=2. Fixes: a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") Fixes: 7d95680d64ac ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com>
2025-06-14Merge tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - SMB3.1.1 POSIX extensions fix for char remapping - Fix for repeated directory listings when directory leases enabled - deferred close handle reuse fix * tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: improve directory cache reuse for readdir operations smb: client: fix perf regression with deferred closes smb: client: disable path remapping with POSIX extensions
2025-06-14Merge tag 'iommu-fixes-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fix from Joerg Roedel: - Fix PTE size calculation for NVidia Tegra * tag 'iommu-fixes-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/tegra: Fix incorrect size calculation
2025-06-14Merge tag 'block-6.16-20250614' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for a deadlock on queue freeze with zoned writes - Fix for zoned append emulation - Two bio folio fixes, for sparsemem and for very large folios - Fix for a performance regression introduced in 6.13 when plug insertion was changed - Fix for NVMe passthrough handling for polled IO - Document the ublk auto registration feature - loop lockdep warning fix * tag 'block-6.16-20250614' of git://git.kernel.dk/linux: nvme: always punt polled uring_cmd end_io work to task_work Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists block: Fix bvec_set_folio() for very large folios bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP block: use plug request list tail for one-shot backmerge attempt block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_work block: Clear BIO_EMULATES_ZONE_APPEND flag on BIO completion ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG) loop: move lo_set_size() out of queue freeze
2025-06-14Merge tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix for a race between SQPOLL exit and fdinfo reading. It's slim and I was only able to reproduce this with an artificial delay in the kernel. Followup sparse fix as well to unify the access to ->thread. - Fix for multiple buffer peeking, avoiding truncation if possible. - Run local task_work for IOPOLL reaping when the ring is exiting. This currently isn't done due to an assumption that polled IO will never need task_work, but a fix on the block side is going to change that. * tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linux: io_uring: run local task_work from ring exit IOPOLL reaping io_uring/kbuf: don't truncate end buffer for multiple buffer peeks io_uring: consistently use rcu semantics with sqpoll thread io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
2025-06-14Merge tag 'rust-fixes-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust fix from Miguel Ojeda: - 'hrtimer': fix future compile error when the 'impl_has_hr_timer!' macro starts to get called * tag 'rust-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: rust: time: Fix compile error in impl_has_hr_timer macro
2025-06-14Merge tag 'mm-hotfixes-stable-2025-06-13-21-56' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "9 hotfixes. 3 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. Only 4 are for MM" * tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: add mmap_prepare() compatibility layer for nested file systems init: fix build warnings about export.h MAINTAINERS: add Barry as a THP reviewer drivers/rapidio/rio_cm.c: prevent possible heap overwrite mm: close theoretical race where stale TLB entries could linger mm/vma: reset VMA iterator on commit_merge() OOM failure docs: proc: update VmFlags documentation in smaps scatterlist: fix extraneous '@'-sign kernel-doc notation selftests/mm: skip failed memfd setups in gup_longterm
2025-06-13Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "All fixes for drivers. The core change in the error handler is simply to translate an ALUA specific sense code into a retry the ALUA components can handle and won't impact any other devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: error: alua: I/O errors for ALUA state transitions scsi: storvsc: Increase the timeouts to storvsc_timeout scsi: s390: zfcp: Ensure synchronous unit_add scsi: iscsi: Fix incorrect error path labels for flashnode operations scsi: mvsas: Fix typos in per-phy comments and SAS cmd port registers scsi: core: ufs: Fix a hang in the error handler
2025-06-13Merge tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Quiet week, only two pull requests came my way, xe has a couple of fixes and then a bunch of fixes across the board, vc4 probably fixes the biggest problem: vc4: - Fix infinite EPROBE_DEFER loop in vc4 probing amdxdna: - Fix amdxdna firmware size meson: - modesetting fixes sitronix: - Kconfig fix for st7171-i2c dma-buf: - Fix -EBUSY WARN_ON_ONCE in dma-buf udmabuf: - Use dma_sync_sgtable_for_cpu in udmabuf xe: - Fix regression disallowing 64K SVM migration - Use a bounce buffer for WA BB" * tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel: drm/xe/lrc: Use a temporary buffer for WA BB udmabuf: use sgtable-based scatterlist wrappers dma-buf: fix compare in WARN_ON_ONCE drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERS drm/meson: fix more rounding issues with 59.94Hz modes drm/meson: use vclk_freq instead of pixel_freq in debug print drm/meson: fix debug log statement when setting the HDMI clocks drm/vc4: fix infinite EPROBE_DEFER loop drm/xe/svm: Fix regression disallowing 64K SVM migration accel/amdxdna: Fix incorrect PSP firmware size
2025-06-13io_uring: run local task_work from ring exit IOPOLL reapingJens Axboe
In preparation for needing to shift NVMe passthrough to always use task_work for polled IO completions, ensure that those are suitably run at exit time. See commit: 9ce6c9875f3e ("nvme: always punt polled uring_cmd end_io work to task_work") for details on why that is necessary. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13nvme: always punt polled uring_cmd end_io work to task_workJens Axboe
Currently NVMe uring_cmd completions will complete locally, if they are polled. This is done because those completions are always invoked from task context. And while that is true, there's no guarantee that it's invoked under the right ring context, or even task. If someone does NVMe passthrough via multiple threads and with a limited number of poll queues, then ringA may find completions from ringB. For that case, completing the request may not be sound. Always just punt the passthrough completions via task_work, which will redirect the completion, if needed. Cc: stable@vger.kernel.org Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13Merge tag 'acpi-6.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix an ACPI APEI error injection driver failure that started to occur after switching it over to using a faux device, address an EC driver issue related to invalid ECDT tables, clean up the usage of mwait_idle_with_hints() in the ACPI PAD driver, add a new IRQ override quirk, and fix a NULL pointer dereference related to nosmp: - Update the faux device handling code in the driver core and address an ACPI APEI error injection driver failure that started to occur after switching it over to using a faux device on top of that (Dan Williams) - Update data types of variables passed as arguments to mwait_idle_with_hints() in the ACPI PAD (processor aggregator device) driver to match the function definition after recent changes (Uros Bizjak) - Fix a NULL pointer dereference in the ACPI CPPC library that occurs when nosmp is passed to the kernel in the command line (Yunhui Cui) - Ignore ECDT tables with an invalid ID string to prevent using an incorrect GPE for signaling events on some systems (Armin Wolf) - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan)" * tag 'acpi-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: Use IRQ override on MACHENIKE 16P ACPI: EC: Ignore ECDT tables with an invalid ID string ACPI: CPPC: Fix NULL pointer dereference when nosmp is used ACPI: PAD: Update arguments of mwait_idle_with_hints() ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failure driver core: faux: Quiet probe failures driver core: faux: Suppress bind attributes
2025-06-13Merge tag 'pm-6.16-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the cpupower utility installation, fix up the recently added Rust abstractions for cpufreq and OPP, restore the x86 update eliminating mwait_play_dead_cpuid_hint() that has been reverted during the 6.16 merge window along with preventing the failure caused by it from happening, and clean up mwait_idle_with_hints() usage in intel_idle: - Implement CpuId Rust abstraction and use it to fix doctest failure related to the recently introduced cpumask abstraction (Viresh Kumar) - Do minor cleanups in the `# Safety` sections for cpufreq abstractions added recently (Viresh Kumar) - Unbreak cpupower systemd service units installation on some systems by adding a unitdir variable for specifying the location to install them (Francesco Poli) - Eliminate mwait_play_dead_cpuid_hint() again after reverting its elimination during the 6.16 merge window due to a problem with handling "dead" SMT siblings, but this time prevent leaving them in C1 after initialization by taking them online and back offline when a proper cpuidle driver for the platform has been registered (Rafael Wysocki) - Update data types of variables passed as arguments to mwait_idle_with_hints() to match the function definition after recent changes (Uros Bizjak)" * tag 'pm-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: rust: cpu: Add CpuId::current() to retrieve current CPU ID rust: Use CpuId in place of raw CPU numbers rust: cpu: Introduce CpuId abstraction intel_idle: Update arguments of mwait_idle_with_hints() cpufreq: Convert `/// SAFETY` lines to `# Safety` sections cpupower: split unitdir from libdir in Makefile Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" ACPI: processor: Rescan "dead" SMT siblings during initialization intel_idle: Rescan "dead" SMT siblings during initialization x86/smp: PM/hibernate: Split arch_resume_nosmt() intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13Merge branches 'acpi-pad', 'acpi-cppc', 'acpi-ec' and 'acpi-resource'Rafael J. Wysocki
Merge assorted ACPI updates for 6.16-rc2: - Update data types of variables passed as arguments to mwait_idle_with_hints() in the ACPI PAD (processor aggregator device) driver to match the function definition after recent changes (Uros Bizjak). - Fix a NULL pointer dereference in the ACPI CPPC library that occurs when nosmp is passed to the kernel in the command line (Yunhui Cui). - Ignore ECDT tables with an invalid ID string to prevent using an incorrect GPE for signaling events on some systems (Armin Wolf). - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan). * acpi-pad: ACPI: PAD: Update arguments of mwait_idle_with_hints() * acpi-cppc: ACPI: CPPC: Fix NULL pointer dereference when nosmp is used * acpi-ec: ACPI: EC: Ignore ECDT tables with an invalid ID string * acpi-resource: ACPI: resource: Use IRQ override on MACHENIKE 16P
2025-06-13Merge branch 'pm-cpuidle'Rafael J. Wysocki
Merge cpuidle updates for 6.16-rc2: - Update data types of variables passed as arguments to mwait_idle_with_hints() to match the function definition after recent changes (Uros Bizjak). - Eliminate mwait_play_dead_cpuid_hint() again after reverting its elimination during the merge window due to a problem with handling "dead" SMT siblings, but this time prevent leaving them in C1 after initialization by taking them online and back offline when a proper cpuidle driver for the platform has been registered (Rafael Wysocki). * pm-cpuidle: intel_idle: Update arguments of mwait_idle_with_hints() Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" ACPI: processor: Rescan "dead" SMT siblings during initialization intel_idle: Rescan "dead" SMT siblings during initialization x86/smp: PM/hibernate: Split arch_resume_nosmt() intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13Merge branch 'pm-tools'Rafael J. Wysocki
Merge a cpupower utility fix for 6.16-rc2 that unbreaks systemd service units installation on some sysems (Francesco Poli). * pm-tools: cpupower: split unitdir from libdir in Makefile
2025-06-13Merge tag 'spi-fix-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A collection of driver specific fixes, most minor apart from the OMAP ones which disable some recent performance optimisations in some non-standard cases where we could start driving the bus incorrectly. The change to the stm32-ospi driver to use the newer reset APIs is a fix for interactions with other IP sharing the same reset line in some SoCs" * tag 'spi-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine spi: stm32-ospi: clean up on error in probe() spi: stm32-ospi: Make usage of reset_control_acquire/release() API spi: offload: check offload ops existence before disabling the trigger spi: spi-pci1xxxx: Fix error code in probe spi: loongson: Fix build warnings about export.h spi: omap2-mcspi: Disable multi-mode when the previous message kept CS asserted spi: omap2-mcspi: Disable multi mode when CS should be kept asserted after message
2025-06-13Merge tag 'regulator-fix-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "One minor fix for a leak in the DT parsing code in the max20086 driver" * tag 'regulator-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: max20086: Fix refcount leak in max20086_parse_regulators_dt()
2025-06-13posix-cpu-timers: fix race between handle_posix_cpu_timers() and ↵Oleg Nesterov
posix_cpu_timer_del() If an exiting non-autoreaping task has already passed exit_notify() and calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent or debugger right after unlock_task_sighand(). If a concurrent posix_cpu_timer_del() runs at that moment, it won't be able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or lock_task_sighand() will fail. Add the tsk->exit_state check into run_posix_cpu_timers() to fix this. This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because exit_task_work() is called before exit_notify(). But the check still makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail anyway in this case. Cc: stable@vger.kernel.org Reported-by: Benoît Sevens <bsevens@google.com> Fixes: 0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-13Merge tag 'trace-v6.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: - Do not free "head" variable in filter_free_subsystem_filters() The first error path jumps to "free_now" label but first frees the newly allocated "head" variable. But the "free_now" code checks this variable, and if it is not NULL, it will iterate the list. As this list variable was already initialized, the "free_now" code will not do anything as it is empty. But freeing it will cause a UAF bug. The error path should simply jump to the "free_now" label and leave the "head" variable alone. * tag 'trace-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Do not free "head" on error path of filter_free_subsystem_filters()
2025-06-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Rework of system register accessors for system registers that are directly writen to memory, so that sanitisation of the in-memory value happens at the correct time (after the read, or before the write). For convenience, RMW-style accessors are also provided. - Multiple fixes for the so-called "arch-timer-edge-cases' selftest, which was always broken. x86: - Make KVM_PRE_FAULT_MEMORY stricter for TDX, allowing userspace to pass only the "untouched" addresses and flipping the shared/private bit in the implementation. - Disable SEV-SNP support on initialization failure * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: Reject direct bits in gpa passed to KVM_PRE_FAULT_MEMORY KVM: x86/mmu: Embed direct bits into gpa for KVM_PRE_FAULT_MEMORY KVM: SEV: Disable SEV-SNP support on initialization failure KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases KVM: arm64: selftests: Fix help text for arch_timer_edge_cases KVM: arm64: Make __vcpu_sys_reg() a pure rvalue operand KVM: arm64: Don't use __vcpu_sys_reg() to get the address of a sysreg KVM: arm64: Add RMW specific sysreg accessor KVM: arm64: Add assignment-specific sysreg accessor
2025-06-13io_uring/kbuf: don't truncate end buffer for multiple buffer peeksJens Axboe
If peeking a bunch of buffers, normally io_ring_buffers_peek() will truncate the end buffer. This isn't optimal as presumably more data will be arriving later, and hence it's better to stop with the last full buffer rather than truncate the end buffer. Cc: stable@vger.kernel.org Fixes: 35c8711c8fc4 ("io_uring/kbuf: add helpers for getting/peeking multiple buffers") Reported-by: Christian Mazakas <christian.mazakas@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13Merge tag 'v6.16-p4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a broken self-test in hkdf (new regression)" * tag 'v6.16-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: hkdf - move to late_initcall
2025-06-13Merge tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "As usual, highlighting the ones users have been noticing: - Fix a small issue with has_case_insensitive not being propagated on snapshot creation; this led to fsck errors, which we're harmless because we're not using this flag yet (it's for overlayfs + casefolding). - Log the error being corrected in the journal when we're doing fsck repair: this was one of the "lessons learned" from the i_nlink 0 -> subvolume deletion bug, where reconstructing what had happened by analyzing the journal was a bit more difficult than it needed to be. - Don't schedule btree node scan to run in the superblock: this fixes a regression from the 6.16 recovery passes rework, and let to it running unnecessarily. The real issue here is that we don't have online, "self healing" style topology repair yet: topology repair currently has to run before we go RW, which means that we may schedule it unnecessarily after a transient error. This will be fixed in the future. - We now track, in btree node flags, the reason it was scheduled to be rewritten. We discovered a deadlock in recovery when many btree nodes need to be rewritten because they're degraded: fully fixing this will take some work but it's now easier to see what's going on. For the bug report where this came up, a device had been kicked RO due to transient errors: manually setting it back to RW was sufficient to allow recovery to succeed. - Mark a few more fsck errors as autofix: as a reminder to users, please do keep reporting cases where something needs to be repaired and is not repaired automatically (i.e. cases where -o fix_errors or fsck -y is required). - rcu_pending.c now works with PREEMPT_RT - 'bcachefs device add', then umount, then remount wasn't working - we now emit a uevent so that the new device's new superblock is correctly picked up - Assorted repair fixes: btree node scan will no longer incorrectly update sb->version_min, - Assorted syzbot fixes" * tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefs: (23 commits) bcachefs: Don't trace should_be_locked unless changing bcachefs: Ensure that snapshot creation propagates has_case_insensitive bcachefs: Print devices we're mounting on multi device filesystems bcachefs: Don't trust sb->nr_devices in members_to_text() bcachefs: Fix version checks in validate_bset() bcachefs: ioctl: avoid stack overflow warning bcachefs: Don't pass trans to fsck_err() in gc_accounting_done bcachefs: Fix leak in bch2_fs_recovery() error path bcachefs: Fix rcu_pending for PREEMPT_RT bcachefs: Fix downgrade_table_extra() bcachefs: Don't put rhashtable on stack bcachefs: Make sure opts.read_only gets propagated back to VFS bcachefs: Fix possible console lock involved deadlock bcachefs: mark more errors autofix bcachefs: Don't persistently run scan_for_btree_nodes bcachefs: Read error message now prints if self healing bcachefs: Only run 'increase_depth' for keys from btree node csan bcachefs: Mark need_discard_freespace_key_bad autofix bcachefs: Update /dev/disk/by-uuid on device add bcachefs: Add more flags to btree nodes for rewrite reason ...
2025-06-13Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublistsBagas Sanjaya
Stephen Rothwell reports htmldocs warning on ublk docs: Documentation/block/ublk.rst:414: ERROR: Unexpected indentation. [docutils] Fix the warning by separating sublists of auto buffer registration fallback behavior from their appropriate parent list item. Fixes: ff20c516485e ("ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250612132638.193de386@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20250613023857.15971-1-bagasdotme@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13iommu/tegra: Fix incorrect size calculationJason Gunthorpe
This driver uses a mixture of ways to get the size of a PTE, tegra_smmu_set_pde() did it as sizeof(*pd) which became wrong when pd switched to a struct tegra_pd. Switch pd back to a u32* in tegra_smmu_set_pde() so the sizeof(*pd) returns 4. Fixes: 50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/62e7f7fe-6200-4e4f-ad42-d58ad272baa6@tecnico.ulisboa.pt/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-13block: Fix bvec_set_folio() for very large foliosMatthew Wilcox (Oracle)
Similarly to 26064d3e2b4d ("block: fix adding folio to bio"), if we attempt to add a folio that is larger than 4GB, we'll silently truncate the offset and len. Widen the parameters to size_t, assert that the length is less than 4GB and set the first page that contains the interesting data rather than the first page of the folio. Fixes: 26db5ee15851 (block: add a bvec_set_folio helper) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144255.2850278-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAPMatthew Wilcox (Oracle)
It is possible for physically contiguous folios to have discontiguous struct pages if SPARSEMEM is enabled and SPARSEMEM_VMEMMAP is not. This is correctly handled by folio_page_idx(), so remove this open-coded implementation. Fixes: 640d1930bef4 (block: Add bio_for_each_folio_all()) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144126.2849931-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engineThangaraj Samynathan
Removes MSI-X from the interrupt request path, as the DMA engine used by the SPI controller does not support MSI-X interrupts. Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com> Link: https://patch.msgid.link/20250612023059.71726-1-thangaraj.s@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-13Merge tag 'drm-misc-fixes-2025-06-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.16-rc2: - Fix infinite EPROBE_DEFER loop in vc4 probing. - Fix amdxdna firmware size. - mode fixes for meson. - Kconfig fix for st7171-i2c. - Fix -EBUSY WARN_ON_ONCE in dma-buf - Use dma_sync_sgtable_for_cpu in udmabuf. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/62c06195-8bc1-4dae-8777-e86d94e4d9d9@linux.intel.com
2025-06-12mm: add mmap_prepare() compatibility layer for nested file systemsLorenzo Stoakes
Nested file systems, that is those which invoke call_mmap() within their own f_op->mmap() handlers, may encounter underlying file systems which provide the f_op->mmap_prepare() hook introduced by commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). We have a chicken-and-egg scenario here - until all file systems are converted to using .mmap_prepare(), we cannot convert these nested handlers, as we can't call f_op->mmap from an .mmap_prepare() hook. So we have to do it the other way round - invoke the .mmap_prepare() hook from an .mmap() one. in order to do so, we need to convert VMA state into a struct vm_area_desc descriptor, invoking the underlying file system's f_op->mmap_prepare() callback passing a pointer to this, and then setting VMA state accordingly and safely. This patch achieves this via the compat_vma_mmap_prepare() function, which we invoke from call_mmap() if f_op->mmap_prepare() is specified in the passed in file pointer. We place the fundamental logic into mm/vma.h where VMA manipulation belongs. We also update the VMA userland tests to accommodate the changes. The compat_vma_mmap_prepare() function and its associated machinery is temporary, and will be removed once the conversion of file systems is complete. We carefully place this code so it can be used with CONFIG_MMU and also with cutting edge nommu silicon. [akpm@linux-foundation.org: export compat_vma_mmap_prepare tp fix build] [lorenzo.stoakes@oracle.com: remove unused declarations] Link: https://lkml.kernel.org/r/ac3ae324-4c65-432a-8c6d-2af988b18ac8@lucifer.local Link: https://lkml.kernel.org/r/20250609165749.344976-1-lorenzo.stoakes@oracle.com Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/ Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-13Merge tag 'drm-xe-fixes-2025-06-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix regression disallowing 64K SVM migration (Maarten) - Use a bounce buffer for WA BB (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/aEsBQoh5Si3ouPgE@fedora
2025-06-12Merge tag 'bitmap-for-6.16-rc2' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap fix from Yury Norov: "Fix for __GENMASK() and __GENMASK_ULL() in UAPI" * tag 'bitmap-for-6.16-rc2' of https://github.com/norov/linux: uapi: bitops: use UAPI-safe variant of BITS_PER_LONG again