git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-03-04	usb: dwc3: Set SUSPENDENABLE soon after phy init	Thinh Nguyen
	After phy initialization, some phy operations can only be executed while in lower P states. Ensure GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are set soon after initialization to avoid blocking phy ops. Previously the SUSPENDENABLE bits are only set after the controller initialization, which may not happen right away if there's no gadget driver or xhci driver bound. Revise this to clear SUSPENDENABLE bits only when there's mode switching (change in GCTL.PRTCAPDIR). Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Cc: stable <stable@kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/633aef0afee7d56d2316f7cc3e1b2a6d518a8cc9.1738280911.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-03	KVM: arm64: nv: Fail KVM init if asking for NV without GICv3	Marc Zyngier
	Although there is nothing in NV that is fundamentally incompatible with the lack of GICv3, there is no HW implementation without one, at least on the virtual side (yes, even fruits have some form of vGICv3). We therefore make the decision to require GICv3, which will only affect models such as QEMU. Booting with a GICv2 or something even more exotic while asking for NV will result in KVM being disabled. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-17-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Allow userland to set VGIC maintenance IRQ	Andre Przywara
	The VGIC maintenance IRQ signals various conditions about the LRs, when the GIC's virtualization extension is used. So far we didn't need it, but nested virtualization needs to know about this interrupt, so add a userland interface to setup the IRQ number. The architecture mandates that it must be a PPI, on top of that this code only exports a per-device option, so the PPI is the same on all VCPUs. Signed-off-by: Andre Przywara <andre.przywara@arm.com> [added some bits of documentation] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-16-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Fold GICv3 host trapping requirements into guest setup	Marc Zyngier
	Popular HW that is able to use NV also has a broken vgic implementation that requires trapping. On such HW, propagate the host trap bits into the guest's shadow ICH_HCR_EL2 register, making sure we don't allow an L2 guest to bring the system down. This involves a bit of tweaking so that the emulation code correctly poicks up the shadow state as needed, and to only partially sync ICH_HCR_EL2 back with the guest state to capture EOIcount. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-15-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Propagate used_lrs between L1 and L0 contexts	Marc Zyngier
	We have so far made sure that L1 and L0 vgic contexts were totally independent. There is however one spot of bother with this approach, and that's in the GICv3 emulation code required by our fruity friends. The issue is that the emulation code needs to know how many LRs are in flight. And while it is easy to reach the L0 version through the vcpu pointer, doing so for the L1 is much more complicated, as these structures are private to the nested code. We could simply expose that structure and pick one or the other depending on the context, but this seems extra complexity for not much benefit. Instead, just propagate the number of used LRs from the nested code into the L0 context, and be done with it. Should this become a burden, it can be easily rectified. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-14-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Request vPE doorbell upon nested ERET to L2	Oliver Upton
	Running an L2 guest with GICv4 enabled goes absolutely nowhere, and gets into a vicious cycle of nested ERET followed by nested exception entry into the L1. When KVM does a put on a runnable vCPU, it marks the vPE as nonresident but does not request a doorbell IRQ. Behind the scenes in the ITS driver's view of the vCPU, its_vpe::pending_last gets set to true to indicate that context is still runnable. This comes to a head when doing the nested ERET into L2. The vPE doesn't get scheduled on the redistributor as it is exclusively part of the L1's VGIC context. kvm_vgic_vcpu_pending_irq() returns true because the vPE appears runnable, and KVM does a nested exception entry into the L1 before L2 ever gets off the ground. This issue can be papered over by requesting a doorbell IRQ when descheduling a vPE as part of a nested ERET. KVM needs this anyway to kick the vCPU out of the L2 when an IRQ becomes pending for the L1. Link: https://lore.kernel.org/r/20240823212703.3576061-4-oliver.upton@linux.dev Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-13-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Respect virtual HCR_EL2.TWx setting	Jintack Lim
	Forward exceptions due to WFI or WFE instructions to the virtual EL2 if they are not coming from the virtual EL2 and virtual HCR_EL2.TWx is set. Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-12-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Add Maintenance Interrupt emulation	Marc Zyngier
	Emulating the vGIC means emulating the dreaded Maintenance Interrupt. This is a two-pronged problem: - while running L2, getting an MI translates into an MI injected in the L1 based on the state of the HW. - while running L1, we must accurately reflect the state of the MI line, based on the in-memory state. The MI INTID is added to the distributor, as expected on any virtualisation-capable implementation, and further patches will allow its configuration. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-11-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Handle L2->L1 transition on interrupt injection	Marc Zyngier
	An interrupt being delivered to L1 while running L2 must result in the correct exception being delivered to L1. This means that if, on entry to L2, we found ourselves with pending interrupts in the L1 distributor, we need to take immediate action. This is done by posting a request which will prevent the entry in L2, and deliver an IRQ exception to L1, forcing the switch. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-10-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Nested GICv3 emulation	Marc Zyngier
	When entering a nested VM, we set up the hypervisor control interface based on what the guest hypervisor has set. Especially, we investigate each list register written by the guest hypervisor whether HW bit is set. If so, we translate hw irq number from the guest's point of view to the real hardware irq number if there is a mapping. Co-developed-by: Jintack Lim <jintack@cs.columbia.edu> Signed-off-by: Jintack Lim <jintack@cs.columbia.edu> [Christoffer: Redesigned execution flow around vcpu load/put] Co-developed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> [maz: Rewritten to support GICv3 instead of GICv2, NV2 support] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-9-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Sanitise ICH_HCR_EL2 accesses	Marc Zyngier
	As ICH_HCR_EL2 is a VNCR accessor when runnintg NV, add some sanitising to what gets written. Crucially, mark TDIR as RES0 if the HW doesn't support it (unlikely, but hey...), as well as anything GICv4 related, since we only expose a GICv3 to the uest. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-8-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Plumb handling of GICv3 EL2 accesses	Marc Zyngier
	Wire the handling of all GICv3 EL2 registers, and provide emulation for all the non memory-backed registers (ICC_SRE_EL2, ICH_VTR_EL2, ICH_MISR_EL2, ICH_ELRSR_EL2, and ICH_EISR_EL2). Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Add ICH_*_EL2 registers to vpcu_sysreg	Marc Zyngier
	FEAT_NV2 comes with a bunch of register-to-memory redirection involving the ICH_*_EL2 registers (LRs, APRs, VMCR, HCR). Adds them to the vcpu_sysreg enumeration. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-6-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	KVM: arm64: nv: Load timer before the GIC	Marc Zyngier
	In order for vgic_v3_load_nested to be able to observe which timer interrupts have the HW bit set for the current context, the timers must have been loaded in the new mode and the right timer mapped to their corresponding HW IRQs. At the moment, we load the GIC first, meaning that timer interrupts injected to an L2 guest will never have the HW bit set (we see the old configuration). Swapping the two loads solves this particular problem. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-5-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	arm64: sysreg: Add layout for ICH_MISR_EL2	Marc Zyngier
	The ICH_MISR_EL2-related macros are missing a number of status bits that we are about to handle. Take this opportunity to fully describe the layout of that register as part of the automatic generation infrastructure. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	arm64: sysreg: Add layout for ICH_VTR_EL2	Marc Zyngier
	The ICH_VTR_EL2-related macros are missing a number of config bits that we are about to handle. Take this opportunity to fully describe the layout of that register as part of the automatic generation infrastructure. This results in a bit of churn to repaint constants that are now generated with a different format. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	arm64: sysreg: Add layout for ICH_HCR_EL2	Marc Zyngier
	The ICH_HCR_EL2-related macros are missing a number of control bits that we are about to handle. Take this opportunity to fully describe the layout of that register as part of the automatic generation infrastructure. This results in a bit of churn, unfortunately. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250225172930.1850838-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03	sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()	Andrea Righi
	If a BPF scheduler provides an invalid CPU (outside the nr_cpu_ids range) as prev_cpu to scx_bpf_select_cpu_dfl() it can cause a kernel crash. To prevent this, validate prev_cpu in scx_bpf_select_cpu_dfl() and trigger an scx error if an invalid CPU is specified. Fixes: f0e1a0643a59b ("sched_ext: Implement BPF extensible scheduler class") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-03-03	Merge tag 'affs-6.14-rc5-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull affs fixes from David Sterba: "Two fixes from Simon Tatham. They're real bugfixes for problems with OFS floppy disks created on linux and then read in the emulated Workbench environment" * tag 'affs-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: affs: don't write overlarge OFS data block size fields affs: generate OFS sequence numbers starting at 1
2025-03-03	Merge tag 'xfs-fixes-6.14-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds
	Pull xfs cleanups from Carlos Maiolino: "Just a few cleanups" * tag 'xfs-fixes-6.14-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove the XBF_STALE check from xfs_buf_rele_cached xfs: remove most in-flight buffer accounting xfs: decouple buffer readahead from the normal buffer read path xfs: reduce context switches for synchronous buffered I/O
2025-03-03	Merge tag 'probes-fixes-v6.14-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probe events fixes from Masami Hiramatsu: - probe-events: Remove unused MAX_ARG_BUF_LEN macro - it is not used - fprobe-events: Log error for exceeding the number of entry args. Since the max number of entry args is limited, it should be checked and rejected when the parser detects it. - tprobe-events: Reject invalid tracepoint name If a user specifies an invalid tracepoint name (e.g. including '/') then the new event is not defined correctly in the eventfs. - tprobe-events: Fix a memory leak when tprobe defined with $retval There is a memory leak if tprobe is defined with $retval. * tag 'probes-fixes-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: probe-events: Remove unused MAX_ARG_BUF_LEN macro tracing: fprobe-events: Log error for exceeding the number of entry args tracing: tprobe-events: Reject invalid tracepoint name tracing: tprobe-events: Fix a memory leak when tprobe with $retval
2025-03-03	KVM: VMX: Extract checks on entry/exit control pairs to a helper macro	Sean Christopherson
	Extract the checking of entry/exit pairs to a helper macro so that the code can be reused to process the upcoming "secondary" exit controls (the primary exit controls field is out of bits). Use a macro instead of a function to support different sized variables (all secondary exit controls will be optional and so the MSR doesn't have the fixed-0/fixed-1 split). Taking the largest size as input is trivial, but handling the modification of KVM's to-be-used controls is much trickier, e.g. would require bitmap games to clear bits from a 32-bit bitmap vs. a 64-bit bitmap. Opportunistically add sanity checks to ensure the size of the controls match (yay, macro!), e.g. to detect bugs where KVM passes in the pairs for primary exit controls, but its variable for the secondary exit controls. To help users triage mismatches, print the control bits that are checked, not just the actual value. For the foreseeable future, that provides enough information for a user to determine which fields mismatched. E.g. until secondary entry controls comes along, all entry bits and thus all error messages are guaranteed to be unique. To avoid returning from a macro, which can get quite dangerous, simply process all pairs even if error_on_inconsistent_vmcs_config is set. The speed at which KVM rejects module load is not at all interesting. Keep the error message a "once" printk, even though it would be nice to print out all mismatching pairs. In practice, the most likely scenario is that a single pair will mismatch on all CPUs. Printing all mismatches generates redundant messages in that situation, and can be extremely noisy on systems with large numbers of CPUs. If a CPU has multiple mismatches, not printing every bad pair is the least of the user's concerns. Cc: Xin Li (Intel) <xin@zytor.com> Link: https://lore.kernel.org/r/20250227005353.3216123-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: selftests: Fix printf() format goof in SEV smoke test	Sean Christopherson
	Print out the index of mismatching XSAVE bytes using unsigned decimal format. Some versions of clang complain about trying to print an integer as an unsigned char. x86/sev_smoke_test.c:55:51: error: format specifies type 'unsigned char' but the argument has type 'int' [-Werror,-Wformat] Fixes: 8c53183dbaa2 ("selftests: kvm: add test for transferring FPU state into VMSA") Link: https://lore.kernel.org/r/20250228233852.3855676-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage	Sean Christopherson
	During the initial mprotect(RO) stage of mmu_stress_test, keep vCPUs spinning until all vCPUs have hit -EFAULT, i.e. until all vCPUs have tried to write to a read-only page. If a vCPU manages to complete an entire iteration of the loop without hitting a read-only page, and the vCPU observes mprotect_ro_done before starting a second iteration, then the vCPU will prematurely fall through to GUEST_SYNC(3) (on x86 and arm64) and get out of sequence. Replace the "do-while (!r)" loop around the associated _vcpu_run() with a single invocation, as barring a KVM bug, the vCPU is guaranteed to hit -EFAULT, and retrying on success is super confusion, hides KVM bugs, and complicates this fix. The do-while loop was semi-unintentionally added specifically to fudge around a KVM x86 bug, and said bug is unhittable without modifying the test to force x86 down the !(x86\|\|arm64) path. On x86, if forced emulation is enabled, vcpu_arch_put_guest() may trigger emulation of the store to memory. Due a (very, very) longstanding bug in KVM x86's emulator, emulate writes to guest memory that fail during __kvm_write_guest_page() unconditionally return KVM_EXIT_MMIO. While that is desirable in the !memslot case, it's wrong in this case as the failure happens due to __copy_to_user() hitting a read-only page, not an emulated MMIO region. But as above, x86 only uses vcpu_arch_put_guest() if the __x86_64__ guards are clobbered to force x86 down the common path, and of course the unexpected MMIO is a KVM bug, i.e. should cause a test failure. Fixes: b6c304aec648 ("KVM: selftests: Verify KVM correctly handles mprotect(PROT_READ)") Reported-by: Yan Zhao <yan.y.zhao@intel.com> Closes: https://lore.kernel.org/all/20250208105318.16861-1-yan.y.zhao@intel.com Debugged-by: Yan Zhao <yan.y.zhao@intel.com> Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20250228230804.3845860-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure	Sean Christopherson
	When processing an SNP AP Creation event, invalidate the "next" VMSA GPA even if acquiring the page/pfn for the new VMSA fails. In practice, the next GPA will never be used regardless of whether or not its invalidated, as the entire flow is guarded by snp_ap_waiting_for_reset, and said guard and snp_vmsa_gpa are always written as a pair. But that's really hard to see in the code. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates	Sean Christopherson
	Use guard(mutex) in sev_snp_init_protected_guest_state() and pull in its lock-protected inner helper. Without an unlock trampoline (and even with one), there is no real need for an inner helper. Eliminating the helper also avoids having to fixup the open coded "lockdep" WARN_ON(). Opportunistically drop the error message if KVM can't obtain the pfn for the new target VMSA. The error message provides zero information that can't be gleaned from the fact that the vCPU is stuck. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa	Sean Christopherson
	Mark the VMCB dirty, i.e. zero control.clean, prior to handling the new VMSA. Nothing in the VALID_PAGE() case touches control.clean, and isolating the VALID_PAGE() code will allow simplifying the overall logic. Note, the VMCB probably doesn't need to be marked dirty when the VMSA is invalid, as KVM will disallow running the vCPU in such a state. But it also doesn't hurt anything. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling	Sean Christopherson
	Use guard(mutex) in sev_snp_ap_creation() and modify the error paths to return directly instead of jumping to a common exit point. No functional change intended. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Simplify request+kick logic in SNP AP Creation handling	Sean Christopherson
	Drop the local "kick" variable and the unnecessary "fallthrough" logic from sev_snp_ap_creation(), and simply pivot on the request when deciding whether or not to immediate force a state update on the target vCPU. No functional change intended. Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view	Sean Christopherson
	When handling an "AP Create" event, return an error if the "requested" SEV features for the vCPU don't exactly match KVM's view of the VM-scoped features. There is no known use case for heterogeneous SEV features across vCPUs, and while KVM can't actually enforce an exact match since the value in RAX isn't guaranteed to match what the guest shoved into the VMSA, KVM can at least avoid knowingly letting the guest run in an unsupported state. E.g. if a VM is created with DebugSwap disabled, KVM will intercept #DBs and DRs for all vCPUs, even if an AP is "created" with DebugSwap enabled in its VMSA. Note, the GHCB spec only "requires" that "AP use the same interrupt injection mechanism as the BSP", but given the disaster that is DebugSwap and SEV_FEATURES in general, it's safe to say that AMD didn't consider all possible complications with mismatching features between the BSP and APs. Opportunistically fold the check into the relevant request flavors; the "request < AP_DESTROY" check is just a bizarre way of implementing the AP_CREATE_ON_INIT => AP_CREATE fallthrough. Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event") Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error	Sean Christopherson
	If KVM rejects an AP Creation event, leave the target vCPU state as-is. Nothing in the GHCB suggests the hypervisor is allowed to muck with vCPU state on failure, let alone required to do so. Furthermore, kicking only in the !ON_INIT case leads to divergent behavior, and even the "kick" case is non-deterministic. E.g. if an ON_INIT request fails, the guest can successfully retry if the fixed AP Creation request is made prior to sending INIT. And if a !ON_INIT fails, the guest can successfully retry if the fixed AP Creation request is handled before the target vCPU processes KVM's KVM_REQ_UPDATE_PROTECTED_GUEST_STATE. Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event") Cc: stable@vger.kernel.org Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Refuse to attempt VRMUN if an SEV-ES+ guest has an invalid VMSA	Sean Christopherson
	Explicitly reject KVM_RUN with KVM_EXIT_FAIL_ENTRY if userspace "coerces" KVM into running an SEV-ES+ guest with an invalid VMSA, e.g. by modifying a vCPU's mp_state to be RUNNABLE after an SNP vCPU has undergone a Destroy event. On Destroy or failed Create, KVM marks the vCPU HALTED so that KVM doesn't run the vCPU, but nothing prevents a misbehaving VMM from manually making the vCPU RUNNABLE via KVM_SET_MP_STATE. Attempting VMRUN with an invalid VMSA should be harmless, but knowingly executing VMRUN with bad control state is at best dodgy. Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event") Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3	Sean Christopherson
	Never rely on the CPU to restore/load host DR0..DR3 values, even if the CPU supports DebugSwap, as there are no guarantees that SNP guests will actually enable DebugSwap on APs. E.g. if KVM were to rely on the CPU to load DR0..DR3 and skipped them during hw_breakpoint_restore(), KVM would run with clobbered-to-zero DRs if an SNP guest created APs without DebugSwap enabled. Update the comment to explain the dangers, and hopefully prevent breaking KVM in the future. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	KVM: SVM: Save host DR masks on CPUs with DebugSwap	Sean Christopherson
	When running SEV-SNP guests on a CPU that supports DebugSwap, always save the host's DR0..DR3 mask MSR values irrespective of whether or not DebugSwap is enabled, to ensure the host values aren't clobbered by the CPU. And for now, also save DR0..DR3, even though doing so isn't necessary (see below). SVM_VMGEXIT_AP_CREATE is deeply flawed in that it allows the guest to create a VMSA with guest-controlled SEV_FEATURES. A well behaved guest can inform the hypervisor, i.e. KVM, of its "requested" features, but on CPUs without ALLOWED_SEV_FEATURES support, nothing prevents the guest from lying about which SEV features are being enabled (or not!). If a misbehaving guest enables DebugSwap in a secondary vCPU's VMSA, the CPU will load the DR0..DR3 mask MSRs on #VMEXIT, i.e. will clobber the MSRs with '0' if KVM doesn't save its desired value. Note, DR0..DR3 themselves are "ok", as DR7 is reset on #VMEXIT, and KVM restores all DRs in common x86 code as needed via hw_breakpoint_restore(). I.e. there is no risk of host DR0..DR3 being clobbered (when it matters). However, there is a flaw in the opposite direction; because the guest can lie about enabling DebugSwap, i.e. can disable DebugSwap without KVM's knowledge, KVM must not rely on the CPU to restore DRs. Defer fixing that wart, as it's more of a documentation issue than a bug in the code. Note, KVM added support for DebugSwap on commit d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES"), but that is not an appropriate Fixes, as the underlying flaw exists in hardware, not in KVM. I.e. all kernels that support SEV-SNP need to be patched, not just kernels with KVM's full support for DebugSwap (ignoring that DebugSwap support landed first). Opportunistically fix an incorrect statement in the comment; on CPUs without DebugSwap, the CPU does NOT save or load debug registers, i.e. Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event") Cc: stable@vger.kernel.org Cc: Naveen N Rao <naveen@kernel.org> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250227012541.3234589-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-03-03	drm/sched: Fix preprocessor guard	Philipp Stanner
	When writing the header guard for gpu_scheduler_trace.h, a typo, apparently, occurred. Fix the typo and document the scope of the guard. Fixes: 353da3c520b4 ("drm/amdgpu: add tracepoint for scheduler (v2)") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250218124149.118002-2-phasta@kernel.org
2025-03-03	hwmon: fix a NULL vs IS_ERR_OR_NULL() check in xgene_hwmon_probe()	Xinghuo Chen
	The devm_memremap() function returns error pointers on error, it doesn't return NULL. Fixes: c7cefce03e69 ("hwmon: (xgene) access mailbox as RAM") Signed-off-by: Xinghuo Chen <xinghuo.chen@foxmail.com> Link: https://lore.kernel.org/r/tencent_9AD8E7683EC29CAC97496B44F3F865BA070A@qq.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-03-03	llc: do not use skb_get() before dev_queue_xmit()	Eric Dumazet
	syzbot is able to crash hosts [1], using llc and devices not supporting IFF_TX_SKB_SHARING. In this case, e1000 driver calls eth_skb_pad(), while the skb is shared. Simply replace skb_get() by skb_clone() in net/llc/llc_s_ac.c Note that e1000 driver might have an issue with pktgen, because it does not clear IFF_TX_SKB_SHARING, this is an orthogonal change. We need to audit other skb_get() uses in net/llc. [1] kernel BUG at net/core/skbuff.c:2178 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 UID: 0 PID: 16371 Comm: syz.2.2764 Not tainted 6.14.0-rc4-syzkaller-00052-gac9c34d1e45a #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:pskb_expand_head+0x6ce/0x1240 net/core/skbuff.c:2178 Call Trace: <TASK> __skb_pad+0x18a/0x610 net/core/skbuff.c:2466 __skb_put_padto include/linux/skbuff.h:3843 [inline] skb_put_padto include/linux/skbuff.h:3862 [inline] eth_skb_pad include/linux/etherdevice.h:656 [inline] e1000_xmit_frame+0x2d99/0x5800 drivers/net/ethernet/intel/e1000/e1000_main.c:3128 __netdev_start_xmit include/linux/netdevice.h:5151 [inline] netdev_start_xmit include/linux/netdevice.h:5160 [inline] xmit_one net/core/dev.c:3806 [inline] dev_hard_start_xmit+0x9a/0x7b0 net/core/dev.c:3822 sch_direct_xmit+0x1ae/0xc30 net/sched/sch_generic.c:343 __dev_xmit_skb net/core/dev.c:4045 [inline] __dev_queue_xmit+0x13d4/0x43e0 net/core/dev.c:4621 dev_queue_xmit include/linux/netdevice.h:3313 [inline] llc_sap_action_send_test_c+0x268/0x320 net/llc/llc_s_ac.c:144 llc_exec_sap_trans_actions net/llc/llc_sap.c:153 [inline] llc_sap_next_state net/llc/llc_sap.c:182 [inline] llc_sap_state_process+0x239/0x510 net/llc/llc_sap.c:209 llc_ui_sendmsg+0xd0d/0x14e0 net/llc/af_llc.c:993 sock_sendmsg_nosec net/socket.c:718 [inline] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+da65c993ae113742a25f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67c020c0.050a0220.222324.0011.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-03	ASoC: tegra: Fix ADX S24_LE audio format	Thorsten Blum
	Commit 4204eccc7b2a ("ASoC: tegra: Add support for S24_LE audio format") added support for the S24_LE audio format, but duplicated S16_LE in OUT_DAI() for ADX instead. Fix this by adding support for the S24_LE audio format. Compile-tested only. Cc: stable@vger.kernel.org Fixes: 4204eccc7b2a ("ASoC: tegra: Add support for S24_LE audio format") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20250222225700.539673-2-thorsten.blum@linux.dev Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-03	ASoC: codecs: wsa884x: report temps to hwmon in millidegree of Celsius	Alexey Klimov
	Temperatures are reported in units of Celsius however hwmon expects values to be in millidegree of Celsius. Userspace tools observe values close to zero and report it as "Not available" or incorrect values like 0C or 1C. Add a simple conversion to fix that. Before the change: wsa884x-virtual-0 Adapter: Virtual device temp1: +0.0°C -- wsa884x-virtual-0 Adapter: Virtual device temp1: +0.0°C Also reported as N/A before first amplifier power on. After this change and initial wsa884x power on: wsa884x-virtual-0 Adapter: Virtual device temp1: +39.0°C -- wsa884x-virtual-0 Adapter: Virtual device temp1: +37.0°C Tested on sm8550 only. Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20250221044024.1207921-1-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-03	ASoC: Intel: sof_sdw: Fix unlikely uninitialized variable use in ↵	Peter Ujfalusi
	create_sdw_dailinks() Initialize current_be_id to 0 to handle the unlikely case when there are no devices connected to a DAI. In this case create_sdw_dailink() would return without touching the passed pointer to current_be_id. Found by gcc -fanalyzer Fixes: 59bf457d8055 ("ASoC: intel: sof_sdw: Factor out SoundWire DAI creation") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20250303065552.78328-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-03	netfilter: nft_ct: Use __refcount_inc() for per-CPU nft_ct_pcpu_template.	Sebastian Andrzej Siewior
	nft_ct_pcpu_template is a per-CPU variable and relies on disabled BH for its locking. The refcounter is read and if its value is set to one then the refcounter is incremented and variable is used - otherwise it is already in use and left untouched. Without per-CPU locking in local_bh_disable() on PREEMPT_RT the read-then-increment operation is not atomic and therefore racy. This can be avoided by using unconditionally __refcount_inc() which will increment counter and return the old value as an atomic operation. In case the returned counter is not one, the variable is in use and we need to decrement counter. Otherwise we can use it. Use __refcount_inc() instead of read and a conditional increment. Fixes: edee4f1e9245 ("netfilter: nft_ct: add zone id set support") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-03-03	ALSA: usx2y: validate nrpacks module parameter on probe	Murad Masimov
	The module parameter defines number of iso packets per one URB. User is allowed to set any value to the parameter of type int, which can lead to various kinds of weird and incorrect behavior like integer overflows, truncations, etc. Number of packets should be a small non-negative number. Since this parameter is read-only, its value can be validated on driver probe. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Link: https://patch.msgid.link/20250303100413.835-1-m.masimov@mt-integration.ru Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-03	platform/x86/intel/vsec: Add Diamond Rapids support	David E. Box
	Add PCI ID for the Diamond Rapids Platforms Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20250226214728.1256747-1-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-03	platform/x86: thinkpad_acpi: Add battery quirk for ThinkPad X131e	Mingcong Bai
	Based on the dmesg messages from the original reporter: [ 4.964073] ACPI: \_SB_.PCI0.LPCB.EC__.HKEY: BCTG evaluated but flagged as error [ 4.964083] thinkpad_acpi: Error probing battery 2 Lenovo ThinkPad X131e also needs this battery quirk. Reported-by: Fan Yang <804284660@qq.com> Tested-by: Fan Yang <804284660@qq.com> Co-developed-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250221164825.77315-1-jeffbai@aosc.io Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-03	platform/x86: intel-hid: fix volume buttons on Microsoft Surface Go 4 tablet	Dmitry Panchenko
	Volume buttons on Microsoft Surface Go 4 tablet didn't send any events. Add Surface Go 4 DMI match to button_array_table to fix this. Signed-off-by: Dmitry Panchenko <dmitry@d-systems.ee> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20250220154016.3620917-1-dmitry@d-systems.ee Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-03-03	drm/imagination: Fix timestamps in firmware traces	Alessio Belle
	When firmware traces are enabled, the firmware dumps 48-bit timestamps for each trace as two 32-bit values, highest 32 bits (of which only 16 useful) first. The driver was reassembling them the other way round i.e. interpreting the first value in memory as the lowest 32 bits, and the second value as the highest 32 bits (then truncated to 16 bits). Due to this, firmware trace dumps showed very large timestamps even for traces recorded shortly after GPU boot. The timestamps in these dumps would also sometimes jump backwards because of the truncation. Example trace dumped after loading the powervr module and enabling firmware traces, where each line is commented with the timestamp value in hexadecimal to better show both issues: [93540092739584] : Host Sync Partition marker: 1 // 0x551300000000 [28419798597632] : GPU units deinit // 0x19d900000000 [28548647616512] : GPU deinit // 0x19f700000000 Update logic to reassemble the timestamps halves in the correct order. Fixes: cb56cd610866 ("drm/imagination: Add firmware trace to debugfs") Signed-off-by: Alessio Belle <alessio.belle@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221-fix-fw-trace-timestamps-v1-1-dba4aeb030ca@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2025-03-03	drm/imagination: only init job done fences once	Brendan King
	Ensure job done fences are only initialised once. This fixes a memory manager not clean warning from drm_mm_takedown on module unload. Cc: stable@vger.kernel.org Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-init-done-fences-once-v2-1-c1b2f556b329@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2025-03-03	spi: microchip-core: prevent RX overflows when transmit size > FIFO size	Conor Dooley
	When the size of a transfer exceeds the size of the FIFO (32 bytes), RX overflows will be generated and receive data will be corrupted and warnings will be produced. For example, here's an error generated by a transfer of 36 bytes: spi_master spi0: mchp_corespi_interrupt: RX OVERFLOW: rxlen: 4, txlen: 0 The driver is currently split between handling receiving in the interrupt handler, and sending outside of it. Move all handling out of the interrupt handling, and explicitly link the number of bytes read of of the RX FIFO to the number written into the TX one. This both resolves the overflow problems as well as simplifying the flow of the driver. CC: stable@vger.kernel.org Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20250303-veal-snooper-712c1dfad336@wendy Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-03	drm/imagination: Hold drm_gem_gpuva lock for unmap	Brendan King
	Avoid a warning from drm_gem_gpuva_assert_lock_held in drm_gpuva_unlink. The Imagination driver uses the GEM object reservation lock to protect the gpuva list, but the GEM object was not always known in the code paths that ended up calling drm_gpuva_unlink. When the GEM object isn't known, it is found by calling drm_gpuva_find to lookup the object associated with a given virtual address range, or by calling drm_gpuva_find_first when removing all mappings. Cc: stable@vger.kernel.org Fixes: 4bc736f890ce ("drm/imagination: vm: make use of GPUVM's drm_exec helper") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-hold-drm_gem_gpuva-lock-for-unmap-v2-1-3fdacded227f@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2025-03-03	drm/imagination: avoid deadlock on fence release	Brendan King
	Do scheduler queue fence release processing on a workqueue, rather than in the release function itself. Fixes deadlock issues such as the following: [ 607.400437] ============================================ [ 607.405755] WARNING: possible recursive locking detected [ 607.415500] -------------------------------------------- [ 607.420817] weston:zfq0/24149 is trying to acquire lock: [ 607.426131] ffff000017d041a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: pvr_gem_object_vunmap+0x40/0xc0 [powervr] [ 607.436728] but task is already holding lock: [ 607.442554] ffff000017d105a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_ioctl+0x250/0x554 [ 607.451727] other info that might help us debug this: [ 607.458245] Possible unsafe locking scenario: [ 607.464155] CPU0 [ 607.466601] ---- [ 607.469044] lock(reservation_ww_class_mutex); [ 607.473584] lock(reservation_ww_class_mutex); [ 607.478114] * DEADLOCK * Cc: stable@vger.kernel.org Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-fence-release-deadlock-v2-1-6fed2fc1fe88@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>