linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-04-24	iommu/amd: Return an error if vCPU affinity is set for non-vCPU IRTE	Sean Christopherson
	Return -EINVAL instead of success if amd_ir_set_vcpu_affinity() is invoked without use_vapic; lying to KVM about whether or not the IRTE was configured to post IRQs is all kinds of bad. Fixes: d98de49a53e4 ("iommu/amd: Enable vAPIC interrupt remapping mode by default") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250404193923.1413163-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer	Sean Christopherson
	Take irqfds.lock when adding/deleting an IRQ bypass producer to ensure irqfd->producer isn't modified while kvm_irq_routing_update() is running. The only lock held when a producer is added/removed is irqbypass's mutex. Fixes: 872768800652 ("KVM: x86: select IRQ_BYPASS_MANAGER") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250404193923.1413163-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	KVM: x86: Explicitly treat routing entry type changes as changes	Sean Christopherson
	Explicitly treat type differences as GSI routing changes, as comparing MSI data between two entries could get a false negative, e.g. if userspace changed the type but left the type-specific data as-is. Fixes: 515a0c79e796 ("kvm: irqfd: avoid update unmodified entries of the routing") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250404193923.1413163-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	KVM: x86: Reset IRTE to host control if new route isn't postable	Sean Christopherson
	Restore an IRTE back to host control (remapped or posted MSI mode) if the new GSI route prevents posting the IRQ directly to a vCPU, regardless of the GSI routing type. Updating the IRTE if and only if the new GSI is an MSI results in KVM leaving an IRTE posting to a vCPU. The dangling IRTE can result in interrupts being incorrectly delivered to the guest, and in the worst case scenario can result in use-after-free, e.g. if the VM is torn down, but the underlying host IRQ isn't freed. Fixes: efc644048ecd ("KVM: x86: Update IRTE for posted-interrupts") Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250404193923.1413163-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	KVM: SVM: Allocate IR data using atomic allocation	Sean Christopherson
	Allocate SVM's interrupt remapping metadata using GFP_ATOMIC as svm_ir_list_add() is called with IRQs are disabled and irqfs.lock held when kvm_irq_routing_update() reacts to GSI routing changes. Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250404193923.1413163-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	KVM: SVM: Don't update IRTEs if APICv/AVIC is disabled	Sean Christopherson
	Skip IRTE updates if AVIC is disabled/unsupported, as forcing the IRTE into remapped mode (kvm_vcpu_apicv_active() will never be true) is unnecessary and wasteful. The IOMMU driver is responsible for putting IRTEs into remapped mode when an IRQ is allocated by a device, long before that device is assigned to a VM. I.e. the kernel as a whole has major issues if the IRTE isn't already in remapped mode. Opportunsitically kvm_arch_has_irq_bypass() to query for APICv/AVIC, so so that all checks in KVM x86 incorporate the same information. Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250401161804.842968-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	KVM: arm64, x86: make kvm_arch_has_irq_bypass() inline	Paolo Bonzini
	kvm_arch_has_irq_bypass() is a small function and even though it does not appear in any really hot paths, it's also not entirely rare. Make it inline---it also works out nicely in preparation for using it in kvm-intel.ko and kvm-amd.ko, since the function is not currently exported. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-04-24	block: don't autoload drivers on blk-cgroup configuration	Christoph Hellwig
	Loading a driver just to configure blk-cgroup doesn't make sense, as that assumes and already existing device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250423053810.1683309-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	block: don't autoload drivers on stat	Christoph Hellwig
	blkdev_get_no_open can trigger the legacy autoload of block drivers. A simple stat of a block device has not historically done that, so disable this behavior again. Fixes: 9abcfbd235f5 ("block: Add atomic write support for statx") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250423053810.1683309-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	block: remove the backing_inode variable in bdev_statx	Christoph Hellwig
	backing_inode is only used once, so remove it and update the comment describing the bdev lookup to be a bit more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250423053810.1683309-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	block: move blkdev_{get,put} _no_open prototypes out of blkdev.h	Christoph Hellwig
	These are only to be used by block internal code. Remove the comment as we grew more users due to reworking block device node opening. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250423053810.1683309-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	block: never reduce ra_pages in blk_apply_bdi_limits	Christoph Hellwig
	When the user increased the read-ahead size through sysfs this value currently get lost if the device is reprobe, including on a resume from suspend. As there is no hardware limitation for the read-ahead size there is no real need to reset it or track a separate hardware limitation like for max_sectors. This restores the pre-atomic queue limit behavior in the sd driver as sd did not use blk_queue_io_opt and thus never updated the read ahead size to the value based of the optimal I/O, but changes behavior for all other drivers. As the new behavior seems useful and sd is the driver for which the readahead size tweaks are most useful that seems like a worthwhile trade off. Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API") Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20250424082521.1967286-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	selftests: ublk: common: fix _get_disk_dev_t for pre-9.0 coreutils	Uday Shankar
	Some distributions, such as centos stream 9, still have a version of coreutils which does not yet support the %Hr and %Lr formats for stat(1) [1, 2]. Running ublk selftests on these distributions results in the following error in tests that use the _get_disk_dev_t helper: line 23: ?r: syntax error: operand expected (error token is "?r") To better accommodate older distributions, rewrite _get_disk_dev_t to use the much older %t and %T formats for stat instead. [1] https://github.com/coreutils/coreutils/blob/v9.0/NEWS#L114 [2] https://pkgs.org/download/coreutils Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250423-ublk_selftests-v1-2-7d060e260e76@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	io_uring: don't duplicate flushing in io_req_post_cqe	Pavel Begunkov
	io_req_post_cqe() sets submit_state.cq_flush so that *flush_completions() can take care of batch commiting CQEs. Don't commit it twice by using __io_cq_unlock_post(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/41c416660c509cee676b6cad96081274bcb459f3.1745493861.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-24	Merge tag 'nvme-6.15-2025-04-24' of git://git.infradead.org/nvme into block-6.15	Jens Axboe
	Pull NVMe fix from Christoph: "nvme fixes for Linux 6.15 - fix an out-of-bounds access in nvmet_enable_port (Richard Weinberger)" * tag 'nvme-6.15-2025-04-24' of git://git.infradead.org/nvme: nvmet: fix out-of-bounds access in nvmet_enable_port
2025-04-24	spi: spi-qpic-snand: propagate errors from qcom_spi_block_erase()	Gabor Juhos
	The qcom_spi_block_erase() function returns with error in case of failure. Change the qcom_spi_send_cmdaddr() function to propagate these errors to the callers instead of returning with success. Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface") Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://patch.msgid.link/20250423-qpic-snand-propagate-error-v1-1-4b26ed45fdb5@gmail.com Reviewed-by: Md Sadre Alam <quic_mdalam@quicinc.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-24	ASoC: renesas: rz-ssi: Use NOIRQ_SYSTEM_SLEEP_PM_OPS()	Claudiu Beznea
	In the latest kernel versions system crashes were noticed occasionally during suspend/resume. This occurs because the RZ SSI suspend trigger (called from snd_soc_suspend()) is executed after rz_ssi_pm_ops->suspend() and it accesses IP registers. After the rz_ssi_pm_ops->suspend() is executed the IP clocks are disabled and its reset line is asserted. Since snd_soc_suspend() is invoked through snd_soc_pm_ops->suspend(), snd_soc_pm_ops is associated with soc_driver (defined in sound/soc/soc-core.c), and there is no parent-child relationship between soc_driver and rz_ssi_driver the power management subsystem does not enforce a specific suspend/resume order between the RZ SSI platform driver and soc_driver. To ensure that the suspend/resume function of rz-ssi is executed after snd_soc_suspend(), use NOIRQ_SYSTEM_SLEEP_PM_OPS(). Fixes: 1fc778f7c833 ("ASoC: renesas: rz-ssi: Add suspend to RAM support") Cc: stable@vger.kernel.org Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://patch.msgid.link/20250410141525.4126502-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-24	Merge drm/drm-next into drm-xe-next	Thomas Hellström
	Backmerge to bring in linux 6.15-rc. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-04-24	drm/imagination: Add reset controller support for GPU initialization	Michal Wilczynski
	All IMG Rogue GPUs include a reset line that participates in the power-up sequence. On some SoCs (e.g., T-Head TH1520 and Banana Pi BPI-F3), this reset line is exposed and must be driven explicitly to ensure proper initialization. On others, such as the currently supported TI SoC, the reset logic is handled in hardware or firmware without exposing the line directly. In platforms where the reset line is externally accessible, if it is not driven correctly, the GPU may remain in an undefined state, leading to instability or performance issues. This commit adds a dedicated reset controller to the drm/imagination driver. By managing the reset line (where applicable) as part of normal GPU bring-up, the driver ensures reliable initialization across platforms regardless of whether the reset is controlled externally or handled internally. Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250418-apr_18_reset_img-v6-2-85a06757b698@samsung.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2025-04-24	dt-bindings: gpu: Add 'resets' property for GPU initialization	Michal Wilczynski
	All IMG Rogue GPUs include a reset line that participates in the power-up sequence. On some SoCs (e.g., T-Head TH1520 and Banana Pi BPI-F3), this reset line is exposed and must be driven explicitly to ensure proper initialization. To support this, add a 'resets' property to the GPU device tree bindings. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250418-apr_18_reset_img-v6-1-85a06757b698@samsung.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2025-04-24	drm/imagination: avoid unused-const-variable warning	Arnd Bergmann
	When CONFIG_DEBUG_FS is disabled, the stid_fmts[] array is not referenced anywhere, causing a W=1 warning with gcc: In file included from drivers/gpu/drm/imagination/pvr_fw_trace.c:7: drivers/gpu/drm/imagination/pvr_rogue_fwif_sf.h:75:39: error: 'stid_fmts' defined but not used [-Werror=unused-const-variable=] 75 \| static const struct rogue_km_stid_fmt stid_fmts[] = { \| ^~~~~~~~~ Rather than adding more #ifdef blocks, address this by changing the existing #ifdef into equivalent IS_ENABLED() checks so gcc can see where the symbol is used but still eliminate it from the object file. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250409122314.2848028-1-arnd@kernel.org Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2025-04-24	Merge branch 'net-stmmac-fix-timestamp-snapshots-on-dwmac1000'	Paolo Abeni
	Alexis Lothore says: ==================== net: stmmac: fix timestamp snapshots on dwmac1000 this is the v2 of a small series containing two small fixes for the timestamp snapshot feature on stmmac, especially on dwmac1000 version. Those issues have been detected on a socfpga (Cyclone V) platform. They kind of follow the big rework sent by Maxime at the end of last year to properly split this feature support between different versions of the DWMAC IP. v1: https://lore.kernel.org/r/20250422-stmmac_ts-v1-0-b59c9f406041@bootlin.com Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> ==================== Link: https://patch.msgid.link/20250423-stmmac_ts-v2-0-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	net: stmmac: fix multiplication overflow when reading timestamp	Alexis Lothoré
	The current way of reading a timestamp snapshot in stmmac can lead to integer overflow, as the computation is done on 32 bits. The issue has been observed on a dwmac-socfpga platform returning chaotic timestamp values due to this overflow. The corresponding multiplication is done with a MUL instruction, which returns 32 bit values. Explicitly casting the value to 64 bits replaced the MUL with a UMLAL, which computes and returns the result on 64 bits, and so returns correctly the timestamps. Prevent this overflow by explicitly casting the intermediate value to u64 to make sure that the whole computation is made on u64. While at it, apply the same cast on the other dwmac variant (GMAC4) method for snapshot retrieval. Fixes: 477c3e1f6363 ("net: stmmac: Introduce dwmac1000 timestamping operations") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423-stmmac_ts-v2-2-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	net: stmmac: fix dwmac1000 ptp timestamp status offset	Alexis Lothore
	When a PTP interrupt occurs, the driver accesses the wrong offset to learn about the number of available snapshots in the FIFO for dwmac1000: it should be accessing bits 29..25, while it is currently reading bits 19..16 (those are bits about the auxiliary triggers which have generated the timestamps). As a consequence, it does not compute correctly the number of available snapshots, and so possibly do not generate the corresponding clock events if the bogus value ends up being 0. Fix clock events generation by reading the correct bits in the timestamp register for dwmac1000. Fixes: 477c3e1f6363 ("net: stmmac: Introduce dwmac1000 timestamping operations") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423-stmmac_ts-v2-1-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	net: dp83822: Fix OF_MDIO config check	Johannes Schneider
	When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: 5dc39fd5ef35 ("net: phy: DP83822: Add ability to advertise Fiber connection") Signed-off-by: Johannes Schneider <johannes.schneider@leica-geosystems.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423044724.1284492-1-johannes.schneider@leica-geosystems.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	selftests/fs/mount-notify: test also remove/flush of mntns marks	Amir Goldstein
	Regression test for FAN_MARK_MNTFS \| FAN_MARK_FLUSH bug. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250418193903.2607617-3-amir73il@gmail.com
2025-04-24	fanotify: fix flush of mntns marks	Amir Goldstein
	fanotify_mark(fd, FAN_MARK_FLUSH \| FAN_MARK_MNTNS, ...) incorrectly ends up causing removal inode marks. Fixes: 0f46d81f2bce ("fanotify: notify on mount attach and detach") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250418193903.2607617-2-amir73il@gmail.com
2025-04-24	drm/panel: himax-hx8279: Always initialize goa_{even,odd}_valid in ↵	Nathan Chancellor
	hx8279_check_goa_config() Clang warns (or errors with CONFIG_WERROR=y): drivers/gpu/drm/panel/panel-himax-hx8279.c:838:6: error: variable 'goa_even_valid' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] 838 \| if (num_zero == ARRAY_SIZE(desc->goa_even_timing)) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/panel/panel-himax-hx8279.c:842:23: note: uninitialized use occurs here 842 \| if (goa_odd_valid != goa_even_valid) \| ^~~~~~~~~~~~~~ drivers/gpu/drm/panel/panel-himax-hx8279.c:838:2: note: remove the 'if' if its condition is always true 838 \| if (num_zero == ARRAY_SIZE(desc->goa_even_timing)) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 839 \| goa_even_valid = false; drivers/gpu/drm/panel/panel-himax-hx8279.c:818:36: note: initialize the variable 'goa_even_valid' to silence this warning 818 \| bool goa_odd_valid, goa_even_valid; \| ^ \| = 0 Even though only the even valid variable gets flagged, both valid variables appear to have the same issue of possibly being used uninitialized if the if statement initializing them to false is not taken. Turn the if statement then variable assignment into a single variable assignment, which states that the configuration is valid when there are not all zeros, clearing up the warning since the variable will always be initialized. Fixes: 38d42c261389 ("drm: panel: Add driver for Himax HX8279 DDIC panels") Suggested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250423-panel-himax-hx8279-fix-sometimes-uninitialized-v2-1-fc501c6558d9@kernel.org Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250423-panel-himax-hx8279-fix-sometimes-uninitialized-v2-1-fc501c6558d9@kernel.org
2025-04-24	powerpc/boot: Fix dash warning	Madhavan Srinivasan
	'commit b2accfe7ca5b ("powerpc/boot: Check for ld-option support")' suppressed linker warnings, but the expressed used did not go well with POSIX shell (dash) resulting with this warning arch/powerpc/boot/wrapper: 237: [: 0: unexpected operator ld: warning: arch/powerpc/boot/zImage.epapr has a LOAD segment with RWX permissions Fix the check to handle the reported warning. Patch also fixes couple of shellcheck reported errors for the same line. In arch/powerpc/boot/wrapper line 237: if [ $(${CROSS}ld -v --no-warn-rwx-segments &>/dev/null; echo $?) -eq 0 ]; then ^-- SC2046 (warning): Quote this to prevent word splitting. ^------^ SC2086 (info): Double quote to prevent globbing and word splitting. ^---------^ SC3020 (warning): In POSIX sh, &> is undefined. Fixes: b2accfe7ca5b ("powerpc/boot: Check for ld-option support") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250423082154.30625-1-maddy@linux.ibm.com
2025-04-23	Merge branch 'pds_core-updates-and-fixes'	Jakub Kicinski
	Shannon Nelson says: ==================== pds_core: updates and fixes This patchset has fixes for issues seen in recent internal testing of error conditions and stress handling. Note that the first patch in this series is a leftover from an earlier patchset that was abandoned: Link: https://lore.kernel.org/netdev/20250129004337.36898-2-shannon.nelson@amd.com/ ==================== Link: https://patch.msgid.link/20250421174606.3892-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	pds_core: make wait_context part of q_info	Shannon Nelson
	Make the wait_context a full part of the q_info struct rather than a stack variable that goes away after pdsc_adminq_post() is done so that the context is still available after the wait loop has given up. There was a case where a slow development firmware caused the adminq request to time out, but then later the FW finally finished the request and sent the interrupt. The handler tried to complete_all() the completion context that had been created on the stack in pdsc_adminq_post() but no longer existed. This caused bad pointer usage, kernel crashes, and much wailing and gnashing of teeth. Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-5-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	pds_core: Remove unnecessary check in pds_client_adminq_cmd()	Brett Creeley
	When the pds_core driver was first created there were some race conditions around using the adminq, especially for client drivers. To reduce the possibility of a race condition there's a check against pf->state in pds_client_adminq_cmd(). This is problematic for a couple of reasons: 1. The PDSC_S_INITING_DRIVER bit is set during probe, but not cleared until after everything in probe is complete, which includes creating the auxiliary devices. For pds_fwctl this means it can't make any adminq commands until after pds_core's probe is complete even though the adminq is fully up by the time pds_fwctl's auxiliary device is created. 2. The race conditions around using the adminq have been fixed and this path is already protected against client drivers calling pds_client_adminq_cmd() if the adminq isn't ready, i.e. see pdsc_adminq_post() -> pdsc_adminq_inc_if_up(). Fix this by removing the pf->state check in pds_client_adminq_cmd() because invalid accesses to pds_core's adminq is already handled by pdsc_adminq_post()->pdsc_adminq_inc_if_up(). Fixes: 10659034c622 ("pds_core: add the aux client API") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-4-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	pds_core: handle unsupported PDS_CORE_CMD_FW_CONTROL result	Brett Creeley
	If the FW doesn't support the PDS_CORE_CMD_FW_CONTROL command the driver might at the least print garbage and at the worst crash when the user runs the "devlink dev info" devlink command. This happens because the stack variable fw_list is not 0 initialized which results in fw_list.num_fw_slots being a garbage value from the stack. Then the driver tries to access fw_list.fw_names[i] with i >= ARRAY_SIZE and runs off the end of the array. Fix this by initializing the fw_list and by not failing completely if the devcmd fails because other useful information is printed via devlink dev info even if the devcmd fails. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-3-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	pds_core: Prevent possible adminq overflow/stuck condition	Brett Creeley
	The pds_core's adminq is protected by the adminq_lock, which prevents more than 1 command to be posted onto it at any one time. This makes it so the client drivers cannot simultaneously post adminq commands. However, the completions happen in a different context, which means multiple adminq commands can be posted sequentially and all waiting on completion. On the FW side, the backing adminq request queue is only 16 entries long and the retry mechanism and/or overflow/stuck prevention is lacking. This can cause the adminq to get stuck, so commands are no longer processed and completions are no longer sent by the FW. As an initial fix, prevent more than 16 outstanding adminq commands so there's no way to cause the adminq from getting stuck. This works because the backing adminq request queue will never have more than 16 pending adminq commands, so it will never overflow. This is done by reducing the adminq depth to 16. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	net: dsa: mt7530: sync driver-specific behavior of MT7531 variants	Daniel Golle
	MT7531 standalone and MMIO variants found in MT7988 and EN7581 share most basic properties. Despite that, assisted_learning_on_cpu_port and mtu_enforcement_ingress were only applied for MT7531 but not for MT7988 or EN7581, causing the expected issues on MMIO devices. Apply both settings equally also for MT7988 and EN7581 by moving both assignments form mt7531_setup() to mt7531_setup_common(). This fixes unwanted flooding of packets due to unknown unicast during DA lookup, as well as issues with heterogenous MTU settings. Fixes: 7f54cc9772ce ("net: dsa: mt7530: split-off common parts from mt7531_setup") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/89ed7ec6d4fa0395ac53ad2809742bb1ce61ed12.1745290867.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	Merge branch 'net_sched-fix-uaf-vulnerability-in-hfsc-qdisc'	Jakub Kicinski
	Cong Wang says: ==================== net_sched: Fix UAF vulnerability in HFSC qdisc This patchset contains two bug fixes and a selftest for the first one which we have a reliable reproducer, please check each patch description for details. ==================== Link: https://patch.msgid.link/20250417184732.943057-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	selftests/tc-testing: Add test for HFSC queue emptying during peek operation	Cong Wang
	Add a selftest to exercise the condition where qdisc implementations like netem or codel might empty the queue during a peek operation. This tests the defensive code path in HFSC that checks the queue length again after peeking to handle this case. Based on the reproducer from Gerrard, improved by Jamal. Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20250417184732.943057-4-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	net_sched: hfsc: Fix a potential UAF in hfsc_dequeue() too	Cong Wang
	Similarly to the previous patch, we need to safe guard hfsc_dequeue() too. But for this one, we don't have a reliable reproducer. Fixes: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 ("Linux-2.6.12-rc2") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20250417184732.943057-3-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	net_sched: hfsc: Fix a UAF vulnerability in class handling	Cong Wang
	This patch fixes a Use-After-Free vulnerability in the HFSC qdisc class handling. The issue occurs due to a time-of-check/time-of-use condition in hfsc_change_class() when working with certain child qdiscs like netem or codel. The vulnerability works as follows: 1. hfsc_change_class() checks if a class has packets (q.qlen != 0) 2. It then calls qdisc_peek_len(), which for certain qdiscs (e.g., codel, netem) might drop packets and empty the queue 3. The code continues assuming the queue is still non-empty, adding the class to vttree 4. This breaks HFSC scheduler assumptions that only non-empty classes are in vttree 5. Later, when the class is destroyed, this can lead to a Use-After-Free The fix adds a second queue length check after qdisc_peek_len() to verify the queue wasn't emptied. Fixes: 21f4d5cc25ec ("net_sched/hfsc: fix curve activation in hfsc_change_class()") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20250417184732.943057-2-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	Merge branch 'mptcp-pm-defer-freeing-userspace-pm-entries'	Jakub Kicinski
	Matthieu Baerts says: ==================== mptcp: pm: Defer freeing userspace pm entries Here are two unrelated fixes for MPTCP: - Patch 1: free userspace PM entry with RCU helpers. A fix for v6.14. - Patch 2: avoid a warning when running diag.sh selftest. A fix for v6.15-rc1. ==================== Link: https://patch.msgid.link/20250421-net-mptcp-pm-defer-freeing-v1-0-e731dc6e86b9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	selftests: mptcp: diag: use mptcp_lib_get_info_value	Geliang Tang
	When running diag.sh in a loop, chk_dump_one will report the following "grep: write error": 13 ....chk 2 cestab [ OK ] grep: write error 14 ....chk dump_one [ OK ] 15 ....chk 2->0 msk in use after flush [ OK ] 16 ....chk 2->0 cestab after flush [ OK ] This error is caused by a broken pipe. When the output of 'ss' is processed by grep, 'head -n 1' will exit immediately after getting the first line, causing the subsequent pipe to close. At this time, if 'grep' is still trying to write data to the closed pipe, it will trigger a SIGPIPE signal, causing a write error. One solution is not to use this problematic "head -n 1" command, but to use mptcp_lib_get_info_value() helper defined in mptcp_lib.sh to get the value of 'token'. Fixes: ba2400166570 ("selftests: mptcp: add a test for mptcp_diag_dump_one") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Tested-by: Gang Yan <yangang@kylinos.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250421-net-mptcp-pm-defer-freeing-v1-2-e731dc6e86b9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	mptcp: pm: Defer freeing of MPTCP userspace path manager entries	Mat Martineau
	When path manager entries are deleted from the local address list, they are first unlinked from the address list using list_del_rcu(). The entries must not be freed until after the RCU grace period, but the existing code immediately frees the entry. Use kfree_rcu_mightsleep() and adjust sk_omem_alloc in open code instead of using the sock_kfree_s() helper. This code path is only called in a netlink handler, so the "might sleep" function is preferable to adding a rarely-used rcu_head member to struct mptcp_pm_addr_entry. Fixes: 88d097316371 ("mptcp: drop free_list for deleting entries") Cc: stable@vger.kernel.org Signed-off-by: Mat Martineau <martineau@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250421-net-mptcp-pm-defer-freeing-v1-1-e731dc6e86b9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23	misc: pci_endpoint_test: Defer IRQ allocation until ioctl(PCITEST_SET_IRQTYPE)	Niklas Cassel
	Commit a402006d48a9 ("misc: pci_endpoint_test: Remove global 'irq_type' and 'no_msi'") changed so that the default IRQ vector requested by pci_endpoint_test_probe() was no longer the module param 'irq_type', but instead test->irq_type. test->irq_type is by default IRQ_TYPE_UNDEFINED (until someone calls ioctl(PCITEST_SET_IRQTYPE)). However, the commit also changed so that after initializing test->irq_type to IRQ_TYPE_UNDEFINED, it also overrides it with driver_data->irq_type, if the PCI device and vendor ID provides driver_data. This causes a regression for PCI device and vendor IDs that do not provide driver_data, and the host side pci_endpoint_test_driver driver failed to probe on such platforms: pci-endpoint-test 0001:01:00.0: Invalid IRQ type selected pci-endpoint-test 0001:01:00.0: probe with driver pci-endpoint-test failed with error -22 Considering that the pci endpoint selftests and the old pcitest.sh always call ioctl(PCITEST_SET_IRQTYPE) before performing any test that requires IRQs, fix the regression by removing the allocation of IRQs in pci_endpoint_test_probe(). The IRQ allocation will occur when ioctl(PCITEST_SET_IRQTYPE) is called. A positive side effect of this is that even if the endpoint controller has issues with IRQs, the user can do still do all the tests/ioctls() that do not require working IRQs, e.g. PCITEST_BAR and PCITEST_BARS. This also means that we can remove the now unused irq_type from driver_data. The irq_type will always be the one configured by the user using ioctl(PCITEST_SET_IRQTYPE). (A user that does not know, or care which irq_type that is used, can use PCITEST_IRQ_TYPE_AUTO. This has superseded the need for a default irq_type in driver_data.) [bhelgaas: add probe failure details] Fixes: a402006d48a9c ("misc: pci_endpoint_test: Remove global 'irq_type' and 'no_msi'") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20250416142825.336554-2-cassel@kernel.org
2025-04-24	drm/ttm/xe: drop unused force_alloc flag	Dave Airlie
	This flag used to be used in the old memory tracking code, that code got migrated into the vmwgfx driver[1], and then got removed from the tree[2], but this piece got left behind. [1] f07069da6b4c ("drm/ttm: move memory accounting into vmwgfx v4") [2] 8aadeb8ad874 ("drm/vmwgfx: Remove the dedicated memory accounting") Cleanup the dead code. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-04-23	selftests: ublk: remove useless 'delay_us' from 'struct dev_ctx'	Ming Lei
	'delay_us' shouldn't be added to 'struct dev_ctx' since now it is handled by per-target command line & 'struct fault_inject_ctx'. So remove it. Fixes: 81586652bb1f ("selftests: ublk: add generic_06 for covering fault inject") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Uday Shankar <ushankar@purestorage.com> Link: https://lore.kernel.org/r/20250421235947.715272-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-23	selftests: ublk: fix recover test	Ming Lei
	When adding recovery test: - 'break' is missed for handling '-g' argument - test name of test_generic_05.sh is wrong So fix the two. Fixes: 57e13a2e8cd2 ("selftests: ublk: support user recovery") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Uday Shankar <ushankar@purestorage.com> Link: https://lore.kernel.org/r/20250421235947.715272-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-23	block: hoist block size validation code to a separate function	Darrick J. Wong
	Hoist the block size validation code to bdev_validate_blocksize so that we can call it from filesystems that don't care about the bdev pagecache manipulations of set_blocksize. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/174543795720.4139148.840349813093799165.stgit@frogsfrogsfrogs Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-23	block: fix race between set_blocksize and read paths	Darrick J. Wong
	With the new large sector size support, it's now the case that set_blocksize can change i_blksize and the folio order in a manner that conflicts with a concurrent reader and causes a kernel crash. Specifically, let's say that udev-worker calls libblkid to detect the labels on a block device. The read call can create an order-0 folio to read the first 4096 bytes from the disk. But then udev is preempted. Next, someone tries to mount an 8k-sectorsize filesystem from the same block device. The filesystem calls set_blksize, which sets i_blksize to 8192 and the minimum folio order to 1. Now udev resumes, still holding the order-0 folio it allocated. It then tries to schedule a read bio and do_mpage_readahead tries to create bufferheads for the folio. Unfortunately, blocks_per_folio == 0 because the page size is 4096 but the blocksize is 8192 so no bufferheads are attached and the bh walk never sets bdev. We then submit the bio with a NULL block device and crash. Therefore, truncate the page cache after flushing but before updating i_blksize. However, that's not enough -- we also need to lock out file IO and page faults during the update. Take both the i_rwsem and the invalidate_lock in exclusive mode for invalidations, and in shared mode for read/write operations. I don't know if this is the correct fix, but xfs/259 found it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/174543795699.4139148.2086129139322431423.stgit@frogsfrogsfrogs Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-23	MAINTAINERS: add entry for Sitronix ST7571 LCD Controller	Marcus Folkesson
	Add MAINTAINERS entry for the Sitronix ST7571 dot matrix LCD controller. Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://lore.kernel.org/r/20250423-st7571-v6-3-e9519e3c4ec4@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2025-04-23	drm/st7571-i2c: add support for Sitronix ST7571 LCD controller	Marcus Folkesson
	Sitronix ST7571 is a 4bit gray scale dot matrix LCD controller. The controller has a SPI, I2C and 8bit parallel interface, this driver is for the I2C interface only. Reviewed-by: Thomas Zimmermann <tzimmrmann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://lore.kernel.org/r/20250423-st7571-v6-2-e9519e3c4ec4@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>