summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-04-16nvmet: pci-epf: cleanup link state managementDamien Le Moal
Since the link_up boolean field of struct nvmet_pci_epf_ctrl is always set to true when nvmet_pci_epf_start_ctrl() is called, assign true to this field in nvmet_pci_epf_start_ctrl(). Conversely, since this field is set to false when nvmet_pci_epf_stop_ctrl() is called, set this field to false directly inside that function. While at it, also add information messages to notify the user of the PCI link state changes to help troubleshoot any link stability issues without needing to enable debug messages. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-16nvmet: pci-epf: clear CC and CSTS when disabling the controllerDamien Le Moal
When a host shuts down the controller when shutting down but does so without first disabling the controller, the enable bit remains set in the controller configuration register. When the host restarts and attempts to enable the controller again, the nvmet_pci_epf_poll_cc_work() function is unable to detect the change from 0 to 1 of the enable bit, and thus the controller is not enabled again, which result in a device scan timeout on the host. This problem also occurs if the host shuts down uncleanly or if the PCIe link goes down: as the CC.EN value is not reset, the controller is not enabled again when the host restarts. Fix this by introducing the function nvmet_pci_epf_clear_ctrl_config() to clear the CC and CSTS registers of the controller when the PCIe link is lost (nvmet_pci_epf_stop_ctrl() function), or when starting the controller fails (nvmet_pci_epf_enable_ctrl() fails). Also use this function in nvmet_pci_epf_init_bar() to simplify the initialization of the CC and CSTS registers. Furthermore, modify the function nvmet_pci_epf_disable_ctrl() to clear the CC.EN bit and write this updated value to the BAR register when the controller is shutdown by the host, to ensure that upon restart, we can detect the host setting CC.EN. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-16nvmet: pci-epf: always fully initialize completion entriesDamien Le Moal
For a command that is normally processed through the command request execute() function, the completion entry for the command is initialized by __nvmet_req_complete() and nvmet_pci_epf_cq_work() only needs to set the status field and the phase of the completion entry before posting the entry to the completion queue. However, for commands that are failed due to an internal error (e.g. the command data buffer allocation fails), the command request execute() function is not called and __nvmet_req_complete() is never executed for the command, leaving the command completion entry uninitialized. For such command failed before calling req->execute(), the host ends up seeing completion entries with an invalid submission queue ID and command ID. Avoid such issue by always fully initilizing a command completion entry in nvmet_pci_epf_cq_work(), setting the entry submission queue head, ID and command ID. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-16nvmet: auth: use NULL to clear a pointer in nvmet_auth_sq_free()Damien Le Moal
When compiling with C=1, the following sparse warning is generated: auth.c:243:23: warning: Using plain integer as NULL pointer Avoid this warning by using NULL to instead of 0 to set the sq tls_key pointer. Fixes: fa2e0f8bbc68 ("nvmet-tcp: support secure channel concatenation") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-16nvme-multipath: sysfs links may not be created for devicesHannes Reinecke
When rapidly rescanning for new namespaces nvme_mpath_add_sysfs_link() may be called for a block device not added to sysfs. But NVME_NS_SYSFS_ATTR_LINK had already been set, so when checking this device a second time we will fail to create the link. Fix this by exchanging the order of the block device check and the NVME_NS_SYSFS_ATTR_LINK bit check. Fixes: 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin io-policy") Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>** Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-16nvme: fixup scan failure for non-ANA multipath controllersHannes Reinecke
Commit 62baf70c3274 caused the ANA log page to be re-read, even on controllers that do not support ANA. While this should generally harmless, some controllers hang on the unsupported log page and never finish probing. Fixes: 62baf70c3274 ("nvme: re-read ANA log page after ns scan completes") Signed-off-by: Hannes Reinecke <hare@kernel.org> Tested-by: Srikanth Aithal <sraithal@amd.com> [hch: more detailed commit message] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2025-04-15Merge tag 'linux-can-fixes-for-6.15-20250415' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2025-04-15 The first patch is by Davide Caratti and fixes the missing derement in the protocol inuse counter for the J1939 CAN protocol. The last patch is by Weizhao Ouyang and fixes a broken quirks check in the rockchip CAN-FD driver. * tag 'linux-can-fixes-for-6.15-20250415' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: rockchip_canfd: fix broken quirks checks can: fix missing decrement of j1939_proto.inuse_idx ==================== Link: https://patch.msgid.link/20250415103401.445981-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15ublk: don't suggest CONFIG_BLK_DEV_UBLK=YCaleb Sander Mateos
The CONFIG_BLK_DEV_UBLK help text suggests setting the config option to Y so task_work_add() can be used to dispatch I/O, improving performance. However, this mechanism was removed in commit 29dc5d06613f2 ("ublk: kill queuing request by task_work_add"). So remove this paragraph from the config help text. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Uday Shankar <ushankar@purestorage.com> Link: https://lore.kernel.org/r/20250416004111.3242817-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15loop: stop using vfs_iter_{read,write} for buffered I/OChristoph Hellwig
vfs_iter_{read,write} always perform direct I/O when the file has the O_DIRECT flag set, which breaks disabling direct I/O using the LOOP_SET_STATUS / LOOP_SET_STATUS64 ioctls. This was recenly reported as a regression, but as far as I can tell was only uncovered by better checking for block sizes and has been around since the direct I/O support was added. Fix this by using the existing aio code that calls the raw read/write iter methods instead. Note that despite the comments there is no need for block drivers to ever call flush_dcache_page themselves, and the call is a left-over from prehistoric times. Fixes: ab1cb278bc70 ("block: loop: introduce ioctl command of LOOP_SET_DIRECT_IO") Reported-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Tested-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20250409130940.3685677-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15eth: bnxt: fix missing ring index trim on error pathJakub Kicinski
Commit under Fixes converted tx_prod to be free running but missed masking it on the Tx error path. This crashes on error conditions, for example when DMA mapping fails. Fixes: 6d1add95536b ("bnxt_en: Modify TX ring indexing logic.") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250414143210.458625-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15net: ethernet: ti: am65-cpsw: fix port_np reference countingMichael Walle
A reference to the device tree node is stored in a private struct, thus the reference count has to be incremented. Also, decrement the count on device removal and in the error path. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Michael Walle <mwalle@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250414083942.4015060-1-mwalle@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15octeontx2-pf: handle otx2_mbox_get_rsp errorsChenyuan Yang
Adding error pointer check after calling otx2_mbox_get_rsp(). This is similar to the commit bd3110bc102a ("octeontx2-pf: handle otx2_mbox_get_rsp errors in otx2_flows.c"). Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Fixes: 6c40ca957fe5 ("octeontx2-pf: Adds TC offload support") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250412183327.3550970-1-chenyuan0y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15Revert "PCI: Avoid reset when disabled via sysfs"Alex Williamson
This reverts commit 479380efe1625e251008d24b2810283db60d6fcd. The reset_method attribute on a PCI device is only intended to manage the availability of function scoped resets for a device. It was never intended to restrict resets targeting the bus or slot. In introducing a restriction that each device must support function level reset by testing pci_reset_supported(), we essentially create a catch-22, that a device must have a function scope reset in order to support bus/slot reset, when we use bus/slot reset to effect a reset of a device that does not support a function scoped reset, especially multi-function devices. This breaks the majority of uses cases where vfio-pci uses bus/slot resets to manage multifunction devices that do not support function scoped resets. Fixes: 479380efe162 ("PCI: Avoid reset when disabled via sysfs") Reported-by: Cal Peake <cp@absolutedigital.net> Closes: https://lore.kernel.org/all/808e1111-27b7-f35b-6d5c-5b275e73677b@absolutedigital.net Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220010 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250414211828.3530741-1-alex.williamson@redhat.com
2025-04-15loop: LOOP_SET_FD: send uevents for partitionsThomas Weißschuh
Remove the suppression of the uevents before scanning for partitions. The partitions inherit their suppression settings from their parent device, which lead to the uevents being dropped. This is similar to the same changes for LOOP_CONFIGURE done in commit bb430b694226 ("loop: LOOP_CONFIGURE: send uevents for partitions"). Fixes: 498ef5c777d9 ("loop: suppress uevents while reconfiguring the device") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20250415-loop-uevent-changed-v3-1-60ff69ac6088@linutronix.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15spi: tegra210-quad: add rate limiting and simplify timeout error messageBreno Leitao
On malfunctioning hardware, timeout error messages can appear thousands of times, creating unnecessary system pressure and log bloat. This patch makes two improvements: 1. Replace dev_err() with dev_err_ratelimited() to prevent log flooding when hardware errors persist 2. Remove the redundant timeout value parameter from the error message, as 'ret' is always zero in this error path These changes reduce logging overhead while maintaining necessary error reporting for debugging purposes. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250401-tegra-v2-2-126c293ec047@debian.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-15spi: tegra210-quad: use WARN_ON_ONCE instead of WARN_ON for timeoutsBreno Leitao
Some machines with tegra_qspi_combined_seq_xfer hardware issues generate excessive kernel warnings, severely polluting the logs: dmesg | grep -i "WARNING:.*tegra_qspi_transfer_one_message" | wc -l 94451 This patch replaces WARN_ON with WARN_ON_ONCE for timeout conditions to reduce log spam. The subsequent error message still prints on each occurrence, providing sufficient information about the failure, while the stack trace is only needed once for debugging purposes. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250401-tegra-v2-1-126c293ec047@debian.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-15thermal: intel: int340x: Fix Panther Lake DLVR supportSrinivas Pandruvada
Panther Lake uses the same DLVR register offsets as Lunar Lake, but the driver uses the default register offsets table for it by mistake. Move the selection of register offsets table from the actual attribute read/write callbacks to proc_thermal_rfim_add() and make it handle Panther Lake the same way as Lunar Lake. This way it is clean and in the future such issues can be avoided. Fixes: e50eeababa94 ("thermal: intel: int340x: Panther Lake DLVR support") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20250411115438.594114-1-srinivas.pandruvada@linux.intel.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-15thermal: intel: int340x: Add missing DVFS support flagsSrinivas Pandruvada
DVFS (Dynamic Voltage Frequency Scaling) is still supported for DDR memory on Lunar Lake and Panther Lake. Add the missing flag PROC_THERMAL_FEATURE_DVFS. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20250410172943.577913-1-srinivas.pandruvada@linux.intel.com [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-15firmware: stratix10-svc: Add of_platform_default_populate()Mahesh Rao
Add of_platform_default_populate() to stratix10-svc driver as the firmware/svc node was moved out of soc. This fixes the failed probing of child drivers of svc node. Cc: stable@vger.kernel.org Fixes: 23c3ebed382a ("arm64: dts: socfpga: agilex: move firmware out of soc node") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Mahesh Rao <mahesh.rao@intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Link: https://lore.kernel.org/r/20250326115446.36123-1-dinguyen@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15mei: vsc: Use struct vsc_tp_packet as vsc-tp tx_buf and rx_buf typeHans de Goede
vsc_tp.tx_buf and vsc_tp.rx_buf point to a struct vsc_tp_packet, use the correct type instead of "void *" and use sizeof(*ptr) when allocating memory for these buffers. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Link: https://lore.kernel.org/r/20250318141203.94342-3-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15mei: vsc: Fix fortify-panic caused by invalid counted_by() useHans de Goede
gcc 15 honors the __counted_by(len) attribute on vsc_tp_packet.buf[] and the vsc-tp.c code is using this in a wrong way. len does not contain the available size in the buffer, it contains the actual packet length *without* the crc. So as soon as vsc_tp_xfer() tries to add the crc to buf[] the fortify-panic handler gets triggered: [ 80.842193] memcpy: detected buffer overflow: 4 byte write of buffer size 0 [ 80.842243] WARNING: CPU: 4 PID: 272 at lib/string_helpers.c:1032 __fortify_report+0x45/0x50 ... [ 80.843175] __fortify_panic+0x9/0xb [ 80.843186] vsc_tp_xfer.cold+0x67/0x67 [mei_vsc_hw] [ 80.843210] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90 [ 80.843229] ? lockdep_hardirqs_on+0x7c/0x110 [ 80.843250] mei_vsc_hw_start+0x98/0x120 [mei_vsc] [ 80.843270] mei_reset+0x11d/0x420 [mei] The easiest fix would be to just drop the counted-by but with the exception of the ack buffer in vsc_tp_xfer_helper() which only contains enough room for the packet-header, all other uses of vsc_tp_packet always use a buffer of VSC_TP_MAX_XFER_SIZE bytes for the packet. Instead of just dropping the counted-by, split the vsc_tp_packet struct definition into a header and a full-packet definition and use a fixed size buf[] in the packet definition, this way fortify-source buffer overrun checking still works when enabled. Fixes: 566f5ca97680 ("mei: Add transport driver for IVSC device") Cc: stable@kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Link: https://lore.kernel.org/r/20250318141203.94342-2-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15pps: generators: tio: fix platform_set_drvdata()Raag Jadav
platform_set_drvdata() is setting a double pointer to struct pps_tio as driver_data, which will point to the local stack of probe function instead of intended data. Set driver_data correctly and fix illegal memory access by its user. BUG: unable to handle page fault for address: ffffc9000117b738 RIP: 0010:hrtimer_active+0x2b/0x60 Call Trace: ? hrtimer_active+0x2b/0x60 hrtimer_cancel+0x19/0x50 pps_gen_tio_remove+0x1e/0x80 [pps_gen_tio] Fixes: c89755d1111f ("pps: generators: Add PPS Generator TIO Driver") Signed-off-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Rodolfo Giometti <giometti@enneenne.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250318114038.2058677-1-raag.jadav@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15mcb: fix a double free bug in chameleon_parse_gdd()Haoxiang Li
In chameleon_parse_gdd(), if mcb_device_register() fails, 'mdev' would be released in mcb_device_register() via put_device(). Thus, goto 'err' label and free 'mdev' again causes a double free. Just return if mcb_device_register() fails. Fixes: 3764e82e5150 ("drivers: Introduce MEN Chameleon Bus") Cc: stable <stable@kernel.org> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Signed-off-by: Johannes Thumshirn <jth@kernel.org> Link: https://lore.kernel.org/r/6201d09e2975ae5789879f79a6de4c38de9edd4a.1741596225.git.jth@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15drivers/base/memory: Avoid overhead from for_each_present_section_nr()Gavin Shan
for_each_present_section_nr() was introduced to add_boot_memory_block() by commit 61659efdb35c ("drivers/base/memory: improve add_boot_memory_block()"). It causes unnecessary overhead when the present sections are really sparse. next_present_section_nr() called by the macro to find the next present section, which is far away from the spanning sections in the specified block. Too much time consumed by next_present_section_nr() in this case, which can lead to softlockup as observed by Aditya Gupta on IBM Power10 machine. watchdog: BUG: soft lockup - CPU#248 stuck for 22s! [swapper/248:1] Modules linked in: CPU: 248 UID: 0 PID: 1 Comm: swapper/248 Not tainted 6.15.0-rc1-next-20250408 #1 VOLUNTARY Hardware name: 9105-22A POWER10 (raw) 0x800200 opal:v7.1-107-gfda75d121942 PowerNV NIP: c00000000209218c LR: c000000002092204 CTR: 0000000000000000 REGS: c00040000418fa30 TRAP: 0900 Not tainted (6.15.0-rc1-next-20250408) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000428 XER: 00000000 CFAR: 0000000000000000 IRQMASK: 0 GPR00: c000000002092204 c00040000418fcd0 c000000001b08100 0000000000000040 GPR04: 0000000000013e00 c000c03ffebabb00 0000000000c03fff c000400fff587f80 GPR08: 0000000000000000 00000000001196f7 0000000000000000 0000000028000428 GPR12: 0000000000000000 c000000002e80000 c00000000001007c 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: c000000002df7f70 0000000000013dc0 c0000000011dd898 0000000008000000 NIP [c00000000209218c] memory_dev_init+0x114/0x1e0 LR [c000000002092204] memory_dev_init+0x18c/0x1e0 Call Trace: [c00040000418fcd0] [c000000002092204] memory_dev_init+0x18c/0x1e0 (unreliable) [c00040000418fd50] [c000000002091348] driver_init+0x78/0xa4 [c00040000418fd70] [c0000000020063ac] kernel_init_freeable+0x22c/0x370 [c00040000418fde0] [c0000000000100a8] kernel_init+0x34/0x25c [c00040000418fe50] [c00000000000cd94] ret_from_kernel_user_thread+0x14/0x1c Avoid the overhead by folding for_each_present_section_nr() to the outer loop. add_boot_memory_block() is dropped after that. Fixes: 61659efdb35c ("drivers/base/memory: improve add_boot_memory_block()") Closes: https://lore.kernel.org/linux-mm/20250409180344.477916-1-adityag@linux.ibm.com Reported-by: Aditya Gupta <adityag@linux.ibm.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Acked-by: Oscar Salvador <osalvador@suse.de> Tested-by: Aditya Gupta <adityag@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250410125110.1232329-1-gshan@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15software node: Prevent link creation failure from causing kobj reference ↵Lizhi Xu
count imbalance syzbot reported a uaf in software_node_notify_remove. [1] When any of the two sysfs_create_link() in software_node_notify() fails, the swnode->kobj reference count will not increase normally, which will cause swnode to be released incorrectly due to the imbalance of kobj reference count when executing software_node_notify_remove(). Increase the reference count of kobj before creating the link to avoid uaf. [1] BUG: KASAN: slab-use-after-free in software_node_notify_remove+0x1bc/0x1c0 drivers/base/swnode.c:1108 Read of size 1 at addr ffff888033c08908 by task syz-executor105/5844 Freed by task 5844: software_node_notify_remove+0x159/0x1c0 drivers/base/swnode.c:1106 device_platform_notify_remove drivers/base/core.c:2387 [inline] Fixes: 9eb59204d519 ("iommufd/selftest: Add set_dev_pasid in mock iommu") Reported-by: syzbot+2ff22910687ee0dfd48e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2ff22910687ee0dfd48e Tested-by: syzbot+2ff22910687ee0dfd48e@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250414071123.1228331-1-lizhi.xu@windriver.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15drivers/base: Extend documentation with preferred way to use auxbusLeon Romanovsky
Document the preferred way to use auxiliary bus. Signed-off-by: Leon Romanovsky <leon@kernel.org> Link: https://lore.kernel.org/r/206e8c249f630abd3661deb36b84b26282241040.1743510317.git.leon@kernel.org [ reworded the text a bit - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15Merge tag 'edac_urgent_for_v6.15_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: "Two fixes to the AMD translation library for the MI300 side of things: - Use the row[13] bit when calculating the memory row to retire - Mask the physical row address in order to avoid creating duplicate error records" * tag 'edac_urgent_for_v6.15_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: RAS/AMD/FMPM: Get masked address RAS/AMD/ATL: Include row[13] bit in row retirement
2025-04-15driver core: fix potential NULL pointer dereference in dev_uevent()Dmitry Torokhov
If userspace reads "uevent" device attribute at the same time as another threads unbinds the device from its driver, change to dev->driver from a valid pointer to NULL may result in crash. Fix this by using READ_ONCE() when fetching the pointer, and take bus' drivers klist lock to make sure driver instance will not disappear while we access it. Use WRITE_ONCE() when setting the driver pointer to ensure there is no tearing. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20250311052417.1846985-3-dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15driver core: introduce device_set_driver() helperDmitry Torokhov
In preparation to closing a race when reading driver pointer in dev_uevent() code, instead of setting device->driver pointer directly introduce device_set_driver() helper. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20250311052417.1846985-2-dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15Revert "drivers: core: synchronize really_probe() and dev_uevent()"Dmitry Torokhov
This reverts commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0. Probing a device can take arbitrary long time. In the field we observed that, for example, probing a bad micro-SD cards in an external USB card reader (or maybe cards were good but cables were flaky) sometimes takes longer than 2 minutes due to multiple retries at various levels of the stack. We can not block uevent_show() method for that long because udev is reading that attribute very often and that blocks udev and interferes with booting of the system. The change that introduced locking was concerned with dev_uevent() racing with unbinding the driver. However we can handle it without locking (which will be done in subsequent patch). There was also claim that synchronization with probe() is needed to properly load USB drivers, however this is a red herring: the change adding the lock was introduced in May of last year and USB loading and probing worked properly for many years before that. Revert the harmful locking. Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20250311052417.1846985-1-dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15loop: properly send KOBJ_CHANGED uevent for disk deviceThomas Weißschuh
The original commit message and the wording "uncork" in the code comment indicate that it is expected that the suppressed event instances are automatically sent after unsuppressing. This is not the case, instead they are discarded. In effect this means that no "changed" events are emitted on the device itself by default. While each discovered partition does trigger a changed event on the device, devices without partitions don't have any event emitted. This makes udev miss the device creation and prompted workarounds in userspace. See the linked util-linux/losetup bug. Explicitly emit the events and drop the confusingly worded comments. Link: https://github.com/util-linux/util-linux/issues/2434 Fixes: 498ef5c777d9 ("loop: suppress uevents while reconfiguring the device") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20250415-loop-uevent-changed-v2-1-0c4e6a923b2a@linutronix.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15loop: aio inherit the ioprio of original requestYunlong Xing
Set cmd->iocb.ki_ioprio to the ioprio of loop device's request. The purpose is to inherit the original request ioprio in the aio flow. Signed-off-by: Yunlong Xing <yunlong.xing@unisoc.com> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250414030159.501180-1-yunlong.xing@unisoc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15misc: microchip: pci1xxxx: Fix incorrect IRQ status handling during ackRengarajan S
Under irq_ack, pci1xxxx_assign_bit reads the current interrupt status, modifies and writes the entire value back. Since, the IRQ status bit gets cleared on writing back, the better approach is to directly write the bitmask to the register in order to preserve the value. Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20250313170856.20868-3-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15misc: microchip: pci1xxxx: Fix Kernel panic during IRQ handler registrationRengarajan S
Resolve kernel panic while accessing IRQ handler associated with the generated IRQ. This is done by acquiring the spinlock and storing the current interrupt state before handling the interrupt request using generic_handle_irq. A previous fix patch was submitted where 'generic_handle_irq' was replaced with 'handle_nested_irq'. However, this change also causes the kernel panic where after determining which GPIO triggered the interrupt and attempting to call handle_nested_irq with the mapped IRQ number, leads to a failure in locating the registered handler. Fixes: 194f9f94a516 ("misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20250313170856.20868-2-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15char: misc: register chrdev region with all possible minorsThadeu Lima de Souza Cascardo
register_chrdev will only register the first 256 minors of a major chrdev. That means that dynamically allocated misc devices with minor above 255 will fail to open with -ENXIO. This was found by kernel test robot when testing a different change that makes all dynamically allocated minors be above 255. This has, however, been separately tested by creating 256 serio_raw devices with the help of userio driver. Ever since allowing misc devices with minors above 128, this has been possible. Fix it by registering all minor numbers from 0 to MINORMASK + 1 for MISC_MAJOR. Reported-by: kernel test robot <oliver.sang@intel.com> Cc: stable <stable@kernel.org> Closes: https://lore.kernel.org/oe-lkp/202503171507.6c8093d0-lkp@intel.com Fixes: ab760791c0cf ("char: misc: Increase the maximum number of dynamic misc devices to 1048448") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Tested-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Link: https://lore.kernel.org/r/20250317-misc-chrdev-v1-1-6cd05da11aef@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15mei: me: add panther lake H DIDAlexander Usyskin
Add Panther Lake H device id. Cc: stable <stable@kernel.org> Co-developed-by: Tomas Winkler <tomasw@gmail.com> Signed-off-by: Tomas Winkler <tomasw@gmail.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Link: https://lore.kernel.org/r/20250408130005.1358140-1-alexander.usyskin@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15net: ngbe: fix memory leak in ngbe_probe() error pathAbdun Nihaal
When ngbe_sw_init() is called, memory is allocated for wx->rss_key in wx_init_rss_key(). However, in ngbe_probe() function, the subsequent error paths after ngbe_sw_init() don't free the rss_key. Fix that by freeing it in error path along with wx->mac_table. Also change the label to which execution jumps when ngbe_sw_init() fails, because otherwise, it could lead to a double free for rss_key, when the mac_table allocation fails in wx_sw_init(). Fixes: 02338c484ab6 ("net: ngbe: Initialize sw info and register netdev") Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250412154927.25908-1-abdun.nihaal@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-15platform/x86: msi-wmi-platform: Rename "data" variableArmin Wolf
Rename the "data" variable inside msi_wmi_platform_read() to avoid a name collision when the driver adds support for a state container struct (that is to be called "data" too) in the future. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20250414140453.7691-1-W_Armin@gmx.de Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-15platform/x86: alienware-wmi-wmax: Extend support to more laptopsKurt Borja
Extend thermal control support to: - Alienware Area-51m R2 - Alienware m16 R1 - Alienware m16 R2 - Dell G16 7630 - Dell G5 5505 SE Cc: stable@vger.kernel.org Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250411-awcc-support-v1-2-09a130ec4560@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-15platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1Kurt Borja
Some users report the Alienware m16 R1 models, support G-Mode. This was manually verified by inspecting their ACPI tables. Cc: stable@vger.kernel.org Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250411-awcc-support-v1-1-09a130ec4560@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-15comedi: jr3_pci: Fix synchronous deletion of timerIan Abbott
When `jr3_pci_detach()` is called during device removal, it calls `timer_delete_sync()` to stop the timer, but the timer expiry function always reschedules the timer, so the synchronization is ineffective. Call `timer_shutdown_sync()` instead. It does not matter that the timer expiry function pointer is cleared, because the device is being removed. Fixes: 07b509e6584a5 ("Staging: comedi: add jr3_pci driver") Cc: stable <stable@kernel.org> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20250415123901.13483-1-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15binder: fix offset calculation in debug logCarlos Llamas
The vma start address should be substracted from the buffer's user data address and not the other way around. Cc: Tiffany Y. Yang <ynaffit@google.com> Cc: stable <stable@kernel.org> Fixes: 162c79731448 ("binder: avoid user addresses in debug logs") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Tiffany Y. Yang <ynaffit@google.com> Link: https://lore.kernel.org/r/20250325184902.587138-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15serial: sifive: lock port in startup()/shutdown() callbacksRyo Takakura
startup()/shutdown() callbacks access SIFIVE_SERIAL_IE_OFFS. The register is also accessed from write() callback. If console were printing and startup()/shutdown() callback gets called, its access to the register could be overwritten. Add port->lock to startup()/shutdown() callbacks to make sure their access to SIFIVE_SERIAL_IE_OFFS is synchronized against write() callback. Fixes: 45c054d0815b ("tty: serial: add driver for the SiFive UART") Signed-off-by: Ryo Takakura <ryotkkr98@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: stable@vger.kernel.org Reviewed-by: John Ogness <john.ogness@linutronix.de> Rule: add Link: https://lore.kernel.org/stable/20250330003522.386632-1-ryotkkr98%40gmail.com Link: https://lore.kernel.org/r/20250412001847.183221-1-ryotkkr98@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15usb: typec: class: Unlocked on error in typec_register_partner()Dan Carpenter
We recently added some locking to this function but this error path was accidentally missed. Unlock before returning. Fixes: ec27386de23a ("usb: typec: class: Fix NULL pointer access") Cc: stable <stable@kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/Z_44tOtmml89wQcM@stanley.mountain Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15usb: quirks: Add delay init quirk for SanDisk 3.2Gen1 Flash DriveMiao Li
The SanDisk 3.2Gen1 Flash Drive, which VID:PID is in 0781:55a3, just like Silicon Motion Flash Drive: https://lore.kernel.org/r/20250401023027.44894-1-limiao870622@163.com also needs the DELAY_INIT quirk, or it will randomly work incorrectly (e.g.: lsusb and can't list this device info) when connecting Huawei hisi platforms and doing thousand of reboot test circles. Cc: stable <stable@kernel.org> Signed-off-by: Miao Li <limiao@kylinos.cn> Signed-off-by: Lei Huang <huanglei@kylinos.cn> Link: https://lore.kernel.org/r/20250414062935.159024-1-limiao870622@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15intel_th: avoid using deprecated page->mapping, index fieldsLorenzo Stoakes
The struct page->mapping, index fields are deprecated and soon to be only available as part of a folio. It is likely the intel_th code which sets page->mapping, index is was implemented out of concern that some aspect of the page fault logic may encounter unexpected problems should they not. However, the appropriate interface for inserting kernel-allocated memory is vm_insert_page() in a VM_MIXEDMAP. By using the helper function vmf_insert_mixed() we can do this with minimal churn in the existing fault handler. By doing so, we bypass the remainder of the faulting logic. The pages are still pinned so there is no possibility of anything unexpected being done with the pages once established. It would also be reasonable to pre-map everything on fault, however to minimise churn we retain the fault handler. We also eliminate all code which clears page->mapping on teardown as this has now become unnecessary. The MSU code relies on faulting to function correctly, so is by definition dependent on CONFIG_MMU. We avoid spurious reports about compilation failure for unsupported platforms by making this requirement explicit in Kconfig as part of this change too. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lore.kernel.org/r/20250331125608.60300-1-lorenzo.stoakes@oracle.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15can: rockchip_canfd: fix broken quirks checksWeizhao Ouyang
First get the devtype_data then check quirks. Fixes: bbdffb341498 ("can: rockchip_canfd: add quirk for broken CAN-FD support") Signed-off-by: Weizhao Ouyang <o451686892@gmail.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20250324114416.10160-1-o451686892@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-04-15pinctrl: airoha: fix wrong PHY LED mapping and PHY2 LED definesChristian Marangi
The current PHY2 LED define are wrong and actually set BITs outside the related mask. Fix it and set the correct value. While at it, also use FIELD_PREP_CONST macro to make it simple to understand what values are actually applied for the mask. Also fix wrong PHY LED mapping. The SoC Switch supports up to 4 port but the register define mapping for 5 PHY port, starting from 0. The mapping was wrongly defined starting from PHY1. Reorder the function group to start from PHY0. PHY4 is actually never supported as we don't have a GPIO pin to assign. Cc: stable@vger.kernel.org Fixes: 1c8ace2d0725 ("pinctrl: airoha: Add support for EN7581 SoC") Reviewed-by: Benjamin Larsson <benjamin.larsson@genexis.eu> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/20250401135026.18018-1-ansuelsmth@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-04-15pinctrl: meson: define the pull up/down resistor value as 60 kOhmMartin Blumenstingl
The public datasheets of the following Amlogic SoCs describe a typical resistor value for the built-in pull up/down resistor: - Meson8/8b/8m2: not documented - GXBB (S905): 60 kOhm - GXL (S905X): 60 kOhm - GXM (S912): 60 kOhm - G12B (S922X): 60 kOhm - SM1 (S905D3): 60 kOhm The public G12B and SM1 datasheets additionally state min and max values: - min value: 50 kOhm for both, pull-up and pull-down - max value for the pull-up: 70 kOhm - max value for the pull-down: 130 kOhm Use 60 kOhm in the pinctrl-meson driver as well so it's shown in the debugfs output. It may not be accurate for Meson8/8b/8m2 but in reality 60 kOhm is closer to the actual value than 1 Ohm. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/20250329190132.855196-1-martin.blumenstingl@googlemail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-04-15pinctrl: imx: Return NULL if no group is matched and foundHui Wang
Currently if no group is matched and found, this function will return the last grp to the caller, this is not expected, it is supposed to return NULL in this case. Fixes: e566fc11ea76 ("pinctrl: imx: use generic pinctrl helpers for managing groups") Signed-off-by: Hui Wang <hui.wang@canonical.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/20250327031600.99723-1-hui.wang@canonical.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>