summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-29platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive driversSuma Hegde
amd_hsmp and hsmp_acpi are intended to be mutually exclusive drivers and amd_hsmp is for legacy platforms. To achieve this, it is essential to check for the presence of the ACPI device in plat.c. If the hsmp ACPI device entry is found, allow the hsmp_acpi driver to manage the hsmp and return an error from plat.c. Additionally, rename the driver from amd_hsmp to hsmp_acpi to prevent "Driver 'amd_hsmp' is already registered, aborting..." error in case both drivers are loaded simultaneously. Also, support both platform device based and ACPI based probing for family 0x1A models 0x00 to 0x0F, implement only ACPI based probing for family 0x1A, models 0x10 to 0x1F. Return false from legacy_hsmp_support() for this platform. This aligns with the condition check in is_f1a_m0h(). Link: https://lore.kernel.org/platform-driver-x86/aALZxvHWmphNL1wa@gourry-fedora-PF4VCD3F/ Fixes: 7d3135d16356 ("platform/x86/amd/hsmp: Create separate ACPI, plat and common drivers") Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com> Co-developed-by: Gregory Price <gourry@gourry.net> Signed-off-by: Gregory Price <gourry@gourry.net> Signed-off-by: Suma Hegde <suma.hegde@amd.com> Link: https://lore.kernel.org/r/20250425102357.266790-1-suma.hegde@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-29drivers/platform/x86/amd: pmf: Check for invalid Smart PC PoliciesMario Limonciello
commit 376a8c2a14439 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA") added support for platforms that support an updated TA, however it also exposed a number of platforms that although they have support for the updated TA don't actually populate a policy binary. Add an explicit check that the policy binary isn't empty before initializing the TA. Reported-by: Christian Heusel <christian@heusel.eu> Closes: https://lore.kernel.org/platform-driver-x86/ae644428-5bf2-4b30-81ba-0b259ed3449b@heusel.eu/ Fixes: 376a8c2a14439 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Christian Heusel <christian@heusel.eu> Link: https://lore.kernel.org/r/20250423132002.3984997-3-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-29drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC PoliciesMario Limonciello
If a policy is passed into amd_pmf_get_pb_data() that causes the engine to fail to start there is a memory leak. Free the memory in this failure path. Fixes: 10817f28e5337 ("platform/x86/amd/pmf: Add capability to sideload of policy binary") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250423132002.3984997-2-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-29Merge branch 'ip-improve-tcp-sock-multipath-routing'Paolo Abeni
Willem de Bruijn says: ==================== ip: improve tcp sock multipath routing From: Willem de Bruijn <willemb@google.com> Improve layer 4 multipath hash policy for local tcp connections: patch 1: Select a source address that matches the nexthop device. Due to tcp_v4_connect making separate route lookups for saddr and route, the two can currently be inconsistent. patch 2: Use all paths when opening multiple local tcp connections to the same ip address and port. patch 3: Test the behavior. Extend the fib_tests.sh testsuite with one opening many connections, and count SYNs on both egress devices, for packets matching the source address of the dev. Changelog in the individual patches ==================== Link: https://patch.msgid.link/20250424143549.669426-1-willemdebruijn.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-29selftests/net: test tcp connection load balancingWillem de Bruijn
Verify that TCP connections use both routes when connecting multiple times to a remote service over a two nexthop multipath route. Use socat to create the connections. Use tc prio + tc filter to count routes taken, counting SYN packets across the two egress devices. Also verify that the saddr matches that of the device. To avoid flaky tests when testing inherently randomized behavior, set a low bar and pass if even a single SYN is observed on each device. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250424143549.669426-4-willemdebruijn.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-29ip: load balance tcp connections to single dst addr and portWillem de Bruijn
Load balance new TCP connections across nexthops also when they connect to the same service at a single remote address and port. This affects only port-based multipath hashing: fib_multipath_hash_policy 1 or 3. Local connections must choose both a source address and port when connecting to a remote service, in ip_route_connect. This "chicken-and-egg problem" (commit 2d7192d6cbab ("ipv4: Sanitize and simplify ip_route_{connect,newports}()")) is resolved by first selecting a source address, by looking up a route using the zero wildcard source port and address. As a result multiple connections to the same destination address and port have no entropy in fib_multipath_hash. This is not a problem when forwarding, as skb-based hashing has a 4-tuple. Nor when establishing UDP connections, as autobind there selects a port before reaching ip_route_connect. Load balance also TCP, by using a random port in fib_multipath_hash. Port assignment in inet_hash_connect is not atomic with ip_route_connect. Thus ports are unpredictable, effectively random. Implementation details: Do not actually pass a random fl4_sport, as that affects not only hashing, but routing more broadly, and can match a source port based policy route, which existing wildcard port 0 will not. Instead, define a new wildcard flowi flag that is used only for hashing. Selecting a random source is equivalent to just selecting a random hash entirely. But for code clarity, follow the normal 4-tuple hash process and only update this field. fib_multipath_hash can be reached with zero sport from other code paths, so explicitly pass this flowi flag, rather than trying to infer this case in the function itself. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250424143549.669426-3-willemdebruijn.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-29ipv4: prefer multipath nexthop that matches source addressWillem de Bruijn
With multipath routes, try to ensure that packets leave on the device that is associated with the source address. Avoid the following tcpdump example: veth0 Out IP 10.1.0.2.38640 > 10.2.0.3.8000: Flags [S] veth1 Out IP 10.1.0.2.38648 > 10.2.0.3.8000: Flags [S] Which can happen easily with the most straightforward setup: ip addr add 10.0.0.1/24 dev veth0 ip addr add 10.1.0.1/24 dev veth1 ip route add 10.2.0.3 nexthop via 10.0.0.2 dev veth0 \ nexthop via 10.1.0.2 dev veth1 This is apparently considered WAI, based on the comment in ip_route_output_key_hash_rcu: * 2. Moreover, we are allowed to send packets with saddr * of another iface. --ANK It may be ok for some uses of multipath, but not all. For instance, when using two ISPs, a router may drop packets with unknown source. The behavior occurs because tcp_v4_connect makes three route lookups when establishing a connection: 1. ip_route_connect calls to select a source address, with saddr zero. 2. ip_route_connect calls again now that saddr and daddr are known. 3. ip_route_newports calls again after a source port is also chosen. With a route with multiple nexthops, each lookup may make a different choice depending on available entropy to fib_select_multipath. So it is possible for 1 to select the saddr from the first entry, but 3 to select the second entry. Leading to the above situation. Address this by preferring a match that matches the flowi4 saddr. This will make 2 and 3 make the same choice as 1. Continue to update the backup choice until a choice that matches saddr is found. Do this in fib_select_multipath itself, rather than passing an fl4_oif constraint, to avoid changing non-multipath route selection. Commit e6b45241c57a ("ipv4: reset flowi parameters on route connect") shows how that may cause regressions. Also read ipv4.sysctl_fib_multipath_use_neigh only once. No need to refresh in the loop. This does not happen in IPv6, which performs only one lookup. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250424143549.669426-2-willemdebruijn.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-29nvme-pci: add quirks for WDC Blue SN550 15b7:5009Wentao Guan
Add two quirks for the WDC Blue SN550 (PCI ID 15b7:5009) based on user reports and hardware analysis: - NVME_QUIRK_NO_DEEPEST_PS: liaozw talked to me the problem and solved with nvme_core.default_ps_max_latency_us=0, so add the quirk. I also found some reports in the following link. - NVME_QUIRK_BROKEN_MSI: after get the lspci from Jack Rio. I think that the disk also have NVME_QUIRK_BROKEN_MSI. described in commit d5887dc6b6c0 ("nvme-pci: Add quirk for broken MSIs") as sean said in link which match the MSI 1/32 and MSI-X 17. Log: lspci -nn | grep -i memory 03:00.0 Non-Volatile memory controller [0108]: Sandisk Corp SanDisk Ultra 3D / WD PC SN530, IX SN530, Blue SN550 NVMe SSD (DRAM-less) [15b7:5009] (rev 01) lspci -v -d 15b7:5009 03:00.0 Non-Volatile memory controller: Sandisk Corp SanDisk Ultra 3D / WD PC SN530, IX SN530, Blue SN550 NVMe SSD (DRAM-less) (rev 01) (prog-if 02 [NVM Express]) Subsystem: Sandisk Corp WD Blue SN550 NVMe SSD Flags: bus master, fast devsel, latency 0, IRQ 35, IOMMU group 10 Memory at fe800000 (64-bit, non-prefetchable) [size=16K] Memory at fe804000 (64-bit, non-prefetchable) [size=256] Capabilities: [80] Power Management version 3 Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+ Capabilities: [b0] MSI-X: Enable+ Count=17 Masked- Capabilities: [c0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Device Serial Number 00-00-00-00-00-00-00-00 Capabilities: [1b8] Latency Tolerance Reporting Capabilities: [300] Secondary PCI Express Capabilities: [900] L1 PM Substates Kernel driver in use: nvme dmesg | grep nvme [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.12.20-amd64-desktop-rolling root=UUID= ro splash quiet nvme_core.default_ps_max_latency_us=0 DEEPIN_GFXMODE= [ 0.059301] Kernel command line: BOOT_IMAGE=/vmlinuz-6.12.20-amd64-desktop-rolling root=UUID= ro splash quiet nvme_core.default_ps_max_latency_us=0 DEEPIN_GFXMODE= [ 0.542430] nvme nvme0: pci function 0000:03:00.0 [ 0.560426] nvme nvme0: allocated 32 MiB host memory buffer. [ 0.562491] nvme nvme0: 16/0/0 default/read/poll queues [ 0.567764] nvme0n1: p1 p2 p3 p4 p5 p6 p7 p8 p9 [ 6.388726] EXT4-fs (nvme0n1p7): mounted filesystem ro with ordered data mode. Quota mode: none. [ 6.893421] EXT4-fs (nvme0n1p7): re-mounted r/w. Quota mode: none. [ 7.125419] Adding 16777212k swap on /dev/nvme0n1p8. Priority:-2 extents:1 across:16777212k SS [ 7.157588] EXT4-fs (nvme0n1p6): mounted filesystem r/w with ordered data mode. Quota mode: none. [ 7.165021] EXT4-fs (nvme0n1p9): mounted filesystem r/w with ordered data mode. Quota mode: none. [ 8.036932] nvme nvme0: using unchecked data buffer [ 8.096023] block nvme0n1: No UUID available providing old NGUID Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d5887dc6b6c054d0da3cd053afc15b7be1f45ff6 Link: https://lore.kernel.org/all/20240422162822.3539156-1-sean.anderson@linux.dev/ Reported-by: liaozw <hedgehog-002@163.com> Closes: https://bbs.deepin.org.cn/post/286300 Reported-by: rugk <rugk+github@posteo.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=208123 Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-29nvme-pci: add quirks for device 126f:1001Wentao Guan
This commit adds NVME_QUIRK_NO_DEEPEST_PS and NVME_QUIRK_BOGUS_NID for device [126f:1001]. It is similar to commit e89086c43f05 ("drivers/nvme: Add quirks for device 126f:2262") Diff is according the dmesg, use NVME_QUIRK_IGNORE_DEV_SUBNQN. dmesg | grep -i nvme0: nvme nvme0: pci function 0000:01:00.0 nvme nvme0: missing or invalid SUBNQN field. nvme nvme0: 12/0/0 default/read/poll queues Link:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e89086c43f0500bc7c4ce225495b73b8ce234c1f Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-29nvme-pci: fix queue unquiesce check on slot_resetKeith Busch
A zero return means the reset was successfully scheduled. We don't want to unquiesce the queues while the reset_work is pending, as that will just flush out requeued requests to a failed completion. Fixes: 71a5bb153be104 ("nvme: ensure disabling pairs with unquiesce") Reported-by: Dhankaran Singh Ajravat <dhankaran@meta.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2025-04-29ALSA: ump: Fix buffer overflow at UMP SysEx message conversionTakashi Iwai
The conversion function from MIDI 1.0 to UMP packet contains an internal buffer to keep the incoming MIDI bytes, and its size is 4, as it was supposed to be the max size for a MIDI1 UMP packet data. However, the implementation overlooked that SysEx is handled in a different format, and it can be up to 6 bytes, as found in do_convert_to_ump(). It leads eventually to a buffer overflow, and may corrupt the memory when a longer SysEx message is received. The fix is simply to extend the buffer size to 6 to fit with the SysEx UMP message. Fixes: 0b5288f5fe63 ("ALSA: ump: Add legacy raw MIDI support") Reported-by: Argusee <vr@darknavy.com> Link: https://patch.msgid.link/20250429124845.25128-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-04-29ublk: remove the check of ublk_need_req_ref() from __ublk_check_and_get_reqMing Lei
__ublk_check_and_get_req() is only called from ublk_check_and_get_req() and ublk_register_io_buf(), the same check has been covered in the two calling sites. So remove the check from __ublk_check_and_get_req(). Suggested-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250429022941.1718671-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-29ublk: enhance check for register/unregister io buffer commandMing Lei
The simple check of UBLK_IO_FLAG_OWNED_BY_SRV can avoid incorrect register/unregister io buffer easily, so check it before calling starting to register/un-register io buffer. Also only allow io buffer register/unregister uring_cmd in case of UBLK_F_SUPPORT_ZERO_COPY. Also mark argument 'ublk_queue *' of ublk_register_io_buf as const. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 1f6540e2aabb ("ublk: zc register/unregister bvec") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250429022941.1718671-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-29ublk: decouple zero copy from user copyMing Lei
UBLK_F_USER_COPY and UBLK_F_SUPPORT_ZERO_COPY are two different features, and shouldn't be coupled together. Commit 1f6540e2aabb ("ublk: zc register/unregister bvec") enables user copy automatically in case of UBLK_F_SUPPORT_ZERO_COPY, this way isn't correct. So decouple zero copy from user copy, and use independent helper to check each one. Fixes: 1f6540e2aabb ("ublk: zc register/unregister bvec") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250429022941.1718671-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-29selftests: ublk: fix UBLK_F_NEED_GET_DATAMing Lei
Commit 57e13a2e8cd2 ("selftests: ublk: support user recovery") starts to support UBLK_F_NEED_GET_DATA for covering recovery feature, however the ublk utility implementation isn't done correctly. Fix it by supporting UBLK_F_NEED_GET_DATA correctly. Also add test generic_07 for covering UBLK_F_NEED_GET_DATA. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 57e13a2e8cd2 ("selftests: ublk: support user recovery") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250429022941.1718671-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-29pinctrl: qcom: Fix PINGROUP definition for sm8750Maulik Shah
On newer SoCs intr_target_bit position is at 8 instead of 5. Fix it. Also add missing intr_wakeup_present_bit and intr_wakeup_enable_bit which enables forwarding of GPIO interrupts to parent PDC interrupt controller. Fixes: afe9803e3b82 ("pinctrl: qcom: Add sm8750 pinctrl driver") Signed-off-by: Maulik Shah <maulik.shah@oss.qualcomm.com> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Melody Olvera <melody.olvera@oss.qualcomm.com> Link: https://lore.kernel.org/20250429-pinctrl_sm8750-v2-1-87d45dd3bd82@oss.qualcomm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-04-29i2c: imx-lpi2c: Fix clock count when probe defersClark Wang
Deferred probe with pm_runtime_put() may delay clock disable, causing incorrect clock usage count. Use pm_runtime_put_sync() to ensure the clock is disabled immediately. Fixes: 13d6eb20fc79 ("i2c: imx-lpi2c: add runtime pm support") Signed-off-by: Clark Wang <xiaoning.wang@nxp.com> Signed-off-by: Carlos Song <carlos.song@nxp.com> Cc: <stable@vger.kernel.org> # v4.16+ Link: https://lore.kernel.org/r/20250421062341.2471922-1-carlos.song@nxp.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-04-28drm/xe/guc: Fix capture of steering registersJohn Harrison
The list of registers to capture on a GPU hang includes some that require steering. Unfortunately, the flag to say this was being wiped to due a missing OR on the assignment of the next flag field. Fix that. Fixes: b170d696c1e2 ("drm/xe/guc: Add XE_LP steered register lists") Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Link: https://lore.kernel.org/r/20250417195215.3002210-2-John.C.Harrison@Intel.com (cherry picked from commit 532da44b54a10d50ebad14a8a02bd0b78ec23e8b) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-28drm/xe/svm: fix dereferencing error pointer in drm_gpusvm_range_alloc()Harshit Mogalapalli
xe_svm_range_alloc() returns ERR_PTR(-ENOMEM) on failure and there is a dereference of "range" after that: --> range->gpusvm = gpusvm; In xe_svm_range_alloc(), when memory allocation fails return NULL instead to handle this situation. Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/adaef4dd-5866-48ca-bc22-4a1ddef20381@stanley.mountain/ Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250323124907.3946370-1-harshit.m.mogalapalli@oracle.com (cherry picked from commit 7a0322122cfdd9a6f10fc7701023d75c98eb3d22) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-29fs/erofs/fileio: call erofs_onlinefolio_split() after bio_add_folio()Max Kellermann
If bio_add_folio() fails (because it is full), erofs_fileio_scan_folio() needs to submit the I/O request via erofs_fileio_rq_submit() and allocate a new I/O request with an empty `struct bio`. Then it retries the bio_add_folio() call. However, at this point, erofs_onlinefolio_split() has already been called which increments `folio->private`; the retry will call erofs_onlinefolio_split() again, but there will never be a matching erofs_onlinefolio_end() call. This leaves the folio locked forever and all waiters will be stuck in folio_wait_bit_common(). This bug has been added by commit ce63cb62d794 ("erofs: support unencoded inodes for fileio"), but was practically unreachable because there was room for 256 folios in the `struct bio` - until commit 9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") which reduced the array capacity to 16 folios. It was now trivial to trigger the bug by manually invoking readahead from userspace, e.g.: posix_fadvise(fd, 0, st.st_size, POSIX_FADV_WILLNEED); This should be fixed by invoking erofs_onlinefolio_split() only after bio_add_folio() has succeeded. This is safe: asynchronous completions invoking erofs_onlinefolio_end() will not unlock the folio because erofs_fileio_scan_folio() is still holding a reference to be released by erofs_onlinefolio_end() at the end. Fixes: ce63cb62d794 ("erofs: support unencoded inodes for fileio") Fixes: 9f74ae8c9ac9 ("erofs: shorten bvecs[] for file-backed mounts") Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Gao Xiang <xiang@kernel.org> Tested-by: Hongbo Li <lihongbo22@huawei.com> Link: https://lore.kernel.org/r/20250428230933.3422273-1-max.kellermann@ionos.com Signed-off-by: Gao Xiang <xiang@kernel.org>
2025-04-28bcachefs: Topology error after insert is now an EROKent Overstreet
A user hit this, and this will naturally be easier to debug if we don't panic. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28bcachefs: Use bch2_kvmalloc() for journal keys arrayKent Overstreet
We can hit this limit fairly easy when we have to reconstuct large amounts of alloc info on large filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28bcachefs: More informative error message when shutting down due to errorKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28bcachefs: btree_root_unreadable_and_scan_found_nothing autofix for non data ↵Kent Overstreet
btrees If loosing a btree won't cause data loss - i.e. it's an alloc btree, or we can easily reconstruct it - we shouldn't require user action to continue repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28scsi: ufs: core: Remove redundant query_complete traceKeoseong Park
The query_complete trace was not removed after ufshcd_issue_dev_cmd() was called from the bsg path, resulting in duplicate output. Below is an example of the trace: ufs-utils-773 [000] ..... 218.176933: ufshcd_upiu: query_send: 0000:00:04.0: HDR:16 00 00 1f 00 01 00 00 00 00 00 00, OSF:03 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ufs-utils-773 [000] ..... 218.177145: ufshcd_upiu: query_complete: 0000:00:04.0: HDR:36 00 00 1f 00 01 00 00 00 00 00 00, OSF:03 07 00 00 00 00 00 00 00 00 00 08 00 00 00 00 ufs-utils-773 [000] ..... 218.177146: ufshcd_upiu: query_complete: 0000:00:04.0: HDR:36 00 00 1f 00 01 00 00 00 00 00 00, OSF:03 07 00 00 00 00 00 00 00 00 00 08 00 00 00 00 Remove the redundant trace call in the bsg path, preventing duplication. Signed-off-by: Keoseong Park <keosung.park@samsung.com> Link: https://lore.kernel.org/r/20250425010605epcms2p67e89b351398832fe0fd547404d3afc65@epcms2p6 Fixes: 71aabb747d5f ("scsi: ufs: core: Reuse exec_dev_cmd") Reviewed-by: Avri Altman <avri.altman@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-04-28scsi: myrb: Fix spelling mistake "statux" -> "status"Colin Ian King
There is a spelling mistake in a dev_err() message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20250422170347.66792-1-colin.i.king@gmail.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-04-28net: ti: icssg-prueth: Add ICSSG FW StatsMD Danish Anwar
The ICSSG firmware maintains set of stats called PA_STATS. Currently the driver only dumps 4 stats. Add support for dumping more stats. The offset for different stats are defined as MACROs in icssg_switch_map.h file. All the offsets are for Slice0. Slice1 offsets are slice0 + 4. The offset calculation is taken care while reading the stats in emac_update_hardware_stats(). The statistics are documented in Documentation/networking/device_drivers/icssg_prueth.rst Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20250424095316.2643573-1-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28tools/Makefile: Add ynl targetJoe Damato
Add targets to build, clean, and install ynl headers, libynl.a, and python tooling. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250423204647.190784-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28rtase: Modify the format specifier in snprintf to %uJustin Lai
Modify the format specifier in snprintf to %u. Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425064057.30035-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28Merge tag 'v6.15-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - Fix three potential use after frees: in session logoff, in krb5 auth, and in RPC open - Fix missing rc check in session setup authentication * tag 'v6.15-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix use-after-free in session logoff ksmbd: fix use-after-free in kerberos authentication ksmbd: fix use-after-free in ksmbd_session_rpc_open smb: server: smb2pdu: check return value of xa_store()
2025-04-28Merge branch 'phase-out-hybrid-pci-devres-api'Jakub Kicinski
Philipp Stanner says: ==================== Phase out hybrid PCI devres API Fixes a number of minor issues with the usage of the PCI API in net. Notbaly, it replaces calls to the sometimes-managed pci_request_regions() to the always-managed pcim_request_all_regions(), enabling us to remove that hybrid functionality from PCI. ==================== Link: https://patch.msgid.link/20250425085740.65304-2-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: thunder_bgx: Don't disable PCI device manuallyPhilipp Stanner
thunder_bgx's PCI device is enabled with pcim_enable_device(), a managed devres function which ensures that the device gets enabled on driver detach automatically. Remove the calls to pci_disable_device(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-10-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: thunder_bgx: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-9-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: mdio: thunder: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-8-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: ethernet: sis900: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Daniele Venzano <venza@brownhat.org> Link: https://patch.msgid.link/20250425085740.65304-7-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: ethernet: natsemi: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-6-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: tulip: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-5-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: octeontx2: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patch.msgid.link/20250425085740.65304-4-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: prestera: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Acked-by: Elad Nachman <enachman@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-3-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28Merge branch 'intel-net-queue-100GbE'Jakub Kicinski
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-04-22 (ice, idpf) For ice: Paul removes setting of ICE_AQ_FLAG_RD in ice_get_set_tx_topo() on E830 devices. Xuanqiang Luo adds error check for NULL VF VSI. For idpf: Madhu fixes misreporting of, currently, unsupported encapsulated packets. ==================== Link: https://patch.msgid.link/20250425222636.3188441-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28idpf: fix offloads support for encapsulated packetsMadhu Chittim
Split offloads into csum, tso and other offloads so that tunneled packets do not by default have all the offloads enabled. Stateless offloads for encapsulated packets are not yet supported in firmware/software but in the driver we were setting the features same as non encapsulated features. Fixed naming to clarify CSUM bits are being checked for Tx. Inherit netdev features to VLAN interfaces as well. Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Zachary Goldstein <zachmgoldstein@google.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250425222636.3188441-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr()Xuanqiang Luo
As mentioned in the commit baeb705fd6a7 ("ice: always check VF VSI pointer values"), we need to perform a null pointer check on the return value of ice_get_vf_vsi() before using it. Fixes: 6ebbe97a4881 ("ice: Add a per-VF limit on number of FDIR filters") Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250425222636.3188441-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28ice: fix Get Tx Topology AQ command error on E830Paul Greenwalt
The Get Tx Topology AQ command (opcode 0x0418) has different read flag requirements depending on the hardware/firmware. For E810, E822, and E823 firmware the read flag must be set, and for newer hardware (E825 and E830) it must not be set. This results in failure to configure Tx topology and the following warning message during probe: DDP package does not support Tx scheduling layers switching feature - please update to the latest DDP package and try again The current implementation only handles E825-C but not E830. It is confusing as we first check ice_is_e825c() and then set the flag in the set case. Finally, we check ice_is_e825c() again and set the flag for all other hardware in both the set and get case. Instead, notice that we always need the read flag for set, but only need the read flag for get on E810, E822, and E823 firmware. Fix the logic to check the MAC type and set the read flag in get only on the older devices which require it. Fixes: ba1124f58afd ("ice: Add E830 device IDs, MAC type and registers") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250425222636.3188441-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28Merge branch 'net_sched-adapt-qdiscs-for-reentrant-enqueue-cases'Jakub Kicinski
Victor Nogueira says: ==================== net_sched: Adapt qdiscs for reentrant enqueue cases As described in Gerrard's report [1], there are cases where netem can make the qdisc enqueue callback reentrant. Some qdiscs (drr, hfsc, ets, qfq) break whenever the enqueue callback has reentrant behaviour. This series addresses these issues by adding extra checks that cater for these reentrant corner cases. This series has passed all relevant test cases in the TDC suite. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ ==================== Link: https://patch.msgid.link/20250425220710.3964791-1-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28selftests: tc-testing: Add TDC tests that exercise reentrant enqueue behaviourVictor Nogueira
Add 5 TDC tests that exercise the reentrant enqueue behaviour in drr, ets, qfq, and hfsc: - Test DRR's enqueue reentrant behaviour with netem (which caused a double list add) - Test ETS's enqueue reentrant behaviour with netem (which caused a double list add) - Test QFQ's enqueue reentrant behaviour with netem (which caused a double list add) - Test HFSC's enqueue reentrant behaviour with netem (which caused a UAF) - Test nested DRR's enqueue reentrant behaviour with netem (which caused a double list add) Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-6-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net_sched: qfq: Fix double list add in class with netem as child qdiscVictor Nogueira
As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of qfq, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. This patch checks whether the class was already added to the agg->active list (cl_is_active) before doing the addition to cater for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-5-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net_sched: ets: Fix double list add in class with netem as child qdiscVictor Nogueira
As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of ets, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before doing the addition to cater for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-4-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net_sched: hfsc: Fix a UAF vulnerability in class with netem as child qdiscVictor Nogueira
As described in Gerrard's report [1], we have a UAF case when an hfsc class has a netem child qdisc. The crux of the issue is that hfsc is assuming that checking for cl->qdisc->q.qlen == 0 guarantees that it hasn't inserted the class in the vttree or eltree (which is not true for the netem duplicate case). This patch checks the n_active class variable to make sure that the code won't insert the class in the vttree or eltree twice, catering for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-3-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net_sched: drr: Fix double list add in class with netem as child qdiscVictor Nogueira
As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of drr, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before adding to the list to cover for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-2-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28pds_core: remove write-after-free of client_idShannon Nelson
A use-after-free error popped up in stress testing: [Mon Apr 21 21:21:33 2025] BUG: KFENCE: use-after-free write in pdsc_auxbus_dev_del+0xef/0x160 [pds_core] [Mon Apr 21 21:21:33 2025] Use-after-free write at 0x000000007013ecd1 (in kfence-#47): [Mon Apr 21 21:21:33 2025] pdsc_auxbus_dev_del+0xef/0x160 [pds_core] [Mon Apr 21 21:21:33 2025] pdsc_remove+0xc0/0x1b0 [pds_core] [Mon Apr 21 21:21:33 2025] pci_device_remove+0x24/0x70 [Mon Apr 21 21:21:33 2025] device_release_driver_internal+0x11f/0x180 [Mon Apr 21 21:21:33 2025] driver_detach+0x45/0x80 [Mon Apr 21 21:21:33 2025] bus_remove_driver+0x83/0xe0 [Mon Apr 21 21:21:33 2025] pci_unregister_driver+0x1a/0x80 The actual device uninit usually happens on a separate thread scheduled after this code runs, but there is no guarantee of order of thread execution, so this could be a problem. There's no actual need to clear the client_id at this point, so simply remove the offending code. Fixes: 10659034c622 ("pds_core: add the aux client API") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250425203857.71547-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>