summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-06net/mlx5: HWS, separate SQ that HWS uses from the usual traffic SQsYevgeny Kliteynik
Mark the HWS SQ as 'non_wire' so that 'Flow Update' flow won't mix with network traffic. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, num_of_rules counter on matcher should be atomicYevgeny Kliteynik
Rule counter in matcher's struct is used in two places: 1. As heuristics to decide when the number of rules have crossed a certain percentage threshold and the matcher should be resized. We don't mind here if the number will be off by 1-2 due to concurrency. 2. When destroying matcher, the counter value is checked and the user is warned if it is not 0. Here we lock all the queues, so the counter will be correct. We don't need to always have *exact* number, but we do need this number to not be corrupted, which is what is happening when the counter isn't atomic, due to update by different threads. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, reduce memory consumption of a matcher structYevgeny Kliteynik
Instead of having a large array of action templates allocated with kmalloc, have smaller array and allocate it with kvmalloc. The size of the array represents the max number of AT attach operations for the same matcher. This number is not expected to be very high. In any case, when the limit is reached, the next attempt to attach new AT will result in creation of a new matcher and moving all the rules to this matcher. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, remove wrong deletion of the miss table listYevgeny Kliteynik
Remove wrong cleanup of the old miss table list and simplify the error flow in the function. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, change error flow on matcher disconnectYevgeny Kliteynik
Currently, when firmware failure occurs during matcher disconnect flow, the error flow of the function reconnects the matcher back and returns an error, which continues running the calling function and eventually frees the matcher that is being disconnected. This leads to a case where we have a freed matcher on the matchers list, which in turn leads to use-after-free and eventual crash. This patch fixes that by not trying to reconnect the matcher back when some FW command fails during disconnect. Note that we're dealing here with FW error. We can't overcome this problem. This might lead to bad steering state (e.g. wrong connection between matchers), and will also lead to resource leakage, as it is the case with any other error handling during resource destruction. However, the goal here is to allow the driver to continue and not crash the machine with use-after-free error. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, add error message on failure to move rulesYevgeny Kliteynik
Add error message for failure to move rules from old matcher to new one during rehash. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, simplify allocations as we support only FDBYevgeny Kliteynik
In pools, STCs and actions: no need to allocate array for various table types, as HWS is used to manage only FDB flow tables. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, denote how refcounts are protectedYevgeny Kliteynik
Some HWS structs have refcounts that are just u32. Comment how they are protected and add '__must_hold()' annotation where applicable. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, remove implementation of unused FW commandsYevgeny Kliteynik
Remove functions that manage alias objects - they are not used. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net/mlx5: HWS, remove the use of duplicated structsYevgeny Kliteynik
Remove definition in HWS of structs that are already defined in mlx5_ifc.h, and fix the usage of these structs. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250102181415.1477316-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06Merge branch 'net-pcs-add-supported_interfaces-bitmap-for-pcs'Jakub Kicinski
Russell King says: ==================== net: pcs: add supported_interfaces bitmap for PCS This series adds supported_interfaces for PCS, which gives MAC code a way to determine the interface modes that the PCS supports without having to implement functions such as xpcs_get_interfaces(), or workarounds such as in https://lore.kernel.org/20241213090526.71516-3-maxime.chevallier@bootlin.com Patch 1 adds the new bitmask to struct phylink_pcs, and code within phylink to validate that the PCS returned by the MAC driver supports the interface mode - but only if this bitmask is non-empty. Patch 2 through 4 fills in the interface modes for XPCS, Mediatek LynxI and Lynx PCS. Patch 5 adds support to stmmac to make use of this bitmask when filling in phylink_config.supported_interfaces, eliminating the call to xpcs_get_interfaces. As xpcs_get_interfaces() is now unused outside of pcs-xpcs.c, patch 6 makes this function static and removes it from the header file. ==================== Link: https://patch.msgid.link/Z3fG9oTY9F9fCYHv@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: pcs: xpcs: make xpcs_get_interfaces() staticRussell King (Oracle)
xpcs_get_interfaces() should no longer be used outside of the XPCS code, so make it static. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tTffk-007Roi-JM@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: stmmac: use PCS supported_interfacesRussell King (Oracle)
Use the PCS' supported_interfaces member to build the MAC level supported_interfaces bitmap. Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tTfff-007Roc-Ff@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: pcs: lynx: fill in PCS supported_interfacesRussell King (Oracle)
Fill in the new PCS supported_interfaces member with the interfaces that Lynx supports. Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tTffa-007RoV-Bo@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: pcs: mtk-lynxi: fill in PCS supported_interfacesRussell King (Oracle)
Fill in the new PCS supported_interfaces member with the interfaces that the Mediatek LynxI supports. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tTffV-007RoP-8D@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: pcs: xpcs: fill in PCS supported_interfacesRussell King (Oracle)
Fill in the new PCS supported_interfaces member with the interfaces that XPCS supports. Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tTffQ-007RoJ-4u@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: phylink: add support for PCS supported_interfaces bitmapRussell King (Oracle)
Add support for the PCS to specify which interfaces it supports, which can be used by MAC drivers to build the main supported_interfaces bitmap. Phylink also validates that the PCS returned by the MAC driver supports the interface that the MAC was asked for. An empty supported_interfaces bitmap from the PCS indicates that it does not provide this information, and we handle that appropriately. Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tTffL-007RoD-1Y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06net: hsr: remove one synchronize_rcu() from hsr_del_port()Eric Dumazet
Use kfree_rcu() instead of synchronize_rcu()+kfree(). This might allow syzbot to fuzz HSR a bit faster... Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250103101148.3594545-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06pds_core: limit loop over fw name listShannon Nelson
Add an array size limit to the for-loop to be sure we don't try to reference a fw_version string off the end of the fw info names array. We know that our firmware only has a limited number of firmware slot names, but we shouldn't leave this unchecked. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06ax25: rcu protect dev->ax25_ptrEric Dumazet
syzbot found a lockdep issue [1]. We should remove ax25 RTNL dependency in ax25_setsockopt() This should also fix a variety of possible UAF in ax25. [1] WARNING: possible circular locking dependency detected 6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0 Not tainted ------------------------------------------------------ syz.5.1818/12806 is trying to acquire lock: ffffffff8fcb3988 (rtnl_mutex){+.+.}-{4:4}, at: ax25_setsockopt+0xa55/0xe90 net/ax25/af_ax25.c:680 but task is already holding lock: ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1618 [inline] ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: ax25_setsockopt+0x209/0xe90 net/ax25/af_ax25.c:574 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (sk_lock-AF_AX25){+.+.}-{0:0}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 lock_sock_nested+0x48/0x100 net/core/sock.c:3642 lock_sock include/net/sock.h:1618 [inline] ax25_kill_by_device net/ax25/af_ax25.c:101 [inline] ax25_device_event+0x24d/0x580 net/ax25/af_ax25.c:146 notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85 __dev_notify_flags+0x207/0x400 dev_change_flags+0xf0/0x1a0 net/core/dev.c:9026 dev_ifsioc+0x7c8/0xe70 net/core/dev_ioctl.c:563 dev_ioctl+0x719/0x1340 net/core/dev_ioctl.c:820 sock_do_ioctl+0x240/0x460 net/socket.c:1234 sock_ioctl+0x626/0x8e0 net/socket.c:1339 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (rtnl_mutex){+.+.}-{4:4}: check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0x1ac/0xee0 kernel/locking/mutex.c:735 ax25_setsockopt+0xa55/0xe90 net/ax25/af_ax25.c:680 do_sock_setsockopt+0x3af/0x720 net/socket.c:2324 __sys_setsockopt net/socket.c:2349 [inline] __do_sys_setsockopt net/socket.c:2355 [inline] __se_sys_setsockopt net/socket.c:2352 [inline] __x64_sys_setsockopt+0x1ee/0x280 net/socket.c:2352 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(sk_lock-AF_AX25); lock(rtnl_mutex); lock(sk_lock-AF_AX25); lock(rtnl_mutex); *** DEADLOCK *** 1 lock held by syz.5.1818/12806: #0: ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1618 [inline] #0: ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: ax25_setsockopt+0x209/0xe90 net/ax25/af_ax25.c:574 stack backtrace: CPU: 1 UID: 0 PID: 12806 Comm: syz.5.1818 Not tainted 6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206 check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0x1ac/0xee0 kernel/locking/mutex.c:735 ax25_setsockopt+0xa55/0xe90 net/ax25/af_ax25.c:680 do_sock_setsockopt+0x3af/0x720 net/socket.c:2324 __sys_setsockopt net/socket.c:2349 [inline] __do_sys_setsockopt net/socket.c:2355 [inline] __se_sys_setsockopt net/socket.c:2352 [inline] __x64_sys_setsockopt+0x1ee/0x280 net/socket.c:2352 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f7b62385d29 Fixes: c433570458e4 ("ax25: fix a use-after-free in ax25_fillin_cb()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250103210514.87290-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06sctp: Prepare sctp_v4_get_dst() to dscp_t conversion.Guillaume Nault
Define inet_sk_dscp() to get a dscp_t value from struct inet_sock, so that sctp_v4_get_dst() can easily set ->flowi4_tos from a dscp_t variable. For the SCTP_DSCP_SET_MASK case, we can just use inet_dsfield_to_dscp() to get a dscp_t value. Then, when converting ->flowi4_tos from __u8 to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() conversion function. Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/1a645f4a0bc60ad18e7c0916642883ce8a43c013.1735835456.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06Merge branch 'igc-deadcoding'Jakub Kicinski
Dr. David Alan Gilbert says: ==================== igc deadcoding This set removes some functions that are entirely unused and have been since ~2018. ==================== Link: https://patch.msgid.link/20250102174142.200700-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06igc: Remove unused igc_read/write_pcie_cap_regDr. David Alan Gilbert
The last uses of igc_read_pcie_cap_reg() and igc_write_pcie_cap_reg() were removed in 2019 by commit 16ecd8d9af26 ("igc: Remove the obsolete workaround") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102174142.200700-4-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06igc: Remove unused igc_read/write_pci_cfg wrappersDr. David Alan Gilbert
igc_read_pci_cfg() and igc_write_pci_cfg were added in 2018 as part of commit 146740f9abc4 ("igc: Add support for PF") but have remained unused. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102174142.200700-3-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06igc: Remove unused igc_acquire/release_nvmDr. David Alan Gilbert
igc_acquire_nvm() and igc_release_nvm() were added in 2018 as part of commit ab4056126813 ("igc: Add NVM support") but never used. Remove them. The igc_1225.c has it's own specific implementations. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102174142.200700-2-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06Merge branch 'i40e-deadcoding'Jakub Kicinski
Dr. David Alan Gilbert says: ==================== i40e deadcoding This is a bunch of deadcoding of functions that are entirely uncalled in the i40e driver. Build tested only. ==================== Link: https://patch.msgid.link/20250102173717.200359-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_dcb_hw_get_num_tcDr. David Alan Gilbert
The last useof i40e_dcb_hw_get_num_tc() was removed in 2022 by commit fe20371578ef ("Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC"") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-10-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_asq_send_command_v2Dr. David Alan Gilbert
i40e_asq_send_command_v2() was added in 2022 by commit 74073848b0d7 ("i40e: Add new versions of send ASQ command functions") but hasn't been used. Remove it. (The _atomic_v2 version of the function is used, so leave it). Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-9-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_commit_partition_bw_settingDr. David Alan Gilbert
i40e_commit_partition_bw_setting() was added in 2017 by commit 4fc8c6763957 ("i40e: genericize the partition bandwidth control") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-8-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_del_filterDr. David Alan Gilbert
The last use of i40e_del_filter() was removed in 2016 by commit 9569a9a4547d ("i40e: when adding or removing MAC filters, correctly handle VLANs") Remove it. Fix up a comment that referenced it. Note: The __ version of this function is still used. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-7-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_get_cur_guaranteed_fd_countDr. David Alan Gilbert
The last use of i40e_get_cur_guaranteed_fd_count() was removed in 2015 by commit 04294e38a451 ("i40e: FD filters flush policy changes") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-6-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Deadcode profile codeDr. David Alan Gilbert
i40e_add_pinfo_to_list() was added in 2017 by commit 1d5c960c5ef5 ("i40e: new AQ commands") i40e_find_section_in_profile() was added in 2019 by commit cdc594e00370 ("i40e: Implement DDP support in i40e driver") Neither have been used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-5-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_(read|write)_phy_registerDr. David Alan Gilbert
i40e_read_phy_register() and i40e_write_phy_register() were added in 2016 by commit f62ba91458b5 ("i40e: Add functions which apply correct PHY access method for read and write operation") but haven't been used. Remove them. (There are more specific _clause* variants of these functions that are still used.) Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-4-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_blink_phy_link_ledDr. David Alan Gilbert
i40e_blink_phy_link_led() was added in 2016 by commit fd077cd3399b ("i40e: Add functions to blink led on 10GBaseT PHY") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-3-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Deadcode i40e_aq_*Dr. David Alan Gilbert
i40e_aq_add_mirrorrule(), i40e_aq_delete_mirrorrule() and i40e_aq_set_vsi_vlan_promisc() were added in 2016 by commit 7bd6875bef70 ("i40e: APIs to Add/remove port mirroring rules") but haven't been used. They were the last user of i40e_mirrorrule_op(). i40e_aq_rearrange_nvm() was added in 2018 by commit f05798b4ff82 ("i40e: Add AQ command for rearrange NVM structure") but hasn't been used. i40e_aq_restore_lldp() was added in 2019 by commit c65e78f87f81 ("i40e: Further implementation of LLDP") but hasn't been used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-2-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06Merge tag 'vfs-6.13-rc7.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Relax assertions on failure to encode file handles The ->encode_fh() method can fail for various reasons. None of them warrant a WARN_ON(). - Fix overlayfs file handle encoding by allowing encoding an fid from an inode without an alias - Make sure fuse_dir_open() handles FOPEN_KEEP_CACHE. If it's not specified fuse needs to invaludate the directory inode page cache - Fix qnx6 so it builds with gcc-15 - Various fixes for netfslib and ceph and nfs filesystems: - Ignore silly rename files from afs and nfs when building header archives - Fix read result collection in netfslib with multiple subrequests - Handle ENOMEM for netfslib buffered reads - Fix oops in nfs_netfs_init_request() - Parse the secctx command immediately in cachefiles - Remove a redundant smp_rmb() in netfslib - Handle recursion in read retry in netfslib - Fix clearing of folio_queue - Fix missing cancellation of copy-to_cache when the cache for a file is temporarly disabled in netfslib - Sanity check the hfs root record - Fix zero padding data issues in concurrent write scenarios - Fix is_mnt_ns_file() after converting nsfs to path_from_stashed() - Fix missing declaration of init_files - Increase I/O priority when writing revoke records in jbd2 - Flush filesystem device before updating tail sequence in jbd2 * tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry fuse: respect FOPEN_KEEP_CACHE on opendir netfs: Fix is-caching check in read-retry netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled netfs: Fix ceph copy to cache on write-begin netfs: Work around recursion by abandoning retry if nothing read netfs: Fix missing barriers by using clear_and_wake_up_bit() netfs: Remove redundant use of smp_rmb() cachefiles: Parse the "secctx" immediately nfs: Fix oops in nfs_netfs_init_request() when copying to cache netfs: Fix enomem handling in buffered reads netfs: Fix non-contiguous donation between completed reads kheaders: Ignore silly-rename files fs: relax assertions on failure to encode file handles fs: fix missing declaration of init_files fs: fix is_mnt_ns_file() iomap: fix zero padding data issue in concurrent append writes iomap: pass byte granular end position to iomap_add_to_ioend jbd2: flush filesystem device before updating tail sequence ...
2025-01-06btrfs: zlib: fix avail_in bytes for s390 zlib HW compression pathMikhail Zaslonko
Since the input data length passed to zlib_compress_folios() can be arbitrary, always setting strm.avail_in to a multiple of PAGE_SIZE may cause read-in bytes to exceed the input range. Currently this triggers an assert in btrfs_compress_folios() on the debug kernel (see below). Fix strm.avail_in calculation for S390 hardware acceleration path. assertion failed: *total_in <= orig_len, in fs/btrfs/compression.c:1041 ------------[ cut here ]------------ kernel BUG at fs/btrfs/compression.c:1041! monitor event: 0040 ilc:2 [#1] PREEMPT SMP CPU: 16 UID: 0 PID: 325 Comm: kworker/u273:3 Not tainted 6.13.0-20241204.rc1.git6.fae3b21430ca.300.fc41.s390x+debug #1 Hardware name: IBM 3931 A01 703 (z/VM 7.4.0) Workqueue: btrfs-delalloc btrfs_work_helper Krnl PSW : 0704d00180000000 0000021761df6538 (btrfs_compress_folios+0x198/0x1a0) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 Krnl GPRS: 0000000080000000 0000000000000001 0000000000000047 0000000000000000 0000000000000006 ffffff01757bb000 000001976232fcc0 000000000000130c 000001976232fcd0 000001976232fcc8 00000118ff4a0e30 0000000000000001 00000111821ab400 0000011100000000 0000021761df6534 000001976232fb58 Krnl Code: 0000021761df6528: c020006f5ef4 larl %r2,0000021762be2310 0000021761df652e: c0e5ffbd09d5 brasl %r14,00000217615978d8 #0000021761df6534: af000000 mc 0,0 >0000021761df6538: 0707 bcr 0,%r7 0000021761df653a: 0707 bcr 0,%r7 0000021761df653c: 0707 bcr 0,%r7 0000021761df653e: 0707 bcr 0,%r7 0000021761df6540: c004004bb7ec brcl 0,000002176276d518 Call Trace: [<0000021761df6538>] btrfs_compress_folios+0x198/0x1a0 ([<0000021761df6534>] btrfs_compress_folios+0x194/0x1a0) [<0000021761d97788>] compress_file_range+0x3b8/0x6d0 [<0000021761dcee7c>] btrfs_work_helper+0x10c/0x160 [<0000021761645760>] process_one_work+0x2b0/0x5d0 [<000002176164637e>] worker_thread+0x20e/0x3e0 [<000002176165221a>] kthread+0x15a/0x170 [<00000217615b859c>] __ret_from_fork+0x3c/0x60 [<00000217626e72d2>] ret_from_fork+0xa/0x38 INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000021761597924>] _printk+0x4c/0x58 Kernel panic - not syncing: Fatal exception: panic_on_oops Fixes: fd1e75d0105d ("btrfs: make compression path to be subpage compatible") CC: stable@vger.kernel.org # 6.12+ Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06btrfs: zoned: calculate max_extent_size properly on non-zoned setupChristoph Hellwig
Since commit 559218d43ec9 ("block: pre-calculate max_zone_append_sectors"), queue_limits's max_zone_append_sectors is default to be 0 and it is only updated when there is a zoned device. So, we have lim->max_zone_append_sectors = 0 when there is no zoned device in the filesystem. That leads to fs_info->max_zone_append_size and thus fs_info->max_extent_size to be 0, which is wrong and can for example lead to a divide by zero in count_max_extents(). Fix this by only capping fs_info->max_extent_size to fs_info->max_zone_append_size when it is non-zero. Based on a patch from Naohiro Aota <naohiro.aota@wdc.com>, from which much of this commit message is stolen as well. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: 559218d43ec9 ("block: pre-calculate max_zone_append_sectors") Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06btrfs: avoid NULL pointer dereference if no valid extent treeQu Wenruo
[BUG] Syzbot reported a crash with the following call trace: BTRFS info (device loop0): scrub: started on devid 1 BUG: kernel NULL pointer dereference, address: 0000000000000208 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G O 6.13.0-rc4-custom+ #206 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022 RIP: 0010:find_first_extent_item+0x26/0x1f0 [btrfs] Call Trace: <TASK> scrub_find_fill_first_stripe+0x13d/0x3b0 [btrfs] scrub_simple_mirror+0x175/0x260 [btrfs] scrub_stripe+0x5d4/0x6c0 [btrfs] scrub_chunk+0xbb/0x170 [btrfs] scrub_enumerate_chunks+0x2f4/0x5f0 [btrfs] btrfs_scrub_dev+0x240/0x600 [btrfs] btrfs_ioctl+0x1dc8/0x2fa0 [btrfs] ? do_sys_openat2+0xa5/0xf0 __x64_sys_ioctl+0x97/0xc0 do_syscall_64+0x4f/0x120 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> [CAUSE] The reproducer is using a corrupted image where extent tree root is corrupted, thus forcing to use "rescue=all,ro" mount option to mount the image. Then it triggered a scrub, but since scrub relies on extent tree to find where the data/metadata extents are, scrub_find_fill_first_stripe() relies on an non-empty extent root. But unfortunately scrub_find_fill_first_stripe() doesn't really expect an NULL pointer for extent root, it use extent_root to grab fs_info and triggered a NULL pointer dereference. [FIX] Add an extra check for a valid extent root at the beginning of scrub_find_fill_first_stripe(). The new error path is introduced by 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots"), but that's pretty old, and later commit b979547513ff ("btrfs: scrub: introduce helper to find and fill sector info for a scrub_stripe") changed how we do scrub. So for kernels older than 6.6, the fix will need manual backport. Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/67756935.050a0220.25abdd.0a12.GAE@google.com/ Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots") Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06Merge tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio fix from Alex Williamson: - Fix a missed order alignment requirement of the pfn when inserting mappings through the new huge fault handler introduced in v6.12 (Alex Williamson) * tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfio: vfio/pci: Fallback huge faults for unaligned pfn
2025-01-06Merge patch series "Fix encoding overlayfs fid for fanotify delete events"Christian Brauner
Amir Goldstein <amir73il@gmail.com> says: This is a followup fix to the reported regression [1] that was introduced by overlayfs non-decodable file handles support in v6.6. The first fix posted two weeks ago [2] was a quick band aid which is justified on its own and is still queued on your vfs.fixes branch. This followup fix fixes the root cause of overlayfs file handle encoding failure and it also solves a bug with fanotify FAN_DELETE_SELF events on overlayfs, that was discovered from analysis of the first report. The fix to fanotify delete events was verified with a new LTP test [3]. [1] https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/ [2] https://lore.kernel.org/linux-fsdevel/20241219115301.465396-1-amir73il@gmail.com/ [3] https://github.com/amir73il/ltp/commits/ovl_encode_fid/ * patches from https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com: ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry Link: https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06ovl: support encoding fid from inode with no aliasAmir Goldstein
Dmitry Safonov reported that a WARN_ON() assertion can be trigered by userspace when calling inotify_show_fdinfo() for an overlayfs watched inode, whose dentry aliases were discarded with drop_caches. The WARN_ON() assertion in inotify_show_fdinfo() was removed, because it is possible for encoding file handle to fail for other reason, but the impact of failing to encode an overlayfs file handle goes beyond this assertion. As shown in the LTP test case mentioned in the link below, failure to encode an overlayfs file handle from a non-aliased inode also leads to failure to report an fid with FAN_DELETE_SELF fanotify events. As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails if it cannot find an alias for the inode, but this failure can be fixed. ovl_encode_fh() seldom uses the alias and in the case of non-decodable file handles, as is often the case with fanotify fid info, ovl_encode_fh() never needs to use the alias to encode a file handle. Defer finding an alias until it is actually needed so ovl_encode_fh() will not fail in the common case of FAN_DELETE_SELF fanotify events. Fixes: 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles") Reported-by: Dmitry Safonov <dima@arista.com> Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06ovl: pass realinode to ovl_encode_real_fh() instead of realdentryAmir Goldstein
We want to be able to encode an fid from an inode with no alias. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06Merge tag 'exfat-for-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: "All fixes are for issues reported by syzbot: - Fix wrong error return in exfat_find_empty_entry() - Fix a endless loop by self-linked chain - fix a KMSAN uninit-value issue in exfat_extend_valid_size()" * tag 'exfat-for-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix the infinite loop in __exfat_free_cluster() exfat: fix the new buffer was not zeroed before writing exfat: fix the infinite loop in exfat_readdir() exfat: fix exfat_find_empty_entry() not returning error on failure
2025-01-06Revert "vmstat: disable vmstat_work on vmstat_cpu_down_prep()"Linus Torvalds
This reverts commit adcfb264c3ed51fbbf5068ddf10d309a63683868. It turns out this just causes a different warning splat instead that seems to be much easier to trigger, so let's revert ASAP. Reported-and-bisected-by: Borislav Petkov <bp@alien8.de> Tested-by: Breno Leitao <leitao@debian.org> Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/all/20250106131817.GAZ3vYGVr3-hWFFPLj@fat_crate.local/ Cc: Koichiro Den <koichiro.den@canonical.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-01-06btrfs: don't read from userspace twice in btrfs_uring_encoded_read()Mark Harmstone
If we return -EAGAIN the first time because we need to block, btrfs_uring_encoded_read() will get called twice. Take a copy of args, the iovs, and the iter the first time, as by the time we are called the second time these may have gone out of scope. Reported-by: Jens Axboe <axboe@kernel.dk> Fixes: 34310c442e17 ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06io_uring: add io_uring_cmd_get_async_data helperMark Harmstone
Add a helper function in include/linux/io_uring/cmd.h to read the async_data pointer from a struct io_uring_cmd. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06io_uring/cmd: add per-op data to struct io_uring_cmd_dataJens Axboe
In case an op handler for ->uring_cmd() needs stable storage for user data, it can allocate io_uring_cmd_data->op_data and use it for the duration of the request. When the request gets cleaned up, uring_cmd will free it automatically. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06io_uring/cmd: rename struct uring_cache to io_uring_cmd_dataJens Axboe
In preparation for making this more generically available for ->uring_cmd() usage that needs stable command data, rename it and move it to io_uring/cmd.h instead. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06selftests/bpf: Extend netkit tests to validate set {head,tail}roomDaniel Borkmann
Extend the netkit selftests to specify and validate the {head,tail}room on the netdevice: # ./vmtest.sh -- ./test_progs -t netkit [...] ./test_progs -t netkit [ 1.174147] bpf_testmod: loading out-of-tree module taints kernel. [ 1.174585] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.422307] tsc: Refined TSC clocksource calibration: 3407.983 MHz [ 1.424511] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc3e5084, max_idle_ns: 440795359833 ns [ 1.428092] clocksource: Switched to clocksource tsc #363 tc_netkit_basic:OK #364 tc_netkit_device:OK #365 tc_netkit_multi_links:OK #366 tc_netkit_multi_opts:OK #367 tc_netkit_neigh_links:OK #368 tc_netkit_pkt_type:OK #369 tc_netkit_scrub:OK Summary: 7/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/bpf/20241220234658.490686-3-daniel@iogearbox.net