summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-10xsk: Bring back busy polling supportStanislav Fomichev
Commit 86e25f40aa1e ("net: napi: Add napi_config") moved napi->napi_id assignment to a later point in time (napi_hash_add_with_id). This breaks __xdp_rxq_info_reg which copies napi_id at an earlier time and now stores 0 napi_id. It also makes sk_mark_napi_id_once_xdp and __sk_mark_napi_id_once useless because they now work against 0 napi_id. Since sk_busy_loop requires valid napi_id to busy-poll on, there is no way to busy-poll AF_XDP sockets anymore. Bring back the ability to busy-poll on XSK by resolving socket's napi_id at bind time. This relies on relatively recent netif_queue_set_napi, but (assume) at this point most popular drivers should have been converted. This also removes per-tx/rx cycles which used to check and/or set the napi_id value. Confirmed by running a busy-polling AF_XDP socket (github.com/fomichev/xskrtt) on mlx5 and looking at BusyPollRxPackets from /proc/net/netstat. Fixes: 86e25f40aa1e ("net: napi: Add napi_config") Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250109003436.2829560-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10eth: bnxt: always recalculate features after XDP clearing, fix null-derefJakub Kicinski
Recalculate features when XDP is detached. Before: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev eth0 xdp off # ethtool -k eth0 | grep gro rx-gro-hw: off [requested on] After: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev eth0 xdp off # ethtool -k eth0 | grep gro rx-gro-hw: on The fact that HW-GRO doesn't get re-enabled automatically is just a minor annoyance. The real issue is that the features will randomly come back during another reconfiguration which just happens to invoke netdev_update_features(). The driver doesn't handle reconfiguring two things at a time very robustly. Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") we only reconfigure the RSS hash table if the "effective" number of Rx rings has changed. If HW-GRO is enabled "effective" number of rings is 2x what user sees. So if we are in the bad state, with HW-GRO re-enablement "pending" after XDP off, and we lower the rings by / 2 - the HW-GRO rings doing 2x and the ethtool -L doing / 2 may cancel each other out, and the: if (old_rx_rings != bp->hw_resc.resv_rx_rings && condition in __bnxt_reserve_rings() will be false. The RSS map won't get updated, and we'll crash with: BUG: kernel NULL pointer dereference, address: 0000000000000168 RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0 bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180 __bnxt_setup_vnic_p5+0x58/0x110 bnxt_init_nic+0xb72/0xf50 __bnxt_open_nic+0x40d/0xab0 bnxt_open_nic+0x2b/0x60 ethtool_set_channels+0x18c/0x1d0 As we try to access a freed ring. The issue is present since XDP support was added, really, but prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") it wasn't causing major issues. Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.") Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20250109043057.2888953-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10bpf: Fix bpf_sk_select_reuseport() memory leakMichal Luczaj
As pointed out in the original comment, lookup in sockmap can return a TCP ESTABLISHED socket. Such TCP socket may have had SO_ATTACH_REUSEPORT_EBPF set before it was ESTABLISHED. In other words, a non-NULL sk_reuseport_cb does not imply a non-refcounted socket. Drop sk's reference in both error paths. unreferenced object 0xffff888101911800 (size 2048): comm "test_progs", pid 44109, jiffies 4297131437 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 80 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 9336483b): __kmalloc_noprof+0x3bf/0x560 __reuseport_alloc+0x1d/0x40 reuseport_alloc+0xca/0x150 reuseport_attach_prog+0x87/0x140 sk_reuseport_attach_bpf+0xc8/0x100 sk_setsockopt+0x1181/0x1990 do_sock_setsockopt+0x12b/0x160 __sys_setsockopt+0x7b/0xc0 __x64_sys_setsockopt+0x1b/0x30 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 64d85290d79c ("bpf: Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250110-reuseport-memleak-v1-1-fa1ddab0adfe@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10Merge branch 'net-stmmac-clean-up-and-fix-eee-implementation'Jakub Kicinski
Russell King says: ==================== net: stmmac: clean up and fix EEE implementation This is a rework of stmmac's EEE support in light of the addition of EEE management to phylib. It's slightly more than 15 patches, but I think it makes sense to be so. Patch 1 adds configuration of the receive clock phy_eee_rx_clock_stop() (which was part of another series, but is necessary for this patch set.) Patch 2 converts stmmac to use phylib's tracking of tx_lpi_timer. Patch 3 corrects the data type used for things involving the LPI timer. The user API uses u32, so stmmac should do too, rather than blindly converting it to "int". eee_timer is left for patch 4. Patch 4 (new) uses an unsigned int for eee_timer. Patch 5 makes stmmac EEE state depend on phylib's enable_tx_lpi flag, thus using phylib's resolution of EEE state. Patch 6 removes redundant code from the ethtool EEE operations. Patch 7 removes some redundant code in stmmac_disable_eee_mode() and renames it to stmmac_disable_sw_eee_mode() to better reflect its purpose. Patch 8 removes the driver private tx_lpi_enabled, which is managed by phylib since patch 4. Patch 9 removes the dependence of EEE error statistics on the EEE enable state, instead depending on whether EEE is supported by the hardware. Patch 10 removes phy_init_eee(), instead using phy_eee_rx_clock_stop() to configure whether the PHY may stop the receive clock. Patch 11 removes priv->eee_tw_timer, which is only ever set to one value at probe time, effectively it is a constant. Hence this is unnecessary complexity. Patch 12 moves priv->eee_enabled into stmmac_eee_init(), and placing it under the protection of priv->lock, except when EEE is not supported (where it becomes constant-false.) Patch 13 moves priv->eee_active also into stmmac_eee_init(), so the indication whether EEE should be enabled or not is passed in to this function. Since both priv->eee_enabled and priv->eee_active are assigned true/false values, they should be typed "bool". Make it sew in patch 14. No Singer machine required. Patch 15 moves the initialisation of priv->eee_ctrl_timer to the probe function - it makes no sense to re-initialise the timer each time we want to start using it. Patch 16 removes the unnecessary EEE handling in the driver tear-down method. The core net code will have brought the interface down already, meaning EEE has already been disabled. Patch 17 reorganises the code to split the hardware LPI timer control paths from the software LPI timer paths. Patch 18 works on this further by eliminating stmmac_lpi_entry_timer_config() and making direct calls to the new functions. This reveals a potential bug where priv->eee_sw_timer_en is set true when EEE is disabled. This is not addressed in this series, but will be in a future separate patch - so that if fixing that causes a regression, it can be handled separately. ==================== Link: https://patch.msgid.link/Z36sHIlnExQBuFJE@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: remove stmmac_lpi_entry_timer_config()Russell King (Oracle)
Remove stmmac_lpi_entry_timer_config(), setting priv->eee_sw_timer_en at the original call sites, and calling the appropriate stmmac_xxx_hw_lpi_timer() function. No functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEq-0002LQ-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: split hardware LPI timer controlRussell King (Oracle)
Provide stmmac_disable_hw_lpi_timer() and stmmac_enable_hw_lpi_timer() to control the hardware transmit LPI timer. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEl-0002LK-LA@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: remove unnecessary EEE handling in stmmac_release()Russell King (Oracle)
phylink_stop() will cause phylink to call the mac_link_down() operation before phylink_stop() returns. As mac_link_down() will call stmmac_eee_init(false), this will set both priv->eee_active and priv->eee_enabled to be false, deleting the eee_ctrl_timer if priv->eee_enabled was previously set. As stmmac_release() calls phylink_stop() before checking whether priv->eee_enabled is true, this is a condition that can never be satisfied, and thus the code within this if() block will never be executed. Remove it. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEg-0002LE-HH@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: move setup of eee_ctrl_timer to stmmac_dvr_probe()Russell King (Oracle)
Move the initialisation of the EEE software timer to the probe function as it is unnecessary to do this each time we enable software LPI. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEb-0002L8-DJ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: use boolean for eee_enabled and eee_activeRussell King (Oracle)
priv->eee_enabled and priv->eee_active are both assigned using boolean values. Type them as bool rather than int. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEW-0002L2-9w@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: move priv->eee_active into stmmac_eee_init()Russell King (Oracle)
Since all call sites of stmmac_eee_init() assign priv->eee_active immediately before, pass this state into stmmac_eee_init() and assign priv->eee_active within this function. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZER-0002Kv-5O@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: move priv->eee_enabled into stmmac_eee_init()Russell King (Oracle)
All call sites for stmmac_eee_init() assign the return code to priv->eee_enabled. Rather than having this coded at each call site, move the assignment inside stmmac_eee_init(). Since stmmac_init_eee() takes priv->lock before checking the state of priv->eee_enabled, move the assignment within the locked region. Also, stmmac_suspend() checks the state of this member under the lock. While two concurrent calls to stmmac_init_eee() aren't possible, there is a possibility that stmmac_suspend() may run concurrently with a change of priv->eee_enabled unless we modify it under the lock. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEM-0002Kq-2Z@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: remove priv->eee_tw_timerRussell King (Oracle)
priv->eee_tw_timer is only assigned during initialisation to a constant value (STMMAC_DEFAULT_TWT_LS) and then never changed. Remove priv->eee_tw_timer, and instead use STMMAC_DEFAULT_TWT_LS for both uses in stmmac_eee_init(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEG-0002Kk-VH@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: convert to use phy_eee_rx_clock_stop()Russell King (Oracle)
Convert stmmac to use phy_eee_rx_clock_stop() to set the PHY receive clock stop in LPI setting, rather than calling the legacy phy_init_eee() function. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZEB-0002Ke-RZ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: report EEE error statistics if EEE is supportedRussell King (Oracle)
Report the number of EEE error statistics in the xstats even when EEE is not enabled in hardware, but is supported. The PHY maintains this counter even when EEE is not enabled. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZE6-0002KY-Nx@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: remove priv->tx_lpi_enabledRussell King (Oracle)
Through using phylib's EEE state, priv->tx_lpi_enabled has become a write-only variable. Remove it. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZE1-0002KS-K1@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: clean up stmmac_disable_eee_mode()Russell King (Oracle)
stmmac_disable_eee_mode() is now only called from stmmac_xmit() when both priv->tx_path_in_lpi_mode and priv->eee_sw_timer_en are true. Therefore: if (!priv->eee_sw_timer_en) in stmmac_disable_eee_mode() will never be true, so this is dead code. Remove it, and rename the function to indicate that it now only deals with software based EEE mode. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDw-0002KL-Gg@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: remove redundant code from ethtool EEE opsRussell King (Oracle)
Setting edata->tx_lpi_enabled in stmmac_ethtool_op_get_eee() gets overwritten by phylib, so there's no point setting this. In stmmac_ethtool_op_set_eee(), now that stmmac is using the result of phylib's evaluation of EEE, there is no need to handle anything in the ethtool EEE ops other than calling through to the appropriate phylink function, which will pass on to phylib the users request. As stmmac_disable_eee_mode() is now no longer called from outside stmmac_main.c, make it static. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDr-0002KF-Cv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: make EEE depend on phy->enable_tx_lpiRussell King (Oracle)
Make stmmac EEE depend on phylib's evaluation of user settings and PHY negotiation, as indicated by phy->enable_tx_lpi. This will ensure when phylib has evaluated that the user has disabled LPI, phy_init_eee() will not be called, and priv->eee_active will be false, causing LPI/EEE to be disabled. This is an interim measure - phy_init_eee() will be removed in a later patch. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDm-0002K9-9w@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: use unsigned int for eee_timerRussell King (Oracle)
Since eee_timer is used to initialise priv->tx_lpi_timer, this also should be unsigned to avoid a negative number being interpreted as a very large positive number. Note that this makes the check for negative numbers passed in as a module parameter redundant, and passing a negative number will now produce a large delay rather than the default. Since the default is used without an argument, passing a negative number would be quite obscure. However, if users do, then this will need to be revisited. Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDh-0002K3-6y@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: use correct type for tx_lpi_timerRussell King (Oracle)
The ethtool interface uses u32 for tx_lpi_timer, and so does phylib. Use u32 to store this internally within stmmac rather than "int" which could misinterpret large values. Correct "value" in dwmac4_set_eee_lpi_entry_timer() to use u32 rather than int, which is derived from tx_lpi_timer. Even though this path won't be used with values larger than STMMAC_ET_MAX, this brings consistency of type usage to the stmmac code for this variable. We leave eee_timer unchanged for now, with the assumption that values up to INT_MAX will safely fit in a u32. Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDc-0002Jx-3b@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: stmmac: move tx_lpi_timer tracking to phylibRussell King (Oracle)
When stmmac_ethtool_op_get_eee() is called, stmmac sets the tx_lpi_timer and tx_lpi_enabled members, and then calls into phylink and thus phylib. phylib overwrites these members. phylib will also cause a link down/link up transition when settings that impact the MAC have been changed. Convert stmmac to use the tx_lpi_timer setting in struct phy_device, updating priv->tx_lpi_timer each time when the link comes up, rather than trying to maintain this user setting itself. We initialise the phylib tx_lpi_timer setting by doing a get_ee-modify-set_eee sequence with the last known priv->tx_lpi_timer value. In order for this to work correctly, we also need this member to be initialised earlier. As stmmac_eee_init() is no longer called outside of stmmac_main.c, make it static. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDW-0002Jr-W3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10net: phy: add configuration of rx clock stop modeRussell King (Oracle)
Add a function to allow configuration of the PCS's clock stop enable bit, used to configure whether the xMII receive clock can be stopped during LPI mode. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tVZDR-0002Jl-Ry@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10Merge branch 'selftests-bpf-migrate-test_xdp_redirect-sh-to-test_progs'Martin KaFai Lau
Bastien Curutchet says: ==================== This patch series continues the work to migrate the *.sh tests into prog_tests. test_xdp_redirect.sh tests the XDP redirections done through bpf_redirect(). These XDP redirections are already tested by prog_tests/xdp_do_redirect.c but IMO it doesn't cover the exact same code path because xdp_do_redirect.c uses bpf_prog_test_run_opts() to trigger redirections of 'fake packets' while test_xdp_redirect.sh redirects packets coming from the network. Also, the test_xdp_redirect.sh script tests the redirections with both SKB and DRV modes while xdp_do_redirect.c only tests the DRV mode. The patch series adds two new test cases in prog_tests/xdp_do_redirect.c to replace the test_xdp_redirect.sh script. ==================== Link: https://patch.msgid.link/20250110-xdp_redirect-v2-0-b8f3ae53e894@bootlin.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-01-10selftests/bpf: Migrate test_xdp_redirect.c to test_xdp_do_redirect.cBastien Curutchet (eBPF Foundation)
prog_tests/xdp_do_redirect.c is the only user of the BPF programs located in progs/test_xdp_do_redirect.c and progs/test_xdp_redirect.c. There is no need to keep both files with such close names. Move test_xdp_redirect.c contents to test_xdp_do_redirect.c and remove progs/test_xdp_redirect.c Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250110-xdp_redirect-v2-3-b8f3ae53e894@bootlin.com
2025-01-10selftests/bpf: Migrate test_xdp_redirect.sh to xdp_do_redirect.cBastien Curutchet (eBPF Foundation)
test_xdp_redirect.sh can't be used by the BPF CI. Migrate test_xdp_redirect.sh into a new test case in xdp_do_redirect.c. It uses the same network topology and the same BPF programs located in progs/test_xdp_redirect.c and progs/xdp_dummy.c. Remove test_xdp_redirect.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250110-xdp_redirect-v2-2-b8f3ae53e894@bootlin.com
2025-01-10selftests/bpf: test_xdp_redirect: Rename BPF sectionsBastien Curutchet (eBPF Foundation)
SEC("redirect_to_111") and SEC("redirect_to_222") can't be loaded by the __load() helper. Rename both sections SEC("xdp") so it can be interpreted by the __load() helper in upcoming patch. Update the test_xdp_redirect.sh to use the program name instead of the section name to load the BPF program. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://patch.msgid.link/20250110-xdp_redirect-v2-1-b8f3ae53e894@bootlin.com
2025-01-10MAINTAINERS: powerpc: Update my statusMichael Ellerman
Maddy is taking over the day-to-day maintenance of powerpc. I will still be around to help, and as a backup. Re-order the main POWERPC list to put Maddy first to reflect that. KVM/powerpc patches will be handled by Maddy via the powerpc tree with review from Nick, so replace myself with Maddy there. Remove myself from BPF, leaving Hari & Christophe as maintainers. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-01-10io_uring: expose read/write attribute capabilityAnuj Gupta
After commit 9a213d3b80c0, we can pass additional attributes along with read/write. However, userspace doesn't know that. Add a new feature flag IORING_FEAT_RW_ATTR, to notify the userspace that the kernel has this ability. Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Tested-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20241205062109.1788-1-anuj20.g@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10smb: client: sync the root session and superblock context passwords before ↵Meetakshi Setiya
automounting In some cases, when password2 becomes the working password, the client swaps the two password fields in the root session struct, but not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit fs context from their parent mounts. Therefore, they might end up getting the passwords in the stale order. The automount should succeed, because the mount function will end up retrying with the actual password anyway. But to reduce these unnecessary session setup retries for automounts, we can sync the parent context's passwords with the root session's passwords before duplicating it to the child's fs context. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-10scsi: fnic: Propagate SCSI error code from fnic_scsi_drv_init()Arun Easi
Propagate scsi_add_host() error instead of returning -1. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Signed-off-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250110091956.17749-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Test for memory allocation failure and return error codeKaran Tilak Kumar
Fix kernel test robot warning. Test for memory allocation failure, and free memory for queues allocated in a multiqueue and non-multiqueue scenario. Return appropriate error code. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202412312347.FE4ZgEoM-lkp@intel.com/ Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202412312347.FE4ZgEoM-lkp@intel.com/ Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Reviewed-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250110091924.17729-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Return appropriate error code from failure of scsi drv initKaran Tilak Kumar
Return appropriate error code from fnic_probe caused by failure of fnic_scsi_drv_init. Fix bug report. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Reviewed-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250110091842.17711-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Return appropriate error code for mem alloc failureKaran Tilak Kumar
Return appropriate error code from fnic_probe when memory create slab pool fails. Fix bug report. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Reviewed-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250110091746.17671-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Remove always-true IS_FNIC_FCP_INITIATOR macroArun Easi
IS_FNIC_FCP_INITIATOR macro is not applicable at this time. Delete the macro. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Signed-off-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250110091655.17643-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Fix use of uninitialized value in debug messageDheeraj Reddy Jonnalagadda
The oxid variable in fdls_process_abts_req() was only being initialized inside the if (tport) block, but was being used in a debug print statement after that block. If tport was NULL, oxid would remain uninitialized. Move the oxid initialization to happen at declaration using FNIC_STD_GET_OX_ID(fchdr). Fixes: f828af44b8dd ("scsi: fnic: Add support for unsolicited requests and responses") Closes: https://scan7.scan.coverity.com/#/project-view/52337/11354?selectedIssue=1602772 Signed-off-by: Dheeraj Reddy Jonnalagadda <dheeraj.linuxdev@gmail.com> Link: https://lore.kernel.org/r/20250108050916.52721-1-dheeraj.linuxdev@gmail.com Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Delete incorrect debugfs error handlingDan Carpenter
Debugfs functions are not supposed to require error checking and, in fact, adding checks would normally lead to the driver refusing to load when CONFIG_DEBUGFS is disabled. What saves us here is that this code checks for NULL instead of error pointers so the error checking is all dead code. Delete it. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/a5c237cd-449b-4f9d-bcff-6285fb7c28d1@stanley.mountain Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Remove unnecessary else to fix warning in FDLS FIPKaran Tilak Kumar
Implement review comments from Martin: Remove unnecessary else from fip.c to fix a warning. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Reviewed-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250106224451.3597-3-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Remove extern definition from .c filesKaran Tilak Kumar
Implement review comments from Martin: Remove extern definition of fnic_fip_queue from .c files Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Reviewed-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250106224451.3597-2-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: fnic: Remove unnecessary else and unnecessary break in FDLSKaran Tilak Kumar
Incorporate review comments from Martin: Remove unnecessary else and unnecessary break to fix warnings in the FDLS code. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com> Reviewed-by: Arun Easi <aeasi@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20250106224451.3597-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10Merge tag 'sched_ext-for-6.13-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix corner case bug where ops.dispatch() couldn't extend the execution of the current task if SCX_OPS_ENQ_LAST is set. - Fix ops.cpu_release() not being called when a SCX task is preempted by a higher priority sched class task. - Fix buitin idle mask being incorrectly left as busy after an idle CPU is picked and kicked. - scx_ops_bypass() was unnecessarily using rq_lock() which comes with rq pinning related sanity checks which could trigger spuriously. Switch to raw_spin_rq_lock(). * tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: idle: Refresh idle masks during idle-to-idle transitions sched_ext: switch class when preempted by higher priority scheduler sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass() sched_ext: keep running prev when prev->scx.slice != 0
2025-01-10scsi: mpi3mr: Fix possible crash when setting up bsg failsGuixin Liu
If bsg_setup_queue() fails, the bsg_queue is assigned a non-NULL value. Consequently, in mpi3mr_bsg_exit(), the condition "if(!mrioc->bsg_queue)" will not be satisfied, preventing execution from entering bsg_remove_queue(), which could lead to the following crash: BUG: kernel NULL pointer dereference, address: 000000000000041c Call Trace: <TASK> mpi3mr_bsg_exit+0x1f/0x50 [mpi3mr] mpi3mr_remove+0x6f/0x340 [mpi3mr] pci_device_remove+0x3f/0xb0 device_release_driver_internal+0x19d/0x220 unbind_store+0xa4/0xb0 kernfs_fop_write_iter+0x11f/0x200 vfs_write+0x1fc/0x3e0 ksys_write+0x67/0xe0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Fixes: 4268fa751365 ("scsi: mpi3mr: Add bsg device support") Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Link: https://lore.kernel.org/r/20250107022032.24006-1-kanie@linux.alibaba.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: ufs: bsg: Set bsg_queue to NULL after removalGuixin Liu
Currently, this does not cause any issues, but I believe it is necessary to set bsg_queue to NULL after removing it to prevent potential use-after-free (UAF) access. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Link: https://lore.kernel.org/r/20241218014214.64533-3-kanie@linux.alibaba.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: ufs: bsg: Delete bsg_dev when setting up bsg failsGuixin Liu
We should remove the bsg device when bsg_setup_queue() fails to release the resources. Fixes: df032bf27a41 ("scsi: ufs: Add a bsg endpoint that supports UPIUs") Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Link: https://lore.kernel.org/r/20241218014214.64533-2-kanie@linux.alibaba.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10Merge tag 'cgroup-for-6.13-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Cpuset fixes: - Fix isolated CPUs leaking into sched domains - Remove now unnecessary kernfs active break which can trigger a warning - Comment updates" * tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: remove kernfs active break cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains cgroup/cpuset: Remove stale text
2025-01-10scsi: st: Don't set pos_unknown just after device recognitionKai Mäkisara
Commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling") in v6.6 added new code to handle the Power On/Reset Unit Attention (POR UA) sense data. This was in addition to the existing method. When this Unit Attention is received, the driver blocks attempts to read, write and some other operations because the reset may have rewinded the tape. Because of the added code, also the initial POR UA resulted in blocking operations, including those that are used to set the driver options after the device is recognized. Also, reading and writing are refused, whereas they succeeded before this commit. Add code to not set pos_unknown to block operations if the POR UA is received from the first test_ready() call after the st device has been created. This restores the behavior before v6.6. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20241216113755.30415-1-Kai.Makisara@kolumbus.fi Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling") CC: stable@vger.kernel.org Closes: https://lore.kernel.org/linux-scsi/2201CF73-4795-4D3B-9A79-6EE5215CF58D@kolumbus.fi/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10scsi: aic7xxx: Fix build 'aicasm' warningwangdicheng
When building with CONFIG_AIC7XXX_BUILD_FIRMWARE=y or CONFIG_AIC79XX_BUILD_FIRMWARE=y, the warning messages are as follows: aicasm_gram.tab.c:1722:16: warning: implicit declaration of function ‘yylex’ [-Wimplicit-function-declaration] aicasm_macro_gram.c:68:25: warning: implicit declaration of function ‘mmlex’ [-Wimplicit-function-declaration] aicasm_scan.l:417:6: warning: implicit declaration of function ‘mm_switch_to_buffer’ aicasm_scan.l:418:6: warning: implicit declaration of function ‘mmparse’ aicasm_scan.l:421:6: warning: implicit declaration of function ‘mm_delete_buffer’ The solution is to add the corresponding function declaration to the corresponding file. Signed-off-by: wangdicheng <wangdicheng@kylinos.cn> Signed-off-by: huanglei <huanglei@kylinos.cn> Link: https://lore.kernel.org/r/20241206071926.63832-1-wangdich9700@163.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-01-10Merge tag 'wq-for-6.13-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: - Add a WARN_ON_ONCE() on queue_delayed_work_on() on an offline CPU as such work items won't get executed till the CPU comes back online * tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: warn if delayed_work is queued to an offlined cpu.
2025-01-10Merge tag 'thermal-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix an OF node leak in the code parsing thermal zone DT properties (Joe Hattori)" * tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: of: fix OF node leak in of_thermal_zone_find()
2025-01-10wifi: ath9k: cleanup ath9k_hw_get_nf_hist_mid()Dmitry Antipov
In 'ath9k_hw_get_nf_hist_mid()', prefer 'memcpy()' and 'sort()' over an ad-hoc things. Briefly tested as a separate module. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20250109080703.106692-1-dmantipov@yandex.ru Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-10Merge tag 'acpi-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Add two more ACPI IRQ override quirks and update the code using them to avoid unnecessary overhead (Hans de Goede)" * tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: acpi_dev_irq_override(): Check DMI match last ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[] ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]