summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2025-08-20ixgbe: fix ndo_xdp_xmit() workloadsMaciej Fijalkowski
Currently ixgbe driver checks periodically in its watchdog subtask if there is anything to be transmitted (considering both Tx and XDP rings) under state of carrier not being 'ok'. Such event is interpreted as Tx hang and therefore results in interface reset. This is currently problematic for ndo_xdp_xmit() as it is allowed to produce descriptors when interface is going through reset or its carrier is turned off. Furthermore, XDP rings should not really be objects of Tx hang detection. This mechanism is rather a matter of ndo_tx_timeout() being called from dev_watchdog against Tx rings exposed to networking stack. Taking into account issues described above, let us have a two fold fix - do not respect XDP rings in local ixgbe watchdog and do not produce Tx descriptors in ndo_xdp_xmit callback when there is some problem with carrier currently. For now, keep the Tx hang checks in clean Tx irq routine, but adjust it to not execute for XDP rings. Cc: Tobias Böhm <tobias.boehm@hetzner-cloud.de> Reported-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Closes: https://lore.kernel.org/netdev/eca1880f-253a-4955-afe6-732d7c6926ee@hetzner-cloud.de/ Fixes: 6453073987ba ("ixgbe: add initial support for xdp redirect") Fixes: 33fdc82f0883 ("ixgbe: add support for XDP_TX action") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250819222000.3504873-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-20ixgbe: xsk: resolve the negative overflow of budget in ixgbe_xmit_zcJason Xing
Resolve the budget negative overflow which leads to returning true in ixgbe_xmit_zc even when the budget of descs are thoroughly consumed. Before this patch, when the budget is decreased to zero and finishes sending the last allowed desc in ixgbe_xmit_zc, it will always turn back and enter into the while() statement to see if it should keep processing packets, but in the meantime it unexpectedly decreases the value again to 'unsigned int (0--)', namely, UINT_MAX. Finally, the ixgbe_xmit_zc returns true, showing 'we complete cleaning the budget'. That also means 'clean_complete = true' in ixgbe_poll. The true theory behind this is if that budget number of descs are consumed, it implies that we might have more descs to be done. So we should return false in ixgbe_xmit_zc to tell napi poll to find another chance to start polling to handle the rest of descs. On the contrary, returning true here means job done and we know we finish all the possible descs this time and we don't intend to start a new napi poll. It is apparently against our expectations. Please also see how ixgbe_clean_tx_irq() handles the problem: it uses do..while() statement to make sure the budget can be decreased to zero at most and the negative overflow never happens. The patch adds 'likely' because we rarely would not hit the loop condition since the standard budget is 256. Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250819222000.3504873-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19microchip: lan865x: fix missing Timer Increment config for Rev.B0/B1Parthiban Veerasooran
Fix missing configuration for LAN865x silicon revisions B0 and B1 as per Microchip Application Note AN1760 (Rev F, June 2024). The Timer Increment register was not being set, which is required for accurate timestamping. As per the application note, configure the MAC to set timestamping at the end of the Start of Frame Delimiter (SFD), and set the Timer Increment register to 40 ns (corresponding to a 25 MHz internal clock). Link: https://www.microchip.com/en-us/application-notes/an1760 Fixes: 5cd2340cb6a3 ("microchip: lan865x: add driver support for Microchip's LAN865X MAC-PHY") Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250818060514.52795-3-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19microchip: lan865x: fix missing netif_start_queue() call on device openParthiban Veerasooran
This fixes an issue where the transmit queue is started implicitly only the very first time the device is registered. When the device is taken down and brought back up again (using `ip` or `ifconfig`), the transmit queue is not restarted, causing packet transmission to hang. Adding an explicit call to netif_start_queue() in lan865x_net_open() ensures the transmit queue is properly started every time the device is reopened. Fixes: 5cd2340cb6a3 ("microchip: lan865x: add driver support for Microchip's LAN865X MAC-PHY") Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Link: https://patch.msgid.link/20250818060514.52795-2-parthiban.veerasooran@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: CT: Use the correct counter offsetVlad Dogaru
Specifying the counter action is not enough, as it is used by multiple counters that were allocated in a bulk. By omitting the offset, rules will be associated with a different counter from the same bulk. Subsequently, the CT subsystem checks the correct counter, assumes that no traffic has triggered the rule, and ages out the rule. The end result is intermittent offloading of long lived connections, as rules are aged out then promptly re-added. Fix this by specifying the correct offset along with the counter rule. Fixes: 34eea5b12a10 ("net/mlx5e: CT: Add initial support for Hardware Steering") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-8-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: HWS, Fix table creation UIDAlex Vesker
During table creation, caller passes a UID using ft_attr. The UID value was ignored, which leads to problems when the caller sets the UID to a non-zero value, such as SHARED_RESOURCE_UID (0xffff) - the internal FT objects will be created with UID=0. Fixes: 0869701cba3d ("net/mlx5: HWS, added FW commands handling") Signed-off-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-7-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: HWS, don't rehash on every kind of insertion failureYevgeny Kliteynik
If rule creation failed due to a full queue, due to timeout in polling for completion, or due to matcher being in resize, don't try to initiate rehash sequence - rehash would have failed anyway. Fixes: 2111bb970c78 ("net/mlx5: HWS, added backward-compatible API handling") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: HWS, prevent rehash from filling up the queuesYevgeny Kliteynik
While moving the rules during rehash, CQ is not drained. The flush and drain happens only when all the rules of a certain queue have been moved. This behaviour can lead to accumulating large quantity of rules that haven't got their completion yet, and eventually will fill up the queue and will cause the rehash to fail. Fix this problem by requiring drain once the number of outstanding completions reaches a certain threshold. Fixes: ef94799a8741 ("net/mlx5: HWS, rework rehash loop") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: HWS, fix complex rules rehash error flowYevgeny Kliteynik
Moving rules from matcher to matcher should not fail. However, if it does fail due to various reasons, the error flow should allow the kernel to continue functioning (albeit with broken steering rules) instead of going into series of soft lock-ups or some other problematic behaviour. Similar to the simple rules, complex rules rehash logic suffers from the same problems. This patch fixes the error flow for moving complex rules: - If new rule creation fails before it was even enqeued, do not poll for completion - If TIMEOUT happened while moving the rule, no point trying to poll for completions for other rules. Something is broken, completion won't come, just abort the rehash sequence. - If some other completion with error received, don't give up. Continue handling rest of the rules to minimize the damage. - Make sure that the first error code that was received will be actually returned to the caller instead of replacing it with the generic error code. All the aforementioned issues stem from the same bad error flow, so no point fixing them one by one and leaving partially broken code - fixing them in one patch. Fixes: 17e0accac577 ("net/mlx5: HWS, support complex matchers") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: HWS, fix simple rules rehash error flowYevgeny Kliteynik
Moving rules from matcher to matcher should not fail. However, if it does fail due to various reasons, the error flow should allow the kernel to continue functioning (albeit with broken steering rules) instead of going into series of soft lock-ups or some other problematic behaviour. This patch fixes the error flow for moving simple rules: - If new rule creation fails before it was even enqeued, do not poll for completion - If TIMEOUT happened while moving the rule, no point trying to poll for completions for other rules. Something is broken, completion won't come, just abort the rehash sequence. - If some other completion with error received, don't give up. Continue handling rest of the rules to minimize the damage. - Make sure that the first error code that was received will be actually returned to the caller instead of replacing it with the generic error code. All the aforementioned issues stem from the same bad error flow, so no point fixing them one by one and leaving partially broken code - fixing them in one patch. Fixes: ef94799a8741 ("net/mlx5: HWS, rework rehash loop") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net/mlx5: HWS, fix bad parameter in CQ creationYevgeny Kliteynik
'cqe_sz' valid value should be 0 for 64-byte CQE. Fixes: 2ca62599aa0b ("net/mlx5: HWS, added send engine and context handling") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250817202323.308604-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net: stmmac: thead: Enable TX clock before MAC initializationYao Zi
The clk_tx_i clock must be supplied to the MAC for successful initialization. On TH1520 SoC, the clock is provided by an internal divider configured through GMAC_PLLCLK_DIV register when using RGMII interface. However, currently we don't setup the divider before initialization of the MAC, resulting in DMA reset failures if the bootloader/firmware doesn't enable the divider, [ 7.839601] thead-dwmac ffe7060000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 7.938338] thead-dwmac ffe7060000.ethernet eth0: PHY [stmmac-0:02] driver [RTL8211F Gigabit Ethernet] (irq=POLL) [ 8.160746] thead-dwmac ffe7060000.ethernet eth0: Failed to reset the dma [ 8.170118] thead-dwmac ffe7060000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed [ 8.179384] thead-dwmac ffe7060000.ethernet eth0: __stmmac_open: Hw setup failed Let's simply write GMAC_PLLCLK_DIV_EN to GMAC_PLLCLK_DIV to enable the divider before MAC initialization. Note that for reconfiguring the divisor, the divider must be disabled first and re-enabled later to make sure the new divisor take effect. The exact clock rate doesn't affect MAC's initialization according to my test. It's set to the speed required by RGMII when the linkspeed is 1Gbps and could be reclocked later after link is up if necessary. Fixes: 33a1a01e3afa ("net: stmmac: Add glue layer for T-HEAD TH1520 SoC") Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Drew Fustini <fustini@kernel.org> Link: https://patch.msgid.link/20250815104803.55294-1-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19gve: prevent ethtool ops after shutdownJordan Rhee
A crash can occur if an ethtool operation is invoked after shutdown() is called. shutdown() is invoked during system shutdown to stop DMA operations without performing expensive deallocations. It is discouraged to unregister the netdev in this path, so the device may still be visible to userspace and kernel helpers. In gve, shutdown() tears down most internal data structures. If an ethtool operation is dispatched after shutdown(), it will dereference freed or NULL pointers, leading to a kernel panic. While graceful shutdown normally quiesces userspace before invoking the reboot syscall, forced shutdowns (as observed on GCP VMs) can still trigger this path. Fix by calling netif_device_detach() in shutdown(). This marks the device as detached so the ethtool ioctl handler will skip dispatching operations to the driver. Fixes: 974365e51861 ("gve: Implement suspend/resume/shutdown") Signed-off-by: Jordan Rhee <jordanrhee@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250818211245.1156919-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-19net: ti: icssg-prueth: Fix HSR and switch offload Enablement during firwmare ↵MD Danish Anwar
reload. To enable HSR / Switch offload, certain configurations are needed. Currently they are done inside icssg_change_mode(). This function only gets called if we move from one mode to another without bringing the links up / down. Once in HSR / Switch mode, if we bring the links down and bring it back up again. The callback sequence is, - emac_ndo_stop() Firmwares are stopped - emac_ndo_open() Firmwares are loaded In this path icssg_change_mode() doesn't get called and as a result the configurations needed for HSR / Switch is not done. To fix this, put all these configurations in a separate function icssg_enable_fw_offload() and call this from both icssg_change_mode() and emac_ndo_open() Fixes: 56375086d093 ("net: ti: icssg-prueth: Enable HSR Tx duplication, Tx Tag and Rx Tag offload") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20250814105106.1491871-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-19net: ethernet: mtk_ppe: add RCU lock around dev_fill_forward_pathQingfang Deng
Ensure ndo_fill_forward_path() is called with RCU lock held. Fixes: 2830e314778d ("net: ethernet: mtk-ppe: fix traffic offload with bridged wlan") Signed-off-by: Qingfang Deng <dqfext@gmail.com> Link: https://patch.msgid.link/20250814012559.3705-1-dqfext@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-18bnxt_en: Fix lockdep warning during rmmodMichael Chan
The commit under the Fixes tag added a netdev_assert_locked() in bnxt_free_ntp_fltrs(). The lock should be held during normal run-time but the assert will be triggered (see below) during bnxt_remove_one() which should not need the lock. The netdev is already unregistered by then. Fix it by calling netdev_assert_locked_or_invisible() which will not assert if the netdev is unregistered. WARNING: CPU: 5 PID: 2241 at ./include/net/netdev_lock.h:17 bnxt_free_ntp_fltrs+0xf8/0x100 [bnxt_en] Modules linked in: rpcrdma rdma_cm iw_cm ib_cm configfs ib_core bnxt_en(-) bridge stp llc x86_pkg_temp_thermal xfs tg3 [last unloaded: bnxt_re] CPU: 5 UID: 0 PID: 2241 Comm: rmmod Tainted: G S W 6.16.0 #2 PREEMPT(voluntary) Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 RIP: 0010:bnxt_free_ntp_fltrs+0xf8/0x100 [bnxt_en] Code: 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 8b 47 60 be ff ff ff ff 48 8d b8 28 0c 00 00 e8 d0 cf 41 c3 85 c0 0f 85 2e ff ff ff <0f> 0b e9 27 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffa92082387da0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9e5b593d8000 RCX: 0000000000000001 RDX: 0000000000000001 RSI: ffffffff83dc9a70 RDI: ffffffff83e1a1cf RBP: ffff9e5b593d8c80 R08: 0000000000000000 R09: ffffffff8373a2b3 R10: 000000008100009f R11: 0000000000000001 R12: 0000000000000001 R13: ffffffffc01c4478 R14: dead000000000122 R15: dead000000000100 FS: 00007f3a8a52c740(0000) GS:ffff9e631ad1c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055bb289419c8 CR3: 000000011274e001 CR4: 00000000003706f0 Call Trace: <TASK> bnxt_remove_one+0x57/0x180 [bnxt_en] pci_device_remove+0x39/0xc0 device_release_driver_internal+0xa5/0x130 driver_detach+0x42/0x90 bus_remove_driver+0x61/0xc0 pci_unregister_driver+0x38/0x90 bnxt_exit+0xc/0x7d0 [bnxt_en] Fixes: 004b5008016a ("eth: bnxt: remove most dependencies on RTNL") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250816183850.4125033-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-15net: libwx: Fix the size in RSS hash key populationChandra Mohan Sundar
While trying to fill a random RSS key, the size of the pointer is being used rather than the actual size of the RSS key. Fix by passing an appropriate value of the RSS key. This issue was reported by static coverity analyser. Fixes: eb4898fde1de8 ("net: libwx: add wangxun vf common api") Signed-off-by: Chandra Mohan Sundar <chandramohan.explore@gmail.com> Link: https://patch.msgid.link/20250814163014.613004-1-chandramohan.explore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-15mlxsw: spectrum: Forward packets with an IPv4 link-local source IPIdo Schimmel
By default, the device does not forward IPv4 packets with a link-local source IP (i.e., 169.254.0.0/16). This behavior does not align with the kernel which does forward them. Fix by instructing the device to forward such packets instead of dropping them. Fixes: ca360db4b825 ("mlxsw: spectrum: Disable DIP_LINK_LOCAL check in hardware pipeline") Reported-by: Zoey Mertes <zoey@cloudflare.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/6721e6b2c96feb80269e72ce8d0b426e2f32d99c.1755174341.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-14rtase: Fix Rx descriptor CRC error bit definitionJustin Lai
The CRC error bit is located at bit 17 in the Rx descriptor, but the driver was incorrectly using bit 16. Fix it. Fixes: a36e9f5cfe9e ("rtase: Add support for a pci table in this module") Signed-off-by: Justin Lai <justinlai0215@realtek.com> Link: https://patch.msgid.link/20250813071631.7566-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-14net: xilinx: axienet: Fix RX skb ring management in DMAengine modeSuraj Gupta
Submit multiple descriptors in axienet_rx_cb() to fill Rx skb ring. This ensures the ring "catches up" on previously missed allocations. Increment Rx skb ring head pointer after BD is successfully allocated. Previously, head pointer was incremented before verifying if descriptor is successfully allocated and has valid entries, which could lead to ring state inconsistency if descriptor setup failed. These changes improve reliability by maintaining adequate descriptor availability and ensuring proper ring buffer state management. Fixes: 6a91b846af85 ("net: axienet: Introduce dmaengine support") Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Link: https://patch.msgid.link/20250813135559.1555652-1-suraj.gupta2@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-13Merge branch '10GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== ixgbe: bypass devlink phys_port_name generation Jedrzej adds option to skip phys_port_name generation and opts ixgbe into it as some configurations rely on pre-devlink naming which could end up broken as a result. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ixgbe: prevent from unwanted interface name changes devlink: let driver opt out of automatic phys_port_name generation ==================== Link: https://patch.msgid.link/20250812205226.1984369-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-13bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZEDavid Wei
The data page pool always fills the HW rx ring with pages. On arm64 with 64K pages, this will waste _at least_ 32K of memory per entry in the rx ring. Fix by fragmenting the pages if PAGE_SIZE > BNXT_RX_PAGE_SIZE. This makes the data page pool the same as the header pool. Tested with iperf3 with a small (64 entries) rx ring to encourage buffer circulation. Fixes: cd1fafe7da1f ("eth: bnxt: add support rx side device memory TCP") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250812182907.1540755-1-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-12ixgbe: prevent from unwanted interface name changesJedrzej Jagielski
Users of the ixgbe driver report that after adding devlink support by the commit a0285236ab93 ("ixgbe: add initial devlink support") their configs got broken due to unwanted changes of interface names. It's caused by automatic phys_port_name generation during devlink port initialization flow. To prevent from that set no_phys_port_name flag for ixgbe devlink ports. Reported-by: David Howells <dhowells@redhat.com> Closes: https://lore.kernel.org/netdev/3452224.1745518016@warthog.procyon.org.uk/ Reported-by: David Kaplan <David.Kaplan@amd.com> Closes: https://lore.kernel.org/netdev/LV3PR12MB92658474624CCF60220157199470A@LV3PR12MB9265.namprd12.prod.outlook.com/ Fixes: a0285236ab93 ("ixgbe: add initial devlink support") Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-12net: stmmac: thead: Get and enable APB clock on initializationYao Zi
It's necessary to adjust the MAC TX clock when the linkspeed changes, but it's noted such adjustment always fails on TH1520 SoC, and reading back from APB glue registers that control clock generation results in garbage, causing broken link. With some testing, it's found a clock must be ungated for access to APB glue registers. Without any consumer, the clock is automatically disabled during late kernel startup. Let's get and enable it if it's described in devicetree. For backward compatibility with older devicetrees, probing won't fail if the APB clock isn't found. In this case, we emit a warning since the link will break if the speed changes. Fixes: 33a1a01e3afa ("net: stmmac: Add glue layer for T-HEAD TH1520 SoC") Signed-off-by: Yao Zi <ziyao@disroot.org> Tested-by: Drew Fustini <fustini@kernel.org> Reviewed-by: Drew Fustini <fustini@kernel.org> Link: https://patch.msgid.link/20250808093655.48074-4-ziyao@disroot.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-11net: stmmac: dwc-qos: fix clk prepare/enable leak on probe failureRussell King (Oracle)
dwc_eth_dwmac_probe() gets bulk clocks, and then prepares and enables them. Unfortunately, if dwc_eth_dwmac_config_dt() or stmmac_dvr_probe() fail, we leave the clocks prepared and enabled. Fix this by using devm_clk_bulk_get_all_enabled() to combine the steps and provide devm based release of the prepare and enable state. This also fixes a similar leakin dwc_eth_dwmac_remove() which wasn't correctly retrieving the struct plat_stmmacenet_data. This becomes unnecessary. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: a045e40645df ("net: stmmac: refactor clock management in EQoS driver") Link: https://patch.msgid.link/E1ukM1X-0086qu-Td@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-11net: stmmac: rk: put the PHY clock on removeRussell King (Oracle)
The PHY clock (bsp_priv->clk_phy) is obtained using of_clk_get(), which doesn't take part in the devm release. Therefore, when a device is unbound, this clock needs to be explicitly put. Fix this. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: fecd4d7eef8b ("net: stmmac: dwmac-rk: Add integrated PHY support") Link: https://patch.msgid.link/E1ukM1S-0086qo-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08net: ti: icss-iep: Fix incorrect type for return value in extts_enable()Alok Tiwari
The variable ret in icss_iep_extts_enable() was incorrectly declared as u32, while the function returns int and may return negative error codes. This will cause sign extension issues and incorrect error propagation. Update ret to be int to fix error handling. This change corrects the declaration to avoid potential type mismatch. Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250805142323.1949406-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08net: page_pool: allow enabling recycling late, fix false positive warningJakub Kicinski
Page pool can have pages "directly" (locklessly) recycled to it, if the NAPI that owns the page pool is scheduled to run on the same CPU. To make this safe we check that the NAPI is disabled while we destroy the page pool. In most cases NAPI and page pool lifetimes are tied together so this happens naturally. The queue API expects the following order of calls: -> mem_alloc alloc new pp -> stop napi_disable -> start napi_enable -> mem_free free old pp Here we allocate the page pool in ->mem_alloc and free in ->mem_free. But the NAPIs are only stopped between ->stop and ->start. We created page_pool_disable_direct_recycling() to safely shut down the recycling in ->stop. This way the page_pool_destroy() call in ->mem_free doesn't have to worry about recycling any more. Unfortunately, the page_pool_disable_direct_recycling() is not enough to deal with failures which necessitate freeing the _new_ page pool. If we hit a failure in ->mem_alloc or ->stop the new page pool has to be freed while the NAPI is active (assuming driver attaches the page pool to an existing NAPI instance and doesn't reallocate NAPIs). Freeing the new page pool is technically safe because it hasn't been used for any packets, yet, so there can be no recycling. But the check in napi_assert_will_not_race() has no way of knowing that. We could check if page pool is empty but that'd make the check much less likely to trigger during development. Add page_pool_enable_direct_recycling(), pairing with page_pool_disable_direct_recycling(). It will allow us to create the new page pools in "disabled" state and only enable recycling when we know the reconfig operation will not fail. Coincidentally it will also let us re-enable the recycling for the old pool, if the reconfig failed: -> mem_alloc (new) -> stop (old) # disables direct recycling for old -> start (new) # fail!! -> start (old) # go back to old pp but direct recycling is lost :( -> mem_free (new) The new helper is idempotent to make the life easier for drivers, which can operate in HDS mode and support zero-copy Rx. The driver can call the helper twice whether there are two pools or it has multiple references to a single pool. Fixes: 40eca00ae605 ("bnxt_en: unlink page pool when stopping Rx queue") Tested-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250805003654.2944974-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08net: ti: icssg-prueth: Fix emac link speed handlingMD Danish Anwar
When link settings are changed emac->speed is populated by emac_adjust_link(). The link speed and other settings are then written into the DRAM. However if both ports are brought down after this and brought up again or if the operating mode is changed and a firmware reload is needed, the DRAM is cleared by icssg_config(). As a result the link settings are lost. Fix this by calling emac_adjust_link() after icssg_config(). This re populates the settings in the DRAM after a new firmware load. Fixes: 9facce84f406 ("net: ti: icssg-prueth: Fix firmware load sequence.") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Message-ID: <20250805173812.2183161-1-danishanwar@ti.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08net: hibmcge: fix the np_link_fail error reporting issueJijie Shao
Currently, after modifying device port mode, the np_link_ok state is immediately checked. At this point, the device may not yet ready, leading to the querying of an intermediate state. This patch will poll to check if np_link is ok after modifying device port mode, and only report np_link_fail upon timeout. Fixes: e0306637e85d ("net: hibmcge: Add support for mac link exception handling feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08net: hibmcge: fix the division by zero issueJijie Shao
When the network port is down, the queue is released, and ring->len is 0. In debugfs, hbg_get_queue_used_num() will be called, which may lead to a division by zero issue. This patch adds a check, if ring->len is 0, hbg_get_queue_used_num() directly returns 0. Fixes: 40735e7543f9 ("net: hibmcge: Implement .ndo_start_xmit function") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08net: hibmcge: fix rtnl deadlock issueJijie Shao
Currently, the hibmcge netdev acquires the rtnl_lock in pci_error_handlers.reset_prepare() and releases it in pci_error_handlers.reset_done(). However, in the PCI framework: pci_reset_bus - __pci_reset_slot - pci_slot_save_and_disable_locked - pci_dev_save_and_disable - err_handler->reset_prepare(dev); In pci_slot_save_and_disable_locked(): list_for_each_entry(dev, &slot->bus->devices, bus_list) { if (!dev->slot || dev->slot!= slot) continue; pci_dev_save_and_disable(dev); if (dev->subordinate) pci_bus_save_and_disable_locked(dev->subordinate); } This will iterate through all devices under the current bus and execute err_handler->reset_prepare(), causing two devices of the hibmcge driver to sequentially request the rtnl_lock, leading to a deadlock. Since the driver now executes netif_device_detach() before the reset process, it will not concurrently with other netdev APIs, so there is no need to hold the rtnl_lock now. Therefore, this patch removes the rtnl_lock during the reset process and adjusts the position of HBG_NIC_STATE_RESETTING to ensure that multiple resets are not executed concurrently. Fixes: 3f5a61f6d504f ("net: hibmcge: Add reset supported in this module") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-08Merge tag 'net-6.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: Previous releases - regressions: - netlink: avoid infinite retry looping in netlink_unicast() Previous releases - always broken: - packet: fix a race in packet_set_ring() and packet_notifier() - ipv6: reject malicious packets in ipv6_gso_segment() - sched: mqprio: fix stack out-of-bounds write in tc entry parsing - net: drop UFO packets (injected via virtio) in udp_rcv_segment() - eth: mlx5: correctly set gso_segs when LRO is used, avoid false positive checksum validation errors - netpoll: prevent hanging NAPI when netcons gets enabled - phy: mscc: fix parsing of unicast frames for PTP timestamping - a number of device tree / OF reference leak fixes" * tag 'net-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits) pptp: fix pptp_xmit() error path net: ti: icssg-prueth: Fix skb handling for XDP_PASS net: Update threaded state in napi config in netif_set_threaded selftests: netdevsim: Xfail nexthop test on slow machines eth: fbnic: Lock the tx_dropped update eth: fbnic: Fix tx_dropped reporting eth: fbnic: remove the debugging trick of super high page bias net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect dt-bindings: net: Replace bouncing Alexandru Tachici emails dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing Revert "net: mdio_bus: Use devm for getting reset GPIO" selftests: net: packetdrill: xfail all problems on slow machines net/packet: fix a race in packet_set_ring() and packet_notifier() benet: fix BUG when creating VFs net: airoha: npu: Add missing MODULE_FIRMWARE macros net: devmem: fix DMA direction on unmapping ipa: fix compile-testing with qcom-mdt=m eth: fbnic: unlink NAPIs from queues on error to open net: Add locking to protect skb->dev access in ip_output ...
2025-08-05net: ti: icssg-prueth: Fix skb handling for XDP_PASSMeghana Malladi
emac_rx_packet() is a common function for handling traffic for both xdp and non-xdp use cases. Use common logic for handling skb with or without xdp to prevent any incorrect packet processing. This patch fixes ping working with XDP_PASS for icssg driver. Fixes: 62aa3246f4623 ("net: ti: icssg-prueth: Add XDP support") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250803180216.3569139-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05eth: fbnic: Lock the tx_dropped updateMohsin Bashir
Wrap copying of drop stats on TX path from fbd->hw_stats by the hw_stats_lock. Currently, it is being performed outside the lock and another thread accessing fbd->hw_stats can lead to inconsistencies. Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250802024636.679317-3-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05eth: fbnic: Fix tx_dropped reportingMohsin Bashir
Correctly copy the tx_dropped stats from the fbd->hw_stats to the rtnl_link_stats64 struct. Fixes: 5f8bd2ce8269 ("eth: fbnic: add support for TMI stats") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250802024636.679317-2-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05eth: fbnic: remove the debugging trick of super high page biasJakub Kicinski
Alex added page bias of LONG_MAX, which is admittedly quite a clever way of catching overflows of the pp ref count. The page pool code was "optimized" to leave the ref at 1 for freed pages so it can't catch basic bugs by itself any more. (Something we should probably address under DEBUG_NET...) Unfortunately for fbnic since commit f7dc3248dcfb ("skbuff: Optimization of SKB coalescing for page pool") core _may_ actually take two extra pp refcounts, if one of them is returned before driver gives up the bias the ret < 0 check in page_pool_unref_netmem() will trigger. While at it add a FBNIC_ to the name of the driver constant. Fixes: 0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free") Link: https://patch.msgid.link/20250801170754.2439577-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-05net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnectHeiner Kallweit
After the call to phy_disconnect() netdev->phydev is reset to NULL. So fixed_phy_unregister() would be called with a NULL pointer as argument. Therefore cache the phy_device before this call. Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Link: https://patch.msgid.link/2b80a77a-06db-4dd7-85dc-3a8e0de55a1d@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04benet: fix BUG when creating VFsMichal Schmidt
benet crashes as soon as SRIOV VFs are created: kernel BUG at mm/vmalloc.c:3457! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 4 UID: 0 PID: 7408 Comm: test.sh Kdump: loaded Not tainted 6.16.0+ #1 PREEMPT(voluntary) [...] RIP: 0010:vunmap+0x5f/0x70 [...] Call Trace: <TASK> __iommu_dma_free+0xe8/0x1c0 be_cmd_set_mac_list+0x3fe/0x640 [be2net] be_cmd_set_mac+0xaf/0x110 [be2net] be_vf_eth_addr_config+0x19f/0x330 [be2net] be_vf_setup+0x4f7/0x990 [be2net] be_pci_sriov_configure+0x3a1/0x470 [be2net] sriov_numvfs_store+0x20b/0x380 kernfs_fop_write_iter+0x354/0x530 vfs_write+0x9b9/0xf60 ksys_write+0xf3/0x1d0 do_syscall_64+0x8c/0x3d0 be_cmd_set_mac_list() calls dma_free_coherent() under a spin_lock_bh. Fix it by freeing only after the lock has been released. Fixes: 1a82d19ca2d6 ("be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250801101338.72502-1-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04net: airoha: npu: Add missing MODULE_FIRMWARE macrosLorenzo Bianconi
Introduce missing MODULE_FIRMWARE definitions for firmware autoload. Fixes: 23290c7bc190d ("net: airoha: Introduce Airoha NPU support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250801-airoha-npu-missing-module-firmware-v2-1-e860c824d515@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-04eth: fbnic: unlink NAPIs from queues on error to openJakub Kicinski
CI hit a UaF in fbnic in the AF_XDP portion of the queues.py test. The UaF is in the __sk_mark_napi_id_once() call in xsk_bind(), NAPI has been freed. Looks like the device failed to open earlier, and we lack clearing the NAPI pointer from the queue. Fixes: 557d02238e05 ("eth: fbnic: centralize the queue count and NAPI<>queue setting") Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250728163129.117360-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-03Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Significant patch series in this pull request: - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets us closer to being able to remove page->mapping - "relayfs: misc changes" (Jason Xing) does some maintenance and minor feature addition work in relayfs - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches us from static preallocation of the kdump crashkernel's working memory over to dynamic allocation. So the difficulty of a-priori estimation of the second kernel's needs is removed and the first kernel obtains extra memory - "generalize panic_print's dump function to be used by other kernel parts" (Feng Tang) implements some consolidation and rationalization of the various ways in which a failing kernel splats information at the operator * tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits) tools/getdelays: add backward compatibility for taskstats version kho: add test for kexec handover delaytop: enhance error logging and add PSI feature description samples: Kconfig: fix spelling mistake "instancess" -> "instances" fat: fix too many log in fat_chain_add() scripts/spelling.txt: add notifer||notifier to spelling.txt xen/xenbus: fix typo "notifer" net: mvneta: fix typo "notifer" drm/xe: fix typo "notifer" cxl: mce: fix typo "notifer" KVM: x86: fix typo "notifer" MAINTAINERS: add maintainers for delaytop ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() ucount: fix atomic_long_inc_below() argument type kexec: enable CMA based contiguous allocation stackdepot: make max number of pools boot-time configurable lib/xxhash: remove unused functions init/Kconfig: restore CONFIG_BROKEN help text lib/raid6: update recov_rvv.c zero page usage docs: update docs after introducing delaytop ...
2025-08-02net: mvneta: fix typo "notifer"WangYuli
There is a spelling mistake of 'notifer' in the comment which should be 'notifier'. Link: https://lkml.kernel.org/r/0CB4300CB6F49007+20250722073431.21983-4-wangyuli@uniontech.com Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-01net/mlx5: Correctly set gso_segs when LRO is usedChristoph Paasch
When gso_segs is left at 0, a number of assumptions will end up being incorrect throughout the stack. For example, in the GRO-path, we set NAPI_GRO_CB()->count to gso_segs. So, if a non-LRO'ed packet followed by an LRO'ed packet is being processed in GRO, the first one will have NAPI_GRO_CB()->count set to 1 and the next one to 0 (in dev_gro_receive()). Since commit 531d0d32de3e ("net/mlx5: Correctly set gso_size when LRO is used") these packets will get merged (as their gso_size now matches). So, we end up in gro_complete() with NAPI_GRO_CB()->count == 1 and thus don't call inet_gro_complete(). Meaning, checksum-validation in tcp_checksum_complete() will fail with a "hw csum failure". Even before the above mentioned commit, incorrect gso_segs means that other things like TCP's accounting of incoming packets (tp->segs_in, data_segs_in, rcv_ooopack) will be incorrect. Which means that if one does bytes_received/data_segs_in, the result will be bigger than the MTU. Fix this by initializing gso_segs correctly when LRO is used. Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Reported-by: Gal Pressman <gal@nvidia.com> Closes: https://lore.kernel.org/netdev/6583783f-f0fb-4fb1-a415-feec8155bc69@nvidia.com/ Signed-off-by: Christoph Paasch <cpaasch@openai.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250729-mlx5_gso_segs-v1-1-b48c480c1c12@openai.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-01sfc: unfix not-a-typo in commentEdward Cree
Commit fe09560f8241 ("net: Fix typos") removed duplicated word 'fallback', but this was not a typo and change altered the semantic meaning of the comment. Partially revert, using the phrase 'fallback of the fallback' to make the meaning more clear to future readers so that they won't try to change it again. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250731144138.2637949-1-edward.cree@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-01net: airoha: Fix PPE table access in airoha_ppe_debugfs_foe_show()Lorenzo Bianconi
In order to avoid any possible race we need to hold the ppe_lock spinlock accessing the hw PPE table. airoha_ppe_foe_get_entry routine is always executed holding ppe_lock except in airoha_ppe_debugfs_foe_show routine. Fix the problem introducing airoha_ppe_foe_get_entry_locked routine. Fixes: 3fe15c640f380 ("net: airoha: Introduce PPE debugfs support") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250731-airoha_ppe_foe_get_entry_locked-v2-1-50efbd8c0fd6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: - Various minor code cleanups and fixes for hns, iser, cxgb4, hfi1, rxe, erdma, mana_ib - Prefetch supprot for rxe ODP - Remove memory window support from hns as new device FW is no longer support it - Remove qib, it is very old and obsolete now, Cornelis wishes to restructure the hfi1/qib shared layer - Fix a race in destroying CQs where we can still end up with work running because the work is cancled before the driver stops triggering it - Improve interaction with namespaces: * Follow the devlink namespace for newly spawned RDMA devices * Create iopoib net devces in the parent IB device's namespace * Allow CAP_NET_RAW checks to pass in user namespaces - A new flow control scheme for IB MADs to try and avoid queue overflows in the network - Fix 2G message sizes in bnxt_re - Optimize mkey layout for mlx5 DMABUF - New "DMA Handle" concept to allow controlling PCI TPH and steering tags * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (71 commits) RDMA/siw: Change maintainer email address RDMA/mana_ib: add support of multiple ports RDMA/mlx5: Refactor optional counters steering code RDMA/mlx5: Add DMAH support for reg_user_mr/reg_user_dmabuf_mr IB: Extend UVERBS_METHOD_REG_MR to get DMAH RDMA/mlx5: Add DMAH object support RDMA/core: Introduce a DMAH object and its alloc/free APIs IB/core: Add UVERBS_METHOD_REG_MR on the MR object net/mlx5: Add support for device steering tag net/mlx5: Expose IFC bits for TPH PCI/TPH: Expose pcie_tph_get_st_table_size() RDMA/mlx5: Fix incorrect MKEY masking RDMA/mlx5: Fix returned type from _mlx5r_umr_zap_mkey() RDMA/mlx5: remove redundant check on err on return expression RDMA/mana_ib: add additional port counters RDMA/mana_ib: Fix DSCP value in modify QP RDMA/efa: Add CQ with external memory support RDMA/core: Add umem "is_contiguous" and "start_dma_addr" helpers RDMA/uverbs: Add a common way to create CQ with umem RDMA/mlx5: Optimize DMABUF mkey page size ...
2025-07-30net: ti: icss-iep: fix device and OF node leaks at probeJohan Hovold
Make sure to drop the references to the IEP OF node and device taken by of_parse_phandle() and of_find_device_by_node() when looking up IEP devices during probe. Drop the bogus additional reference taken on successful lookup so that the device is released correctly by icss_iep_put(). Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver") Cc: stable@vger.kernel.org # 6.6 Cc: Roger Quadros <rogerq@kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250725171213.880-6-johan@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-30net: mtk_eth_soc: fix device leak at probeJohan Hovold
The reference count to the WED devices has already been incremented when looking them up using of_find_device_by_node() so drop the bogus additional reference taken during probe. Fixes: 804775dfc288 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)") Cc: stable@vger.kernel.org # 5.19 Cc: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250725171213.880-5-johan@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-30net: gianfar: fix device leak when querying time stamp infoJohan Hovold
Make sure to drop the reference to the ptp device taken by of_find_device_by_node() when querying the time stamping capabilities. Note that holding a reference to the ptp device does not prevent its driver data from going away. Fixes: 7349a74ea75c ("net: ethernet: gianfar_ethtool: get phc index through drvdata") Cc: stable@vger.kernel.org # 4.18 Cc: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250725171213.880-4-johan@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>