summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2025-01-06i40e: Remove unused i40e_(read|write)_phy_registerDr. David Alan Gilbert
i40e_read_phy_register() and i40e_write_phy_register() were added in 2016 by commit f62ba91458b5 ("i40e: Add functions which apply correct PHY access method for read and write operation") but haven't been used. Remove them. (There are more specific _clause* variants of these functions that are still used.) Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-4-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Remove unused i40e_blink_phy_link_ledDr. David Alan Gilbert
i40e_blink_phy_link_led() was added in 2016 by commit fd077cd3399b ("i40e: Add functions to blink led on 10GBaseT PHY") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-3-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06i40e: Deadcode i40e_aq_*Dr. David Alan Gilbert
i40e_aq_add_mirrorrule(), i40e_aq_delete_mirrorrule() and i40e_aq_set_vsi_vlan_promisc() were added in 2016 by commit 7bd6875bef70 ("i40e: APIs to Add/remove port mirroring rules") but haven't been used. They were the last user of i40e_mirrorrule_op(). i40e_aq_rearrange_nvm() was added in 2018 by commit f05798b4ff82 ("i40e: Add AQ command for rearrange NVM structure") but hasn't been used. i40e_aq_restore_lldp() was added in 2019 by commit c65e78f87f81 ("i40e: Further implementation of LLDP") but hasn't been used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250102173717.200359-2-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04net: libwx: fix firmware mailbox abnormal returnJiawen Wu
The existing SW-FW interaction flow on the driver is wrong. Follow this wrong flow, driver would never return error if there is a unknown command. Since firmware writes back 'firmware ready' and 'unknown command' in the mailbox message if there is an unknown command sent by driver. So reading 'firmware ready' does not timeout. Then driver would mistakenly believe that the interaction has completed successfully. It tends to happen with the use of custom firmware. Move the check for 'unknown command' out of the poll timeout for 'firmware ready'. And adjust the debug log so that mailbox messages are always printed when commands timeout. Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04net: stmmac: TSO: Simplify the code flow of DMA descriptor allocationsFurong Xu
The TCP Segmentation Offload (TSO) engine is an optional function in DWMAC cores, it is implemented for dwmac4 and dwxgmac2 only, ancient dwmac100 and dwmac1000 are not supported by hardware. Current driver code checks priv->dma_cap.tsoen which is read from MAC_HW_Feature1 register to determine if TSO is enabled in hardware configurations, if (!priv->dma_cap.tsoen) driver never sets NETIF_F_TSO for net_device. This patch never affects dwmac100/dwmac1000 and their stmmac_desc_ops: ndesc_ops/enh_desc_ops, since TSO is never supported by them two. The DMA AXI address width of DWMAC cores can be configured to 32-bit/40-bit/48-bit, then the format of DMA transmit descriptors get a little different between 32-bit and 40-bit/48-bit. Current driver code checks priv->dma_cap.addr64 to use certain format with certain configuration. This patch converts the format of DMA transmit descriptors on dwmac4 and dwxgmac2 that the DMA AXI address width is configured to 32-bit (as described by function comments of stmmac_tso_xmit() in current code) to a more generic format (see updated function comments after this patch) which is actually already used on 40-bit/48-bit platforms to provide better compatibility and make code flow cleaner in TSO TX routine. Another interesting finding, struct stmmac_desc_ops is a common abstract interface to maintain descriptors, we should avoid the direct assignment of descriptor members (e.g. desc->des0), stmmac_set_desc_addr() is the proper method yet. This patch tries to improve this by the way. Tested and verified on: DWMAC CORE 5.00a with 32-bit DMA AXI address width DWMAC CORE 5.10a with 32-bit DMA AXI address width DWXGMAC CORE 3.20a with 40-bit DMA AXI address width Signed-off-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/20241220080726.1733837-1-0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.13-rc6). No conflicts. Adjacent changes: include/linux/if_vlan.h f91a5b808938 ("af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK") 3f330db30638 ("net: reformat kdoc return statements") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-03net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_initMeghana Malladi
When ICSSG interfaces are brought down and brought up again, the pru cores are shut down and booted again, flushing out all the memories and start again in a clean state. Hence it is expected that the IEP_CMP_CFG register needs to be flushed during iep_init() to ensure that the existing residual configuration doesn't cause any unusual behavior. If the register is not cleared, existing IEP_CMP_CFG set for CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values. After bringing the interface up, calling PPS enable doesn't work as the driver believes PPS is already enabled, (iep->pps_enabled is not cleared during interface bring down) and driver will just return true even though there is no signal. Fix this by disabling pps and perout. Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-03net: ti: icssg-prueth: Fix firmware load sequence.MD Danish Anwar
Timesync related operations are ran in PRU0 cores for both ICSSG SLICE0 and SLICE1. Currently whenever any ICSSG interface comes up we load the respective firmwares to PRU cores and whenever interface goes down, we stop the resective cores. Due to this, when SLICE0 goes down while SLICE1 is still active, PRU0 firmwares are unloaded and PRU0 core is stopped. This results in clock jump for SLICE1 interface as the timesync related operations are no longer running. As there are interdependencies between SLICE0 and SLICE1 firmwares, fix this by running both PRU0 and PRU1 firmwares as long as at least 1 ICSSG interface is up. Add new flag in prueth struct to check if all firmwares are running and remove the old flag (fw_running). Use emacs_initialized as reference count to load the firmwares for the first and last interface up/down. Moving init_emac_mode and fw_offload_mode API outside of icssg_config to icssg_common_start API as they need to be called only once per firmware boot. Change prueth_emac_restart() to return error code and add error prints inside the caller of this functions in case of any failures. Move prueth_emac_stop() from common to sr1 driver. sr1 and sr2 drivers have different logic handling for stopping the firmwares. While sr1 driver is dependent on emac structure to stop the corresponding pru cores for that slice, for sr2 all the pru cores of both the slices are stopped and is not dependent on emac. So the prueth_emac_stop() function is no longer common and can be moved to sr1 driver. Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Meghana Malladi <m-malladi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-02net: dwmac-imx: add imx93 clock input support in RMII modeMathieu Othacehe
If the rmii_refclk_ext boolean is set, configure the ENET QOS TX_CLK pin direction to input. Otherwise, it defaults to output. That mirrors what is already happening for the imx8mp in the imx8mp_set_intf_mode function. Signed-off-by: Mathieu Othacehe <othacehe@gnu.org> Link: https://patch.msgid.link/20241227095923.4414-1-othacehe@gnu.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-02net: sfc: Correct key_len for efx_tc_ct_zone_ht_paramsLiang Jie
In efx_tc_ct_zone_ht_params, the key_len was previously set to offsetof(struct efx_tc_ct_zone, linkage). This calculation is incorrect because it includes any padding between the zone field and the linkage field due to structure alignment, which can vary between systems. This patch updates key_len to use sizeof_field(struct efx_tc_ct_zone, zone) , ensuring that the hash table correctly uses the zone as the key. This fix prevents potential hash lookup errors and improves connection tracking reliability. Fixes: c3bb5c6acd4e ("sfc: functions to register for conntrack zone offload") Signed-off-by: Liang Jie <liangjie@lixiang.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20241230093709.3226854-1-buaajxlj@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30sky2: Add device ID 11ab:4373 for Marvell 88E8075Pascal Hambourg
A Marvell 88E8075 ethernet controller has this device ID instead of 11ab:4370 and works fine with the sky2 driver. Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/10165a62-99fb-4be6-8c64-84afd6234085@plouf.fr.eu.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30net: mv643xx_eth: fix an OF node reference leakJoe Hattori
Current implementation of mv643xx_eth_shared_of_add_port() calls of_parse_phandle(), but does not release the refcount on error. Call of_node_put() in the error path and in mv643xx_eth_shared_of_remove(). This bug was found by an experimental verification tool that I am developing. Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241221081448.3313163-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeupJoshua Washington
Commit ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI") moved XSK TX processing to be part of the RX NAPI. However, that commit did not include triggering the RX NAPI in gve_xsk_wakeup. This is necessary because the TX NAPI only processes TX completions, meaning that a TX wakeup would not actually trigger XSK descriptor processing. Also, the branch on XDP_WAKEUP_TX was supposed to have been removed, as the NAPI should be scheduled whether the wakeup is for RX or TX. Fixes: ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://patch.msgid.link/20241221032807.302244-1-pkaligineedi@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-30eth: bcmsysport: fix call balance of priv->clk handling routinesVitalii Mordan
Check the return value of clk_prepare_enable to ensure that priv->clk has been successfully enabled. If priv->clk was not enabled during bcm_sysport_probe, bcm_sysport_resume, or bcm_sysport_open, it must not be disabled in any subsequent execution paths. Fixes: 31bc72d97656 ("net: systemport: fetch and use clock resources") Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241227123007.2333397-1-mordan@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: lan969x: add RGMII implementationDaniel Machon
The lan969x switch device includes two RGMII port interfaces (port 28 and 29) supporting data speeds of 1 Gbps, 100 Mbps and 10 Mbps. MAC level delays are configurable through the HSIO_WRAP target, by choosing a phase shift selector, corresponding to a certain time delay in nano seconds. Add new file: lan969x_rgmii.c that contains the implementation for configuring the RGMII port devices. MAC level delays are configured using the "{rx,tx}-internal-delay-ps" properties. These properties must be specified independently of the phy-mode. If missing, or set to zero, the MAC will not apply any delay. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-8-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: lan969x: add RGMII registersDaniel Machon
Configuration of RGMII is done by configuring the GPIO and clock settings in the HSIOWRAP target, and configuring the RGMII port devices in the DEVRGMII target. Both targets contain registers replicated for the number of RGMII port devices, which is two. Add said targets and register macros required to configure RGMII. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-7-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: sparx5: verify RGMII speedsDaniel Machon
When doing a port config, we verify the port speed against the PHY mode and supported speeds of that PHY mode. Add checks for the four RGMII phy modes: RGMII, RGMII_ID, RGMII_TXID and RGMII_RXID. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-6-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: sparx5: only return PCS for modes that require itDaniel Machon
The RGMII ports have no PCS to configure. Make sure we only return the PCS for port modes that require it. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-5-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: sparx5: skip low-speed configuration when port is RGMIIDaniel Machon
When doing a port config, we configure low-speed port devices, among other things. We have a check to ensure, that the device is indeed a low-speed device, an not a high-speed device. Add an additional check, to ensure that the device is not an RGMII device. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-4-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: sparx5: use is_port_rgmii() throughoutDaniel Machon
Now that we can check if a given port is an RGMII port, use it in the following cases: - To set RGMII PHY modes for RGMII port devices. - To avoid checking for a SerDes node in the devicetree, when the port is an RGMII port. - To bail out of sparx5_port_init() when the common configuration is done. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-3-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: sparx5: add function for RGMII port checkDaniel Machon
The lan969x device contains two RGMII port interfaces, sitting at port 28 and 29. Add function: is_port_rgmii() to the match data ops, that checks if a given port is an RGMII port or not. For Sparx5, this function always returns false. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-2-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: sparx5: do some preparation workDaniel Machon
The sparx5_port_init() does initial configuration of a variety of different features and options for each port. Some are shared for all types of devices, some are not. As it is now, common configuration is done after configuration of low-speed devices. This will not work when adding RGMII support in a subsequent patch. In preparation for lan969x RGMII support, move a block of code, that configures 2g5 devices, down. This ensures that the configuration common to all devices is done before configuration of 2g5, 5g, 10g and 25g devices. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Robert Marko <robert.marko@sartura.hr> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241220-sparx5-lan969x-switch-driver-4-v5-1-fa8ba5dff732@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5e: Keep netdev when leave switchdev for devlink set legacy onlyJianbo Liu
In the cited commit, when changing from switchdev to legacy mode, uplink representor's netdev is kept, and its profile is replaced with nic profile, so netdev is detached from old profile, then attach to new profile. During profile change, the hardware resources allocated by the old profile will be cleaned up. However, the cleanup is relying on the related kernel modules. And they may need to flush themselves first, which is triggered by netdev events, for example, NETDEV_UNREGISTER. However, netdev is kept, or netdev_register is called after the cleanup, which may cause troubles because the resources are still referred by kernel modules. The same process applies to all the caes when uplink is leaving switchdev mode, including devlink eswitch mode set legacy, driver unload and devlink reload. For the first one, it can be blocked and returns failure to users, whenever possible. But it's hard for the others. Besides, the attachment to nic profile is unnecessary as the netdev will be unregistered anyway for such cases. So in this patch, the original behavior is kept only for devlink eswitch set mode legacy. For the others, moves netdev unregistration before the profile change. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5e: Skip restore TC rules for vport rep without loaded flagJianbo Liu
During driver unload, unregister_netdev is called after unloading vport rep. So, the mlx5e_rep_priv is already freed while trying to get rpriv->netdev, or walk rpriv->tc_ht, which results in use-after-free. So add the checking to make sure access the data of vport rep which is still loaded. Fixes: d1569537a837 ("net/mlx5e: Modify and restore TC rules for IPSec TX rules") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5e: macsec: Maintain TX SA from encoding_saDragos Tatulea
In MACsec, it is possible to create multiple active TX SAs on a SC, but only one such SA can be used at a time for transmission. This SA is selected through the encoding_sa link parameter. When there are 2 or more active TX SAs configured (encoding_sa=0): ip macsec add macsec0 tx sa 0 pn 1 on key 00 <KEY1> ip macsec add macsec0 tx sa 1 pn 1 on key 00 <KEY2> ... the traffic should be still sent via TX SA 0 as the encoding_sa was not changed. However, the driver ignores the encoding_sa and overrides it to SA 1 by installing the flow steering id of the newly created TX SA into the SCI -> flow steering id hash map. The future packet tx descriptors will point to the incorrect flow steering rule (SA 1). This patch fixes the issue by avoiding the creation of the flow steering rule for an active TX SA that is not the encoding_sa. The driver side tx_sa object and the FW side macsec object are still created. When the encoding_sa link parameter is changed to another active TX SA, only the new flow steering rule will be created in the mlx5e_macsec_upd_txsa() handler. Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: DR, select MSIX vector 0 for completion queue creationShahar Shitrit
When creating a software steering completion queue (CQ), an arbitrary MSIX vector n is selected. This results in the CQ sharing the same Ethernet traffic channel n associated with the chosen vector. However, the value of n is often unpredictable, which can introduce complications for interrupt monitoring and verification tools. Moreover, SW steering uses polling rather than event-driven interrupts. Therefore, there is no need to select any MSIX vector other than the existing vector 0 for CQ creation. In light of these factors, and to enhance predictability, we modify the code to consistently select MSIX vector 0 for CQ creation. Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23Merge branch '10GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ixgbe, ixgbevf: Add support for Intel(R) E610 device Piotr Kwapulinski says: Add initial support for Intel(R) E610 Series of network devices. The E610 is based on X550 but adds firmware managed link, enhanced security capabilities and support for updated server manageability. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ixgbevf: Add support for Intel(R) E610 device PCI: Add PCI_VDEVICE_SUB helper macro ixgbe: Enable link management in E610 device ixgbe: Clean up the E610 link management related code ixgbe: Add ixgbe_x540 multiple header inclusion protection ixgbe: Add support for EEPROM dump in E610 device ixgbe: Add support for NVM handling in E610 device ixgbe: Add link management support for E610 device ixgbe: Add support for E610 device capabilities detection ixgbe: Add support for E610 FW Admin Command Interface ==================== Link: https://patch.msgid.link/20241220201521.3363985-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net: ethernet: ti: am65-cpsw: default to round-robin for host port receiveSiddharth Vadapalli
The Host Port (i.e. CPU facing port) of CPSW receives traffic from Linux via TX DMA Channels which are Hardware Queues consisting of traffic categorized according to their priority. The Host Port is configured to dequeue traffic from these Hardware Queues on the basis of priority i.e. as long as traffic exists on a Hardware Queue of a higher priority, the traffic on Hardware Queues of lower priority isn't dequeued. An alternate operation is also supported wherein traffic can be dequeued by the Host Port in a Round-Robin manner. Until commit under Fixes, the am65-cpsw driver enabled a single TX DMA Channel, due to which, unless modified by user via "ethtool", all traffic from Linux is transmitted on DMA Channel 0. Therefore, configuring the Host Port for priority based dequeuing or Round-Robin operation is identical since there is a single DMA Channel. Since commit under Fixes, all 8 TX DMA Channels are enabled by default. Additionally, the default "tc mapping" doesn't take into account the possibility of different traffic profiles which various users might have. This results in traffic starvation at the Host Port due to the priority based dequeuing which has been enabled by default since the inception of the driver. The traffic starvation triggers NETDEV WATCHDOG timeout for all TX DMA Channels that haven't been serviced due to the presence of traffic on the higher priority TX DMA Channels. Fix this by defaulting to Round-Robin dequeuing at the Host Port, which shall ensure that traffic is dequeued from all TX DMA Channels irrespective of the traffic profile. This will address the NETDEV WATCHDOG timeouts. At the same time, users can still switch from Round-Robin to Priority based dequeuing at the Host Port with the help of the "p0-rx-ptype-rrobin" private flag of "ethtool". Users are expected to setup an appropriate "tc mapping" that suits their traffic profile when switching to priority based dequeuing at the Host Port. Fixes: be397ea3473d ("net: ethernet: am65-cpsw: Set default TX channels to maximum") Cc: <stable@vger.kernel.org> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Link: https://patch.msgid.link/20241220075618.228202-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: support ring channel set while upJakub Kicinski
Implement the channel count changes. Copy the netdev priv, allocate new channels using it. Stop, swap, start. Then free the copy of the priv along with the channels it holds, which are now the channels that used to be on the real priv. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20241220025241.1522781-11-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: support ring channel get and set while downJakub Kicinski
Trivial implementation of ethtool channel get and set. Set is only supported when device is closed, next patch will add code for live reconfig. Asymmetric configurations are supported (combined + extra Tx or Rx), so are configurations with independent IRQs for Rx and Tx. Having all 3 NAPI types (combined, Tx, Rx) is not supported. We used to only call fbnic_reset_indir_tbl() during init. Now that we call it after device had been register must be careful not to override user config. Link: https://patch.msgid.link/20241220025241.1522781-10-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: centralize the queue count and NAPI<>queue settingAlexander Duyck
To simplify dealing with RTNL_ASSERT() requirements further down the line, move setting queue count and NAPI<>queue association to their own helpers. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241220025241.1522781-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: add IRQ reuse supportJakub Kicinski
Change our method of swapping NAPIs without disturbing existing config. This is primarily needed for "live reconfiguration" such as changing the channel count when interface is already up. Previously we were planning to use a trick of using shared interrupts. We would install a second IRQ handler for the new NAPI, and make it return IRQ_NONE until we were ready for it to take over. This works fine functionally but breaks IRQ naming. The IRQ subsystem uses the IRQ name to create the procfs entry, since both handlers used the same name the second handler wouldn't get a proc directory registered. When first one gets removed on success full ring count change it would remove its directory and we would be left with none. New approach uses a double pointer to the NAPI. The IRQ handler needs to know how to locate the NAPI to schedule. We register a single IRQ handler and give it a pointer to a pointer. We can then change what it points to without re-registering. This may have a tiny perf impact, but really really negligible. Link: https://patch.msgid.link/20241220025241.1522781-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: store NAPIs in an array instead of the listJakub Kicinski
We will need an array for storing NAPIs in the upcoming IRQ handler reuse rework. Replace the current list we have, so that we are able to reuse it later. In a few places replace i as the iterator with t when we iterate over triads, this seems slightly less confusing than having i, j, k variables. Link: https://patch.msgid.link/20241220025241.1522781-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: let user control the RSS hash fieldsAlexander Duyck
Support setting the fields over which RSS computes its hash. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241220025241.1522781-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: support setting RSS configurationAlexander Duyck
Let the user program the RSS indirection table and the RSS key. Straightforward implementation. Track the changes and don't bother poking the HW if user asked for a config identical to what's already programmed. The device only supports Toeplitz hash. Similarly to the GET support - all the real code that does the programming was part of initial driver submission, already. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/20241220025241.1522781-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: don't reset the secondary RSS indir tableJakub Kicinski
Secondary RSS indirection table is for additional contexts. It can / should be initialized when such context is created. Since we don't support creating RSS contexts, yet, this change has no user visible effect. Link: https://patch.msgid.link/20241220025241.1522781-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: support querying RSS configAlexander Duyck
The initial driver submission already added all the RSS state, as part of multi-queue support. Expose the configuration via the ethtool APIs. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20241220025241.1522781-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23eth: fbnic: reorder ethtool codeJakub Kicinski
Define ethtool callback handlers in order in which they are defined in the ops struct. It doesn't really matter what the order is, but it's good to have an order. Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://patch.msgid.link/20241220025241.1522781-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: fs, Add support for RDMA RX steering over IB link layerPatrisious Haddad
Relax the capability check for creating the RDMA RX steering domain by considering only the capabilities reported by the firmware as necessary for its creation, which in turn allows RDMA RX creation over devices with IB link layer as well. The table_miss_action_domain capability is required only for a specific priority, which is handled in mlx5_rdma_enable_roce_steering(). The additional capability check for this case is already in place. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-12-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: Remove PTM support log messageCarolina Jubran
The absence of Precision Time Measurement support should not emit a message, as it can be misleading in contexts where PTM is not required. Remove the log message indicating the lack of PCIe PTM support. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: DR, add support for ConnectX-8 steeringItamar Gozlan
Add support for a new steering format version that is implemented by ConnectX-8. Except for several differences, the STEv3 is identical to STEv2, so for most callbacks STEv3 context struct will call STEv2 functions. Signed-off-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: DR, expand SWS STE callbacks and consolidate common structsItamar Gozlan
Expand SWS STE callbacks to support ConnectX-8 hardware. Move common enums and structures to a shared header file. Signed-off-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: HWS, do not initialize native API queuesYevgeny Kliteynik
HWS has two types of APIs: - Native: fastest and slimmest, async API. The user of this API is required to manage rule handles memory, and to poll for completion for each rule. - BWC: backward compatible API, similar semantics to SWS API. This layer is implemented above native API and it does all the work for the user, so that it is easy to switch between SWS and HWS. Right now the existing users of HWS require only BWC API. Therefore, in order to not waste resources, this patch disables send queues allocation for native API. If in the future support for faster HWS rule insertion will be required (such as for Connection Tracking), native queues can be enabled. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-8-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: HWS, no need to expose mlx5hws_send_queues_open/closeYevgeny Kliteynik
No need to have mlx5hws_send_queues_open/close in header. Make them static and remove from header. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: fs, retry insertion to hash table on EBUSYMark Bloch
When inserting into an rhashtable faster than it can grow, an -EBUSY error may be encountered. Modify the insertion logic to retry on -EBUSY until either a successful insertion or a genuine error is returned. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20241219175841.1094544-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: fs, add mlx5_fs_pool APIMoshe Shemesh
Refactor fc_pool API to create generic fs_pool API, as HW steering has more flow steering elements which can take advantage of the same pool of bulks API. Change fs_counters code to use the fs_pool API. Note, removed __counted_by from struct mlx5_fc_bulk as bulk_len is now inner struct member. It will be added back once __counted_by can support inner struct members. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: fs, add counter object to flow destinationMoshe Shemesh
Currently mlx5_flow_destination includes counter_id which is assigned in case we use flow counter on the flow steering rule. However, counter_id is not enough data in case of using HW Steering. Thus, have mlx5_fc object as part of mlx5_flow_destination instead of counter_id and assign it where needed. In case counter_id is received from user space, create a local counter object to represent it. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: LAG, Support LAG over Multi-Host NICsRongwei Liu
New multi-host NICs provide each host with partial ports, allowing each host to maintain its unique LAG configuration. On these multi-host NICs, the 'native_port_num' capability is no longer continuous on each host and can exceed the 'num_lag_ports' capability. Therefore, it is necessary to skip the PFs with ldev->pf[i].dev == NULL when querying/modifying the lag devices' information. There is no need to check dev.native_port_num against ldev->ports. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23net/mlx5: LAG, Refactor lag logicRongwei Liu
Wrap the lag pf access into two new macros: 1. ldev_for_each() 2. ldev_for_each_reverse() The maximum number of lag ports and the index to `natvie_port_num` mapping will be handled by the two new macros. Users shouldn't use the for loop anymore. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241219175841.1094544-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23sfc: Use netdev refcount tracking in struct efx_async_filter_insertionYiFei Zhu
I was debugging some netdev refcount issues in OpenOnload, and one of the places I was looking at was in the sfc driver. Only struct efx_async_filter_insertion was not using netdev refcount tracker, so add it here. GFP_ATOMIC because this code path is called by ndo_rx_flow_steer which holds RCU. This patch should be a no-op if !CONFIG_NET_DEV_REFCNT_TRACKER Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241219173004.2615655-1-zhuyifei@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>