summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)Author
2020-03-26net: atlantic: add XPN handlingMark Starovoytov
This patch adds XPN handling. Our driver doesn't support XPN, but we should still update a couple of places in the code, because the size of 'next_pn' field has changed. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec offload statistics implementationDmitry Bogdanov
This patch adds support for MACSec statistics on Atlantic network cards. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec offload statistics HW bindingsDmitry Bogdanov
This patch adds the Atlantic HW-specific bindings for MACSec statistics, e.g. register addresses / structs, helper function, etc, which will be used by actual callback implementations. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec ingress offload implementationMark Starovoytov
This patch adds support for MACSec ingress HW offloading on Atlantic network cards. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec ingress offload HW bindingsMark Starovoytov
This patch adds the Atlantic HW-specific bindings for MACSec ingress, e.g. register addresses / structs, helper function, etc, which will be used by actual callback implementations. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec egress offload implementationDmitry Bogdanov
This patch adds support for MACSec egress HW offloading on Atlantic network cards. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec egress offload HW bindingsDmitry Bogdanov
This patch adds the Atlantic HW-specific bindings for MACSec egress, e.g. register addresses / structs, helper function, etc, which will be used by actual callback implementations. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: atlantic: MACSec offload skeletonDmitry Bogdanov
This patch adds basic functionality for MACSec offloading for Atlantic NICs. MACSec offloading functionality is enabled if network card has appropriate FW that has MACSec offloading enabled in config. Actual functionality (ingress, egress, etc) will be added in follow-up patches. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: macsec: report real_dev features when HW offloading is enabledMark Starovoytov
This patch makes real_dev_feature propagation by MACSec offloaded device. Issue description: real_dev features are disabled upon macsec creation. Root cause: Features limitation (specific to SW MACSec limitation) is being applied to HW offloaded case as well. This causes 'set_features' request on the real_dev with reduced feature set due to chain propagation. Proposed solution: Report real_dev features when HW offloading is enabled. NB! MACSec offloaded device does not propagate VLAN offload features at the moment. This can potentially be added later on as a separate patch. Note: this patch requires HW offloading to be enabled by default in order to function properly. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: macsec: add support for getting offloaded statsDmitry Bogdanov
When HW offloading is enabled, offloaded stats should be used, because s/w stats are wrong and out of sync with the HW in this case. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: macsec: support multicast/broadcast when offloadingMark Starovoytov
The idea is simple. If the frame is an exact match for the controlled port (based on DA comparison), then we simply divert this skb to matching port. Multicast/broadcast messages are delivered to all ports. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: macsec: allow multiple macsec devices with offloadDmitry Bogdanov
Offload engine can setup several SecY. Each macsec interface shall have its own mac address. It will filter a traffic by dest mac address. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: macsec: init secy pointer in macsec_contextDmitry Bogdanov
This patch adds secy pointer initialization in the macsec_context. It will be used by MAC drivers in offloading operations. Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: macsec: add support for offloading to the MACAntoine Tenart
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC, allowing a user to select a MAC as a provider for MACsec offloading operations. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driverGrygorii Strashko
The TI AM65x/J721E SoCs Gigabit Ethernet Switch subsystem (CPSW2G NUSS) has two ports - One Ethernet port (port 1) with selectable RGMII and RMII interfaces and an internal Communications Port Programming Interface (CPPI) port (Host port 0) and with ALE in between. It also contains - Management Data Input/Output (MDIO) interface for physical layer device (PHY) management; - Updated Address Lookup Engine (ALE) module; - (TBD) New version of Common platform time sync (CPTS) module. On the TI am65x/J721E SoCs CPSW NUSS Ethernet subsystem into device MCU domain named MCU_CPSW0. Host Port 0 CPPI Packet Streaming Interface interface supports 8 TX channels and one RX channels operating by TI am654 NAVSS Unified DMA Peripheral Root Complex (UDMA-P) controller. Introduced driver provides standard Linux net_device to user space and supports: - ifconfig up/down - MAC address configuration - ethtool operation: --driver --change --register-dump --negotiate phy --statistics --set-eee phy --show-ring --show-channels --set-channels - net_device ioctl mii-control - promisc mode - rx checksum offload for non-fragmented IPv4/IPv6 TCP/UDP packets. The CPSW NUSS can verify IPv4/IPv6 TCP/UDP packets checksum and fills csum information for each packet in psdata[2] word: - BIT(16) CHECKSUM_ERROR - indicates csum error - BIT(17) FRAGMENT - indicates fragmented packet - BIT(18) TCP_UDP_N - Indicates TCP packet was detected - BIT(19) IPV6_VALID, BIT(20) IPV4_VALID - indicates IPv6/IPv4 packet - BIT(15, 0) CHECKSUM_ADD - This is the value that was summed during the checksum computation. This value is FFFFh for non fragmented IPV4/6 UDP/TCP packets with no checksum error. RX csum offload can be disabled: ethtool -K <dev> rx-checksum on|off - tx checksum offload support for IPv4/IPv6 TCP/UDP packets (J721E only). TX csum HW offload can be enabled/disabled: ethtool -K <dev> tx-checksum-ip-generic on|off - multiq and switch between round robin/prio modes for cppi tx queues by using Netdev private flag "p0-rx-ptype-rrobin" to switch between Round Robin and Fixed priority modes: # ethtool --show-priv-flags eth0 Private flags for eth0: p0-rx-ptype-rrobin: on # ethtool --set-priv-flags eth0 p0-rx-ptype-rrobin off Number of TX DMA channels can be changed using "ethtool -L eth0 tx <N>". - GRO support: the napi_gro_receive() and napi_complete_done() are used. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: ethernet: ti: ale: am65: add support for default thread cfgGrygorii Strashko
Add support for default thread configuration for AM65x CPSW NUSS ALE to allow route all ingress packets to one default RX UDMA flow. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: ethernet: ti: ale: add support for mac-only modeGrygorii Strashko
The new CPSW ALE version, available on TI K3 AM654/J721E SoCs family, allows to switch any external port to MAC only mode. When MAC only mode enabled this port be treated like a MAC port for the host. All traffic received is only sent to the host. The host must direct traffic to this port as the lookup engine will not send traffic to the ports with the p0_maconly bit set and the p0_no_learn also set. If p0_maconly bit is set and the p0_no_learn is not set, the host can send non-directed packets that can be sent to the destination of a MacOnly port. It is also possible that The host can broadcast to all ports including MacOnly ports in this mode. This patch add ALE supprt for MAC only mode. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and ↵Grygorii Strashko
allmulti disabled On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS the unregistered multicast packets are still can be received with promisc and allmulti disabled. This happens, because ALE VLAN entries on these SoCs do not contain port masks for reg/unreg mcast packets, but instead store indexes of ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for reg/unreg mcast packets. ALE VLAN entry:UNREG_MCAST_FLOOD_INDEX -> ALE_VLAN_MASK_MUXx ALE VLAN entry:REG_MCAST_FLOOD_INDEX -> ALE_VLAN_MASK_MUXy The commit b361da837392 ("net: netcp: ale: add proper ale entry mask bits for netcp switch ALE") update ALE code to support such ALE entries, it is always used ALE_VLAN_MASK_MUX0_REG index in ALE VLAN entry for unreg mcast packets mask configuration, which is read-only, at least for AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS. As result unreg mcast packets are allowed always. Hence, update ALE code to use ALE_VLAN_MASK_MUX1_REG index for ALE VLAN entries to configure unreg mcast port mask. Fixes: b361da837392 ("net: netcp: ale: add proper ale entry mask bits for netcp switch ALE") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26phy: ti: gmii-sel: simplify config dependencies between net drivers and gmii phyGrygorii Strashko
The phy-gmii-sel can be only auto selected in Kconfig and now the pretty complex Kconfig dependencies are defined for phy-gmii-sel driver, which also need to be updated every time phy-gmii-sel is re-used for any new networking driver. Simplify Kconfig definition for phy-gmii-sel PHY driver - drop all dependencies and from networking drivers and rely on using 'imply PHY_TI_GMII_SEL' in Kconfig definitions for networking drivers instead. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: phy: add marvell usb to mdio controllerTobias Waldekranz
An MDIO controller present on development boards for Marvell switches from the Link Street (88E6xxx) family. Using this module, you can use the following setup as a development platform for switchdev and DSA related work. .-------. .-----------------. | USB----USB | | SoC | | 88E6390X-DB ETH1-10 | ETH----ETH0 | '-------' '-----------------' Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26net: phy: probe PHY drivers synchronouslyHeiner Kallweit
If we have scenarios like mdiobus_register() -> loads PHY driver module(s) -> registers PHY driver(s) -> may schedule async probe phydev = mdiobus_get_phy() <phydev action involving PHY driver> or phydev = phy_device_create() -> loads PHY driver module -> registers PHY driver -> may schedule async probe <phydev action involving PHY driver> then we expect the PHY driver to be bound to the phydev when triggering the action. This may not be the case in case of asynchronous probing. Therefore ensure that PHY drivers are probed synchronously. Default still is sync probing, except async probing is explicitly requested. I saw some comments that the intention is to promote async probing for more parallelism in boot process and want to be prepared for the case that the default is changed to async probing. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26ice: add a devlink region for dumping NVM contentsJacob Keller
Add a devlink region for exposing the device's Non Volatime Memory flash contents. Support the recently added .snapshot operation, enabling userspace to request a snapshot of the NVM contents via DEVLINK_CMD_REGION_NEW. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26netdevsim: support taking immediate snapshot via devlinkJacob Keller
Implement the .snapshot region operation for the dummy data region. This enables a region snapshot to be taken upon request via the new DEVLINK_CMD_REGION_SNAPSHOT command. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: track snapshot id usage count using an xarrayJacob Keller
Each snapshot created for a devlink region must have an id. These ids are supposed to be unique per "event" that caused the snapshot to be created. Drivers call devlink_region_snapshot_id_get to obtain a new id to use for a new event trigger. The id values are tracked per devlink, so that the same id number can be used if a triggering event creates multiple snapshots on different regions. There is no mechanism for snapshot ids to ever be reused. Introduce an xarray to store the count of how many snapshots are using a given id, replacing the snapshot_id field previously used for picking the next id. The devlink_region_snapshot_id_get() function will use xa_alloc to insert an initial value of 1 value at an available slot between 0 and U32_MAX. The new __devlink_snapshot_id_increment() and __devlink_snapshot_id_decrement() functions will be used to track how many snapshots currently use an id. Drivers must now call devlink_snapshot_id_put() in order to release their reference of the snapshot id after adding region snapshots. By tracking the total number of snapshots using a given id, it is possible for the decrement() function to erase the id from the xarray when it is not in use. With this method, a snapshot id can become reused again once all snapshots that referred to it have been deleted via DEVLINK_CMD_REGION_DEL, and the driver has finished adding snapshots. This work also paves the way to introduce a mechanism for userspace to request a snapshot. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: report error once U32_MAX snapshot ids have been usedJacob Keller
The devlink_snapshot_id_get() function returns a snapshot id. The snapshot id is a u32, so there is no way to indicate an error code. A future change is going to possibly add additional cases where this function could fail. Refactor the function to return the snapshot id in an argument, so that it can return zero or an error value. This ensures that snapshot ids cannot be confused with error values, and aids in the future refactor of snapshot id allocation management. Because there is no current way to release previously used snapshot ids, add a simple check ensuring that an error is reported in case the snapshot_id would over flow. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: convert snapshot destructor callback to region opJacob Keller
It does not makes sense that two snapshots for a given region would use different destructors. Simplify snapshot creation by adding a .destructor op for regions. This operation will replace the data_destructor for the snapshot creation, and makes snapshot creation easier. Noticed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26devlink: prepare to support region operationsJacob Keller
Modify the devlink region code in preparation for adding new operations on regions. Create a devlink_region_ops structure, and move the name pointer from within the devlink_region structure into the ops structure (similar to the devlink_health_reporter_ops). This prepares the regions to enable support of additional operations in the future such as requesting snapshots, or accessing the region directly without a snapshot. In order to re-use the constant strings in the mlx4 driver their declaration must be changed to 'const char * const' to ensure the compiler realizes that both the data and the pointer cannot change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26veth: rely on peer veth_rq for ndo_xdp_xmit accountingLorenzo Bianconi
Rely on 'remote' veth_rq to account ndo_xdp_xmit ethtool counters. Move XDP_TX accounting to veth_xdp_flush_bq routine. Remove 'rx' prefix in rx xdp ethool counters Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26veth: rely on veth_rq in veth_xdp_flush_bq signatureLorenzo Bianconi
Substitute net_device point with veth_rq one in veth_xdp_flush_bq, veth_xdp_flush and veth_xdp_tx signature. This is a preliminary patch to account xdp_xmit counter on 'receiving' veth_rq Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26sfc: falcon: convert to use i2c_new_client_device()Wolfram Sang
Move away from the deprecated API and return the shiny new ERRPTR where useful. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26igb: convert to use i2c_new_client_device()Wolfram Sang
Move away from the deprecated API and return the shiny new ERRPTR where useful. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26mlxsw: spectrum_flower: Offload FLOW_ACTION_MANGLEPetr Machata
Offload action pedit ex munge when used with a flower classifier. Only allow setting of DSCP, ECN, or the whole DSField in IPv4 and IPv6 packets. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26mlxsw: core: Add DSCP, ECN, dscp_rw to QOS_ACTIONPetr Machata
The QOS_ACTION is used for manipulating the QOS attributes of the packet. Add the defines and helpers related to DSCP and ECN fields, and dscp_rw. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26mlxsw: core: Rename mlxsw_afa_qos_cmd to mlxsw_afa_qos_switch_prio_cmdPetr Machata
The original idea was to reuse this set of actions for ECN rewrite as well, but on second look, it's not such a great idea. These two items should each have its own command. Rename the existing enum to make it obvious that it belongs to switch_prio_cmd. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26Merge tag 'mlx5-updates-2020-03-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-03-25 1) Cleanups from Dan Carpenter and wenxu. 2) Paul and Roi, Some minor updates and fixes to E-Switch to address issues introduced in the previous reg_c0 updates series. 3) Eli Cohen simplifies and improves flow steering matching group searches and flow table entries version management. 4) Parav Pandit, improves devlink eswitch mode changes thread safety. By making devlink rely on driver for thread safety and introducing mlx5 eswitch mode change protection. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-26atl2: remove unused variable 'atl2_driver_string'YueHaibing
drivers/net/ethernet/atheros/atlx/atl2.c:40:19: warning: ‘atl2_driver_string’ defined but not used [-Wunused-const-variable=] static const char atl2_driver_string[] = "Atheros(R) L2 Ethernet Driver"; ^~~~~~~~~~~~~~~~~~ commit ea973742140b ("net/atheros: Clean atheros code from driver version") left behind this, remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25net/mlx5: E-switch, Protect eswitch mode changesParav Pandit
Currently eswitch mode change is occurring from 2 different execution contexts as below. 1. sriov sysfs enable/disable 2. devlink eswitch set commands Both of them need to access eswitch related data structures in synchronized manner. Without any synchronization below race condition exist. SR-IOV enable/disable with devlink eswitch mode change: cpu-0 cpu-1 ----- ----- mlx5_device_disable_sriov() mlx5_devlink_eswitch_mode_set() mlx5_eswitch_disable() esw_offloads_stop() esw_offloads_disable() mlx5_eswitch_disable() esw_offloads_disable() Hence, they are synchronized using a new mode_lock. eswitch's state_lock is not used as it can lead to a deadlock scenario below and state_lock is only for vport and fdb exclusive access. ip link set vf <param> netlink rcv_msg() - Lock A rtnl_lock vfinfo() esw->state_lock() - Lock B devlink eswitch_set devlink_mutex esw->state_lock() - Lock B attach_netdev() register_netdev() rtnl_lock - Lock A Alternatives considered: 1. Acquiring rtnl lock before taking esw->state_lock to follow similar locking sequence as ip link flow during eswitch mode set. rtnl lock is not good idea for two reasons. (a) Holding rtnl lock for several hundred device commands is not good idea. (b) It leads to below and more similar deadlocks. devlink eswitch_set devlink_mutex rtnl_lock - Lock A esw->state_lock() - Lock B eswitch_disable() reload() ib_register_device() ib_cache_setup_one() rtnl_lock() 2. Exporting devlink lock may lead to undesired use of it in vendor driver(s) in future. 3. Unloading representors outside of the mode_lock requires serialization with other process trying to enable the eswitch. 4. Differing the representors life cycle to a different workqueue requires synchronization with func_change_handler workqueue. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-switch, Extend eswitch enable to handle num_vfs changeParav Pandit
Subsequent patch protects eswitch mode changes across sriov and devlink interfaces. It is desirable for eswitch to provide thread safe eswitch enable and disable APIs. Hence, extend eswitch enable API to optionally update num_vfs when requested. In subsequent patch, eswitch num_vfs are updated after all the eswitch users eswitch drops its reference count. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Split eswitch mode check to different helper functionParav Pandit
In order to check eswitch state under a lock, prepare code to split capability check and eswitch state check into two helper functions. Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Simplify mlx5_unload_one() and its callersParav Pandit
mlx5_unload_one() always returns 0. Simplify callers of mlx5_unload_one() and remove the dead code. Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Simplify mlx5_register_device to return voidParav Pandit
mlx5_register_device() doesn't check for any error and always returns 0. Simplify mlx5_register_device() to return void and its caller, remove dead code related to it. Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Avoid group version scan when not necessaryEli Cohen
Group version is used when modifying a rule is allowed (FLOW_ACT_NO_APPEND is clear) to detect a case where the rule was found but while the groups where unlocked a new FTE was added. In this case, the added FTE could be one with identical match value so we need to attempt again with group lock held. Change the code so version is retrieved only when FLOW_ACT_NO_APPEND is cleared. As result, later compare can also be avoided if FLOW_ACT_NO_APPEND is cleared. Also improve comments text. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Avoid incrementing FTE versionEli Cohen
FTE version is not used anywhere in the code so avoid incrementing it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Fix group version managementEli Cohen
When adding a rule to a flow group we need increment the version of the group. Current code fails to do that and as a result, when trying to add a rule, we will fail to discover a case where an FTE with the same match value was added while we scanned the groups of the same match criteria, thus we may try to add an FTE that was already added. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: Simplify matching group searchesEli Cohen
Instead of using two different structs for searching groups with the same match, use a single struct and thus simplify the code, make it more readable and smaller size which means less code cache misses. text data bss dec hex before: 35524 2744 0 38268 957c after: 35038 2744 0 37782 9396 When testing add 70000 rules, delete all the rules, and repeat three times taking the average, we get (time in seconds): Before the change: insert 16.80, delete 11.02 After the change: insert 16.55, delete 10.95 Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, Use correct type for chain, prio and level valuesRoi Dayan
The correct type is u32. Fixes: d18296ffd9cc ("net/mlx5: E-Switch, Introduce global tables") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, free flow_group_in after creating the restore tableRoi Dayan
We allocate a temporary memory but forget to free it. Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, Enable chains only if regs loopback is enabledPaul Blakey
Register c0 loopback is needed to fully support chains and prios. Enable chains and prio only if loopback (of reg c1 which came together with c0), is enabled. To be able to check that, move enabling of loopback before eswitch chains init. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5: E-Switch, Enable restore table only if reg_c1 is supportedPaul Blakey
Reg c0/c1 matching, rewrite of regs c0/c1, and copy header of regs c1,B is needed for the restore table to function, might not be supported by firmware, and creation of the restore table or the copy header will fail. Check reg_c1 loopback support, as firmware which supports this, should have all of the above. Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25net/mlx5e: remove duplicated check chain_index in mlx5e_rep_setup_ft_cbwenxu
The function mlx5e_rep_setup_ft_cb check chain_index is zero twice. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>