summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlxsw
AgeCommit message (Collapse)Author
2022-06-22mlxsw: pci: Query resources before and after issuing 'CONFIG_PROFILE' commandAmit Cohen
Currently, as part of mlxsw_pci_init(), resources are queried from firmware before issuing the 'CONFIG_PROFILE' command. There are resources whose size depend on the enablement of the unified bridge model that is performed via 'CONFIG_PROFILE' command. As a preparation for unified bridge model, add an additional query after issuing this command. Both queries are required as KVD sizes are read from firmware and then are configured as part of 'CONFIG_PROFILE' command. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: cmd: Increase 'config_profile.flood_mode' lengthAmit Cohen
Currently, the length of 'config_profile.flood_mode' is defined as 2 bits, while the correct length is 3 bits. As preparation for unified bridge model, which will use the whole field length, fix it and increase the field to the correct size. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: Add enumerator for 'config_profile.flood_mode'Amit Cohen
Currently magic constant is used for setting 'flood_mode' as part of config profile. As preparation for unified bridge model, which will require different 'flood_mode', add a dedicated enumerator for this field and use it as part of 'struct mlxsw_config_profile'. Add the relevant value for unified bridge model. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: spectrum_switchdev: Handle error in mlxsw_sp_bridge_mdb_mc_enable_sync()Amit Cohen
The above mentioned function calls two functions which return values, but it ignores them. Check if the return value is an error, handle it in such case and return an error back to the caller. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: spectrum_switchdev: Convert mlxsw_sp_mc_write_mdb_entry() to return intAmit Cohen
The function returns 'false' upon failure and 'true' upon success. Convert it to the usual scheme of returning integer error codes and align the callers. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: spectrum_switchdev: Add error path in mlxsw_sp_port_mc_disabled_set()Amit Cohen
The above mentioned function just returns an error in case that mlxsw_sp_bridge_ports_flood_table_set() fails. That means that the previous configurations are not cleaned. Fix it by adding error path to clean the configurations in case of error. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: spectrum_switchdev: Simplify mlxsw_sp_port_mc_disabled_set()Amit Cohen
The above mentioned function is called for each port in bridge when the multicast behavior is changed. Currently, it updates all the relevant MID entries only for the first call, but the update of flooding per port is done only for the port that it was called for it. To simplify this behavior, it is possible to handle flooding for all the ports in the first call, then, the other calls will do nothing. For that, new functions are required to set flooding for all ports, no 'struct mlxsw_sp_port' is required. This issue was found while extending this function for unified bridge model, the above mentioned change will ease the extending later. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: spectrum_switchdev: Do not set 'multicast_enabled' twiceAmit Cohen
The function mlxsw_sp_port_mc_disabled_set() sets 'bridge_device->multicast_enabled' twice. Remove the unnecessary setting. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: spectrum_switchdev: Pass 'struct mlxsw_sp' to ↵Amit Cohen
mlxsw_sp_bridge_mdb_mc_enable_sync() Currently the above mentioned function gets 'struct mlxsw_sp_port', while the only use of this structure is to get 'mlxsw_sp_port->mlxsw_sp'. Instead, pass 'struct mlxsw_sp'. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-22mlxsw: Remove lag_vid_valid indicationAmit Cohen
Currently 'struct mlxsw_sp_fid_family' has a field which indicates if 'lag_vid' is valid for use in SFD register. This is a leftover from using .1Q FIDs instead of emulating them using .1D FIDs. Currently when .1Q FIDs are emulated using .1D FIDs, this field is true for both families, so there is no reason to maintain it. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add support for VLAN RIF as part of RITR registerAmit Cohen
Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of "VLAN" type, whereas RIFs constructed on top of VLAN-unaware bridges of "FID" type. In other words, the RIF type is derived from the underlying FID type. VLAN RIFs are used on top of 802.1Q FIDs, whereas FID RIFs are used on top of 802.1D FIDs. Currently 802.1Q FIDs are emulated using 802.1D FIDs, and therefore VLAN RIFs are emulated using FID RIFs. As part of converting the driver to use unified bridge, 802.1Q FIDs and VLAN RIFs will be used. Add the relevant fields to RITR register, add pack() function for VLAN RIF and rename one field to fit the internal name. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: Add support for egress FID classification after decapsulationAmit Cohen
As preparation for unified bridge model, add support for VNI->FID mapping via SVFA register. When performing VXLAN encapsulation, the VXLAN header needs to contain a VNI. This VNI is derived from the FID classification performed on ingress, through which the ingress RIF is also determined. Similarly, when performing VXLAN decapsulation, the FID of the packet needs to be determined. This FID is derived from VNI classification performed during decapsulation. In the old model, both entries (i.e., FID->VNI and VNI->FID) were configured via SFMR.vni. In the new model, where ingress is separated from egress, ingress configuration (VNI->FID) is performed via SVFA, while SFMR only configures egress (FID->VNI). Add 'vni' field to SVFA, add new mapping table - VNI to FID, add new pack() function for VNI mapping and edit the comment in SFMR. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add egress FID field to RITR registerAmit Cohen
RITR configures the router interface table. As preparation for unified bridge model, add egress FID field to RITR. After routing, a packet has to perform a layer-2 lookup using the destination MAC it got from the routing and a FID. In the new model, the egress FID is configured by RITR for both sub-port and FID RIFs. Add 'efid' field to sub-port router interface and update FID router interface related comment. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add Router Egress Interface to VID RegisterAmit Cohen
The REIV maps {egress router interface (eRIF), egress_port} -> {vlan ID}. As preparation for unified bridge model, add REIV register for future use. In the past, firmware would take care of the above mentioned mapping, but in the new model this should be done by software using REIV register. REIV register supports a simultaneous update of 256 ports using 'port_page' field. When 'port_page'=0 the records represent ports 0-255, when 'port_page'=1 the records represent ports 256-511 and so on. The register is reserved while using the legacy model. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Replace MID related fields in SFGC registerAmit Cohen
SFGC register maps {packet type, bridge type} -> {MID base, table type}. As preparation for unified bridge model, remove 'mid' field and add 'mid_base' field. The MID index (index to PGT table which maps MID to local port list and SMPE index) is a result of 'mid_base' + 'fid_offset'. Using the legacy bridge model, firmware configures 'mid_base'. However, using the new model, software is responsible to configure it via SFGC register. The 'mid_base' is configured per {packet type, bridge type}, for example, for {Unicast, .1Q}, {Broadcast, .1D}. Add the field 'mid_base' to SFGC register and increase the length of the register accordingly. Remove the field 'mid' as currently it is ignored by the device, its use is an old leftover. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add flood related field to SFMR registerAmit Cohen
SFMR register creates and configures FIDs. As preparation for unified bridge model, add a required field for future use. The PGT (Port Group) table maps multicast ID (MID) to {local port list, SMPE index} on Spectrum-1 and to {local port list} on the other ASICs. In the legacy model, software did not interact with this table directly. Instead, it was accessed by firmware in response to registers such as SFTR and SMID. In the new model, the SFTR register is deprecated and software has full control over the PGT table using the SMID register. The configuration of MDB entries (using SFD) is unchanged, but flooding configuration is completely different. SFGC register maps {packet type, bridge type} -> {MID base, table type}, then with FID and FID-offset which are configured via SFMR, the MID index is obtained. Add the field 'flood_bridge_type' to SFMR, software can separate between 802.1q FIDs and vFIDs using two types which are supported. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add VID related fields to SFD registerAmit Cohen
SFD register configures FDB table. As preparation for unified bridge model, add some required fields for future use. In the new model, firmware no longer configures the egress VID, this responsibility is moved to software. For layer 2 this means that software needs to determine the egress VID for both unicast and multicast. For unicast FDB records and unicast LAG FDB records, the VID needs to be set via new fields in SFD - 'set_vid' and 'vid'. Add the two mentioned fields for future use. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add SMPE related fields to SFMR registerAmit Cohen
SFMR register creates and configures FIDs. As preparation unified bridge model, add some required fields for future use. The device includes two main tables to support layer 2 multicast (i.e., MDB and flooding). These are the PGT (Port Group Table) and the MPE (Multicast Port Egress) table. - PGT is {MID -> (bitmap of local_port, SPME index)} - MPE is {(Local port, SMPE index) -> eVID} In Spectrum-2 and later ASICs, the SMPE index is an attribute of the FID and programmed via new fields in SFMR register - 'smpe_valid' and 'smpe'. Add the two mentioned fields for future use and increase the length of the register accordingly. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: Add SMPE related fields to SMID2 registerAmit Cohen
SMID register maps multicast ID (MID) into a list of local ports. As preparation for unified bridge model, add some required fields for future use. The device includes two main tables to support layer 2 multicast (i.e., MDB and flooding). These are the PGT (Port Group Table) and the MPE (Multicast Port Egress) table. - PGT is {MID -> (bitmap of local_port, SPME index)} - MPE is {(Local port, SMPE index) -> eVID} In Spectrum-1, both indexes into the MPE table (local port and SMPE) are derived from the PGT table. Therefore, the SMPE index needs to be programmed as part of the PGT entry via new fields in SMID - 'smpe_valid' and 'smpe'. Add the two mentioned fields for future use and align the callers of mlxsw_reg_smid2_pack() to pass zeros for SMPE fields. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add Switch Multicast Port to Egress VID RegisterAmit Cohen
The SMPE register maps {egress_port, SMPE index} -> VID. The device includes two main tables to support layer 2 multicast (i.e., MDB and flooding). These are the PGT (Port Group Table) and the MPE (Multicast Port Egress) table. - PGT is {MID -> (bitmap of local_port, SPME index)} - MPE is {(Local port, SMPE index) -> eVID} In Spectrum-1, the index into the MPE table - called switch multicast to port egress VID (SMPE) - is derived from the PGT entry, whereas in Spectrum-2 and later ASICs it is derived from the FID. In the legacy model, software did not interact with this table as it was completely hidden in firmware. In the new model, software needs to populate the table itself in order to map from {Local port, SMPE index} to an egress VID. This is done using the SMPE register. Add the register for future use. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add ingress RIF related fields to SVFA registerAmit Cohen
SVFA register controls the VID to FID mapping and {Port, VID} to FID mapping for virtualized ports. As preparation for unified bridge model, add some required fields for future use. On ingress, after ingress ACL, a packet needs to be classified to a FID. The key for this lookup can be one of: 1. VID. When port is not in virtual mode. 2. {RQ, VID}. When port is in virtual mode. 3. FID. When FID was set by ingress ACL. Since RITR no longer performs ingress configuration, the ingress RIF for the first two entry types needs to be set via new fields in SVFA - 'irif_v' and 'irif'. Add the two mentioned fields for future use and increase the length of the register accordingly. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add ingress RIF related fields to SFMR registerAmit Cohen
SFMR register creates and configures FIDs. As preparation for unified bridge model, add some required fields for future use. On ingress, after ingress ACL, a packet needs to be classified to a FID. The key for this lookup can be one of: 1. VID. When port is not in virtual mode. 2. {RQ, VID}. When port is in virtual mode. 3. FID. When FID was set by ingress ACL. For example, via VR_AND_FID_ACTION. Since RITR no longer performs ingress configuration, the ingress RIF for the last entry type needs to be set via new fields in SFMR - 'irif_v' and 'irif'. Add the two mentioned fields for future use. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20mlxsw: reg: Add 'flood_rsp' field to SFMR registerAmit Cohen
SFMR register creates and configures FIDs. As preparation for unified bridge model, add a field for future use. In the new model, RITR no longer configures the rFID used for sub-port RIFs and it has to be created by software via SFMR. Such FIDs need to be created with special flood indication using 'flood_rsp' field. When set, this bit instructs the device to manage the flooding entries for this FID in a reserved part of the port group table (PGT). Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17mlxsw: Add a resource describing number of RIFsPetr Machata
The Spectrum ASIC has a limit on how many L3 devices (called RIFs) can be created. The limit depends on the ASIC and FW revision, and mlxsw reads it from the FW. In order to communicate both the number of RIFs that there can be, and how many are taken now (i.e. occupancy), introduce a corresponding devlink resource. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17mlxsw: Keep track of number of allocated RIFsPetr Machata
In order to expose number of RIFs as a resource, it is going to be handy to have the number of currently-allocated RIFs as a single number. Introduce such. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-17mlxsw: Trap ARP packets at layer 3 instead of layer 2Amit Cohen
Currently, the traps 'ARP_REQUEST' and 'ARP_RESPONSE' occur at layer 2. To allow the packets to be flooded, they are configured with the action 'MIRROR_TO_CPU' which means that the CPU receives a replica of the packet. Today, Spectrum ASICs also support trapping ARP packets at layer 3. This behavior is better, then the packets can just be trapped and there is no need to mirror them. An additional motivation is that using the traps at layer 2, the ARP packets are dropped in the router as they do not have an IP header, then they are counted as error packets, which might confuse users. Add the relevant traps for layer 3 and use them instead of the existing traps. There is no visible change to user space. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14mlxsw: Revert "Prepare for XM implementation - LPM trees"Petr Machata
This reverts commit 923ba95ea22d ("Merge branch 'mlxsw-spectrum-prepare-for-xm-implementation-lpm-trees'"). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14mlxsw: Revert "Prepare for XM implementation - prefix insertion and removal"Petr Machata
This reverts commit e7086213f7b4 ("Merge branch 'mlxsw-spectrum-prepare-for-xm-implementation-prefix-insertion-and-removal'"). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14mlxsw: Revert "Introduce initial XM router support"Petr Machata
This reverts commit 75c2a8fe8e39 ("Merge branch 'mlxsw-introduce-initial-xm-router-support'"). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14mlxsw: spectrum_cnt: Reorder counter poolsPetr Machata
Both RIF and ACL flow counters use a 24-bit SW-managed counter address to communicate which counter they want to bind. In a number of Spectrum FW releases, binding a RIF counter is broken and slices the counter index to 16 bits. As a result, on Spectrum-2 and above, no more than about 410 RIF counters can be effectively used. This translates to 205 netdevices for which L3 HW stats can be enabled. (This does not happen on Spectrum-1, because there are fewer counters available overall and the counter index never exceeds 16 bits.) Binding counters to ACLs does not have this issue. Therefore reorder the counter allocation scheme so that RIF counters come first and therefore get lower indices that are below the 16-bit barrier. Fixes: 98e60dce4da1 ("Merge branch 'mlxsw-Introduce-initial-Spectrum-2-support'") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220613125017.2018162-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Build issue in drivers/net/ethernet/sfc/ptp.c 54fccfdd7c66 ("sfc: efx_default_channel_type APIs can be static") 49e6123c65da ("net: sfc: fix memory leak due to ptp channel") https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12mlxsw: Avoid warning during ip6gre device removalAmit Cohen
IPv6 addresses which are used for tunnels are stored in a hash table with reference counting. When a new GRE tunnel is configured, the driver is notified and configures it in hardware. Currently, any change in the tunnel is not applied in the driver. It means that if the remote address is changed, the driver is not aware of this change and the first address will be used. This behavior results in a warning [1] in scenarios such as the following: # ip link add name gre1 type ip6gre local 2000::3 remote 2000::fffe tos inherit ttl inherit # ip link set name gre1 type ip6gre local 2000::3 remote 2000::ffff ttl inherit # ip link delete gre1 The change of the address is not applied in the driver. Currently, the driver uses the remote address which is stored in the 'parms' of the overlay device. When the tunnel is removed, the new IPv6 address is used, the driver tries to release it, but as it is not aware of the change, this address is not configured and it warns about releasing non existing IPv6 address. Fix it by using the IPv6 address which is cached in the IPIP entry, this address is the last one that the driver used, so even in cases such the above, the first address will be released, without any warning. [1]: WARNING: CPU: 1 PID: 2197 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2920 mlxsw_sp_ipv6_addr_put+0x146/0x220 [mlxsw_spectrum] ... CPU: 1 PID: 2197 Comm: ip Not tainted 5.17.0-rc8-custom-95062-gc1e5ded51a9a #84 Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021 RIP: 0010:mlxsw_sp_ipv6_addr_put+0x146/0x220 [mlxsw_spectrum] ... Call Trace: <TASK> mlxsw_sp2_ipip_rem_addr_unset_gre6+0xf1/0x120 [mlxsw_spectrum] mlxsw_sp_netdevice_ipip_ol_event+0xdb/0x640 [mlxsw_spectrum] mlxsw_sp_netdevice_event+0xc4/0x850 [mlxsw_spectrum] raw_notifier_call_chain+0x3c/0x50 call_netdevice_notifiers_info+0x2f/0x80 unregister_netdevice_many+0x311/0x6d0 rtnl_dellink+0x136/0x360 rtnetlink_rcv_msg+0x12f/0x380 netlink_rcv_skb+0x49/0xf0 netlink_unicast+0x233/0x340 netlink_sendmsg+0x202/0x440 ____sys_sendmsg+0x1f3/0x220 ___sys_sendmsg+0x70/0xb0 __sys_sendmsg+0x54/0xa0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: e846efe2737b ("mlxsw: spectrum: Add hash table for IPv6 address mapping") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220511115747.238602-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-08mlxsw: spectrum_router: Take router lock in router notifier handlerPetr Machata
For notifications that the router needs to handle, router lock is taken. Further, at least to determine whether an event is related to a tunnel underlay, router lock also needs to be taken. Due to this, the router lock is always taken for each unhandled event, and also for some handled events, even if they are not related to underlay. Thus each event implies at least one router lock, sometimes two. Instead of deferring the locking to the leaf handlers, take the lock in the router notifier handler always. This simplifies thinking about the locking state, and in some cases saves one lock cycle. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Update a commentPetr Machata
The position of netdevice notifier registration no longer depends on the router initialization, because the event handler no longer dispatches to the router code. Update the comment at the registration to that effect. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of tunnel events to router codePetr Machata
The events related to IPIP tunnels are handled by the router code. Move the handling from the central dispatcher in spectrum.c to the new notifier handler in the router module. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of router events to router codePetr Machata
The events NETDEV_PRE_CHANGEADDR, NETDEV_CHANGEADDR and NETDEV_CHANGEMTU have implications for in-ASIC router interface objects, and as such are handled in the router module. Move the handling from the central dispatcher in spectrum.c to the new notifier handler in the router module. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of HW stats events to router codePetr Machata
L3 HW stats are implemented in mlxsw as RIF counters, and therefore the code resides in spectrum_router. Exclude the offload xstats events from the mlxsw_sp_netdevice_event_is_router() predicate, and instead recreate the glue code in the router module. Previously, the order of dispatch was that for events on tunnels, a dedicated handler was called, which however did not handle HW stats events. But there is nothing special about tunnel devices as far as HW stats: there is a RIF associated with the tunnel netdevice, and that RIF is where the counter should be installed. Therefore now, HW stats events are tested first, independent of netdevice type. The upshot is that as of this commit, mlxsw supports L3 HW stats work on GRE tunnels. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Move handling of VRF events to router codePetr Machata
Events involving VRF, as L3 concern, are handled in the router code, by the helper mlxsw_sp_netdevice_vrf_event(). The handler is currently invoked from the centralized dispatcher in spectrum.c. Instead, move the call to the newly-introduced router-specific notifier handler. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum_router: Add a dedicated notifier blockPetr Machata
Currently all netdevice events are handled in the centralized notifier handler maintained by spectrum.c. Since a number of events are involving router code, spectrum.c needs to dispatch them to spectrum_router.c. The spectrum module therefore needs to know more about the router code than it should have, and there is are several API points through which the two modules communicate. To simplify the notifier handlers, introduce a new notifier into the router module. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08mlxsw: spectrum: Tolerate enslaving of various devices to VRFPetr Machata
Enslaving netdevices to VRF is currently handled through an mlxsw_sp_is_vrf_event() conditional in mlxsw_sp_netdevice_event(). In the following patch sets, VRF enslavement will be handled purely in the router code. Therefore make handlers of NETDEV_PRECHANGEUPPER tolerant of enslaving to VRF, so that they do not bounce the change. For NETDEV_CHANGEUPPER, drop the WARN_ON(1) and bounce from mlxsw_sp_netdevice_port_vlan_event(). This is the only handler that warned and bounces even in the CHANGEUPPER code, other handler quietly do nothing when they encounter an unfamiliar upper. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-05Revert "Merge branch 'mlxsw-line-card-model'"Jakub Kicinski
This reverts commit 5e927a9f4b9f29d78a7c7d66ea717bb5c8bbad8e, reversing changes made to cfc1d91a7d78cf9de25b043d81efcc16966d55b3. The discussion is still ongoing so let's remove the uAPI until the discussion settles. Link: https://lore.kernel.org/all/20220425090021.32e9a98f@kernel.org/ Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220504154037.539442-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04mlxsw: spectrum_router: Only query neighbour activity when necessaryIdo Schimmel
The driver periodically queries the device for activity of neighbour entries in order to report it to the kernel's neighbour table. Avoid unnecessary activity query when no neighbours are installed. Use an atomic variable to track the number of neighbours, as it is read without any locks. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mlxsw: spectrum_switchdev: Only query FDB notifications when necessaryIdo Schimmel
The driver periodically queries the device for FDB notifications (e.g., learned, aged-out) in order to update the bridge driver. These notifications can only be generated when bridges are offloaded to the device. Avoid unnecessary queries by starting to query upon installation of the first bridge and stop querying upon removal of the last bridge. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mlxsw: spectrum_acl: Do not report activity for multicast routesIdo Schimmel
The driver periodically queries the device for activity of ACL rules in order to report it to tc upon 'FLOW_CLS_STATS'. In Spectrum-2 and later ASICs, multicast routes are programmed as ACL rules, but unlike rules installed by tc, their activity is of no interest. Avoid unnecessary activity query for such rules by always reporting them as inactive. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mlxsw: Treat LLDP packets as controlPetr Machata
When trapping packets for on-CPU processing, Spectrum machines differentiate between control and non-control traps. Traffic trapped through non-control traps is treated as data and kept in shared buffer in pools 0-4. Traffic trapped through control traps is kept in the dedicated control buffer 9. The advantage of marking traps as control is that pressure in the data plane does not prevent the control traffic to be processed. When the LLDP trap was introduced, it was marked as a control trap. But then in commit aed4b5721143 ("mlxsw: spectrum: PTP: Hook into packet receive path"), PTP traps were introduced. Because Ethernet-encapsulated PTP packets look to the Spectrum-1 ASIC as LLDP traffic and are trapped under the LLDP trap, this trap was reconfigured as non-control, in sync with the PTP traps. There is however no requirement that PTP traffic be handled as data. Besides, the usual encapsulation for PTP traffic is UDP, not bare Ethernet, and that is in deployments that even need PTP, which is far less common than LLDP. This is reflected by the default policer, which was not bumped up to the 19Kpps / 24Kpps that is the expected load of a PTP-enabled Spectrum-1 switch. Marking of LLDP trap as non-control was therefore probably misguided. In this patch, change it back to control. Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mlxsw: spectrum_dcb: Do not warn about priority changesPetr Machata
The idea behind the warnings is that the user would get warned in case when more than one priority is configured for a given DSCP value on a netdevice. The warning is currently wrong, because dcb_ieee_getapp_mask() returns the first matching entry, not all of them, and the warning will then claim that some priority is "current", when in fact it is not. But more importantly, the warning is misleading in general. Consider the following commands: # dcb app flush dev swp19 dscp-prio # dcb app add dev swp19 dscp-prio 24:3 # dcb app replace dev swp19 dscp-prio 24:2 The last command will issue the following warning: mlxsw_spectrum3 0000:07:00.0 swp19: Ignoring new priority 2 for DSCP 24 in favor of current value of 3 The reason is that the "replace" command works by first adding the new value, and then removing all old values. This is the only way to make the replacement without causing the traffic to be prioritized to whatever the chip defaults to. The warning is issued in response to adding the new priority, and then no warning is shown when the old priority is removed. The upshot is that the canonical way to change traffic prioritization always produces a warning about ignoring the new priority, but what gets configured is in fact what the user intended. An option to just emit warning every time that the prioritization changes just to make it clear that it happened is obviously unsatisfactory. Therefore, in this patch, remove the warnings. Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-03mlxsw: Configure descriptor buffersPetr Machata
Spectrum machines have two resources related to keeping packets in an internal buffer: bytes (allocated in cell-sized units) for packet payload, and descriptors, for keeping metadata. Currently, mlxsw only configures the bytes part of the resource management. Spectrum switches permit a full parallel configuration for the descriptor resources, including port-pool and port-TC-pool quotas. By default, these are all configured to use pool 14, with an infinite quota. The ingress pool 14 is then infinite in size. However, egress pool 14 has finite size by default. The size is chip dependent, but always much lower than what the chip actually permits. As a result, we can easily construct workloads that exhaust the configured descriptor limit. Fix the issue by configuring the egress descriptor pool to be infinite in size as well. This will maintain the configuration philosophy of the default configuration, but will unlock all chip resources to be usable. In the code, include both the configuration of ingress and ingress, mostly for clarity. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03mlxsw: reg: Add "desc" field to SBPRPetr Machata
SBPR, or Shared Buffer Pools Register, configures and retrieves the shared buffer pools and configuration. The desc field determines whether the configuration relates to the byte pool or the descriptor pool. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-25mlxsw: core_linecards: Expose device FW version over device infoJiri Pirko
Extend MDDQ to obtain FW version of line card device and implement device_info_get() op to fill up the info with that. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>