diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-06-30 15:51:09 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-06-30 15:51:09 -0700 |
commit | dbe69e43372212527abf48609aba7fc39a6daa27 (patch) | |
tree | 96cfafdf70f5325ceeac1054daf7deca339c9730 /drivers/net/ethernet/mellanox/mlx5 | |
parent | a6eaf3850cb171c328a8b0db6d3c79286a1eba9d (diff) | |
parent | b6df00789e2831fff7a2c65aa7164b2a4dcbe599 (diff) |
Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener to
another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance (for
pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require jump
labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast
address allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks in NAPI
context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and
NXP (our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support"
* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
tcp: change ICSK_CA_PRIV_SIZE definition
tcp_yeah: check struct yeah size at compile time
gve: DQO: Fix off by one in gve_rx_dqo()
stmmac: intel: set PCI_D3hot in suspend
stmmac: intel: Enable PHY WOL option in EHL
net: stmmac: option to enable PHY WOL with PMT enabled
net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
net: use netdev_info in ndo_dflt_fdb_{add,del}
ptp: Set lookup cookie when creating a PTP PPS source.
net: sock: add trace for socket errors
net: sock: introduce sk_error_report
net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
net: dsa: include fdb entries pointing to bridge in the host fdb list
net: dsa: include bridge addresses which are local in the host fdb list
net: dsa: sync static FDB entries on foreign interfaces to hardware
net: dsa: install the host MDB and FDB entries in the master's RX filter
net: dsa: reference count the FDB addresses at the cross-chip notifier level
net: dsa: introduce a separate cross-chip notifier type for host FDBs
net: dsa: reference count the MDB entries at the cross-chip notifier level
...
Diffstat (limited to 'drivers/net/ethernet/mellanox/mlx5')
68 files changed, 3798 insertions, 840 deletions
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index 461a43f338e6..e1a5a79e27c7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -12,7 +12,6 @@ config MLX5_CORE depends on MLXFW || !MLXFW depends on PTP_1588_CLOCK || !PTP_1588_CLOCK depends on PCI_HYPERV_INTERFACE || !PCI_HYPERV_INTERFACE - default n help Core driver for low level functionality of the ConnectX-4 and Connect-IB cards by Mellanox Technologies. @@ -36,7 +35,6 @@ config MLX5_CORE_EN depends on NETDEVICES && ETHERNET && INET && PCI && MLX5_CORE select PAGE_POOL select DIMLIB - default n help Ethernet support in Mellanox Technologies ConnectX-4 NIC. @@ -79,6 +77,16 @@ config MLX5_ESWITCH Legacy SRIOV mode (L2 mac vlan steering based). Switchdev mode (eswitch offloads). +config MLX5_BRIDGE + bool + depends on MLX5_ESWITCH && BRIDGE + default y + help + mlx5 ConnectX offloads support for Ethernet Bridging (BRIDGE). + Enable adding representors of mlx5 uplink and VF ports to Bridge and + offloading rules for traffic between such ports. Supports VLANs (trunk and + access modes). + config MLX5_CLS_ACT bool "MLX5 TC classifier action support" depends on MLX5_ESWITCH && NET_CLS_ACT @@ -131,7 +139,6 @@ config MLX5_CORE_EN_DCB config MLX5_CORE_IPOIB bool "Mellanox 5th generation network adapters (connectX series) IPoIB offloads support" depends on MLX5_CORE_EN - default n help MLX5 IPoIB offloads & acceleration support. @@ -139,7 +146,6 @@ config MLX5_FPGA_IPSEC bool "Mellanox Technologies IPsec Innova support" depends on MLX5_CORE depends on MLX5_FPGA - default n help Build IPsec support for the Innova family of network cards by Mellanox Technologies. Innova network cards are comprised of a ConnectX chip @@ -153,7 +159,6 @@ config MLX5_IPSEC depends on XFRM_OFFLOAD depends on INET_ESP_OFFLOAD || INET6_ESP_OFFLOAD select MLX5_ACCEL - default n help Build IPsec support for the Connect-X family of network cards by Mellanox Technologies. @@ -166,7 +171,6 @@ config MLX5_EN_IPSEC depends on XFRM_OFFLOAD depends on INET_ESP_OFFLOAD || INET6_ESP_OFFLOAD depends on MLX5_FPGA_IPSEC || MLX5_IPSEC - default n help Build support for IPsec cryptography-offload acceleration in the NIC. Note: Support for hardware with this capability needs to be selected @@ -179,7 +183,6 @@ config MLX5_FPGA_TLS depends on MLX5_CORE_EN depends on MLX5_FPGA select MLX5_EN_TLS - default n help Build TLS support for the Innova family of network cards by Mellanox Technologies. Innova network cards are comprised of a ConnectX chip @@ -194,7 +197,6 @@ config MLX5_TLS depends on MLX5_CORE_EN select MLX5_ACCEL select MLX5_EN_TLS - default n help Build TLS support for the Connect-X family of network cards by Mellanox Technologies. @@ -217,7 +219,6 @@ config MLX5_SW_STEERING config MLX5_SF bool "Mellanox Technologies subfunction device support using auxiliary device" depends on MLX5_CORE && MLX5_CORE_EN - default n help Build support for subfuction device in the NIC. A Mellanox subfunction device can support RDMA, netdevice and vdpa device. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index a1223e904190..b5072a3a2585 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -14,7 +14,7 @@ obj-$(CONFIG_MLX5_CORE) += mlx5_core.o mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \ health.o mcg.o cq.o alloc.o port.o mr.o pd.o \ transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \ - fs_counters.o rl.o lag.o dev.o events.o wq.o lib/gid.o \ + fs_counters.o fs_ft_pool.o rl.o lag.o dev.o events.o wq.o lib/gid.o \ lib/devcom.o lib/pci_vsc.o lib/dm.o diag/fs_tracepoint.o \ diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o \ fw_reset.o qos.o @@ -56,6 +56,7 @@ mlx5_core-$(CONFIG_MLX5_ESWITCH) += esw/acl/helper.o \ esw/acl/ingress_lgcy.o esw/acl/ingress_ofld.o \ esw/devlink_port.o esw/vporttbl.o mlx5_core-$(CONFIG_MLX5_TC_SAMPLE) += esw/sample.o +mlx5_core-$(CONFIG_MLX5_BRIDGE) += esw/bridge.o en/rep/bridge.o mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o mlx5_core-$(CONFIG_VXLAN) += lib/vxlan.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index 44c458443428..d791d351b489 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -63,6 +63,11 @@ mlx5_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req, err = devlink_info_version_running_put(req, "fw.version", version_str); if (err) return err; + err = devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW, + version_str); + if (err) + return err; /* no pending version, return running (stored) version */ if (stored_fw == 0) @@ -74,8 +79,9 @@ mlx5_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req, err = devlink_info_version_stored_put(req, "fw.version", version_str); if (err) return err; - - return 0; + return devlink_info_version_stored_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW, + version_str); } static int mlx5_devlink_reload_fw_activate(struct devlink *devlink, struct netlink_ext_ack *extack) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index b636d63358d2..b1b51bbba054 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -974,7 +974,6 @@ int mlx5e_open_rq(struct mlx5e_params *params, struct mlx5e_rq_param *param, struct mlx5e_xsk_param *xsk, int node, struct mlx5e_rq *rq); int mlx5e_wait_for_min_rx_wqes(struct mlx5e_rq *rq, int wait_time); -void mlx5e_deactivate_rq(struct mlx5e_rq *rq); void mlx5e_close_rq(struct mlx5e_rq *rq); int mlx5e_create_rq(struct mlx5e_rq *rq, struct mlx5e_rq_param *param); void mlx5e_destroy_rq(struct mlx5e_rq *rq); @@ -1163,6 +1162,13 @@ mlx5e_calc_max_nch(struct mlx5e_priv *priv, const struct mlx5e_profile *profile) return priv->netdev->num_rx_queues / max_t(u8, profile->rq_groups, 1); } +static inline bool +mlx5e_tx_mpwqe_supported(struct mlx5_core_dev *mdev) +{ + return !is_kdump_kernel() && + MLX5_CAP_ETH(mdev, enhanced_multi_pkt_send_wqe); +} + int mlx5e_priv_init(struct mlx5e_priv *priv, struct net_device *netdev, struct mlx5_core_dev *mdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index f410c1268422..150c8e82c738 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -201,7 +201,7 @@ int mlx5e_validate_params(struct mlx5_core_dev *mdev, struct mlx5e_params *param static struct dim_cq_moder mlx5e_get_def_tx_moderation(u8 cq_period_mode) { - struct dim_cq_moder moder; + struct dim_cq_moder moder = {}; moder.cq_period_mode = cq_period_mode; moder.pkts = MLX5E_PARAMS_DEFAULT_TX_CQ_MODERATION_PKTS; @@ -214,7 +214,7 @@ static struct dim_cq_moder mlx5e_get_def_tx_moderation(u8 cq_period_mode) static struct dim_cq_moder mlx5e_get_def_rx_moderation(u8 cq_period_mode) { - struct dim_cq_moder moder; + struct dim_cq_moder moder = {}; moder.cq_period_mode = cq_period_mode; moder.pkts = MLX5E_PARAMS_DEFAULT_RX_CQ_MODERATION_PKTS; @@ -614,7 +614,7 @@ static u8 mlx5e_build_icosq_log_wq_sz(struct mlx5e_params *params, static u8 mlx5e_build_async_icosq_log_wq_sz(struct mlx5_core_dev *mdev) { - if (mlx5_accel_is_ktls_rx(mdev)) + if (mlx5e_accel_is_ktls_rx(mdev)) return MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE; return MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE; @@ -643,7 +643,7 @@ static void mlx5e_build_async_icosq_param(struct mlx5_core_dev *mdev, mlx5e_build_sq_param_common(mdev, param); param->stop_room = mlx5e_stop_room_for_wqe(1); /* for XSK NOP */ - param->is_tls = mlx5_accel_is_ktls_rx(mdev); + param->is_tls = mlx5e_accel_is_ktls_rx(mdev); if (param->is_tls) param->stop_room += mlx5e_stop_room_for_wqe(1); /* for TLS RX resync NOP */ MLX5_SET(sqc, sqc, reg_umr, MLX5_CAP_ETH(mdev, reg_umr_sq)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c new file mode 100644 index 000000000000..3c0032c9647c --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c @@ -0,0 +1,427 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2021 Mellanox Technologies. */ + +#include <linux/netdevice.h> +#include <linux/if_bridge.h> +#include <net/netevent.h> +#include <net/switchdev.h> +#include "bridge.h" +#include "esw/bridge.h" +#include "en_rep.h" + +#define MLX5_ESW_BRIDGE_UPDATE_INTERVAL 1000 + +struct mlx5_bridge_switchdev_fdb_work { + struct work_struct work; + struct switchdev_notifier_fdb_info fdb_info; + struct net_device *dev; + bool add; +}; + +static int mlx5_esw_bridge_port_changeupper(struct notifier_block *nb, void *ptr) +{ + struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb, + struct mlx5_esw_bridge_offloads, + netdev_nb); + struct net_device *dev = netdev_notifier_info_to_dev(ptr); + struct netdev_notifier_changeupper_info *info = ptr; + struct netlink_ext_ack *extack; + struct mlx5e_rep_priv *rpriv; + struct mlx5_eswitch *esw; + struct mlx5_vport *vport; + struct net_device *upper; + struct mlx5e_priv *priv; + u16 vport_num; + + if (!mlx5e_eswitch_rep(dev)) + return 0; + + upper = info->upper_dev; + if (!netif_is_bridge_master(upper)) + return 0; + + esw = br_offloads->esw; + priv = netdev_priv(dev); + if (esw != priv->mdev->priv.eswitch) + return 0; + + rpriv = priv->ppriv; + vport_num = rpriv->rep->vport; + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + extack = netdev_notifier_info_to_extack(&info->info); + + return info->linking ? + mlx5_esw_bridge_vport_link(upper->ifindex, br_offloads, vport, extack) : + mlx5_esw_bridge_vport_unlink(upper->ifindex, br_offloads, vport, extack); +} + +static int mlx5_esw_bridge_switchdev_port_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + int err = 0; + + switch (event) { + case NETDEV_PRECHANGEUPPER: + break; + + case NETDEV_CHANGEUPPER: + err = mlx5_esw_bridge_port_changeupper(nb, ptr); + break; + } + + return notifier_from_errno(err); +} + +static int mlx5_esw_bridge_port_obj_add(struct net_device *dev, + const void *ctx, + const struct switchdev_obj *obj, + struct netlink_ext_ack *extack) +{ + const struct switchdev_obj_port_vlan *vlan; + struct mlx5e_rep_priv *rpriv; + struct mlx5_eswitch *esw; + struct mlx5_vport *vport; + struct mlx5e_priv *priv; + u16 vport_num; + int err = 0; + + priv = netdev_priv(dev); + rpriv = priv->ppriv; + vport_num = rpriv->rep->vport; + esw = priv->mdev->priv.eswitch; + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + switch (obj->id) { + case SWITCHDEV_OBJ_ID_PORT_VLAN: + vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); + err = mlx5_esw_bridge_port_vlan_add(vlan->vid, vlan->flags, esw, vport, extack); + break; + default: + return -EOPNOTSUPP; + } + return err; +} + +static int mlx5_esw_bridge_port_obj_del(struct net_device *dev, + const void *ctx, + const struct switchdev_obj *obj) +{ + const struct switchdev_obj_port_vlan *vlan; + struct mlx5e_rep_priv *rpriv; + struct mlx5_eswitch *esw; + struct mlx5_vport *vport; + struct mlx5e_priv *priv; + u16 vport_num; + + priv = netdev_priv(dev); + rpriv = priv->ppriv; + vport_num = rpriv->rep->vport; + esw = priv->mdev->priv.eswitch; + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + switch (obj->id) { + case SWITCHDEV_OBJ_ID_PORT_VLAN: + vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); + mlx5_esw_bridge_port_vlan_del(vlan->vid, esw, vport); + break; + default: + return -EOPNOTSUPP; + } + return 0; +} + +static int mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev, + const void *ctx, + const struct switchdev_attr *attr, + struct netlink_ext_ack *extack) +{ + struct mlx5e_rep_priv *rpriv; + struct mlx5_eswitch *esw; + struct mlx5_vport *vport; + struct mlx5e_priv *priv; + u16 vport_num; + int err = 0; + + priv = netdev_priv(dev); + rpriv = priv->ppriv; + vport_num = rpriv->rep->vport; + esw = priv->mdev->priv.eswitch; + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + switch (attr->id) { + case SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS: + if (attr->u.brport_flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD)) { + NL_SET_ERR_MSG_MOD(extack, "Flag is not supported"); + err = -EINVAL; + } + break; + case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS: + break; + case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME: + err = mlx5_esw_bridge_ageing_time_set(attr->u.ageing_time, esw, vport); + break; + case SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING: + err = mlx5_esw_bridge_vlan_filtering_set(attr->u.vlan_filtering, esw, vport); + break; + default: + err = -EOPNOTSUPP; + } + + return err; +} + +static int mlx5_esw_bridge_event_blocking(struct notifier_block *unused, + unsigned long event, void *ptr) +{ + struct net_device *dev = switchdev_notifier_info_to_dev(ptr); + int err; + + switch (event) { + case SWITCHDEV_PORT_OBJ_ADD: + err = switchdev_handle_port_obj_add(dev, ptr, + mlx5e_eswitch_rep, + mlx5_esw_bridge_port_obj_add); + break; + case SWITCHDEV_PORT_OBJ_DEL: + err = switchdev_handle_port_obj_del(dev, ptr, + mlx5e_eswitch_rep, + mlx5_esw_bridge_port_obj_del); + break; + case SWITCHDEV_PORT_ATTR_SET: + err = switchdev_handle_port_attr_set(dev, ptr, + mlx5e_eswitch_rep, + mlx5_esw_bridge_port_obj_attr_set); + break; + default: + err = 0; + } + + return notifier_from_errno(err); +} + +static void +mlx5_esw_bridge_cleanup_switchdev_fdb_work(struct mlx5_bridge_switchdev_fdb_work *fdb_work) +{ + dev_put(fdb_work->dev); + kfree(fdb_work->fdb_info.addr); + kfree(fdb_work); +} + +static void mlx5_esw_bridge_switchdev_fdb_event_work(struct work_struct *work) +{ + struct mlx5_bridge_switchdev_fdb_work *fdb_work = + container_of(work, struct mlx5_bridge_switchdev_fdb_work, work); + struct switchdev_notifier_fdb_info *fdb_info = + &fdb_work->fdb_info; + struct net_device *dev = fdb_work->dev; + struct mlx5e_rep_priv *rpriv; + struct mlx5_eswitch *esw; + struct mlx5_vport *vport; + struct mlx5e_priv *priv; + u16 vport_num; + + rtnl_lock(); + + priv = netdev_priv(dev); + rpriv = priv->ppriv; + vport_num = rpriv->rep->vport; + esw = priv->mdev->priv.eswitch; + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + goto out; + + if (fdb_work->add) + mlx5_esw_bridge_fdb_create(dev, esw, vport, fdb_info); + else + mlx5_esw_bridge_fdb_remove(dev, esw, vport, fdb_info); + +out: + rtnl_unlock(); + mlx5_esw_bridge_cleanup_switchdev_fdb_work(fdb_work); +} + +static struct mlx5_bridge_switchdev_fdb_work * +mlx5_esw_bridge_init_switchdev_fdb_work(struct net_device *dev, bool add, + struct switchdev_notifier_fdb_info *fdb_info) +{ + struct mlx5_bridge_switchdev_fdb_work *work; + u8 *addr; + + work = kzalloc(sizeof(*work), GFP_ATOMIC); + if (!work) + return ERR_PTR(-ENOMEM); + + INIT_WORK(&work->work, mlx5_esw_bridge_switchdev_fdb_event_work); + memcpy(&work->fdb_info, fdb_info, sizeof(work->fdb_info)); + + addr = kzalloc(ETH_ALEN, GFP_ATOMIC); + if (!addr) { + kfree(work); + return ERR_PTR(-ENOMEM); + } + ether_addr_copy(addr, fdb_info->addr); + work->fdb_info.addr = addr; + + dev_hold(dev); + work->dev = dev; + work->add = add; + return work; +} + +static int mlx5_esw_bridge_switchdev_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + struct mlx5_esw_bridge_offloads *br_offloads = container_of(nb, + struct mlx5_esw_bridge_offloads, + nb); + struct net_device *dev = switchdev_notifier_info_to_dev(ptr); + struct switchdev_notifier_fdb_info *fdb_info; + struct mlx5_bridge_switchdev_fdb_work *work; + struct switchdev_notifier_info *info = ptr; + struct net_device *upper; + struct mlx5e_priv *priv; + + if (!mlx5e_eswitch_rep(dev)) + return NOTIFY_DONE; + priv = netdev_priv(dev); + if (priv->mdev->priv.eswitch != br_offloads->esw) + return NOTIFY_DONE; + + if (event == SWITCHDEV_PORT_ATTR_SET) { + int err = switchdev_handle_port_attr_set(dev, ptr, + mlx5e_eswitch_rep, + mlx5_esw_bridge_port_obj_attr_set); + return notifier_from_errno(err); + } + + upper = netdev_master_upper_dev_get_rcu(dev); + if (!upper) + return NOTIFY_DONE; + if (!netif_is_bridge_master(upper)) + return NOTIFY_DONE; + + switch (event) { + case SWITCHDEV_FDB_ADD_TO_DEVICE: + case SWITCHDEV_FDB_DEL_TO_DEVICE: + fdb_info = container_of(info, + struct switchdev_notifier_fdb_info, + info); + + work = mlx5_esw_bridge_init_switchdev_fdb_work(dev, + event == SWITCHDEV_FDB_ADD_TO_DEVICE, + fdb_info); + if (IS_ERR(work)) { + WARN_ONCE(1, "Failed to init switchdev work, err=%ld", + PTR_ERR(work)); + return notifier_from_errno(PTR_ERR(work)); + } + + queue_work(br_offloads->wq, &work->work); + break; + default: + break; + } + return NOTIFY_DONE; +} + +static void mlx5_esw_bridge_update_work(struct work_struct *work) +{ + struct mlx5_esw_bridge_offloads *br_offloads = container_of(work, + struct mlx5_esw_bridge_offloads, + update_work.work); + + rtnl_lock(); + mlx5_esw_bridge_update(br_offloads); + rtnl_unlock(); + + queue_delayed_work(br_offloads->wq, &br_offloads->update_work, + msecs_to_jiffies(MLX5_ESW_BRIDGE_UPDATE_INTERVAL)); +} + +void mlx5e_rep_bridge_init(struct mlx5e_priv *priv) +{ + struct mlx5_esw_bridge_offloads *br_offloads; + struct mlx5_core_dev *mdev = priv->mdev; + struct mlx5_eswitch *esw = + mdev->priv.eswitch; + int err; + + rtnl_lock(); + br_offloads = mlx5_esw_bridge_init(esw); + rtnl_unlock(); + if (IS_ERR(br_offloads)) { + esw_warn(mdev, "Failed to init esw bridge (err=%ld)\n", PTR_ERR(br_offloads)); + return; + } + + br_offloads->wq = alloc_ordered_workqueue("mlx5_bridge_wq", 0); + if (!br_offloads->wq) { + esw_warn(mdev, "Failed to allocate bridge offloads workqueue\n"); + goto err_alloc_wq; + } + INIT_DELAYED_WORK(&br_offloads->update_work, mlx5_esw_bridge_update_work); + queue_delayed_work(br_offloads->wq, &br_offloads->update_work, + msecs_to_jiffies(MLX5_ESW_BRIDGE_UPDATE_INTERVAL)); + + br_offloads->nb.notifier_call = mlx5_esw_bridge_switchdev_event; + err = register_switchdev_notifier(&br_offloads->nb); + if (err) { + esw_warn(mdev, "Failed to register switchdev notifier (err=%d)\n", err); + goto err_register_swdev; + } + + br_offloads->nb_blk.notifier_call = mlx5_esw_bridge_event_blocking; + err = register_switchdev_blocking_notifier(&br_offloads->nb_blk); + if (err) { + esw_warn(mdev, "Failed to register blocking switchdev notifier (err=%d)\n", err); + goto err_register_swdev_blk; + } + + br_offloads->netdev_nb.notifier_call = mlx5_esw_bridge_switchdev_port_event; + err = register_netdevice_notifier(&br_offloads->netdev_nb); + if (err) { + esw_warn(mdev, "Failed to register bridge offloads netdevice notifier (err=%d)\n", + err); + goto err_register_netdev; + } + return; + +err_register_netdev: + unregister_switchdev_blocking_notifier(&br_offloads->nb_blk); +err_register_swdev_blk: + unregister_switchdev_notifier(&br_offloads->nb); +err_register_swdev: + destroy_workqueue(br_offloads->wq); +err_alloc_wq: + mlx5_esw_bridge_cleanup(esw); +} + +void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv) +{ + struct mlx5_esw_bridge_offloads *br_offloads; + struct mlx5_core_dev *mdev = priv->mdev; + struct mlx5_eswitch *esw = + mdev->priv.eswitch; + + br_offloads = esw->br_offloads; + if (!br_offloads) + return; + + unregister_netdevice_notifier(&br_offloads->netdev_nb); + unregister_switchdev_blocking_notifier(&br_offloads->nb_blk); + unregister_switchdev_notifier(&br_offloads->nb); + cancel_delayed_work(&br_offloads->update_work); + destroy_workqueue(br_offloads->wq); + rtnl_lock(); + mlx5_esw_bridge_cleanup(esw); + rtnl_unlock(); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h new file mode 100644 index 000000000000..fbeb64242831 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies. */ + +#ifndef __MLX5_EN_REP_BRIDGE__ +#define __MLX5_EN_REP_BRIDGE__ + +#include "en.h" + +#if IS_ENABLED(CONFIG_MLX5_BRIDGE) + +void mlx5e_rep_bridge_init(struct mlx5e_priv *priv); +void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv); + +#else /* CONFIG_MLX5_BRIDGE */ + +static inline void mlx5e_rep_bridge_init(struct mlx5e_priv *priv) {} +static inline void mlx5e_rep_bridge_cleanup(struct mlx5e_priv *priv) {} + +#endif /* CONFIG_MLX5_BRIDGE */ + +#endif /* __MLX5_EN_REP_BRIDGE__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c index 85eaadc989df..059799e4f483 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c @@ -613,7 +613,7 @@ static bool mlx5e_restore_skb(struct sk_buff *skb, u32 chain, u32 reg_c1, struct mlx5e_tc_update_priv *tc_priv) { struct mlx5e_priv *priv = netdev_priv(skb->dev); - u32 tunnel_id = reg_c1 >> ESW_TUN_OFFSET; + u32 tunnel_id = (reg_c1 >> ESW_TUN_OFFSET) & TUNNEL_ID_MASK; if (chain) { struct mlx5_rep_uplink_priv *uplink_priv; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 5da5e5323a44..91e7a01e32be 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -23,7 +23,7 @@ #include "en_tc.h" #include "en_rep.h" -#define MLX5_CT_ZONE_BITS (mlx5e_tc_attr_to_reg_mappings[ZONE_TO_REG].mlen * 8) +#define MLX5_CT_ZONE_BITS (mlx5e_tc_attr_to_reg_mappings[ZONE_TO_REG].mlen) #define MLX5_CT_ZONE_MASK GENMASK(MLX5_CT_ZONE_BITS - 1, 0) #define MLX5_CT_STATE_ESTABLISHED_BIT BIT(1) #define MLX5_CT_STATE_TRK_BIT BIT(2) @@ -32,11 +32,11 @@ #define MLX5_CT_STATE_RELATED_BIT BIT(5) #define MLX5_CT_STATE_INVALID_BIT BIT(6) -#define MLX5_FTE_ID_BITS (mlx5e_tc_attr_to_reg_mappings[FTEID_TO_REG].mlen * 8) +#define MLX5_FTE_ID_BITS (mlx5e_tc_attr_to_reg_mappings[FTEID_TO_REG].mlen) #define MLX5_FTE_ID_MAX GENMASK(MLX5_FTE_ID_BITS - 1, 0) #define MLX5_FTE_ID_MASK MLX5_FTE_ID_MAX -#define MLX5_CT_LABELS_BITS (mlx5e_tc_attr_to_reg_mappings[LABELS_TO_REG].mlen * 8) +#define MLX5_CT_LABELS_BITS (mlx5e_tc_attr_to_reg_mappings[LABELS_TO_REG].mlen) #define MLX5_CT_LABELS_MASK GENMASK(MLX5_CT_LABELS_BITS - 1, 0) #define ct_dbg(fmt, args...)\ @@ -150,6 +150,11 @@ struct mlx5_ct_entry { unsigned long flags; }; +static void +mlx5_tc_ct_entry_destroy_mod_hdr(struct mlx5_tc_ct_priv *ct_priv, + struct mlx5_flow_attr *attr, + struct mlx5e_mod_hdr_handle *mh); + static const struct rhashtable_params cts_ht_params = { .head_offset = offsetof(struct mlx5_ct_entry, node), .key_offset = offsetof(struct mlx5_ct_entry, cookie), @@ -458,8 +463,7 @@ mlx5_tc_ct_entry_del_rule(struct mlx5_tc_ct_priv *ct_priv, ct_dbg("Deleting ct entry rule in zone %d", entry->tuple.zone); mlx5_tc_rule_delete(netdev_priv(ct_priv->netdev), zone_rule->rule, attr); - mlx5e_mod_hdr_detach(ct_priv->dev, - ct_priv->mod_hdr_tbl, zone_rule->mh); + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, zone_rule->mh); mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); kfree(attr); } @@ -686,15 +690,27 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv, if (err) goto err_mapping; - *mh = mlx5e_mod_hdr_attach(ct_priv->dev, - ct_priv->mod_hdr_tbl, - ct_priv->ns_type, - &mod_acts); - if (IS_ERR(*mh)) { - err = PTR_ERR(*mh); - goto err_mapping; + if (nat) { + attr->modify_hdr = mlx5_modify_header_alloc(ct_priv->dev, ct_priv->ns_type, + mod_acts.num_actions, + mod_acts.actions); + if (IS_ERR(attr->modify_hdr)) { + err = PTR_ERR(attr->modify_hdr); + goto err_mapping; + } + + *mh = NULL; + } else { + *mh = mlx5e_mod_hdr_attach(ct_priv->dev, + ct_priv->mod_hdr_tbl, + ct_priv->ns_type, + &mod_acts); + if (IS_ERR(*mh)) { + err = PTR_ERR(*mh); + goto err_mapping; + } + attr->modify_hdr = mlx5e_mod_hdr_get(*mh); } - attr->modify_hdr = mlx5e_mod_hdr_get(*mh); dealloc_mod_hdr_actions(&mod_acts); return 0; @@ -705,6 +721,17 @@ err_mapping: return err; } +static void +mlx5_tc_ct_entry_destroy_mod_hdr(struct mlx5_tc_ct_priv *ct_priv, + struct mlx5_flow_attr *attr, + struct mlx5e_mod_hdr_handle *mh) +{ + if (mh) + mlx5e_mod_hdr_detach(ct_priv->dev, ct_priv->mod_hdr_tbl, mh); + else + mlx5_modify_header_dealloc(ct_priv->dev, attr->modify_hdr); +} + static int mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv, struct flow_rule *flow_rule, @@ -767,8 +794,7 @@ mlx5_tc_ct_entry_add_rule(struct mlx5_tc_ct_priv *ct_priv, return 0; err_rule: - mlx5e_mod_hdr_detach(ct_priv->dev, - ct_priv->mod_hdr_tbl, zone_rule->mh); + mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, zone_rule->attr, zone_rule->mh); mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); err_mod_hdr: kfree(attr); @@ -918,7 +944,7 @@ mlx5_tc_ct_shared_counter_get(struct mlx5_tc_ct_priv *ct_priv, } if (rev_entry && refcount_inc_not_zero(&rev_entry->counter->refcount)) { - ct_dbg("Using shared counter entry=0x%p rev=0x%p\n", entry, rev_entry); + ct_dbg("Using shared counter entry=0x%p rev=0x%p", entry, rev_entry); shared_counter = rev_entry->counter; spin_unlock_bh(&ct_priv->ht_lock); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h index 69e618d17071..644cf1641cde 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h @@ -33,15 +33,15 @@ struct mlx5_ct_attr { #define zone_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_2,\ .moffset = 0,\ - .mlen = 2,\ + .mlen = 16,\ .soffset = MLX5_BYTE_OFF(fte_match_param,\ - misc_parameters_2.metadata_reg_c_2) + 2,\ + misc_parameters_2.metadata_reg_c_2),\ } #define ctstate_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_2,\ - .moffset = 2,\ - .mlen = 2,\ + .moffset = 16,\ + .mlen = 16,\ .soffset = MLX5_BYTE_OFF(fte_match_param,\ misc_parameters_2.metadata_reg_c_2),\ } @@ -49,7 +49,7 @@ struct mlx5_ct_attr { #define mark_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_3,\ .moffset = 0,\ - .mlen = 4,\ + .mlen = 32,\ .soffset = MLX5_BYTE_OFF(fte_match_param,\ misc_parameters_2.metadata_reg_c_3),\ } @@ -57,7 +57,7 @@ struct mlx5_ct_attr { #define labels_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_4,\ .moffset = 0,\ - .mlen = 4,\ + .mlen = 32,\ .soffset = MLX5_BYTE_OFF(fte_match_param,\ misc_parameters_2.metadata_reg_c_4),\ } @@ -65,7 +65,7 @@ struct mlx5_ct_attr { #define fteid_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_5,\ .moffset = 0,\ - .mlen = 4,\ + .mlen = 32,\ .soffset = MLX5_BYTE_OFF(fte_match_param,\ misc_parameters_2.metadata_reg_c_5),\ } @@ -73,20 +73,19 @@ struct mlx5_ct_attr { #define zone_restore_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_1,\ .moffset = 0,\ - .mlen = (ESW_ZONE_ID_BITS / 8),\ + .mlen = ESW_ZONE_ID_BITS,\ .soffset = MLX5_BYTE_OFF(fte_match_param,\ - misc_parameters_2.metadata_reg_c_1) + 3,\ + misc_parameters_2.metadata_reg_c_1),\ } #define nic_zone_restore_to_reg_ct {\ .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_B,\ - .moffset = 2,\ - .mlen = (ESW_ZONE_ID_BITS / 8),\ + .moffset = 16,\ + .mlen = ESW_ZONE_ID_BITS,\ } #define REG_MAPPING_MLEN(reg) (mlx5e_tc_attr_to_reg_mappings[reg].mlen) #define REG_MAPPING_MOFFSET(reg) (mlx5e_tc_attr_to_reg_mappings[reg].moffset) -#define REG_MAPPING_SHIFT(reg) (REG_MAPPING_MOFFSET(reg) * 8) #if IS_ENABLED(CONFIG_MLX5_TC_CT) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index 172e0474f2e6..8f79f04eccd6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -212,6 +212,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, { int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size); const struct ip_tunnel_key *tun_key = &e->tun_info->key; + struct mlx5_pkt_reformat_params reformat_params; struct mlx5e_neigh m_neigh = {}; TC_TUN_ROUTE_ATTR_INIT(attr); int ipv4_encap_size; @@ -295,9 +296,12 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, */ goto release_neigh; } - e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, - e->reformat_type, - ipv4_encap_size, encap_header, + + memset(&reformat_params, 0, sizeof(reformat_params)); + reformat_params.type = e->reformat_type; + reformat_params.size = ipv4_encap_size; + reformat_params.data = encap_header; + e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params, MLX5_FLOW_NAMESPACE_FDB); if (IS_ERR(e->pkt_reformat)) { err = PTR_ERR(e->pkt_reformat); @@ -324,6 +328,7 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv, { int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size); const struct ip_tunnel_key *tun_key = &e->tun_info->key; + struct mlx5_pkt_reformat_params reformat_params; TC_TUN_ROUTE_ATTR_INIT(attr); int ipv4_encap_size; char *encap_header; @@ -396,9 +401,12 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv, */ goto release_neigh; } - e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, - e->reformat_type, - ipv4_encap_size, encap_header, + + memset(&reformat_params, 0, sizeof(reformat_params)); + reformat_params.type = e->reformat_type; + reformat_params.size = ipv4_encap_size; + reformat_params.data = encap_header; + e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params, MLX5_FLOW_NAMESPACE_FDB); if (IS_ERR(e->pkt_reformat)) { err = PTR_ERR(e->pkt_reformat); @@ -471,6 +479,7 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, { int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size); const struct ip_tunnel_key *tun_key = &e->tun_info->key; + struct mlx5_pkt_reformat_params reformat_params; struct mlx5e_neigh m_neigh = {}; TC_TUN_ROUTE_ATTR_INIT(attr); struct ipv6hdr *ip6h; @@ -553,9 +562,11 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, goto release_neigh; } - e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, - e->reformat_type, - ipv6_encap_size, encap_header, + memset(&reformat_params, 0, sizeof(reformat_params)); + reformat_params.type = e->reformat_type; + reformat_params.size = ipv6_encap_size; + reformat_params.data = encap_header; + e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params, MLX5_FLOW_NAMESPACE_FDB); if (IS_ERR(e->pkt_reformat)) { err = PTR_ERR(e->pkt_reformat); @@ -582,6 +593,7 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, { int max_encap_size = MLX5_CAP_ESW(priv->mdev, max_encap_header_size); const struct ip_tunnel_key *tun_key = &e->tun_info->key; + struct mlx5_pkt_reformat_params reformat_params; TC_TUN_ROUTE_ATTR_INIT(attr); struct ipv6hdr *ip6h; int ipv6_encap_size; @@ -654,9 +666,11 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, goto release_neigh; } - e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, - e->reformat_type, - ipv6_encap_size, encap_header, + memset(&reformat_params, 0, sizeof(reformat_params)); + reformat_params.type = e->reformat_type; + reformat_params.size = ipv6_encap_size; + reformat_params.data = encap_header; + e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params, MLX5_FLOW_NAMESPACE_FDB); if (IS_ERR(e->pkt_reformat)) { err = PTR_ERR(e->pkt_reformat); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c index 490131e06efb..2e846b741280 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c @@ -120,6 +120,7 @@ void mlx5e_tc_encap_flows_add(struct mlx5e_priv *priv, struct list_head *flow_list) { struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; + struct mlx5_pkt_reformat_params reformat_params; struct mlx5_esw_flow_attr *esw_attr; struct mlx5_flow_handle *rule; struct mlx5_flow_attr *attr; @@ -130,9 +131,12 @@ void mlx5e_tc_encap_flows_add(struct mlx5e_priv *priv, if (e->flags & MLX5_ENCAP_ENTRY_NO_ROUTE) return; + memset(&reformat_params, 0, sizeof(reformat_params)); + reformat_params.type = e->reformat_type; + reformat_params.size = e->encap_size; + reformat_params.data = e->encap_header; e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, - e->reformat_type, - e->encap_size, e->encap_header, + &reformat_params, MLX5_FLOW_NAMESPACE_FDB); if (IS_ERR(e->pkt_reformat)) { mlx5_core_warn(priv->mdev, "Failed to offload cached encapsulation header, %lu\n", @@ -839,6 +843,7 @@ int mlx5e_attach_decap(struct mlx5e_priv *priv, { struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; struct mlx5_esw_flow_attr *attr = flow->attr->esw_attr; + struct mlx5_pkt_reformat_params reformat_params; struct mlx5e_tc_flow_parse_attr *parse_attr; struct mlx5e_decap_entry *d; struct mlx5e_decap_key key; @@ -880,10 +885,12 @@ int mlx5e_attach_decap(struct mlx5e_priv *priv, hash_add_rcu(esw->offloads.decap_tbl, &d->hlist, hash_key); mutex_unlock(&esw->offloads.decap_tbl_lock); + memset(&reformat_params, 0, sizeof(reformat_params)); + reformat_params.type = MLX5_REFORMAT_TYPE_L3_TUNNEL_TO_L2; + reformat_params.size = sizeof(parse_attr->eth); + reformat_params.data = &parse_attr->eth; d->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, - MLX5_REFORMAT_TYPE_L3_TUNNEL_TO_L2, - sizeof(parse_attr->eth), - &parse_attr->eth, + &reformat_params, MLX5_FLOW_NAMESPACE_FDB); if (IS_ERR(d->pkt_reformat)) { err = PTR_ERR(d->pkt_reformat); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h index 00af0b831a28..d964665eaa63 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h @@ -162,7 +162,7 @@ static inline unsigned int mlx5e_accel_tx_ids_len(struct mlx5e_txqsq *sq, /* Part of the eseg touched by TX offloads */ #define MLX5E_ACCEL_ESEG_LEN offsetof(struct mlx5_wqe_eth_seg, mss) -static inline bool mlx5e_accel_tx_eseg(struct mlx5e_priv *priv, +static inline void mlx5e_accel_tx_eseg(struct mlx5e_priv *priv, struct sk_buff *skb, struct mlx5_wqe_eth_seg *eseg, u16 ihs) { @@ -175,8 +175,6 @@ static inline bool mlx5e_accel_tx_eseg(struct mlx5e_priv *priv, if (skb->encapsulation && skb->ip_summed == CHECKSUM_PARTIAL) mlx5e_tx_tunnel_accel(skb, eseg, ihs); #endif - - return true; } static inline void mlx5e_accel_tx_finish(struct mlx5e_txqsq *sq, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 26f7fab109d9..7cab08a2f715 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -428,7 +428,6 @@ int mlx5e_ipsec_init(struct mlx5e_priv *priv) spin_lock_init(&ipsec->sadb_rx_lock); ida_init(&ipsec->halloc); ipsec->en_priv = priv; - ipsec->en_priv->ipsec = ipsec; ipsec->no_trailer = !!(mlx5_accel_ipsec_device_caps(priv->mdev) & MLX5_ACCEL_IPSEC_CAP_RX_NO_TRAILER); ipsec->wq = alloc_ordered_workqueue("mlx5e_ipsec: %s", 0, @@ -438,6 +437,7 @@ int mlx5e_ipsec_init(struct mlx5e_priv *priv) return -ENOMEM; } + priv->ipsec = ipsec; mlx5e_accel_ipsec_fs_init(priv); netdev_dbg(priv->netdev, "IPSec attached to netdevice\n"); return 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c index a97e8d205094..33de8f0092a6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c @@ -136,8 +136,6 @@ static void mlx5e_ipsec_set_swp(struct sk_buff *skb, struct mlx5_wqe_eth_seg *eseg, u8 mode, struct xfrm_offload *xo) { - struct mlx5e_swp_spec swp_spec = {}; - /* Tunnel Mode: * SWP: OutL3 InL3 InL4 * Pkt: MAC IP ESP IP L4 @@ -146,23 +144,58 @@ static void mlx5e_ipsec_set_swp(struct sk_buff *skb, * SWP: OutL3 InL4 * InL3 * Pkt: MAC IP ESP L4 + * + * Tunnel(VXLAN TCP/UDP) over Transport Mode + * SWP: OutL3 InL3 InL4 + * Pkt: MAC IP ESP UDP VXLAN IP L4 */ - swp_spec.l3_proto = skb->protocol; - swp_spec.is_tun = mode == XFRM_MODE_TUNNEL; - if (swp_spec.is_tun) { - if (xo->proto == IPPROTO_IPV6) { - swp_spec.tun_l3_proto = htons(ETH_P_IPV6); - swp_spec.tun_l4_proto = inner_ipv6_hdr(skb)->nexthdr; - } else { - swp_spec.tun_l3_proto = htons(ETH_P_IP); - swp_spec.tun_l4_proto = inner_ip_hdr(skb)->protocol; - } - } else { - swp_spec.tun_l3_proto = skb->protocol; - swp_spec.tun_l4_proto = xo->proto; + + /* Shared settings */ + eseg->swp_outer_l3_offset = skb_network_offset(skb) / 2; + if (skb->protocol == htons(ETH_P_IPV6)) + eseg->swp_flags |= MLX5_ETH_WQE_SWP_OUTER_L3_IPV6; + + /* Tunnel mode */ + if (mode == XFRM_MODE_TUNNEL) { + eseg->swp_inner_l3_offset = skb_inner_network_offset(skb) / 2; + eseg->swp_inner_l4_offset = skb_inner_transport_offset(skb) / 2; + if (xo->proto == IPPROTO_IPV6) + eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L3_IPV6; + if (inner_ip_hdr(skb)->protocol == IPPROTO_UDP) + eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L4_UDP; + return; + } + + /* Transport mode */ + if (mode != XFRM_MODE_TRANSPORT) + return; + + if (!xo->inner_ipproto) { + eseg->swp_inner_l3_offset = skb_network_offset(skb) / 2; + eseg->swp_inner_l4_offset = skb_inner_transport_offset(skb) / 2; + if (skb->protocol == htons(ETH_P_IPV6)) + eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L3_IPV6; + if (xo->proto == IPPROTO_UDP) + eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L4_UDP; + return; + } + + /* Tunnel(VXLAN TCP/UDP) over Transport Mode */ + switch (xo->inner_ipproto) { + case IPPROTO_UDP: + eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L4_UDP; + fallthrough; + case IPPROTO_TCP: + eseg->swp_inner_l3_offset = skb_inner_network_offset(skb) / 2; + eseg->swp_inner_l4_offset = (skb->csum_start + skb->head - skb->data) / 2; + if (skb->protocol == htons(ETH_P_IPV6)) + eseg->swp_flags |= MLX5_ETH_WQE_SWP_INNER_L3_IPV6; + break; + default: + break; } - mlx5e_set_eseg_swp(skb, eseg, &swp_spec); + return; } void mlx5e_ipsec_set_iv_esn(struct sk_buff *skb, struct xfrm_state *x, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h index 3e80742a3caf..5120a59361e6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h @@ -93,18 +93,38 @@ static inline bool mlx5e_ipsec_eseg_meta(struct mlx5_wqe_eth_seg *eseg) void mlx5e_ipsec_tx_build_eseg(struct mlx5e_priv *priv, struct sk_buff *skb, struct mlx5_wqe_eth_seg *eseg); -static inline bool mlx5e_ipsec_feature_check(struct sk_buff *skb, struct net_device *netdev, - netdev_features_t features) +static inline netdev_features_t +mlx5e_ipsec_feature_check(struct sk_buff *skb, netdev_features_t features) { + struct xfrm_offload *xo = xfrm_offload(skb); struct sec_path *sp = skb_sec_path(skb); - if (sp && sp->len) { + if (sp && sp->len && xo) { struct xfrm_state *x = sp->xvec[0]; - if (x && x->xso.offload_handle) - return true; + if (!x || !x->xso.offload_handle) + goto out_disable; + + if (xo->inner_ipproto) { + /* Cannot support tunnel packet over IPsec tunnel mode + * because we cannot offload three IP header csum + */ + if (x->props.mode == XFRM_MODE_TUNNEL) + goto out_disable; + + /* Only support UDP or TCP L4 checksum */ + if (xo->inner_ipproto != IPPROTO_UDP && + xo->inner_ipproto != IPPROTO_TCP) + goto out_disable; + } + + return features; + } - return false; + + /* Disable CSUM and GSO for software IPsec */ +out_disable: + return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK); } #else @@ -120,8 +140,9 @@ static inline bool mlx5e_ipsec_eseg_meta(struct mlx5_wqe_eth_seg *eseg) } static inline bool mlx5_ipsec_is_rx_flow(struct mlx5_cqe64 *cqe) { return false; } -static inline bool mlx5e_ipsec_feature_check(struct sk_buff *skb, struct net_device *netdev, - netdev_features_t features) { return false; } +static inline netdev_features_t +mlx5e_ipsec_feature_check(struct sk_buff *skb, netdev_features_t features) +{ return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK); } #endif /* CONFIG_MLX5_EN_IPSEC */ #endif /* __MLX5E_IPSEC_RXTX_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c index 95293ee0d38d..d93aadbf10da 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c @@ -59,12 +59,15 @@ void mlx5e_ktls_build_netdev(struct mlx5e_priv *priv) struct net_device *netdev = priv->netdev; struct mlx5_core_dev *mdev = priv->mdev; - if (mlx5_accel_is_ktls_tx(mdev)) { + if (!mlx5e_accel_is_ktls_tx(mdev) && !mlx5e_accel_is_ktls_rx(mdev)) + return; + + if (mlx5e_accel_is_ktls_tx(mdev)) { netdev->hw_features |= NETIF_F_HW_TLS_TX; netdev->features |= NETIF_F_HW_TLS_TX; } - if (mlx5_accel_is_ktls_rx(mdev)) + if (mlx5e_accel_is_ktls_rx(mdev)) netdev->hw_features |= NETIF_F_HW_TLS_RX; netdev->tlsdev_ops = &mlx5e_ktls_ops; @@ -89,7 +92,7 @@ int mlx5e_ktls_init_rx(struct mlx5e_priv *priv) { int err; - if (!mlx5_accel_is_ktls_rx(priv->mdev)) + if (!mlx5e_accel_is_ktls_rx(priv->mdev)) return 0; priv->tls->rx_wq = create_singlethread_workqueue("mlx5e_tls_rx"); @@ -109,7 +112,7 @@ int mlx5e_ktls_init_rx(struct mlx5e_priv *priv) void mlx5e_ktls_cleanup_rx(struct mlx5e_priv *priv) { - if (!mlx5_accel_is_ktls_rx(priv->mdev)) + if (!mlx5e_accel_is_ktls_rx(priv->mdev)) return; if (priv->netdev->features & NETIF_F_HW_TLS_RX) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h index aaa579bf9a39..5833deb2354c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.h @@ -15,6 +15,25 @@ int mlx5e_ktls_set_feature_rx(struct net_device *netdev, bool enable); struct mlx5e_ktls_resync_resp * mlx5e_ktls_rx_resync_create_resp_list(void); void mlx5e_ktls_rx_resync_destroy_resp_list(struct mlx5e_ktls_resync_resp *resp_list); + +static inline bool mlx5e_accel_is_ktls_tx(struct mlx5_core_dev *mdev) +{ + return !is_kdump_kernel() && + mlx5_accel_is_ktls_tx(mdev); +} + +static inline bool mlx5e_accel_is_ktls_rx(struct mlx5_core_dev *mdev) +{ + return !is_kdump_kernel() && + mlx5_accel_is_ktls_rx(mdev); +} + +static inline bool mlx5e_accel_is_ktls_device(struct mlx5_core_dev *mdev) +{ + return !is_kdump_kernel() && + mlx5_accel_is_ktls_device(mdev); +} + #else static inline void mlx5e_ktls_build_netdev(struct mlx5e_priv *priv) @@ -44,6 +63,11 @@ mlx5e_ktls_rx_resync_create_resp_list(void) static inline void mlx5e_ktls_rx_resync_destroy_resp_list(struct mlx5e_ktls_resync_resp *resp_list) {} + +static inline bool mlx5e_accel_is_ktls_tx(struct mlx5_core_dev *mdev) { return false; } +static inline bool mlx5e_accel_is_ktls_rx(struct mlx5_core_dev *mdev) { return false; } +static inline bool mlx5e_accel_is_ktls_device(struct mlx5_core_dev *mdev) { return false; } + #endif #endif /* __MLX5E_TLS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c index 51bdf71073f3..9ad3459fb63a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c @@ -23,10 +23,13 @@ mlx5e_ktls_dumps_num_wqes(struct mlx5e_params *params, unsigned int nfrags, return nfrags + DIV_ROUND_UP(sync_len, MLX5E_SW2HW_MTU(params, params->sw_mtu)); } -u16 mlx5e_ktls_get_stop_room(struct mlx5e_params *params) +u16 mlx5e_ktls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params) { u16 num_dumps, stop_room = 0; + if (!mlx5e_accel_is_ktls_tx(mdev)) + return 0; + num_dumps = mlx5e_ktls_dumps_num_wqes(params, MAX_SKB_FRAGS, TLS_MAX_PAYLOAD_SIZE); stop_room += mlx5e_stop_room_for_wqe(MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS); @@ -135,6 +138,7 @@ void mlx5e_ktls_del_tx(struct net_device *netdev, struct tls_context *tls_ctx) priv = netdev_priv(netdev); mdev = priv->mdev; + atomic64_inc(&priv_tx->sw_stats->tx_tls_del); mlx5e_destroy_tis(mdev, priv_tx->tisn); mlx5_ktls_destroy_key(mdev, priv_tx->key_id); kfree(priv_tx); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.h index 8f79335057dc..08c9d5134479 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.h @@ -14,7 +14,7 @@ struct mlx5e_accel_tx_tls_state { u32 tls_tisn; }; -u16 mlx5e_ktls_get_stop_room(struct mlx5e_params *params); +u16 mlx5e_ktls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params); bool mlx5e_ktls_handle_tx_skb(struct tls_context *tls_ctx, struct mlx5e_txqsq *sq, struct sk_buff *skb, int datalen, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c index d6b21b899dbc..b8fc863aa68d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.c @@ -192,13 +192,13 @@ void mlx5e_tls_build_netdev(struct mlx5e_priv *priv) struct net_device *netdev = priv->netdev; u32 caps; - if (mlx5_accel_is_ktls_device(priv->mdev)) { + if (mlx5e_accel_is_ktls_device(priv->mdev)) { mlx5e_ktls_build_netdev(priv); return; } /* FPGA */ - if (!mlx5_accel_is_tls_device(priv->mdev)) + if (!mlx5e_accel_is_tls_device(priv->mdev)) return; caps = mlx5_accel_tls_device_caps(priv->mdev); @@ -224,7 +224,7 @@ int mlx5e_tls_init(struct mlx5e_priv *priv) { struct mlx5e_tls *tls; - if (!mlx5_accel_is_tls_device(priv->mdev)) + if (!mlx5e_accel_is_tls_device(priv->mdev)) return 0; tls = kzalloc(sizeof(*tls), GFP_KERNEL); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.h index 4c9274d390da..62ecf14bf86a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.h @@ -42,6 +42,7 @@ struct mlx5e_tls_sw_stats { atomic64_t tx_tls_ctx; + atomic64_t tx_tls_del; atomic64_t tx_tls_drop_metadata; atomic64_t tx_tls_drop_resync_alloc; atomic64_t tx_tls_drop_no_sync_data; @@ -103,11 +104,18 @@ int mlx5e_tls_get_count(struct mlx5e_priv *priv); int mlx5e_tls_get_strings(struct mlx5e_priv *priv, uint8_t *data); int mlx5e_tls_get_stats(struct mlx5e_priv *priv, u64 *data); +static inline bool mlx5e_accel_is_tls_device(struct mlx5_core_dev *mdev) +{ + return !is_kdump_kernel() && + mlx5_accel_is_tls_device(mdev); +} + #else static inline void mlx5e_tls_build_netdev(struct mlx5e_priv *priv) { - if (mlx5_accel_is_ktls_device(priv->mdev)) + if (!is_kdump_kernel() && + mlx5_accel_is_ktls_device(priv->mdev)) mlx5e_ktls_build_netdev(priv); } @@ -117,6 +125,7 @@ static inline void mlx5e_tls_cleanup(struct mlx5e_priv *priv) { } static inline int mlx5e_tls_get_count(struct mlx5e_priv *priv) { return 0; } static inline int mlx5e_tls_get_strings(struct mlx5e_priv *priv, uint8_t *data) { return 0; } static inline int mlx5e_tls_get_stats(struct mlx5e_priv *priv, u64 *data) { return 0; } +static inline bool mlx5e_accel_is_tls_device(struct mlx5_core_dev *mdev) { return false; } #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.c index 82dc09aaa7fc..7a700f913582 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.c @@ -273,7 +273,7 @@ bool mlx5e_tls_handle_tx_skb(struct net_device *netdev, struct mlx5e_txqsq *sq, if (WARN_ON_ONCE(tls_ctx->netdev != netdev)) goto err_out; - if (mlx5_accel_is_ktls_tx(sq->mdev)) + if (mlx5e_accel_is_ktls_tx(sq->mdev)) return mlx5e_ktls_handle_tx_skb(tls_ctx, sq, skb, datalen, state); /* FPGA */ @@ -378,11 +378,11 @@ void mlx5e_tls_handle_rx_skb_metadata(struct mlx5e_rq *rq, struct sk_buff *skb, u16 mlx5e_tls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params) { - if (!mlx5_accel_is_tls_device(mdev)) + if (!mlx5e_accel_is_tls_device(mdev)) return 0; - if (mlx5_accel_is_ktls_device(mdev)) - return mlx5e_ktls_get_stop_room(params); + if (mlx5e_accel_is_ktls_device(mdev)) + return mlx5e_ktls_get_stop_room(mdev, params); /* FPGA */ /* Resync SKB. */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_stats.c index 29463bdb7715..56e7b2aee85f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_stats.c @@ -47,6 +47,7 @@ static const struct counter_desc mlx5e_tls_sw_stats_desc[] = { static const struct counter_desc mlx5e_ktls_sw_stats_desc[] = { { MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, tx_tls_ctx) }, + { MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, tx_tls_del) }, { MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, rx_tls_ctx) }, { MLX5E_DECLARE_STAT(struct mlx5e_tls_sw_stats, rx_tls_del) }, }; @@ -58,7 +59,7 @@ static const struct counter_desc *get_tls_atomic_stats(struct mlx5e_priv *priv) { if (!priv->tls) return NULL; - if (mlx5_accel_is_ktls_device(priv->mdev)) + if (mlx5e_accel_is_ktls_device(priv->mdev)) return mlx5e_ktls_sw_stats_desc; return mlx5e_tls_sw_stats_desc; } @@ -67,7 +68,7 @@ int mlx5e_tls_get_count(struct mlx5e_priv *priv) { if (!priv->tls) return 0; - if (mlx5_accel_is_ktls_device(priv->mdev)) + if (mlx5e_accel_is_ktls_device(priv->mdev)) return ARRAY_SIZE(mlx5e_ktls_sw_stats_desc); return ARRAY_SIZE(mlx5e_tls_sw_stats_desc); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index d6513aef5cd4..bd72572e03d1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1992,7 +1992,7 @@ static int set_pflag_tx_mpwqe_common(struct net_device *netdev, u32 flag, bool e struct mlx5_core_dev *mdev = priv->mdev; struct mlx5e_params new_params; - if (enable && !MLX5_CAP_ETH(mdev, enhanced_multi_pkt_send_wqe)) + if (enable && !mlx5e_tx_mpwqe_supported(mdev)) return -EOPNOTSUPP; new_params = priv->channels.params; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index d26b8ed51195..414a73d16619 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -91,12 +91,16 @@ void mlx5e_update_carrier(struct mlx5e_priv *priv) { struct mlx5_core_dev *mdev = priv->mdev; u8 port_state; + bool up; port_state = mlx5_query_vport_state(mdev, MLX5_VPORT_STATE_OP_MOD_VNIC_VPORT, 0); - if (port_state == VPORT_STATE_UP) { + up = port_state == VPORT_STATE_UP; + if (up == netif_carrier_ok(priv->netdev)) + netif_carrier_event(priv->netdev); + if (up) { netdev_info(priv->netdev, "Link up\n"); netif_carrier_on(priv->netdev); } else { @@ -853,7 +857,7 @@ int mlx5e_open_rq(struct mlx5e_params *params, struct mlx5e_rq_param *param, if (err) goto err_destroy_rq; - if (mlx5e_is_tls_on(rq->priv) && !mlx5_accel_is_ktls_device(mdev)) + if (mlx5e_is_tls_on(rq->priv) && !mlx5e_accel_is_ktls_device(mdev)) __set_bit(MLX5E_RQ_STATE_FPGA_TLS, &rq->state); /* must be FPGA */ if (MLX5_CAP_ETH(mdev, cqe_checksum_full)) @@ -4327,6 +4331,11 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv, if (port == GENEVE_UDP_PORT && mlx5_geneve_tx_allowed(priv->mdev)) return features; #endif + break; +#ifdef CONFIG_MLX5_EN_IPSEC + case IPPROTO_ESP: + return mlx5e_ipsec_feature_check(skb, features); +#endif } out: @@ -4343,9 +4352,6 @@ netdev_features_t mlx5e_features_check(struct sk_buff *skb, features = vlan_features_check(skb, features); features = vxlan_features_check(skb, features); - if (mlx5e_ipsec_feature_check(skb, netdev, features)) - return features; - /* Validate if the tunneled packet is being offloaded by HW */ if (skb->encapsulation && (features & NETIF_F_CSUM_MASK || features & NETIF_F_GSO_MASK)) @@ -4661,12 +4667,10 @@ void mlx5e_build_nic_params(struct mlx5e_priv *priv, struct mlx5e_xsk *xsk, u16 params->log_sq_size = is_kdump_kernel() ? MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE : MLX5E_PARAMS_DEFAULT_LOG_SQ_SIZE; - MLX5E_SET_PFLAG(params, MLX5E_PFLAG_SKB_TX_MPWQE, - MLX5_CAP_ETH(mdev, enhanced_multi_pkt_send_wqe)); + MLX5E_SET_PFLAG(params, MLX5E_PFLAG_SKB_TX_MPWQE, mlx5e_tx_mpwqe_supported(mdev)); /* XDP SQ */ - MLX5E_SET_PFLAG(params, MLX5E_PFLAG_XDP_TX_MPWQE, - MLX5_CAP_ETH(mdev, enhanced_multi_pkt_send_wqe)); + MLX5E_SET_PFLAG(params, MLX5E_PFLAG_XDP_TX_MPWQE, mlx5e_tx_mpwqe_supported(mdev)); /* set CQE compression */ params->rx_cqe_compress_def = false; @@ -5103,7 +5107,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv) mlx5e_set_netdev_mtu_boundaries(priv); mlx5e_set_dev_port_mtu(priv); - mlx5_lag_add(mdev, netdev); + mlx5_lag_add_netdev(mdev, netdev); mlx5e_enable_async_events(priv); mlx5e_enable_blocking_events(priv); @@ -5151,7 +5155,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) priv->en_trap = NULL; } mlx5e_disable_async_events(priv); - mlx5_lag_remove(mdev); + mlx5_lag_remove_netdev(mdev, priv->netdev); mlx5_vxlan_reset_to_default(mdev->vxlan); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 34eb1118670f..bf94bcb6fa5d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -45,11 +45,13 @@ #include "en_tc.h" #include "en/rep/tc.h" #include "en/rep/neigh.h" +#include "en/rep/bridge.h" #include "en/devlink.h" #include "fs_core.h" #include "lib/mlx5.h" #define CREATE_TRACE_POINTS #include "diag/en_rep_tracepoint.h" +#include "en_accel/ipsec.h" #define MLX5E_REP_PARAMS_DEF_LOG_SQ_SIZE \ max(0x7, MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE) @@ -536,13 +538,13 @@ static const struct net_device_ops mlx5e_netdev_ops_rep = { .ndo_change_carrier = mlx5e_rep_change_carrier, }; -bool mlx5e_eswitch_uplink_rep(struct net_device *netdev) +bool mlx5e_eswitch_uplink_rep(const struct net_device *netdev) { return netdev->netdev_ops == &mlx5e_netdev_ops && mlx5e_is_uplink_rep(netdev_priv(netdev)); } -bool mlx5e_eswitch_vf_rep(struct net_device *netdev) +bool mlx5e_eswitch_vf_rep(const struct net_device *netdev) { return netdev->netdev_ops == &mlx5e_netdev_ops_rep; } @@ -629,6 +631,11 @@ static int mlx5e_init_ul_rep(struct mlx5_core_dev *mdev, struct net_device *netdev) { struct mlx5e_priv *priv = netdev_priv(netdev); + int err; + + err = mlx5e_ipsec_init(priv); + if (err) + mlx5_core_err(mdev, "Uplink rep IPsec initialization failed, %d\n", err); mlx5e_vxlan_set_netdev_info(priv); return mlx5e_init_rep(mdev, netdev); @@ -636,6 +643,7 @@ static int mlx5e_init_ul_rep(struct mlx5_core_dev *mdev, static void mlx5e_cleanup_rep(struct mlx5e_priv *priv) { + mlx5e_ipsec_cleanup(priv); } static int mlx5e_create_rep_ttc_table(struct mlx5e_priv *priv) @@ -975,12 +983,13 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv) if (MLX5_CAP_GEN(mdev, uplink_follow)) mlx5_modify_vport_admin_state(mdev, MLX5_VPORT_STATE_OP_MOD_UPLINK, 0, 0, MLX5_VPORT_ADMIN_STATE_AUTO); - mlx5_lag_add(mdev, netdev); + mlx5_lag_add_netdev(mdev, netdev); priv->events_nb.notifier_call = uplink_rep_async_event; mlx5_notifier_register(mdev, &priv->events_nb); mlx5e_dcbnl_initialize(priv); mlx5e_dcbnl_init_app(priv); mlx5e_rep_neigh_init(rpriv); + mlx5e_rep_bridge_init(priv); netdev->wanted_features |= NETIF_F_HW_TC; @@ -1002,11 +1011,12 @@ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv) netif_device_detach(priv->netdev); rtnl_unlock(); + mlx5e_rep_bridge_cleanup(priv); mlx5e_rep_neigh_cleanup(rpriv); mlx5e_dcbnl_delete_app(priv); mlx5_notifier_unregister(mdev, &priv->events_nb); mlx5e_rep_tc_disable(priv); - mlx5_lag_remove(mdev); + mlx5_lag_remove_netdev(mdev, priv->netdev); } static MLX5E_DEFINE_STATS_GRP(sw_rep, 0); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 22585015c7a7..47a2dfb7792a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -231,9 +231,9 @@ void mlx5e_remove_sqs_fwd_rules(struct mlx5e_priv *priv); void mlx5e_rep_queue_neigh_stats_work(struct mlx5e_priv *priv); -bool mlx5e_eswitch_vf_rep(struct net_device *netdev); -bool mlx5e_eswitch_uplink_rep(struct net_device *netdev); -static inline bool mlx5e_eswitch_rep(struct net_device *netdev) +bool mlx5e_eswitch_vf_rep(const struct net_device *netdev); +bool mlx5e_eswitch_uplink_rep(const struct net_device *netdev); +static inline bool mlx5e_eswitch_rep(const struct net_device *netdev) { return mlx5e_eswitch_vf_rep(netdev) || mlx5e_eswitch_uplink_rep(netdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index f90894eea9e0..3c65fd0bcf31 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -579,6 +579,9 @@ INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq) if (mlx5_wq_cyc_missing(wq) < wqe_bulk) return false; + if (rq->page_pool) + page_pool_nid_changed(rq->page_pool, numa_mem_id()); + do { u16 head = mlx5_wq_cyc_get_head(wq); @@ -734,6 +737,9 @@ INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq) if (likely(missing < UMR_WQE_BULK)) return false; + if (rq->page_pool) + page_pool_nid_changed(rq->page_pool, numa_mem_id()); + head = rq->mpwqe.actual_wq_head; i = missing; do { @@ -1310,7 +1316,8 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) if (rep->vlan && skb_vlan_tag_present(skb)) skb_vlan_pop(skb); - if (!mlx5e_rep_tc_update_skb(cqe, skb, &tc_priv)) { + if (unlikely(!mlx5_ipsec_is_rx_flow(cqe) && + !mlx5e_rep_tc_update_skb(cqe, skb, &tc_priv))) { dev_kfree_skb_any(skb); goto free_wqe; } @@ -1367,7 +1374,8 @@ static void mlx5e_handle_rx_cqe_mpwrq_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); - if (!mlx5e_rep_tc_update_skb(cqe, skb, &tc_priv)) { + if (unlikely(!mlx5_ipsec_is_rx_flow(cqe) && + !mlx5e_rep_tc_update_skb(cqe, skb, &tc_priv))) { dev_kfree_skb_any(skb); goto mpwrq_cqe_out; } @@ -1553,12 +1561,9 @@ int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget) if (unlikely(!test_bit(MLX5E_RQ_STATE_ENABLED, &rq->state))) return 0; - if (rq->page_pool) - page_pool_nid_changed(rq->page_pool, numa_mem_id()); - if (rq->cqd.left) { work_done += mlx5e_decompress_cqes_cont(rq, cqwq, 0, budget); - if (rq->cqd.left || work_done >= budget) + if (work_done >= budget) goto out; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index d4b0f270b6bb..629a61e8022f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -83,17 +83,17 @@ struct mlx5e_tc_attr_to_reg_mapping mlx5e_tc_attr_to_reg_mappings[] = { [CHAIN_TO_REG] = { .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_0, .moffset = 0, - .mlen = 2, + .mlen = 16, }, [VPORT_TO_REG] = { .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_0, - .moffset = 2, - .mlen = 2, + .moffset = 16, + .mlen = 16, }, [TUNNEL_TO_REG] = { .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_1, - .moffset = 1, - .mlen = ((ESW_TUN_OPTS_BITS + ESW_TUN_ID_BITS) / 8), + .moffset = 8, + .mlen = ESW_TUN_OPTS_BITS + ESW_TUN_ID_BITS, .soffset = MLX5_BYTE_OFF(fte_match_param, misc_parameters_2.metadata_reg_c_1), }, @@ -110,7 +110,7 @@ struct mlx5e_tc_attr_to_reg_mapping mlx5e_tc_attr_to_reg_mappings[] = { [NIC_CHAIN_TO_REG] = { .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_B, .moffset = 0, - .mlen = 2, + .mlen = 16, }, [NIC_ZONE_RESTORE_TO_REG] = nic_zone_restore_to_reg_ct, }; @@ -128,23 +128,46 @@ static void mlx5e_put_flow_tunnel_id(struct mlx5e_tc_flow *flow); void mlx5e_tc_match_to_reg_match(struct mlx5_flow_spec *spec, enum mlx5e_tc_attr_to_reg type, - u32 data, + u32 val, u32 mask) { + void *headers_c = spec->match_criteria, *headers_v = spec->match_value, *fmask, *fval; int soffset = mlx5e_tc_attr_to_reg_mappings[type].soffset; + int moffset = mlx5e_tc_attr_to_reg_mappings[type].moffset; int match_len = mlx5e_tc_attr_to_reg_mappings[type].mlen; - void *headers_c = spec->match_criteria; - void *headers_v = spec->match_value; - void *fmask, *fval; + u32 max_mask = GENMASK(match_len - 1, 0); + __be32 curr_mask_be, curr_val_be; + u32 curr_mask, curr_val; fmask = headers_c + soffset; fval = headers_v + soffset; - mask = (__force u32)(cpu_to_be32(mask)) >> (32 - (match_len * 8)); - data = (__force u32)(cpu_to_be32(data)) >> (32 - (match_len * 8)); + memcpy(&curr_mask_be, fmask, 4); + memcpy(&curr_val_be, fval, 4); + + curr_mask = be32_to_cpu(curr_mask_be); + curr_val = be32_to_cpu(curr_val_be); + + //move to correct offset + WARN_ON(mask > max_mask); + mask <<= moffset; + val <<= moffset; + max_mask <<= moffset; + + //zero val and mask + curr_mask &= ~max_mask; + curr_val &= ~max_mask; - memcpy(fmask, &mask, match_len); - memcpy(fval, &data, match_len); + //add current to mask + curr_mask |= mask; + curr_val |= val; + + //back to be32 and write + curr_mask_be = cpu_to_be32(curr_mask); + curr_val_be = cpu_to_be32(curr_val); + + memcpy(fmask, &curr_mask_be, 4); + memcpy(fval, &curr_val_be, 4); spec->match_criteria_enable |= MLX5_MATCH_MISC_PARAMETERS_2; } @@ -152,23 +175,28 @@ mlx5e_tc_match_to_reg_match(struct mlx5_flow_spec *spec, void mlx5e_tc_match_to_reg_get_match(struct mlx5_flow_spec *spec, enum mlx5e_tc_attr_to_reg type, - u32 *data, + u32 *val, u32 *mask) { + void *headers_c = spec->match_criteria, *headers_v = spec->match_value, *fmask, *fval; int soffset = mlx5e_tc_attr_to_reg_mappings[type].soffset; + int moffset = mlx5e_tc_attr_to_reg_mappings[type].moffset; int match_len = mlx5e_tc_attr_to_reg_mappings[type].mlen; - void *headers_c = spec->match_criteria; - void *headers_v = spec->match_value; - void *fmask, *fval; + u32 max_mask = GENMASK(match_len - 1, 0); + __be32 curr_mask_be, curr_val_be; + u32 curr_mask, curr_val; fmask = headers_c + soffset; fval = headers_v + soffset; - memcpy(mask, fmask, match_len); - memcpy(data, fval, match_len); + memcpy(&curr_mask_be, fmask, 4); + memcpy(&curr_val_be, fval, 4); + + curr_mask = be32_to_cpu(curr_mask_be); + curr_val = be32_to_cpu(curr_val_be); - *mask = be32_to_cpu((__force __be32)(*mask << (32 - (match_len * 8)))); - *data = be32_to_cpu((__force __be32)(*data << (32 - (match_len * 8)))); + *mask = (curr_mask >> moffset) & max_mask; + *val = (curr_val >> moffset) & max_mask; } int @@ -192,13 +220,13 @@ mlx5e_tc_match_to_reg_set_and_get_id(struct mlx5_core_dev *mdev, (mod_hdr_acts->num_actions * MLX5_MH_ACT_SZ); /* Firmware has 5bit length field and 0 means 32bits */ - if (mlen == 4) + if (mlen == 32) mlen = 0; MLX5_SET(set_action_in, modact, action_type, MLX5_ACTION_TYPE_SET); MLX5_SET(set_action_in, modact, field, mfield); - MLX5_SET(set_action_in, modact, offset, moffset * 8); - MLX5_SET(set_action_in, modact, length, mlen * 8); + MLX5_SET(set_action_in, modact, offset, moffset); + MLX5_SET(set_action_in, modact, length, mlen); MLX5_SET(set_action_in, modact, data, data); err = mod_hdr_acts->num_actions; mod_hdr_acts->num_actions++; @@ -296,13 +324,13 @@ void mlx5e_tc_match_to_reg_mod_hdr_change(struct mlx5_core_dev *mdev, modact = mod_hdr_acts->actions + (act_id * MLX5_MH_ACT_SZ); /* Firmware has 5bit length field and 0 means 32bits */ - if (mlen == 4) + if (mlen == 32) mlen = 0; MLX5_SET(set_action_in, modact, action_type, MLX5_ACTION_TYPE_SET); MLX5_SET(set_action_in, modact, field, mfield); - MLX5_SET(set_action_in, modact, offset, moffset * 8); - MLX5_SET(set_action_in, modact, length, mlen * 8); + MLX5_SET(set_action_in, modact, offset, moffset); + MLX5_SET(set_action_in, modact, length, mlen); MLX5_SET(set_action_in, modact, data, data); } @@ -818,7 +846,7 @@ static int mlx5e_hairpin_flow_add(struct mlx5e_priv *priv, hash_hairpin_info(peer_id, match_prio)); mutex_unlock(&priv->fs.tc.hairpin_tbl_lock); - params.log_data_size = 15; + params.log_data_size = 16; params.log_data_size = min_t(u8, params.log_data_size, MLX5_CAP_GEN(priv->mdev, log_max_hairpin_wq_data_sz)); params.log_data_size = max_t(u8, params.log_data_size, @@ -5105,7 +5133,7 @@ bool mlx5e_tc_update_skb(struct mlx5_cqe64 *cqe, tc_skb_ext->chain = chain; - zone_restore_id = (reg_b >> REG_MAPPING_SHIFT(NIC_ZONE_RESTORE_TO_REG)) & + zone_restore_id = (reg_b >> REG_MAPPING_MOFFSET(NIC_ZONE_RESTORE_TO_REG)) & ESW_ZONE_ID_MASK; if (!mlx5e_tc_ct_restore_flow(tc->ct, skb, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h index 17027536efba..f7cbeb0b66d2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h @@ -129,7 +129,7 @@ struct tunnel_match_enc_opts { */ #define TUNNEL_INFO_BITS 12 #define TUNNEL_INFO_BITS_MASK GENMASK(TUNNEL_INFO_BITS - 1, 0) -#define ENC_OPTS_BITS 12 +#define ENC_OPTS_BITS 11 #define ENC_OPTS_BITS_MASK GENMASK(ENC_OPTS_BITS - 1, 0) #define TUNNEL_ID_BITS (TUNNEL_INFO_BITS + ENC_OPTS_BITS) #define TUNNEL_ID_MASK GENMASK(TUNNEL_ID_BITS - 1, 0) @@ -201,10 +201,10 @@ enum mlx5e_tc_attr_to_reg { struct mlx5e_tc_attr_to_reg_mapping { int mfield; /* rewrite field */ - int moffset; /* offset of mfield */ - int mlen; /* bytes to rewrite/match */ + int moffset; /* bit offset of mfield */ + int mlen; /* bits to rewrite/match */ - int soffset; /* offset of spec for match */ + int soffset; /* byte offset of spec for match */ }; extern struct mlx5e_tc_attr_to_reg_mapping mlx5e_tc_attr_to_reg_mappings[]; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c index 320fe0cda917..c63d78eda606 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c @@ -687,16 +687,12 @@ void mlx5e_tx_mpwqe_ensure_complete(struct mlx5e_txqsq *sq) mlx5e_tx_mpwqe_session_complete(sq); } -static bool mlx5e_txwqe_build_eseg(struct mlx5e_priv *priv, struct mlx5e_txqsq *sq, +static void mlx5e_txwqe_build_eseg(struct mlx5e_priv *priv, struct mlx5e_txqsq *sq, struct sk_buff *skb, struct mlx5e_accel_tx_state *accel, struct mlx5_wqe_eth_seg *eseg, u16 ihs) { - if (unlikely(!mlx5e_accel_tx_eseg(priv, skb, eseg, ihs))) - return false; - + mlx5e_accel_tx_eseg(priv, skb, eseg, ihs); mlx5e_txwqe_build_eseg_csum(sq, skb, accel, eseg); - - return true; } netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev) @@ -725,10 +721,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev) if (mlx5e_tx_skb_supports_mpwqe(skb, &attr)) { struct mlx5_wqe_eth_seg eseg = {}; - if (unlikely(!mlx5e_txwqe_build_eseg(priv, sq, skb, &accel, &eseg, - attr.ihs))) - return NETDEV_TX_OK; - + mlx5e_txwqe_build_eseg(priv, sq, skb, &accel, &eseg, attr.ihs); mlx5e_sq_xmit_mpwqe(sq, skb, &eseg, netdev_xmit_more()); return NETDEV_TX_OK; } @@ -743,9 +736,7 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev) /* May update the WQE, but may not post other WQEs. */ mlx5e_accel_tx_finish(sq, wqe, &accel, (struct mlx5_wqe_inline_seg *)(wqe->data + wqe_attr.ds_cnt_inl)); - if (unlikely(!mlx5e_txwqe_build_eseg(priv, sq, skb, &accel, &wqe->eth, attr.ihs))) - return NETDEV_TX_OK; - + mlx5e_txwqe_build_eseg(priv, sq, skb, &accel, &wqe->eth, attr.ihs); mlx5e_sq_xmit_wqe(sq, skb, &attr, &wqe_attr, wqe, pi, netdev_xmit_more()); return NETDEV_TX_OK; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 940333410267..6e074cc457de 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -1,33 +1,6 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* - * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. + * Copyright (c) 2013-2021, Mellanox Technologies inc. All rights reserved. */ #include <linux/interrupt.h> @@ -45,6 +18,7 @@ #include "eswitch.h" #include "lib/clock.h" #include "diag/fw_tracer.h" +#include "mlx5_irq.h" enum { MLX5_EQE_OWNER_INIT_VAL = 0x1, @@ -84,6 +58,9 @@ struct mlx5_eq_table { struct mutex lock; /* sync async eqs creations */ int num_comp_eqs; struct mlx5_irq_table *irq_table; +#ifdef CONFIG_RFS_ACCEL + struct cpu_rmap *rmap; +#endif }; #define MLX5_ASYNC_EVENT_MASK ((1ull << MLX5_EVENT_TYPE_PATH_MIG) | \ @@ -288,7 +265,7 @@ create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u32 out[MLX5_ST_SZ_DW(create_eq_out)] = {0}; u8 log_eq_stride = ilog2(MLX5_EQE_SIZE); struct mlx5_priv *priv = &dev->priv; - u8 vecidx = param->irq_index; + u16 vecidx = param->irq_index; __be64 *pas; void *eqc; int inlen; @@ -311,13 +288,20 @@ create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, mlx5_init_fbc(eq->frag_buf.frags, log_eq_stride, log_eq_size, &eq->fbc); init_eq_buf(eq); + eq->irq = mlx5_irq_request(dev, vecidx, param->affinity); + if (IS_ERR(eq->irq)) { + err = PTR_ERR(eq->irq); + goto err_buf; + } + + vecidx = mlx5_irq_get_index(eq->irq); inlen = MLX5_ST_SZ_BYTES(create_eq_in) + MLX5_FLD_SZ_BYTES(create_eq_in, pas[0]) * eq->frag_buf.npages; in = kvzalloc(inlen, GFP_KERNEL); if (!in) { err = -ENOMEM; - goto err_buf; + goto err_irq; } pas = (__be64 *)MLX5_ADDR_OF(create_eq_in, in, pas); @@ -361,6 +345,8 @@ err_eq: err_in: kvfree(in); +err_irq: + mlx5_irq_release(eq->irq); err_buf: mlx5_frag_buf_free(dev, &eq->frag_buf); return err; @@ -379,10 +365,9 @@ err_buf: int mlx5_eq_enable(struct mlx5_core_dev *dev, struct mlx5_eq *eq, struct notifier_block *nb) { - struct mlx5_eq_table *eq_table = dev->priv.eq_table; int err; - err = mlx5_irq_attach_nb(eq_table->irq_table, eq->vecidx, nb); + err = mlx5_irq_attach_nb(eq->irq, nb); if (!err) eq_update_ci(eq, 1); @@ -401,9 +386,7 @@ EXPORT_SYMBOL(mlx5_eq_enable); void mlx5_eq_disable(struct mlx5_core_dev *dev, struct mlx5_eq *eq, struct notifier_block *nb) { - struct mlx5_eq_table *eq_table = dev->priv.eq_table; - - mlx5_irq_detach_nb(eq_table->irq_table, eq->vecidx, nb); + mlx5_irq_detach_nb(eq->irq, nb); } EXPORT_SYMBOL(mlx5_eq_disable); @@ -417,10 +400,9 @@ static int destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq) if (err) mlx5_core_warn(dev, "failed to destroy a previously created eq: eqn %d\n", eq->eqn); - synchronize_irq(eq->irqn); + mlx5_irq_release(eq->irq); mlx5_frag_buf_free(dev, &eq->frag_buf); - return err; } @@ -492,14 +474,7 @@ static int create_async_eq(struct mlx5_core_dev *dev, int err; mutex_lock(&eq_table->lock); - /* Async EQs must share irq index 0 */ - if (param->irq_index != 0) { - err = -EINVAL; - goto unlock; - } - err = create_map_eq(dev, eq, param); -unlock: mutex_unlock(&eq_table->lock); return err; } @@ -618,8 +593,11 @@ setup_async_eq(struct mlx5_core_dev *dev, struct mlx5_eq_async *eq, eq->irq_nb.notifier_call = mlx5_eq_async_int; spin_lock_init(&eq->lock); + if (!zalloc_cpumask_var(¶m->affinity, GFP_KERNEL)) + return -ENOMEM; err = create_async_eq(dev, &eq->core, param); + free_cpumask_var(param->affinity); if (err) { mlx5_core_warn(dev, "failed to create %s EQ %d\n", name, err); return err; @@ -654,7 +632,6 @@ static int create_async_eqs(struct mlx5_core_dev *dev) mlx5_eq_notifier_register(dev, &table->cq_err_nb); param = (struct mlx5_eq_param) { - .irq_index = 0, .nent = MLX5_NUM_CMD_EQE, .mask[0] = 1ull << MLX5_EVENT_TYPE_CMD, }; @@ -667,7 +644,6 @@ static int create_async_eqs(struct mlx5_core_dev *dev) mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL); param = (struct mlx5_eq_param) { - .irq_index = 0, .nent = MLX5_NUM_ASYNC_EQE, }; @@ -677,7 +653,6 @@ static int create_async_eqs(struct mlx5_core_dev *dev) goto err2; param = (struct mlx5_eq_param) { - .irq_index = 0, .nent = /* TODO: sriov max_vf + */ 1, .mask[0] = 1ull << MLX5_EVENT_TYPE_PAGE_REQUEST, }; @@ -737,6 +712,9 @@ mlx5_eq_create_generic(struct mlx5_core_dev *dev, struct mlx5_eq *eq = kvzalloc(sizeof(*eq), GFP_KERNEL); int err; + if (!cpumask_available(param->affinity)) + return ERR_PTR(-EINVAL); + if (!eq) return ERR_PTR(-ENOMEM); @@ -847,16 +825,21 @@ static int create_comp_eqs(struct mlx5_core_dev *dev) .irq_index = vecidx, .nent = nent, }; - err = create_map_eq(dev, &eq->core, ¶m); - if (err) { - kfree(eq); - goto clean; + + if (!zalloc_cpumask_var(¶m.affinity, GFP_KERNEL)) { + err = -ENOMEM; + goto clean_eq; } + cpumask_set_cpu(cpumask_local_spread(i, dev->priv.numa_node), + param.affinity); + err = create_map_eq(dev, &eq->core, ¶m); + free_cpumask_var(param.affinity); + if (err) + goto clean_eq; err = mlx5_eq_enable(dev, &eq->core, &eq->irq_nb); if (err) { destroy_unmap_eq(dev, &eq->core); - kfree(eq); - goto clean; + goto clean_eq; } mlx5_core_dbg(dev, "allocated completion EQN %d\n", eq->core.eqn); @@ -865,7 +848,8 @@ static int create_comp_eqs(struct mlx5_core_dev *dev) } return 0; - +clean_eq: + kfree(eq); clean: destroy_comp_eqs(dev); return err; @@ -901,17 +885,23 @@ EXPORT_SYMBOL(mlx5_comp_vectors_count); struct cpumask * mlx5_comp_irq_get_affinity_mask(struct mlx5_core_dev *dev, int vector) { - int vecidx = vector + MLX5_IRQ_VEC_COMP_BASE; + struct mlx5_eq_table *table = dev->priv.eq_table; + struct mlx5_eq_comp *eq, *n; + int i = 0; - return mlx5_irq_get_affinity_mask(dev->priv.eq_table->irq_table, - vecidx); + list_for_each_entry_safe(eq, n, &table->comp_eqs_list, list) { + if (i++ == vector) + break; + } + + return mlx5_irq_get_affinity_mask(eq->core.irq); } EXPORT_SYMBOL(mlx5_comp_irq_get_affinity_mask); #ifdef CONFIG_RFS_ACCEL struct cpu_rmap *mlx5_eq_table_get_rmap(struct mlx5_core_dev *dev) { - return mlx5_irq_get_rmap(dev->priv.eq_table->irq_table); + return dev->priv.eq_table->rmap; } #endif @@ -928,12 +918,57 @@ struct mlx5_eq_comp *mlx5_eqn2comp_eq(struct mlx5_core_dev *dev, int eqn) return ERR_PTR(-ENOENT); } +static void clear_rmap(struct mlx5_core_dev *dev) +{ +#ifdef CONFIG_RFS_ACCEL + struct mlx5_eq_table *eq_table = dev->priv.eq_table; + + free_irq_cpu_rmap(eq_table->rmap); +#endif +} + +static int set_rmap(struct mlx5_core_dev *mdev) +{ + int err = 0; +#ifdef CONFIG_RFS_ACCEL + struct mlx5_eq_table *eq_table = mdev->priv.eq_table; + int vecidx; + + eq_table->rmap = alloc_irq_cpu_rmap(eq_table->num_comp_eqs); + if (!eq_table->rmap) { + err = -ENOMEM; + mlx5_core_err(mdev, "Failed to allocate cpu_rmap. err %d", err); + goto err_out; + } + + vecidx = MLX5_IRQ_VEC_COMP_BASE; + for (; vecidx < eq_table->num_comp_eqs + MLX5_IRQ_VEC_COMP_BASE; + vecidx++) { + err = irq_cpu_rmap_add(eq_table->rmap, + pci_irq_vector(mdev->pdev, vecidx)); + if (err) { + mlx5_core_err(mdev, "irq_cpu_rmap_add failed. err %d", + err); + goto err_irq_cpu_rmap_add; + } + } + return 0; + +err_irq_cpu_rmap_add: + clear_rmap(mdev); +err_out: +#endif + return err; +} + /* This function should only be called after mlx5_cmd_force_teardown_hca */ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; mutex_lock(&table->lock); /* sync with create/destroy_async_eq */ + if (!mlx5_core_is_sf(dev)) + clear_rmap(dev); mlx5_irq_table_destroy(dev); mutex_unlock(&table->lock); } @@ -950,12 +985,19 @@ int mlx5_eq_table_create(struct mlx5_core_dev *dev) int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ? MLX5_CAP_GEN(dev, max_num_eqs) : 1 << MLX5_CAP_GEN(dev, log_max_eq); + int max_eqs_sf; int err; eq_table->num_comp_eqs = min_t(int, - mlx5_irq_get_num_comp(eq_table->irq_table), + mlx5_irq_table_get_num_comp(eq_table->irq_table), num_eqs - MLX5_MAX_ASYNC_EQS); + if (mlx5_core_is_sf(dev)) { + max_eqs_sf = min_t(int, MLX5_COMP_EQS_PER_SF, + mlx5_irq_table_get_sfs_vec(eq_table->irq_table)); + eq_table->num_comp_eqs = min_t(int, eq_table->num_comp_eqs, + max_eqs_sf); + } err = create_async_eqs(dev); if (err) { @@ -963,6 +1005,18 @@ int mlx5_eq_table_create(struct mlx5_core_dev *dev) goto err_async_eqs; } + if (!mlx5_core_is_sf(dev)) { + /* rmap is a mapping between irq number and queue number. + * each irq can be assign only to a single rmap. + * since SFs share IRQs, rmap mapping cannot function correctly + * for irqs that are shared for different core/netdev RX rings. + * Hence we don't allow netdev rmap for SFs + */ + err = set_rmap(dev); + if (err) + goto err_rmap; + } + err = create_comp_eqs(dev); if (err) { mlx5_core_err(dev, "Failed to create completion EQs\n"); @@ -971,6 +1025,9 @@ int mlx5_eq_table_create(struct mlx5_core_dev *dev) return 0; err_comp_eqs: + if (!mlx5_core_is_sf(dev)) + clear_rmap(dev); +err_rmap: destroy_async_eqs(dev); err_async_eqs: return err; @@ -978,6 +1035,8 @@ err_async_eqs: void mlx5_eq_table_destroy(struct mlx5_core_dev *dev) { + if (!mlx5_core_is_sf(dev)) + clear_rmap(dev); destroy_comp_eqs(dev); destroy_async_eqs(dev); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c new file mode 100644 index 000000000000..a6e1d4f78268 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c @@ -0,0 +1,1299 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2021 Mellanox Technologies. */ + +#include <linux/list.h> +#include <linux/notifier.h> +#include <net/netevent.h> +#include <net/switchdev.h> +#include "bridge.h" +#include "eswitch.h" +#include "bridge_priv.h" +#define CREATE_TRACE_POINTS +#include "diag/bridge_tracepoint.h" + +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE 64000 +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM 0 +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE / 4 - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE / 2 - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE - 1) + +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE 64000 +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_FROM 0 +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO (MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE / 2 - 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO (MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE - 1) + +#define MLX5_ESW_BRIDGE_SKIP_TABLE_SIZE 0 + +enum { + MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE, + MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE, + MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE, +}; + +static const struct rhashtable_params fdb_ht_params = { + .key_offset = offsetof(struct mlx5_esw_bridge_fdb_entry, key), + .key_len = sizeof(struct mlx5_esw_bridge_fdb_key), + .head_offset = offsetof(struct mlx5_esw_bridge_fdb_entry, ht_node), + .automatic_shrinking = true, +}; + +enum { + MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG = BIT(0), +}; + +struct mlx5_esw_bridge { + int ifindex; + int refcnt; + struct list_head list; + struct mlx5_esw_bridge_offloads *br_offloads; + + struct list_head fdb_list; + struct rhashtable fdb_ht; + struct xarray vports; + + struct mlx5_flow_table *egress_ft; + struct mlx5_flow_group *egress_vlan_fg; + struct mlx5_flow_group *egress_mac_fg; + unsigned long ageing_time; + u32 flags; +}; + +static void +mlx5_esw_bridge_fdb_offload_notify(struct net_device *dev, const unsigned char *addr, u16 vid, + unsigned long val) +{ + struct switchdev_notifier_fdb_info send_info; + + send_info.addr = addr; + send_info.vid = vid; + send_info.offloaded = true; + call_switchdev_notifiers(val, dev, &send_info.info, NULL); +} + +static struct mlx5_flow_table * +mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw) +{ + struct mlx5_flow_table_attr ft_attr = {}; + struct mlx5_core_dev *dev = esw->dev; + struct mlx5_flow_namespace *ns; + struct mlx5_flow_table *fdb; + + ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB); + if (!ns) { + esw_warn(dev, "Failed to get FDB namespace\n"); + return ERR_PTR(-ENOENT); + } + + ft_attr.flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; + ft_attr.max_fte = max_fte; + ft_attr.level = level; + ft_attr.prio = FDB_BR_OFFLOAD; + fdb = mlx5_create_flow_table(ns, &ft_attr); + if (IS_ERR(fdb)) + esw_warn(dev, "Failed to create bridge FDB Table (err=%ld)\n", PTR_ERR(fdb)); + + return fdb; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_ingress_vlan_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *ingress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, + MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_47_16); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_15_0); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.first_vid); + + MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_mask()); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO); + + fg = mlx5_create_flow_group(ingress_ft, in); + kvfree(in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create VLAN flow group for bridge ingress table (err=%ld)\n", + PTR_ERR(fg)); + + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_ingress_filter_fg_create(struct mlx5_eswitch *esw, + struct mlx5_flow_table *ingress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, + MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_47_16); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_15_0); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag); + + MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_mask()); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_FILTER_GRP_IDX_TO); + + fg = mlx5_create_flow_group(ingress_ft, in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create bridge ingress table VLAN filter flow group (err=%ld)\n", + PTR_ERR(fg)); + + kvfree(in); + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_ingress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *ingress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, + MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_47_16); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.smac_15_0); + + MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_mask()); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO); + + fg = mlx5_create_flow_group(ingress_ft, in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create MAC flow group for bridge ingress table (err=%ld)\n", + PTR_ERR(fg)); + + kvfree(in); + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_egress_vlan_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *egress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_47_16); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_15_0); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.first_vid); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO); + + fg = mlx5_create_flow_group(egress_ft, in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create VLAN flow group for bridge egress table (err=%ld)\n", + PTR_ERR(fg)); + kvfree(in); + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_egress_mac_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *egress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_47_16); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.dmac_15_0); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO); + + fg = mlx5_create_flow_group(egress_ft, in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create bridge egress table MAC flow group (err=%ld)\n", + PTR_ERR(fg)); + kvfree(in); + return fg; +} + +static int +mlx5_esw_bridge_ingress_table_init(struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_flow_group *mac_fg, *filter_fg, *vlan_fg; + struct mlx5_flow_table *ingress_ft, *skip_ft; + int err; + + if (!mlx5_eswitch_vport_match_metadata_enabled(br_offloads->esw)) + return -EOPNOTSUPP; + + ingress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE, + MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE, + br_offloads->esw); + if (IS_ERR(ingress_ft)) + return PTR_ERR(ingress_ft); + + skip_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_SKIP_TABLE_SIZE, + MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE, + br_offloads->esw); + if (IS_ERR(skip_ft)) { + err = PTR_ERR(skip_ft); + goto err_skip_tbl; + } + + vlan_fg = mlx5_esw_bridge_ingress_vlan_fg_create(br_offloads->esw, ingress_ft); + if (IS_ERR(vlan_fg)) { + err = PTR_ERR(vlan_fg); + goto err_vlan_fg; + } + + filter_fg = mlx5_esw_bridge_ingress_filter_fg_create(br_offloads->esw, ingress_ft); + if (IS_ERR(filter_fg)) { + err = PTR_ERR(filter_fg); + goto err_filter_fg; + } + + mac_fg = mlx5_esw_bridge_ingress_mac_fg_create(br_offloads->esw, ingress_ft); + if (IS_ERR(mac_fg)) { + err = PTR_ERR(mac_fg); + goto err_mac_fg; + } + + br_offloads->ingress_ft = ingress_ft; + br_offloads->skip_ft = skip_ft; + br_offloads->ingress_vlan_fg = vlan_fg; + br_offloads->ingress_filter_fg = filter_fg; + br_offloads->ingress_mac_fg = mac_fg; + return 0; + +err_mac_fg: + mlx5_destroy_flow_group(filter_fg); +err_filter_fg: + mlx5_destroy_flow_group(vlan_fg); +err_vlan_fg: + mlx5_destroy_flow_table(skip_ft); +err_skip_tbl: + mlx5_destroy_flow_table(ingress_ft); + return err; +} + +static void +mlx5_esw_bridge_ingress_table_cleanup(struct mlx5_esw_bridge_offloads *br_offloads) +{ + mlx5_destroy_flow_group(br_offloads->ingress_mac_fg); + br_offloads->ingress_mac_fg = NULL; + mlx5_destroy_flow_group(br_offloads->ingress_filter_fg); + br_offloads->ingress_filter_fg = NULL; + mlx5_destroy_flow_group(br_offloads->ingress_vlan_fg); + br_offloads->ingress_vlan_fg = NULL; + mlx5_destroy_flow_table(br_offloads->skip_ft); + br_offloads->skip_ft = NULL; + mlx5_destroy_flow_table(br_offloads->ingress_ft); + br_offloads->ingress_ft = NULL; +} + +static int +mlx5_esw_bridge_egress_table_init(struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_flow_group *mac_fg, *vlan_fg; + struct mlx5_flow_table *egress_ft; + int err; + + egress_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE, + MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE, + br_offloads->esw); + if (IS_ERR(egress_ft)) + return PTR_ERR(egress_ft); + + vlan_fg = mlx5_esw_bridge_egress_vlan_fg_create(br_offloads->esw, egress_ft); + if (IS_ERR(vlan_fg)) { + err = PTR_ERR(vlan_fg); + goto err_vlan_fg; + } + + mac_fg = mlx5_esw_bridge_egress_mac_fg_create(br_offloads->esw, egress_ft); + if (IS_ERR(mac_fg)) { + err = PTR_ERR(mac_fg); + goto err_mac_fg; + } + + bridge->egress_ft = egress_ft; + bridge->egress_vlan_fg = vlan_fg; + bridge->egress_mac_fg = mac_fg; + return 0; + +err_mac_fg: + mlx5_destroy_flow_group(vlan_fg); +err_vlan_fg: + mlx5_destroy_flow_table(egress_ft); + return err; +} + +static void +mlx5_esw_bridge_egress_table_cleanup(struct mlx5_esw_bridge *bridge) +{ + mlx5_destroy_flow_group(bridge->egress_mac_fg); + mlx5_destroy_flow_group(bridge->egress_vlan_fg); + mlx5_destroy_flow_table(bridge->egress_ft); +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_ingress_flow_create(u16 vport_num, const unsigned char *addr, + struct mlx5_esw_bridge_vlan *vlan, u32 counter_id, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads; + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_COUNT, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_destination dests[2] = {}; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + u8 *smac_v, *smac_c; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2; + + smac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value, + outer_headers.smac_47_16); + ether_addr_copy(smac_v, addr); + smac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria, + outer_headers.smac_47_16); + eth_broadcast_addr(smac_c); + + MLX5_SET(fte_match_param, rule_spec->match_criteria, + misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_mask()); + MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num)); + + if (vlan && vlan->pkt_reformat_push) { + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT; + flow_act.pkt_reformat = vlan->pkt_reformat_push; + } else if (vlan) { + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.first_vid); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.first_vid, + vlan->vid); + } + + dests[0].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dests[0].ft = bridge->egress_ft; + dests[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER; + dests[1].counter_id = counter_id; + + handle = mlx5_add_flow_rules(br_offloads->ingress_ft, rule_spec, &flow_act, dests, + ARRAY_SIZE(dests)); + + kvfree(rule_spec); + return handle; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_ingress_filter_flow_create(u16 vport_num, const unsigned char *addr, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads; + struct mlx5_flow_destination dest = { + .type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE, + .ft = br_offloads->skip_ft, + }; + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + u8 *smac_v, *smac_c; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_2; + + smac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value, + outer_headers.smac_47_16); + ether_addr_copy(smac_v, addr); + smac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria, + outer_headers.smac_47_16); + eth_broadcast_addr(smac_c); + + MLX5_SET(fte_match_param, rule_spec->match_criteria, + misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_mask()); + MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_for_match(br_offloads->esw, vport_num)); + + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.cvlan_tag); + + handle = mlx5_add_flow_rules(br_offloads->ingress_ft, rule_spec, &flow_act, &dest, 1); + + kvfree(rule_spec); + return handle; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_egress_flow_create(u16 vport_num, const unsigned char *addr, + struct mlx5_esw_bridge_vlan *vlan, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_flow_destination dest = { + .type = MLX5_FLOW_DESTINATION_TYPE_VPORT, + .vport.num = vport_num, + }; + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + u8 *dmac_v, *dmac_c; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS; + + dmac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value, + outer_headers.dmac_47_16); + ether_addr_copy(dmac_v, addr); + dmac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria, + outer_headers.dmac_47_16); + eth_broadcast_addr(dmac_c); + + if (vlan) { + if (vlan->pkt_reformat_pop) { + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT; + flow_act.pkt_reformat = vlan->pkt_reformat_pop; + } + + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.first_vid); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.first_vid, + vlan->vid); + } + + handle = mlx5_add_flow_rules(bridge->egress_ft, rule_spec, &flow_act, &dest, 1); + + kvfree(rule_spec); + return handle; +} + +static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex, + struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_esw_bridge *bridge; + int err; + + bridge = kvzalloc(sizeof(*bridge), GFP_KERNEL); + if (!bridge) + return ERR_PTR(-ENOMEM); + + bridge->br_offloads = br_offloads; + err = mlx5_esw_bridge_egress_table_init(br_offloads, bridge); + if (err) + goto err_egress_tbl; + + err = rhashtable_init(&bridge->fdb_ht, &fdb_ht_params); + if (err) + goto err_fdb_ht; + + INIT_LIST_HEAD(&bridge->fdb_list); + xa_init(&bridge->vports); + bridge->ifindex = ifindex; + bridge->refcnt = 1; + bridge->ageing_time = BR_DEFAULT_AGEING_TIME; + list_add(&bridge->list, &br_offloads->bridges); + + return bridge; + +err_fdb_ht: + mlx5_esw_bridge_egress_table_cleanup(bridge); +err_egress_tbl: + kvfree(bridge); + return ERR_PTR(err); +} + +static void mlx5_esw_bridge_get(struct mlx5_esw_bridge *bridge) +{ + bridge->refcnt++; +} + +static void mlx5_esw_bridge_put(struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_esw_bridge *bridge) +{ + if (--bridge->refcnt) + return; + + mlx5_esw_bridge_egress_table_cleanup(bridge); + WARN_ON(!xa_empty(&bridge->vports)); + list_del(&bridge->list); + rhashtable_destroy(&bridge->fdb_ht); + kvfree(bridge); + + if (list_empty(&br_offloads->bridges)) + mlx5_esw_bridge_ingress_table_cleanup(br_offloads); +} + +static struct mlx5_esw_bridge * +mlx5_esw_bridge_lookup(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_esw_bridge *bridge; + + ASSERT_RTNL(); + + list_for_each_entry(bridge, &br_offloads->bridges, list) { + if (bridge->ifindex == ifindex) { + mlx5_esw_bridge_get(bridge); + return bridge; + } + } + + if (!br_offloads->ingress_ft) { + int err = mlx5_esw_bridge_ingress_table_init(br_offloads); + + if (err) + return ERR_PTR(err); + } + + bridge = mlx5_esw_bridge_create(ifindex, br_offloads); + if (IS_ERR(bridge) && list_empty(&br_offloads->bridges)) + mlx5_esw_bridge_ingress_table_cleanup(br_offloads); + return bridge; +} + +static int mlx5_esw_bridge_port_insert(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge *bridge) +{ + return xa_insert(&bridge->vports, port->vport_num, port, GFP_KERNEL); +} + +static struct mlx5_esw_bridge_port * +mlx5_esw_bridge_port_lookup(u16 vport_num, struct mlx5_esw_bridge *bridge) +{ + return xa_load(&bridge->vports, vport_num); +} + +static void mlx5_esw_bridge_port_erase(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge *bridge) +{ + xa_erase(&bridge->vports, port->vport_num); +} + +static void mlx5_esw_bridge_fdb_entry_refresh(unsigned long lastuse, + struct mlx5_esw_bridge_fdb_entry *entry) +{ + trace_mlx5_esw_bridge_fdb_entry_refresh(entry); + + entry->lastuse = lastuse; + mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr, + entry->key.vid, + SWITCHDEV_FDB_ADD_TO_BRIDGE); +} + +static void +mlx5_esw_bridge_fdb_entry_cleanup(struct mlx5_esw_bridge_fdb_entry *entry, + struct mlx5_esw_bridge *bridge) +{ + trace_mlx5_esw_bridge_fdb_entry_cleanup(entry); + + rhashtable_remove_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params); + mlx5_del_flow_rules(entry->egress_handle); + if (entry->filter_handle) + mlx5_del_flow_rules(entry->filter_handle); + mlx5_del_flow_rules(entry->ingress_handle); + mlx5_fc_destroy(bridge->br_offloads->esw->dev, entry->ingress_counter); + list_del(&entry->vlan_list); + list_del(&entry->list); + kvfree(entry); +} + +static void mlx5_esw_bridge_fdb_flush(struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_fdb_entry *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) { + if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)) + mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr, + entry->key.vid, + SWITCHDEV_FDB_DEL_TO_BRIDGE); + mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge); + } +} + +static struct mlx5_esw_bridge_vlan * +mlx5_esw_bridge_vlan_lookup(u16 vid, struct mlx5_esw_bridge_port *port) +{ + return xa_load(&port->vlans, vid); +} + +static int +mlx5_esw_bridge_vlan_push_create(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw) +{ + struct { + __be16 h_vlan_proto; + __be16 h_vlan_TCI; + } vlan_hdr = { htons(ETH_P_8021Q), htons(vlan->vid) }; + struct mlx5_pkt_reformat_params reformat_params = {}; + struct mlx5_pkt_reformat *pkt_reformat; + + if (!BIT(MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat_insert)) || + MLX5_CAP_GEN_2(esw->dev, max_reformat_insert_size) < sizeof(vlan_hdr) || + MLX5_CAP_GEN_2(esw->dev, max_reformat_insert_offset) < + offsetof(struct vlan_ethhdr, h_vlan_proto)) { + esw_warn(esw->dev, "Packet reformat INSERT_HEADER is not supported\n"); + return -EOPNOTSUPP; + } + + reformat_params.type = MLX5_REFORMAT_TYPE_INSERT_HDR; + reformat_params.param_0 = MLX5_REFORMAT_CONTEXT_ANCHOR_MAC_START; + reformat_params.param_1 = offsetof(struct vlan_ethhdr, h_vlan_proto); + reformat_params.size = sizeof(vlan_hdr); + reformat_params.data = &vlan_hdr; + pkt_reformat = mlx5_packet_reformat_alloc(esw->dev, + &reformat_params, + MLX5_FLOW_NAMESPACE_FDB); + if (IS_ERR(pkt_reformat)) { + esw_warn(esw->dev, "Failed to alloc packet reformat INSERT_HEADER (err=%ld)\n", + PTR_ERR(pkt_reformat)); + return PTR_ERR(pkt_reformat); + } + + vlan->pkt_reformat_push = pkt_reformat; + return 0; +} + +static void +mlx5_esw_bridge_vlan_push_cleanup(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw) +{ + mlx5_packet_reformat_dealloc(esw->dev, vlan->pkt_reformat_push); + vlan->pkt_reformat_push = NULL; +} + +static int +mlx5_esw_bridge_vlan_pop_create(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw) +{ + struct mlx5_pkt_reformat_params reformat_params = {}; + struct mlx5_pkt_reformat *pkt_reformat; + + if (!BIT(MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat_remove)) || + MLX5_CAP_GEN_2(esw->dev, max_reformat_remove_size) < sizeof(struct vlan_hdr) || + MLX5_CAP_GEN_2(esw->dev, max_reformat_remove_offset) < + offsetof(struct vlan_ethhdr, h_vlan_proto)) { + esw_warn(esw->dev, "Packet reformat REMOVE_HEADER is not supported\n"); + return -EOPNOTSUPP; + } + + reformat_params.type = MLX5_REFORMAT_TYPE_REMOVE_HDR; + reformat_params.param_0 = MLX5_REFORMAT_CONTEXT_ANCHOR_MAC_START; + reformat_params.param_1 = offsetof(struct vlan_ethhdr, h_vlan_proto); + reformat_params.size = sizeof(struct vlan_hdr); + pkt_reformat = mlx5_packet_reformat_alloc(esw->dev, + &reformat_params, + MLX5_FLOW_NAMESPACE_FDB); + if (IS_ERR(pkt_reformat)) { + esw_warn(esw->dev, "Failed to alloc packet reformat REMOVE_HEADER (err=%ld)\n", + PTR_ERR(pkt_reformat)); + return PTR_ERR(pkt_reformat); + } + + vlan->pkt_reformat_pop = pkt_reformat; + return 0; +} + +static void +mlx5_esw_bridge_vlan_pop_cleanup(struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw) +{ + mlx5_packet_reformat_dealloc(esw->dev, vlan->pkt_reformat_pop); + vlan->pkt_reformat_pop = NULL; +} + +static struct mlx5_esw_bridge_vlan * +mlx5_esw_bridge_vlan_create(u16 vid, u16 flags, struct mlx5_esw_bridge_port *port, + struct mlx5_eswitch *esw) +{ + struct mlx5_esw_bridge_vlan *vlan; + int err; + + vlan = kvzalloc(sizeof(*vlan), GFP_KERNEL); + if (!vlan) + return ERR_PTR(-ENOMEM); + + vlan->vid = vid; + vlan->flags = flags; + INIT_LIST_HEAD(&vlan->fdb_list); + + if (flags & BRIDGE_VLAN_INFO_PVID) { + err = mlx5_esw_bridge_vlan_push_create(vlan, esw); + if (err) + goto err_vlan_push; + } + if (flags & BRIDGE_VLAN_INFO_UNTAGGED) { + err = mlx5_esw_bridge_vlan_pop_create(vlan, esw); + if (err) + goto err_vlan_pop; + } + + err = xa_insert(&port->vlans, vid, vlan, GFP_KERNEL); + if (err) + goto err_xa_insert; + + trace_mlx5_esw_bridge_vlan_create(vlan); + return vlan; + +err_xa_insert: + if (vlan->pkt_reformat_pop) + mlx5_esw_bridge_vlan_pop_cleanup(vlan, esw); +err_vlan_pop: + if (vlan->pkt_reformat_push) + mlx5_esw_bridge_vlan_push_cleanup(vlan, esw); +err_vlan_push: + kvfree(vlan); + return ERR_PTR(err); +} + +static void mlx5_esw_bridge_vlan_erase(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan) +{ + xa_erase(&port->vlans, vlan->vid); +} + +static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_vlan *vlan, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_fdb_entry *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &vlan->fdb_list, vlan_list) { + if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)) + mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr, + entry->key.vid, + SWITCHDEV_FDB_DEL_TO_BRIDGE); + mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge); + } + + if (vlan->pkt_reformat_pop) + mlx5_esw_bridge_vlan_pop_cleanup(vlan, bridge->br_offloads->esw); + if (vlan->pkt_reformat_push) + mlx5_esw_bridge_vlan_push_cleanup(vlan, bridge->br_offloads->esw); +} + +static void mlx5_esw_bridge_vlan_cleanup(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan, + struct mlx5_esw_bridge *bridge) +{ + trace_mlx5_esw_bridge_vlan_cleanup(vlan); + mlx5_esw_bridge_vlan_flush(vlan, bridge); + mlx5_esw_bridge_vlan_erase(port, vlan); + kvfree(vlan); +} + +static void mlx5_esw_bridge_port_vlans_flush(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_vlan *vlan; + unsigned long index; + + xa_for_each(&port->vlans, index, vlan) + mlx5_esw_bridge_vlan_cleanup(port, vlan, bridge); +} + +static struct mlx5_esw_bridge_vlan * +mlx5_esw_bridge_port_vlan_lookup(u16 vid, u16 vport_num, struct mlx5_esw_bridge *bridge, + struct mlx5_eswitch *esw) +{ + struct mlx5_esw_bridge_port *port; + struct mlx5_esw_bridge_vlan *vlan; + + port = mlx5_esw_bridge_port_lookup(vport_num, bridge); + if (!port) { + /* FDB is added asynchronously on wq while port might have been deleted + * concurrently. Report on 'info' logging level and skip the FDB offload. + */ + esw_info(esw->dev, "Failed to lookup bridge port (vport=%u)\n", vport_num); + return ERR_PTR(-EINVAL); + } + + vlan = mlx5_esw_bridge_vlan_lookup(vid, port); + if (!vlan) { + /* FDB is added asynchronously on wq while vlan might have been deleted + * concurrently. Report on 'info' logging level and skip the FDB offload. + */ + esw_info(esw->dev, "Failed to lookup bridge port vlan metadata (vport=%u)\n", + vport_num); + return ERR_PTR(-EINVAL); + } + + return vlan; +} + +static struct mlx5_esw_bridge_fdb_entry * +mlx5_esw_bridge_fdb_entry_init(struct net_device *dev, u16 vport_num, const unsigned char *addr, + u16 vid, bool added_by_user, struct mlx5_eswitch *esw, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_vlan *vlan = NULL; + struct mlx5_esw_bridge_fdb_entry *entry; + struct mlx5_flow_handle *handle; + struct mlx5_fc *counter; + struct mlx5e_priv *priv; + int err; + + if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG && vid) { + vlan = mlx5_esw_bridge_port_vlan_lookup(vid, vport_num, bridge, esw); + if (IS_ERR(vlan)) + return ERR_CAST(vlan); + } + + priv = netdev_priv(dev); + entry = kvzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return ERR_PTR(-ENOMEM); + + ether_addr_copy(entry->key.addr, addr); + entry->key.vid = vid; + entry->dev = dev; + entry->vport_num = vport_num; + entry->lastuse = jiffies; + if (added_by_user) + entry->flags |= MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER; + + counter = mlx5_fc_create(priv->mdev, true); + if (IS_ERR(counter)) { + err = PTR_ERR(counter); + goto err_ingress_fc_create; + } + entry->ingress_counter = counter; + + handle = mlx5_esw_bridge_ingress_flow_create(vport_num, addr, vlan, mlx5_fc_id(counter), + bridge); + if (IS_ERR(handle)) { + err = PTR_ERR(handle); + esw_warn(esw->dev, "Failed to create ingress flow(vport=%u,err=%d)\n", + vport_num, err); + goto err_ingress_flow_create; + } + entry->ingress_handle = handle; + + if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG) { + handle = mlx5_esw_bridge_ingress_filter_flow_create(vport_num, addr, bridge); + if (IS_ERR(handle)) { + err = PTR_ERR(handle); + esw_warn(esw->dev, "Failed to create ingress filter(vport=%u,err=%d)\n", + vport_num, err); + goto err_ingress_filter_flow_create; + } + entry->filter_handle = handle; + } + + handle = mlx5_esw_bridge_egress_flow_create(vport_num, addr, vlan, bridge); + if (IS_ERR(handle)) { + err = PTR_ERR(handle); + esw_warn(esw->dev, "Failed to create egress flow(vport=%u,err=%d)\n", + vport_num, err); + goto err_egress_flow_create; + } + entry->egress_handle = handle; + + err = rhashtable_insert_fast(&bridge->fdb_ht, &entry->ht_node, fdb_ht_params); + if (err) { + esw_warn(esw->dev, "Failed to insert FDB flow(vport=%u,err=%d)\n", vport_num, err); + goto err_ht_init; + } + + if (vlan) + list_add(&entry->vlan_list, &vlan->fdb_list); + else + INIT_LIST_HEAD(&entry->vlan_list); + list_add(&entry->list, &bridge->fdb_list); + + trace_mlx5_esw_bridge_fdb_entry_init(entry); + return entry; + +err_ht_init: + mlx5_del_flow_rules(entry->egress_handle); +err_egress_flow_create: + if (entry->filter_handle) + mlx5_del_flow_rules(entry->filter_handle); +err_ingress_filter_flow_create: + mlx5_del_flow_rules(entry->ingress_handle); +err_ingress_flow_create: + mlx5_fc_destroy(priv->mdev, entry->ingress_counter); +err_ingress_fc_create: + kvfree(entry); + return ERR_PTR(err); +} + +int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw, + struct mlx5_vport *vport) +{ + if (!vport->bridge) + return -EINVAL; + + vport->bridge->ageing_time = ageing_time; + return 0; +} + +int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw, + struct mlx5_vport *vport) +{ + struct mlx5_esw_bridge *bridge; + bool filtering; + + if (!vport->bridge) + return -EINVAL; + + bridge = vport->bridge; + filtering = bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG; + if (filtering == enable) + return 0; + + mlx5_esw_bridge_fdb_flush(bridge); + if (enable) + bridge->flags |= MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG; + else + bridge->flags &= ~MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG; + + return 0; +} + +static int mlx5_esw_bridge_vport_init(struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_esw_bridge *bridge, + struct mlx5_vport *vport) +{ + struct mlx5_eswitch *esw = br_offloads->esw; + struct mlx5_esw_bridge_port *port; + int err; + + port = kvzalloc(sizeof(*port), GFP_KERNEL); + if (!port) { + err = -ENOMEM; + goto err_port_alloc; + } + + port->vport_num = vport->vport; + xa_init(&port->vlans); + err = mlx5_esw_bridge_port_insert(port, bridge); + if (err) { + esw_warn(esw->dev, "Failed to insert port metadata (vport=%u,err=%d)\n", + vport->vport, err); + goto err_port_insert; + } + trace_mlx5_esw_bridge_vport_init(port); + + vport->bridge = bridge; + return 0; + +err_port_insert: + kvfree(port); +err_port_alloc: + mlx5_esw_bridge_put(br_offloads, bridge); + return err; +} + +static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_vport *vport) +{ + struct mlx5_esw_bridge *bridge = vport->bridge; + struct mlx5_esw_bridge_fdb_entry *entry, *tmp; + struct mlx5_esw_bridge_port *port; + + list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) + if (entry->vport_num == vport->vport) + mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge); + + port = mlx5_esw_bridge_port_lookup(vport->vport, bridge); + if (!port) { + WARN(1, "Vport %u metadata not found on bridge", vport->vport); + return -EINVAL; + } + + trace_mlx5_esw_bridge_vport_cleanup(port); + mlx5_esw_bridge_port_vlans_flush(port, bridge); + mlx5_esw_bridge_port_erase(port, bridge); + kvfree(port); + mlx5_esw_bridge_put(br_offloads, bridge); + vport->bridge = NULL; + return 0; +} + +int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_vport *vport, struct netlink_ext_ack *extack) +{ + struct mlx5_esw_bridge *bridge; + int err; + + WARN_ON(vport->bridge); + + bridge = mlx5_esw_bridge_lookup(ifindex, br_offloads); + if (IS_ERR(bridge)) { + NL_SET_ERR_MSG_MOD(extack, "Error checking for existing bridge with same ifindex"); + return PTR_ERR(bridge); + } + + err = mlx5_esw_bridge_vport_init(br_offloads, bridge, vport); + if (err) + NL_SET_ERR_MSG_MOD(extack, "Error initializing port"); + return err; +} + +int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_vport *vport, struct netlink_ext_ack *extack) +{ + struct mlx5_esw_bridge *bridge = vport->bridge; + int err; + + if (!bridge) { + NL_SET_ERR_MSG_MOD(extack, "Port is not attached to any bridge"); + return -EINVAL; + } + if (bridge->ifindex != ifindex) { + NL_SET_ERR_MSG_MOD(extack, "Port is attached to another bridge"); + return -EINVAL; + } + + err = mlx5_esw_bridge_vport_cleanup(br_offloads, vport); + if (err) + NL_SET_ERR_MSG_MOD(extack, "Port cleanup failed"); + return err; +} + +int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw, + struct mlx5_vport *vport, struct netlink_ext_ack *extack) +{ + struct mlx5_esw_bridge_port *port; + struct mlx5_esw_bridge_vlan *vlan; + + port = mlx5_esw_bridge_port_lookup(vport->vport, vport->bridge); + if (!port) + return -EINVAL; + + vlan = mlx5_esw_bridge_vlan_lookup(vid, port); + if (vlan) { + if (vlan->flags == flags) + return 0; + mlx5_esw_bridge_vlan_cleanup(port, vlan, vport->bridge); + } + + vlan = mlx5_esw_bridge_vlan_create(vid, flags, port, esw); + if (IS_ERR(vlan)) { + NL_SET_ERR_MSG_MOD(extack, "Failed to create VLAN entry"); + return PTR_ERR(vlan); + } + return 0; +} + +void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx5_vport *vport) +{ + struct mlx5_esw_bridge_port *port; + struct mlx5_esw_bridge_vlan *vlan; + + port = mlx5_esw_bridge_port_lookup(vport->vport, vport->bridge); + if (!port) + return; + + vlan = mlx5_esw_bridge_vlan_lookup(vid, port); + if (!vlan) + return; + mlx5_esw_bridge_vlan_cleanup(port, vlan, vport->bridge); +} + +void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw, + struct mlx5_vport *vport, + struct switchdev_notifier_fdb_info *fdb_info) +{ + struct mlx5_esw_bridge *bridge = vport->bridge; + struct mlx5_esw_bridge_fdb_entry *entry; + u16 vport_num = vport->vport; + + if (!bridge) { + esw_info(esw->dev, "Vport is not assigned to bridge (vport=%u)\n", vport_num); + return; + } + + entry = mlx5_esw_bridge_fdb_entry_init(dev, vport_num, fdb_info->addr, fdb_info->vid, + fdb_info->added_by_user, esw, bridge); + if (IS_ERR(entry)) + return; + + if (entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER) + mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid, + SWITCHDEV_FDB_OFFLOADED); + else + /* Take over dynamic entries to prevent kernel bridge from aging them out. */ + mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid, + SWITCHDEV_FDB_ADD_TO_BRIDGE); +} + +void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw, + struct mlx5_vport *vport, + struct switchdev_notifier_fdb_info *fdb_info) +{ + struct mlx5_esw_bridge *bridge = vport->bridge; + struct mlx5_esw_bridge_fdb_entry *entry; + struct mlx5_esw_bridge_fdb_key key; + u16 vport_num = vport->vport; + + if (!bridge) { + esw_warn(esw->dev, "Vport is not assigned to bridge (vport=%u)\n", vport_num); + return; + } + + ether_addr_copy(key.addr, fdb_info->addr); + key.vid = fdb_info->vid; + entry = rhashtable_lookup_fast(&bridge->fdb_ht, &key, fdb_ht_params); + if (!entry) { + esw_warn(esw->dev, + "FDB entry with specified key not found (MAC=%pM,vid=%u,vport=%u)\n", + key.addr, key.vid, vport_num); + return; + } + + if (!(entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER)) + mlx5_esw_bridge_fdb_offload_notify(dev, entry->key.addr, entry->key.vid, + SWITCHDEV_FDB_DEL_TO_BRIDGE); + mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge); +} + +void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_esw_bridge_fdb_entry *entry, *tmp; + struct mlx5_esw_bridge *bridge; + + list_for_each_entry(bridge, &br_offloads->bridges, list) { + list_for_each_entry_safe(entry, tmp, &bridge->fdb_list, list) { + unsigned long lastuse = + (unsigned long)mlx5_fc_query_lastuse(entry->ingress_counter); + + if (entry->flags & MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER) + continue; + + if (time_after(lastuse, entry->lastuse)) { + mlx5_esw_bridge_fdb_entry_refresh(lastuse, entry); + } else if (time_is_before_jiffies(entry->lastuse + bridge->ageing_time)) { + mlx5_esw_bridge_fdb_offload_notify(entry->dev, entry->key.addr, + entry->key.vid, + SWITCHDEV_FDB_DEL_TO_BRIDGE); + mlx5_esw_bridge_fdb_entry_cleanup(entry, bridge); + } + } + } +} + +static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_eswitch *esw = br_offloads->esw; + struct mlx5_vport *vport; + unsigned long i; + + mlx5_esw_for_each_vport(esw, i, vport) + if (vport->bridge) + mlx5_esw_bridge_vport_cleanup(br_offloads, vport); + + WARN_ONCE(!list_empty(&br_offloads->bridges), + "Cleaning up bridge offloads while still having bridges attached\n"); +} + +struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw) +{ + struct mlx5_esw_bridge_offloads *br_offloads; + + br_offloads = kvzalloc(sizeof(*br_offloads), GFP_KERNEL); + if (!br_offloads) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&br_offloads->bridges); + br_offloads->esw = esw; + esw->br_offloads = br_offloads; + + return br_offloads; +} + +void mlx5_esw_bridge_cleanup(struct mlx5_eswitch *esw) +{ + struct mlx5_esw_bridge_offloads *br_offloads = esw->br_offloads; + + if (!br_offloads) + return; + + mlx5_esw_bridge_flush(br_offloads); + + esw->br_offloads = NULL; + kvfree(br_offloads); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h new file mode 100644 index 000000000000..d826942b27fc --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies. */ + +#ifndef __MLX5_ESW_BRIDGE_H__ +#define __MLX5_ESW_BRIDGE_H__ + +#include <linux/notifier.h> +#include <linux/list.h> +#include <linux/workqueue.h> +#include "eswitch.h" + +struct mlx5_flow_table; +struct mlx5_flow_group; + +struct mlx5_esw_bridge_offloads { + struct mlx5_eswitch *esw; + struct list_head bridges; + struct notifier_block netdev_nb; + struct notifier_block nb_blk; + struct notifier_block nb; + struct workqueue_struct *wq; + struct delayed_work update_work; + + struct mlx5_flow_table *ingress_ft; + struct mlx5_flow_group *ingress_vlan_fg; + struct mlx5_flow_group *ingress_filter_fg; + struct mlx5_flow_group *ingress_mac_fg; + + struct mlx5_flow_table *skip_ft; +}; + +struct mlx5_esw_bridge_offloads *mlx5_esw_bridge_init(struct mlx5_eswitch *esw); +void mlx5_esw_bridge_cleanup(struct mlx5_eswitch *esw); +int mlx5_esw_bridge_vport_link(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_vport *vport, struct netlink_ext_ack *extack); +int mlx5_esw_bridge_vport_unlink(int ifindex, struct mlx5_esw_bridge_offloads *br_offloads, + struct mlx5_vport *vport, struct netlink_ext_ack *extack); +void mlx5_esw_bridge_fdb_create(struct net_device *dev, struct mlx5_eswitch *esw, + struct mlx5_vport *vport, + struct switchdev_notifier_fdb_info *fdb_info); +void mlx5_esw_bridge_fdb_remove(struct net_device *dev, struct mlx5_eswitch *esw, + struct mlx5_vport *vport, + struct switchdev_notifier_fdb_info *fdb_info); +void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads); +int mlx5_esw_bridge_ageing_time_set(unsigned long ageing_time, struct mlx5_eswitch *esw, + struct mlx5_vport *vport); +int mlx5_esw_bridge_vlan_filtering_set(bool enable, struct mlx5_eswitch *esw, + struct mlx5_vport *vport); +int mlx5_esw_bridge_port_vlan_add(u16 vid, u16 flags, struct mlx5_eswitch *esw, + struct mlx5_vport *vport, struct netlink_ext_ack *extack); +void mlx5_esw_bridge_port_vlan_del(u16 vid, struct mlx5_eswitch *esw, struct mlx5_vport *vport); + +#endif /* __MLX5_ESW_BRIDGE_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h new file mode 100644 index 000000000000..d9ab2e8bc2cb --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies. */ + +#ifndef _MLX5_ESW_BRIDGE_PRIVATE_ +#define _MLX5_ESW_BRIDGE_PRIVATE_ + +#include <linux/netdevice.h> +#include <linux/if_bridge.h> +#include <linux/if_vlan.h> +#include <linux/if_ether.h> +#include <linux/rhashtable.h> +#include <linux/xarray.h> +#include "fs_core.h" + +struct mlx5_esw_bridge_fdb_key { + unsigned char addr[ETH_ALEN]; + u16 vid; +}; + +enum { + MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER = BIT(0), +}; + +struct mlx5_esw_bridge_fdb_entry { + struct mlx5_esw_bridge_fdb_key key; + struct rhash_head ht_node; + struct net_device *dev; + struct list_head list; + struct list_head vlan_list; + u16 vport_num; + u16 flags; + + struct mlx5_flow_handle *ingress_handle; + struct mlx5_fc *ingress_counter; + unsigned long lastuse; + struct mlx5_flow_handle *egress_handle; + struct mlx5_flow_handle *filter_handle; +}; + +struct mlx5_esw_bridge_vlan { + u16 vid; + u16 flags; + struct list_head fdb_list; + struct mlx5_pkt_reformat *pkt_reformat_push; + struct mlx5_pkt_reformat *pkt_reformat_pop; +}; + +struct mlx5_esw_bridge_port { + u16 vport_num; + struct xarray vlans; +}; + +#endif /* _MLX5_ESW_BRIDGE_PRIVATE_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h new file mode 100644 index 000000000000..227964b7d3b9 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h @@ -0,0 +1,113 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies. */ + +#undef TRACE_SYSTEM +#define TRACE_SYSTEM mlx5 + +#if !defined(_MLX5_ESW_BRIDGE_TRACEPOINT_) || defined(TRACE_HEADER_MULTI_READ) +#define _MLX5_ESW_BRIDGE_TRACEPOINT_ + +#include <linux/tracepoint.h> +#include "../bridge_priv.h" + +DECLARE_EVENT_CLASS(mlx5_esw_bridge_fdb_template, + TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb), + TP_ARGS(fdb), + TP_STRUCT__entry( + __array(char, dev_name, IFNAMSIZ) + __array(unsigned char, addr, ETH_ALEN) + __field(u16, vid) + __field(u16, flags) + __field(unsigned int, used) + ), + TP_fast_assign( + strncpy(__entry->dev_name, + netdev_name(fdb->dev), + IFNAMSIZ); + memcpy(__entry->addr, fdb->key.addr, ETH_ALEN); + __entry->vid = fdb->key.vid; + __entry->flags = fdb->flags; + __entry->used = jiffies_to_msecs(jiffies - fdb->lastuse) + ), + TP_printk("net_device=%s addr=%pM vid=%hu flags=%hx used=%u", + __entry->dev_name, + __entry->addr, + __entry->vid, + __entry->flags, + __entry->used / 1000) + ); + +DEFINE_EVENT(mlx5_esw_bridge_fdb_template, + mlx5_esw_bridge_fdb_entry_init, + TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb), + TP_ARGS(fdb) + ); +DEFINE_EVENT(mlx5_esw_bridge_fdb_template, + mlx5_esw_bridge_fdb_entry_refresh, + TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb), + TP_ARGS(fdb) + ); +DEFINE_EVENT(mlx5_esw_bridge_fdb_template, + mlx5_esw_bridge_fdb_entry_cleanup, + TP_PROTO(const struct mlx5_esw_bridge_fdb_entry *fdb), + TP_ARGS(fdb) + ); + +DECLARE_EVENT_CLASS(mlx5_esw_bridge_vlan_template, + TP_PROTO(const struct mlx5_esw_bridge_vlan *vlan), + TP_ARGS(vlan), + TP_STRUCT__entry( + __field(u16, vid) + __field(u16, flags) + ), + TP_fast_assign( + __entry->vid = vlan->vid; + __entry->flags = vlan->flags; + ), + TP_printk("vid=%hu flags=%hx", + __entry->vid, + __entry->flags) + ); + +DEFINE_EVENT(mlx5_esw_bridge_vlan_template, + mlx5_esw_bridge_vlan_create, + TP_PROTO(const struct mlx5_esw_bridge_vlan *vlan), + TP_ARGS(vlan) + ); +DEFINE_EVENT(mlx5_esw_bridge_vlan_template, + mlx5_esw_bridge_vlan_cleanup, + TP_PROTO(const struct mlx5_esw_bridge_vlan *vlan), + TP_ARGS(vlan) + ); + +DECLARE_EVENT_CLASS(mlx5_esw_bridge_port_template, + TP_PROTO(const struct mlx5_esw_bridge_port *port), + TP_ARGS(port), + TP_STRUCT__entry( + __field(u16, vport_num) + ), + TP_fast_assign( + __entry->vport_num = port->vport_num; + ), + TP_printk("vport_num=%hu", __entry->vport_num) + ); + +DEFINE_EVENT(mlx5_esw_bridge_port_template, + mlx5_esw_bridge_vport_init, + TP_PROTO(const struct mlx5_esw_bridge_port *port), + TP_ARGS(port) + ); +DEFINE_EVENT(mlx5_esw_bridge_port_template, + mlx5_esw_bridge_vport_cleanup, + TP_PROTO(const struct mlx5_esw_bridge_port *port), + TP_ARGS(port) + ); + +#endif + +/* This part must be outside protection */ +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH esw/diag +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_FILE bridge_tracepoint +#include <trace/define_trace.h> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 64ccb2bc0b58..48cac5bf606d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -150,6 +150,8 @@ enum mlx5_eswitch_vport_event { MLX5_VPORT_PROMISC_CHANGE = BIT(3), }; +struct mlx5_esw_bridge; + struct mlx5_vport { struct mlx5_core_dev *dev; struct hlist_head uc_list[MLX5_L2_ADDR_HASH_SIZE]; @@ -178,6 +180,7 @@ struct mlx5_vport { enum mlx5_eswitch_vport_event enabled_events; int index; struct devlink_port *dl_port; + struct mlx5_esw_bridge *bridge; }; struct mlx5_esw_indir_table; @@ -196,6 +199,7 @@ struct mlx5_eswitch_fdb { struct offloads_fdb { struct mlx5_flow_namespace *ns; + struct mlx5_flow_table *tc_miss_table; struct mlx5_flow_table *slow_fdb; struct mlx5_flow_group *send_to_vport_grp; struct mlx5_flow_group *send_to_vport_meta_grp; @@ -270,6 +274,8 @@ enum { MLX5_ESWITCH_REG_C1_LOOPBACK_ENABLED = BIT(1), }; +struct mlx5_esw_bridge_offloads; + struct mlx5_eswitch { struct mlx5_core_dev *dev; struct mlx5_nb nb; @@ -299,6 +305,7 @@ struct mlx5_eswitch { u32 root_tsar_id; } qos; + struct mlx5_esw_bridge_offloads *br_offloads; struct mlx5_esw_offload offloads; int mode; u16 manager_vport; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index d18a28a6e9a6..7579f3402776 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1634,7 +1634,21 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw) } esw->fdb_table.offloads.slow_fdb = fdb; - err = esw_chains_create(esw, fdb); + /* Create empty TC-miss managed table. This allows plugging in following + * priorities without directly exposing their level 0 table to + * eswitch_offloads and passing it as miss_fdb to following call to + * esw_chains_create(). + */ + memset(&ft_attr, 0, sizeof(ft_attr)); + ft_attr.prio = FDB_TC_MISS; + esw->fdb_table.offloads.tc_miss_table = mlx5_create_flow_table(root_ns, &ft_attr); + if (IS_ERR(esw->fdb_table.offloads.tc_miss_table)) { + err = PTR_ERR(esw->fdb_table.offloads.tc_miss_table); + esw_warn(dev, "Failed to create TC miss FDB Table err %d\n", err); + goto tc_miss_table_err; + } + + err = esw_chains_create(esw, esw->fdb_table.offloads.tc_miss_table); if (err) { esw_warn(dev, "Failed to open fdb chains err(%d)\n", err); goto fdb_chains_err; @@ -1779,6 +1793,8 @@ send_vport_meta_err: send_vport_err: esw_chains_destroy(esw, esw_chains(esw)); fdb_chains_err: + mlx5_destroy_flow_table(esw->fdb_table.offloads.tc_miss_table); +tc_miss_table_err: mlx5_destroy_flow_table(esw->fdb_table.offloads.slow_fdb); slow_fdb_err: /* Holds true only as long as DMFS is the default */ @@ -1806,6 +1822,7 @@ static void esw_destroy_offloads_fdb_tables(struct mlx5_eswitch *esw) esw_chains_destroy(esw, esw_chains(esw)); + mlx5_destroy_flow_table(esw->fdb_table.offloads.tc_miss_table); mlx5_destroy_flow_table(esw->fdb_table.offloads.slow_fdb); /* Holds true only as long as DMFS is the default */ mlx5_flow_namespace_set_mode(esw->fdb_table.offloads.ns, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c index 8e06731d3cb3..896a6c3dbdb7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c @@ -36,6 +36,7 @@ #include "fs_core.h" #include "fs_cmd.h" +#include "fs_ft_pool.h" #include "mlx5_core.h" #include "eswitch.h" @@ -49,9 +50,11 @@ static int mlx5_cmd_stub_update_root_ft(struct mlx5_flow_root_namespace *ns, static int mlx5_cmd_stub_create_flow_table(struct mlx5_flow_root_namespace *ns, struct mlx5_flow_table *ft, - unsigned int log_size, + unsigned int size, struct mlx5_flow_table *next_ft) { + ft->max_fte = size ? roundup_pow_of_two(size) : 1; + return 0; } @@ -108,9 +111,7 @@ static int mlx5_cmd_stub_delete_fte(struct mlx5_flow_root_namespace *ns, } static int mlx5_cmd_stub_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns, - int reformat_type, - size_t size, - void *reformat_data, + struct mlx5_pkt_reformat_params *params, enum mlx5_flow_namespace_type namespace, struct mlx5_pkt_reformat *pkt_reformat) { @@ -181,7 +182,7 @@ static int mlx5_cmd_update_root_ft(struct mlx5_flow_root_namespace *ns, static int mlx5_cmd_create_flow_table(struct mlx5_flow_root_namespace *ns, struct mlx5_flow_table *ft, - unsigned int log_size, + unsigned int size, struct mlx5_flow_table *next_ft) { int en_encap = !!(ft->flags & MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT); @@ -192,12 +193,18 @@ static int mlx5_cmd_create_flow_table(struct mlx5_flow_root_namespace *ns, struct mlx5_core_dev *dev = ns->dev; int err; + if (size != POOL_NEXT_SIZE) + size = roundup_pow_of_two(size); + size = mlx5_ft_pool_get_avail_sz(dev, ft->type, size); + if (!size) + return -ENOSPC; + MLX5_SET(create_flow_table_in, in, opcode, MLX5_CMD_OP_CREATE_FLOW_TABLE); MLX5_SET(create_flow_table_in, in, table_type, ft->type); MLX5_SET(create_flow_table_in, in, flow_table_context.level, ft->level); - MLX5_SET(create_flow_table_in, in, flow_table_context.log_size, log_size); + MLX5_SET(create_flow_table_in, in, flow_table_context.log_size, size ? ilog2(size) : 0); MLX5_SET(create_flow_table_in, in, vport_number, ft->vport); MLX5_SET(create_flow_table_in, in, other_vport, !!(ft->flags & MLX5_FLOW_TABLE_OTHER_VPORT)); @@ -234,9 +241,14 @@ static int mlx5_cmd_create_flow_table(struct mlx5_flow_root_namespace *ns, } err = mlx5_cmd_exec_inout(dev, create_flow_table, in, out); - if (!err) + if (!err) { ft->id = MLX5_GET(create_flow_table_out, out, table_id); + ft->max_fte = size; + } else { + mlx5_ft_pool_put_sz(ns->dev, size); + } + return err; } @@ -245,6 +257,7 @@ static int mlx5_cmd_destroy_flow_table(struct mlx5_flow_root_namespace *ns, { u32 in[MLX5_ST_SZ_DW(destroy_flow_table_in)] = {}; struct mlx5_core_dev *dev = ns->dev; + int err; MLX5_SET(destroy_flow_table_in, in, opcode, MLX5_CMD_OP_DESTROY_FLOW_TABLE); @@ -254,7 +267,11 @@ static int mlx5_cmd_destroy_flow_table(struct mlx5_flow_root_namespace *ns, MLX5_SET(destroy_flow_table_in, in, other_vport, !!(ft->flags & MLX5_FLOW_TABLE_OTHER_VPORT)); - return mlx5_cmd_exec_in(dev, destroy_flow_table, in); + err = mlx5_cmd_exec_in(dev, destroy_flow_table, in); + if (!err) + mlx5_ft_pool_put_sz(ns->dev, ft->max_fte); + + return err; } static int mlx5_cmd_modify_flow_table(struct mlx5_flow_root_namespace *ns, @@ -682,9 +699,7 @@ int mlx5_cmd_fc_bulk_query(struct mlx5_core_dev *dev, u32 base_id, int bulk_len, } static int mlx5_cmd_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns, - int reformat_type, - size_t size, - void *reformat_data, + struct mlx5_pkt_reformat_params *params, enum mlx5_flow_namespace_type namespace, struct mlx5_pkt_reformat *pkt_reformat) { @@ -702,14 +717,14 @@ static int mlx5_cmd_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns, else max_encap_size = MLX5_CAP_FLOWTABLE(dev, max_encap_header_size); - if (size > max_encap_size) { + if (params->size > max_encap_size) { mlx5_core_warn(dev, "encap size %zd too big, max supported is %d\n", - size, max_encap_size); + params->size, max_encap_size); return -EINVAL; } - in = kzalloc(MLX5_ST_SZ_BYTES(alloc_packet_reformat_context_in) + size, - GFP_KERNEL); + in = kzalloc(MLX5_ST_SZ_BYTES(alloc_packet_reformat_context_in) + + params->size, GFP_KERNEL); if (!in) return -ENOMEM; @@ -718,15 +733,20 @@ static int mlx5_cmd_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns, reformat = MLX5_ADDR_OF(packet_reformat_context_in, packet_reformat_context_in, reformat_data); - inlen = reformat - (void *)in + size; + inlen = reformat - (void *)in + params->size; MLX5_SET(alloc_packet_reformat_context_in, in, opcode, MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT); MLX5_SET(packet_reformat_context_in, packet_reformat_context_in, - reformat_data_size, size); + reformat_data_size, params->size); + MLX5_SET(packet_reformat_context_in, packet_reformat_context_in, + reformat_type, params->type); + MLX5_SET(packet_reformat_context_in, packet_reformat_context_in, + reformat_param_0, params->param_0); MLX5_SET(packet_reformat_context_in, packet_reformat_context_in, - reformat_type, reformat_type); - memcpy(reformat, reformat_data, size); + reformat_param_1, params->param_1); + if (params->data && params->size) + memcpy(reformat, params->data, params->size); err = mlx5_cmd_exec(dev, in, inlen, out, sizeof(out)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h index d62de642eca9..5ecd33cdc087 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h @@ -38,7 +38,7 @@ struct mlx5_flow_cmds { int (*create_flow_table)(struct mlx5_flow_root_namespace *ns, struct mlx5_flow_table *ft, - unsigned int log_size, + unsigned int size, struct mlx5_flow_table *next_ft); int (*destroy_flow_table)(struct mlx5_flow_root_namespace *ns, struct mlx5_flow_table *ft); @@ -77,9 +77,7 @@ struct mlx5_flow_cmds { bool disconnect); int (*packet_reformat_alloc)(struct mlx5_flow_root_namespace *ns, - int reformat_type, - size_t size, - void *reformat_data, + struct mlx5_pkt_reformat_params *params, enum mlx5_flow_namespace_type namespace, struct mlx5_pkt_reformat *pkt_reformat); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index f74d2c834037..d7bf0a3e4a52 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -38,6 +38,7 @@ #include "mlx5_core.h" #include "fs_core.h" #include "fs_cmd.h" +#include "fs_ft_pool.h" #include "diag/fs_tracepoint.h" #include "accel/ipsec.h" #include "fpga/ipsec.h" @@ -752,7 +753,7 @@ static struct mlx5_flow_group *alloc_insert_flow_group(struct mlx5_flow_table *f return fg; } -static struct mlx5_flow_table *alloc_flow_table(int level, u16 vport, int max_fte, +static struct mlx5_flow_table *alloc_flow_table(int level, u16 vport, enum fs_flow_table_type table_type, enum fs_flow_table_op_mod op_mod, u32 flags) @@ -775,7 +776,6 @@ static struct mlx5_flow_table *alloc_flow_table(int level, u16 vport, int max_ft ft->op_mod = op_mod; ft->type = table_type; ft->vport = vport; - ft->max_fte = max_fte; ft->flags = flags; INIT_LIST_HEAD(&ft->fwd_rules); mutex_init(&ft->lock); @@ -1070,7 +1070,6 @@ static struct mlx5_flow_table *__mlx5_create_flow_table(struct mlx5_flow_namespa struct mlx5_flow_table *next_ft; struct fs_prio *fs_prio = NULL; struct mlx5_flow_table *ft; - int log_table_sz; int err; if (!root) { @@ -1101,7 +1100,6 @@ static struct mlx5_flow_table *__mlx5_create_flow_table(struct mlx5_flow_namespa */ ft = alloc_flow_table(ft_attr->level, vport, - ft_attr->max_fte ? roundup_pow_of_two(ft_attr->max_fte) : 0, root->table_type, op_mod, ft_attr->flags); if (IS_ERR(ft)) { @@ -1110,12 +1108,11 @@ static struct mlx5_flow_table *__mlx5_create_flow_table(struct mlx5_flow_namespa } tree_init_node(&ft->node, del_hw_flow_table, del_sw_flow_table); - log_table_sz = ft->max_fte ? ilog2(ft->max_fte) : 0; next_ft = unmanaged ? ft_attr->next_ft : find_next_chained_ft(fs_prio); ft->def_miss_action = ns->def_miss_action; ft->ns = ns; - err = root->cmds->create_flow_table(root, ft, log_table_sz, next_ft); + err = root->cmds->create_flow_table(root, ft, ft_attr->max_fte, next_ft); if (err) goto free_ft; @@ -1170,28 +1167,36 @@ mlx5_create_lag_demux_flow_table(struct mlx5_flow_namespace *ns, ft_attr.level = level; ft_attr.prio = prio; + ft_attr.max_fte = 1; + return __mlx5_create_flow_table(ns, &ft_attr, FS_FT_OP_MOD_LAG_DEMUX, 0); } EXPORT_SYMBOL(mlx5_create_lag_demux_flow_table); +#define MAX_FLOW_GROUP_SIZE BIT(24) struct mlx5_flow_table* mlx5_create_auto_grouped_flow_table(struct mlx5_flow_namespace *ns, struct mlx5_flow_table_attr *ft_attr) { int num_reserved_entries = ft_attr->autogroup.num_reserved_entries; - int autogroups_max_fte = ft_attr->max_fte - num_reserved_entries; int max_num_groups = ft_attr->autogroup.max_num_groups; struct mlx5_flow_table *ft; - - if (max_num_groups > autogroups_max_fte) - return ERR_PTR(-EINVAL); - if (num_reserved_entries > ft_attr->max_fte) - return ERR_PTR(-EINVAL); + int autogroups_max_fte; ft = mlx5_create_flow_table(ns, ft_attr); if (IS_ERR(ft)) return ft; + autogroups_max_fte = ft->max_fte - num_reserved_entries; + if (max_num_groups > autogroups_max_fte) + goto err_validate; + if (num_reserved_entries > ft->max_fte) + goto err_validate; + + /* Align the number of groups according to the largest group size */ + if (autogroups_max_fte / (max_num_groups + 1) > MAX_FLOW_GROUP_SIZE) + max_num_groups = (autogroups_max_fte / MAX_FLOW_GROUP_SIZE) - 1; + ft->autogroup.active = true; ft->autogroup.required_groups = max_num_groups; ft->autogroup.max_fte = autogroups_max_fte; @@ -1199,6 +1204,10 @@ mlx5_create_auto_grouped_flow_table(struct mlx5_flow_namespace *ns, ft->autogroup.group_size = autogroups_max_fte / (max_num_groups + 1); return ft; + +err_validate: + mlx5_destroy_flow_table(ft); + return ERR_PTR(-ENOSPC); } EXPORT_SYMBOL(mlx5_create_auto_grouped_flow_table); @@ -1495,7 +1504,9 @@ static bool mlx5_flow_dests_cmp(struct mlx5_flow_destination *d1, (d1->type == MLX5_FLOW_DESTINATION_TYPE_TIR && d1->tir_num == d2->tir_num) || (d1->type == MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE_NUM && - d1->ft_num == d2->ft_num)) + d1->ft_num == d2->ft_num) || + (d1->type == MLX5_FLOW_DESTINATION_TYPE_FLOW_SAMPLER && + d1->sampler_id == d2->sampler_id)) return true; } @@ -2592,6 +2603,7 @@ void mlx5_cleanup_fs(struct mlx5_core_dev *dev) mlx5_cleanup_fc_stats(dev); kmem_cache_destroy(steering->ftes_cache); kmem_cache_destroy(steering->fgs_cache); + mlx5_ft_pool_destroy(dev); kfree(steering); } @@ -2770,6 +2782,18 @@ static int init_fdb_root_ns(struct mlx5_flow_steering *steering) if (err) goto out_err; + maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_TC_MISS, 1); + if (IS_ERR(maj_prio)) { + err = PTR_ERR(maj_prio); + goto out_err; + } + + maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_BR_OFFLOAD, 3); + if (IS_ERR(maj_prio)) { + err = PTR_ERR(maj_prio); + goto out_err; + } + maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_SLOW_PATH, 1); if (IS_ERR(maj_prio)) { err = PTR_ERR(maj_prio); @@ -2942,9 +2966,16 @@ int mlx5_init_fs(struct mlx5_core_dev *dev) if (err) return err; + err = mlx5_ft_pool_init(dev); + if (err) + return err; + steering = kzalloc(sizeof(*steering), GFP_KERNEL); - if (!steering) - return -ENOMEM; + if (!steering) { + err = -ENOMEM; + goto err; + } + steering->dev = dev; dev->priv.steering = steering; @@ -3151,9 +3182,7 @@ void mlx5_modify_header_dealloc(struct mlx5_core_dev *dev, EXPORT_SYMBOL(mlx5_modify_header_dealloc); struct mlx5_pkt_reformat *mlx5_packet_reformat_alloc(struct mlx5_core_dev *dev, - int reformat_type, - size_t size, - void *reformat_data, + struct mlx5_pkt_reformat_params *params, enum mlx5_flow_namespace_type ns_type) { struct mlx5_pkt_reformat *pkt_reformat; @@ -3169,9 +3198,8 @@ struct mlx5_pkt_reformat *mlx5_packet_reformat_alloc(struct mlx5_core_dev *dev, return ERR_PTR(-ENOMEM); pkt_reformat->ns_type = ns_type; - pkt_reformat->reformat_type = reformat_type; - err = root->cmds->packet_reformat_alloc(root, reformat_type, size, - reformat_data, ns_type, + pkt_reformat->reformat_type = params->type; + err = root->cmds->packet_reformat_alloc(root, params, ns_type, pkt_reformat); if (err) { kfree(pkt_reformat); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h index e577a2c424af..7317cdeab661 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h @@ -331,6 +331,7 @@ void mlx5_fs_ingress_acls_cleanup(struct mlx5_core_dev *dev); #define MLX5_CAP_FLOWTABLE_TYPE(mdev, cap, type) ( \ (type == FS_FT_NIC_RX) ? MLX5_CAP_FLOWTABLE_NIC_RX(mdev, cap) : \ + (type == FS_FT_NIC_TX) ? MLX5_CAP_FLOWTABLE_NIC_TX(mdev, cap) : \ (type == FS_FT_ESW_EGRESS_ACL) ? MLX5_CAP_ESW_EGRESS_ACL(mdev, cap) : \ (type == FS_FT_ESW_INGRESS_ACL) ? MLX5_CAP_ESW_INGRESS_ACL(mdev, cap) : \ (type == FS_FT_FDB) ? MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, cap) : \ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_ft_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_ft_pool.c new file mode 100644 index 000000000000..c14590acc772 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_ft_pool.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2021 Mellanox Technologies. */ + +#include "fs_ft_pool.h" + +/* Firmware currently has 4 pool of 4 sizes that it supports (FT_POOLS), + * and a virtual memory region of 16M (MLX5_FT_SIZE), this region is duplicated + * for each flow table pool. We can allocate up to 16M of each pool, + * and we keep track of how much we used via mlx5_ft_pool_get_avail_sz. + * Firmware doesn't report any of this for now. + * ESW_POOL is expected to be sorted from large to small and match firmware + * pools. + */ +#define FT_SIZE (16 * 1024 * 1024) +static const unsigned int FT_POOLS[] = { 4 * 1024 * 1024, + 1 * 1024 * 1024, + 64 * 1024, + 128, + 1 /* size for termination tables */ }; +struct mlx5_ft_pool { + int ft_left[ARRAY_SIZE(FT_POOLS)]; +}; + +int mlx5_ft_pool_init(struct mlx5_core_dev *dev) +{ + struct mlx5_ft_pool *ft_pool; + int i; + + ft_pool = kzalloc(sizeof(*ft_pool), GFP_KERNEL); + if (!ft_pool) + return -ENOMEM; + + for (i = ARRAY_SIZE(FT_POOLS) - 1; i >= 0; i--) + ft_pool->ft_left[i] = FT_SIZE / FT_POOLS[i]; + + dev->priv.ft_pool = ft_pool; + return 0; +} + +void mlx5_ft_pool_destroy(struct mlx5_core_dev *dev) +{ + kfree(dev->priv.ft_pool); +} + +int +mlx5_ft_pool_get_avail_sz(struct mlx5_core_dev *dev, enum fs_flow_table_type table_type, + int desired_size) +{ + u32 max_ft_size = 1 << MLX5_CAP_FLOWTABLE_TYPE(dev, log_max_ft_size, table_type); + int i, found_i = -1; + + for (i = ARRAY_SIZE(FT_POOLS) - 1; i >= 0; i--) { + if (dev->priv.ft_pool->ft_left[i] && FT_POOLS[i] >= desired_size && + FT_POOLS[i] <= max_ft_size) { + found_i = i; + if (desired_size != POOL_NEXT_SIZE) + break; + } + } + + if (found_i != -1) { + --dev->priv.ft_pool->ft_left[found_i]; + return FT_POOLS[found_i]; + } + + return 0; +} + +void +mlx5_ft_pool_put_sz(struct mlx5_core_dev *dev, int sz) +{ + int i; + + if (!sz) + return; + + for (i = ARRAY_SIZE(FT_POOLS) - 1; i >= 0; i--) { + if (sz == FT_POOLS[i]) { + ++dev->priv.ft_pool->ft_left[i]; + return; + } + } + + WARN_ONCE(1, "Couldn't find size %d in flow table size pool", sz); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_ft_pool.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_ft_pool.h new file mode 100644 index 000000000000..25f4274b372b --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_ft_pool.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies. */ + +#ifndef __MLX5_FS_FT_POOL_H__ +#define __MLX5_FS_FT_POOL_H__ + +#include <linux/mlx5/driver.h> +#include "fs_core.h" + +#define POOL_NEXT_SIZE 0 + +int mlx5_ft_pool_init(struct mlx5_core_dev *dev); +void mlx5_ft_pool_destroy(struct mlx5_core_dev *dev); + +int +mlx5_ft_pool_get_avail_sz(struct mlx5_core_dev *dev, enum fs_flow_table_type table_type, + int desired_size); +void +mlx5_ft_pool_put_sz(struct mlx5_core_dev *dev, int sz); + +#endif /* __MLX5_FS_FT_POOL_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c index 02558ac2ace6..016d26f809a5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c @@ -148,6 +148,12 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev) if (err) return err; + if (MLX5_CAP_GEN(dev, hca_cap_2)) { + err = mlx5_core_get_caps(dev, MLX5_CAP_GENERAL_2); + if (err) + return err; + } + if (MLX5_CAP_GEN(dev, eth_net_offloads)) { err = mlx5_core_get_caps(dev, MLX5_CAP_ETHERNET_OFFLOADS); if (err) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c index 97d96fc38a65..0e487ec57d5c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c @@ -150,6 +150,7 @@ enum mlx5_ptys_rate { MLX5_PTYS_RATE_FDR = 1 << 4, MLX5_PTYS_RATE_EDR = 1 << 5, MLX5_PTYS_RATE_HDR = 1 << 6, + MLX5_PTYS_RATE_NDR = 1 << 7, }; static inline int mlx5_ptys_rate_enum_to_int(enum mlx5_ptys_rate rate) @@ -162,6 +163,7 @@ static inline int mlx5_ptys_rate_enum_to_int(enum mlx5_ptys_rate rate) case MLX5_PTYS_RATE_FDR: return 14000; case MLX5_PTYS_RATE_EDR: return 25000; case MLX5_PTYS_RATE_HDR: return 50000; + case MLX5_PTYS_RATE_NDR: return 100000; default: return -1; } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index b8748390335f..5c043c5cc403 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -93,6 +93,64 @@ int mlx5_cmd_destroy_vport_lag(struct mlx5_core_dev *dev) } EXPORT_SYMBOL(mlx5_cmd_destroy_vport_lag); +static int mlx5_lag_netdev_event(struct notifier_block *this, + unsigned long event, void *ptr); +static void mlx5_do_bond_work(struct work_struct *work); + +static void mlx5_ldev_free(struct kref *ref) +{ + struct mlx5_lag *ldev = container_of(ref, struct mlx5_lag, ref); + + if (ldev->nb.notifier_call) + unregister_netdevice_notifier_net(&init_net, &ldev->nb); + mlx5_lag_mp_cleanup(ldev); + cancel_delayed_work_sync(&ldev->bond_work); + destroy_workqueue(ldev->wq); + kfree(ldev); +} + +static void mlx5_ldev_put(struct mlx5_lag *ldev) +{ + kref_put(&ldev->ref, mlx5_ldev_free); +} + +static void mlx5_ldev_get(struct mlx5_lag *ldev) +{ + kref_get(&ldev->ref); +} + +static struct mlx5_lag *mlx5_lag_dev_alloc(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; + int err; + + ldev = kzalloc(sizeof(*ldev), GFP_KERNEL); + if (!ldev) + return NULL; + + ldev->wq = create_singlethread_workqueue("mlx5_lag"); + if (!ldev->wq) { + kfree(ldev); + return NULL; + } + + kref_init(&ldev->ref); + INIT_DELAYED_WORK(&ldev->bond_work, mlx5_do_bond_work); + + ldev->nb.notifier_call = mlx5_lag_netdev_event; + if (register_netdevice_notifier_net(&init_net, &ldev->nb)) { + ldev->nb.notifier_call = NULL; + mlx5_core_err(dev, "Failed to register LAG netdev notifier\n"); + } + + err = mlx5_lag_mp_init(ldev); + if (err) + mlx5_core_err(dev, "Failed to init multipath lag err=%d\n", + err); + + return ldev; +} + int mlx5_lag_dev_get_netdev_idx(struct mlx5_lag *ldev, struct net_device *ndev) { @@ -118,17 +176,24 @@ static bool __mlx5_lag_is_sriov(struct mlx5_lag *ldev) static void mlx5_infer_tx_affinity_mapping(struct lag_tracker *tracker, u8 *port1, u8 *port2) { + bool p1en; + bool p2en; + + p1en = tracker->netdev_state[MLX5_LAG_P1].tx_enabled && + tracker->netdev_state[MLX5_LAG_P1].link_up; + + p2en = tracker->netdev_state[MLX5_LAG_P2].tx_enabled && + tracker->netdev_state[MLX5_LAG_P2].link_up; + *port1 = 1; *port2 = 2; - if (!tracker->netdev_state[MLX5_LAG_P1].tx_enabled || - !tracker->netdev_state[MLX5_LAG_P1].link_up) { - *port1 = 2; + if ((!p1en && !p2en) || (p1en && p2en)) return; - } - if (!tracker->netdev_state[MLX5_LAG_P2].tx_enabled || - !tracker->netdev_state[MLX5_LAG_P2].link_up) + if (p1en) *port2 = 1; + else + *port1 = 2; } void mlx5_modify_lag(struct mlx5_lag *ldev, @@ -251,6 +316,10 @@ static void mlx5_lag_add_devices(struct mlx5_lag *ldev) if (!ldev->pf[i].dev) continue; + if (ldev->pf[i].dev->priv.flags & + MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV) + continue; + ldev->pf[i].dev->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(ldev->pf[i].dev); } @@ -269,6 +338,31 @@ static void mlx5_lag_remove_devices(struct mlx5_lag *ldev) } } +static void mlx5_disable_lag(struct mlx5_lag *ldev) +{ + struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; + struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + bool roce_lag; + int err; + + roce_lag = __mlx5_lag_is_roce(ldev); + + if (roce_lag) { + if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) { + dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; + mlx5_rescan_drivers_locked(dev0); + } + mlx5_nic_vport_disable_roce(dev1); + } + + err = mlx5_deactivate_lag(ldev); + if (err) + return; + + if (roce_lag) + mlx5_lag_add_devices(ldev); +} + static void mlx5_do_bond(struct mlx5_lag *ldev) { struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; @@ -280,9 +374,7 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) if (!mlx5_lag_is_ready(ldev)) return; - spin_lock(&lag_lock); tracker = ldev->tracker; - spin_unlock(&lag_lock); do_bond = tracker.is_bonded && mlx5_lag_check_prereq(ldev); @@ -291,8 +383,9 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) !mlx5_sriov_is_enabled(dev1); #ifdef CONFIG_MLX5_ESWITCH - roce_lag &= dev0->priv.eswitch->mode == MLX5_ESWITCH_NONE && - dev1->priv.eswitch->mode == MLX5_ESWITCH_NONE; + roce_lag = roce_lag && + dev0->priv.eswitch->mode == MLX5_ESWITCH_NONE && + dev1->priv.eswitch->mode == MLX5_ESWITCH_NONE; #endif if (roce_lag) @@ -316,20 +409,7 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) } else if (do_bond && __mlx5_lag_is_active(ldev)) { mlx5_modify_lag(ldev, &tracker); } else if (!do_bond && __mlx5_lag_is_active(ldev)) { - roce_lag = __mlx5_lag_is_roce(ldev); - - if (roce_lag) { - dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; - mlx5_rescan_drivers_locked(dev0); - mlx5_nic_vport_disable_roce(dev1); - } - - err = mlx5_deactivate_lag(ldev); - if (err) - return; - - if (roce_lag) - mlx5_lag_add_devices(ldev); + mlx5_disable_lag(ldev); } } @@ -481,9 +561,7 @@ static int mlx5_lag_netdev_event(struct notifier_block *this, break; } - spin_lock(&lag_lock); ldev->tracker = tracker; - spin_unlock(&lag_lock); if (changed) mlx5_queue_bond_work(ldev, 0); @@ -491,55 +569,52 @@ static int mlx5_lag_netdev_event(struct notifier_block *this, return NOTIFY_DONE; } -static struct mlx5_lag *mlx5_lag_dev_alloc(void) +static void mlx5_ldev_add_netdev(struct mlx5_lag *ldev, + struct mlx5_core_dev *dev, + struct net_device *netdev) { - struct mlx5_lag *ldev; - - ldev = kzalloc(sizeof(*ldev), GFP_KERNEL); - if (!ldev) - return NULL; - - ldev->wq = create_singlethread_workqueue("mlx5_lag"); - if (!ldev->wq) { - kfree(ldev); - return NULL; - } + unsigned int fn = PCI_FUNC(dev->pdev->devfn); - INIT_DELAYED_WORK(&ldev->bond_work, mlx5_do_bond_work); + if (fn >= MLX5_MAX_PORTS) + return; - return ldev; + spin_lock(&lag_lock); + ldev->pf[fn].netdev = netdev; + ldev->tracker.netdev_state[fn].link_up = 0; + ldev->tracker.netdev_state[fn].tx_enabled = 0; + spin_unlock(&lag_lock); } -static void mlx5_lag_dev_free(struct mlx5_lag *ldev) +static void mlx5_ldev_remove_netdev(struct mlx5_lag *ldev, + struct net_device *netdev) { - destroy_workqueue(ldev->wq); - kfree(ldev); + int i; + + spin_lock(&lag_lock); + for (i = 0; i < MLX5_MAX_PORTS; i++) { + if (ldev->pf[i].netdev == netdev) { + ldev->pf[i].netdev = NULL; + break; + } + } + spin_unlock(&lag_lock); } -static int mlx5_lag_dev_add_pf(struct mlx5_lag *ldev, - struct mlx5_core_dev *dev, - struct net_device *netdev) +static void mlx5_ldev_add_mdev(struct mlx5_lag *ldev, + struct mlx5_core_dev *dev) { unsigned int fn = PCI_FUNC(dev->pdev->devfn); if (fn >= MLX5_MAX_PORTS) - return -EPERM; - - spin_lock(&lag_lock); - ldev->pf[fn].dev = dev; - ldev->pf[fn].netdev = netdev; - ldev->tracker.netdev_state[fn].link_up = 0; - ldev->tracker.netdev_state[fn].tx_enabled = 0; + return; + ldev->pf[fn].dev = dev; dev->priv.lag = ldev; - - spin_unlock(&lag_lock); - - return fn; } -static void mlx5_lag_dev_remove_pf(struct mlx5_lag *ldev, - struct mlx5_core_dev *dev) +/* Must be called with intf_mutex held */ +static void mlx5_ldev_remove_mdev(struct mlx5_lag *ldev, + struct mlx5_core_dev *dev) { int i; @@ -550,19 +625,15 @@ static void mlx5_lag_dev_remove_pf(struct mlx5_lag *ldev, if (i == MLX5_MAX_PORTS) return; - spin_lock(&lag_lock); - memset(&ldev->pf[i], 0, sizeof(*ldev->pf)); - + ldev->pf[i].dev = NULL; dev->priv.lag = NULL; - spin_unlock(&lag_lock); } /* Must be called with intf_mutex held */ -void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev) +static void __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev = NULL; struct mlx5_core_dev *tmp_dev; - int i, err; if (!MLX5_CAP_GEN(dev, vport_group_manager) || !MLX5_CAP_GEN(dev, lag_master) || @@ -574,67 +645,77 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev) ldev = tmp_dev->priv.lag; if (!ldev) { - ldev = mlx5_lag_dev_alloc(); + ldev = mlx5_lag_dev_alloc(dev); if (!ldev) { mlx5_core_err(dev, "Failed to alloc lag dev\n"); return; } + } else { + mlx5_ldev_get(ldev); } - if (mlx5_lag_dev_add_pf(ldev, dev, netdev) < 0) - return; + mlx5_ldev_add_mdev(ldev, dev); - for (i = 0; i < MLX5_MAX_PORTS; i++) - if (!ldev->pf[i].dev) - break; + return; +} - if (i >= MLX5_MAX_PORTS) - ldev->flags |= MLX5_LAG_FLAG_READY; +void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; - if (!ldev->nb.notifier_call) { - ldev->nb.notifier_call = mlx5_lag_netdev_event; - if (register_netdevice_notifier_net(&init_net, &ldev->nb)) { - ldev->nb.notifier_call = NULL; - mlx5_core_err(dev, "Failed to register LAG netdev notifier\n"); - } - } + ldev = mlx5_lag_dev(dev); + if (!ldev) + return; - err = mlx5_lag_mp_init(ldev); - if (err) - mlx5_core_err(dev, "Failed to init multipath lag err=%d\n", - err); + mlx5_dev_list_lock(); + mlx5_ldev_remove_mdev(ldev, dev); + mlx5_dev_list_unlock(); + mlx5_ldev_put(ldev); +} + +void mlx5_lag_add_mdev(struct mlx5_core_dev *dev) +{ + mlx5_dev_list_lock(); + __mlx5_lag_dev_add_mdev(dev); + mlx5_dev_list_unlock(); } /* Must be called with intf_mutex held */ -void mlx5_lag_remove(struct mlx5_core_dev *dev) +void mlx5_lag_remove_netdev(struct mlx5_core_dev *dev, + struct net_device *netdev) { struct mlx5_lag *ldev; - int i; - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); if (!ldev) return; if (__mlx5_lag_is_active(ldev)) - mlx5_deactivate_lag(ldev); - - mlx5_lag_dev_remove_pf(ldev, dev); + mlx5_disable_lag(ldev); + mlx5_ldev_remove_netdev(ldev, netdev); ldev->flags &= ~MLX5_LAG_FLAG_READY; +} + +/* Must be called with intf_mutex held */ +void mlx5_lag_add_netdev(struct mlx5_core_dev *dev, + struct net_device *netdev) +{ + struct mlx5_lag *ldev; + int i; + + ldev = mlx5_lag_dev(dev); + if (!ldev) + return; + + mlx5_ldev_add_netdev(ldev, dev, netdev); for (i = 0; i < MLX5_MAX_PORTS; i++) - if (ldev->pf[i].dev) + if (!ldev->pf[i].dev) break; - if (i == MLX5_MAX_PORTS) { - if (ldev->nb.notifier_call) { - unregister_netdevice_notifier_net(&init_net, &ldev->nb); - ldev->nb.notifier_call = NULL; - } - mlx5_lag_mp_cleanup(ldev); - cancel_delayed_work_sync(&ldev->bond_work); - mlx5_lag_dev_free(ldev); - } + if (i >= MLX5_MAX_PORTS) + ldev->flags |= MLX5_LAG_FLAG_READY; } bool mlx5_lag_is_roce(struct mlx5_core_dev *dev) @@ -643,7 +724,7 @@ bool mlx5_lag_is_roce(struct mlx5_core_dev *dev) bool res; spin_lock(&lag_lock); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); res = ldev && __mlx5_lag_is_roce(ldev); spin_unlock(&lag_lock); @@ -657,7 +738,7 @@ bool mlx5_lag_is_active(struct mlx5_core_dev *dev) bool res; spin_lock(&lag_lock); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); res = ldev && __mlx5_lag_is_active(ldev); spin_unlock(&lag_lock); @@ -671,7 +752,7 @@ bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev) bool res; spin_lock(&lag_lock); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); res = ldev && __mlx5_lag_is_sriov(ldev); spin_unlock(&lag_lock); @@ -684,7 +765,7 @@ void mlx5_lag_update(struct mlx5_core_dev *dev) struct mlx5_lag *ldev; mlx5_dev_list_lock(); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); if (!ldev) goto unlock; @@ -700,7 +781,7 @@ struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev) struct mlx5_lag *ldev; spin_lock(&lag_lock); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); if (!(ldev && __mlx5_lag_is_roce(ldev))) goto unlock; @@ -729,7 +810,7 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev, u8 port = 0; spin_lock(&lag_lock); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); if (!(ldev && __mlx5_lag_is_roce(ldev))) goto unlock; @@ -765,7 +846,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev, memset(values, 0, sizeof(*values) * num_counters); spin_lock(&lag_lock); - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); if (ldev && __mlx5_lag_is_active(ldev)) { num_ports = MLX5_MAX_PORTS; mdev[MLX5_LAG_P1] = ldev->pf[MLX5_LAG_P1].dev; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index 8d8cf2d0bc6d..191392c37558 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -40,6 +40,7 @@ struct lag_tracker { struct mlx5_lag { u8 flags; u8 v2p_map[MLX5_MAX_PORTS]; + struct kref ref; struct lag_func pf[MLX5_MAX_PORTS]; struct lag_tracker tracker; struct workqueue_struct *wq; @@ -49,7 +50,7 @@ struct mlx5_lag { }; static inline struct mlx5_lag * -mlx5_lag_dev_get(struct mlx5_core_dev *dev) +mlx5_lag_dev(struct mlx5_core_dev *dev) { return dev->priv.lag; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c index fd6196b5e163..c4bf8b679541 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c @@ -28,7 +28,7 @@ bool mlx5_lag_is_multipath(struct mlx5_core_dev *dev) struct mlx5_lag *ldev; bool res; - ldev = mlx5_lag_dev_get(dev); + ldev = mlx5_lag_dev(dev); res = ldev && __mlx5_lag_is_multipath(ldev); return res; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h index f607a3858ef5..624cedebb510 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ -/* Copyright (c) 2018 Mellanox Technologies */ +/* Copyright (c) 2018-2021, Mellanox Technologies inc. All rights reserved. */ #ifndef __LIB_MLX5_EQ_H__ #define __LIB_MLX5_EQ_H__ @@ -32,6 +32,7 @@ struct mlx5_eq { unsigned int irqn; u8 eqn; struct mlx5_rsc_debug *dbg; + struct mlx5_irq *irq; }; struct mlx5_eq_async { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c index 20a4047f2737..97e5845b4cfd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c @@ -6,6 +6,7 @@ #include <linux/mlx5/fs.h> #include "lib/fs_chains.h" +#include "fs_ft_pool.h" #include "en/mapping.h" #include "fs_core.h" #include "en_tc.h" @@ -13,25 +14,10 @@ #define chains_lock(chains) ((chains)->lock) #define chains_ht(chains) ((chains)->chains_ht) #define prios_ht(chains) ((chains)->prios_ht) -#define ft_pool_left(chains) ((chains)->ft_left) #define tc_default_ft(chains) ((chains)->tc_default_ft) #define tc_end_ft(chains) ((chains)->tc_end_ft) #define ns_to_chains_fs_prio(ns) ((ns) == MLX5_FLOW_NAMESPACE_FDB ? \ FDB_TC_OFFLOAD : MLX5E_TC_PRIO) - -/* Firmware currently has 4 pool of 4 sizes that it supports (FT_POOLS), - * and a virtual memory region of 16M (MLX5_FT_SIZE), this region is duplicated - * for each flow table pool. We can allocate up to 16M of each pool, - * and we keep track of how much we used via get_next_avail_sz_from_pool. - * Firmware doesn't report any of this for now. - * ESW_POOL is expected to be sorted from large to small and match firmware - * pools. - */ -#define FT_SIZE (16 * 1024 * 1024) -static const unsigned int FT_POOLS[] = { 4 * 1024 * 1024, - 1 * 1024 * 1024, - 64 * 1024, - 128 }; #define FT_TBL_SZ (64 * 1024) struct mlx5_fs_chains { @@ -49,8 +35,6 @@ struct mlx5_fs_chains { enum mlx5_flow_namespace_type ns; u32 group_num; u32 flags; - - int ft_left[ARRAY_SIZE(FT_POOLS)]; }; struct fs_chain { @@ -160,54 +144,6 @@ mlx5_chains_set_end_ft(struct mlx5_fs_chains *chains, tc_end_ft(chains) = ft; } -#define POOL_NEXT_SIZE 0 -static int -mlx5_chains_get_avail_sz_from_pool(struct mlx5_fs_chains *chains, - int desired_size) -{ - int i, found_i = -1; - - for (i = ARRAY_SIZE(FT_POOLS) - 1; i >= 0; i--) { - if (ft_pool_left(chains)[i] && FT_POOLS[i] > desired_size) { - found_i = i; - if (desired_size != POOL_NEXT_SIZE) - break; - } - } - - if (found_i != -1) { - --ft_pool_left(chains)[found_i]; - return FT_POOLS[found_i]; - } - - return 0; -} - -static void -mlx5_chains_put_sz_to_pool(struct mlx5_fs_chains *chains, int sz) -{ - int i; - - for (i = ARRAY_SIZE(FT_POOLS) - 1; i >= 0; i--) { - if (sz == FT_POOLS[i]) { - ++ft_pool_left(chains)[i]; - return; - } - } - - WARN_ONCE(1, "Couldn't find size %d in flow table size pool", sz); -} - -static void -mlx5_chains_init_sz_pool(struct mlx5_fs_chains *chains, u32 ft_max) -{ - int i; - - for (i = ARRAY_SIZE(FT_POOLS) - 1; i >= 0; i--) - ft_pool_left(chains)[i] = - FT_POOLS[i] <= ft_max ? FT_SIZE / FT_POOLS[i] : 0; -} - static struct mlx5_flow_table * mlx5_chains_create_table(struct mlx5_fs_chains *chains, u32 chain, u32 prio, u32 level) @@ -221,11 +157,7 @@ mlx5_chains_create_table(struct mlx5_fs_chains *chains, ft_attr.flags |= (MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT | MLX5_FLOW_TABLE_TUNNEL_EN_DECAP); - sz = (chain == mlx5_chains_get_nf_ft_chain(chains)) ? - mlx5_chains_get_avail_sz_from_pool(chains, FT_TBL_SZ) : - mlx5_chains_get_avail_sz_from_pool(chains, POOL_NEXT_SIZE); - if (!sz) - return ERR_PTR(-ENOSPC); + sz = (chain == mlx5_chains_get_nf_ft_chain(chains)) ? FT_TBL_SZ : POOL_NEXT_SIZE; ft_attr.max_fte = sz; /* We use tc_default_ft(chains) as the table's next_ft till @@ -266,21 +198,12 @@ mlx5_chains_create_table(struct mlx5_fs_chains *chains, if (IS_ERR(ft)) { mlx5_core_warn(chains->dev, "Failed to create chains table err %d (chain: %d, prio: %d, level: %d, size: %d)\n", (int)PTR_ERR(ft), chain, prio, level, sz); - mlx5_chains_put_sz_to_pool(chains, sz); return ft; } return ft; } -static void -mlx5_chains_destroy_table(struct mlx5_fs_chains *chains, - struct mlx5_flow_table *ft) -{ - mlx5_chains_put_sz_to_pool(chains, ft->max_fte); - mlx5_destroy_flow_table(ft); -} - static int create_chain_restore(struct fs_chain *chain) { @@ -336,9 +259,10 @@ create_chain_restore(struct fs_chain *chain) MLX5_SET(set_action_in, modact, field, mlx5e_tc_attr_to_reg_mappings[chain_to_reg].mfield); MLX5_SET(set_action_in, modact, offset, - mlx5e_tc_attr_to_reg_mappings[chain_to_reg].moffset * 8); + mlx5e_tc_attr_to_reg_mappings[chain_to_reg].moffset); MLX5_SET(set_action_in, modact, length, - mlx5e_tc_attr_to_reg_mappings[chain_to_reg].mlen * 8); + mlx5e_tc_attr_to_reg_mappings[chain_to_reg].mlen == 32 ? + 0 : mlx5e_tc_attr_to_reg_mappings[chain_to_reg].mlen); MLX5_SET(set_action_in, modact, data, chain->id); mod_hdr = mlx5_modify_header_alloc(chains->dev, chains->ns, 1, modact); @@ -636,7 +560,7 @@ err_insert: err_miss_rule: mlx5_destroy_flow_group(miss_group); err_group: - mlx5_chains_destroy_table(chains, ft); + mlx5_destroy_flow_table(ft); err_create: err_alloc: kvfree(prio_s); @@ -659,7 +583,7 @@ mlx5_chains_destroy_prio(struct mlx5_fs_chains *chains, prio_params); mlx5_del_flow_rules(prio->miss_rule); mlx5_destroy_flow_group(prio->miss_group); - mlx5_chains_destroy_table(chains, prio->ft); + mlx5_destroy_flow_table(prio->ft); mlx5_chains_put_chain(chain); kvfree(prio); } @@ -784,7 +708,7 @@ void mlx5_chains_destroy_global_table(struct mlx5_fs_chains *chains, struct mlx5_flow_table *ft) { - mlx5_chains_destroy_table(chains, ft); + mlx5_destroy_flow_table(ft); } static struct mlx5_fs_chains * @@ -816,8 +740,6 @@ mlx5_chains_init(struct mlx5_core_dev *dev, struct mlx5_chains_attr *attr) mlx5_chains_get_chain_range(chains_priv), mlx5_chains_get_prio_range(chains_priv)); - mlx5_chains_init_sz_pool(chains_priv, attr->max_ft_sz); - err = rhashtable_init(&chains_ht(chains_priv), &chain_params); if (err) goto init_chains_ht_err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/sf.h new file mode 100644 index 000000000000..84e5683861be --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sf.h @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies Ltd */ + +#ifndef __LIB_MLX5_SF_H__ +#define __LIB_MLX5_SF_H__ + +#include <linux/mlx5/driver.h> + +static inline u16 mlx5_sf_start_function_id(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf_base_id); +} + +#ifdef CONFIG_MLX5_SF + +static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf); +} + +static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) +{ + if (!mlx5_sf_supported(dev)) + return 0; + if (MLX5_CAP_GEN(dev, max_num_sf)) + return MLX5_CAP_GEN(dev, max_num_sf); + else + return 1 << MLX5_CAP_GEN(dev, log_max_sf); +} + +#else + +static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) +{ + return false; +} + +static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) +{ + return 0; +} + +#endif + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 0d0f63a27aba..eb1b316560a8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -76,6 +76,7 @@ #include "sf/vhca_event.h" #include "sf/dev/dev.h" #include "sf/sf.h" +#include "mlx5_irq.h" MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>"); MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver"); @@ -1185,6 +1186,7 @@ static int mlx5_load(struct mlx5_core_dev *dev) } mlx5_sf_dev_table_create(dev); + mlx5_lag_add_mdev(dev); return 0; @@ -1220,6 +1222,7 @@ err_irq_table: static void mlx5_unload(struct mlx5_core_dev *dev) { + mlx5_lag_remove_mdev(dev); mlx5_sf_dev_table_destroy(dev); mlx5_sriov_detach(dev); mlx5_ec_cleanup(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index a22b706eebd3..343807ac2036 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -164,27 +164,10 @@ int mlx5_query_mcam_reg(struct mlx5_core_dev *dev, u32 *mcap, u8 feature_group, int mlx5_query_qcam_reg(struct mlx5_core_dev *mdev, u32 *qcam, u8 feature_group, u8 access_reg_group); -void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev); -void mlx5_lag_remove(struct mlx5_core_dev *dev); - -int mlx5_irq_table_init(struct mlx5_core_dev *dev); -void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev); -int mlx5_irq_table_create(struct mlx5_core_dev *dev); -void mlx5_irq_table_destroy(struct mlx5_core_dev *dev); -int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx, - struct notifier_block *nb); -int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx, - struct notifier_block *nb); - -int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn, - int msix_vec_count); -int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs); - -struct cpumask * -mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx); -struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table); -int mlx5_irq_get_num_comp(struct mlx5_irq_table *table); -struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev); +void mlx5_lag_add_netdev(struct mlx5_core_dev *dev, struct net_device *netdev); +void mlx5_lag_remove_netdev(struct mlx5_core_dev *dev, struct net_device *netdev); +void mlx5_lag_add_mdev(struct mlx5_core_dev *dev); +void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev); int mlx5_events_init(struct mlx5_core_dev *dev); void mlx5_events_cleanup(struct mlx5_core_dev *dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h new file mode 100644 index 000000000000..abd024173c42 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2021 Mellanox Technologies. */ + +#ifndef __MLX5_IRQ_H__ +#define __MLX5_IRQ_H__ + +#include <linux/mlx5/driver.h> + +#define MLX5_COMP_EQS_PER_SF 8 + +#define MLX5_IRQ_EQ_CTRL (0) + +struct mlx5_irq; + +int mlx5_irq_table_init(struct mlx5_core_dev *dev); +void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev); +int mlx5_irq_table_create(struct mlx5_core_dev *dev); +void mlx5_irq_table_destroy(struct mlx5_core_dev *dev); +int mlx5_irq_table_get_num_comp(struct mlx5_irq_table *table); +int mlx5_irq_table_get_sfs_vec(struct mlx5_irq_table *table); +struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev); + +int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn, + int msix_vec_count); +int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs); + +struct mlx5_irq *mlx5_irq_request(struct mlx5_core_dev *dev, u16 vecidx, + struct cpumask *affinity); +void mlx5_irq_release(struct mlx5_irq *irq); +int mlx5_irq_attach_nb(struct mlx5_irq *irq, struct notifier_block *nb); +int mlx5_irq_detach_nb(struct mlx5_irq *irq, struct notifier_block *nb); +struct cpumask *mlx5_irq_get_affinity_mask(struct mlx5_irq *irq); +int mlx5_irq_get_index(struct mlx5_irq *irq); + +#endif /* __MLX5_IRQ_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index c3373fb1cd7f..b25f764daa08 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -6,60 +6,52 @@ #include <linux/module.h> #include <linux/mlx5/driver.h> #include "mlx5_core.h" +#include "mlx5_irq.h" +#include "lib/sf.h" #ifdef CONFIG_RFS_ACCEL #include <linux/cpu_rmap.h> #endif #define MLX5_MAX_IRQ_NAME (32) +/* max irq_index is 255. three chars */ +#define MLX5_MAX_IRQ_IDX_CHARS (3) + +#define MLX5_SFS_PER_CTRL_IRQ 64 +#define MLX5_IRQ_CTRL_SF_MAX 8 +/* min num of vectores for SFs to be enabled */ +#define MLX5_IRQ_VEC_COMP_BASE_SF 2 + +#define MLX5_EQ_SHARE_IRQ_MAX_COMP (8) +#define MLX5_EQ_SHARE_IRQ_MAX_CTRL (UINT_MAX) +#define MLX5_EQ_SHARE_IRQ_MIN_COMP (1) +#define MLX5_EQ_SHARE_IRQ_MIN_CTRL (4) +#define MLX5_EQ_REFS_PER_IRQ (2) struct mlx5_irq { + u32 index; struct atomic_notifier_head nh; cpumask_var_t mask; char name[MLX5_MAX_IRQ_NAME]; + struct kref kref; + int irqn; + struct mlx5_irq_pool *pool; }; -struct mlx5_irq_table { - struct mlx5_irq *irq; - int nvec; -#ifdef CONFIG_RFS_ACCEL - struct cpu_rmap *rmap; -#endif +struct mlx5_irq_pool { + char name[MLX5_MAX_IRQ_NAME - MLX5_MAX_IRQ_IDX_CHARS]; + struct xa_limit xa_num_irqs; + struct mutex lock; /* sync IRQs creations */ + struct xarray irqs; + u32 max_threshold; + u32 min_threshold; + struct mlx5_core_dev *dev; }; -int mlx5_irq_table_init(struct mlx5_core_dev *dev) -{ - struct mlx5_irq_table *irq_table; - - if (mlx5_core_is_sf(dev)) - return 0; - - irq_table = kvzalloc(sizeof(*irq_table), GFP_KERNEL); - if (!irq_table) - return -ENOMEM; - - dev->priv.irq_table = irq_table; - return 0; -} - -void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev) -{ - if (mlx5_core_is_sf(dev)) - return; - - kvfree(dev->priv.irq_table); -} - -int mlx5_irq_get_num_comp(struct mlx5_irq_table *table) -{ - return table->nvec - MLX5_IRQ_VEC_COMP_BASE; -} - -static struct mlx5_irq *mlx5_irq_get(struct mlx5_core_dev *dev, int vecidx) -{ - struct mlx5_irq_table *irq_table = dev->priv.irq_table; - - return &irq_table->irq[vecidx]; -} +struct mlx5_irq_table { + struct mlx5_irq_pool *pf_pool; + struct mlx5_irq_pool *sf_ctrl_pool; + struct mlx5_irq_pool *sf_comp_pool; +}; /** * mlx5_get_default_msix_vec_count - Get the default number of MSI-X vectors @@ -146,34 +138,46 @@ out: return ret; } -int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx, - struct notifier_block *nb) +static void irq_release(struct kref *kref) { - struct mlx5_irq *irq; + struct mlx5_irq *irq = container_of(kref, struct mlx5_irq, kref); + struct mlx5_irq_pool *pool = irq->pool; - irq = &irq_table->irq[vecidx]; - return atomic_notifier_chain_register(&irq->nh, nb); + xa_erase(&pool->irqs, irq->index); + /* free_irq requires that affinity and rmap will be cleared + * before calling it. This is why there is asymmetry with set_rmap + * which should be called after alloc_irq but before request_irq. + */ + irq_set_affinity_hint(irq->irqn, NULL); + free_cpumask_var(irq->mask); + free_irq(irq->irqn, &irq->nh); + kfree(irq); } -int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx, - struct notifier_block *nb) +static void irq_put(struct mlx5_irq *irq) { - struct mlx5_irq *irq; + struct mlx5_irq_pool *pool = irq->pool; - irq = &irq_table->irq[vecidx]; - return atomic_notifier_chain_unregister(&irq->nh, nb); + mutex_lock(&pool->lock); + kref_put(&irq->kref, irq_release); + mutex_unlock(&pool->lock); } -static irqreturn_t mlx5_irq_int_handler(int irq, void *nh) +static irqreturn_t irq_int_handler(int irq, void *nh) { atomic_notifier_call_chain(nh, 0, NULL); return IRQ_HANDLED; } +static void irq_sf_set_name(struct mlx5_irq_pool *pool, char *name, int vecidx) +{ + snprintf(name, MLX5_MAX_IRQ_NAME, "%s%d", pool->name, vecidx); +} + static void irq_set_name(char *name, int vecidx) { if (vecidx == 0) { - snprintf(name, MLX5_MAX_IRQ_NAME, "mlx5_async"); + snprintf(name, MLX5_MAX_IRQ_NAME, "mlx5_async%d", vecidx); return; } @@ -181,251 +185,431 @@ static void irq_set_name(char *name, int vecidx) vecidx - MLX5_IRQ_VEC_COMP_BASE); } -static int request_irqs(struct mlx5_core_dev *dev, int nvec) +static struct mlx5_irq *irq_request(struct mlx5_irq_pool *pool, int i) { + struct mlx5_core_dev *dev = pool->dev; char name[MLX5_MAX_IRQ_NAME]; + struct mlx5_irq *irq; int err; - int i; - - for (i = 0; i < nvec; i++) { - struct mlx5_irq *irq = mlx5_irq_get(dev, i); - int irqn = pci_irq_vector(dev->pdev, i); + irq = kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + irq->irqn = pci_irq_vector(dev->pdev, i); + if (!pool->name[0]) irq_set_name(name, i); - ATOMIC_INIT_NOTIFIER_HEAD(&irq->nh); - snprintf(irq->name, MLX5_MAX_IRQ_NAME, - "%s@pci:%s", name, pci_name(dev->pdev)); - err = request_irq(irqn, mlx5_irq_int_handler, 0, irq->name, - &irq->nh); - if (err) { - mlx5_core_err(dev, "Failed to request irq\n"); - goto err_request_irq; - } + else + irq_sf_set_name(pool, name, i); + ATOMIC_INIT_NOTIFIER_HEAD(&irq->nh); + snprintf(irq->name, MLX5_MAX_IRQ_NAME, + "%s@pci:%s", name, pci_name(dev->pdev)); + err = request_irq(irq->irqn, irq_int_handler, 0, irq->name, + &irq->nh); + if (err) { + mlx5_core_err(dev, "Failed to request irq. err = %d\n", err); + goto err_req_irq; } - return 0; + if (!zalloc_cpumask_var(&irq->mask, GFP_KERNEL)) { + mlx5_core_warn(dev, "zalloc_cpumask_var failed\n"); + err = -ENOMEM; + goto err_cpumask; + } + kref_init(&irq->kref); + irq->index = i; + err = xa_err(xa_store(&pool->irqs, irq->index, irq, GFP_KERNEL)); + if (err) { + mlx5_core_err(dev, "Failed to alloc xa entry for irq(%u). err = %d\n", + irq->index, err); + goto err_xa; + } + irq->pool = pool; + return irq; +err_xa: + free_cpumask_var(irq->mask); +err_cpumask: + free_irq(irq->irqn, &irq->nh); +err_req_irq: + kfree(irq); + return ERR_PTR(err); +} -err_request_irq: - while (i--) { - struct mlx5_irq *irq = mlx5_irq_get(dev, i); - int irqn = pci_irq_vector(dev->pdev, i); +int mlx5_irq_attach_nb(struct mlx5_irq *irq, struct notifier_block *nb) +{ + int err; - free_irq(irqn, &irq->nh); - } - return err; + err = kref_get_unless_zero(&irq->kref); + if (WARN_ON_ONCE(!err)) + /* Something very bad happens here, we are enabling EQ + * on non-existing IRQ. + */ + return -ENOENT; + err = atomic_notifier_chain_register(&irq->nh, nb); + if (err) + irq_put(irq); + return err; } -static void irq_clear_rmap(struct mlx5_core_dev *dev) +int mlx5_irq_detach_nb(struct mlx5_irq *irq, struct notifier_block *nb) { -#ifdef CONFIG_RFS_ACCEL - struct mlx5_irq_table *irq_table = dev->priv.irq_table; + irq_put(irq); + return atomic_notifier_chain_unregister(&irq->nh, nb); +} - free_irq_cpu_rmap(irq_table->rmap); -#endif +struct cpumask *mlx5_irq_get_affinity_mask(struct mlx5_irq *irq) +{ + return irq->mask; } -static int irq_set_rmap(struct mlx5_core_dev *mdev) +int mlx5_irq_get_index(struct mlx5_irq *irq) { - int err = 0; -#ifdef CONFIG_RFS_ACCEL - struct mlx5_irq_table *irq_table = mdev->priv.irq_table; - int num_affinity_vec; - int vecidx; + return irq->index; +} - num_affinity_vec = mlx5_irq_get_num_comp(irq_table); - irq_table->rmap = alloc_irq_cpu_rmap(num_affinity_vec); - if (!irq_table->rmap) { - err = -ENOMEM; - mlx5_core_err(mdev, "Failed to allocate cpu_rmap. err %d", err); - goto err_out; +/* irq_pool API */ + +/* creating an irq from irq_pool */ +static struct mlx5_irq *irq_pool_create_irq(struct mlx5_irq_pool *pool, + struct cpumask *affinity) +{ + struct mlx5_irq *irq; + u32 irq_index; + int err; + + err = xa_alloc(&pool->irqs, &irq_index, NULL, pool->xa_num_irqs, + GFP_KERNEL); + if (err) + return ERR_PTR(err); + irq = irq_request(pool, irq_index); + if (IS_ERR(irq)) + return irq; + cpumask_copy(irq->mask, affinity); + irq_set_affinity_hint(irq->irqn, irq->mask); + return irq; +} + +/* looking for the irq with the smallest refcount and the same affinity */ +static struct mlx5_irq *irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, + struct cpumask *affinity) +{ + int start = pool->xa_num_irqs.min; + int end = pool->xa_num_irqs.max; + struct mlx5_irq *irq = NULL; + struct mlx5_irq *iter; + unsigned long index; + + lockdep_assert_held(&pool->lock); + xa_for_each_range(&pool->irqs, index, iter, start, end) { + if (!cpumask_equal(iter->mask, affinity)) + continue; + if (kref_read(&iter->kref) < pool->min_threshold) + return iter; + if (!irq || kref_read(&iter->kref) < + kref_read(&irq->kref)) + irq = iter; } + return irq; +} + +/* requesting an irq from a given pool according to given affinity */ +static struct mlx5_irq *irq_pool_request_affinity(struct mlx5_irq_pool *pool, + struct cpumask *affinity) +{ + struct mlx5_irq *least_loaded_irq, *new_irq; - vecidx = MLX5_IRQ_VEC_COMP_BASE; - for (; vecidx < irq_table->nvec; vecidx++) { - err = irq_cpu_rmap_add(irq_table->rmap, - pci_irq_vector(mdev->pdev, vecidx)); - if (err) { - mlx5_core_err(mdev, "irq_cpu_rmap_add failed. err %d", - err); - goto err_irq_cpu_rmap_add; + mutex_lock(&pool->lock); + least_loaded_irq = irq_pool_find_least_loaded(pool, affinity); + if (least_loaded_irq && + kref_read(&least_loaded_irq->kref) < pool->min_threshold) + goto out; + new_irq = irq_pool_create_irq(pool, affinity); + if (IS_ERR(new_irq)) { + if (!least_loaded_irq) { + mlx5_core_err(pool->dev, "Didn't find IRQ for cpu = %u\n", + cpumask_first(affinity)); + mutex_unlock(&pool->lock); + return new_irq; } + /* We failed to create a new IRQ for the requested affinity, + * sharing existing IRQ. + */ + goto out; } - return 0; + least_loaded_irq = new_irq; + goto unlock; +out: + kref_get(&least_loaded_irq->kref); + if (kref_read(&least_loaded_irq->kref) > pool->max_threshold) + mlx5_core_dbg(pool->dev, "IRQ %u overloaded, pool_name: %s, %u EQs on this irq\n", + least_loaded_irq->irqn, pool->name, + kref_read(&least_loaded_irq->kref) / MLX5_EQ_REFS_PER_IRQ); +unlock: + mutex_unlock(&pool->lock); + return least_loaded_irq; +} -err_irq_cpu_rmap_add: - irq_clear_rmap(mdev); -err_out: -#endif - return err; +/* requesting an irq from a given pool according to given index */ +static struct mlx5_irq * +irq_pool_request_vector(struct mlx5_irq_pool *pool, int vecidx, + struct cpumask *affinity) +{ + struct mlx5_irq *irq; + + mutex_lock(&pool->lock); + irq = xa_load(&pool->irqs, vecidx); + if (irq) { + kref_get(&irq->kref); + goto unlock; + } + irq = irq_request(pool, vecidx); + if (IS_ERR(irq) || !affinity) + goto unlock; + cpumask_copy(irq->mask, affinity); + irq_set_affinity_hint(irq->irqn, irq->mask); +unlock: + mutex_unlock(&pool->lock); + return irq; } -/* Completion IRQ vectors */ +static struct mlx5_irq_pool *find_sf_irq_pool(struct mlx5_irq_table *irq_table, + int i, struct cpumask *affinity) +{ + if (cpumask_empty(affinity) && i == MLX5_IRQ_EQ_CTRL) + return irq_table->sf_ctrl_pool; + return irq_table->sf_comp_pool; +} -static int set_comp_irq_affinity_hint(struct mlx5_core_dev *mdev, int i) +/** + * mlx5_irq_release - release an IRQ back to the system. + * @irq: irq to be released. + */ +void mlx5_irq_release(struct mlx5_irq *irq) { - int vecidx = MLX5_IRQ_VEC_COMP_BASE + i; + synchronize_irq(irq->irqn); + irq_put(irq); +} + +/** + * mlx5_irq_request - request an IRQ for mlx5 device. + * @dev: mlx5 device that requesting the IRQ. + * @vecidx: vector index of the IRQ. This argument is ignore if affinity is + * provided. + * @affinity: cpumask requested for this IRQ. + * + * This function returns a pointer to IRQ, or ERR_PTR in case of error. + */ +struct mlx5_irq *mlx5_irq_request(struct mlx5_core_dev *dev, u16 vecidx, + struct cpumask *affinity) +{ + struct mlx5_irq_table *irq_table = mlx5_irq_table_get(dev); + struct mlx5_irq_pool *pool; struct mlx5_irq *irq; - int irqn; - irq = mlx5_irq_get(mdev, vecidx); - irqn = pci_irq_vector(mdev->pdev, vecidx); - if (!zalloc_cpumask_var(&irq->mask, GFP_KERNEL)) { - mlx5_core_warn(mdev, "zalloc_cpumask_var failed"); - return -ENOMEM; + if (mlx5_core_is_sf(dev)) { + pool = find_sf_irq_pool(irq_table, vecidx, affinity); + if (!pool) + /* we don't have IRQs for SFs, using the PF IRQs */ + goto pf_irq; + if (cpumask_empty(affinity) && !strcmp(pool->name, "mlx5_sf_comp")) + /* In case an SF user request IRQ with vecidx */ + irq = irq_pool_request_vector(pool, vecidx, NULL); + else + irq = irq_pool_request_affinity(pool, affinity); + goto out; } +pf_irq: + pool = irq_table->pf_pool; + irq = irq_pool_request_vector(pool, vecidx, affinity); +out: + if (IS_ERR(irq)) + return irq; + mlx5_core_dbg(dev, "irq %u mapped to cpu %*pbl, %u EQs on this irq\n", + irq->irqn, cpumask_pr_args(affinity), + kref_read(&irq->kref) / MLX5_EQ_REFS_PER_IRQ); + return irq; +} - cpumask_set_cpu(cpumask_local_spread(i, mdev->priv.numa_node), - irq->mask); - if (IS_ENABLED(CONFIG_SMP) && - irq_set_affinity_hint(irqn, irq->mask)) - mlx5_core_warn(mdev, "irq_set_affinity_hint failed, irq 0x%.4x", - irqn); - - return 0; +static struct mlx5_irq_pool * +irq_pool_alloc(struct mlx5_core_dev *dev, int start, int size, char *name, + u32 min_threshold, u32 max_threshold) +{ + struct mlx5_irq_pool *pool = kvzalloc(sizeof(*pool), GFP_KERNEL); + + if (!pool) + return ERR_PTR(-ENOMEM); + pool->dev = dev; + xa_init_flags(&pool->irqs, XA_FLAGS_ALLOC); + pool->xa_num_irqs.min = start; + pool->xa_num_irqs.max = start + size - 1; + if (name) + snprintf(pool->name, MLX5_MAX_IRQ_NAME - MLX5_MAX_IRQ_IDX_CHARS, + name); + pool->min_threshold = min_threshold * MLX5_EQ_REFS_PER_IRQ; + pool->max_threshold = max_threshold * MLX5_EQ_REFS_PER_IRQ; + mutex_init(&pool->lock); + mlx5_core_dbg(dev, "pool->name = %s, pool->size = %d, pool->start = %d", + name, size, start); + return pool; } -static void clear_comp_irq_affinity_hint(struct mlx5_core_dev *mdev, int i) +static void irq_pool_free(struct mlx5_irq_pool *pool) { - int vecidx = MLX5_IRQ_VEC_COMP_BASE + i; struct mlx5_irq *irq; - int irqn; + unsigned long index; - irq = mlx5_irq_get(mdev, vecidx); - irqn = pci_irq_vector(mdev->pdev, vecidx); - irq_set_affinity_hint(irqn, NULL); - free_cpumask_var(irq->mask); + xa_for_each(&pool->irqs, index, irq) + irq_release(&irq->kref); + xa_destroy(&pool->irqs); + kvfree(pool); } -static int set_comp_irq_affinity_hints(struct mlx5_core_dev *mdev) +static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pf_vec) { - int nvec = mlx5_irq_get_num_comp(mdev->priv.irq_table); + struct mlx5_irq_table *table = dev->priv.irq_table; + int num_sf_ctrl_by_msix; + int num_sf_ctrl_by_sfs; + int num_sf_ctrl; int err; - int i; - for (i = 0; i < nvec; i++) { - err = set_comp_irq_affinity_hint(mdev, i); - if (err) - goto err_out; + /* init pf_pool */ + table->pf_pool = irq_pool_alloc(dev, 0, pf_vec, NULL, + MLX5_EQ_SHARE_IRQ_MIN_COMP, + MLX5_EQ_SHARE_IRQ_MAX_COMP); + if (IS_ERR(table->pf_pool)) + return PTR_ERR(table->pf_pool); + if (!mlx5_sf_max_functions(dev)) + return 0; + if (sf_vec < MLX5_IRQ_VEC_COMP_BASE_SF) { + mlx5_core_err(dev, "Not enough IRQs for SFs. SF may run at lower performance\n"); + return 0; } + /* init sf_ctrl_pool */ + num_sf_ctrl_by_msix = DIV_ROUND_UP(sf_vec, MLX5_COMP_EQS_PER_SF); + num_sf_ctrl_by_sfs = DIV_ROUND_UP(mlx5_sf_max_functions(dev), + MLX5_SFS_PER_CTRL_IRQ); + num_sf_ctrl = min_t(int, num_sf_ctrl_by_msix, num_sf_ctrl_by_sfs); + num_sf_ctrl = min_t(int, MLX5_IRQ_CTRL_SF_MAX, num_sf_ctrl); + table->sf_ctrl_pool = irq_pool_alloc(dev, pf_vec, num_sf_ctrl, + "mlx5_sf_ctrl", + MLX5_EQ_SHARE_IRQ_MIN_CTRL, + MLX5_EQ_SHARE_IRQ_MAX_CTRL); + if (IS_ERR(table->sf_ctrl_pool)) { + err = PTR_ERR(table->sf_ctrl_pool); + goto err_pf; + } + /* init sf_comp_pool */ + table->sf_comp_pool = irq_pool_alloc(dev, pf_vec + num_sf_ctrl, + sf_vec - num_sf_ctrl, "mlx5_sf_comp", + MLX5_EQ_SHARE_IRQ_MIN_COMP, + MLX5_EQ_SHARE_IRQ_MAX_COMP); + if (IS_ERR(table->sf_comp_pool)) { + err = PTR_ERR(table->sf_comp_pool); + goto err_sf_ctrl; + } return 0; - -err_out: - for (i--; i >= 0; i--) - clear_comp_irq_affinity_hint(mdev, i); - +err_sf_ctrl: + irq_pool_free(table->sf_ctrl_pool); +err_pf: + irq_pool_free(table->pf_pool); return err; } -static void clear_comp_irqs_affinity_hints(struct mlx5_core_dev *mdev) +static void irq_pools_destroy(struct mlx5_irq_table *table) { - int nvec = mlx5_irq_get_num_comp(mdev->priv.irq_table); - int i; - - for (i = 0; i < nvec; i++) - clear_comp_irq_affinity_hint(mdev, i); + if (table->sf_ctrl_pool) { + irq_pool_free(table->sf_comp_pool); + irq_pool_free(table->sf_ctrl_pool); + } + irq_pool_free(table->pf_pool); } -struct cpumask * -mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx) +/* irq_table API */ + +int mlx5_irq_table_init(struct mlx5_core_dev *dev) { - return irq_table->irq[vecidx].mask; + struct mlx5_irq_table *irq_table; + + if (mlx5_core_is_sf(dev)) + return 0; + + irq_table = kvzalloc(sizeof(*irq_table), GFP_KERNEL); + if (!irq_table) + return -ENOMEM; + + dev->priv.irq_table = irq_table; + return 0; } -#ifdef CONFIG_RFS_ACCEL -struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *irq_table) +void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev) { - return irq_table->rmap; + if (mlx5_core_is_sf(dev)) + return; + + kvfree(dev->priv.irq_table); } -#endif -static void unrequest_irqs(struct mlx5_core_dev *dev) +int mlx5_irq_table_get_num_comp(struct mlx5_irq_table *table) { - struct mlx5_irq_table *table = dev->priv.irq_table; - int i; - - for (i = 0; i < table->nvec; i++) - free_irq(pci_irq_vector(dev->pdev, i), - &mlx5_irq_get(dev, i)->nh); + return table->pf_pool->xa_num_irqs.max - table->pf_pool->xa_num_irqs.min; } int mlx5_irq_table_create(struct mlx5_core_dev *dev) { - struct mlx5_priv *priv = &dev->priv; - struct mlx5_irq_table *table = priv->irq_table; int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ? MLX5_CAP_GEN(dev, max_num_eqs) : 1 << MLX5_CAP_GEN(dev, log_max_eq); - int nvec; + int total_vec; + int pf_vec; int err; if (mlx5_core_is_sf(dev)) return 0; - nvec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + - MLX5_IRQ_VEC_COMP_BASE; - nvec = min_t(int, nvec, num_eqs); - if (nvec <= MLX5_IRQ_VEC_COMP_BASE) - return -ENOMEM; - - table->irq = kcalloc(nvec, sizeof(*table->irq), GFP_KERNEL); - if (!table->irq) + pf_vec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + + MLX5_IRQ_VEC_COMP_BASE; + pf_vec = min_t(int, pf_vec, num_eqs); + if (pf_vec <= MLX5_IRQ_VEC_COMP_BASE) return -ENOMEM; - nvec = pci_alloc_irq_vectors(dev->pdev, MLX5_IRQ_VEC_COMP_BASE + 1, - nvec, PCI_IRQ_MSIX); - if (nvec < 0) { - err = nvec; - goto err_free_irq; - } - - table->nvec = nvec; + total_vec = pf_vec; + if (mlx5_sf_max_functions(dev)) + total_vec += MLX5_IRQ_CTRL_SF_MAX + + MLX5_COMP_EQS_PER_SF * mlx5_sf_max_functions(dev); - err = irq_set_rmap(dev); - if (err) - goto err_set_rmap; + total_vec = pci_alloc_irq_vectors(dev->pdev, MLX5_IRQ_VEC_COMP_BASE + 1, + total_vec, PCI_IRQ_MSIX); + if (total_vec < 0) + return total_vec; + pf_vec = min(pf_vec, total_vec); - err = request_irqs(dev, nvec); + err = irq_pools_init(dev, total_vec - pf_vec, pf_vec); if (err) - goto err_request_irqs; - - err = set_comp_irq_affinity_hints(dev); - if (err) { - mlx5_core_err(dev, "Failed to alloc affinity hint cpumask\n"); - goto err_set_affinity; - } - - return 0; + pci_free_irq_vectors(dev->pdev); -err_set_affinity: - unrequest_irqs(dev); -err_request_irqs: - irq_clear_rmap(dev); -err_set_rmap: - pci_free_irq_vectors(dev->pdev); -err_free_irq: - kfree(table->irq); return err; } void mlx5_irq_table_destroy(struct mlx5_core_dev *dev) { struct mlx5_irq_table *table = dev->priv.irq_table; - int i; if (mlx5_core_is_sf(dev)) return; - /* free_irq requires that affinity and rmap will be cleared - * before calling it. This is why there is asymmetry with set_rmap - * which should be called after alloc_irq but before request_irq. + /* There are cases where IRQs still will be in used when we reaching + * to here. Hence, making sure all the irqs are realeased. */ - irq_clear_rmap(dev); - clear_comp_irqs_affinity_hints(dev); - for (i = 0; i < table->nvec; i++) - free_irq(pci_irq_vector(dev->pdev, i), - &mlx5_irq_get(dev, i)->nh); + irq_pools_destroy(table); pci_free_irq_vectors(dev->pdev); - kfree(table->irq); +} + +int mlx5_irq_table_get_sfs_vec(struct mlx5_irq_table *table) +{ + if (table->sf_comp_pool) + return table->sf_comp_pool->xa_num_irqs.max - + table->sf_comp_pool->xa_num_irqs.min + 1; + else + return mlx5_irq_table_get_num_comp(table); } struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c index ef5f892aafad..d9c69123c1ab 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c @@ -6,7 +6,6 @@ #include "sf.h" #include "mlx5_ifc_vhca_event.h" #include "ecpf.h" -#include "vhca_event.h" #include "mlx5_core.h" #include "eswitch.h" @@ -74,26 +73,29 @@ static int mlx5_sf_hw_table_id_alloc(struct mlx5_sf_hw_table *table, u32 control u32 usr_sfnum) { struct mlx5_sf_hwc_table *hwc; + int free_idx = -1; int i; hwc = mlx5_sf_controller_to_hwc(table->dev, controller); if (!hwc->sfs) return -ENOSPC; - /* Check if sf with same sfnum already exists or not. */ for (i = 0; i < hwc->max_fn; i++) { + if (!hwc->sfs[i].allocated && free_idx == -1) { + free_idx = i; + continue; + } + if (hwc->sfs[i].allocated && hwc->sfs[i].usr_sfnum == usr_sfnum) return -EEXIST; } - /* Find the free entry and allocate the entry from the array */ - for (i = 0; i < hwc->max_fn; i++) { - if (!hwc->sfs[i].allocated) { - hwc->sfs[i].usr_sfnum = usr_sfnum; - hwc->sfs[i].allocated = true; - return i; - } - } - return -ENOSPC; + + if (free_idx == -1) + return -ENOSPC; + + hwc->sfs[free_idx].usr_sfnum = usr_sfnum; + hwc->sfs[free_idx].allocated = true; + return free_idx; } static void mlx5_sf_hw_table_id_free(struct mlx5_sf_hw_table *table, u32 controller, int id) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h index 0b6aea1e6a94..81ce13b19ee8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h @@ -5,42 +5,7 @@ #define __MLX5_SF_H__ #include <linux/mlx5/driver.h> - -static inline u16 mlx5_sf_start_function_id(const struct mlx5_core_dev *dev) -{ - return MLX5_CAP_GEN(dev, sf_base_id); -} - -#ifdef CONFIG_MLX5_SF - -static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) -{ - return MLX5_CAP_GEN(dev, sf); -} - -static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) -{ - if (!mlx5_sf_supported(dev)) - return 0; - if (MLX5_CAP_GEN(dev, max_num_sf)) - return MLX5_CAP_GEN(dev, max_num_sf); - else - return 1 << MLX5_CAP_GEN(dev, log_max_sf); -} - -#else - -static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) -{ - return false; -} - -static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) -{ - return 0; -} - -#endif +#include "lib/sf.h" #ifdef CONFIG_MLX5_SF_MANAGER diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c index 2338989d4403..e8185b69ac6c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c @@ -34,6 +34,7 @@ #include <linux/mlx5/driver.h> #include <linux/mlx5/vport.h> #include "mlx5_core.h" +#include "mlx5_irq.h" #include "eswitch.h" static int sriov_restore_guids(struct mlx5_core_dev *dev, int vf) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c index 949879cf2092..6475ba35cf6b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c @@ -2,6 +2,7 @@ /* Copyright (c) 2019 Mellanox Technologies. */ #include "dr_types.h" +#include "dr_ste.h" enum dr_action_domain { DR_ACTION_DOMAIN_NIC_INGRESS, @@ -14,7 +15,8 @@ enum dr_action_domain { enum dr_action_valid_state { DR_ACTION_STATE_ERR, DR_ACTION_STATE_NO_ACTION, - DR_ACTION_STATE_REFORMAT, + DR_ACTION_STATE_ENCAP, + DR_ACTION_STATE_DECAP, DR_ACTION_STATE_MODIFY_HDR, DR_ACTION_STATE_MODIFY_VLAN, DR_ACTION_STATE_NON_TERM, @@ -29,46 +31,74 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX] [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, - [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, }, - [DR_ACTION_STATE_REFORMAT] = { + [DR_ACTION_STATE_DECAP] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, - [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, }, + [DR_ACTION_STATE_ENCAP] = { + [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, + }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_MODIFY_VLAN] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, - [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, }, @@ -80,39 +110,48 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX] [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, }, - [DR_ACTION_STATE_REFORMAT] = { + [DR_ACTION_STATE_ENCAP] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, - [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, }, [DR_ACTION_STATE_MODIFY_VLAN] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, }, @@ -124,41 +163,69 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX] [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, - [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, }, - [DR_ACTION_STATE_REFORMAT] = { + [DR_ACTION_STATE_DECAP] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, - [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, + }, + [DR_ACTION_STATE_ENCAP] = { + [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_MODIFY_VLAN] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, - [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, @@ -171,44 +238,53 @@ next_action_state[DR_ACTION_DOMAIN_MAX][DR_ACTION_STATE_MAX][DR_ACTION_TYP_MAX] [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, }, - [DR_ACTION_STATE_REFORMAT] = { + [DR_ACTION_STATE_ENCAP] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, - [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_MODIFY_VLAN] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_VLAN, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, + [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, - [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_REFORMAT, - [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_REFORMAT, + [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, + [DR_ACTION_TYP_INSERT_HDR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_MODIFY_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, }, @@ -235,6 +311,9 @@ dr_action_reformat_to_action_type(enum mlx5dr_action_reformat_type reformat_type case DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L3: *action_type = DR_ACTION_TYP_L2_TO_TNL_L3; break; + case DR_ACTION_REFORMAT_TYP_INSERT_HDR: + *action_type = DR_ACTION_TYP_INSERT_HDR; + break; default: return -EINVAL; } @@ -454,8 +533,17 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher, break; case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: - attr.reformat_size = action->reformat->reformat_size; - attr.reformat_id = action->reformat->reformat_id; + if (rx_rule && + !(dmn->ste_ctx->actions_caps & DR_STE_CTX_ACTION_CAP_RX_ENCAP)) { + mlx5dr_info(dmn, "Device doesn't support Encap on RX\n"); + goto out_invalid_arg; + } + attr.reformat.size = action->reformat->size; + attr.reformat.id = action->reformat->id; + break; + case DR_ACTION_TYP_SAMPLER: + attr.final_icm_addr = rx_rule ? action->sampler->rx_icm_addr : + action->sampler->tx_icm_addr; break; case DR_ACTION_TYP_VPORT: attr.hit_gvmi = action->vport->caps->vhca_gvmi; @@ -481,6 +569,12 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher, attr.vlans.headers[attr.vlans.count++] = action->push_vlan->vlan_hdr; break; + case DR_ACTION_TYP_INSERT_HDR: + attr.reformat.size = action->reformat->size; + attr.reformat.id = action->reformat->id; + attr.reformat.param_0 = action->reformat->param_0; + attr.reformat.param_1 = action->reformat->param_1; + break; default: goto out_invalid_arg; } @@ -543,6 +637,8 @@ static unsigned int action_size[DR_ACTION_TYP_MAX] = { [DR_ACTION_TYP_MODIFY_HDR] = sizeof(struct mlx5dr_action_rewrite), [DR_ACTION_TYP_VPORT] = sizeof(struct mlx5dr_action_vport), [DR_ACTION_TYP_PUSH_VLAN] = sizeof(struct mlx5dr_action_push_vlan), + [DR_ACTION_TYP_INSERT_HDR] = sizeof(struct mlx5dr_action_reformat), + [DR_ACTION_TYP_SAMPLER] = sizeof(struct mlx5dr_action_sampler), }; static struct mlx5dr_action * @@ -651,7 +747,7 @@ mlx5dr_action_create_mult_dest_tbl(struct mlx5dr_domain *dmn, if (reformat_action) { reformat_req = true; hw_dests[i].vport.reformat_id = - reformat_action->reformat->reformat_id; + reformat_action->reformat->id; ref_actions[num_of_ref++] = reformat_action; hw_dests[i].vport.flags |= MLX5_FLOW_DEST_VPORT_REFORMAT_ID; } @@ -755,14 +851,43 @@ struct mlx5dr_action *mlx5dr_action_create_tag(u32 tag_value) return action; } +struct mlx5dr_action * +mlx5dr_action_create_flow_sampler(struct mlx5dr_domain *dmn, u32 sampler_id) +{ + struct mlx5dr_action *action; + u64 icm_rx, icm_tx; + int ret; + + ret = mlx5dr_cmd_query_flow_sampler(dmn->mdev, sampler_id, + &icm_rx, &icm_tx); + if (ret) + return NULL; + + action = dr_action_create_generic(DR_ACTION_TYP_SAMPLER); + if (!action) + return NULL; + + action->sampler->dmn = dmn; + action->sampler->sampler_id = sampler_id; + action->sampler->rx_icm_addr = icm_rx; + action->sampler->tx_icm_addr = icm_tx; + + refcount_inc(&dmn->refcount); + return action; +} + static int dr_action_verify_reformat_params(enum mlx5dr_action_type reformat_type, struct mlx5dr_domain *dmn, + u8 reformat_param_0, + u8 reformat_param_1, size_t data_sz, void *data) { - if ((!data && data_sz) || (data && !data_sz) || reformat_type > - DR_ACTION_TYP_L2_TO_TNL_L3) { + if ((!data && data_sz) || (data && !data_sz) || + ((reformat_param_0 || reformat_param_1) && + reformat_type != DR_ACTION_TYP_INSERT_HDR) || + reformat_type > DR_ACTION_TYP_INSERT_HDR) { mlx5dr_dbg(dmn, "Invalid reformat parameter!\n"); goto out_err; } @@ -794,6 +919,7 @@ out_err: static int dr_action_create_reformat_action(struct mlx5dr_domain *dmn, + u8 reformat_param_0, u8 reformat_param_1, size_t data_sz, void *data, struct mlx5dr_action *action) { @@ -811,13 +937,14 @@ dr_action_create_reformat_action(struct mlx5dr_domain *dmn, else rt = MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL; - ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev, rt, data_sz, data, + ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev, rt, 0, 0, + data_sz, data, &reformat_id); if (ret) return ret; - action->reformat->reformat_id = reformat_id; - action->reformat->reformat_size = data_sz; + action->reformat->id = reformat_id; + action->reformat->size = data_sz; return 0; } case DR_ACTION_TYP_TNL_L2_TO_L2: @@ -859,6 +986,23 @@ dr_action_create_reformat_action(struct mlx5dr_domain *dmn, } return 0; } + case DR_ACTION_TYP_INSERT_HDR: + { + ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev, + MLX5_REFORMAT_TYPE_INSERT_HDR, + reformat_param_0, + reformat_param_1, + data_sz, data, + &reformat_id); + if (ret) + return ret; + + action->reformat->id = reformat_id; + action->reformat->size = data_sz; + action->reformat->param_0 = reformat_param_0; + action->reformat->param_1 = reformat_param_1; + return 0; + } default: mlx5dr_info(dmn, "Reformat type is not supported %d\n", action->action_type); return -EINVAL; @@ -896,6 +1040,8 @@ struct mlx5dr_action *mlx5dr_action_create_push_vlan(struct mlx5dr_domain *dmn, struct mlx5dr_action * mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn, enum mlx5dr_action_reformat_type reformat_type, + u8 reformat_param_0, + u8 reformat_param_1, size_t data_sz, void *data) { @@ -912,7 +1058,9 @@ mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn, goto dec_ref; } - ret = dr_action_verify_reformat_params(action_type, dmn, data_sz, data); + ret = dr_action_verify_reformat_params(action_type, dmn, + reformat_param_0, reformat_param_1, + data_sz, data); if (ret) goto dec_ref; @@ -923,6 +1071,8 @@ mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn, action->reformat->dmn = dmn; ret = dr_action_create_reformat_action(dmn, + reformat_param_0, + reformat_param_1, data_sz, data, action); @@ -1516,8 +1666,9 @@ int mlx5dr_action_destroy(struct mlx5dr_action *action) break; case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: + case DR_ACTION_TYP_INSERT_HDR: mlx5dr_cmd_destroy_reformat_ctx((action->reformat->dmn)->mdev, - action->reformat->reformat_id); + action->reformat->id); refcount_dec(&action->reformat->dmn->refcount); break; case DR_ACTION_TYP_MODIFY_HDR: @@ -1525,6 +1676,9 @@ int mlx5dr_action_destroy(struct mlx5dr_action *action) kfree(action->rewrite->data); refcount_dec(&action->rewrite->dmn->refcount); break; + case DR_ACTION_TYP_SAMPLER: + refcount_dec(&action->sampler->dmn->refcount); + break; default: break; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c index 5970cb8fc0c0..54e1f5438bbe 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c @@ -228,6 +228,36 @@ int mlx5dr_cmd_query_flow_table(struct mlx5_core_dev *dev, return 0; } +int mlx5dr_cmd_query_flow_sampler(struct mlx5_core_dev *dev, + u32 sampler_id, + u64 *rx_icm_addr, + u64 *tx_icm_addr) +{ + u32 out[MLX5_ST_SZ_DW(query_sampler_obj_out)] = {}; + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + void *attr; + int ret; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, + MLX5_CMD_OP_QUERY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, + MLX5_GENERAL_OBJECT_TYPES_SAMPLER); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, sampler_id); + + ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (ret) + return ret; + + attr = MLX5_ADDR_OF(query_sampler_obj_out, out, sampler_object); + + *rx_icm_addr = MLX5_GET64(sampler_obj, attr, + sw_steering_icm_address_rx); + *tx_icm_addr = MLX5_GET64(sampler_obj, attr, + sw_steering_icm_address_tx); + + return 0; +} + int mlx5dr_cmd_sync_steering(struct mlx5_core_dev *mdev) { u32 in[MLX5_ST_SZ_DW(sync_steering_in)] = {}; @@ -460,6 +490,8 @@ int mlx5dr_cmd_destroy_flow_table(struct mlx5_core_dev *mdev, int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev, enum mlx5_reformat_ctx_type rt, + u8 reformat_param_0, + u8 reformat_param_1, size_t reformat_size, void *reformat_data, u32 *reformat_id) @@ -486,8 +518,11 @@ int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev, pdata = MLX5_ADDR_OF(packet_reformat_context_in, prctx, reformat_data); MLX5_SET(packet_reformat_context_in, prctx, reformat_type, rt); + MLX5_SET(packet_reformat_context_in, prctx, reformat_param_0, reformat_param_0); + MLX5_SET(packet_reformat_context_in, prctx, reformat_param_1, reformat_param_1); MLX5_SET(packet_reformat_context_in, prctx, reformat_data_size, reformat_size); - memcpy(pdata, reformat_data, reformat_size); + if (reformat_data && reformat_size) + memcpy(pdata, reformat_data, reformat_size); err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out)); if (err) @@ -706,6 +741,9 @@ int mlx5dr_cmd_set_fte(struct mlx5_core_dev *dev, fte->dest_arr[i].vport.reformat_id); } break; + case MLX5_FLOW_DESTINATION_TYPE_FLOW_SAMPLER: + id = fte->dest_arr[i].sampler_id; + break; default: id = fte->dest_arr[i].tir_num; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h index 992b591bf0c5..12a8bbbf944b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h @@ -156,6 +156,7 @@ struct mlx5dr_ste_ctx { u16 (*get_byte_mask)(u8 *hw_ste_p); /* Actions */ + u32 actions_caps; void (*set_actions_rx)(struct mlx5dr_domain *dmn, u8 *action_type_set, u8 *hw_ste_arr, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c index 0757a4e8540e..f1950e4968da 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.c @@ -437,8 +437,8 @@ dr_ste_v0_set_actions_tx(struct mlx5dr_domain *dmn, attr->gvmi); dr_ste_v0_set_tx_encap(last_ste, - attr->reformat_id, - attr->reformat_size, + attr->reformat.id, + attr->reformat.size, action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]); /* Whenever prio_tag_required enabled, we can be sure that the * previous table (ACL) already push vlan to our packet, @@ -1893,6 +1893,7 @@ struct mlx5dr_ste_ctx ste_ctx_v0 = { .get_byte_mask = &dr_ste_v0_get_byte_mask, /* Actions */ + .actions_caps = DR_STE_CTX_ACTION_CAP_NONE, .set_actions_rx = &dr_ste_v0_set_actions_rx, .set_actions_tx = &dr_ste_v0_set_actions_tx, .modify_field_arr_sz = ARRAY_SIZE(dr_ste_v0_action_modify_field_arr), diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c index 7466f016375c..4aaca8eb7597 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c @@ -116,6 +116,8 @@ enum { DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, + DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_0 = 0x6f, + DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_1 = 0x70, DR_STE_V1_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, DR_STE_V1_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2 = 0x8c, @@ -246,6 +248,12 @@ static const struct mlx5dr_ste_action_modify_field dr_ste_v1_action_modify_field [MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_2, .start = 0, .end = 15, }, + [MLX5_ACTION_IN_FIELD_OUT_EMD_31_0] = { + .hw_field = DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_1, .start = 0, .end = 31, + }, + [MLX5_ACTION_IN_FIELD_OUT_EMD_47_32] = { + .hw_field = DR_STE_V1_ACTION_MDFY_FLD_CFG_HDR_0_0, .start = 0, .end = 15, + }, }; static void dr_ste_v1_set_entry_type(u8 *hw_ste_p, u8 entry_type) @@ -361,8 +369,8 @@ static void dr_ste_v1_set_reparse(u8 *hw_ste_p) MLX5_SET(ste_match_bwc_v1, hw_ste_p, reparse, 1); } -static void dr_ste_v1_set_tx_encap(u8 *hw_ste_p, u8 *d_action, - u32 reformat_id, int size) +static void dr_ste_v1_set_encap(u8 *hw_ste_p, u8 *d_action, + u32 reformat_id, int size) { MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); @@ -374,6 +382,26 @@ static void dr_ste_v1_set_tx_encap(u8 *hw_ste_p, u8 *d_action, dr_ste_v1_set_reparse(hw_ste_p); } +static void dr_ste_v1_set_insert_hdr(u8 *hw_ste_p, u8 *d_action, + u32 reformat_id, + u8 anchor, u8 offset, + int size) +{ + MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, + action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); + MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, start_anchor, anchor); + + /* The hardware expects here size and offset in words (2 byte) */ + MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, size, size / 2); + MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, start_offset, offset / 2); + + MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, pointer, reformat_id); + MLX5_SET(ste_double_action_insert_with_ptr_v1, d_action, attributes, + DR_STE_V1_ACTION_INSERT_PTR_ATTR_NONE); + + dr_ste_v1_set_reparse(hw_ste_p); +} + static void dr_ste_v1_set_tx_push_vlan(u8 *hw_ste_p, u8 *d_action, u32 vlan_hdr) { @@ -401,11 +429,11 @@ static void dr_ste_v1_set_rx_pop_vlan(u8 *hw_ste_p, u8 *s_action, u8 vlans_num) dr_ste_v1_set_reparse(hw_ste_p); } -static void dr_ste_v1_set_tx_encap_l3(u8 *hw_ste_p, - u8 *frst_s_action, - u8 *scnd_d_action, - u32 reformat_id, - int size) +static void dr_ste_v1_set_encap_l3(u8 *hw_ste_p, + u8 *frst_s_action, + u8 *scnd_d_action, + u32 reformat_id, + int size) { /* Remove L2 headers */ MLX5_SET(ste_single_action_remove_header_v1, frst_s_action, action_id, @@ -519,9 +547,9 @@ static void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_encap = true; } - dr_ste_v1_set_tx_encap(last_ste, action, - attr->reformat_id, - attr->reformat_size); + dr_ste_v1_set_encap(last_ste, action, + attr->reformat.id, + attr->reformat.size); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) { @@ -532,12 +560,25 @@ static void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, action_sz = DR_STE_ACTION_TRIPLE_SZ; d_action = action + DR_STE_ACTION_SINGLE_SZ; - dr_ste_v1_set_tx_encap_l3(last_ste, - action, d_action, - attr->reformat_id, - attr->reformat_size); + dr_ste_v1_set_encap_l3(last_ste, + action, d_action, + attr->reformat.id, + attr->reformat.size); action_sz -= DR_STE_ACTION_TRIPLE_SZ; action += DR_STE_ACTION_TRIPLE_SZ; + } else if (action_type_set[DR_ACTION_TYP_INSERT_HDR]) { + if (!allow_encap || action_sz < DR_STE_ACTION_DOUBLE_SZ) { + dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); + action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action); + action_sz = DR_STE_ACTION_TRIPLE_SZ; + } + dr_ste_v1_set_insert_hdr(last_ste, action, + attr->reformat.id, + attr->reformat.param_0, + attr->reformat.param_1, + attr->reformat.size); + action_sz -= DR_STE_ACTION_DOUBLE_SZ; + action += DR_STE_ACTION_DOUBLE_SZ; } dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi); @@ -616,7 +657,9 @@ static void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, } if (action_type_set[DR_ACTION_TYP_CTR]) { - /* Counter action set after decap to exclude decaped header */ + /* Counter action set after decap and before insert_hdr + * to exclude decaped / encaped header respectively. + */ if (!allow_ctr) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action); @@ -627,6 +670,52 @@ static void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, dr_ste_v1_set_counter_id(last_ste, attr->ctr_id); } + if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L2]) { + if (action_sz < DR_STE_ACTION_DOUBLE_SZ) { + dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); + action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action); + action_sz = DR_STE_ACTION_TRIPLE_SZ; + } + dr_ste_v1_set_encap(last_ste, action, + attr->reformat.id, + attr->reformat.size); + action_sz -= DR_STE_ACTION_DOUBLE_SZ; + action += DR_STE_ACTION_DOUBLE_SZ; + allow_modify_hdr = false; + } else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) { + u8 *d_action; + + if (action_sz < DR_STE_ACTION_TRIPLE_SZ) { + dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); + action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action); + action_sz = DR_STE_ACTION_TRIPLE_SZ; + } + + d_action = action + DR_STE_ACTION_SINGLE_SZ; + + dr_ste_v1_set_encap_l3(last_ste, + action, d_action, + attr->reformat.id, + attr->reformat.size); + action_sz -= DR_STE_ACTION_TRIPLE_SZ; + allow_modify_hdr = false; + } else if (action_type_set[DR_ACTION_TYP_INSERT_HDR]) { + /* Modify header, decap, and encap must use different STEs */ + if (!allow_modify_hdr || action_sz < DR_STE_ACTION_DOUBLE_SZ) { + dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); + action = MLX5_ADDR_OF(ste_mask_and_match_v1, last_ste, action); + action_sz = DR_STE_ACTION_TRIPLE_SZ; + } + dr_ste_v1_set_insert_hdr(last_ste, action, + attr->reformat.id, + attr->reformat.param_0, + attr->reformat.param_1, + attr->reformat.size); + action_sz -= DR_STE_ACTION_DOUBLE_SZ; + action += DR_STE_ACTION_DOUBLE_SZ; + allow_modify_hdr = false; + } + dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi); dr_ste_v1_set_hit_addr(last_ste, attr->final_icm_addr, 1); } @@ -1871,6 +1960,7 @@ struct mlx5dr_ste_ctx ste_ctx_v1 = { .set_byte_mask = &dr_ste_v1_set_byte_mask, .get_byte_mask = &dr_ste_v1_get_byte_mask, /* Actions */ + .actions_caps = DR_STE_CTX_ACTION_CAP_RX_ENCAP, .set_actions_rx = &dr_ste_v1_set_actions_rx, .set_actions_tx = &dr_ste_v1_set_actions_tx, .modify_field_arr_sz = ARRAY_SIZE(dr_ste_v1_action_modify_field_arr), diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h index 67460c42a99b..f5e93fa87aff 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h @@ -89,6 +89,11 @@ enum { DR_STE_SIZE_REDUCED = DR_STE_SIZE - DR_STE_SIZE_MASK, }; +enum mlx5dr_ste_ctx_action_cap { + DR_STE_CTX_ACTION_CAP_NONE = 0, + DR_STE_CTX_ACTION_CAP_RX_ENCAP = 1 << 0, +}; + enum { DR_MODIFY_ACTION_SIZE = 8, }; @@ -118,6 +123,8 @@ enum mlx5dr_action_type { DR_ACTION_TYP_VPORT, DR_ACTION_TYP_POP_VLAN, DR_ACTION_TYP_PUSH_VLAN, + DR_ACTION_TYP_INSERT_HDR, + DR_ACTION_TYP_SAMPLER, DR_ACTION_TYP_MAX, }; @@ -261,8 +268,12 @@ struct mlx5dr_ste_actions_attr { u32 ctr_id; u16 gvmi; u16 hit_gvmi; - u32 reformat_id; - u32 reformat_size; + struct { + u32 id; + u32 size; + u8 param_0; + u8 param_1; + } reformat; struct { int count; u32 headers[MLX5DR_MAX_VLANS]; @@ -903,8 +914,17 @@ struct mlx5dr_action_rewrite { struct mlx5dr_action_reformat { struct mlx5dr_domain *dmn; - u32 reformat_id; - u32 reformat_size; + u32 id; + u32 size; + u8 param_0; + u8 param_1; +}; + +struct mlx5dr_action_sampler { + struct mlx5dr_domain *dmn; + u64 rx_icm_addr; + u64 tx_icm_addr; + u32 sampler_id; }; struct mlx5dr_action_dest_tbl { @@ -950,6 +970,7 @@ struct mlx5dr_action { void *data; struct mlx5dr_action_rewrite *rewrite; struct mlx5dr_action_reformat *reformat; + struct mlx5dr_action_sampler *sampler; struct mlx5dr_action_dest_tbl *dest_tbl; struct mlx5dr_action_ctr *ctr; struct mlx5dr_action_vport *vport; @@ -1104,6 +1125,10 @@ int mlx5dr_cmd_query_gvmi(struct mlx5_core_dev *mdev, bool other_vport, u16 vport_number, u16 *gvmi); int mlx5dr_cmd_query_esw_caps(struct mlx5_core_dev *mdev, struct mlx5dr_esw_caps *caps); +int mlx5dr_cmd_query_flow_sampler(struct mlx5_core_dev *dev, + u32 sampler_id, + u64 *rx_icm_addr, + u64 *tx_icm_addr); int mlx5dr_cmd_sync_steering(struct mlx5_core_dev *mdev); int mlx5dr_cmd_set_fte_modify_and_vport(struct mlx5_core_dev *mdev, u32 table_type, @@ -1142,6 +1167,8 @@ int mlx5dr_cmd_query_flow_table(struct mlx5_core_dev *dev, struct mlx5dr_cmd_query_flow_table_details *output); int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev, enum mlx5_reformat_ctx_type rt, + u8 reformat_param_0, + u8 reformat_param_1, size_t reformat_size, void *reformat_data, u32 *reformat_id); @@ -1252,7 +1279,6 @@ struct mlx5dr_send_ring { u32 tx_head; void *buf; u32 buf_size; - struct ib_wc wc[MAX_SEND_CQE]; u8 sync_buff[MIN_READ_SYNC]; struct mlx5dr_mr *sync_mr; spinlock_t lock; /* Protect the data path of the send ring */ @@ -1290,6 +1316,7 @@ struct mlx5dr_cmd_flow_destination_hw_info { u32 ft_num; u32 ft_id; u32 counter_id; + u32 sampler_id; struct { u16 num; u16 vhca_id; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c index 96c39a17d026..d5926dd7e972 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c @@ -62,7 +62,7 @@ static int set_miss_action(struct mlx5_flow_root_namespace *ns, static int mlx5_cmd_dr_create_flow_table(struct mlx5_flow_root_namespace *ns, struct mlx5_flow_table *ft, - unsigned int log_size, + unsigned int size, struct mlx5_flow_table *next_ft) { struct mlx5dr_table *tbl; @@ -71,7 +71,7 @@ static int mlx5_cmd_dr_create_flow_table(struct mlx5_flow_root_namespace *ns, if (mlx5_dr_is_fw_table(ft->flags)) return mlx5_fs_cmd_get_fw_cmds()->create_flow_table(ns, ft, - log_size, + size, next_ft); flags = ft->flags; /* turn off encap/decap if not supported for sw-str by fw */ @@ -97,6 +97,8 @@ static int mlx5_cmd_dr_create_flow_table(struct mlx5_flow_root_namespace *ns, } } + ft->max_fte = INT_MAX; + return 0; } @@ -287,7 +289,8 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns, DR_ACTION_REFORMAT_TYP_TNL_L2_TO_L2; tmp_action = mlx5dr_action_create_packet_reformat(domain, - decap_type, 0, + decap_type, + 0, 0, 0, NULL); if (!tmp_action) { err = -ENOMEM; @@ -384,7 +387,7 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns, if (fte->action.action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) { list_for_each_entry(dst, &fte->node.children, node.list) { enum mlx5_flow_destination_type type = dst->dest_attr.type; - u32 ft_id; + u32 id; if (num_actions == MLX5_FLOW_CONTEXT_ACTION_MAX || num_term_actions >= MLX5_FLOW_CONTEXT_ACTION_MAX) { @@ -422,9 +425,20 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns, num_term_actions++; break; case MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE_NUM: - ft_id = dst->dest_attr.ft_num; + id = dst->dest_attr.ft_num; tmp_action = mlx5dr_action_create_dest_table_num(domain, - ft_id); + id); + if (!tmp_action) { + err = -ENOMEM; + goto free_actions; + } + fs_dr_actions[fs_dr_num_actions++] = tmp_action; + term_actions[num_term_actions++].dest = tmp_action; + break; + case MLX5_FLOW_DESTINATION_TYPE_FLOW_SAMPLER: + id = dst->dest_attr.sampler_id; + tmp_action = mlx5dr_action_create_flow_sampler(domain, + id); if (!tmp_action) { err = -ENOMEM; goto free_actions; @@ -520,9 +534,7 @@ out_err: } static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns, - int reformat_type, - size_t size, - void *reformat_data, + struct mlx5_pkt_reformat_params *params, enum mlx5_flow_namespace_type namespace, struct mlx5_pkt_reformat *pkt_reformat) { @@ -530,7 +542,7 @@ static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns struct mlx5dr_action *action; int dr_reformat; - switch (reformat_type) { + switch (params->type) { case MLX5_REFORMAT_TYPE_L2_TO_VXLAN: case MLX5_REFORMAT_TYPE_L2_TO_NVGRE: case MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL: @@ -542,16 +554,21 @@ static int mlx5_cmd_dr_packet_reformat_alloc(struct mlx5_flow_root_namespace *ns case MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL: dr_reformat = DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L3; break; + case MLX5_REFORMAT_TYPE_INSERT_HDR: + dr_reformat = DR_ACTION_REFORMAT_TYP_INSERT_HDR; + break; default: mlx5_core_err(ns->dev, "Packet-reformat not supported(%d)\n", - reformat_type); + params->type); return -EOPNOTSUPP; } action = mlx5dr_action_create_packet_reformat(dr_domain, dr_reformat, - size, - reformat_data); + params->param_0, + params->param_1, + params->size, + params->data); if (!action) { mlx5_core_err(ns->dev, "Failed allocating packet-reformat action\n"); return -EINVAL; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h index 9737565cd8d4..bbfe101d4e57 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5dr.h @@ -26,6 +26,7 @@ enum mlx5dr_action_reformat_type { DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L2, DR_ACTION_REFORMAT_TYP_TNL_L3_TO_L2, DR_ACTION_REFORMAT_TYP_L2_TO_TNL_L3, + DR_ACTION_REFORMAT_TYP_INSERT_HDR, }; struct mlx5dr_match_parameters { @@ -100,11 +101,16 @@ struct mlx5dr_action *mlx5dr_action_create_drop(void); struct mlx5dr_action *mlx5dr_action_create_tag(u32 tag_value); struct mlx5dr_action * +mlx5dr_action_create_flow_sampler(struct mlx5dr_domain *dmn, u32 sampler_id); + +struct mlx5dr_action * mlx5dr_action_create_flow_counter(u32 counter_id); struct mlx5dr_action * mlx5dr_action_create_packet_reformat(struct mlx5dr_domain *dmn, enum mlx5dr_action_reformat_type reformat_type, + u8 reformat_param_0, + u8 reformat_param_1, size_t data_sz, void *data); |