Age | Commit message (Collapse) | Author |
|
The current code causes problems when the unregistering netdevice could
be different then the registering one.
Since the check in mlx5_lag_netdev_event() does not allow any other
network namespace anyway, fix this by registerting the lag notifier
per init network namespace only.
Fixes: d48834f9d4b4 ("mlx5: Use dev_net netdevice notifier registrations")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Aya Levin <ayal@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On flow table creation, send the relevant flags according to what the FW
currently supports.
When FW doesn't support reformat option over SW-steering managed table,
the driver shouldn't pass this.
Fixes: 988fd6b32d07 ("net/mlx5: DR, Pass table flags at creation to lower layer")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The pool sizes represent the pool sizes in the fw. when we request
a pool size from fw, it will return the next possible group.
We track how many pools the fw has left and start requesting groups
from the big to the small.
When we start request 4k group, which doesn't exists in fw, fw
wants to allocate the next possible size, 64k, but will fail since
its exhausted. The correct smallest pool size in fw is 128 and not 4k.
Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
There is no need to reset all vf config (except link state) between
legacy and switchdev modes changes.
Also, set link state to AUTO, when legacy enabled.
Fixes: 3b83b6c2e024 ("net/mlx5e: Clear VF config when switching modes")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Set vport gvmi in the tag, only when source gvmi is set in the bit mask.
Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
When health reporters are not supported, recovery function is invoked
directly, not via devlink health reporters.
In this direct flow, the recover function input parameter was passed
incorrectly and is causing a kernel oops. This patch is fixing the input
parameter.
Following call trace is observed on rx error health reporting.
Internal error: Oops: 96000007 [#1] PREEMPT SMP
Process kworker/u16:4 (pid: 4584, stack limit = 0x00000000c9e45703)
Call trace:
mlx5e_rx_reporter_err_rq_cqe_recover+0x30/0x164 [mlx5_core]
mlx5e_health_report+0x60/0x6c [mlx5_core]
mlx5e_reporter_rq_cqe_err+0x6c/0x90 [mlx5_core]
mlx5e_rq_err_cqe_work+0x20/0x2c [mlx5_core]
process_one_work+0x168/0x3d0
worker_thread+0x58/0x3d0
kthread+0x108/0x134
Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Initialize RQ doorbell counters to zero prior to moving an RQ from RST
to RDY state. Per HW spec, when RQ is back to RDY state, the descriptor
ID on the completion is reset. The doorbell record must comply.
Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
rtnl_bridge_getlink is protected by rcu lock, so mlx5_eswitch_get_vepa
cannot take mutex lock. Two possible issues can happen:
1. User at the same time change vepa mode via RTM_SETLINK command.
2. User at the same time change the switchdev mode via devlink netlink
interface.
Case 1 cannot happen because rtnl executes one message in order.
Case 2 can happen but we do not expect user to change the switchdev mode
when changing vepa. Even if a user does it, so he will read a value
which is no longer valid.
Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Deprecate the generic TLS cap bit, use the new TX-specific
TLS cap bit instead.
Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
For a cyclic work queue, when not requesting a completion per WQE,
a single CQE might indicate the completion of several WQEs.
However, in case some WQE in the batch causes an error, then an error
completion is issued, breaking the batch, and pointing to the offending
WQE in the wqe_counter field.
Hence, WQE-specific error CQE handling (like printing, breaking, etc...)
should be performed only for the last WQE in batch.
Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events")
Fixes: fd9b4be8002c ("net/mlx5e: RX, Support multiple outstanding UMR posts")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx,
however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function
nullifies sa_ctx pointer without freeing the memory allocated,
hence the memory leak.
Fix by free SA context when the SA is released.
Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used
with negative negation as zero value indicates success but it
used as failure return value instead.
Fix by remove the unary not negation operator.
Fixes: 05564d0ae075 ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
free_match_list could be called when the flow table is already
locked. We need to pass this notation to tree_put_node.
It fixes the following lockdep warnning:
[ 1797.268537] ============================================
[ 1797.276837] WARNING: possible recursive locking detected
[ 1797.285101] 5.5.0-rc5+ #10 Not tainted
[ 1797.291641] --------------------------------------------
[ 1797.299917] handler10/9296 is trying to acquire lock:
[ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at:
tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.319694]
[ 1797.319694] but task is already holding lock:
[ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at:
nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
[ 1797.344707]
[ 1797.344707] other info that might help us debug this:
[ 1797.356952] Possible unsafe locking scenario:
[ 1797.356952]
[ 1797.368333] CPU0
[ 1797.373357] ----
[ 1797.378364] lock(&node->lock);
[ 1797.384222] lock(&node->lock);
[ 1797.390031]
[ 1797.390031] *** DEADLOCK ***
[ 1797.390031]
[ 1797.403003] May be due to missing lock nesting notation
[ 1797.403003]
[ 1797.414691] 3 locks held by handler10/9296:
[ 1797.421465] #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at:
tc_setup_cb_add+0x70/0x250
[ 1797.432810] #1: ffff88a030081490 (&comp->sem){++++}, at:
mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core]
[ 1797.445829] #2: ffff889ad399a0a0 (&node->lock){++++}, at:
nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
[ 1797.459913]
[ 1797.459913] stack backtrace:
[ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not
tainted 5.5.0-rc5+ #10
[ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS
2.4.3 01/17/2017
[ 1797.491480] Call Trace:
[ 1797.496701] dump_stack+0x96/0xe0
[ 1797.502864] __lock_acquire.cold.63+0xf8/0x212
[ 1797.510301] ? lockdep_hardirqs_on+0x250/0x250
[ 1797.517701] ? mark_held_locks+0x55/0xa0
[ 1797.524547] ? quarantine_put+0xb7/0x160
[ 1797.531422] ? lockdep_hardirqs_on+0x17d/0x250
[ 1797.538913] lock_acquire+0xd6/0x1f0
[ 1797.545529] ? tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.553701] down_write+0x94/0x140
[ 1797.560206] ? tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.568464] ? down_write_killable_nested+0x170/0x170
[ 1797.576925] ? del_hw_flow_group+0xde/0x1f0 [mlx5_core]
[ 1797.585629] tree_put_node+0x1d5/0x210 [mlx5_core]
[ 1797.593891] ? free_match_list.part.25+0x147/0x170 [mlx5_core]
[ 1797.603389] free_match_list.part.25+0xe0/0x170 [mlx5_core]
[ 1797.612654] _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core]
[ 1797.621838] ? lock_acquire+0xd6/0x1f0
[ 1797.629028] ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core]
[ 1797.637981] ? alloc_insert_flow_group+0x420/0x420 [mlx5_core]
[ 1797.647459] ? try_to_wake_up+0x4c7/0xc70
[ 1797.654881] ? lock_downgrade+0x350/0x350
[ 1797.662271] ? __mutex_unlock_slowpath+0xb1/0x3f0
[ 1797.670396] ? find_held_lock+0xac/0xd0
[ 1797.677540] ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
[ 1797.686467] mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
[ 1797.695134] ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core]
[ 1797.704270] ? irq_exit+0xa5/0x170
[ 1797.710764] ? retint_kernel+0x10/0x10
[ 1797.717698] ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230
[mlx5_core]
[ 1797.728708] mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core]
[ 1797.738713] ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core]
[ 1797.748384] ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core]
[ 1797.757400] mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core]
[ 1797.767665] mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core]
[ 1797.776886] ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core]
[ 1797.785562] ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core]
[ 1797.795353] __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core]
[ 1797.804558] ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0
[mlx5_core]
[ 1797.815093] ? wait_for_completion+0x260/0x260
[ 1797.823272] mlx5e_configure_flower+0xe94/0x1620 [mlx5_core]
[ 1797.832792] ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core]
[ 1797.842096] ? down_read+0x11a/0x2e0
[ 1797.849090] ? down_write+0x140/0x140
[ 1797.856142] ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core]
[ 1797.866027] tc_setup_cb_add+0x11a/0x250
[ 1797.873339] fl_hw_replace_filter+0x25e/0x320 [cls_flower]
[ 1797.882385] ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower]
[ 1797.891607] fl_change+0x1d54/0x1fb6 [cls_flower]
[ 1797.899772] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
[cls_flower]
[ 1797.910728] ? lock_downgrade+0x350/0x350
[ 1797.918187] ? __radix_tree_lookup+0xa5/0x130
[ 1797.926046] ? fl_set_key+0x1590/0x1590 [cls_flower]
[ 1797.934611] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
[cls_flower]
[ 1797.945673] tc_new_tfilter+0xcd1/0x1240
[ 1797.953138] ? tc_del_tfilter+0xb10/0xb10
[ 1797.960688] ? avc_has_perm_noaudit+0x92/0x320
[ 1797.968721] ? avc_has_perm_noaudit+0x1df/0x320
[ 1797.976816] ? avc_has_extended_perms+0x990/0x990
[ 1797.985090] ? mark_lock+0xaa/0x9e0
[ 1797.991988] ? match_held_lock+0x1b/0x240
[ 1797.999457] ? match_held_lock+0x1b/0x240
[ 1798.006859] ? find_held_lock+0xac/0xd0
[ 1798.014045] ? symbol_put_addr+0x40/0x40
[ 1798.021317] ? rcu_read_lock_sched_held+0xd0/0xd0
[ 1798.029460] ? tc_del_tfilter+0xb10/0xb10
[ 1798.036810] rtnetlink_rcv_msg+0x4d5/0x620
[ 1798.044236] ? rtnl_bridge_getlink+0x460/0x460
[ 1798.052034] ? lockdep_hardirqs_on+0x250/0x250
[ 1798.059837] ? match_held_lock+0x1b/0x240
[ 1798.067146] ? find_held_lock+0xac/0xd0
[ 1798.074246] netlink_rcv_skb+0xc6/0x1f0
[ 1798.081339] ? rtnl_bridge_getlink+0x460/0x460
[ 1798.089104] ? netlink_ack+0x440/0x440
[ 1798.096061] netlink_unicast+0x2d4/0x3b0
[ 1798.103189] ? netlink_attachskb+0x3f0/0x3f0
[ 1798.110724] ? _copy_from_iter_full+0xda/0x370
[ 1798.118415] netlink_sendmsg+0x3ba/0x6a0
[ 1798.125478] ? netlink_unicast+0x3b0/0x3b0
[ 1798.132705] ? netlink_unicast+0x3b0/0x3b0
[ 1798.139880] sock_sendmsg+0x94/0xa0
[ 1798.146332] ____sys_sendmsg+0x36c/0x3f0
[ 1798.153251] ? copy_msghdr_from_user+0x165/0x230
[ 1798.160941] ? kernel_sendmsg+0x30/0x30
[ 1798.167738] ___sys_sendmsg+0xeb/0x150
[ 1798.174411] ? sendmsg_copy_msghdr+0x30/0x30
[ 1798.181649] ? lock_downgrade+0x350/0x350
[ 1798.188559] ? rcu_read_lock_sched_held+0xd0/0xd0
[ 1798.196239] ? __fget+0x21d/0x320
[ 1798.202335] ? do_dup2+0x2a0/0x2a0
[ 1798.208499] ? lock_downgrade+0x350/0x350
[ 1798.215366] ? __fget_light+0xd6/0xf0
[ 1798.221808] ? syscall_trace_enter+0x369/0x5d0
[ 1798.229112] __sys_sendmsg+0xd3/0x160
[ 1798.235511] ? __sys_sendmsg_sock+0x60/0x60
[ 1798.242478] ? syscall_trace_enter+0x233/0x5d0
[ 1798.249721] ? syscall_slow_exit_work+0x280/0x280
[ 1798.257211] ? do_syscall_64+0x1e/0x2e0
[ 1798.263680] do_syscall_64+0x72/0x2e0
[ 1798.269950] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Register the dev_net notifier and allow the per-net notifier to follow
the device into different namespace.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Minor conflict in mlx5 because changes happened to code that has
moved meanwhile.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When TCP out-of-order is identified (unexpected tcp seq mismatch), driver
analyzes the packet and decides what handling should it get:
1. go to accelerated path (to be encrypted in HW),
2. go to regular xmit path (send w/o encryption),
3. drop.
Packets marked with skb->decrypted by the TLS stack in the TX flow skips
SW encryption, and rely on the HW offload.
Verify that such packets are never sent un-encrypted on the wire.
Add a WARN to catch such bugs, and prefer dropping the packet in these cases.
Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The call to tx_post_resync_params() is done earlier in the flow,
the post of the control WQEs is unnecessarily repeated. Remove it.
Fixes: 700ec4974240 ("net/mlx5e: kTLS, Fix missing SQ edge fill")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
There are the following cases:
1. Packet ends before start marker: bypass offload.
2. Packet starts before start marker and ends after it: drop,
not supported, breaks contract with kernel.
3. packet ends before tls record info starts: drop,
this packet was already acknowledged and its record info
was released.
Add the above as comment in code.
Mind possible wraparounds of the TCP seq, replace the simple comparison
with a call to the TCP before() method.
In addition, remove logic that handles negative sync_len values,
as it became impossible.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS
mode, when switching to it first time, VF can go up independently to
his representor, which is not expected.
Perform clearing of VF config when switching modes and set link state
to AUTO as default value. Also, when switching to OFFLOADS mode set
link state to DOWN, which allow VF link state to be controlled by its
REP.
Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore")
Fixes: 556b9d16d3f5 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Use raw_smp_processor_id instead of smp_processor_id() otherwise we will
get the following trace in debug-kernel:
BUG: using smp_processor_id() in preemptible [00000000] code: devlink
caller is dr_create_cq.constprop.2+0x31d/0x970 [mlx5_core]
Call Trace:
dump_stack+0x9a/0xf0
debug_smp_processor_id+0x1f3/0x200
dr_create_cq.constprop.2+0x31d/0x970
genl_family_rcv_msg+0x5fd/0x1170
genl_rcv_msg+0xb8/0x160
netlink_rcv_skb+0x11e/0x340
Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Since the implementation relies on limiting the VF transmit rate to
simulate ingress rate limiting, and since either uplink representor or
ecpf are not associated with a VF, we limit the rate limit configuration
for those ports.
Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The current code handles only counters that attached to dest, we still
have the cases where we have counter on non-dest, like over drop etc.
Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add the upcoming ConnectX-7 device ID.
Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The pool sizes represent the pool sizes in the fw. when we request
a pool size from fw, it will return the next possible group.
We track how many pools the fw has left and start requesting groups
from the big to the small.
When we start request 4k group, which doesn't exists in fw, fw
wants to allocate the next possible size, 64k, but will fail since
its exhausted. The correct smallest pool size in fw is 128 and not 4k.
Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Extend stats group array of uplink representor with all stats that are
available for PF in legacy mode, besides ipsec and TLS which are not
supported.
Don't output vport stats for uplink representor because they are already
handled by 802_3 group (with different names: {tx|rx}_{bytes|packets}_phy).
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Q counters were disabled for all types of representors to prevent an issue
where there is not enough resources to init q counters for 127 representor
instances. Enable q counters only for uplink representors to support
"rx_out_of_buffer", "rx_if_down_packets" counters in ethtool.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In order to support all of the supported stats that are available in legacy
mode for switchdev uplink representors, convert rep stats infrastructure to
reuse struct mlx5e_stats_grp that is already used when device is in legacy
mode. Refactor rep code to use array of mlx5e_stats_grp
structures (constructed using macros provided by stats infra) to
fill/update stats, instead of fixed hardcoded set of values. This approach
allows to easily extend representors with new stats types.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Don't copy all of the stats groups used for mlx5e ethernet NIC profile,
have a separate stats groups for IPoIB with the set of the needed stats
only.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Convert stats groups array to array of "stats group" pointers to allow
sharing and individual selection of groups per profile as illustrated in
the next patches.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
|
|
Introduce new macros to declare stats callbacks and groups, for better
code reuse and for individual groups selection per profile which will be
introduced in next patches.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
|
|
Attach stats groups array to the profiles and make the stats utility
functions (get_num, update, fill, fill_strings) generic and use the
profile->stats_grps rather the hardcoded NIC stats groups.
This will allow future extension to have per profile stats groups.
In this patch mlx5e NIC and IPoIB will still share the same stats
groups.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
|
|
Clean up the code and allows to call uplink rep init/cleanup
from different location later.
To be used later for a new uplink representor mode.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Allow connecting SW steering source table to a lower/same level
destination table.
Lifting this limitation is required to support Connection Tracking.
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Modify header supports ADD/SET and from this patch
also COPY. Copy allows to copy header fields and
metadata.
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Modify set actions are not supported on both tx
and rx, added a check for that.
Also refactored the code in a way that every modify
action has his own functions, this needed so in the
future we could add copy action more smoothly.
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In the flowtables offload all the devices in the flowtables
share the same flow_block. An offload rule will be installed on
all the devices. This scenario is not correct.
It is no problem if there are only two devices in the flowtable,
The rule with ingress and egress on the same device can be reject
by driver.
But more than two devices in the flowtable will install the wrong
rules on hardware.
For example:
Three devices in a offload flowtables: dev_a, dev_b, dev_c
A rule ingress from dev_a and egress to dev_b:
The rule will install on device dev_a.
The rule will try to install on dev_b but failed for ingress
and egress on the same device.
The rule will install on dev_c. This is not correct.
The flowtables offload avoid this case through restricting the ingress dev
with FLOW_DISSECTOR_KEY_META.
So the mlx5e driver also should support the FLOW_DISSECTOR_KEY_META parse.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Fix the following sparse warning:
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c:35:20: warning: symbol 'ESW_POOLS' was not declared. Should it be static?
Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Acked-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
since mlx5 hardware can segment correctly TSO packets on VXLAN over VLAN
topologies, CPU usage can improve significantly if we enable tunnel
offloads in dev->vlan_features, like it was done in the past with other
NIC drivers (e.g. mlx4, be2net and ixgbe).
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Use "%zu" for size_t. Seen on ARM allmodconfig:
drivers/net/ethernet/mellanox/mlx5/core/wq.c: In function 'mlx5_wq_cyc_wqe_dump':
include/linux/kern_levels.h:5:18: warning: format '%ld' expects argument of type 'long int', but argument 5 has type 'size_t' {aka 'unsigned int'} [-Wformat=]
Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events")
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Increase the number of chains and priorities to support
the whole range available in tc.
We use unmanaged tables and ignore flow level to create more
tables than what we declared to fs_core steering, and we manage
the connections between the tables themselves.
To support that we need FW with ignore_flow_level capability.
Otherwise the old behaviour will be used, where we are limited
by the number of levels we declared (4 chains, 16 prios).
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
To support the entire chain and prio range (32bit + 16bit),
instead of a using a static array of chains/prios of limited size, create
them dynamically, and use a rhashtable to search for existing chains/prio
combinations.
This will be used in next patch to actually increase the number using
unamanged tables support and ignore flow level capability.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Before changing the chain from original chain to ft offload chain,
make sure user doesn't actually use chains.
While here, normalize the prio range to that which we support.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
FT chain is defined as the next chain after tc.
To prepare for next patches that will increase the number of tc
chains available at runtime, use a getter function to get this
value.
The define is still used in static fs_core allocation,
to calculate the number of chains. This static allocation
will be used if the relevant capabilities won't be available
to support dynamic chains.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Exclude the last n entries for an autogrouped flow table.
Reserving entries at the end of the FT will ensure that this FG will be
the last to be evaluated. This will be used in the next patch to create
a miss group enabling custom actions on FT miss.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
If user sets ignore flow level flag on a rule, that rule can point to
a flow table of any level, including those with levels equal or less
than the level of the flow table it is added on.
This with unamanged tables will be used to create a FDB chain/prio
hierarchy much larger than currently supported level range.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Currently, Most of the steering tree is statically declared ahead of time,
with steering prios instances allocated for each fdb chain to assign max
number of levels for each of them. This allows fs_core to manage the
connections and levels of the flow tables hierarcy to prevent loops, but
restricts us with the number of supported chains and priorities.
Introduce unmananged flow tables, allowing the user to manage the flow
table connections. A unamanged table is detached from the fs_core flow
table hierarcy, and is only connected back to the hierarchy by explicit
FTEs forward actions.
This will be used together with firmware that supports ignoring the flow
table levels to increase the number of supported chains and prios.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
This merge syncs with mlx5-next latest HW bits and layout updates for next
features, in addition one patch that improves
mlx5_create_auto_grouped_flow_table() API across all mlx5 users.
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
net/mlx5e: Add discard counters per priority
net/mlx5e: Expose FEC feilds and related capability bit
net/mlx5: Add mlx5_ifc definitions for connection tracking support
net/mlx5: Add copy header action struct layout
net/mlx5: Expose resource dump register mapping
net/mlx5: Add structures and defines for MIRC register
net/mlx5: Read MCAM register groups 1 and 2
net/mlx5: Add structures layout for new MCAM access reg groups
net/mlx5: Expose vDPA emulation device capabilities
net/mlx5: Add Virtio Emulation related device capabilities
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param
which already carries the max_fte, prio and flags memebers, and is
used the same in similar mlx5_create_flow_table() function.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add counters that count (per priority) the number of received
packets that dropped due to lack of buffers on a physical port. If
this counter is increasing, it implies that the adapter is
congested and cannot absorb the traffic coming from the network.
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
On load, Driver caches MCAM (Management Capabilities Mask Register)
registers. in addition to the only MCAM register group (0) the driver
already reads, here we add support for reading groups 1 and 2.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|