summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx5
AgeCommit message (Collapse)Author
2020-03-24net/mlx5e: Do not recover from a non-fatal syndromeAya Levin
For non-fatal syndromes like LOCAL_LENGTH_ERR, recovery shouldn't be triggered. In these scenarios, the RQ is not actually in ERR state. This misleads the recovery flow which assumes that the RQ is really in error state and no more completions arrive, causing crashes on bad page state. Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24net/mlx5e: Fix ICOSQ recovery flow with Striding RQAya Levin
In striding RQ mode, the buffers of an RX WQE are first prepared and posted to the HW using a UMR WQEs via the ICOSQ. We maintain the state of these in-progress WQEs in the RQ SW struct. In the flow of ICOSQ recovery, the corresponding RQ is not in error state, hence: - The buffers of the in-progress WQEs must be released and the RQ metadata should reflect it. - Existing RX WQEs in the RQ should not be affected. For this, wrap the dealloc of the in-progress WQEs in a function, and use it in the ICOSQ recovery flow instead of mlx5e_free_rx_descs(). Fixes: be5323c8379f ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24net/mlx5e: Fix missing reset of SW metadata in Striding RQ resetAya Levin
When resetting the RQ (moving RQ state from RST to RDY), the driver resets the WQ's SW metadata. In striding RQ mode, we maintain a field that reflects the actual expected WQ head (including in progress WQEs posted to the ICOSQ). It was mistakenly not reset together with the WQ. Fix this here. Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24net/mlx5e: Enhance ICOSQ WQE info fieldsAya Levin
Add number of WQEBBs (WQE's Basic Block) to WQE info struct. Set the number of WQEBBs on WQE post, and increment the consumer counter (cc) on completion. In case of error completions, the cc was mistakenly not incremented, keeping a gap between cc and pc (producer counter). This failed the recovery flow on the ICOSQ from a CQE error which timed-out waiting for the cc and pc to meet. Fixes: be5323c8379f ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failureLeon Romanovsky
The cap_mask1 isn't protected by field_select and not listed among RW fields, but it is required to be written to properly initialize ports in IB virtualization mode. Link: https://lore.kernel.org/linux-rdma/88bab94d2fd72f3145835b4518bc63dda587add6.camel@redhat.com Fixes: ab118da4c10a ("net/mlx5: Don't write read-only fields in MODIFY_HCA_VPORT_CONTEXT command") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05net/mlx5: Clear LAG notifier pointer after unregisterEli Cohen
After returning from unregister_netdevice_notifier_dev_net(), set the notifier_call field to NULL so successive call to mlx5_lag_add() will function as expected. Fixes: 7907f23adc18 ("net/mlx5: Implement RoCE LAG feature") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05net/mlx5e: Fix endianness handling in pedit maskSebastian Hense
The mask value is provided as 64 bit and has to be casted in either 32 or 16 bit. On big endian systems the wrong half was casted which resulted in an all zero mask. Fixes: 2b64beba0251 ("net/mlx5e: Support header re-write of partial fields in TC pedit offload") Signed-off-by: Sebastian Hense <sebastian.hense1@ibm.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05net/mlx5e: kTLS, Fix wrong value in record tracker enumTariq Toukan
Fix to match the HW spec: TRACKING state is 1, SEARCHING is 2. No real issue for now, as these values are not currently used. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05net/mlx5e: kTLS, Fix TCP seq off-by-1 issue in TX resync flowTariq Toukan
We have an off-by-1 issue in the TCP seq comparison. The last sequence number that belongs to the TCP packet's payload is not "start_seq + len", but one byte before it. Fix it so the 'ends_before' is evaluated properly. This fixes a bug that results in error completions in the kTLS HW offload flows. Fixes: ffbd9ca94e2e ("net/mlx5e: kTLS, Fix corner-case checks in TX resync flow") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-05net/mlx5: DR, Fix postsend actions write lengthHamdan Igbaria
Fix the send info write length to be (actions x action) size in bytes. Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-27mlx5: register lag notifier for init network namespace onlyJiri Pirko
The current code causes problems when the unregistering netdevice could be different then the registering one. Since the check in mlx5_lag_netdev_event() does not allow any other network namespace anyway, fix this by registerting the lag notifier per init network namespace only. Fixes: d48834f9d4b4 ("mlx5: Use dev_net netdevice notifier registrations") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Aya Levin <ayal@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18net/mlx5: DR, Handle reformat capability over sw-steering tablesErez Shitrit
On flow table creation, send the relevant flags according to what the FW currently supports. When FW doesn't support reformat option over SW-steering managed table, the driver shouldn't pass this. Fixes: 988fd6b32d07 ("net/mlx5: DR, Pass table flags at creation to lower layer") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix lowest FDB pool sizePaul Blakey
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Don't clear the whole vf config when switching modesDmytro Linkin
There is no need to reset all vf config (except link state) between legacy and switchdev modes changes. Also, set link state to AUTO, when legacy enabled. Fixes: 3b83b6c2e024 ("net/mlx5e: Clear VF config when switching modes") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: DR, Fix matching on vport gvmiHamdan Igbaria
Set vport gvmi in the tag, only when source gvmi is set in the bit mask. Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Fix crash in recovery flow without devlink reporterAya Levin
When health reporters are not supported, recovery function is invoked directly, not via devlink health reporters. In this direct flow, the recover function input parameter was passed incorrectly and is causing a kernel oops. This patch is fixing the input parameter. Following call trace is observed on rx error health reporting. Internal error: Oops: 96000007 [#1] PREEMPT SMP Process kworker/u16:4 (pid: 4584, stack limit = 0x00000000c9e45703) Call trace: mlx5e_rx_reporter_err_rq_cqe_recover+0x30/0x164 [mlx5_core] mlx5e_health_report+0x60/0x6c [mlx5_core] mlx5e_reporter_rq_cqe_err+0x6c/0x90 [mlx5_core] mlx5e_rq_err_cqe_work+0x20/0x2c [mlx5_core] process_one_work+0x168/0x3d0 worker_thread+0x58/0x3d0 kthread+0x108/0x134 Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDYAya Levin
Initialize RQ doorbell counters to zero prior to moving an RQ from RST to RDY state. Per HW spec, when RQ is back to RDY state, the descriptor ID on the completion is reset. The doorbell record must comply. Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reported-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepaHuy Nguyen
rtnl_bridge_getlink is protected by rcu lock, so mlx5_eswitch_get_vepa cannot take mutex lock. Two possible issues can happen: 1. User at the same time change vepa mode via RTM_SETLINK command. 2. User at the same time change the switchdev mode via devlink netlink interface. Case 1 cannot happen because rtnl executes one message in order. Case 2 can happen but we do not expect user to change the switchdev mode when changing vepa. Even if a user does it, so he will read a value which is no longer valid. Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: Deprecate usage of generic TLS HW capability bitTariq Toukan
Deprecate the generic TLS cap bit, use the new TX-specific TLS cap bit instead. Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5e: TX, Error completion is for last WQE in batchTariq Toukan
For a cyclic work queue, when not requesting a completion per WQE, a single CQE might indicate the completion of several WQEs. However, in case some WQE in the batch causes an error, then an error completion is issued, breaking the batch, and pointing to the offending WQE in the wqe_counter field. Hence, WQE-specific error CQE handling (like printing, breaking, etc...) should be performed only for the last WQE in batch. Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events") Fixes: fd9b4be8002c ("net/mlx5e: RX, Support multiple outstanding UMR posts") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctxRaed Salem
SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx, however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function nullifies sa_ctx pointer without freeing the memory allocated, hence the memory leak. Fix by free SA context when the SA is released. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: IPsec, Fix esp modify function attributeRaed Salem
The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used with negative negation as zero value indicates success but it used as failure return value instead. Fix by remove the unary not negation operator. Fixes: 05564d0ae075 ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: Fix deadlock in fs_coreMaor Gottlieb
free_match_list could be called when the flow table is already locked. We need to pass this notation to tree_put_node. It fixes the following lockdep warnning: [ 1797.268537] ============================================ [ 1797.276837] WARNING: possible recursive locking detected [ 1797.285101] 5.5.0-rc5+ #10 Not tainted [ 1797.291641] -------------------------------------------- [ 1797.299917] handler10/9296 is trying to acquire lock: [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at: tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.319694] [ 1797.319694] but task is already holding lock: [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.344707] [ 1797.344707] other info that might help us debug this: [ 1797.356952] Possible unsafe locking scenario: [ 1797.356952] [ 1797.368333] CPU0 [ 1797.373357] ---- [ 1797.378364] lock(&node->lock); [ 1797.384222] lock(&node->lock); [ 1797.390031] [ 1797.390031] *** DEADLOCK *** [ 1797.390031] [ 1797.403003] May be due to missing lock nesting notation [ 1797.403003] [ 1797.414691] 3 locks held by handler10/9296: [ 1797.421465] #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at: tc_setup_cb_add+0x70/0x250 [ 1797.432810] #1: ffff88a030081490 (&comp->sem){++++}, at: mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core] [ 1797.445829] #2: ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.459913] [ 1797.459913] stack backtrace: [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not tainted 5.5.0-rc5+ #10 [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 [ 1797.491480] Call Trace: [ 1797.496701] dump_stack+0x96/0xe0 [ 1797.502864] __lock_acquire.cold.63+0xf8/0x212 [ 1797.510301] ? lockdep_hardirqs_on+0x250/0x250 [ 1797.517701] ? mark_held_locks+0x55/0xa0 [ 1797.524547] ? quarantine_put+0xb7/0x160 [ 1797.531422] ? lockdep_hardirqs_on+0x17d/0x250 [ 1797.538913] lock_acquire+0xd6/0x1f0 [ 1797.545529] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.553701] down_write+0x94/0x140 [ 1797.560206] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.568464] ? down_write_killable_nested+0x170/0x170 [ 1797.576925] ? del_hw_flow_group+0xde/0x1f0 [mlx5_core] [ 1797.585629] tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.593891] ? free_match_list.part.25+0x147/0x170 [mlx5_core] [ 1797.603389] free_match_list.part.25+0xe0/0x170 [mlx5_core] [ 1797.612654] _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core] [ 1797.621838] ? lock_acquire+0xd6/0x1f0 [ 1797.629028] ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core] [ 1797.637981] ? alloc_insert_flow_group+0x420/0x420 [mlx5_core] [ 1797.647459] ? try_to_wake_up+0x4c7/0xc70 [ 1797.654881] ? lock_downgrade+0x350/0x350 [ 1797.662271] ? __mutex_unlock_slowpath+0xb1/0x3f0 [ 1797.670396] ? find_held_lock+0xac/0xd0 [ 1797.677540] ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.686467] mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.695134] ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core] [ 1797.704270] ? irq_exit+0xa5/0x170 [ 1797.710764] ? retint_kernel+0x10/0x10 [ 1797.717698] ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230 [mlx5_core] [ 1797.728708] mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core] [ 1797.738713] ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core] [ 1797.748384] ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core] [ 1797.757400] mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core] [ 1797.767665] mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core] [ 1797.776886] ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core] [ 1797.785562] ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core] [ 1797.795353] __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core] [ 1797.804558] ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0 [mlx5_core] [ 1797.815093] ? wait_for_completion+0x260/0x260 [ 1797.823272] mlx5e_configure_flower+0xe94/0x1620 [mlx5_core] [ 1797.832792] ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core] [ 1797.842096] ? down_read+0x11a/0x2e0 [ 1797.849090] ? down_write+0x140/0x140 [ 1797.856142] ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core] [ 1797.866027] tc_setup_cb_add+0x11a/0x250 [ 1797.873339] fl_hw_replace_filter+0x25e/0x320 [cls_flower] [ 1797.882385] ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower] [ 1797.891607] fl_change+0x1d54/0x1fb6 [cls_flower] [ 1797.899772] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.910728] ? lock_downgrade+0x350/0x350 [ 1797.918187] ? __radix_tree_lookup+0xa5/0x130 [ 1797.926046] ? fl_set_key+0x1590/0x1590 [cls_flower] [ 1797.934611] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.945673] tc_new_tfilter+0xcd1/0x1240 [ 1797.953138] ? tc_del_tfilter+0xb10/0xb10 [ 1797.960688] ? avc_has_perm_noaudit+0x92/0x320 [ 1797.968721] ? avc_has_perm_noaudit+0x1df/0x320 [ 1797.976816] ? avc_has_extended_perms+0x990/0x990 [ 1797.985090] ? mark_lock+0xaa/0x9e0 [ 1797.991988] ? match_held_lock+0x1b/0x240 [ 1797.999457] ? match_held_lock+0x1b/0x240 [ 1798.006859] ? find_held_lock+0xac/0xd0 [ 1798.014045] ? symbol_put_addr+0x40/0x40 [ 1798.021317] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.029460] ? tc_del_tfilter+0xb10/0xb10 [ 1798.036810] rtnetlink_rcv_msg+0x4d5/0x620 [ 1798.044236] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.052034] ? lockdep_hardirqs_on+0x250/0x250 [ 1798.059837] ? match_held_lock+0x1b/0x240 [ 1798.067146] ? find_held_lock+0xac/0xd0 [ 1798.074246] netlink_rcv_skb+0xc6/0x1f0 [ 1798.081339] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.089104] ? netlink_ack+0x440/0x440 [ 1798.096061] netlink_unicast+0x2d4/0x3b0 [ 1798.103189] ? netlink_attachskb+0x3f0/0x3f0 [ 1798.110724] ? _copy_from_iter_full+0xda/0x370 [ 1798.118415] netlink_sendmsg+0x3ba/0x6a0 [ 1798.125478] ? netlink_unicast+0x3b0/0x3b0 [ 1798.132705] ? netlink_unicast+0x3b0/0x3b0 [ 1798.139880] sock_sendmsg+0x94/0xa0 [ 1798.146332] ____sys_sendmsg+0x36c/0x3f0 [ 1798.153251] ? copy_msghdr_from_user+0x165/0x230 [ 1798.160941] ? kernel_sendmsg+0x30/0x30 [ 1798.167738] ___sys_sendmsg+0xeb/0x150 [ 1798.174411] ? sendmsg_copy_msghdr+0x30/0x30 [ 1798.181649] ? lock_downgrade+0x350/0x350 [ 1798.188559] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.196239] ? __fget+0x21d/0x320 [ 1798.202335] ? do_dup2+0x2a0/0x2a0 [ 1798.208499] ? lock_downgrade+0x350/0x350 [ 1798.215366] ? __fget_light+0xd6/0xf0 [ 1798.221808] ? syscall_trace_enter+0x369/0x5d0 [ 1798.229112] __sys_sendmsg+0xd3/0x160 [ 1798.235511] ? __sys_sendmsg_sock+0x60/0x60 [ 1798.242478] ? syscall_trace_enter+0x233/0x5d0 [ 1798.249721] ? syscall_slow_exit_work+0x280/0x280 [ 1798.257211] ? do_syscall_64+0x1e/0x2e0 [ 1798.263680] do_syscall_64+0x72/0x2e0 [ 1798.269950] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-27mlx5: Use dev_net netdevice notifier registrationsJiri Pirko
Register the dev_net notifier and allow the per-net notifier to follow the device into different namespace. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel pathTariq Toukan
When TCP out-of-order is identified (unexpected tcp seq mismatch), driver analyzes the packet and decides what handling should it get: 1. go to accelerated path (to be encrypted in HW), 2. go to regular xmit path (send w/o encryption), 3. drop. Packets marked with skb->decrypted by the TLS stack in the TX flow skips SW encryption, and rely on the HW offload. Verify that such packets are never sent un-encrypted on the wire. Add a WARN to catch such bugs, and prefer dropping the packet in these cases. Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5e: kTLS, Remove redundant posts in TX resync flowTariq Toukan
The call to tx_post_resync_params() is done earlier in the flow, the post of the control WQEs is unnecessarily repeated. Remove it. Fixes: 700ec4974240 ("net/mlx5e: kTLS, Fix missing SQ edge fill") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5e: kTLS, Fix corner-case checks in TX resync flowTariq Toukan
There are the following cases: 1. Packet ends before start marker: bypass offload. 2. Packet starts before start marker and ends after it: drop, not supported, breaks contract with kernel. 3. packet ends before tls record info starts: drop, this packet was already acknowledged and its record info was released. Add the above as comment in code. Mind possible wraparounds of the TCP seq, replace the simple comparison with a call to the TCP before() method. In addition, remove logic that handles negative sync_len values, as it became impossible. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5e: Clear VF config when switching modesDmytro Linkin
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS mode, when switching to it first time, VF can go up independently to his representor, which is not expected. Perform clearing of VF config when switching modes and set link state to AUTO as default value. Also, when switching to OFFLOADS mode set link state to DOWN, which allow VF link state to be controlled by its REP. Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore") Fixes: 556b9d16d3f5 ("net/mlx5: Clear VF's configuration on disabling SRIOV") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: DR, use non preemptible call to get the current cpu numberErez Shitrit
Use raw_smp_processor_id instead of smp_processor_id() otherwise we will get the following trace in debug-kernel: BUG: using smp_processor_id() in preemptible [00000000] code: devlink caller is dr_create_cq.constprop.2+0x31d/0x970 [mlx5_core] Call Trace: dump_stack+0x9a/0xf0 debug_smp_processor_id+0x1f3/0x200 dr_create_cq.constprop.2+0x31d/0x970 genl_family_rcv_msg+0x5fd/0x1170 genl_rcv_msg+0xb8/0x160 netlink_rcv_skb+0x11e/0x340 Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: E-Switch, Prevent ingress rate configuration of uplink repEli Cohen
Since the implementation relies on limiting the VF transmit rate to simulate ingress rate limiting, and since either uplink representor or ecpf are not associated with a VF, we limit the rate limit configuration for those ports. Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: DR, Enable counter on non-fwd-dest objectsErez Shitrit
The current code handles only counters that attached to dest, we still have the cases where we have counter on non-dest, like over drop etc. Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: Update the list of the PCI supported devicesMeir Lichtinger
Add the upcoming ConnectX-7 device ID. Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: Fix lowest FDB pool sizePaul Blakey
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Enable all available stats for uplink repsVlad Buslov
Extend stats group array of uplink representor with all stats that are available for PF in legacy mode, besides ipsec and TLS which are not supported. Don't output vport stats for uplink representor because they are already handled by 802_3 group (with different names: {tx|rx}_{bytes|packets}_phy). Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Create q counters on uplink representorsVlad Buslov
Q counters were disabled for all types of representors to prevent an issue where there is not enough resources to init q counters for 127 representor instances. Enable q counters only for uplink representors to support "rx_out_of_buffer", "rx_if_down_packets" counters in ethtool. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infraVlad Buslov
In order to support all of the supported stats that are available in legacy mode for switchdev uplink representors, convert rep stats infrastructure to reuse struct mlx5e_stats_grp that is already used when device is in legacy mode. Refactor rep code to use array of mlx5e_stats_grp structures (constructed using macros provided by stats infra) to fill/update stats, instead of fixed hardcoded set of values. This approach allows to easily extend representors with new stats types. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: IPoIB, use separate stats groupsSaeed Mahameed
Don't copy all of the stats groups used for mlx5e ethernet NIC profile, have a separate stats groups for IPoIB with the set of the needed stats only. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Convert stats groups array to array of group pointersSaeed Mahameed
Convert stats groups array to array of "stats group" pointers to allow sharing and individual selection of groups per profile as illustrated in the next patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22net/mlx5e: Declare stats groups via macroSaeed Mahameed
Introduce new macros to declare stats callbacks and groups, for better code reuse and for individual groups selection per profile which will be introduced in next patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22net/mlx5e: Profile specific stats groupsSaeed Mahameed
Attach stats groups array to the profiles and make the stats utility functions (get_num, update, fill, fill_strings) generic and use the profile->stats_grps rather the hardcoded NIC stats groups. This will allow future extension to have per profile stats groups. In this patch mlx5e NIC and IPoIB will still share the same stats groups. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22net/mlx5e: Move uplink rep init/cleanup code into own functionsRoi Dayan
Clean up the code and allows to call uplink rep init/cleanup from different location later. To be used later for a new uplink representor mode. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5: DR, Allow connecting flow table to a lower/same level tableYevgeny Kliteynik
Allow connecting SW steering source table to a lower/same level destination table. Lifting this limitation is required to support Connection Tracking. Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5: DR, Modify header copy supportHamdan Igbaria
Modify header supports ADD/SET and from this patch also COPY. Copy allows to copy header fields and metadata. Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5: DR, Modify set action limitation extensionHamdan Igbaria
Modify set actions are not supported on both tx and rx, added a check for that. Also refactored the code in a way that every modify action has his own functions, this needed so in the future we could add copy action more smoothly. Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Add mlx5e_flower_parse_meta supportwenxu
In the flowtables offload all the devices in the flowtables share the same flow_block. An offload rule will be installed on all the devices. This scenario is not correct. It is no problem if there are only two devices in the flowtable, The rule with ingress and egress on the same device can be reject by driver. But more than two devices in the flowtable will install the wrong rules on hardware. For example: Three devices in a offload flowtables: dev_a, dev_b, dev_c A rule ingress from dev_a and egress to dev_b: The rule will install on device dev_a. The rule will try to install on dev_b but failed for ingress and egress on the same device. The rule will install on dev_c. This is not correct. The flowtables offload avoid this case through restricting the ingress dev with FLOW_DISSECTOR_KEY_META. So the mlx5e driver also should support the FLOW_DISSECTOR_KEY_META parse. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5: make the symbol 'ESW_POOLS' staticChen Wandun
Fix the following sparse warning: drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c:35:20: warning: symbol 'ESW_POOLS' was not declared. Should it be static? Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Acked-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: allow TSO on VXLAN over VLAN topologiesDavide Caratti
since mlx5 hardware can segment correctly TSO packets on VXLAN over VLAN topologies, CPU usage can improve significantly if we enable tunnel offloads in dev->vlan_features, like it was done in the past with other NIC drivers (e.g. mlx4, be2net and ixgbe). Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Fix printk format warningOlof Johansson
Use "%zu" for size_t. Seen on ARM allmodconfig: drivers/net/ethernet/mellanox/mlx5/core/wq.c: In function 'mlx5_wq_cyc_wqe_dump': include/linux/kern_levels.h:5:18: warning: format '%ld' expects argument of type 'long int', but argument 5 has type 'size_t' {aka 'unsigned int'} [-Wformat=] Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events") Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: E-Switch, Increase number of chains and prioritiesPaul Blakey
Increase the number of chains and priorities to support the whole range available in tc. We use unmanaged tables and ignore flow level to create more tables than what we declared to fs_core steering, and we manage the connections between the tables themselves. To support that we need FW with ignore_flow_level capability. Otherwise the old behaviour will be used, where we are limited by the number of levels we declared (4 chains, 16 prios). Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>