summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
AgeCommit message (Collapse)Author
2019-11-20net/mlx5e: Add missing capability bit check for IP-in-IPMarina Varshaver
Device that doesn't support IP-in-IP offloads has to filter csum and gso offload support, otherwise kernel will conclude that device is capable of offloading csum and gso for IP-in-IP tunnels and that might result in IP-in-IP tunnel not functioning. Fixes: 25948b87dda2 ("net/mlx5e: Support TSO and TX checksum offloads for IP-in-IP") Signed-off-by: Marina Varshaver <marinav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-10-18net/mlx5e: kTLS, Limit DUMP wqe sizeTariq Toukan
HW expects the data size in DUMP WQEs to be up to MTU. Make sure they are in range. We elevate the frag page refcount by 'n-1', in addition to the one obtained in tx_sync_info_get(), having an overall of 'n' references. We bulk increments by using a single page_ref_add() command, to optimize perfermance. The refcounts are released one by one, by the corresponding completions. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-10-18net/mlx5e: Tx, Fix assumption of single WQEBB of NOP in cleanup flowTariq Toukan
Cited patch removed the assumption only in datapath. Here we remove it also form control/cleanup flow. Fixes: 9ab0233728ca ("net/mlx5e: Tx, Don't implicitly assume SKB-less wqe has one WQEBB") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-07Merge tag 'mlx5-updates-2019-09-05' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-09-05 1) Allover mlx5 cleanups 2) Added port congestion counters to ethtool stats: Add 3 counters per priority to ethtool using PPCNT: 2.1) rx_prio[p]_buf_discard - the number of packets discarded by device due to lack of per host receive buffers 2.2) rx_prio[p]_cong_discard - the number of packets discarded by device due to per host congestion 2.3) rx_prio[p]_marked - the number of packets ECN marked by device due to per host congestion ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add the ability to use unaligned chunks in the AF_XDP umem. By relaxing where the chunks can be placed, it allows to use an arbitrary buffer size and place whenever there is a free address in the umem. Helps more seamless DPDK AF_XDP driver integration. Support for i40e, ixgbe and mlx5e, from Kevin and Maxim. 2) Addition of a wakeup flag for AF_XDP tx and fill rings so the application can wake up the kernel for rx/tx processing which avoids busy-spinning of the latter, useful when app and driver is located on the same core. Support for i40e, ixgbe and mlx5e, from Magnus and Maxim. 3) bpftool fixes for printf()-like functions so compiler can actually enforce checks, bpftool build system improvements for custom output directories, and addition of 'bpftool map freeze' command, from Quentin. 4) Support attaching/detaching XDP programs from 'bpftool net' command, from Daniel. 5) Automatic xskmap cleanup when AF_XDP socket is released, and several barrier/{read,write}_once fixes in AF_XDP code, from Björn. 6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf inclusion as well as libbpf versioning improvements, from Andrii. 7) Several new BPF kselftests for verifier precision tracking, from Alexei. 8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya. 9) And more BPF kselftest improvements all over the place, from Stanislav. 10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub. 11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan. 12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni. 13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin. 14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar. 15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari, Peter, Wei, Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05net/mlx5e: Remove unnecessary clear_bit()sMaxim Mikityanskiy
Don't clear MLX5E_SQ_STATE_ENABLED on error in mlx5e_open_txqsq and mlx5e_open_icosq, because it's not set there, and is 0 by default. Fixes: acc6c5953af1 ("net/mlx5e: Split open/close channels to stages") Fixes: 9d18b5144a0a ("net/mlx5e: Split open/close ICOSQ into stages") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-30net/mlx5e: Move local var definition into ifdef blockVlad Buslov
New local variable "struct flow_block_offload *f" was added to mlx5e_setup_tc() in recent rtnl lock removal patches. The variable is used in code that is only compiled when CONFIG_MLX5_ESWITCH is enabled. This results compilation warning about unused variable when CONFIG_MLX5_ESWITCH is not set. Move the variable definition into eswitch-specific code block from the beginning of mlx5e_setup_tc() function. Fixes: c9f14470d048 ("net: sched: add API for registering unlocked offload block callbacks") Reported-by: tanhuazhong <tanhuazhong@huawei.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-28net/mlx5e: Support TSO and TX checksum offloads for IP-in-IPMarina Varshaver
tunnels Add TX offloads support for IP-in-IP tunneled packets by reporting the needed netdev features. Signed-off-by: Marina Varshaver <marinav@mellanox.com> Signed-off-by: Avihu Hagag <avihuh@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-28net/mlx5e: Improve stateless offload capability checkMarina Varshaver
Use generic function for checking tunnel stateless offload capability instead of separate macros. Signed-off-by: Marina Varshaver <marinav@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-28net/mlx5e: Support LAG TX port affinity distributionMaxim Mikityanskiy
When the VF LAG is in use, round-robin the TX affinity of channels among the different ports, if supported by the firmware. Create a set of TISes per port, while doing round-robin of the channels over the different sets. Let all SQs of a channel share the same set of TISes. If lag_tx_port_affinity HCA cap bit is supported, num_lag_ports > 1 and we aren't the LACP owner (PF in the regular use), assign the affinities, otherwise use tx_affinity == 0 in TIS context to let the FW assign the affinities itself. The TISes of the LACP owner are mapped only to the native physical port. For VFs, the starting port for round-robin is determined by its vhca_id, because a VF may have only one channel if attached to a single-core VM. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-28net/mlx5e: Expose new function for TIS destroy loopTariq Toukan
For better modularity and code sharing. Function internal change to be introduced in the next patches. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-26net: sched: add API for registering unlocked offload block callbacksVlad Buslov
Extend struct flow_block_offload with "unlocked_driver_cb" flag to allow registering and unregistering block hardware offload callbacks that do not require caller to hold rtnl lock. Extend tcf_block with additional lockeddevcnt counter that is incremented for each non-unlocked driver callback attached to device. This counter is necessary to conditionally obtain rtnl lock before calling hardware callbacks in following patches. Register mlx5 tc block offload callbacks as "unlocked". Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-22net/mlx5e: Add mlx5e HV VHCA stats agentEran Ben Elisha
HV VHCA stats agent is responsible on running a preiodic rx/tx packets/bytes stats update. Currently the supported format is version MLX5_HV_VHCA_STATS_VERSION. Block ID 1 is dedicated for statistics data transfer from the VF to the PF. The reporter fetch the statistics data from all opened channels, fill it in a buffer and send it to mlx5_hv_vhca_write_agent. As the stats layer should include some metadata per block (sequence and offset), the HV VHCA layer shall modify the buffer before actually send it over block 1. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-20net/mlx5e: Report and recover from CQE with error on RQAya Levin
Add support for report and recovery from error on completion on RQ by setting the queue back to ready state. Handle only errors with a syndrome indicating the RQ might enter error state and could be recovered. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Report and recover from rx timeoutAya Levin
Add support for report and recovery from rx timeout. On driver open we post NOP work request on the rx channels to trigger napi in order to fillup the rx rings. In case napi wasn't scheduled due to a lost interrupt, perform EQ recovery. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Report and recover from CQE error on ICOSQAya Levin
Add support for report and recovery from error on completion on ICOSQ. Deactivate RQ and flush, then deactivate ICOSQ. Set the queue back to ready state (firmware) and reset the ICOSQ and the RQ (software resources). Finally, activate the ICOSQ and the RQ. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Split open/close ICOSQ into stagesAya Levin
Align ICOSQ open/close behaviour with RQ and SQ. Split open flow into open and activate where open handles creation and activate enables the queue. Do a symmetric thing in close flow: split into close and deactivate. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Add support to rx reporter diagnoseAya Levin
Add rx reporter, which supports diagnose call-back. Diagnostics output include: information common to all RQs: RQ type, RQ size, RQ stride size, CQ size and CQ stride size. In addition advertise information per RQ and its related icosq and attached CQ. $ devlink health diagnose pci/0000:00:0b.0 reporter rx Common config: RQ: type: 2 stride size: 2048 size: 8 CQ: stride size: 64 size: 1024 RQs: channel ix: 0 rqn: 4308 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1 CQ: cqn: 1032 HW status: 0 channel ix: 1 rqn: 4313 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1 CQ: cqn: 1036 HW status: 0 channel ix: 2 rqn: 4318 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1 CQ: cqn: 1040 HW status: 0 channel ix: 3 rqn: 4323 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1 CQ: cqn: 1044 HW status: 0 $ devlink health diagnose pci/0000:00:0b.0 reporter rx -jp { "Common config": { "RQ": { "type": 2, "stride size": 2048, "size": 8 }, "CQ": { "stride size": 64, "size": 1024 } }, "RQs": [ { "channel ix": 0, "rqn": 4308, "HW state": 1, "SW state": 3, "posted WQEs": 7, "cc": 7, "ICOSQ HW state": 1, "CQ": { "cqn": 1032, "HW status": 0 } },{ "channel ix": 1, "rqn": 4313, "HW state": 1, "SW state": 3, "posted WQEs": 7, "cc": 7, "ICOSQ HW state": 1, "CQ": { "cqn": 1036, "HW status": 0 } },{ "channel ix": 2, "rqn": 4318, "HW state": 1, "SW state": 3, "posted WQEs": 7, "cc": 7, "ICOSQ HW state": 1, "CQ": { "cqn": 1040, "HW status": 0 } },{ "channel ix": 3, "rqn": 4323, "HW state": 1, "SW state": 3, "posted WQEs": 7, "cc": 7, "ICOSQ HW state": 1, "CQ": { "cqn": 1044, "HW status": 0 } } ] } Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Add helper functions for reporter's basicsAya Levin
Introduce helper functions for create and destroy reporters and update channels. In the following patch, rx reporter is added and it will use these helpers too. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Change naming convention for reporter's functionsAya Levin
Change from mlx5e_tx_reporter_* to mlx5e_reporter_tx_*. In the following patches in the set rx reporter is added, the new naming convention is more uniformed. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-20net/mlx5e: Rename reporter header fileAya Levin
Rename reporter.h -> health.h so patches in the set can use it for health related functionality. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Merge conflict of mlx5 resolved using instructions in merge commit 9566e650bf7fdf58384bb06df634f7531ca3a97e. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeupMagnus Karlsson
This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new ndo provides the same functionality as before but with the addition of a new flags field that is used to specifiy if Rx, Tx or both should be woken up. The previous ndo only woke up Tx, as implied by the name. The i40e and ixgbe drivers (which are all the supported ones) are updated with this new interface. This new ndo will be used by the new need_wakeup functionality of XDP sockets that need to be able to wake up both Rx and Tx driver processing. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-08net/mlx5e: Fix error flow of CQE recovery on tx reporterAya Levin
CQE recovery function begins with test and set of recovery bit. Add an error flow which ensures clearing of this bit when leaving the recovery function, to allow further recoveries to take place. This allows removal of clearing recovery bit on sq activate. Fixes: de8650a82071 ("net/mlx5e: Add tx reporter support") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Just minor overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-05net/mlx5e: always initialize frag->last_in_pageQian Cai
The commit 069d11465a80 ("net/mlx5e: RX, Enhance legacy Receive Queue memory scheme") introduced an undefined behaviour below due to "frag->last_in_page" is only initialized in mlx5e_init_frags_partition() when, if (next_frag.offset + frag_info[f].frag_stride > PAGE_SIZE) or after bailed out the loop, for (i = 0; i < mlx5_wq_cyc_get_size(&rq->wqe.wq); i++) As the result, there could be some "frag" have uninitialized value of "last_in_page". Later, get_frag() obtains those "frag" and check "frag->last_in_page" in mlx5e_put_rx_frag() and triggers the error during boot. Fix it by always initializing "frag->last_in_page" to "false" in mlx5e_init_frags_partition(). UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:325:12 load of value 170 is not a valid value for type 'bool' (aka '_Bool') Call trace: dump_backtrace+0x0/0x264 show_stack+0x20/0x2c dump_stack+0xb0/0x104 __ubsan_handle_load_invalid_value+0x104/0x128 mlx5e_handle_rx_cqe+0x8e8/0x12cc [mlx5_core] mlx5e_poll_rx_cq+0xca8/0x1a94 [mlx5_core] mlx5e_napi_poll+0x17c/0xa30 [mlx5_core] net_rx_action+0x248/0x940 __do_softirq+0x350/0x7b8 irq_exit+0x200/0x26c __handle_domain_irq+0xc8/0x128 gic_handle_irq+0x138/0x228 el1_irq+0xb8/0x140 arch_cpu_idle+0x1a4/0x348 do_idle+0x114/0x1b0 cpu_startup_entry+0x24/0x28 rest_init+0x1ac/0x1dc arch_call_rest_init+0x10/0x18 start_kernel+0x4d4/0x57c Fixes: 069d11465a80 ("net/mlx5e: RX, Enhance legacy Receive Queue memory scheme") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01net/mlx5e: Set tx reporter only on successful creationAya Levin
When failing to create tx reporter, don't set the reporter's pointer. Creating a reporter is not mandatory for driver load, avoid garbage/error pointer. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Tx, Soften inline mode VLAN dependenciesTariq Toukan
If capable, use zero inline mode in TX WQE for non-VLAN packets. For VLAN ones, keep the enforcement of at least L2 inline mode, unless the WQE VLAN insertion offload cap is on. Performance: Tested single core packet rate of 64Bytes. NIC: ConnectX-5 CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz pktgen: Before: 12.46 Mpps After: 14.65 Mpps (+17.5%) XDP_TX: The MPWQE flow is not affected, as it already has this optimization. So we test with priv-flag xdp_tx_mpwqe: off. Before: 9.90 Mpps After: 10.20 Mpps (+3%) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Tested-by: Noam Stolero <noams@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-29net/mlx5e: Change flow flags type to unsigned longVlad Buslov
To remove dependency on rtnl lock and allow concurrent modification of 'flags' field of tc flow structure, change flow flag type to unsigned long and use atomic bit ops for reading and changing the flags. Implement auxiliary functions for setting, resetting and getting specific flag, and for checking most often used flag values. Always set flags with smp_mb__before_atomic() to ensure that all mlx5e_tc_flow are updated before concurrent readers can read new flags value. Rearrange all code paths to actually set flow->rule[] pointers before setting the OFFLOADED flag. On read side, use smp_mb__after_atomic() when accessing flags to ensure that offload-related flow fields are only read after the flags. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-29net/mlx5e: Avoid warning print when not requiredSaeed Mahameed
When disabling CQE compression in favor of time-stamping, don't show a warning when CQE compression is already disabled. Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-29net/mlx5e: Print a warning when LRO feature is dropped or not allowedHuy Nguyen
When user enables LRO via ethtool and if the RQ mode is legacy, mlx5e_fix_features drops the request without any explanation. Add netdev_warn to cover this case. Fixes: 6c3a823e1e9c ("net/mlx5e: RX, Remove HW LRO support in legacy RQ") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-25net/mlx5e: Fix wrong max num channels indicationTariq Toukan
No XSK support in the enhanced IPoIB driver and representors. Add a profile property to specify this, and enhance the logic that calculates the max number of channels to take it into account. Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-11Merge tag 'mlx5-fixes-2019-07-11' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2019-07-11 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v4.15 ('net/mlx5e: IPoIB, Add error path in mlx5_rdma_setup_rn') For -stable v5.1 ('net/mlx5e: Fix port tunnel GRE entropy control') ('net/mlx5e: Rx, Fix checksum calculation for new hardware') ('net/mlx5e: Fix return value from timeout recover function') ('net/mlx5e: Fix error flow in tx reporter diagnose') For -stable v5.2 ('net/mlx5: E-Switch, Fix default encap mode') Conflict note: This pull request will produce a small conflict when merged with net-next. In drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c Take the hunk from net and replace: esw_offloads_steering_init(esw, vf_nvports, total_nvports); with: esw_offloads_steering_init(esw); ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-11net/mlx5e: Fix unused variable warning when CONFIG_MLX5_ESWITCH is offSaeed Mahameed
In mlx5e_setup_tc "priv" variable is not being used if CONFIG_MLX5_ESWITCH is off, one way to fix this is to actually use it. mlx5e_setup_tc_mqprio also needs the "priv" variable and it extracts it on its own. We can simply pass priv to mlx5e_setup_tc_mqprio instead of netdev and avoid extracting the priv var, which will also resolve the compiler warning. Fixes: 4e95bc268b91 ("net: flow_offload: add flow_block_cb_setup_simple()") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> CC: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-11net/mlx5e: Rx, Fix checksum calculation for new hardwareSaeed Mahameed
CQE checksum full mode in new HW, provides a full checksum of rx frame. Covering bytes starting from eth protocol up to last byte in the received frame (frame_size - ETH_HLEN), as expected by the stack. Fixing up skb->csum by the driver is not required in such case. This fix is to avoid wrong checksum calculation in drivers which already support the new hardware with the new checksum mode. Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-09net: flow_offload: rename tc_cls_flower_offload to flow_cls_offloadPablo Neira Ayuso
And any other existing fields in this structure that refer to tc. Specifically: * tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule(). * TC_CLSFLOWER_* to FLOW_CLS_*. * tc_cls_common_offload to tc_cls_common_offload. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09drivers: net: use flow block APIPablo Neira Ayuso
This patch updates flow_block_cb_setup_simple() to use the flow block API. Several drivers are also adjusted to use it. This patch introduces the per-driver list of flow blocks to account for blocks that are already in use. Remove tc_block_offload alias. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09net: flow_offload: add flow_block_cb_setup_simple()Pablo Neira Ayuso
Most drivers do the same thing to set up the flow block callbacks, this patch adds a helper function to do this. This preparation patch reduces the number of changes to adapt the existing drivers to use the flow block callback API. This new helper function takes a flow block list per-driver, which is set to NULL until this driver list is used. This patch also introduces the flow_block_command and flow_block_binder_type enumerations, which are renamed to use FLOW_BLOCK_* in follow up patches. There are three definitions (aliases) in order to reduce the number of updates in this patch, which go away once drivers are fully adapted to use this flow block API. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08net: core: page_pool: add user refcnt and reintroduce page_pool_destroyIvan Khoronzhuk
Jesper recently removed page_pool_destroy() (from driver invocation) and moved shutdown and free of page_pool into xdp_rxq_info_unreg(), in-order to handle in-flight packets/pages. This created an asymmetry in drivers create/destroy pairs. This patch reintroduce page_pool_destroy and add page_pool user refcnt. This serves the purpose to simplify drivers error handling as driver now drivers always calls page_pool_destroy() and don't need to track if xdp_rxq_info_reg_mem_model() was unsuccessful. This could be used for a special cases where a single RX-queue (with a single page_pool) provides packets for two net_device'es, and thus needs to register the same page_pool twice with two xdp_rxq_info structures. This patch is primarily to ease API usage for drivers. The recently merged netsec driver, actually have a bug in this area, which is solved by this API change. This patch is a modified version of Ivan Khoronzhuk's original patch. Link: https://lore.kernel.org/netdev/20190625175948.24771-2-ivan.khoronzhuk@linaro.org/ Fixes: 5c67bf0ec4d0 ("net: netsec: Use page_pool API") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Add kTLS TX HW offload supportTariq Toukan
Add support for transmit side kernel-TLS acceleration. Offload the crypto encryption to HW. Per TLS connection: - Use a separate TIS to maintain the HW context. - Use a separate encryption key. - Maintain static and progress HW contexts by posting the proper WQEs at creation time, or upon resync. - Use a special DUMP opcode to replay the previous frags and sync the HW context. To make sure the SQ is able to serve an xmit request, increase SQ stop room to cover: - static params WQE, - progress params WQE, and - resync DUMP per frag. Currently supporting TLS 1.2, and key size 128bit. Tested over SimX simulator. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Re-work TIS creation functionsTariq Toukan
Let the EN TIS creation function (mlx5e_create_tis) be responsible for applying common mdev related fields. Other specific fields must be set by the caller and passed within the inbox. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Tx, Unconstify SQ stop roomTariq Toukan
Use an SQ field for stop_room, and use the larger value only if TLS is supported. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Move helper functions to a new txrx datapath headerTariq Toukan
Take datapath helper functions to a new header file en/txrx.h. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge tag 'mlx5-updates-2019-07-04-v2' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-update-2019-07-04 This series adds mlx5 support for devlink fw versions query. 1) Implement the required low level firmware commands 2) Implement the devlink knobs and callbacks for fw versions query. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-04Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Misc updates from mlx5-next branch: 1) Add the required HW definitions and structures for upcoming TLS support. 2) Add support for MCQI and MCQS hardware registers for fw version query. 3) Added hardware bits and structures definitions for sub-functions 4) Small code cleanup and improvement for PF pci driver. 5) Bluefield (ECPF) updates and refactoring for better E-Switch management on ECPF embedded CPU NIC: 5.1) Consolidate querying eswitch number of VFs 5.2) Register event handler at the correct E-Switch init stage 5.3) Setup PF's inline mode and vlan pop when the ECPF is the E-Swtich manager ( the host PF is basically a VF ). 5.4) Handle Vport UC address changes in switchdev mode. 6) Cleanup the rep and netdev reference when unloading IB rep. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> i# All conflicts fixed but you are still merging.
2019-07-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-07-03 The following pull-request contains BPF updates for your *net-next* tree. There is a minor merge conflict in mlx5 due to 8960b38932be ("linux/dim: Rename externally used net_dim members") which has been pulled into your tree in the meantime, but resolution seems not that bad ... getting current bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed just so he's aware of the resolution below: ** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c: <<<<<<< HEAD static int mlx5e_open_cq(struct mlx5e_channel *c, struct dim_cq_moder moder, struct mlx5e_cq_param *param, struct mlx5e_cq *cq) ======= int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder, struct mlx5e_cq_param *param, struct mlx5e_cq *cq) >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497 Resolution is to take the second chunk and rename net_dim_cq_moder into dim_cq_moder. Also the signature for mlx5e_open_cq() in ... drivers/net/ethernet/mellanox/mlx5/core/en.h +977 ... and in mlx5e_open_xsk() ... drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64 ... needs the same rename from net_dim_cq_moder into dim_cq_moder. ** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c: <<<<<<< HEAD int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix)); struct dim_cq_moder icocq_moder = {0, 0}; struct net_device *netdev = priv->netdev; struct mlx5e_channel *c; unsigned int irq; ======= struct net_dim_cq_moder icocq_moder = {0, 0}; >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497 Take the second chunk and rename net_dim_cq_moder into dim_cq_moder as well. Let me know if you run into any issues. Anyway, the main changes are: 1) Long-awaited AF_XDP support for mlx5e driver, from Maxim. 2) Addition of two new per-cgroup BPF hooks for getsockopt and setsockopt along with a new sockopt program type which allows more fine-grained pass/reject settings for containers. Also add a sock_ops callback that can be selectively enabled on a per-socket basis and is executed for every RTT to help tracking TCP statistics, both features from Stanislav. 3) Follow-up fix from loops in precision tracking which was not propagating precision marks and as a result verifier assumed that some branches were not taken and therefore wrongly removed as dead code, from Alexei. 4) Fix BPF cgroup release synchronization race which could lead to a double-free if a leaf's cgroup_bpf object is released and a new BPF program is attached to the one of ancestor cgroups in parallel, from Roman. 5) Support for bulking XDP_TX on veth devices which improves performance in some cases by around 9%, from Toshiaki. 6) Allow for lookups into BPF devmap and improve feedback when calling into bpf_redirect_map() as lookup is now performed right away in the helper itself, from Toke. 7) Add support for fq's Earliest Departure Time to the Host Bandwidth Manager (HBM) sample BPF program, from Lawrence. 8) Various cleanups and minor fixes all over the place from many others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03net/mlx5: mlx5_core_create_cq() enhancementsYishai Hadas
Enhance mlx5_core_create_cq() to get the command out buffer from the callers to let them use the output. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-07-01net/mlx5: E-Switch, Refactor eswitch SR-IOV interfaceBodong Wang
Devlink eswitch mode is not necessarily related to SR-IOV, e.g, ECPF can be at offload mode when SR-IOV is not enabled. Rename the interface and eswitch mode names to decouple from SR-IOV, and cleanup eswitch messages accordingly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-01net/mlx5: Handle host PF vport mac/guid for ECPFBodong Wang
When ECPF is eswitch manager, it has the privilege to query and configure the mac and node guid of host PF. While vport number of host PF is 0, the vport command should be issued with other_vport set in this case as the cmd is issued by ECPF vport(0xfffe). Add a specific function to query own vport mac. Low level functions are used by vport manager to query/modify any vport mac and node guid. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-28net/mlx5e: Don't refresh TIRs when updating representor SQsGavi Teitz
Refreshing TIRs is done in order to update the TIRs with the current state of SQs in the transport domain, so that the TIRs can filter out undesired self-loopback packets based on the source SQ of the packet. Representor TIRs will only receive packets that originate from their associated vport, due to dedicated steering, and therefore will never receive self-loopback packets, whose source vport will be the vport of the E-Switch manager, and therefore not the vport associated with the representor. As such, it is not necessary to refresh the representors' TIRs, since self-loopback packets can't reach them. Since representors only exist in switchdev mode, and there is no scenario in which a representor will exist in the transport domain alongside a non-representor, it is not necessary to refresh the transport domain's TIRs upon changing the state of a representor's queues. Therefore, do not refresh TIRs upon such a change. Achieve this by adding an update_rx callback to the mlx5e_profile, which refreshes TIRs for non-representors and does nothing for representors, and replace instances of mlx5e_refresh_tirs() upon changing the state of the queues with update_rx(). Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>