Age | Commit message (Collapse) | Author |
|
Set hairpin table size to the corret size, based on the groups that
would be created in it. Groups are laid out on the table such that a
group occupies a range of entries in the table. This implies that the
group ranges should have correspondence to the table they are laid upon.
The patch cited below made group 1's size to grow hence causing
overflow of group range laid on the table.
Fixes: a795d8db2a6d ("net/mlx5e: Support RSS for IP-in-IP and IPv6 tunneled packets")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
In case a reporter exists, error message is logged only to the devlink
tracer. The devlink tracer is a visibility utility only, which user can
choose not to monitor.
After cited patch, 3rd party monitoring tools that tracks these error
message will no longer find them in dmesg, causing a regression.
With this patch, error messages are also logged into the dmesg.
Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
After disabling resources necessary for XSK (the XDP program, channels,
XSK queues), use synchronize_rcu to wait until the XSK wakeup function
finishes, before freeing the resources.
Suspend XSK wakeups during switching channels. If the XDP program is
being removed, synchronize_rcu before closing the old channels to allow
XSK wakeup to complete.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191217162023.16011-3-maximmi@mellanox.com
|
|
Pull networking fixes from David Miller:
1) More jumbo frame fixes in r8169, from Heiner Kallweit.
2) Fix bpf build in minimal configuration, from Alexei Starovoitov.
3) Use after free in slcan driver, from Jouni Hogander.
4) Flower classifier port ranges don't work properly in the HW offload
case, from Yoshiki Komachi.
5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin.
6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk.
7) Fix flow dissection in dsa TX path, from Alexander Lobakin.
8) Stale syncookie timestampe fixes from Guillaume Nault.
[ Did an evil merge to silence a warning introduced by this pull - Linus ]
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
r8169: fix rtl_hw_jumbo_disable for RTL8168evl
net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()
r8169: add missing RX enabling for WoL on RTL8125
vhost/vsock: accept only packets with the right dst_cid
net: phy: dp83867: fix hfs boot in rgmii mode
net: ethernet: ti: cpsw: fix extra rx interrupt
inet: protect against too small mtu values.
gre: refetch erspan header from skb->data after pskb_may_pull()
pppoe: remove redundant BUG_ON() check in pppoe_pernet
tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
tcp: tighten acceptance of ACKs not matching a child socket
tcp: fix rejected syncookies due to stale timestamps
lpc_eth: kernel BUG on remove
tcp: md5: fix potential overestimation of TCP option space
net: sched: allow indirect blocks to bind to clsact in TC
net: core: rename indirect block ingress cb function
net-sysfs: Call dev_hold always in netdev_queue_add_kobject
net: dsa: fix flow dissection on Tx path
net/tls: Fix return values to avoid ENOTSUPP
net: avoid an indirect call in ____sys_recvmsg()
...
|
|
Add a missing value in translation of PTYS ext_eth_proto_oper to its
corresponding speed. When ext_eth_proto_oper bit 10 is set, ethtool
shows unknown speed. With this fix, ethtool shows speed is 100G as
expected.
Fixes: a08b4ed1373d ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
When the user changes prio2buffer mapping while global pause is
enabled, mlx5 driver incorrectly sets all active buffers
(buffer that has at least one priority mapped) to lossy.
Solution:
If global pause is enabled, set all the active buffers to lossless
in prio2buffer command.
Also, add error message when buffer size is not enough to meet
xoff threshold.
Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
ipv6_stub uses the ip6_dst_lookup function to allow other modules to
perform IPv6 lookups. However, this function skips the XFRM layer
entirely.
All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the
ip_route_output_key and ip_route_output helpers) for their IPv4 lookups,
which calls xfrm_lookup_route(). This patch fixes this inconsistent
behavior by switching the stub to ip6_dst_lookup_flow, which also calls
xfrm_lookup_route().
This requires some changes in all the callers, as these two functions
take different arguments and have different return types.
Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If IPV6 is not set and CONFIG_MLX5_ESWITCH is y,
building fails:
drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:322:5: error: redefinition of mlx5e_tc_tun_create_header_ipv6
int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:7:0:
drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:67:1: note: previous definition of mlx5e_tc_tun_create_header_ipv6 was here
mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use #ifdef to guard this, also move mlx5e_route_lookup_ipv6
to cleanup unused warning.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: e689e998e102 ("net/mlx5e: TC, Stub out ipv6 tun create header function")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When code reaches the "out" label, n is guaranteed to be valid so we can
unconditionally call neigh_release.
Also change the label to release_neigh to better reflect the fact that
we unconditionally free the neighbour and also match other labels
convention.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Improve mlx5e_route_lookup_ipv6 function structure by avoiding #ifdef then
return -EOPNOTSUPP in the middle of the function code.
To do so, we stub out mlx5e_tc_tun_create_header_ipv6 which is the only
caller of this helper function to avoid calling it altogether
when ipv6 is compiled out, which should also cleanup some compiler
warnings of unused variables.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Be sure to release the neighbour in case of failures after successful
route lookup.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Neighbour initializations to NULL are not necessary as the pointers are
not used if an error is returned, and if success returned, pointers are
initialized.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.
The rest were (relatively) trivial in nature.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the rt entry gateway family is not AF_INET for multipath device,
rtable reference is leaked.
Hence, fix it by releasing the reference.
Fixes: 5fb091e8130b ("net/mlx5e: Use hint to resolve route when in HW multipath mode")
Fixes: e32ee6c78efa ("net/mlx5e: Support tunnel encap over tagged Ethernet")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Memory allocated by kvzalloc should be freed by kvfree.
Fixes: cef35af34d6d ("net/mlx5e: Add mlx5e HV VHCA stats agent")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Fix various misspellings of "configuration" and "configure".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
HW expects the data size in DUMP WQEs to be up to MTU.
Make sure they are in range.
We elevate the frag page refcount by 'n-1', in addition to the
one obtained in tx_sync_info_get(), having an overall of 'n'
references. We bulk increments by using a single page_ref_add()
command, to optimize perfermance.
The refcounts are released one by one, by the corresponding completions.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
No Eth segment, so no dynamic inline headers.
The size of a Dump WQE is fixed, use constants and remove
unnecessary checks.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Not all fields of WQE info are being written in the function,
having some leftovers from previous rounds.
Zero-memset it upon update.
Particularly, not nullifying the wi->resync_dump_frag field
will cause double free of the kTLS DUMPed frags.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
During health reporter operations, driver might want to fill-up
the extack message, so propagate extack down to the health reporter ops.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- add modpost warn exported symbols marked as 'static' because 'static'
and EXPORT_SYMBOL is an odd combination
- break the build early if gold linker is used
- optimize the Bison rule to produce .c and .h files by a single
pattern rule
- handle PREEMPT_RT in the module vermagic and UTS_VERSION
- warn CONFIG options leaked to the user-space except existing ones
- make single targets work properly
- rebuild modules when module linker scripts are updated
- split the module final link stage into scripts/Makefile.modfinal
- fix the missed error code in merge_config.sh
- improve the error message displayed on the attempt of the O= build in
unclean source tree
- remove 'clean-dirs' syntax
- disable -Wimplicit-fallthrough warning for Clang
- add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC
- remove ARCH_{CPP,A,C}FLAGS variables
- add $(BASH) to run bash scripts
- change *CFLAGS_<basetarget>.o to take the relative path to $(obj)
instead of the basename
- stop suppressing Clang's -Wunused-function warnings when W=1
- fix linux/export.h to avoid genksyms calculating CRC of trimmed
exported symbols
- misc cleanups
* tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (63 commits)
genksyms: convert to SPDX License Identifier for lex.l and parse.y
modpost: use __section in the output to *.mod.c
modpost: use MODULE_INFO() for __module_depends
export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols
export.h: remove defined(__KERNEL__), which is no longer needed
kbuild: allow Clang to find unused static inline functions for W=1 build
kbuild: rename KBUILD_ENABLE_EXTRA_GCC_CHECKS to KBUILD_EXTRA_WARN
kbuild: refactor scripts/Makefile.extrawarn
merge_config.sh: ignore unwanted grep errors
kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)
modpost: add NOFAIL to strndup
modpost: add guid_t type definition
kbuild: add $(BASH) to run scripts with bash-extension
kbuild: remove ARCH_{CPP,A,C}FLAGS
kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC
kbuild: Do not enable -Wimplicit-fallthrough for clang for now
kbuild: clean up subdir-ymn calculation in Makefile.clean
kbuild: remove unneeded '+' marker from cmd_clean
kbuild: remove clean-dirs syntax
kbuild: check clean srctree even earlier
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-09-05
1) Allover mlx5 cleanups
2) Added port congestion counters to ethtool stats:
Add 3 counters per priority to ethtool using PPCNT:
2.1) rx_prio[p]_buf_discard - the number of packets discarded by device
due to lack of per host receive buffers
2.2) rx_prio[p]_cong_discard - the number of packets discarded by device
due to per host congestion
2.3) rx_prio[p]_marked - the number of packets ECN marked by device due
to per host congestion
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add the ability to use unaligned chunks in the AF_XDP umem. By
relaxing where the chunks can be placed, it allows to use an
arbitrary buffer size and place whenever there is a free
address in the umem. Helps more seamless DPDK AF_XDP driver
integration. Support for i40e, ixgbe and mlx5e, from Kevin and
Maxim.
2) Addition of a wakeup flag for AF_XDP tx and fill rings so the
application can wake up the kernel for rx/tx processing which
avoids busy-spinning of the latter, useful when app and driver
is located on the same core. Support for i40e, ixgbe and mlx5e,
from Magnus and Maxim.
3) bpftool fixes for printf()-like functions so compiler can actually
enforce checks, bpftool build system improvements for custom output
directories, and addition of 'bpftool map freeze' command, from Quentin.
4) Support attaching/detaching XDP programs from 'bpftool net' command,
from Daniel.
5) Automatic xskmap cleanup when AF_XDP socket is released, and several
barrier/{read,write}_once fixes in AF_XDP code, from Björn.
6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf
inclusion as well as libbpf versioning improvements, from Andrii.
7) Several new BPF kselftests for verifier precision tracking, from Alexei.
8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya.
9) And more BPF kselftest improvements all over the place, from Stanislav.
10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub.
11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan.
12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni.
13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin.
14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar.
15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari,
Peter, Wei, Yue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cited patch have an issue in WARN_ON_ONCE check, with wrong address ranges
are compared. Fix that by changing pointer types from u64* to void*. This
will also make code simpler to read.
In addition mlx5e_hv_vhca_fill_ring_stats can get void pointer, so remove
the unnecessary casting when calling it.
Found by static checker:
drivers/net/ethernet/mellanox/mlx5/core/en/hv_vhca_stats.c:41 mlx5e_hv_vhca_fill_stats()
warn: potential pointer math issue ('buf' is a u64 pointer)
Fixes: cef35af34d6d ("net/mlx5e: Add mlx5e HV VHCA stats agent")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add flow steering actions: modify header and packet reformat
to the fs_cmd shim layer. This allows each namespace to define
possibly different functionality for alloc/dealloc action commands.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Relax the requirements to the XSK frame size to allow it to be smaller
than a page and even not a power of two. The current implementation can
work in this mode, both with Striding RQ and without it.
The code that checks `mtu + headroom <= XSK frame size` is modified
accordingly. Any frame size between 2048 and PAGE_SIZE is accepted.
Functions that worked with pages only now work with XSK frames, even if
their size is different from PAGE_SIZE.
With XSK queues, regardless of the frame size, Striding RQ uses the
stride size of PAGE_SIZE, and UMR MTTs are posted using starting
addresses of frames, but PAGE_SIZE as page size. MTU guarantees that no
packet data will overlap with other frames. UMR MTT size is made equal
to the stride size of the RQ, because UMEM frames may come in random
order, and we need to handle them one by one. PAGE_SIZE is just a power
of two that is bigger than any allowed XSK frame size, and also it
doesn't require making additional changes to the code.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
With the addition of the unaligned chunks option, we need to make sure we
handle the offsets accordingly based on the mode we are currently running
in. This patch modifies the driver to appropriately mask the address for
each case.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Use generic function for checking tunnel stateless offload capability
instead of separate macros.
Signed-off-by: Marina Varshaver <marinav@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add support for inner header RSS on IP-in-IP and IPv6 tunneled packets.
Add rules to the steering table regarding outer IP header, with
IPv4/6->IP-in-IP. Tunneled packets with protocol numbers: 0x4 (IP-in-IP)
and 0x29 (IPv6) are RSS-ed on the inner IP header.
Separate FW dependencies between flow table inner IP capabilities and
GRE offload support. Allowing this feature even if GRE offload is not
supported. Tested with multi stream TCP traffic tunneled with IPnIP.
Verified that:
Without this patch, only a single RX ring was processing the traffic.
With this patch, multiple RX rings were processing the traffic.
Verified with and without GRE offload support.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Move function which indicates whether tunnel inner flow table is
supported from en.h to en_fs.c. It fits better right after tunnel
protocol rules definitions.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
HV VHCA stats agent is responsible on running a preiodic rx/tx
packets/bytes stats update. Currently the supported format is version
MLX5_HV_VHCA_STATS_VERSION. Block ID 1 is dedicated for statistics data
transfer from the VF to the PF.
The reporter fetch the statistics data from all opened channels, fill it
in a buffer and send it to mlx5_hv_vhca_write_agent.
As the stats layer should include some metadata per block (sequence and
offset), the HV VHCA layer shall modify the buffer before actually send it
over block 1.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that the single target build descends into sub-directories in the
same way as the normal build, these dummy Makefiles are not needed
any more.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Add support for report and recovery from error on completion on RQ by
setting the queue back to ready state. Handle only errors with a
syndrome indicating the RQ might enter error state and could be
recovered.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Just to be aligned with the MPWQE handlers, handle RX WQE with error
for legacy RQs in the top RX handlers, just before calling skb_from_cqe().
CQE error handling will now be called at the same stage regardless of
the RQ type or netdev mode NIC, Representor, IPoIB, etc ..
This will be useful for down stream patch to improve error CQE
handling.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add support for report and recovery from rx timeout. On driver open we
post NOP work request on the rx channels to trigger napi in order to
fillup the rx rings. In case napi wasn't scheduled due to a lost
interrupt, perform EQ recovery.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add support for report and recovery from error on completion on ICOSQ.
Deactivate RQ and flush, then deactivate ICOSQ. Set the queue back to
ready state (firmware) and reset the ICOSQ and the RQ (software
resources). Finally, activate the ICOSQ and the RQ.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Align ICOSQ open/close behaviour with RQ and SQ. Split open flow into
open and activate where open handles creation and activate enables the
queue. Do a symmetric thing in close flow: split into close and
deactivate.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add rx reporter, which supports diagnose call-back. Diagnostics output
include: information common to all RQs: RQ type, RQ size, RQ stride
size, CQ size and CQ stride size. In addition advertise information per
RQ and its related icosq and attached CQ.
$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
RQ:
type: 2 stride size: 2048 size: 8
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4308 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1032 HW status: 0
channel ix: 1 rqn: 4313 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1036 HW status: 0
channel ix: 2 rqn: 4318 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1040 HW status: 0
channel ix: 3 rqn: 4323 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1044 HW status: 0
$ devlink health diagnose pci/0000:00:0b.0 reporter rx -jp
{
"Common config": {
"RQ": {
"type": 2,
"stride size": 2048,
"size": 8
},
"CQ": {
"stride size": 64,
"size": 1024
}
},
"RQs": [ {
"channel ix": 0,
"rqn": 4308,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1032,
"HW status": 0
}
},{
"channel ix": 1,
"rqn": 4313,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1036,
"HW status": 0
}
},{
"channel ix": 2,
"rqn": 4318,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1040,
"HW status": 0
}
},{
"channel ix": 3,
"rqn": 4323,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1044,
"HW status": 0
}
} ]
}
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Introduce helper functions for create and destroy reporters and update
channels. In the following patch, rx reporter is added and it will use
these helpers too.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Add cq information to general diagnose output: CQ size and stride size.
Per SQ add information about the related CQ: cqn and CQ's HW status.
$ devlink health diagnose pci/0000:00:0b.0 reporter tx
Common Config:
SQ:
stride size: 64 size: 1024
CQ:
stride size: 64 size: 1024
SQs:
channel ix: 0 tc: 0 txq ix: 0 sqn: 4307 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1030 HW status: 0
channel ix: 1 tc: 0 txq ix: 1 sqn: 4312 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1034 HW status: 0
channel ix: 2 tc: 0 txq ix: 2 sqn: 4317 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1038 HW status: 0
channel ix: 3 tc: 0 txq ix: 3 sqn: 4322 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1042 HW status: 0
$ devlink health diagnose pci/0000:00:0b.0 reporter tx -jp
{
"Common Config": {
"SQ": {
"stride size": 64,
"size": 1024
},
"CQ": {
"stride size": 64,
"size": 1024
}
},
"SQs": [ {
"channel ix": 0,
"tc": 0,
"txq ix": 0,
"sqn": 4307,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1030,
"HW status": 0
}
},{
"channel ix": 1,
"tc": 0,
"txq ix": 1,
"sqn": 4312,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1034,
"HW status": 0
}
},{
"channel ix": 2,
"tc": 0,
"txq ix": 2,
"sqn": 4317,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1038,
"HW status": 0
}
},{
"channel ix": 3,
"tc": 0,
"txq ix": 3,
"sqn": 4322,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1042,
"HW status": 0
} ]
}
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Enhance tx reporter's diagnostics output to include: information common
to all SQs: SQ size, SQ stride size.
In addition add channel ix, tc, txq ix, cc and pc.
$ devlink health diagnose pci/0000:00:0b.0 reporter tx
Common config:
SQ:
stride size: 64 size: 1024
SQs:
channel ix: 0 tc: 0 txq ix: 0 sqn: 4307 HW state: 1 stopped: false cc: 0 pc: 0
channel ix: 1 tc: 0 txq ix: 1 sqn: 4312 HW state: 1 stopped: false cc: 0 pc: 0
channel ix: 2 tc: 0 txq ix: 2 sqn: 4317 HW state: 1 stopped: false cc: 0 pc: 0
channel ix: 3 tc: 0 txq ix: 3 sqn: 4322 HW state: 1 stopped: false cc: 0 pc: 0
$ devlink health diagnose pci/0000:00:0b.0 reporter tx -jp
{
"Common config": {
"SQ": {
"stride size": 64,
"size": 1024
}
},
"SQs": [ {
"channel ix": 0,
"tc": 0,
"txq ix": 0,
"sqn": 4307,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
},{
"channel ix": 1,
"tc": 0,
"txq ix": 1,
"sqn": 4312,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
},{
"channel ix": 2,
"tc": 0,
"txq ix": 2,
"sqn": 4317,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
},{
"channel ix": 3,
"tc": 0,
"txq ix": 3,
"sqn": 4322,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
} ]
}
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The following patches in the set enhance the diagnostics info of tx
reporter. Therefore, it is better to pass a pointer to the SQ for
further data extraction.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Prepare for code sharing with rx reporter, which is added in the
following patches in the set. Introduce a generic error_ctx for
agnostic recovery despatch.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Change from mlx5e_tx_reporter_* to mlx5e_reporter_tx_*. In the following
patches in the set rx reporter is added, the new naming convention is
more uniformed.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Rename reporter.h -> health.h so patches in the set can use it for
health related functionality.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Merge conflict of mlx5 resolved using instructions in merge
commit 9566e650bf7fdf58384bb06df634f7531ca3a97e.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit adds support for the new need_wakeup feature of AF_XDP. The
applications can opt-in by using the XDP_USE_NEED_WAKEUP bind() flag.
When this feature is enabled, some behavior changes:
RX side: If the Fill Ring is empty, instead of busy-polling, set the
flag to tell the application to kick the driver when it refills the Fill
Ring.
TX side: If there are pending completions or packets queued for
transmission, set the flag to tell the application that it can skip the
sendto() syscall and save time.
The performance testing was performed on a machine with the following
configuration:
- 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz
- Mellanox ConnectX-5 Ex with 100 Gbit/s link
The results with retpoline disabled:
| without need_wakeup | with need_wakeup |
|----------------------|----------------------|
| one core | two cores | one core | two cores |
-------|----------|-----------|----------|-----------|
txonly | 20.1 | 33.5 | 29.0 | 34.2 |
rxdrop | 0.065 | 14.1 | 12.0 | 14.1 |
l2fwd | 0.032 | 7.3 | 6.6 | 7.2 |
"One core" means the application and NAPI run on the same core. "Two
cores" means they are pinned to different cores.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new
ndo provides the same functionality as before but with the addition of
a new flags field that is used to specifiy if Rx, Tx or both should be
woken up. The previous ndo only woke up Tx, as implied by the
name. The i40e and ixgbe drivers (which are all the supported ones)
are updated with this new interface.
This new ndo will be used by the new need_wakeup functionality of XDP
sockets that need to be able to wake up both Rx and Tx driver
processing.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add a missing spinlock around XSKICOSQ usage at the activation stage,
because there is a race between a configuration change and the
application calling sendto().
Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
List of flows attached to mod header entry is used as implicit reference
counter (mod header entry is deallocated when list becomes free) and as a
mechanism to obtain mod header entry that flow is attached to (through list
head). This is not safe when concurrent modification of list of flows
attached to mod header entry is possible. Proper atomic reference counter
is required to support concurrent access.
As a preparation for extending mod header with reference counting, extract
code that lookups and deletes mod header entry into standalone put/get
helpers. In order to remove this dependency on external locking, extend mod
header entry with reference counter to manage its lifetime and extend flow
structure with direct pointer to mod header entry that flow is attached to.
To remove code duplication between legacy and switchdev mode
implementations that both support mod_hdr functionality, store mod_hdr
table in dedicated structure used by both fdb and kernel namespaces. New
table structure is extended with table lock by one of the following patches
in this series. Implement helper function to get correct mod_hdr table
depending on flow namespace.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|