Age | Commit message (Collapse) | Author |
|
When running workloads heavy unbalanced towards TX (high TX, low RX
traffic), sfc driver can retain the CPU during too long times. Although
in many cases this is not enough to be visible, it can affect
performance and system responsiveness.
A way to reproduce it is to use a debug kernel and run some parallel
netperf TX tests. In some systems, this will lead to this message being
logged:
kernel:watchdog: BUG: soft lockup - CPU#12 stuck for 22s!
The reason is that sfc driver doesn't account any NAPI budget for the TX
completion events work. With high-TX/low-RX traffic, this makes that the
CPU is held for long time for NAPI poll.
Documentations says "drivers can process completions for any number of Tx
packets but should only process up to budget number of Rx packets".
However, many drivers do limit the amount of TX completions that they
process in a single NAPI poll.
In the same way, this patch adds a limit for the TX work in sfc. With
the patch applied, the watchdog warning never appears.
Tested with netperf in different combinations: single process / parallel
processes, TCP / UDP and different sizes of UDP messages. Repeated the
tests before and after the patch, without any noticeable difference in
network or CPU performance.
Test hardware:
Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz (4 cores, 2 threads/core)
Solarflare Communications XtremeScale X2522-25G Network Adapter
Fixes: 5227ecccea2d ("sfc: remove tx and MCDI handling from NAPI budget consideration")
Fixes: d19a53721863 ("sfc_ef100: TX path for EF100 NICs")
Reported-by: Fei Liu <feliu@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230615084929.10506-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ef100_enqueue_skb() generates a valid warning with gcc-13:
drivers/net/ethernet/sfc/ef100_tx.c:370:5: error: conflicting types for 'ef100_enqueue_skb' due to enum/integer mismatch; have 'int(struct efx_tx_queue *, struct sk_buff *)'
drivers/net/ethernet/sfc/ef100_tx.h:25:13: note: previous declaration of 'ef100_enqueue_skb' with type 'netdev_tx_t(struct efx_tx_queue *, struct sk_buff *)'
I.e. the type of the ef100_enqueue_skb()'s return value in the declaration is
int, while the definition spells enum netdev_tx_t. Synchronize them to the
latter.
Cc: Martin Liska <mliska@suse.cz>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20221031114440.10461-1-jirislaby@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A non-null efv in __ef100_enqueue_skb() indicates that the packet is
from that representor, should be transmitted with a suitable option
descriptor (to instruct the switch to deliver it to the representee),
and should not be accounted to the parent PF's stats or BQL.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can treat SKB_GSO_GRE almost exactly the same as UDP tunnels, except
that we don't want to edit the outer UDP len (as there isn't one).
For SKB_GSO_GRE_CSUM, we have to use GSO_PARTIAL as the device doesn't
support offload of non-UDP outer L4 checksums.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
|
|
By asking the HW for the correct edits, we can make UDP tunnel TSO
work without needing GSO_PARTIAL. So don't specify it in our
netdev->gso_partial_features.
However, retain GSO_PARTIAL support, as this will be used for other
protocols later.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
|
|
AIUI, the NETIF_F_TSO_MANGLEID flag is a signal to the stack that a
driver may _need_ to mangle IDs in order to do TSO, and conversely
a signal from the stack that the driver is permitted to do so.
Since we support both fixed and incrementing IPIDs, we should rely
on the SKB_GSO_FIXEDID flag on a per-skb basis, rather than using
the MANGLEID feature to make all TSOs fixed-id.
Includes other minor cleanups of ef100_make_tso_desc() coding style.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The NIC only needs to know where the headers it has to edit (TCP and
inner and outer IPv4) are, which fits GSO_PARTIAL nicely.
It also supports non-PARTIAL offload of UDP tunnels, again just
needing to be told the outer transport offset so that it can edit
the UDP length field.
(It's not clear to me whether the stack will ever use the non-PARTIAL
version with the netdev feature flags we're setting here.)
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the tx_queue->handle_tso function pointer, and just use
tx_queue->tso_version to decide which function to call, thus removing
an indirect call from the fast path.
Instead of passing a tso_v2 flag to efx_mcdi_tx_init(), set the desired
tx_queue->tso_version before calling it.
In efx_mcdi_tx_init(), report back failure to obtain a TSOv2 context by
setting tx_queue->tso_version to 0, which will cause the TX path to
use the GSO-based fallback.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The statement above it already returns, so there is no way to get here.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As in the Siena/EF10 case, it minimises cacheline ping-pong between
the TX and completion paths.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This should cause no functional change; merely make there only be one
design of xmit_more handling to understand. As with the EF10/Siena
version, we set tx_queue->xmit_pending when we queue up a TX, and
clear it when we ring the doorbell (in ef100_notify_tx_desc).
While we're at it, make ef100_notify_tx_desc static since nothing
outside of ef100_tx.c uses it.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of using efx_tx_queue_partner(), which relies on the assumption
that tx_queues_per_channel is 2, efx_tx_send_pending() iterates over
txqs with efx_for_each_channel_tx_queue().
We unconditionally set tx_queue->xmit_pending (renamed from
xmit_more_available), then condition on xmit_more for the call to
efx_tx_send_pending(), which will clear xmit_pending. Thus, after an
xmit_more TX, the doorbell is un-rung and xmit_pending is true.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Includes checksum offload and TSO, so declare those in our netdev features.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Channels are probed, but actual event handling is still stubbed out.
Stub implementation of check_caps is needed because ptp.c will call into
it from efx_ptp_use_mac_tx_timestamps() to decide if it wants TXQs.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No TX or RX path, no MCDI, not even an ifup/down handler.
Besides stubs, the bulk of the patch deals with reading the Xilinx
extended PCIe capability, which tells us where to find our BAR.
Though in the same module, EF100 has its own struct pci_driver,
which is named sfc_ef100.
A small number of additional nic_type methods are added; those in the
TX (tx_enqueue) and RX (rx_packet) paths are called through indirect
call wrappers to minimise the performance impact.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|