summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/sfc/ef100_tx.c
AgeCommit message (Collapse)Author
2023-06-17sfc: use budget for TX completionsÍñigo Huguet
When running workloads heavy unbalanced towards TX (high TX, low RX traffic), sfc driver can retain the CPU during too long times. Although in many cases this is not enough to be visible, it can affect performance and system responsiveness. A way to reproduce it is to use a debug kernel and run some parallel netperf TX tests. In some systems, this will lead to this message being logged: kernel:watchdog: BUG: soft lockup - CPU#12 stuck for 22s! The reason is that sfc driver doesn't account any NAPI budget for the TX completion events work. With high-TX/low-RX traffic, this makes that the CPU is held for long time for NAPI poll. Documentations says "drivers can process completions for any number of Tx packets but should only process up to budget number of Rx packets". However, many drivers do limit the amount of TX completions that they process in a single NAPI poll. In the same way, this patch adds a limit for the TX work in sfc. With the patch applied, the watchdog warning never appears. Tested with netperf in different combinations: single process / parallel processes, TCP / UDP and different sizes of UDP messages. Repeated the tests before and after the patch, without any noticeable difference in network or CPU performance. Test hardware: Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz (4 cores, 2 threads/core) Solarflare Communications XtremeScale X2522-25G Network Adapter Fixes: 5227ecccea2d ("sfc: remove tx and MCDI handling from NAPI budget consideration") Fixes: d19a53721863 ("sfc_ef100: TX path for EF100 NICs") Reported-by: Fei Liu <feliu@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20230615084929.10506-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02sfc (gcc13): synchronize ef100_enqueue_skb()'s return typeJiri Slaby (SUSE)
ef100_enqueue_skb() generates a valid warning with gcc-13: drivers/net/ethernet/sfc/ef100_tx.c:370:5: error: conflicting types for 'ef100_enqueue_skb' due to enum/integer mismatch; have 'int(struct efx_tx_queue *, struct sk_buff *)' drivers/net/ethernet/sfc/ef100_tx.h:25:13: note: previous declaration of 'ef100_enqueue_skb' with type 'netdev_tx_t(struct efx_tx_queue *, struct sk_buff *)' I.e. the type of the ef100_enqueue_skb()'s return value in the declaration is int, while the definition spells enum netdev_tx_t. Synchronize them to the latter. Cc: Martin Liska <mliska@suse.cz> Cc: Edward Cree <ecree.xilinx@gmail.com> Cc: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114440.10461-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-22sfc: support passing a representor to the EF100 TX pathEdward Cree
A non-null efv in __ef100_enqueue_skb() indicates that the packet is from that representor, should be transmitted with a suitable option descriptor (to instruct the switch to deliver it to the representee), and should not be accounted to the parent PF's stats or BQL. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-11-13sfc: support GRE TSO on EF100Edward Cree
We can treat SKB_GSO_GRE almost exactly the same as UDP tunnels, except that we don't want to edit the outer UDP len (as there isn't one). For SKB_GSO_GRE_CSUM, we have to use GSO_PARTIAL as the device doesn't support offload of non-UDP outer L4 checksums. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Martin Habets <mhabets@solarflare.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13sfc: correctly support non-partial GSO_UDP_TUNNEL_CSUM on EF100Edward Cree
By asking the HW for the correct edits, we can make UDP tunnel TSO work without needing GSO_PARTIAL. So don't specify it in our netdev->gso_partial_features. However, retain GSO_PARTIAL support, as this will be used for other protocols later. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Martin Habets <mhabets@solarflare.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-10-30sfc: only use fixed-id if the skb asks for itEdward Cree
AIUI, the NETIF_F_TSO_MANGLEID flag is a signal to the stack that a driver may _need_ to mangle IDs in order to do TSO, and conversely a signal from the stack that the driver is permitted to do so. Since we support both fixed and incrementing IPIDs, we should rely on the SKB_GSO_FIXEDID flag on a per-skb basis, rather than using the MANGLEID feature to make all TSOs fixed-id. Includes other minor cleanups of ef100_make_tso_desc() coding style. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-30sfc: implement encap TSO on EF100Edward Cree
The NIC only needs to know where the headers it has to edit (TCP and inner and outer IPv4) are, which fits GSO_PARTIAL nicely. It also supports non-PARTIAL offload of UDP tunnels, again just needing to be told the outer transport offset so that it can edit the UDP length field. (It's not clear to me whether the stack will ever use the non-PARTIAL version with the netdev feature flags we're setting here.) Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-11sfc: de-indirect TSO handlingEdward Cree
Remove the tx_queue->handle_tso function pointer, and just use tx_queue->tso_version to decide which function to call, thus removing an indirect call from the fast path. Instead of passing a tso_v2 flag to efx_mcdi_tx_init(), set the desired tx_queue->tso_version before calling it. In efx_mcdi_tx_init(), report back failure to obtain a TSOv2 context by setting tx_queue->tso_version to 0, which will cause the TX path to use the GSO-based fallback. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11sfc: remove spurious unreachable return statementEdward Cree
The statement above it already returns, so there is no way to get here. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-05sfc: use tx_queue->old_read_count in EF100 TX pathEdward Cree
As in the Siena/EF10 case, it minimises cacheline ping-pong between the TX and completion paths. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05sfc: make ef100 xmit_more handling look more like ef10'sEdward Cree
This should cause no functional change; merely make there only be one design of xmit_more handling to understand. As with the EF10/Siena version, we set tx_queue->xmit_pending when we queue up a TX, and clear it when we ring the doorbell (in ef100_notify_tx_desc). While we're at it, make ef100_notify_tx_desc static since nothing outside of ef100_tx.c uses it. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-05sfc: add and use efx_tx_send_pending in tx.cEdward Cree
Instead of using efx_tx_queue_partner(), which relies on the assumption that tx_queues_per_channel is 2, efx_tx_send_pending() iterates over txqs with efx_for_each_channel_tx_queue(). We unconditionally set tx_queue->xmit_pending (renamed from xmit_more_available), then condition on xmit_more for the call to efx_tx_send_pending(), which will clear xmit_pending. Thus, after an xmit_more TX, the doorbell is un-rung and xmit_pending is true. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-08-03sfc_ef100: TX path for EF100 NICsEdward Cree
Includes checksum offload and TSO, so declare those in our netdev features. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27sfc_ef100: implement ndo_open/close and EVQ probingEdward Cree
Channels are probed, but actual event handling is still stubbed out. Stub implementation of check_caps is needed because ptp.c will call into it from efx_ptp_use_mac_tx_timestamps() to decide if it wants TXQs. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27sfc: skeleton EF100 PF driverEdward Cree
No TX or RX path, no MCDI, not even an ifup/down handler. Besides stubs, the bulk of the patch deals with reading the Xilinx extended PCIe capability, which tells us where to find our BAR. Though in the same module, EF100 has its own struct pci_driver, which is named sfc_ef100. A small number of additional nic_type methods are added; those in the TX (tx_enqueue) and RX (rx_packet) paths are called through indirect call wrappers to minimise the performance impact. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>