linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2024-04-11	net: ena: Fix incorrect descriptor free behavior	David Arinzon
	ENA has two types of TX queues: - queues which only process TX packets arriving from the network stack - queues which only process TX packets forwarded to it by XDP_REDIRECT or XDP_TX instructions The ena_free_tx_bufs() cycles through all descriptors in a TX queue and unmaps + frees every descriptor that hasn't been acknowledged yet by the device (uncompleted TX transactions). The function assumes that the processed TX queue is necessarily from the first category listed above and ends up using napi_consume_skb() for descriptors belonging to an XDP specific queue. This patch solves a bug in which, in case of a VF reset, the descriptors aren't freed correctly, leading to crashes. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11	net: ena: Wrong missing IO completions check order	David Arinzon
	Missing IO completions check is called every second (HZ jiffies). This commit fixes several issues with this check: 1. Duplicate queues check: Max of 4 queues are scanned on each check due to monitor budget. Once reaching the budget, this check exits under the assumption that the next check will continue to scan the remainder of the queues, but in practice, next check will first scan the last already scanned queue which is not necessary and may cause the full queue scan to last a couple of seconds longer. The fix is to start every check with the next queue to scan. For example, on 8 IO queues: Bug: [0,1,2,3], [3,4,5,6], [6,7] Fix: [0,1,2,3], [4,5,6,7] 2. Unbalanced queues check: In case the number of active IO queues is not a multiple of budget, there will be checks which don't utilize the full budget because the full scan exits when reaching the last queue id. The fix is to run every TX completion check with exact queue budget regardless of the queue id. For example, on 7 IO queues: Bug: [0,1,2,3], [4,5,6], [0,1,2,3] Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4] The budget may be lowered in case the number of IO queues is less than the budget (4) to make sure there are no duplicate queues on the same check. For example, on 3 IO queues: Bug: [0,1,2,0], [1,2,0,1] Fix: [0,1,2], [0,1,2] Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Amit Bernstein <amitbern@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-18	net: ena: Remove ena_select_queue	Kamal Heib
	Avoid the following warnings by removing the ena_select_queue() function and rely on the net core to do the queue selection, The issue happen when an skb received from an interface with more queues than ena is forwarded to the ena interface. [ 1176.159959] eth0 selects TX queue 11, but real number of TX queues is 8 [ 1176.863976] eth0 selects TX queue 14, but real number of TX queues is 8 [ 1180.767877] eth0 selects TX queue 14, but real number of TX queues is 8 [ 1188.703742] eth0 selects TX queue 14, but real number of TX queues is 8 Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-15	net: ena: Remove unlikely() from IS_ERR() condition	Kamal Heib
	IS_ERR() is already using unlikely internally. Signed-off-by: Kamal Heib <kheib@redhat.com> Acked-by: Arthur Kiyanovski <akiyano@amazon.com> Link: https://lore.kernel.org/r/20240213161502.2297048-1-kheib@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-14	net: ena: Remove redundant assignment	Kamal Heib
	There is no point in initializing an ndo to NULL, therefore the assignment is redundant and can be removed. Signed-off-by: Kamal Heib <kheib@redhat.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Acked-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-01	net: ena: Reduce lines with longer column width boundary	David Arinzon
	This patch reduces some of the lines by removing newlines where more variables or print strings can be pushed back to the previous line while still adhering to the styling guidelines. Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: handle ena_calc_io_queue_size() possible errors	David Arinzon
	Fail queue size calculation when the device returns maximum TX/RX queue sizes that are smaller than the allowed minimum. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Change default print level for netif_ prints	David Arinzon
	The netif_* functions are used by the driver to log events into the kernel ring (dmesg) similar to the netdev_* ones. Unlike the latter, the netif_* function family allow the user to choose what events get logged using ethtool: sudo ethtool -s [interface] msglvl [msg_type] on By default the events which get logged are slow-path related and aren't printed often (e.g. interface up related prints). This patch removes the NETIF_MSG_TX_DONE type (called every TX completion polling) from the defaults and adds NETIF_MSG_IFDOWN instead as it makes more sensible defaults. This patch also transforms ena_down() print from netif_info into netif_dbg (same as the analogue print in ena_up()) as it suits it better. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Relocate skb_tx_timestamp() to improve time stamping accuracy	David Arinzon
	Move skb_tx_timestamp() closer to the actual time the driver sends the packets to the device. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Add more information on TX timeouts	David Arinzon
	The function responsible for polling TX completions might not receive the CPU resources it needs due to higher priority tasks running on the requested core. The driver might not be able to recognize such cases, but it can use its state to suspect that they happened. If both conditions are met: - napi hasn't been executed more than the TX completion timeout value - napi is scheduled (meaning that we've received an interrupt) Then it's more likely that the napi handler isn't scheduled because of an overloaded CPU. It was decided that for this case, the driver would wait twice as long as the regular timeout before scheduling a reset. The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset reason to indicate this case to the device. This patch also adds more information to the ena_tx_timeout() callback. This function is called by the kernel when it detects that a specific TX queue has been closed for too long. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Change error print during ena_device_init()	David Arinzon
	The print was re-worded to a more informative one. Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Remove CQ tail pointer update	David Arinzon
	The functionality was added to allow the drivers to create an SQ and CQ of different sizes. When the RX/TX SQ and CQ have the same size, such update isn't necessary as the device can safely assume it doesn't override unprocessed completions. However, if the SQ is larger than the CQ, the device might "have" more completions it wants to update about than there's room in the CQ. There's no support for different SQ and CQ sizes, therefore, removing the API and its usage. '____cacheline_aligned' compiler attribute was added to 'struct ena_com_io_cq' to ensure that the removal of the 'cq_head_db_reg' field doesn't change the cache-line layout of this struct. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Enable DIM by default	David Arinzon
	Dynamic Interrupt Moderation (DIM) is a technique designed to balance the need for timely data processing with the desire to minimize CPU overhead. Instead of generating an interrupt for every received packet, the system can dynamically adjust the rate at which interrupts are generated based on the incoming traffic patterns. Enabling DIM by default to improve the user experience. DIM can be turned on/off through ethtool: `ethtool -C <interface> adaptive-rx <on/off>` Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01	net: ena: Minor cosmetic changes	David Arinzon
	A few changes for better readability and style 1. Adding / Removing newlines 2. Removing an unnecessary and confusing comment 3. Using an existing variable rather than re-checking a field Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-03	net: ena: Take xdp packets stats into account in ena_get_stats64()	David Arinzon
	Queue stats using ifconfig and ip are retrieved via ena_get_stats64(). This function currently does not take the xdp sent or dropped packets stats into account. This commit adds the following xdp stats to ena_get_stats64(): tx bytes sent tx packets sent rx dropped packets Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-12-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03	net: ena: Always register RX queue info	David Arinzon
	The RX queue info contains information about the RX queue which might be relevant to the kernel. To avoid configuring this queue for different scenarios, this patch moves the RX queue configuration to ena_up()/ena_down() function and makes it configured every interface state toggle. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-10-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03	net: ena: Refactor napi functions	David Arinzon
	This patch focuses on changes to the XDP part of the napi polling routine. 1. Update the `napi_comp` stat only when napi is actually complete. 2. Simplify the code by using a function pointer to the right napi routine (XDP vs non-XDP path) 3. Remove unnecessary local variables. 4. Adjust a debug print to show the processed XDP frame index rather than the pointer. Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-8-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03	net: ena: Use tx_ring instead of xdp_ring for XDP channel TX	David Arinzon
	When an XDP program is loaded the existing channels in the driver split into two halves: - The first half of the channels contain RX and TX rings, these queues are used for receiving traffic and sending packets originating from kernel. - The second half of the channels contain only a TX ring. These queues are used for sending packets that were redirected using XDP_TX or XDP_REDIRECT. Referring to the queues in the second half of the channels as "xdp_ring" can be confusing and may give the impression that ENA has the capability to generate an additional special queue. This patch ensures that the xdp_ring field is exclusively used to describe the XDP TX queue that a specific RX queue needs to utilize when forwarding packets with XDP TX and XDP REDIRECT, preserving the integrity of the xdp_ring field in ena_ring. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-6-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03	net: ena: Introduce total_tx_size field in ena_tx_buffer struct	David Arinzon
	To avoid de-referencing skb or xdp_frame when we poll for TX completion (where they might not be in the cache), save the total TX packet size in the ena_tx_buffer object representing the packet. Also the 'print_once' field's type was changed from u32 to u8 to allow adding the 'total_tx_size' without changing the total size of the struct. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-5-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03	net: ena: Pass ena_adapter instead of net_device to ena_xmit_common()	David Arinzon
	This change will enable the ability to use ena_xmit_common() in functions that don't have a net_device pointer. While it can be retrieved by dereferencing ena_adapter (adapter->netdev), there's no reason to do it in fast path code where this pointer is only needed for debug prints. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-3-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-03	net: ena: Move XDP code to its new files	David Arinzon
	XDP system has a very large footprint in the driver's overall code. makes the whole driver's code much harder to read. Moving XDP code to dedicated files. This patch doesn't make any changes to the code itself and only cut-pastes the code into ena_xdp.c and ena_xdp.h files so the change is purely cosmetic. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20240101190855.18739-2-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/iavf/iavf_ethtool.c 3a0b5a2929fd ("iavf: Introduce new state machines for flow director") 95260816b489 ("iavf: use iavf_schedule_aq_request() helper") https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/ drivers/net/ethernet/broadcom/bnxt/bnxt.c c13e268c0768 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic") c2f8063309da ("bnxt_en: Refactor RX VLAN acceleration logic.") a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7") 1c7fd6ee2fe4 ("bnxt_en: Rename some macros for the P5 chips") https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c bd6781c18cb5 ("bnxt_en: Fix wrong return value check in bnxt_close_nic()") 84793a499578 ("bnxt_en: Skip nic close/open when configuring tstamp filters") https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/ drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c 3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled") cecf44ea1a1f ("net/mlx5: Allow sync reset flow when BF MGT interface device is present") https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12	net: ena: Fix DMA syncing in XDP path when SWIOTLB is on	David Arinzon
	This patch fixes two issues: Issue 1 ------- Description ``````````` Current code does not call dma_sync_single_for_cpu() to sync data from the device side memory to the CPU side memory before the XDP code path uses the CPU side data. This causes the XDP code path to read the unset garbage data in the CPU side memory, resulting in incorrect handling of the packet by XDP. Solution ```````` 1. Add a call to dma_sync_single_for_cpu() before the XDP code starts to use the data in the CPU side memory. 2. The XDP code verdict can be XDP_PASS, in which case there is a fallback to the non-XDP code, which also calls dma_sync_single_for_cpu(). To avoid calling dma_sync_single_for_cpu() twice: 2.1. Put the dma_sync_single_for_cpu() in the code in such a place where it happens before XDP and non-XDP code. 2.2. Remove the calls to dma_sync_single_for_cpu() in the non-XDP code for the first buffer only (rx_copybreak and non-rx_copybreak cases), since the new call that was added covers these cases. The call to dma_sync_single_for_cpu() for the second buffer and on stays because only the first buffer is handled by the newly added dma_sync_single_for_cpu(). And there is no need for special handling of the second buffer and on for the XDP path since currently the driver supports only single buffer packets. Issue 2 ------- Description ``````````` In case the XDP code forwarded the packet (ENA_XDP_FORWARDED), ena_unmap_rx_buff_attrs() is called with attrs set to 0. This means that before unmapping the buffer, the internal function dma_unmap_page_attrs() will also call dma_sync_single_for_cpu() on the whole buffer (not only on the data part of it). This sync is both wasteful (since a sync was already explicitly called before) and also causes a bug, which will be explained using the below diagram. The following diagram shows the flow of events causing the bug. The order of events is (1)-(4) as shown in the diagram. CPU side memory area (3)convert_to_xdp_frame() initializes the headroom with xdpf metadata \|\| \/ ___________________________________ \| \| 0 \| V 4K --------------------------------------------------------------------- \| xdpf->data \| other xdpf \| < data > \| tailroom \|\|...\| \| \| fields \| \| GARBAGE \|\| \| --------------------------------------------------------------------- /\ /\ \|\| \|\| (4)ena_unmap_rx_buff_attrs() calls (2)dma_sync_single_for_cpu() dma_sync_single_for_cpu() on the copies data from device whole buffer page, overwriting side to CPU side memory the xdpf->data with GARBAGE. \|\| 0 4K --------------------------------------------------------------------- \| headroom \| < data > \| tailroom \|\|...\| \| GARBAGE \| \| GARBAGE \|\| \| --------------------------------------------------------------------- Device side memory area /\ \|\| (1) device writes RX packet data After the call to ena_unmap_rx_buff_attrs() in (4), the xdpf->data becomes corrupted, and so when it is later accessed in ena_clean_xdp_irq()->xdp_return_frame(), it causes a page fault, crashing the kernel. Solution ```````` Explicitly tell ena_unmap_rx_buff_attrs() not to call dma_sync_single_for_cpu() by passing it the ENA_DMA_ATTR_SKIP_CPU_SYNC flag. Fixes: f7d625adeb7b ("net: ena: Add dynamic recycling mechanism for rx buffers") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-4-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12	net: ena: Fix xdp drops handling due to multibuf packets	David Arinzon
	Current xdp code drops packets larger than ENA_XDP_MAX_MTU. This is an incorrect condition since the problem is not the size of the packet, rather the number of buffers it contains. This commit: 1. Identifies and drops XDP multi-buffer packets at the beginning of the function. 2. Increases the xdp drop statistic when this drop occurs. 3. Adds a one-time print that such drops are happening to give better indication to the user. Fixes: 838c93dc5449 ("net: ena: implement XDP drop support") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-3-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12	net: ena: Destroy correct number of xdp queues upon failure	David Arinzon
	The ena_setup_and_create_all_xdp_queues() function freed all the resources upon failure, after creating only xdp_num_queues queues, instead of freeing just the created ones. In this patch, the only resources that are freed, are the ones allocated right before the failure occurs. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20231211062801.27891-2-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-10	net: ena: replace deprecated strncpy with strscpy	justinstitt@google.com
	`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. A suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. host_info allocation is done in ena_com_allocate_host_info() via dma_alloc_coherent() and is not zero initialized by alloc_etherdev_mq(). However zero initialization of the destination doesn't matter in this case, because strscpy() guarantees a NULL termination. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Acked-by: Arthur Kiyanovski <akiyano@amazon.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-03	net: Tree wide: Replace xdp_do_flush_map() with xdp_do_flush().	Sebastian Andrzej Siewior
	xdp_do_flush_map() is deprecated and new code should use xdp_do_flush() instead. Replace xdp_do_flush_map() with xdp_do_flush(). Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Clark Wang <xiaoning.wang@nxp.com> Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: David Arinzon <darinzon@amazon.com> Cc: Edward Cree <ecree.xilinx@gmail.com> Cc: Felix Fietkau <nbd@nbd.name> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Jassi Brar <jaswinder.singh@linaro.org> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: John Crispin <john@phrozen.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Lorenzo Bianconi <lorenzo@kernel.org> Cc: Louis Peens <louis.peens@corigine.com> Cc: Marcin Wojtas <mw@semihalf.com> Cc: Mark Lee <Mark-MC.Lee@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: Noam Dagan <ndagan@amazon.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Saeed Bishara <saeedb@amazon.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Sean Wang <sean.wang@mediatek.com> Cc: Shay Agroskin <shayagr@amazon.com> Cc: Shenwei Wang <shenwei.wang@nxp.com> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Vladimir Oltean <vladimir.oltean@nxp.com> Cc: Wei Fang <wei.fang@nxp.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Arthur Kiyanovski <akiyano@amazon.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20230908143215.869913-2-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-09-21	net: ena: Flush XDP packets on error.	Sebastian Andrzej Siewior
	xdp_do_flush() should be invoked before leaving the NAPI poll function after a XDP-redirect. This is not the case if the driver leaves via the error path (after having a redirect in one of its previous iterations). Invoke xdp_do_flush() also in the error path. Cc: Arthur Kiyanovski <akiyano@amazon.com> Cc: David Arinzon <darinzon@amazon.com> Cc: Noam Dagan <ndagan@amazon.com> Cc: Saeed Bishara <saeedb@amazon.com> Cc: Shay Agroskin <shayagr@amazon.com> Fixes: a318c70ad152b ("net: ena: introduce XDP redirect implementation") Acked-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-17	net: ena: Use pci_dev_id() to simplify the code	Jialin Zhang
	PCI core API pci_dev_id() can be used to get the BDF number for a pci device. We don't need to compose it manually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230815024248.3519068-1-zhangjialin11@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-15	net: ena: Add dynamic recycling mechanism for rx buffers	David Arinzon
	The current implementation allocates page-sized rx buffers. As traffic may consist of different types and sizes of packets, in various cases, buffers are not fully used. This change (Dynamic RX Buffers - DRB) uses part of the allocated rx page needed for the incoming packet, and returns the rest of the unused page to be used again as an rx buffer for future packets. A threshold of 2K for unused space has been set in order to declare whether the remainder of the page can be reused again as an rx buffer. As a page may be reused, dma_sync_single_for_cpu() is added in order to sync the memory to the CPU side after it was owned by the HW. In addition, when the rx page can no longer be reused, it is being unmapped using dma_page_unmap(), which implicitly syncs and then unmaps the entire page. In case the kernel still handles the skbs pointing to the previous buffers from that rx page, it may access garbage pointers, caused by the implicit sync overwriting them. The implicit dma sync is removed by replacing dma_page_unmap() with dma_unmap_page_attrs() with DMA_ATTR_SKIP_CPU_SYNC flag. The functionality is disabled for XDP traffic to avoid handling several descriptors per packet. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Link: https://lore.kernel.org/r/20230612121448.28829-1-darinzon@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-29	net: ena: removed unused tx_bytes variable	Simon Horman
	clang 16.0.0 with W=1 reports: drivers/net/ethernet/amazon/ena/ena_netdev.c:1901:6: error: variable 'tx_bytes' set but not used [-Werror,-Wunused-but-set-variable] u32 tx_bytes = 0; The variable is not used so remove it. Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230328151958.410687-1-horms@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-27	net: ena: Add support to changing tx_push_buf_len	Shay Agroskin
	The ENA driver allows for two distinct values for the number of bytes of the packet's payload that can be written directly to the device. For a value of 224 the driver turns on Large LLQ Header mode in which the first 224 of the packet's payload are written to the LLQ. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-27	net: ena: Recalculate TX state variables every device reset	Shay Agroskin
	With the ability to modify LLQ entry size, the size of packet's payload that can be written directly to the device changes. This patch makes the driver recalculate this information every device negotiation (also called device reset). Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-27	net: ena: Add an option to configure large LLQ headers	David Arinzon
	Allow configuring the device with large LLQ headers. The Low Latency Queue (LLQ) allows the driver to write the first N bytes of the packet, along with the rest of the TX descriptors directly into device (N can be either 96 or 224 for large LLQ headers configuration). Having L4 TCP/UDP headers contained in the first 96 bytes of the packet is required to get maximum performance from the device. Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-27	net: ena: Make few cosmetic preparations to support large LLQ	Shay Agroskin
	Move ena_calc_io_queue_size() implementation closer to the file's beginning so that it can be later called from ena_device_init() function without adding a function declaration. Also add an empty line at some spots to separate logical blocks in funcitons. Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10	net: ena: take into account xdp_features setting tx/rx queues	Lorenzo Bianconi
	ena nic allows xdp just if enough hw queues are available for XDP. Take into account queues configuration setting xdp_features. Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Reviewed-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-02	drivers: net: turn on XDP features	Marek Majtyka
	A summary of the flags being set for various drivers is given below. Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features that can be turned off and on at runtime. This means that these flags may be set and unset under RTNL lock protection by the driver. Hence, READ_ONCE must be used by code loading the flag value. Also, these flags are not used for synchronization against the availability of XDP resources on a device. It is merely a hint, and hence the read may race with the actual teardown of XDP resources on the device. This may change in the future, e.g. operations taking a reference on the XDP resources of the driver, and in turn inhibiting turning off this flag. However, for now, it can only be used as a hint to check whether device supports becoming a redirection target. Turn 'hw-offload' feature flag on for: - netronome (nfp) - netdevsim. Turn 'native' and 'zerocopy' features flags on for: - intel (i40e, ice, ixgbe, igc) - mellanox (mlx5). - stmmac - netronome (nfp) Turn 'native' features flags on for: - amazon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2, enetc) - funeth - intel (igb) - marvell (mvneta, mvpp2, octeontx2) - mellanox (mlx4) - mtk_eth_soc - qlogic (qede) - sfc - socionext (netsec) - ti (cpsw) - tap - tsnep - veth - xen - virtio_net. Turn 'basic' (tx, pass, aborted and drop) features flags on for: - netronome (nfp) - cavium (thunder) - hyperv. Turn 'redirect_target' feature flag on for: - amanzon (ena) - broadcom (bnxt) - freescale (dpaa, dpaa2) - intel (i40e, ice, igb, ixgbe) - ti (cpsw) - marvell (mvneta, mvpp2) - sfc - socionext (netsec) - qlogic (qede) - mellanox (mlx5) - tap - veth - virtio_net - xen Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Marek Majtyka <alardam@gmail.com> Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-30	net: ena: Update NUMA TPH hint register upon NUMA node update	David Arinzon
	The device supports a PCIe optimization hint, which indicates on which NUMA the queue is currently processed. This hint is utilized by PCIe in order to reduce its access time by accessing the correct NUMA resources and maintaining cache coherence. The driver calls the register update for the hint (called TPH - TLP Processing Hint) during the NAPI loop. Though the update is expected upon a NUMA change (when a queue is moved from one NUMA to the other), the current logic performs a register update when the queue is moved to a different CPU, but the CPU is not necessarily in a different NUMA. The changes include: 1. Performing the TPH update only when the queue has switched a NUMA node. 2. Moving the TPH update call to be triggered only when NAPI was scheduled from interrupt context, as opposed to a busy-polling loop. This is due to the fact that during busy-polling, the frequency of CPU switches for a particular queue is significantly higher, thus, the likelihood to switch NUMA is much higher. Therefore, providing the frequent updates to the device upon a NUMA update are unlikely to be beneficial. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-30	net: ena: Set default value for RX interrupt moderation	David Arinzon
	RX ring can be NULL in XDP use cases where only TX queues are configured. In this scenario, the RX interrupt moderation value sent to the device remains in its default value of 0. In this change, setting the default value of the RX interrupt moderation to be the same as of the TX. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-30	net: ena: Fix rx_copybreak value update	David Arinzon
	Make the upper bound on rx_copybreak tighter, by making sure it is smaller than the minimum of mtu and ENA_PAGE_SIZE. With the current upper bound of mtu, rx_copybreak can be larger than a page. Such large rx_copybreak will not bring any performance benefit to the user and therefore makes no sense. In addition, the value update was only reflected in the adapter structure, but not applied for each ring, causing it to not take effect. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-30	net: ena: Use bitmask to indicate packet redirection	David Arinzon
	Redirecting packets with XDP Redirect is done in two phases: 1. A packet is passed by the driver to the kernel using xdp_do_redirect(). 2. After finishing polling for new packets the driver lets the kernel know that it can now process the redirected packet using xdp_do_flush_map(). The packets' redirection is handled in the napi context of the queue that called xdp_do_redirect() To avoid calling xdp_do_flush_map() each time the driver first checks whether any packets were redirected, using xdp_flags \|= xdp_verdict; and if (xdp_flags & XDP_REDIRECT) xdp_do_flush_map() essentially treating XDP instructions as a bitmask, which isn't the case: enum xdp_action { XDP_ABORTED = 0, XDP_DROP, XDP_PASS, XDP_TX, XDP_REDIRECT, }; Given the current possible values of xdp_action, the current design doesn't have a bug (since XDP_REDIRECT = 100b), but it is still flawed. This patch makes the driver use a bitmask instead, to avoid future issues. Fixes: a318c70ad152 ("net: ena: introduce XDP redirect implementation") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-30	net: ena: Account for the number of processed bytes in XDP	David Arinzon
	The size of packets that were forwarded or dropped by XDP wasn't added to the total processed bytes statistic. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-30	net: ena: Don't register memory info on XDP exchange	David Arinzon
	Since the queues aren't destroyed when we only exchange XDP programs, there's no need to re-register them again. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	include/linux/bpf.h 1f6e04a1c7b8 ("bpf: Fix offset calculation error in __copy_map_value and zero_map_value") aa3496accc41 ("bpf: Refactor kptr_off_tab into btf_record") f71b2f64177a ("bpf: Refactor map->off_arr handling") https://lore.kernel.org/all/20221114095000.67a73239@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-15	net: ena: Fix error handling in ena_init()	Yuan Can
	The ena_init() won't destroy workqueue created by create_singlethread_workqueue() when pci_register_driver() failed. Call destroy_workqueue() when pci_register_driver() failed to prevent the resource leak. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Yuan Can <yuancan@huawei.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20221114025659.124726-1-yuancan@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-28	net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).	Thomas Gleixner
	Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net: drop the weight argument from netif_napi_add	Jakub Kicinski
	We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31	net: ethernet: move from strlcpy with unused retval to strscpy	Wolfram Sang
	Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4\|5} Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-30	eth: remove remaining copies of the NAPI_POLL_WEIGHT define	Jakub Kicinski
	Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. This patch covers three more drivers which I missed in commit 5f012b40ef63 ("eth: remove copies of the NAPI_POLL_WEIGHT define"). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31	net: ena: Do not waste napi skb cache	Hyeonggon Yoo
	By profiling, discovered that ena device driver allocates skb by build_skb() and frees by napi_skb_cache_put(). Because the driver does not use napi skb cache in allocation path, napi skb cache is periodically filled and flushed. This is waste of napi skb cache. As ena_alloc_skb() is called only in napi, Use napi_build_skb() and napi_alloc_skb() when allocating skb. This patch was tested on aws a1.metal instance. [ jwiedmann.dev@gmail.com: Use napi_alloc_skb() instead of netdev_alloc_skb_ip_align() to keep things consistent. ] Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/YfUAkA9BhyOJRT4B@ip-172-31-19-208.ap-northeast-1.compute.internal Signed-off-by: Jakub Kicinski <kuba@kernel.org>