Age | Commit message (Collapse) | Author |
|
Move this API to the canonical timer_*() namespace.
[ tglx: Redone against pre rc1 ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc2).
Conflict:
Documentation/networking/netdevices.rst
net/core/lock_debug.c
04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE")
03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock")
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let's pass the queue index to netif_napi_add_config() to preserve
per-NAPI config.
Test:
Set 100 to defer-hard-irqs (default is 0) and check the value after
link down & up.
$ cat /sys/class/net/enp39s0/napi_defer_hard_irqs
0
$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 2}'
[{'defer-hard-irqs': 0,
'gro-flush-timeout': 0,
'id': 65,
'ifindex': 2,
'irq': 29,
'irq-suspend-timeout': 0}]
$ sudo ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--do napi-set --json='{"id": 65, "defer-hard-irqs": 100}'
$ sudo ip link set enp39s0 down && sudo ip link set enp39s0 up
Without patch:
$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 2}'
[{'defer-hard-irqs': 0, <------------------- Reset to 0
'gro-flush-timeout': 0,
'id': 66, <------------------------------- New ID
'ifindex': 2,
'irq': 29,
'irq-suspend-timeout': 0}]
With patch:
$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 2}'
[{'defer-hard-irqs': 100, <--------------+-- Preserved
'gro-flush-timeout': 0, |
'id': 65, <----------------------------'
'ifindex': 2,
'irq': 29,
'irq-suspend-timeout': 0}]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Link: https://patch.msgid.link/20250407164802.25184-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When IRQs are freed, a WARN_ON is triggered as the
affinity notifier is not released.
This results in the below stack trace:
[ 484.544586] ? __warn+0x84/0x130
[ 484.544843] ? free_irq+0x5c/0x70
[ 484.545105] ? report_bug+0x18a/0x1a0
[ 484.545390] ? handle_bug+0x53/0x90
[ 484.545664] ? exc_invalid_op+0x14/0x70
[ 484.545959] ? asm_exc_invalid_op+0x16/0x20
[ 484.546279] ? free_irq+0x5c/0x70
[ 484.546545] ? free_irq+0x10/0x70
[ 484.546807] ena_free_io_irq+0x5f/0x70 [ena]
[ 484.547138] ena_down+0x250/0x3e0 [ena]
[ 484.547435] ena_destroy_device+0x118/0x150 [ena]
[ 484.547796] __ena_shutoff+0x5a/0xe0 [ena]
[ 484.548110] pci_device_remove+0x3b/0xb0
[ 484.548412] device_release_driver_internal+0x193/0x200
[ 484.548804] driver_detach+0x44/0x90
[ 484.549084] bus_remove_driver+0x69/0xf0
[ 484.549386] pci_unregister_driver+0x2a/0xb0
[ 484.549717] ena_cleanup+0xc/0x130 [ena]
[ 484.550021] __do_sys_delete_module.constprop.0+0x176/0x310
[ 484.550438] ? syscall_trace_enter+0xfb/0x1c0
[ 484.550782] do_syscall_64+0x5b/0x170
[ 484.551067] entry_SYSCALL_64_after_hwframe+0x76/0x7e
Adding a call to `netif_napi_set_irq` with -1 as the IRQ index,
which frees the notifier.
Fixes: de340d8206bf ("net: ena: use napi's aRFS rmap notifers")
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250317071147.1105-1-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the core's rmap notifiers and delete our own.
Acked-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20250224232228.990783-3-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The assignment was accidentally aligned to the string one line before.
This was raised by the kernel bot.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202412101739.umNl7yYu-lkp@intel.com/
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241212115910.2485851-1-shayagr@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There's no need for devm bloat here. In addition, these are freed right
before the function exits.
Also swapped kcalloc order for consistency.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Link: https://patch.msgid.link/20241101214828.289752-2-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ena_com_get_dev_basic_stats() has been unused since 2017's
commit d81db2405613 ("net/ena: refactor ena_get_stats64 to be atomic
context safe")
ena_com_get_offload_settings() has been unused since the original
commit of ENA back in 2016 in
commit 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic
Network Adapters (ENA)")
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20241102220142.80285-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This manually reverts
commit a4e262cde3cd ("net: ena: allow automatic fallback to polling mode")
which is unused.
(I did it manually because there are other minor comment
and function changes surrounding it).
Build tested only.
Suggested-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20241103194149.293456-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
net_dim() is currently passed a struct dim_sample argument by value.
struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64
passes it on the stack. All callers have already initialized dim_sample
on the stack, so passing it by value requires pushing a duplicated copy
to the stack. Either witing to the stack and immediately reading it, or
perhaps dereferencing addresses relative to the stack pointer in a chain
of push instructions, seems to perform quite poorly.
In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
94% of which is attributed to the first push instruction to copy
dim_sample on the stack for the call to net_dim():
// Call ktime_get()
0.26 |4ead2: call 4ead7 <mlx5e_handle_rx_dim+0x47>
// Pass the address of struct dim in %rdi
|4ead7: lea 0x3d0(%rbx),%rdi
// Set dim_sample.pkt_ctr
|4eade: mov %r13d,0x8(%rsp)
// Set dim_sample.byte_ctr
|4eae3: mov %r12d,0xc(%rsp)
// Set dim_sample.event_ctr
0.15 |4eae8: mov %bp,0x10(%rsp)
// Duplicate dim_sample on the stack
94.16 |4eaed: push 0x10(%rsp)
2.79 |4eaf1: push 0x10(%rsp)
0.07 |4eaf5: push %rax
// Call net_dim()
0.21 |4eaf6: call 4eafb <mlx5e_handle_rx_dim+0x6b>
To allow the caller to reuse the struct dim_sample already on the stack,
pass the struct dim_sample by reference to net_dim().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://patch.msgid.link/20241031002326.3426181-2-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Link queues to NAPIs using the netdev-genl API so this information is
queryable.
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8201, 'type': 'rx'},
{'id': 1, 'ifindex': 2, 'napi-id': 8202, 'type': 'rx'},
{'id': 2, 'ifindex': 2, 'napi-id': 8203, 'type': 'rx'},
{'id': 3, 'ifindex': 2, 'napi-id': 8204, 'type': 'rx'},
{'id': 4, 'ifindex': 2, 'napi-id': 8205, 'type': 'rx'},
{'id': 5, 'ifindex': 2, 'napi-id': 8206, 'type': 'rx'},
{'id': 6, 'ifindex': 2, 'napi-id': 8207, 'type': 'rx'},
{'id': 7, 'ifindex': 2, 'napi-id': 8208, 'type': 'rx'},
{'id': 0, 'ifindex': 2, 'napi-id': 8201, 'type': 'tx'},
{'id': 1, 'ifindex': 2, 'napi-id': 8202, 'type': 'tx'},
{'id': 2, 'ifindex': 2, 'napi-id': 8203, 'type': 'tx'},
{'id': 3, 'ifindex': 2, 'napi-id': 8204, 'type': 'tx'},
{'id': 4, 'ifindex': 2, 'napi-id': 8205, 'type': 'tx'},
{'id': 5, 'ifindex': 2, 'napi-id': 8206, 'type': 'tx'},
{'id': 6, 'ifindex': 2, 'napi-id': 8207, 'type': 'tx'},
{'id': 7, 'ifindex': 2, 'napi-id': 8208, 'type': 'tx'}]
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20241002001331.65444-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Link IRQs to NAPI instances with netif_napi_set_irq. This information
can be queried with the netdev-genl API. Note that the ENA device
appears to allocate an IRQ for management purposes which does not have a
NAPI associated with it; this commit takes this into consideration to
accurately construct a map between IRQs and NAPI instances.
Compare the output of /proc/interrupts for my ena device with the output of
netdev-genl after applying this patch:
$ cat /proc/interrupts | grep enp55s0 | cut -f1 --delimiter=':'
94
95
96
97
98
99
100
101
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump napi-get --json='{"ifindex": 2}'
[{'id': 8208, 'ifindex': 2, 'irq': 101},
{'id': 8207, 'ifindex': 2, 'irq': 100},
{'id': 8206, 'ifindex': 2, 'irq': 99},
{'id': 8205, 'ifindex': 2, 'irq': 98},
{'id': 8204, 'ifindex': 2, 'irq': 97},
{'id': 8203, 'ifindex': 2, 'irq': 96},
{'id': 8202, 'ifindex': 2, 'irq': 95},
{'id': 8201, 'ifindex': 2, 'irq': 94}]
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20241002001331.65444-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ENA currently supports the following customer metrics:
- `bw_in_allowance_exceeded`
- `bw_out_allowance_exceeded`
- `conntrack_allowance_exceeded`
- `linklocal_allowance_exceeded`
- `pps_allowance_exceeded`
This patch adds a new metric named:
`conntrack_allowance_available`.
Information about these metrics is available in [1].
In addition, the interface between the driver and the
device has been upgraded to allow more flexibility and
expendability to additional metrics in the future.
[1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-network-performance-ena.html#network-performance-metrics
Signed-off-by: Ron Beider <rbeider@amazon.com>
Signed-off-by: Shahar Itzko <itzko@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20240909084704.13856-3-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ENA Express metrics, called `ena_srd` are exposed to
customers via `ethtool`.
The metrics allow customers to check the configuration
(mode), tx/rx counters as well as resource utilization.
The documentation is also updated to provide a general
explanation about ENA Express as well as links for further
information about metrics and configurations.
Signed-off-by: Igor Chauskin <igorch@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20240909084704.13856-2-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver overrides the NUMA node id of the device regardless of
whether it knows its correct value (often setting it to -1 even though
the node id is advertised in 'struct device'). This can lead to
suboptimal configurations.
This patch fixes this behavior and makes the shared memory allocation
functions use the NUMA node id advertised by the underlying device.
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Link: https://lore.kernel.org/r/20240528170912.1204417-1-shayagr@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For the purpose of obtaining better CPU utilization,
minimum rx moderation interval is set to 20 usec.
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240512134637.25299-6-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
strscpy copies as much of the string as possible,
meaning that the destination string will be truncated
in case of no space. As this is a non-critical error in
our case, adding a debug level print for indication.
This patch also removes a -1 which was added to ensure
enough space for NUL, but strscpy destination string is
guaranteed to be NUL-terminted, therefore, the -1 is
not needed.
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240512134637.25299-5-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Validate that `first` flag is set only for the first
descriptor in multi-buffer packets.
In case of an invalid descriptor, a reset will occur.
A new reset reason for RX data corruption has been added.
Signed-off-by: Shahar Itzko <itzko@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240512134637.25299-4-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch makes two changes in order to fill holes and
reduce ther overall size of the structures ena_com_dev
and ena_com_rx_ctx.
Signed-off-by: Shahar Itzko <itzko@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240512134637.25299-3-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds a counter to the ena_adapter struct in
order to keep track of reset failures.
The counter is incremented every time either ena_restore_device()
or ena_destroy_device() fail.
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240512134637.25299-2-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Simon reported that ndo_change_mtu() methods were never
updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted
in commit 501a90c94510 ("inet: protect against too small
mtu values.")
We read dev->mtu without holding RTNL in many places,
with READ_ONCE() annotations.
It is time to take care of ndo_change_mtu() methods
to use corresponding WRITE_ONCE()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Simon Horman <horms@kernel.org>
Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The patch mentioned in the `Fixes` tag removed the explicit assignment
of tx_info->xdpf to NULL with the justification that there's no need
to set tx_info->xdpf to NULL and tx_info->num_of_bufs to 0 in case
of a mapping error. Both values won't be used once the mapping function
returns an error, and their values would be overridden by the next
transmitted packet.
While both values do indeed get overridden in the next transmission
call, the value of tx_info->xdpf is also used to check whether a TX
descriptor's transmission has been completed (i.e. a completion for it
was polled).
An example scenario:
1. Mapping failed, tx_info->xdpf wasn't set to NULL
2. A VF reset occurred leading to IO resource destruction and
a call to ena_free_tx_bufs() function
3. Although the descriptor whose mapping failed was freed by the
transmission function, it still passes the check
if (!tx_info->skb)
(skb and xdp_frame are in a union)
4. The xdp_frame associated with the descriptor is freed twice
This patch returns the assignment of NULL to tx_info->xdpf to make the
cleaning function knows that the descriptor is already freed.
Fixes: 504fd6a5390c ("net: ena: fix DMA mapping function issues in XDP")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ENA has two types of TX queues:
- queues which only process TX packets arriving from the network stack
- queues which only process TX packets forwarded to it by XDP_REDIRECT
or XDP_TX instructions
The ena_free_tx_bufs() cycles through all descriptors in a TX queue
and unmaps + frees every descriptor that hasn't been acknowledged yet
by the device (uncompleted TX transactions).
The function assumes that the processed TX queue is necessarily from
the first category listed above and ends up using napi_consume_skb()
for descriptors belonging to an XDP specific queue.
This patch solves a bug in which, in case of a VF reset, the
descriptors aren't freed correctly, leading to crashes.
Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Missing IO completions check is called every second (HZ jiffies).
This commit fixes several issues with this check:
1. Duplicate queues check:
Max of 4 queues are scanned on each check due to monitor budget.
Once reaching the budget, this check exits under the assumption that
the next check will continue to scan the remainder of the queues,
but in practice, next check will first scan the last already scanned
queue which is not necessary and may cause the full queue scan to
last a couple of seconds longer.
The fix is to start every check with the next queue to scan.
For example, on 8 IO queues:
Bug: [0,1,2,3], [3,4,5,6], [6,7]
Fix: [0,1,2,3], [4,5,6,7]
2. Unbalanced queues check:
In case the number of active IO queues is not a multiple of budget,
there will be checks which don't utilize the full budget
because the full scan exits when reaching the last queue id.
The fix is to run every TX completion check with exact queue budget
regardless of the queue id.
For example, on 7 IO queues:
Bug: [0,1,2,3], [4,5,6], [0,1,2,3]
Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4]
The budget may be lowered in case the number of IO queues is less
than the budget (4) to make sure there are no duplicate queues on
the same check.
For example, on 3 IO queues:
Bug: [0,1,2,0], [1,2,0,1]
Fix: [0,1,2], [0,1,2]
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Amit Bernstein <amitbern@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Small unsigned types are promoted to larger signed types in
the case of multiplication, the result of which may overflow.
In case the result of such a multiplication has its MSB
turned on, it will be sign extended with '1's.
This changes the multiplication result.
Code example of the phenomenon:
-------------------------------
u16 x, y;
size_t z1, z2;
x = y = 0xffff;
printk("x=%x y=%x\n",x,y);
z1 = x*y;
z2 = (size_t)x*y;
printk("z1=%lx z2=%lx\n", z1, z2);
Output:
-------
x=ffff y=ffff
z1=fffffffffffe0001 z2=fffe0001
The expected result of ffff*ffff is fffe0001, and without the
explicit casting to avoid the unwanted sign extension we got
fffffffffffe0001.
This commit adds an explicit casting to avoid the sign extension
issue.
Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues in ena_com")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Avoid the following warnings by removing the ena_select_queue() function
and rely on the net core to do the queue selection, The issue happen
when an skb received from an interface with more queues than ena is
forwarded to the ena interface.
[ 1176.159959] eth0 selects TX queue 11, but real number of TX queues is 8
[ 1176.863976] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1180.767877] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1188.703742] eth0 selects TX queue 14, but real number of TX queues is 8
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IS_ERR() is already using unlikely internally.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Acked-by: Arthur Kiyanovski <akiyano@amazon.com>
Link: https://lore.kernel.org/r/20240213161502.2297048-1-kheib@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There is no point in initializing an ndo to NULL, therefore the
assignment is redundant and can be removed.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Acked-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch reduces some of the lines by removing newlines
where more variables or print strings can be pushed back
to the previous line while still adhering to the styling
guidelines.
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fail queue size calculation when the device returns maximum
TX/RX queue sizes that are smaller than the allowed minimum.
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The netif_* functions are used by the driver to log events into the
kernel ring (dmesg) similar to the netdev_* ones. Unlike the latter,
the netif_* function family allow the user to choose what events get
logged using ethtool:
sudo ethtool -s [interface] msglvl [msg_type] on
By default the events which get logged are slow-path related and aren't
printed often (e.g. interface up related prints). This patch removes the
NETIF_MSG_TX_DONE type (called every TX completion polling) from the
defaults and adds NETIF_MSG_IFDOWN instead as it makes more sensible
defaults.
This patch also transforms ena_down() print from netif_info into
netif_dbg (same as the analogue print in ena_up()) as it suits it
better.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Move skb_tx_timestamp() closer to the actual time the driver sends the
packets to the device.
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The function responsible for polling TX completions might not receive
the CPU resources it needs due to higher priority tasks running on the
requested core.
The driver might not be able to recognize such cases, but it can use its
state to suspect that they happened. If both conditions are met:
- napi hasn't been executed more than the TX completion timeout value
- napi is scheduled (meaning that we've received an interrupt)
Then it's more likely that the napi handler isn't scheduled because of
an overloaded CPU.
It was decided that for this case, the driver would wait twice as long
as the regular timeout before scheduling a reset.
The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset reason to
indicate this case to the device.
This patch also adds more information to the ena_tx_timeout() callback.
This function is called by the kernel when it detects that a specific TX
queue has been closed for too long.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The print was re-worded to a more informative one.
Signed-off-by: Shahar Itzko <itzko@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The functionality was added to allow the drivers to create an
SQ and CQ of different sizes.
When the RX/TX SQ and CQ have the same size, such update isn't
necessary as the device can safely assume it doesn't override
unprocessed completions. However, if the SQ is larger than the CQ,
the device might "have" more completions it wants to update about
than there's room in the CQ.
There's no support for different SQ and CQ sizes, therefore,
removing the API and its usage.
'____cacheline_aligned' compiler attribute was added to
'struct ena_com_io_cq' to ensure that the removal of the
'cq_head_db_reg' field doesn't change the cache-line layout
of this struct.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Dynamic Interrupt Moderation (DIM) is a technique
designed to balance the need for timely data processing
with the desire to minimize CPU overhead.
Instead of generating an interrupt for every received
packet, the system can dynamically adjust the rate at
which interrupts are generated based on the incoming
traffic patterns.
Enabling DIM by default to improve the user experience.
DIM can be turned on/off through ethtool:
`ethtool -C <interface> adaptive-rx <on/off>`
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Osama Abboud <osamaabb@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
A few changes for better readability and style
1. Adding / Removing newlines
2. Removing an unnecessary and confusing comment
3. Using an existing variable rather than re-checking a field
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove io_sq->header_addr field because it is no longer
in use.
LLQ was updated to support a bounce buffer so there is
no need in saving the header address of the sq.
Signed-off-by: Nati Koler <nkoler@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Queue stats using ifconfig and ip are retrieved
via ena_get_stats64(). This function currently does not take
the xdp sent or dropped packets stats into account.
This commit adds the following xdp stats to ena_get_stats64():
tx bytes sent
tx packets sent
rx dropped packets
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-12-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Also shorten comment related to it.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-11-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The RX queue info contains information about the RX queue which might
be relevant to the kernel.
To avoid configuring this queue for different scenarios, this patch
moves the RX queue configuration to ena_up()/ena_down() function and
makes it configured every interface state toggle.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-10-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Used for better readability and debugging of XDP
flow.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-9-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch focuses on changes to the XDP part of the napi
polling routine.
1. Update the `napi_comp` stat only when napi is actually
complete.
2. Simplify the code by using a function pointer to the right
napi routine (XDP vs non-XDP path)
3. Remove unnecessary local variables.
4. Adjust a debug print to show the processed XDP frame index
rather than the pointer.
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-8-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This check is already done in ena_clean_rx_irq() which indirectly
calls it.
This function is called in napi context and the driver doesn't
allow to change the XDP program without performing destruction and
reinitialization of napi context (part of ena_down/ena_up sequence).
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-7-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When an XDP program is loaded the existing channels in the driver split
into two halves:
- The first half of the channels contain RX and TX rings, these queues
are used for receiving traffic and sending packets originating from
kernel.
- The second half of the channels contain only a TX ring. These queues
are used for sending packets that were redirected using XDP_TX
or XDP_REDIRECT.
Referring to the queues in the second half of the channels as "xdp_ring"
can be confusing and may give the impression that ENA has the capability
to generate an additional special queue.
This patch ensures that the xdp_ring field is exclusively used to
describe the XDP TX queue that a specific RX queue needs to utilize when
forwarding packets with XDP TX and XDP REDIRECT, preserving the
integrity of the xdp_ring field in ena_ring.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-6-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To avoid de-referencing skb or xdp_frame when we poll for TX completion
(where they might not be in the cache), save the total TX packet size in
the ena_tx_buffer object representing the packet.
Also the 'print_once' field's type was changed from u32 to u8 to allow
adding the 'total_tx_size' without changing the total size of the
struct.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-5-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The skb and xdpf pointers cannot be set together in the driver
(each TX descriptor can send either an SKB or an XDP frame), and so it
makes more sense to put them both in a union.
This decreases the overall size of the ena_tx_buffer struct which
improves cache locality.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-4-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This change will enable the ability to use ena_xmit_common()
in functions that don't have a net_device pointer.
While it can be retrieved by dereferencing
ena_adapter (adapter->netdev), there's no reason to do it in
fast path code where this pointer is only needed for
debug prints.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-3-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
XDP system has a very large footprint in the driver's overall code.
makes the whole driver's code much harder to read.
Moving XDP code to dedicated files.
This patch doesn't make any changes to the code itself and only
cut-pastes the code into ena_xdp.c and ena_xdp.h files so the change
is purely cosmetic.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-2-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|