summaryrefslogtreecommitdiff
path: root/Documentation/networking/napi.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking/napi.rst')
-rw-r--r--Documentation/networking/napi.rst103
1 files changed, 94 insertions, 9 deletions
diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst
index 02720dd71a76..4e008efebb35 100644
--- a/Documentation/networking/napi.rst
+++ b/Documentation/networking/napi.rst
@@ -171,12 +171,43 @@ a channel as an IRQ/NAPI which services queues of a given type. For example,
a configuration of 1 ``rx``, 1 ``tx`` and 1 ``combined`` channel is expected
to utilize 3 interrupts, 2 Rx and 2 Tx queues.
+Persistent NAPI config
+----------------------
+
+Drivers often allocate and free NAPI instances dynamically. This leads to loss
+of NAPI-related user configuration each time NAPI instances are reallocated.
+The netif_napi_add_config() API prevents this loss of configuration by
+associating each NAPI instance with a persistent NAPI configuration based on
+a driver defined index value, like a queue number.
+
+Using this API allows for persistent NAPI IDs (among other settings), which can
+be beneficial to userspace programs using ``SO_INCOMING_NAPI_ID``. See the
+sections below for other NAPI configuration settings.
+
+Drivers should try to use netif_napi_add_config() whenever possible.
+
User API
========
User interactions with NAPI depend on NAPI instance ID. The instance IDs
are only visible to the user thru the ``SO_INCOMING_NAPI_ID`` socket option.
-It's not currently possible to query IDs used by a given device.
+
+Users can query NAPI IDs for a device or device queue using netlink. This can
+be done programmatically in a user application or by using a script included in
+the kernel source tree: ``tools/net/ynl/pyynl/cli.py``.
+
+For example, using the script to dump all of the queues for a device (which
+will reveal each queue's NAPI ID):
+
+.. code-block:: bash
+
+ $ kernel-source/tools/net/ynl/pyynl/cli.py \
+ --spec Documentation/netlink/specs/netdev.yaml \
+ --dump queue-get \
+ --json='{"ifindex": 2}'
+
+See ``Documentation/netlink/specs/netdev.yaml`` for more details on
+available operations and attributes.
Software IRQ coalescing
-----------------------
@@ -199,13 +230,13 @@ parameters mentioned above use hyphens instead of underscores:
Per-NAPI configuration can be done programmatically in a user application
or by using a script included in the kernel source tree:
-``tools/net/ynl/cli.py``.
+``tools/net/ynl/pyynl/cli.py``.
For example, using the script:
.. code-block:: bash
- $ kernel-source/tools/net/ynl/cli.py \
+ $ kernel-source/tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--do napi-set \
--json='{"id": 345,
@@ -232,7 +263,9 @@ are not well known).
Busy polling is enabled by either setting ``SO_BUSY_POLL`` on
selected sockets or using the global ``net.core.busy_poll`` and
``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling
-also exists.
+also exists. Threaded polling of NAPI also has a mode to busy poll for
+packets (:ref:`threaded busy polling<threaded_busy_poll>`) using the NAPI
+processing kthread.
epoll-based busy polling
------------------------
@@ -362,7 +395,7 @@ It is expected that ``irq-suspend-timeout`` will be set to a value much larger
than ``gro_flush_timeout`` as ``irq-suspend-timeout`` should suspend IRQs for
the duration of one userland processing cycle.
-While it is not stricly necessary to use ``napi_defer_hard_irqs`` and
+While it is not strictly necessary to use ``napi_defer_hard_irqs`` and
``gro_flush_timeout`` to use IRQ suspension, their use is strongly
recommended.
@@ -395,6 +428,52 @@ Therefore, setting ``gro_flush_timeout`` and ``napi_defer_hard_irqs`` is
the recommended usage, because otherwise setting ``irq-suspend-timeout``
might not have any discernible effect.
+.. _threaded_busy_poll:
+
+Threaded NAPI busy polling
+--------------------------
+
+Threaded NAPI busy polling extends threaded NAPI and adds support to do
+continuous busy polling of the NAPI. This can be useful for forwarding or
+AF_XDP applications.
+
+Threaded NAPI busy polling can be enabled on per NIC queue basis using Netlink.
+
+For example, using the following script:
+
+.. code-block:: bash
+
+ $ ynl --family netdev --do napi-set \
+ --json='{"id": 66, "threaded": "busy-poll"}'
+
+The kernel will create a kthread that busy polls on this NAPI.
+
+The user may elect to set the CPU affinity of this kthread to an unused CPU
+core to improve how often the NAPI is polled at the expense of wasted CPU
+cycles. Note that this will keep the CPU core busy with 100% usage.
+
+Once threaded busy polling is enabled for a NAPI, PID of the kthread can be
+retrieved using Netlink so the affinity of the kthread can be set up.
+
+For example, the following script can be used to fetch the PID:
+
+.. code-block:: bash
+
+ $ ynl --family netdev --do napi-get --json='{"id": 66}'
+
+This will output something like following, the pid `258` is the PID of the
+kthread that is polling this NAPI.
+
+.. code-block:: bash
+
+ $ {'defer-hard-irqs': 0,
+ 'gro-flush-timeout': 0,
+ 'id': 66,
+ 'ifindex': 2,
+ 'irq-suspend-timeout': 0,
+ 'pid': 258,
+ 'threaded': 'busy-poll'}
+
.. _threaded:
Threaded NAPI
@@ -402,9 +481,8 @@ Threaded NAPI
Threaded NAPI is an operating mode that uses dedicated kernel
threads rather than software IRQ context for NAPI processing.
-The configuration is per netdevice and will affect all
-NAPI instances of that device. Each NAPI instance will spawn a separate
-thread (called ``napi/${ifc-name}-${napi-id}``).
+Each threaded NAPI instance will spawn a separate thread
+(called ``napi/${ifc-name}-${napi-id}``).
It is recommended to pin each kernel thread to a single CPU, the same
CPU as the CPU which services the interrupt. Note that the mapping
@@ -413,7 +491,14 @@ dependent). The NAPI instance IDs will be assigned in the opposite
order than the process IDs of the kernel threads.
Threaded NAPI is controlled by writing 0/1 to the ``threaded`` file in
-netdev's sysfs directory.
+netdev's sysfs directory. It can also be enabled for a specific NAPI using
+netlink interface.
+
+For example, using the script:
+
+.. code-block:: bash
+
+ $ ynl --family netdev --do napi-set --json='{"id": 66, "threaded": 1}'
.. rubric:: Footnotes