summaryrefslogtreecommitdiff
path: root/Documentation/networking/devlink
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking/devlink')
-rw-r--r--Documentation/networking/devlink/bnxt.rst2
-rw-r--r--Documentation/networking/devlink/devlink-info.rst9
-rw-r--r--Documentation/networking/devlink/devlink-port.rst33
-rw-r--r--Documentation/networking/devlink/devlink-region.rst2
-rw-r--r--Documentation/networking/devlink/devlink-trap.rst2
-rw-r--r--Documentation/networking/devlink/hns3.rst5
-rw-r--r--Documentation/networking/devlink/ice.rst83
-rw-r--r--Documentation/networking/devlink/index.rst1
-rw-r--r--Documentation/networking/devlink/ixgbe.rst171
-rw-r--r--Documentation/networking/devlink/mlx5.rst7
-rw-r--r--Documentation/networking/devlink/nfp.rst5
-rw-r--r--Documentation/networking/devlink/octeontx2.rst37
-rw-r--r--Documentation/networking/devlink/sfc.rst16
13 files changed, 369 insertions, 4 deletions
diff --git a/Documentation/networking/devlink/bnxt.rst b/Documentation/networking/devlink/bnxt.rst
index a4fb27663cd6..9a8b3d76d11f 100644
--- a/Documentation/networking/devlink/bnxt.rst
+++ b/Documentation/networking/devlink/bnxt.rst
@@ -24,6 +24,8 @@ Parameters
- Permanent
* - ``enable_remote_dev_reset``
- Runtime
+ * - ``enable_roce``
+ - Permanent
The ``bnxt`` driver also implements the following driver-specific
parameters.
diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst
index 1242b0e6826b..dd6adc4d0559 100644
--- a/Documentation/networking/devlink/devlink-info.rst
+++ b/Documentation/networking/devlink/devlink-info.rst
@@ -86,6 +86,10 @@ In case software/firmware components are loaded from the disk (e.g.
``/lib/firmware``) only the running version should be reported via
the kernel API.
+Please note that any security versions reported via devlink are purely
+informational. Devlink does not use a secure channel to communicate with
+the device.
+
Generic Versions
================
@@ -146,6 +150,11 @@ board.manufacture
An identifier of the company or the facility which produced the part.
+board.part_number
+-----------------
+
+Part number of the board and its components.
+
fw
--
diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst
index 562f46b41274..9d22d41a7cd1 100644
--- a/Documentation/networking/devlink/devlink-port.rst
+++ b/Documentation/networking/devlink/devlink-port.rst
@@ -134,6 +134,9 @@ Users may also set the IPsec crypto capability of the function using
Users may also set the IPsec packet capability of the function using
`devlink port function set ipsec_packet` command.
+Users may also set the maximum IO event queues of the function
+using `devlink port function set max_io_eqs` command.
+
Function attributes
===================
@@ -295,6 +298,36 @@ policy is processed in software by the kernel.
function:
hw_addr 00:00:00:00:00:00 ipsec_packet enabled
+Maximum IO events queues setup
+------------------------------
+When user sets maximum number of IO event queues for a SF or
+a VF, such function driver is limited to consume only enforced
+number of IO event queues.
+
+IO event queues deliver events related to IO queues, including network
+device transmit and receive queues (txq and rxq) and RDMA Queue Pairs (QPs).
+For example, the number of netdevice channels and RDMA device completion
+vectors are derived from the function's IO event queues. Usually, the number
+of interrupt vectors consumed by the driver is limited by the number of IO
+event queues per device, as each of the IO event queues is connected to an
+interrupt vector.
+
+- Get maximum IO event queues of the VF device::
+
+ $ devlink port show pci/0000:06:00.0/2
+ pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+ function:
+ hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10
+
+- Set maximum IO event queues of the VF device::
+
+ $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32
+
+ $ devlink port show pci/0000:06:00.0/2
+ pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+ function:
+ hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32
+
Subfunction
============
diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst
index 9232cd7da301..5d0b68f752c0 100644
--- a/Documentation/networking/devlink/devlink-region.rst
+++ b/Documentation/networking/devlink/devlink-region.rst
@@ -49,7 +49,7 @@ example usage
$ devlink region show [ DEV/REGION ]
$ devlink region del DEV/REGION snapshot SNAPSHOT_ID
$ devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ]
- $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length length
+ $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length LENGTH
# Show all of the exposed regions with region sizes:
$ devlink region show
diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst
index 2c14dfe69b3a..5885e21e2212 100644
--- a/Documentation/networking/devlink/devlink-trap.rst
+++ b/Documentation/networking/devlink/devlink-trap.rst
@@ -451,7 +451,7 @@ be added to the following table:
* - ``udp_parsing``
- ``drop``
- Traps packets dropped due to an error in the UDP header parsing.
- This packet trap could include checksum errorrs, an improper UDP
+ This packet trap could include checksum errors, an improper UDP
length detected (smaller than 8 bytes) or detection of header
truncation.
* - ``tcp_parsing``
diff --git a/Documentation/networking/devlink/hns3.rst b/Documentation/networking/devlink/hns3.rst
index 4562a6e4782f..72bc1b9f3785 100644
--- a/Documentation/networking/devlink/hns3.rst
+++ b/Documentation/networking/devlink/hns3.rst
@@ -23,3 +23,8 @@ The ``hns3`` driver reports the following versions
* - ``fw``
- running
- Used to represent the firmware version.
+ * - ``fw.scc``
+ - running
+ - Used to represent the Soft Congestion Control (SSC) firmware version.
+ SCC is a firmware component which provides multiple RDMA congestion
+ control algorithms, including DCQCN.
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
index 7f30ebd5debb..792e9f8c846a 100644
--- a/Documentation/networking/devlink/ice.rst
+++ b/Documentation/networking/devlink/ice.rst
@@ -11,6 +11,7 @@ Parameters
==========
.. list-table:: Generic parameters implemented
+ :widths: 5 5 90
* - Name
- Mode
@@ -21,6 +22,88 @@ Parameters
* - ``enable_iwarp``
- runtime
- mutually exclusive with ``enable_roce``
+ * - ``tx_scheduling_layers``
+ - permanent
+ - The ice hardware uses hierarchical scheduling for Tx with a fixed
+ number of layers in the scheduling tree. Each of them are decision
+ points. Root node represents a port, while all the leaves represent
+ the queues. This way of configuring the Tx scheduler allows features
+ like DCB or devlink-rate (documented below) to configure how much
+ bandwidth is given to any given queue or group of queues, enabling
+ fine-grained control because scheduling parameters can be configured
+ at any given layer of the tree.
+
+ The default 9-layer tree topology was deemed best for most workloads,
+ as it gives an optimal ratio of performance to configurability. However,
+ for some specific cases, this 9-layer topology might not be desired.
+ One example would be sending traffic to queues that are not a multiple
+ of 8. Because the maximum radix is limited to 8 in 9-layer topology,
+ the 9th queue has a different parent than the rest, and it's given
+ more bandwidth credits. This causes a problem when the system is
+ sending traffic to 9 queues:
+
+ | tx_queue_0_packets: 24163396
+ | tx_queue_1_packets: 24164623
+ | tx_queue_2_packets: 24163188
+ | tx_queue_3_packets: 24163701
+ | tx_queue_4_packets: 24163683
+ | tx_queue_5_packets: 24164668
+ | tx_queue_6_packets: 23327200
+ | tx_queue_7_packets: 24163853
+ | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th
+
+ To address this need, you can switch to a 5-layer topology, which
+ changes the maximum topology radix to 512. With this enhancement,
+ the performance characteristic is equal as all queues can be assigned
+ to the same parent in the tree. The obvious drawback of this solution
+ is a lower configuration depth of the tree.
+
+ Use the ``tx_scheduling_layer`` parameter with the devlink command
+ to change the transmit scheduler topology. To use 5-layer topology,
+ use a value of 5. For example:
+ $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers
+ value 5 cmode permanent
+ Use a value of 9 to set it back to the default value.
+
+ You must do PCI slot powercycle for the selected topology to take effect.
+
+ To verify that value has been set:
+ $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers
+ * - ``msix_vec_per_pf_max``
+ - driverinit
+ - Set the max MSI-X that can be used by the PF, rest can be utilized for
+ SRIOV. The range is from min value set in msix_vec_per_pf_min to
+ 2k/number of ports.
+ * - ``msix_vec_per_pf_min``
+ - driverinit
+ - Set the min MSI-X that will be used by the PF. This value inform how many
+ MSI-X will be allocated statically. The range is from 2 to value set
+ in msix_vec_per_pf_max.
+
+.. list-table:: Driver specific parameters implemented
+ :widths: 5 5 90
+
+ * - Name
+ - Mode
+ - Description
+ * - ``local_forwarding``
+ - runtime
+ - Controls loopback behavior by tuning scheduler bandwidth.
+ It impacts all kinds of functions: physical, virtual and
+ subfunctions.
+ Supported values are:
+
+ ``enabled`` - loopback traffic is allowed on port
+
+ ``disabled`` - loopback traffic is not allowed on this port
+
+ ``prioritized`` - loopback traffic is prioritized on this port
+
+ Default value of ``local_forwarding`` parameter is ``enabled``.
+ ``prioritized`` provides ability to adjust loopback traffic rate to increase
+ one port capacity at cost of the another. User needs to disable
+ local forwarding on one of the ports in order have increased capacity
+ on the ``prioritized`` port.
Info versions
=============
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 948c8c44e233..8319f43b5933 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -84,6 +84,7 @@ parameters, info versions, and other features it supports.
i40e
ionic
ice
+ ixgbe
mlx4
mlx5
mlxsw
diff --git a/Documentation/networking/devlink/ixgbe.rst b/Documentation/networking/devlink/ixgbe.rst
new file mode 100644
index 000000000000..c27d1436c70e
--- /dev/null
+++ b/Documentation/networking/devlink/ixgbe.rst
@@ -0,0 +1,171 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+ixgbe devlink support
+=====================
+
+This document describes the devlink features implemented by the ``ixgbe``
+device driver.
+
+Info versions
+=============
+
+Any of the versions dealing with the security presented by ``devlink-info``
+is purely informational. Devlink does not use a secure channel to communicate
+with the device.
+
+The ``ixgbe`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 5 90
+
+ * - Name
+ - Type
+ - Example
+ - Description
+ * - ``board.id``
+ - fixed
+ - H49289-000
+ - The Product Board Assembly (PBA) identifier of the board.
+ * - ``fw.undi``
+ - running
+ - 1.1937.0
+ - Version of the Option ROM containing the UEFI driver. The version is
+ reported in ``major.minor.patch`` format. The major version is
+ incremented whenever a major breaking change occurs, or when the
+ minor version would overflow. The minor version is incremented for
+ non-breaking changes and reset to 1 when the major version is
+ incremented. The patch version is normally 0 but is incremented when
+ a fix is delivered as a patch against an older base Option ROM.
+ * - ``fw.undi.srev``
+ - running
+ - 4
+ - Number indicating the security revision of the Option ROM.
+ * - ``fw.bundle_id``
+ - running
+ - 0x80000d0d
+ - Unique identifier of the firmware image file that was loaded onto
+ the device. Also referred to as the EETRACK identifier of the NVM.
+ * - ``fw.mgmt.api``
+ - running
+ - 1.5.1
+ - 3-digit version number (major.minor.patch) of the API exported over
+ the AdminQ by the management firmware. Used by the driver to
+ identify what commands are supported. Historical versions of the
+ kernel only displayed a 2-digit version number (major.minor).
+ * - ``fw.mgmt.build``
+ - running
+ - 0x305d955f
+ - Unique identifier of the source for the management firmware.
+ * - ``fw.mgmt.srev``
+ - running
+ - 3
+ - Number indicating the security revision of the firmware.
+ * - ``fw.psid.api``
+ - running
+ - 0.80
+ - Version defining the format of the flash contents.
+ * - ``fw.netlist``
+ - running
+ - 1.1.2000-6.7.0
+ - The version of the netlist module. This module defines the device's
+ Ethernet capabilities and default settings, and is used by the
+ management firmware as part of managing link and device
+ connectivity.
+ * - ``fw.netlist.build``
+ - running
+ - 0xee16ced7
+ - The first 4 bytes of the hash of the netlist module contents.
+
+Flash Update
+============
+
+The ``ixgbe`` driver implements support for flash update using the
+``devlink-flash`` interface. It supports updating the device flash using a
+combined flash image that contains the ``fw.mgmt``, ``fw.undi``, and
+``fw.netlist`` components.
+
+.. list-table:: List of supported overwrite modes
+ :widths: 5 95
+
+ * - Bits
+ - Behavior
+ * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS``
+ - Do not preserve settings stored in the flash components being
+ updated. This includes overwriting the port configuration that
+ determines the number of physical functions the device will
+ initialize with.
+ * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS`` and ``DEVLINK_FLASH_OVERWRITE_IDENTIFIERS``
+ - Do not preserve either settings or identifiers. Overwrite everything
+ in the flash with the contents from the provided image, without
+ performing any preservation. This includes overwriting device
+ identifying fields such as the MAC address, Vital product Data (VPD) area,
+ and device serial number. It is expected that this combination be used with an
+ image customized for the specific device.
+
+Reload
+======
+
+The ``ixgbe`` driver supports activating new firmware after a flash update
+using ``DEVLINK_CMD_RELOAD`` with the ``DEVLINK_RELOAD_ACTION_FW_ACTIVATE``
+action.
+
+.. code:: shell
+
+ $ devlink dev reload pci/0000:01:00.0 reload action fw_activate
+
+The new firmware is activated by issuing a device specific Embedded
+Management Processor reset which requests the device to reset and reload the
+EMP firmware image.
+
+The driver does not currently support reloading the driver via
+``DEVLINK_RELOAD_ACTION_DRIVER_REINIT``.
+
+Regions
+=======
+
+The ``ixgbe`` driver implements the following regions for accessing internal
+device data.
+
+.. list-table:: regions implemented
+ :widths: 15 85
+
+ * - Name
+ - Description
+ * - ``nvm-flash``
+ - The contents of the entire flash chip, sometimes referred to as
+ the device's Non Volatile Memory.
+ * - ``shadow-ram``
+ - The contents of the Shadow RAM, which is loaded from the beginning
+ of the flash. Although the contents are primarily from the flash,
+ this area also contains data generated during device boot which is
+ not stored in flash.
+ * - ``device-caps``
+ - The contents of the device firmware's capabilities buffer. Useful to
+ determine the current state and configuration of the device.
+
+Both the ``nvm-flash`` and ``shadow-ram`` regions can be accessed without a
+snapshot. The ``device-caps`` region requires a snapshot as the contents are
+sent by firmware and can't be split into separate reads.
+
+Users can request an immediate capture of a snapshot for all three regions
+via the ``DEVLINK_CMD_REGION_NEW`` command.
+
+.. code:: shell
+
+ $ devlink region show
+ pci/0000:01:00.0/nvm-flash: size 10485760 snapshot [] max 1
+ pci/0000:01:00.0/device-caps: size 4096 snapshot [] max 10
+
+ $ devlink region new pci/0000:01:00.0/nvm-flash snapshot 1
+
+ $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1
+ 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+ 0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
+ 0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc
+ 0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5
+
+ $ devlink region read pci/0000:01:00.0/nvm-flash snapshot 1 address 0 length 16
+ 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+
+ $ devlink region delete pci/0000:01:00.0/device-caps snapshot 1
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 456985407475..7febe0aecd53 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -53,6 +53,9 @@ parameters.
* ``smfs`` Software managed flow steering. In SMFS mode, the HW
steering entities are created and manage through the driver without
firmware intervention.
+ * ``hmfs`` Hardware managed flow steering. In HMFS mode, the driver
+ is configuring steering rules directly to the HW using Work Queues with
+ a special new type of WQE (Work Queue Element).
SMFS mode is faster and provides better rule insertion rate compared to
default DMFS mode.
@@ -277,6 +280,10 @@ Description of the vnic counters:
number of packets handled by the VNIC experiencing unexpected steering
failure (at any point in steering flow owned by the VNIC, including the FDB
for the eswitch owner).
+- icm_consumption
+ amount of Interconnect Host Memory (ICM) consumed by the vnic in
+ granularity of 4KB. ICM is host memory allocated by SW upon HCA request
+ and is used for storing data structures that control HCA operation.
User commands examples:
diff --git a/Documentation/networking/devlink/nfp.rst b/Documentation/networking/devlink/nfp.rst
index a1717db0dfcc..3093642bdae4 100644
--- a/Documentation/networking/devlink/nfp.rst
+++ b/Documentation/networking/devlink/nfp.rst
@@ -32,7 +32,7 @@ The ``nfp`` driver reports the following versions
- Description
* - ``board.id``
- fixed
- - Part number identifying the board design
+ - Identifier of the board design
* - ``board.rev``
- fixed
- Revision of the board design
@@ -42,6 +42,9 @@ The ``nfp`` driver reports the following versions
* - ``board.model``
- fixed
- Model name of the board design
+ * - ``board.part_number``
+ - fixed
+ - Part number of the board and its components
* - ``fw.bundle_id``
- stored, running
- Firmware bundle id
diff --git a/Documentation/networking/devlink/octeontx2.rst b/Documentation/networking/devlink/octeontx2.rst
index 610de99b728a..84206537aedb 100644
--- a/Documentation/networking/devlink/octeontx2.rst
+++ b/Documentation/networking/devlink/octeontx2.rst
@@ -40,3 +40,40 @@ The ``octeontx2 AF`` driver implements the following driver-specific parameters.
- runtime
- Use to set the quantum which hardware uses for scheduling among transmit queues.
Hardware uses weighted DWRR algorithm to schedule among all transmit queues.
+ * - ``npc_mcam_high_zone_percent``
+ - u8
+ - runtime
+ - Use to set the number of high priority zone entries in NPC MCAM that can be allocated
+ by a user, out of the three priority zone categories high, mid and low.
+ * - ``npc_def_rule_cntr``
+ - bool
+ - runtime
+ - Use to enable or disable hit counters for the default rules in NPC MCAM.
+ Its not guaranteed that counters gets enabled and mapped to all the default rules,
+ since the counters are scarce and driver follows a best effort approach.
+ The default rule serves as the primary packet steering rule for a specific PF or VF,
+ based on its DMAC address which is installed by AF driver as part of its initialization.
+ Sample command to read hit counters for default rule from debugfs is as follows,
+ cat /sys/kernel/debug/cn10k/npc/mcam_rules
+ * - ``nix_maxlf``
+ - u16
+ - runtime
+ - Use to set the maximum number of LFs in NIX hardware block. This would be useful
+ to increase the availability of default resources allocated to enabled LFs like
+ MCAM entries for example.
+
+The ``octeontx2 PF`` driver implements the following driver-specific parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``unicast_filter_count``
+ - u8
+ - runtime
+ - Set the maximum number of unicast filters that can be programmed for
+ the device. This can be used to achieve better device resource
+ utilization, avoiding over consumption of unused MCAM table entries.
diff --git a/Documentation/networking/devlink/sfc.rst b/Documentation/networking/devlink/sfc.rst
index db64a1bd9733..0398d59ea184 100644
--- a/Documentation/networking/devlink/sfc.rst
+++ b/Documentation/networking/devlink/sfc.rst
@@ -5,7 +5,7 @@ sfc devlink support
===================
This document describes the devlink features implemented by the ``sfc``
-device driver for the ef100 device.
+device driver for the ef10 and ef100 devices.
Info versions
=============
@@ -18,6 +18,10 @@ The ``sfc`` driver reports the following versions
* - Name
- Type
- Description
+ * - ``fw.bundle_id``
+ - stored
+ - Version of the firmware "bundle" image that was last used to update
+ multiple components.
* - ``fw.mgmt.suc``
- running
- For boards where the management function is split between multiple
@@ -55,3 +59,13 @@ The ``sfc`` driver reports the following versions
* - ``fw.uefi``
- running
- UEFI driver version (No UNDI support).
+
+Flash Update
+============
+
+The ``sfc`` driver implements support for flash update using the
+``devlink-flash`` interface. It supports updating the device flash using a
+combined flash image ("bundle") that contains multiple components (on ef10,
+typically ``fw.mgmt``, ``fw.app``, ``fw.exprom`` and ``fw.uefi``).
+
+The driver does not support any overwrite mask flags.