diff options
Diffstat (limited to 'Documentation/networking/devlink')
-rw-r--r-- | Documentation/networking/devlink/bnxt.rst | 2 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-info.rst | 9 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-port.rst | 33 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-region.rst | 2 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-trap.rst | 2 | ||||
-rw-r--r-- | Documentation/networking/devlink/hns3.rst | 5 | ||||
-rw-r--r-- | Documentation/networking/devlink/ice.rst | 83 | ||||
-rw-r--r-- | Documentation/networking/devlink/index.rst | 1 | ||||
-rw-r--r-- | Documentation/networking/devlink/ixgbe.rst | 171 | ||||
-rw-r--r-- | Documentation/networking/devlink/mlx5.rst | 7 | ||||
-rw-r--r-- | Documentation/networking/devlink/nfp.rst | 5 | ||||
-rw-r--r-- | Documentation/networking/devlink/octeontx2.rst | 37 | ||||
-rw-r--r-- | Documentation/networking/devlink/sfc.rst | 16 |
13 files changed, 369 insertions, 4 deletions
diff --git a/Documentation/networking/devlink/bnxt.rst b/Documentation/networking/devlink/bnxt.rst index a4fb27663cd6..9a8b3d76d11f 100644 --- a/Documentation/networking/devlink/bnxt.rst +++ b/Documentation/networking/devlink/bnxt.rst @@ -24,6 +24,8 @@ Parameters - Permanent * - ``enable_remote_dev_reset`` - Runtime + * - ``enable_roce`` + - Permanent The ``bnxt`` driver also implements the following driver-specific parameters. diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst index 1242b0e6826b..dd6adc4d0559 100644 --- a/Documentation/networking/devlink/devlink-info.rst +++ b/Documentation/networking/devlink/devlink-info.rst @@ -86,6 +86,10 @@ In case software/firmware components are loaded from the disk (e.g. ``/lib/firmware``) only the running version should be reported via the kernel API. +Please note that any security versions reported via devlink are purely +informational. Devlink does not use a secure channel to communicate with +the device. + Generic Versions ================ @@ -146,6 +150,11 @@ board.manufacture An identifier of the company or the facility which produced the part. +board.part_number +----------------- + +Part number of the board and its components. + fw -- diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst index 562f46b41274..9d22d41a7cd1 100644 --- a/Documentation/networking/devlink/devlink-port.rst +++ b/Documentation/networking/devlink/devlink-port.rst @@ -134,6 +134,9 @@ Users may also set the IPsec crypto capability of the function using Users may also set the IPsec packet capability of the function using `devlink port function set ipsec_packet` command. +Users may also set the maximum IO event queues of the function +using `devlink port function set max_io_eqs` command. + Function attributes =================== @@ -295,6 +298,36 @@ policy is processed in software by the kernel. function: hw_addr 00:00:00:00:00:00 ipsec_packet enabled +Maximum IO events queues setup +------------------------------ +When user sets maximum number of IO event queues for a SF or +a VF, such function driver is limited to consume only enforced +number of IO event queues. + +IO event queues deliver events related to IO queues, including network +device transmit and receive queues (txq and rxq) and RDMA Queue Pairs (QPs). +For example, the number of netdevice channels and RDMA device completion +vectors are derived from the function's IO event queues. Usually, the number +of interrupt vectors consumed by the driver is limited by the number of IO +event queues per device, as each of the IO event queues is connected to an +interrupt vector. + +- Get maximum IO event queues of the VF device:: + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10 + +- Set maximum IO event queues of the VF device:: + + $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32 + Subfunction ============ diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst index 9232cd7da301..5d0b68f752c0 100644 --- a/Documentation/networking/devlink/devlink-region.rst +++ b/Documentation/networking/devlink/devlink-region.rst @@ -49,7 +49,7 @@ example usage $ devlink region show [ DEV/REGION ] $ devlink region del DEV/REGION snapshot SNAPSHOT_ID $ devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ] - $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length length + $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length LENGTH # Show all of the exposed regions with region sizes: $ devlink region show diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst index 2c14dfe69b3a..5885e21e2212 100644 --- a/Documentation/networking/devlink/devlink-trap.rst +++ b/Documentation/networking/devlink/devlink-trap.rst @@ -451,7 +451,7 @@ be added to the following table: * - ``udp_parsing`` - ``drop`` - Traps packets dropped due to an error in the UDP header parsing. - This packet trap could include checksum errorrs, an improper UDP + This packet trap could include checksum errors, an improper UDP length detected (smaller than 8 bytes) or detection of header truncation. * - ``tcp_parsing`` diff --git a/Documentation/networking/devlink/hns3.rst b/Documentation/networking/devlink/hns3.rst index 4562a6e4782f..72bc1b9f3785 100644 --- a/Documentation/networking/devlink/hns3.rst +++ b/Documentation/networking/devlink/hns3.rst @@ -23,3 +23,8 @@ The ``hns3`` driver reports the following versions * - ``fw`` - running - Used to represent the firmware version. + * - ``fw.scc`` + - running + - Used to represent the Soft Congestion Control (SSC) firmware version. + SCC is a firmware component which provides multiple RDMA congestion + control algorithms, including DCQCN. diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index 7f30ebd5debb..792e9f8c846a 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -11,6 +11,7 @@ Parameters ========== .. list-table:: Generic parameters implemented + :widths: 5 5 90 * - Name - Mode @@ -21,6 +22,88 @@ Parameters * - ``enable_iwarp`` - runtime - mutually exclusive with ``enable_roce`` + * - ``tx_scheduling_layers`` + - permanent + - The ice hardware uses hierarchical scheduling for Tx with a fixed + number of layers in the scheduling tree. Each of them are decision + points. Root node represents a port, while all the leaves represent + the queues. This way of configuring the Tx scheduler allows features + like DCB or devlink-rate (documented below) to configure how much + bandwidth is given to any given queue or group of queues, enabling + fine-grained control because scheduling parameters can be configured + at any given layer of the tree. + + The default 9-layer tree topology was deemed best for most workloads, + as it gives an optimal ratio of performance to configurability. However, + for some specific cases, this 9-layer topology might not be desired. + One example would be sending traffic to queues that are not a multiple + of 8. Because the maximum radix is limited to 8 in 9-layer topology, + the 9th queue has a different parent than the rest, and it's given + more bandwidth credits. This causes a problem when the system is + sending traffic to 9 queues: + + | tx_queue_0_packets: 24163396 + | tx_queue_1_packets: 24164623 + | tx_queue_2_packets: 24163188 + | tx_queue_3_packets: 24163701 + | tx_queue_4_packets: 24163683 + | tx_queue_5_packets: 24164668 + | tx_queue_6_packets: 23327200 + | tx_queue_7_packets: 24163853 + | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th + + To address this need, you can switch to a 5-layer topology, which + changes the maximum topology radix to 512. With this enhancement, + the performance characteristic is equal as all queues can be assigned + to the same parent in the tree. The obvious drawback of this solution + is a lower configuration depth of the tree. + + Use the ``tx_scheduling_layer`` parameter with the devlink command + to change the transmit scheduler topology. To use 5-layer topology, + use a value of 5. For example: + $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers + value 5 cmode permanent + Use a value of 9 to set it back to the default value. + + You must do PCI slot powercycle for the selected topology to take effect. + + To verify that value has been set: + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers + * - ``msix_vec_per_pf_max`` + - driverinit + - Set the max MSI-X that can be used by the PF, rest can be utilized for + SRIOV. The range is from min value set in msix_vec_per_pf_min to + 2k/number of ports. + * - ``msix_vec_per_pf_min`` + - driverinit + - Set the min MSI-X that will be used by the PF. This value inform how many + MSI-X will be allocated statically. The range is from 2 to value set + in msix_vec_per_pf_max. + +.. list-table:: Driver specific parameters implemented + :widths: 5 5 90 + + * - Name + - Mode + - Description + * - ``local_forwarding`` + - runtime + - Controls loopback behavior by tuning scheduler bandwidth. + It impacts all kinds of functions: physical, virtual and + subfunctions. + Supported values are: + + ``enabled`` - loopback traffic is allowed on port + + ``disabled`` - loopback traffic is not allowed on this port + + ``prioritized`` - loopback traffic is prioritized on this port + + Default value of ``local_forwarding`` parameter is ``enabled``. + ``prioritized`` provides ability to adjust loopback traffic rate to increase + one port capacity at cost of the another. User needs to disable + local forwarding on one of the ports in order have increased capacity + on the ``prioritized`` port. Info versions ============= diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index 948c8c44e233..8319f43b5933 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -84,6 +84,7 @@ parameters, info versions, and other features it supports. i40e ionic ice + ixgbe mlx4 mlx5 mlxsw diff --git a/Documentation/networking/devlink/ixgbe.rst b/Documentation/networking/devlink/ixgbe.rst new file mode 100644 index 000000000000..c27d1436c70e --- /dev/null +++ b/Documentation/networking/devlink/ixgbe.rst @@ -0,0 +1,171 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================== +ixgbe devlink support +===================== + +This document describes the devlink features implemented by the ``ixgbe`` +device driver. + +Info versions +============= + +Any of the versions dealing with the security presented by ``devlink-info`` +is purely informational. Devlink does not use a secure channel to communicate +with the device. + +The ``ixgbe`` driver reports the following versions + +.. list-table:: devlink info versions implemented + :widths: 5 5 5 90 + + * - Name + - Type + - Example + - Description + * - ``board.id`` + - fixed + - H49289-000 + - The Product Board Assembly (PBA) identifier of the board. + * - ``fw.undi`` + - running + - 1.1937.0 + - Version of the Option ROM containing the UEFI driver. The version is + reported in ``major.minor.patch`` format. The major version is + incremented whenever a major breaking change occurs, or when the + minor version would overflow. The minor version is incremented for + non-breaking changes and reset to 1 when the major version is + incremented. The patch version is normally 0 but is incremented when + a fix is delivered as a patch against an older base Option ROM. + * - ``fw.undi.srev`` + - running + - 4 + - Number indicating the security revision of the Option ROM. + * - ``fw.bundle_id`` + - running + - 0x80000d0d + - Unique identifier of the firmware image file that was loaded onto + the device. Also referred to as the EETRACK identifier of the NVM. + * - ``fw.mgmt.api`` + - running + - 1.5.1 + - 3-digit version number (major.minor.patch) of the API exported over + the AdminQ by the management firmware. Used by the driver to + identify what commands are supported. Historical versions of the + kernel only displayed a 2-digit version number (major.minor). + * - ``fw.mgmt.build`` + - running + - 0x305d955f + - Unique identifier of the source for the management firmware. + * - ``fw.mgmt.srev`` + - running + - 3 + - Number indicating the security revision of the firmware. + * - ``fw.psid.api`` + - running + - 0.80 + - Version defining the format of the flash contents. + * - ``fw.netlist`` + - running + - 1.1.2000-6.7.0 + - The version of the netlist module. This module defines the device's + Ethernet capabilities and default settings, and is used by the + management firmware as part of managing link and device + connectivity. + * - ``fw.netlist.build`` + - running + - 0xee16ced7 + - The first 4 bytes of the hash of the netlist module contents. + +Flash Update +============ + +The ``ixgbe`` driver implements support for flash update using the +``devlink-flash`` interface. It supports updating the device flash using a +combined flash image that contains the ``fw.mgmt``, ``fw.undi``, and +``fw.netlist`` components. + +.. list-table:: List of supported overwrite modes + :widths: 5 95 + + * - Bits + - Behavior + * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS`` + - Do not preserve settings stored in the flash components being + updated. This includes overwriting the port configuration that + determines the number of physical functions the device will + initialize with. + * - ``DEVLINK_FLASH_OVERWRITE_SETTINGS`` and ``DEVLINK_FLASH_OVERWRITE_IDENTIFIERS`` + - Do not preserve either settings or identifiers. Overwrite everything + in the flash with the contents from the provided image, without + performing any preservation. This includes overwriting device + identifying fields such as the MAC address, Vital product Data (VPD) area, + and device serial number. It is expected that this combination be used with an + image customized for the specific device. + +Reload +====== + +The ``ixgbe`` driver supports activating new firmware after a flash update +using ``DEVLINK_CMD_RELOAD`` with the ``DEVLINK_RELOAD_ACTION_FW_ACTIVATE`` +action. + +.. code:: shell + + $ devlink dev reload pci/0000:01:00.0 reload action fw_activate + +The new firmware is activated by issuing a device specific Embedded +Management Processor reset which requests the device to reset and reload the +EMP firmware image. + +The driver does not currently support reloading the driver via +``DEVLINK_RELOAD_ACTION_DRIVER_REINIT``. + +Regions +======= + +The ``ixgbe`` driver implements the following regions for accessing internal +device data. + +.. list-table:: regions implemented + :widths: 15 85 + + * - Name + - Description + * - ``nvm-flash`` + - The contents of the entire flash chip, sometimes referred to as + the device's Non Volatile Memory. + * - ``shadow-ram`` + - The contents of the Shadow RAM, which is loaded from the beginning + of the flash. Although the contents are primarily from the flash, + this area also contains data generated during device boot which is + not stored in flash. + * - ``device-caps`` + - The contents of the device firmware's capabilities buffer. Useful to + determine the current state and configuration of the device. + +Both the ``nvm-flash`` and ``shadow-ram`` regions can be accessed without a +snapshot. The ``device-caps`` region requires a snapshot as the contents are +sent by firmware and can't be split into separate reads. + +Users can request an immediate capture of a snapshot for all three regions +via the ``DEVLINK_CMD_REGION_NEW`` command. + +.. code:: shell + + $ devlink region show + pci/0000:01:00.0/nvm-flash: size 10485760 snapshot [] max 1 + pci/0000:01:00.0/device-caps: size 4096 snapshot [] max 10 + + $ devlink region new pci/0000:01:00.0/nvm-flash snapshot 1 + + $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1 + 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30 + 0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8 + 0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc + 0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5 + + $ devlink region read pci/0000:01:00.0/nvm-flash snapshot 1 address 0 length 16 + 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30 + + $ devlink region delete pci/0000:01:00.0/device-caps snapshot 1 diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index 456985407475..7febe0aecd53 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -53,6 +53,9 @@ parameters. * ``smfs`` Software managed flow steering. In SMFS mode, the HW steering entities are created and manage through the driver without firmware intervention. + * ``hmfs`` Hardware managed flow steering. In HMFS mode, the driver + is configuring steering rules directly to the HW using Work Queues with + a special new type of WQE (Work Queue Element). SMFS mode is faster and provides better rule insertion rate compared to default DMFS mode. @@ -277,6 +280,10 @@ Description of the vnic counters: number of packets handled by the VNIC experiencing unexpected steering failure (at any point in steering flow owned by the VNIC, including the FDB for the eswitch owner). +- icm_consumption + amount of Interconnect Host Memory (ICM) consumed by the vnic in + granularity of 4KB. ICM is host memory allocated by SW upon HCA request + and is used for storing data structures that control HCA operation. User commands examples: diff --git a/Documentation/networking/devlink/nfp.rst b/Documentation/networking/devlink/nfp.rst index a1717db0dfcc..3093642bdae4 100644 --- a/Documentation/networking/devlink/nfp.rst +++ b/Documentation/networking/devlink/nfp.rst @@ -32,7 +32,7 @@ The ``nfp`` driver reports the following versions - Description * - ``board.id`` - fixed - - Part number identifying the board design + - Identifier of the board design * - ``board.rev`` - fixed - Revision of the board design @@ -42,6 +42,9 @@ The ``nfp`` driver reports the following versions * - ``board.model`` - fixed - Model name of the board design + * - ``board.part_number`` + - fixed + - Part number of the board and its components * - ``fw.bundle_id`` - stored, running - Firmware bundle id diff --git a/Documentation/networking/devlink/octeontx2.rst b/Documentation/networking/devlink/octeontx2.rst index 610de99b728a..84206537aedb 100644 --- a/Documentation/networking/devlink/octeontx2.rst +++ b/Documentation/networking/devlink/octeontx2.rst @@ -40,3 +40,40 @@ The ``octeontx2 AF`` driver implements the following driver-specific parameters. - runtime - Use to set the quantum which hardware uses for scheduling among transmit queues. Hardware uses weighted DWRR algorithm to schedule among all transmit queues. + * - ``npc_mcam_high_zone_percent`` + - u8 + - runtime + - Use to set the number of high priority zone entries in NPC MCAM that can be allocated + by a user, out of the three priority zone categories high, mid and low. + * - ``npc_def_rule_cntr`` + - bool + - runtime + - Use to enable or disable hit counters for the default rules in NPC MCAM. + Its not guaranteed that counters gets enabled and mapped to all the default rules, + since the counters are scarce and driver follows a best effort approach. + The default rule serves as the primary packet steering rule for a specific PF or VF, + based on its DMAC address which is installed by AF driver as part of its initialization. + Sample command to read hit counters for default rule from debugfs is as follows, + cat /sys/kernel/debug/cn10k/npc/mcam_rules + * - ``nix_maxlf`` + - u16 + - runtime + - Use to set the maximum number of LFs in NIX hardware block. This would be useful + to increase the availability of default resources allocated to enabled LFs like + MCAM entries for example. + +The ``octeontx2 PF`` driver implements the following driver-specific parameters. + +.. list-table:: Driver-specific parameters implemented + :widths: 5 5 5 85 + + * - Name + - Type + - Mode + - Description + * - ``unicast_filter_count`` + - u8 + - runtime + - Set the maximum number of unicast filters that can be programmed for + the device. This can be used to achieve better device resource + utilization, avoiding over consumption of unused MCAM table entries. diff --git a/Documentation/networking/devlink/sfc.rst b/Documentation/networking/devlink/sfc.rst index db64a1bd9733..0398d59ea184 100644 --- a/Documentation/networking/devlink/sfc.rst +++ b/Documentation/networking/devlink/sfc.rst @@ -5,7 +5,7 @@ sfc devlink support =================== This document describes the devlink features implemented by the ``sfc`` -device driver for the ef100 device. +device driver for the ef10 and ef100 devices. Info versions ============= @@ -18,6 +18,10 @@ The ``sfc`` driver reports the following versions * - Name - Type - Description + * - ``fw.bundle_id`` + - stored + - Version of the firmware "bundle" image that was last used to update + multiple components. * - ``fw.mgmt.suc`` - running - For boards where the management function is split between multiple @@ -55,3 +59,13 @@ The ``sfc`` driver reports the following versions * - ``fw.uefi`` - running - UEFI driver version (No UNDI support). + +Flash Update +============ + +The ``sfc`` driver implements support for flash update using the +``devlink-flash`` interface. It supports updating the device flash using a +combined flash image ("bundle") that contains multiple components (on ef10, +typically ``fw.mgmt``, ``fw.app``, ``fw.exprom`` and ``fw.uefi``). + +The driver does not support any overwrite mask flags. |