summaryrefslogtreecommitdiff
path: root/Documentation/networking/devlink
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking/devlink')
-rw-r--r--Documentation/networking/devlink/bnxt.rst14
-rw-r--r--Documentation/networking/devlink/devlink-flash.rst93
-rw-r--r--Documentation/networking/devlink/devlink-info.rst144
-rw-r--r--Documentation/networking/devlink/devlink-params.rst2
-rw-r--r--Documentation/networking/devlink/devlink-region.rst14
-rw-r--r--Documentation/networking/devlink/devlink-trap.rst35
-rw-r--r--Documentation/networking/devlink/ice.rst96
-rw-r--r--Documentation/networking/devlink/index.rst2
-rw-r--r--Documentation/networking/devlink/mlx5.rst6
9 files changed, 382 insertions, 24 deletions
diff --git a/Documentation/networking/devlink/bnxt.rst b/Documentation/networking/devlink/bnxt.rst
index 82ef9ec46707..3dfd84ccb1c7 100644
--- a/Documentation/networking/devlink/bnxt.rst
+++ b/Documentation/networking/devlink/bnxt.rst
@@ -51,6 +51,9 @@ The ``bnxt_en`` driver reports the following versions
* - Name
- Type
- Description
+ * - ``board.id``
+ - fixed
+ - Part number identifying the board design
* - ``asic.id``
- fixed
- ASIC design identifier
@@ -63,12 +66,15 @@ The ``bnxt_en`` driver reports the following versions
* - ``fw``
- stored, running
- Overall board firmware version
- * - ``fw.app``
- - stored, running
- - Data path firmware version
* - ``fw.mgmt``
- stored, running
- - Management firmware version
+ - NIC hardware resource management firmware version
+ * - ``fw.mgmt.api``
+ - running
+ - Minimum firmware interface spec version supported between driver and firmware
+ * - ``fw.nsci``
+ - stored, running
+ - General platform management firmware version
* - ``fw.roce``
- stored, running
- RoCE management firmware version
diff --git a/Documentation/networking/devlink/devlink-flash.rst b/Documentation/networking/devlink/devlink-flash.rst
new file mode 100644
index 000000000000..40a87c0222cb
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-flash.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+
+.. _devlink_flash:
+
+=============
+Devlink Flash
+=============
+
+The ``devlink-flash`` API allows updating device firmware. It replaces the
+older ``ethtool-flash`` mechanism, and doesn't require taking any
+networking locks in the kernel to perform the flash update. Example use::
+
+ $ devlink dev flash pci/0000:05:00.0 file flash-boot.bin
+
+Note that the file name is a path relative to the firmware loading path
+(usually ``/lib/firmware/``). Drivers may send status updates to inform
+user space about the progress of the update operation.
+
+Firmware Loading
+================
+
+Devices which require firmware to operate usually store it in non-volatile
+memory on the board, e.g. flash. Some devices store only basic firmware on
+the board, and the driver loads the rest from disk during probing.
+``devlink-info`` allows users to query firmware information (loaded
+components and versions).
+
+In other cases the device can both store the image on the board, load from
+disk, or automatically flash a new image from disk. The ``fw_load_policy``
+devlink parameter can be used to control this behavior
+(:ref:`Documentation/networking/devlink/devlink-params.rst <devlink_params_generic>`).
+
+On-disk firmware files are usually stored in ``/lib/firmware/``.
+
+Firmware Version Management
+===========================
+
+Drivers are expected to implement ``devlink-flash`` and ``devlink-info``
+functionality, which together allow for implementing vendor-independent
+automated firmware update facilities.
+
+``devlink-info`` exposes the ``driver`` name and three version groups
+(``fixed``, ``running``, ``stored``).
+
+The ``driver`` attribute and ``fixed`` group identify the specific device
+design, e.g. for looking up applicable firmware updates. This is why
+``serial_number`` is not part of the ``fixed`` versions (even though it
+is fixed) - ``fixed`` versions should identify the design, not a single
+device.
+
+``running`` and ``stored`` firmware versions identify the firmware running
+on the device, and firmware which will be activated after reboot or device
+reset.
+
+The firmware update agent is supposed to be able to follow this simple
+algorithm to update firmware contents, regardless of the device vendor:
+
+.. code-block:: sh
+
+ # Get unique HW design identifier
+ $hw_id = devlink-dev-info['fixed']
+
+ # Find out which FW flash we want to use for this NIC
+ $want_flash_vers = some-db-backed.lookup($hw_id, 'flash')
+
+ # Update flash if necessary
+ if $want_flash_vers != devlink-dev-info['stored']:
+ $file = some-db-backed.download($hw_id, 'flash')
+ devlink-dev-flash($file)
+
+ # Find out the expected overall firmware versions
+ $want_fw_vers = some-db-backed.lookup($hw_id, 'all')
+
+ # Update on-disk file if necessary
+ if $want_fw_vers != devlink-dev-info['running']:
+ $file = some-db-backed.download($hw_id, 'disk')
+ write($file, '/lib/firmware/')
+
+ # Try device reset, if available
+ if $want_fw_vers != devlink-dev-info['running']:
+ devlink-reset()
+
+ # Reboot, if reset wasn't enough
+ if $want_fw_vers != devlink-dev-info['running']:
+ reboot()
+
+Note that each reference to ``devlink-dev-info`` in this pseudo-code
+is expected to fetch up-to-date information from the kernel.
+
+For the convenience of identifying firmware files some vendors add
+``bundle_id`` information to the firmware versions. This meta-version covers
+multiple per-component versions and can be used e.g. in firmware file names
+(all component versions could get rather long.)
diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst
index 70981dd1b981..3fe11401b838 100644
--- a/Documentation/networking/devlink/devlink-info.rst
+++ b/Documentation/networking/devlink/devlink-info.rst
@@ -5,34 +5,119 @@ Devlink Info
============
The ``devlink-info`` mechanism enables device drivers to report device
-information in a generic fashion. It is extensible, and enables exporting
-even device or driver specific information.
+(hardware and firmware) information in a standard, extensible fashion.
-devlink supports representing the following types of versions
+The original motivation for the ``devlink-info`` API was twofold:
-.. list-table:: List of version types
+ - making it possible to automate device and firmware management in a fleet
+ of machines in a vendor-independent fashion (see also
+ :ref:`Documentation/networking/devlink/devlink-flash.rst <devlink_flash>`);
+ - name the per component FW versions (as opposed to the crowded ethtool
+ version string).
+
+``devlink-info`` supports reporting multiple types of objects. Reporting driver
+versions is generally discouraged - here, and via any other Linux API.
+
+.. list-table:: List of top level info objects
:widths: 5 95
- * - Type
+ * - Name
- Description
+ * - ``driver``
+ - Name of the currently used device driver, also available through sysfs.
+
+ * - ``serial_number``
+ - Serial number of the device.
+
+ This is usually the serial number of the ASIC, also often available
+ in PCI config space of the device in the *Device Serial Number*
+ capability.
+
+ The serial number should be unique per physical device.
+ Sometimes the serial number of the device is only 48 bits long (the
+ length of the Ethernet MAC address), and since PCI DSN is 64 bits long
+ devices pad or encode additional information into the serial number.
+ One example is adding port ID or PCI interface ID in the extra two bytes.
+ Drivers should make sure to strip or normalize any such padding
+ or interface ID, and report only the part of the serial number
+ which uniquely identifies the hardware. In other words serial number
+ reported for two ports of the same device or on two hosts of
+ a multi-host device should be identical.
+
+ .. note:: ``devlink-info`` API should be extended with a new field
+ if devices want to report board/product serial number (often
+ reported in PCI *Vital Product Data* capability).
+
* - ``fixed``
- - Represents fixed versions, which cannot change. For example,
+ - Group for hardware identifiers, and versions of components
+ which are not field-updatable.
+
+ Versions in this section identify the device design. For example,
component identifiers or the board version reported in the PCI VPD.
+ Data in ``devlink-info`` should be broken into the smallest logical
+ components, e.g. PCI VPD may concatenate various information
+ to form the Part Number string, while in ``devlink-info`` all parts
+ should be reported as separate items.
+
+ This group must not contain any frequently changing identifiers,
+ such as serial numbers. See
+ :ref:`Documentation/networking/devlink/devlink-flash.rst <devlink_flash>`
+ to understand why.
+
* - ``running``
- - Represents the version of the currently running component. For
- example the running version of firmware. These versions generally
- only update after a reboot.
+ - Group for information about currently running software/firmware.
+ These versions often only update after a reboot, sometimes device reset.
+
* - ``stored``
- - Represents the version of a component as stored, such as after a
- flash update. Stored values should update to reflect changes in the
- flash even if a reboot has not yet occurred.
+ - Group for software/firmware versions in device flash.
+
+ Stored values must update to reflect changes in the flash even
+ if reboot has not yet occurred. If device is not capable of updating
+ ``stored`` versions when new software is flashed, it must not report
+ them.
+
+Each version can be reported at most once in each version group. Firmware
+components stored on the flash should feature in both the ``running`` and
+``stored`` sections, if device is capable of reporting ``stored`` versions
+(see :ref:`Documentation/networking/devlink/devlink-flash.rst <devlink_flash>`).
+In case software/firmware components are loaded from the disk (e.g.
+``/lib/firmware``) only the running version should be reported via
+the kernel API.
Generic Versions
================
It is expected that drivers use the following generic names for exporting
-version information. Other information may be exposed using driver-specific
-names, but these should be documented in the driver-specific file.
+version information. If a generic name for a given component doesn't exist yet,
+driver authors should consult existing driver-specific versions and attempt
+reuse. As last resort, if a component is truly unique, using driver-specific
+names is allowed, but these should be documented in the driver-specific file.
+
+All versions should try to use the following terminology:
+
+.. list-table:: List of common version suffixes
+ :widths: 10 90
+
+ * - Name
+ - Description
+ * - ``id``, ``revision``
+ - Identifiers of designs and revision, mostly used for hardware versions.
+
+ * - ``api``
+ - Version of API between components. API items are usually of limited
+ value to the user, and can be inferred from other versions by the vendor,
+ so adding API versions is generally discouraged as noise.
+
+ * - ``bundle_id``
+ - Identifier of a distribution package which was flashed onto the device.
+ This is an attribute of a firmware package which covers multiple versions
+ for ease of managing firmware images (see
+ :ref:`Documentation/networking/devlink/devlink-flash.rst <devlink_flash>`).
+
+ ``bundle_id`` can appear in both ``running`` and ``stored`` versions,
+ but it must not be reported if any of the components covered by the
+ ``bundle_id`` was changed and no longer matches the version from
+ the bundle.
board.id
--------
@@ -52,7 +137,7 @@ ASIC design identifier.
asic.rev
--------
-ASIC design revision.
+ASIC design revision/stepping.
board.manufacture
-----------------
@@ -72,6 +157,12 @@ Control unit firmware version. This firmware is responsible for house
keeping tasks, PHY control etc. but not the packet-by-packet data path
operation.
+fw.mgmt.api
+-----------
+
+Firmware interface specification version of the software interfaces between
+driver and firmware.
+
fw.app
------
@@ -91,10 +182,31 @@ Network Controller Sideband Interface.
fw.psid
-------
-Unique identifier of the firmware parameter set.
+Unique identifier of the firmware parameter set. These are usually
+parameters of a particular board, defined at manufacturing time.
fw.roce
-------
RoCE firmware version which is responsible for handling roce
management.
+
+fw.bundle_id
+------------
+
+Unique identifier of the entire firmware bundle.
+
+Future work
+===========
+
+The following extensions could be useful:
+
+ - product serial number - NIC boards often get labeled with a board serial
+ number rather than ASIC serial number; it'd be useful to add board serial
+ numbers to the API if they can be retrieved from the device;
+
+ - on-disk firmware file names - drivers list the file names of firmware they
+ may need to load onto devices via the ``MODULE_FIRMWARE()`` macro. These,
+ however, are per module, rather than per device. It'd be useful to list
+ the names of firmware files the driver will try to load for a given device,
+ in order of priority.
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index da2f85c0fa21..d075fd090b3d 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -41,6 +41,8 @@ In order for ``driverinit`` parameters to take effect, the driver must
support reloading via the ``devlink-reload`` command. This command will
request a reload of the device driver.
+.. _devlink_params_generic:
+
Generic configuration parameters
================================
The following is a list of generic configuration parameters that drivers may
diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst
index 8b46e8591fe0..04e04d1ff627 100644
--- a/Documentation/networking/devlink/devlink-region.rst
+++ b/Documentation/networking/devlink/devlink-region.rst
@@ -20,6 +20,11 @@ address regions that are otherwise inaccessible to the user.
Regions may also be used to provide an additional way to debug complex error
states, but see also :doc:`devlink-health`
+Regions may optionally support capturing a snapshot on demand via the
+``DEVLINK_CMD_REGION_NEW`` netlink message. A driver wishing to allow
+requested snapshots must implement the ``.snapshot`` callback for the region
+in its ``devlink_region_ops`` structure.
+
example usage
-------------
@@ -29,8 +34,7 @@ example usage
$ devlink region show [ DEV/REGION ]
$ devlink region del DEV/REGION snapshot SNAPSHOT_ID
$ devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ]
- $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ]
- address ADDRESS length length
+ $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length length
# Show all of the exposed regions with region sizes:
$ devlink region show
@@ -40,6 +44,9 @@ example usage
# Delete a snapshot using:
$ devlink region del pci/0000:00:05.0/cr-space snapshot 1
+ # Request an immediate snapshot, if supported by the region
+ $ devlink region new pci/0000:00:05.0/cr-space snapshot 5
+
# Dump a snapshot:
$ devlink region dump pci/0000:00:05.0/fw-health snapshot 1
0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
@@ -48,8 +55,7 @@ example usage
0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5
# Read a specific part of a snapshot:
- $ devlink region read pci/0000:00:05.0/fw-health snapshot 1 address 0
- length 16
+ $ devlink region read pci/0000:00:05.0/fw-health snapshot 1 address 0 length 16
0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
As regions are likely very device or driver specific, no generic regions are
diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst
index 47a429bb8658..a09971c2115c 100644
--- a/Documentation/networking/devlink/devlink-trap.rst
+++ b/Documentation/networking/devlink/devlink-trap.rst
@@ -238,6 +238,12 @@ be added to the following table:
- ``drop``
- Traps NVE packets that the device decided to drop because their overlay
source MAC is multicast
+ * - ``ingress_flow_action_drop``
+ - ``drop``
+ - Traps packets dropped during processing of ingress flow action drop
+ * - ``egress_flow_action_drop``
+ - ``drop``
+ - Traps packets dropped during processing of egress flow action drop
Driver-specific Packet Traps
============================
@@ -277,6 +283,35 @@ narrow. The description of these groups must be added to the following table:
* - ``tunnel_drops``
- Contains packet traps for packets that were dropped by the device during
tunnel encapsulation / decapsulation
+ * - ``acl_drops``
+ - Contains packet traps for packets that were dropped by the device during
+ ACL processing
+
+Packet Trap Policers
+====================
+
+As previously explained, the underlying device can trap certain packets to the
+CPU for processing. In most cases, the underlying device is capable of handling
+packet rates that are several orders of magnitude higher compared to those that
+can be handled by the CPU.
+
+Therefore, in order to prevent the underlying device from overwhelming the CPU,
+devices usually include packet trap policers that are able to police the
+trapped packets to rates that can be handled by the CPU.
+
+The ``devlink-trap`` mechanism allows capable device drivers to register their
+supported packet trap policers with ``devlink``. The device driver can choose
+to associate these policers with supported packet trap groups (see
+:ref:`Generic-Packet-Trap-Groups`) during its initialization, thereby exposing
+its default control plane policy to user space.
+
+Device drivers should allow user space to change the parameters of the policers
+(e.g., rate, burst size) as well as the association between the policers and
+trap groups by implementing the relevant callbacks.
+
+If possible, device drivers should implement a callback that allows user space
+to retrieve the number of packets that were dropped by the policer because its
+configured policy was violated.
Testing
=======
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
new file mode 100644
index 000000000000..5b58fc4e1268
--- /dev/null
+++ b/Documentation/networking/devlink/ice.rst
@@ -0,0 +1,96 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+ice devlink support
+===================
+
+This document describes the devlink features implemented by the ``ice``
+device driver.
+
+Info versions
+=============
+
+The ``ice`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 5 90
+
+ * - Name
+ - Type
+ - Example
+ - Description
+ * - ``board.id``
+ - fixed
+ - K65390-000
+ - The Product Board Assembly (PBA) identifier of the board.
+ * - ``fw.mgmt``
+ - running
+ - 2.1.7
+ - 3-digit version number of the management firmware that controls the
+ PHY, link, etc.
+ * - ``fw.mgmt.api``
+ - running
+ - 1.5
+ - 2-digit version number of the API exported over the AdminQ by the
+ management firmware. Used by the driver to identify what commands
+ are supported.
+ * - ``fw.mgmt.build``
+ - running
+ - 0x305d955f
+ - Unique identifier of the source for the management firmware.
+ * - ``fw.undi``
+ - running
+ - 1.2581.0
+ - Version of the Option ROM containing the UEFI driver. The version is
+ reported in ``major.minor.patch`` format. The major version is
+ incremented whenever a major breaking change occurs, or when the
+ minor version would overflow. The minor version is incremented for
+ non-breaking changes and reset to 1 when the major version is
+ incremented. The patch version is normally 0 but is incremented when
+ a fix is delivered as a patch against an older base Option ROM.
+ * - ``fw.psid.api``
+ - running
+ - 0.80
+ - Version defining the format of the flash contents.
+ * - ``fw.bundle_id``
+ - running
+ - 0x80002ec0
+ - Unique identifier of the firmware image file that was loaded onto
+ the device. Also referred to as the EETRACK identifier of the NVM.
+ * - ``fw.app.name``
+ - running
+ - ICE OS Default Package
+ - The name of the DDP package that is active in the device. The DDP
+ package is loaded by the driver during initialization. Each varation
+ of DDP package shall have a unique name.
+ * - ``fw.app``
+ - running
+ - 1.3.1.0
+ - The version of the DDP package that is active in the device. Note
+ that both the name (as reported by ``fw.app.name``) and version are
+ required to uniquely identify the package.
+
+Regions
+=======
+
+The ``ice`` driver enables access to the contents of the Non Volatile Memory
+flash chip via the ``nvm-flash`` region.
+
+Users can request an immediate capture of a snapshot via the
+``DEVLINK_CMD_REGION_NEW``
+
+.. code:: shell
+
+ $ devlink region new pci/0000:01:00.0/nvm-flash snapshot 1
+ $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1
+
+ $ devlink region dump pci/0000:01:00.0/nvm-flash snapshot 1
+ 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+ 0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
+ 0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc
+ 0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5
+
+ $ devlink region read pci/0000:01:00.0/nvm-flash snapshot 1 address 0 length 16
+ 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+
+ $ devlink region delete pci/0000:01:00.0/nvm-flash snapshot 1
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 087ff54d53fc..c536db2cc0f9 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -16,6 +16,7 @@ general.
devlink-dpipe
devlink-health
devlink-info
+ devlink-flash
devlink-params
devlink-region
devlink-resource
@@ -32,6 +33,7 @@ parameters, info versions, and other features it supports.
bnxt
ionic
+ ice
mlx4
mlx5
mlxsw
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 629a6e69c036..4e4b97f7971a 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -37,6 +37,12 @@ parameters.
* ``smfs`` Software managed flow steering. In SMFS mode, the HW
steering entities are created and manage through the driver without
firmware intervention.
+ * - ``fdb_large_groups``
+ - u32
+ - driverinit
+ - Control the number of large groups (size > 1) in the FDB table.
+
+ * The default value is 15, and the range is between 1 and 1024.
The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD``