summaryrefslogtreecommitdiff
path: root/Documentation/ABI/testing/sysfs-bus-pci-devices-aer
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/ABI/testing/sysfs-bus-pci-devices-aer')
-rw-r--r--Documentation/ABI/testing/sysfs-bus-pci-devices-aer163
1 files changed, 163 insertions, 0 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer
new file mode 100644
index 000000000000..5ed284523956
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer
@@ -0,0 +1,163 @@
+PCIe Device AER statistics
+--------------------------
+
+These attributes show up under all the devices that are AER capable. These
+statistical counters indicate the errors "as seen/reported by the device".
+Note that this may mean that if an endpoint is causing problems, the AER
+counters may increment at its link partner (e.g. root port) because the
+errors may be "seen" / reported by the link partner and not the
+problematic endpoint itself (which may report all counters as 0 as it never
+saw any problems).
+
+What: /sys/bus/pci/devices/<dev>/aer_dev_correctable
+Date: July 2018
+KernelVersion: 4.19.0
+Contact: linux-pci@vger.kernel.org, rajatja@google.com
+Description: List of correctable errors seen and reported by this
+ PCI device using ERR_COR. Note that since multiple errors may
+ be reported using a single ERR_COR message, thus
+ TOTAL_ERR_COR at the end of the file may not match the actual
+ total of all the errors in the file. Sample output::
+
+ localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable
+ Receiver Error 2
+ Bad TLP 0
+ Bad DLLP 0
+ RELAY_NUM Rollover 0
+ Replay Timer Timeout 0
+ Advisory Non-Fatal 0
+ Corrected Internal Error 0
+ Header Log Overflow 0
+ TOTAL_ERR_COR 2
+
+What: /sys/bus/pci/devices/<dev>/aer_dev_fatal
+Date: July 2018
+KernelVersion: 4.19.0
+Contact: linux-pci@vger.kernel.org, rajatja@google.com
+Description: List of uncorrectable fatal errors seen and reported by this
+ PCI device using ERR_FATAL. Note that since multiple errors may
+ be reported using a single ERR_FATAL message, thus
+ TOTAL_ERR_FATAL at the end of the file may not match the actual
+ total of all the errors in the file. Sample output::
+
+ localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal
+ Undefined 0
+ Data Link Protocol 0
+ Surprise Down Error 0
+ Poisoned TLP 0
+ Flow Control Protocol 0
+ Completion Timeout 0
+ Completer Abort 0
+ Unexpected Completion 0
+ Receiver Overflow 0
+ Malformed TLP 0
+ ECRC 0
+ Unsupported Request 0
+ ACS Violation 0
+ Uncorrectable Internal Error 0
+ MC Blocked TLP 0
+ AtomicOp Egress Blocked 0
+ TLP Prefix Blocked Error 0
+ TOTAL_ERR_FATAL 0
+
+What: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal
+Date: July 2018
+KernelVersion: 4.19.0
+Contact: linux-pci@vger.kernel.org, rajatja@google.com
+Description: List of uncorrectable nonfatal errors seen and reported by this
+ PCI device using ERR_NONFATAL. Note that since multiple errors
+ may be reported using a single ERR_FATAL message, thus
+ TOTAL_ERR_NONFATAL at the end of the file may not match the
+ actual total of all the errors in the file. Sample output::
+
+ localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal
+ Undefined 0
+ Data Link Protocol 0
+ Surprise Down Error 0
+ Poisoned TLP 0
+ Flow Control Protocol 0
+ Completion Timeout 0
+ Completer Abort 0
+ Unexpected Completion 0
+ Receiver Overflow 0
+ Malformed TLP 0
+ ECRC 0
+ Unsupported Request 0
+ ACS Violation 0
+ Uncorrectable Internal Error 0
+ MC Blocked TLP 0
+ AtomicOp Egress Blocked 0
+ TLP Prefix Blocked Error 0
+ TOTAL_ERR_NONFATAL 0
+
+PCIe Rootport AER statistics
+----------------------------
+
+These attributes show up under only the rootports (or root complex event
+collectors) that are AER capable. These indicate the number of error messages as
+"reported to" the rootport. Please note that the rootports also transmit
+(internally) the ERR_* messages for errors seen by the internal rootport PCI
+device, so these counters include them and are thus cumulative of all the error
+messages on the PCI hierarchy originating at that root port.
+
+What: /sys/bus/pci/devices/<dev>/aer_rootport_total_err_cor
+Date: July 2018
+KernelVersion: 4.19.0
+Contact: linux-pci@vger.kernel.org, rajatja@google.com
+Description: Total number of ERR_COR messages reported to rootport.
+
+What: /sys/bus/pci/devices/<dev>/aer_rootport_total_err_fatal
+Date: July 2018
+KernelVersion: 4.19.0
+Contact: linux-pci@vger.kernel.org, rajatja@google.com
+Description: Total number of ERR_FATAL messages reported to rootport.
+
+What: /sys/bus/pci/devices/<dev>/aer_rootport_total_err_nonfatal
+Date: July 2018
+KernelVersion: 4.19.0
+Contact: linux-pci@vger.kernel.org, rajatja@google.com
+Description: Total number of ERR_NONFATAL messages reported to rootport.
+
+PCIe AER ratelimits
+-------------------
+
+These attributes show up under all the devices that are AER capable.
+They represent configurable ratelimits of logs per error type.
+
+See Documentation/PCI/pcieaer-howto.rst for more info on ratelimits.
+
+What: /sys/bus/pci/devices/<dev>/aer/correctable_ratelimit_interval_ms
+Date: May 2025
+KernelVersion: 6.16.0
+Contact: linux-pci@vger.kernel.org
+Description: Writing 0 disables AER correctable error log ratelimiting.
+ Writing a positive value sets the ratelimit interval in ms.
+ Default is DEFAULT_RATELIMIT_INTERVAL (5000 ms).
+
+What: /sys/bus/pci/devices/<dev>/aer/correctable_ratelimit_burst
+Date: May 2025
+KernelVersion: 6.16.0
+Contact: linux-pci@vger.kernel.org
+Description: Ratelimit burst for correctable error logs. Writing a value
+ changes the number of errors (burst) allowed per interval
+ before ratelimiting. Reading gets the current ratelimit
+ burst. Default is DEFAULT_RATELIMIT_BURST (10).
+
+What: /sys/bus/pci/devices/<dev>/aer/nonfatal_ratelimit_interval_ms
+Date: May 2025
+KernelVersion: 6.16.0
+Contact: linux-pci@vger.kernel.org
+Description: Writing 0 disables AER non-fatal uncorrectable error log
+ ratelimiting. Writing a positive value sets the ratelimit
+ interval in ms. Default is DEFAULT_RATELIMIT_INTERVAL
+ (5000 ms).
+
+What: /sys/bus/pci/devices/<dev>/aer/nonfatal_ratelimit_burst
+Date: May 2025
+KernelVersion: 6.16.0
+Contact: linux-pci@vger.kernel.org
+Description: Ratelimit burst for non-fatal uncorrectable error logs.
+ Writing a value changes the number of errors (burst)
+ allowed per interval before ratelimiting. Reading gets the
+ current ratelimit burst. Default is DEFAULT_RATELIMIT_BURST
+ (10).