From e2b67859ab6efd4458bda1baaee20331a367d995 Mon Sep 17 00:00:00 2001 From: Damian Muszynski Date: Fri, 2 Feb 2024 18:53:16 +0800 Subject: crypto: qat - add heartbeat error simulator Add a mechanism that allows to inject a heartbeat error for testing purposes. A new attribute `inject_error` is added to debugfs for each QAT device. Upon a write on this attribute, the driver will inject an error on the device which can then be detected by the heartbeat feature. Errors are breaking the device functionality thus they require a device reset in order to be recovered. This functionality is not compiled by default, to enable it CRYPTO_DEV_QAT_ERROR_INJECTION must be set. Signed-off-by: Damian Muszynski Reviewed-by: Giovanni Cabiddu Reviewed-by: Lucas Segarra Fernandez Reviewed-by: Ahsan Atta Reviewed-by: Markas Rapoportas Signed-off-by: Mun Chun Yep Signed-off-by: Herbert Xu --- Documentation/ABI/testing/debugfs-driver-qat | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) (limited to 'Documentation/ABI/testing/debugfs-driver-qat') diff --git a/Documentation/ABI/testing/debugfs-driver-qat b/Documentation/ABI/testing/debugfs-driver-qat index b2db010d851e..bd6793760f29 100644 --- a/Documentation/ABI/testing/debugfs-driver-qat +++ b/Documentation/ABI/testing/debugfs-driver-qat @@ -81,3 +81,29 @@ Description: (RO) Read returns, for each Acceleration Engine (AE), the number : Number of Compress and Verify (CnV) errors and type of the last CnV error detected by Acceleration Engine N. + +What: /sys/kernel/debug/qat__/heartbeat/inject_error +Date: March 2024 +KernelVersion: 6.8 +Contact: qat-linux@intel.com +Description: (WO) Write to inject an error that simulates an heartbeat + failure. This is to be used for testing purposes. + + After writing this file, the driver stops arbitration on a + random engine and disables the fetching of heartbeat counters. + If a workload is running on the device, a job submitted to the + accelerator might not get a response and a read of the + `heartbeat/status` attribute might report -1, i.e. device + unresponsive. + The error is unrecoverable thus the device must be restarted to + restore its functionality. + + This attribute is available only when the kernel is built with + CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION=y. + + A write of 1 enables error injection. + + The following example shows how to enable error injection:: + + # cd /sys/kernel/debug/qat__ + # echo 1 > heartbeat/inject_error -- cgit