summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/device.c
diff options
context:
space:
mode:
authorOded Gabbay <oded.gabbay@gmail.com>2020-07-05 15:48:34 +0300
committerOded Gabbay <oded.gabbay@gmail.com>2020-07-24 20:31:36 +0300
commitc83c4171933bc4ebd147efb6bbdb787b25d1907d (patch)
tree1372f754c868e0b2aa828421bfabfdee6ff66fd9 /drivers/misc/habanalabs/device.c
parent9158c47e2059967038b19d051a1afd87954fbba4 (diff)
habanalabs: halt device CPU only upon certain reset
Currently the driver halts the device CPU in the halt engines function, which halts all the engines of the ASIC. The problem is that if later on we stop the reset process (due to inability to clean memory mappings in time), the CPU will remain in halt mode. This creates many issues, such as thermal/power control and FLR handling. Therefore, move the halting of the device CPU to the very end of the reset process, just before writing to the registers to initiate the reset. In addition, the driver now needs to send a message to the device F/W to disable it from sending interrupts to the host machine because during halt engines function the driver disables the MSI/MSI-X interrupts. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Diffstat (limited to 'drivers/misc/habanalabs/device.c')
-rw-r--r--drivers/misc/habanalabs/device.c16
1 files changed, 16 insertions, 0 deletions
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c
index 65a5a5c52a48..df709767c7ea 100644
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -838,6 +838,22 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset,
if (rc)
return 0;
+ if (hard_reset) {
+ /* Disable PCI access from device F/W so he won't send
+ * us additional interrupts. We disable MSI/MSI-X at
+ * the halt_engines function and we can't have the F/W
+ * sending us interrupts after that. We need to disable
+ * the access here because if the device is marked
+ * disable, the message won't be send. Also, in case
+ * of heartbeat, the device CPU is marked as disable
+ * so this message won't be sent
+ */
+ if (hl_fw_send_pci_access_msg(hdev,
+ ARMCP_PACKET_DISABLE_PCI_ACCESS))
+ dev_warn(hdev->dev,
+ "Failed to disable PCI access by F/W\n");
+ }
+
/* This also blocks future CS/VM/JOB completion operations */
hdev->disabled = true;