summaryrefslogtreecommitdiff
path: root/drivers/misc/habanalabs/hwmon.c
AgeCommit message (Collapse)Author
2019-09-05habanalabs: display card name as sensors headerOded Gabbay
To allow the user to use a custom file for the HWMON lm-sensors library per card type, the driver needs to register the HWMON sensors with the specific card type name. The card name is supplied by the F/W running on the device. If the F/W is old and doesn't supply a card name, a default card name is displayed as the sensors group name. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
2019-09-05habanalabs: replace __le32_to_cpu with le32_to_cpuOded Gabbay
In some files the driver uses __le32_to_cpu while in other it uses le32_to_cpu. Replace all __le32_to_cpu instances with le32_to_cpu for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-09-05habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64Oded Gabbay
In some files the code use __cpu_to_le32/64 while in other it use cpu_to_le32/64. Replace all __cpu_to_le32/64 instances with cpu_to_le32/64 for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-02-28habanalabs: fix little-endian<->cpu conversion warningsOded Gabbay
Add __cpu_to_le16/32/64 and __le16/32/64_to_cpu where needed according to sparse. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-28habanalabs: use NULL to initialize array of pointersOded Gabbay
This patch fixes the following sparse warnings: drivers/misc/habanalabs/hwmon.c:20:56: warning: Using plain integer as NULL pointer Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-28habanalabs: disable CPU access on timeoutsOded Gabbay
This patch provides a workaround for a bug in the F/W where the response time for a request from KMD may take more then 100ms. This could cause the queue between KMD and the F/W to get out of sync. The WA is to: 1. Increase the timeout of ALL requests to 1s. 2. In case a request isn't answered in time, mark the state as "cpu_disabled" and prevent sending further requests from KMD to the F/W. This will eventually lead to a heartbeat failure and hard reset of the device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add device reset supportOded Gabbay
This patch adds support for doing various on-the-fly reset of Goya. The driver supports two types of resets: 1. soft-reset 2. hard-reset Soft-reset is done when the device detects a timeout of a command submission that was given to the device. The soft-reset process only resets the engines that are relevant for the submission of compute jobs, i.e. the DMA channels, the TPCs and the MME. The purpose is to bring the device as fast as possible to a working state. Hard-reset is done in several cases: 1. After soft-reset is done but the device is not responding 2. When fatal errors occur inside the device, e.g. ECC error 3. When the driver is removed Hard-reset performs a reset of the entire chip except for the PCI controller and the PLLs. It is a much longer process then soft-reset but it helps to recover the device without the need to reboot the Host. After hard-reset, the driver will restore the max power attribute and in case of manual power management, the frequencies that were set. This patch also adds two entries to the sysfs, which allows the root user to initiate a soft or hard reset. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-18habanalabs: add sysfs and hwmon supportOded Gabbay
This patch add the sysfs and hwmon entries that are exposed by the driver. Goya has several sensors, from various categories such as temperature, voltage, current, etc. The driver exposes those sensors in the standard hwmon mechanism. In addition, the driver exposes a couple of interfaces in sysfs, both for configuration and for providing status of the device or driver. The configuration attributes is for Power Management: - Automatic or manual - Frequency value when moving to high frequency mode - Maximum power the device is allowed to consume The rest of the attributes are read-only and provide the following information: - Versions of the various firmwares running on the device - Contents of the device's EEPROM - The device type (currently only Goya is supported) - PCI address of the device (to allow user-space to connect between /dev/hlX to PCI address) - Status of the device (operational, malfunction, in_reset) - How many processes are open on the device's file Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>