diff options
author | Riana Tauro <riana.tauro@intel.com> | 2025-08-26 12:04:15 +0530 |
---|---|---|
committer | Rodrigo Vivi <rodrigo.vivi@intel.com> | 2025-08-26 10:11:34 -0400 |
commit | 0a2a873d615a39e8a87d3f15285ed888341ddce8 (patch) | |
tree | 66cddf5ebdb2d67323b44bf0614033695b745f9f /rust/helpers | |
parent | f646c9f9371b28b8f93e619fe003415f6aaeb416 (diff) |
drm/xe: Add support to handle hardware errors
Gfx device reports two classes of errors: uncorrectable and
correctable. Depending on the severity uncorrectable errors are further
classified Non-Fatal and Fatal.
Correctable and Non-Fatal errors: These errors are reported as MSI. Bits in
the Master Interrupt Register indicate the class of the error.
The source of the error is then read from the Device Error Source
Register.
Fatal errors: These are reported as PCIe errors
When a PCIe error is asserted, the OS will perform a SBR (Secondary
Bus reset) which causes the driver to reload. The error registers are
sticky and the values are maintained through SBR.
Add basic support to handle these errors.
Bspec: 50875, 53073, 53074, 53075, 53076
v2: Format commit message (Umesh)
v3: fix documentation (Stuart)
Cc: Stuart Summers <stuart.summers@intel.com>
Co-developed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://lore.kernel.org/r/20250826063419.3022216-9-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Diffstat (limited to 'rust/helpers')
0 files changed, 0 insertions, 0 deletions