diff options
author | Hawking Zhang <Hawking.Zhang@amd.com> | 2021-04-16 17:34:13 +0800 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2021-04-20 21:35:45 -0400 |
commit | 53ee6609b42e09f89bf2cdd15a340c236694ecd3 (patch) | |
tree | c7cf82ceedef56a55b750b0cc1f5b9b6bcf58601 /drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | |
parent | 9406d39bb6ef11e8525d7bd9acfcba5708db485b (diff) |
drm/amdgpu: only harvest gcea/mmea error status in arcturus
SDP RdRspStatus/WrRspStatus or first parity error on
RdRsp data can cause system fatal error in arcturus.
GPU will be freezed in such case.
Driver needs to harvest these error information before
reset the GPU. Check error type to avoid harvest normal
gcea/mmea information.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c index 1a92177c522f..47c8dd9d1c78 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v9_4.c @@ -1645,9 +1645,15 @@ static void mmhub_v9_4_query_ras_error_status(struct amdgpu_device *adev) for (i = 0; i < ARRAY_SIZE(mmhub_v9_4_err_status_regs); i++) { reg_value = RREG32(SOC15_REG_ENTRY_OFFSET(mmhub_v9_4_err_status_regs[i])); - if (reg_value) + if (REG_GET_FIELD(reg_value, MMEA0_ERR_STATUS, SDP_RDRSP_STATUS) || + REG_GET_FIELD(reg_value, MMEA0_ERR_STATUS, SDP_WRRSP_STATUS) || + REG_GET_FIELD(reg_value, MMEA0_ERR_STATUS, SDP_RDRSP_DATAPARITY_ERROR)) { + /* SDP read/write error/parity error in FUE_IS_FATAL mode + * can cause system fatal error in arcturas. Harvest the error + * status before GPU reset */ dev_warn(adev->dev, "MMHUB EA err detected at instance: %d, status: 0x%x!\n", i, reg_value); + } } } |