summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-10drm/amdgpu: Simplify cleanup check for FRU sysfsLijo Lazar
FRU info is expected to be non-NULL if FRU sys files are created. Simplify the check. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Add secure display v2 commandJinzhou Su
Add secure display v2 command to support multiple ROI instances per display. v2: fix typo and coding style issue Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd: Invert APU check for amdgpu_device_evict_resources()Mario Limonciello
Resource eviction isn't needed for s3 or s2idle on APUs, but should be run for S4. As amdgpu_device_evict_resources() will be called by prepare notifier adjust logic so that APUs only cover S4. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Link: https://lore.kernel.org/r/20241128032656.2090059-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add "restore" missing variable commentSunil Khatri
add "restore" missing variable in the fucntions sdma_v4_4_2_page_resume and sdma_v4_4_2_inst_start. This fixes the warning: warning: Function parameter or struct member 'restore' not described in 'sdma_v4_4_2_page_resume' warning: Function parameter or struct member 'restore' not described in 'sdma_v4_4_2_inst_start' Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Update the variable name to dma_bufSunil Khatri
Instead of fixing the warning for missing variable its better to update the variable name to match with the style followed in the code. This will fix the below mentioned warning: warning: Function parameter or struct member 'dbuf' not described in 'amdgpu_bo_create_isp_user' warning: Excess function parameter 'dma_buf' description in 'amdgpu_bo_create_isp_user' Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: set UMC PA per NPS mode when PA is 0Tao Zhou
The shift bit of PA varys according to NPS mode due to different address format. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: remove is_mca_add for ras_add_bad_pagesTao Zhou
Remove unnecessary variable and simplify the logic. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modesTao Zhou
All legacy RAS bad pages are generated in NPS1 mode, but new bad page can be generated in any NPS mode, so we can't use retired_page stored on eeprom directly in non-nps1 mode even for legacy data. We need to take different actions for different data, new data can be identified from old data by UMC_CHANNEL_IDX_V2 flag. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Check fence emitted count to identify bad jobsShikang Fan
In SRIOV, when host driver performs MODE 1 reset and notifies FLR to guest driver, there is a small chance that there is no job running on hw but the driver has not updated the pending list yet, causing the driver not respond the FLR request. Modify the has_job_running function to make sure if there is still running job. v2: Use amdgpu_fence_count_emitted to determine job running status. v3: Remove the timeout wait in has_job_running Signed-off-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Shikang Fan <shikang.fan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdkfd: Differentiate logging message for driver oversubscriptionXiaogang Chen
To have user better understand the causes triggering runlist oversubscription. No function change. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: 3.2.311Aric Cyr
This version brings along following fixes: - Add hblank borrowing support - Limit VTotal range to max hw cap minus fp - Correct prefetch calculation - Add option to retrieve detile buffer size - Add support for custom recout_width in SPL - Add disable_ips_in_dpms_off flag for IPS - Enable EASF based on luma taps only - Add a left edge pixel if in YCbCr422 or YCbCr420 and odm Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Add support for custom recout_width in SPLSamson Tam
[WHY] Add support for custom recout_width for mpc combine in SPL [HOW] 1. Rename mpc_combine_h and mpc_combine_v 2. Add flag use_recout_width_aligned to use custom recout_width 3. Create union to use either mpc_num_h_slices or mpc_recout_width_align Reviewed-by: Navid Assadian <navid.assadian@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Add disable_ips_in_dpms_off flag for IPSNicholas Kazlauskas
[WHY] It's possible we still allow IPS2 when all streams are DPMS off but this is unexpected. [HOW] Pass the DM config value into DC so it can use the pure stream count to decide. We will be in 0 streams for S0i3 so this will still allow it for D3. Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Enable EASF based on luma taps onlySamson Tam
[WHY] EASF only applies to luma. Previously both luma and chroma taps were checked to determine whether to enable EASF. [HOW] Only check if luma taps are supported before determining whether to enable EASF or not. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/amdgpu/vcn: Fix kdoc entries for VCN clock/power gating functionsSrinivasan Shanmugam
This commit corrects the descriptors for the vcn_v4_0/v4_0_3/v4_0_5/v5_0_0 _set_clockgating_state and vcn_v4_0/v4_0_3/v4_0_5/v5_0_0 _set_powergating_state functions in the amdgpu driver. The parameter descriptors in the comments were mismatched with the actual function parameters. The non-existent 'handle' parameter has been replaced with the correct 'ip_block' parameter in the comments to accurately reflect the function signatures and to resolving the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1232: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_set_clockgating_state' drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1232: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_set_clockgating_state' drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1263: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1263: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2012: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_set_clockgating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2012: warning: Excess function parameter 'handle' description in 'vcn_v4_0_set_clockgating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2043: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:2043: warning: Excess function parameter 'handle' description in 'vcn_v4_0_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1505: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_5_set_clockgating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1505: warning: Excess function parameter 'handle' description in 'vcn_v4_0_5_set_clockgating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1536: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_5_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:1536: warning: Excess function parameter 'handle' description in 'vcn_v4_0_5_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1629: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v4_0_3_set_powergating_state' drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c:1629: warning: Excess function parameter 'handle' description in 'vcn_v4_0_3_set_powergating_state' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/amdgpu: Add missing kdoc 'inst' parameter in ↵Srinivasan Shanmugam
'smu_dpm_set_power_gate' function This commit adds the missing kdoc parameter descriptor for 'inst' in the smu_dpm_set_power_gate function. The 'inst' parameter, which specifies the instance of the IP block to power gate/ungate. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:359: warning: Function parameter or struct member 'inst' not described in 'smu_dpm_set_power_gate' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: move per inst variables to amdgpu_vcn_instBoyuan Zhang
Move all per instance variables from amdgpu_vcn to amdgpu_vcn_inst. Move adev->vcn.fw[i] from amdgpu_vcn to amdgpu_vcn_inst. Move adev->vcn.vcn_config[i] from amdgpu_vcn to amdgpu_vcn_inst. Move adev->vcn.vcn_codec_disable_mask[i] from amdgpu_vcn to amdgpu_vcn_inst. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: pass ip_block in set_clockgating_stateBoyuan Zhang
Pass ip_block instead of adev in set_clockgating_state() callback functions. Modify set_clockgating_state()for all correspoding ip blocks. v2: remove all changes for is_idle(), remove type casting Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: pass ip_block in set_powergating_stateBoyuan Zhang
Pass ip_block instead of adev in set_powergating_state callback function. Modify set_powergating_state ip functions for all correspoding ip blocks. v2: fix a ip block index error. v3: remove type casting Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add inst to amdgpu_dpm_enable_vcnBoyuan Zhang
Add an instance parameter to amdgpu_dpm_enable_vcn() function, and change all calls from vcn ip functions to add instance argument. vcn generations with only one instance (v1.0, v2.0) always use 0 as instance number. vcn generations with multiple instances (v2.5, v3.0, v4.0, v4.0.3, v4.0.5, v5.0.0) use the actual instance number. v2: remove for-loop in amdgpu_dpm_enable_vcn(), and temporarily move it to vcn ip with multiple instances, in order to keep the exact same logic as before, until further separation in next patch. v3: fix missing prefix Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: add inst to dpm_set_powergating_by_smuBoyuan Zhang
Add an instance parameter to amdgpu_dpm_set_powergating_by_smu() function, and use the instance to call set_powergating_by_smu(). v2: remove duplicated functions. remove for-loop in amdgpu_dpm_set_powergating_by_smu(), and temporarily move it to amdgpu_dpm_enable_vcn(), in order to keep the exact same logic as before, until further separation in next patch. v3: drop SI logic in amdgpu_dpm_enable_vcn(). Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: add inst to set_powergating_by_smuBoyuan Zhang
Add an instance parameter to set_powergating_by_smu() function, and re-write all amd_pm functions accordingly. Then use the instance to call smu_dpm_set_vcn_enable(). v2: remove duplicated functions. remove for-loop in smu_dpm_set_power_gate(), and temporarily move it to to amdgpu_dpm_set_powergating_by_smu(), in order to keep the exact same logic as before, until further separation in next patch. v3: add instance number in error message. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: add inst to smu_dpm_set_vcn_enableBoyuan Zhang
First, add an instance parameter to smu_dpm_set_vcn_enable() function, and calling dpm_set_vcn_enable() with this given instance. Second, modify vcn_gated to be an array, to track the gating status for each vcn instance separately. With these 2 changes, smu_dpm_set_vcn_enable() will check and set the gating status for the given vcn instance ONLY. v2: remove duplicated functions. remove for-loop in dpm_set_vcn_enable(), and temporarily move it to to smu_dpm_set_power_gate(), in order to keep the exact same logic as before, until further separation in next patch. v3: add instance number in error message. v4: declaring i at the top of the function. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: power up or down vcn by instanceBoyuan Zhang
For smu ip with multiple vcn instances (smu 11/13/14), remove all the for loop in dpm_set_vcn_enable() functions. And use the instance argument to power up/down vcn for the given instance only, instead of powering up/down for all vcn instances. v2: remove all duplicated functions in v1. remove for-loop from each ip, and temporarily move to dpm_set_vcn_enable, in order to keep the exact same logic as before, until further separation in the next patch. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: Fix an error handling path in ↵Christophe JAILLET
vega10_enable_se_edc_force_stall_config() In case of error after a amdgpu_gfx_rlc_enter_safe_mode() call, it is not balanced by a corresponding amdgpu_gfx_rlc_exit_safe_mode() call. Add the missing call. Fixes: 9b7b8154cdb8 ("drm/amd/powerplay: added didt support for vega10") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add interface to get die id from memory addressTao Zhou
And implement it for UMC v12_0. The die id is calculated from IPID register in bad page retirement flow, but we don't store it on eeprom and it can be also gotten from physical address. v2: get PA_C4 and PA_R13 from MCA address since they may be cleared in retired page. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add a flag to indicate UMC channel index versionTao Zhou
v1 (legacy way): store channel index within a UMC instance in eeprom v2: store global channel index in eeprom V2: only save the flag on eeprom, clear it after saving. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: save UMC global channel index to eepromTao Zhou
Save the global channel index returned by RAS TA to eeprom. We can get memory physical address by MCA address and channel index. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: support to find RAS bad pages via old TATao Zhou
Old version of RAS TA doesn't support to convert MCA address stored on eeprom to physical address (PA), support to find all bad pages in one memory row by PA with old RAS TA. This approach is only suitable for nps1 mode. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add function to find all memory pages in one physical rowTao Zhou
And the function can be reused across amdgpu driver. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: retire RAS bad pages in different NPS modesTao Zhou
There are some changes in format of memory normalized address per NPS mode, need to adjust bit mapping according to NPS mode. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: store only one RAS bad page record for all pages in one rowTao Zhou
So eeprom space can be saved, compatible with legacy way. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Prefer RAS recovery for scheduler hangLijo Lazar
Before scheduling a recovery due to scheduler/job hang, check if a RAS error is detected. If so, choose RAS recovery to handle the situation. A scheduler/job hang could be the side effect of a RAS error. In such cases, it is required to go through the RAS error recovery process. A RAS error recovery process in certains cases also could avoid a full device device reset. An error state is maintained in RAS context to detect the block affected. Fatal Error state uses unused block id. Set the block id when error is detected. If the interrupt handler detected a poison error, it's not required to look for a fatal error. Skip fatal error checking in such cases. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: do RAS MCA2PA conversion in device init phaseTao Zhou
NPS mode is introduced, the value of memory physical address (PA) related to a MCA address varies per nps mode. We need to rely on MCA address and convert it into PA accroding to nps mode. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add flag to indicate the type of RAS eeprom recordTao Zhou
One UMC MCA address could map to multiply physical address (PA): AMDGPU_RAS_EEPROM_REC_PA: one record store one PA AMDGPU_RAS_EEPROM_REC_MCA: one record store one MCA address, PA is not cared about Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add TA_RAS_INV_NODE valueTao Zhou
We can set UMC node instance to invalid state if we use global channel index, and RAS TA can choose UMC address conversion approach by checking node_inst value. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add return value for convert_ras_err_addrTao Zhou
So upper layer can return failure directly if address conversion fails. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: reduce memory usage for umc_lookup_bad_pages_in_a_rowTao Zhou
The function handles one page in one time, allocating umc.retire_unit bad page records is enough. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: make convert_ras_err_addr visible outside UMC blockTao Zhou
And change some UMC v12 specific functions to generic version, so the code can be shared. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: store PA with column bits cleared for RAS bad pageTao Zhou
So the code can be simplified, and no need to expose the detail of PA format outside address conversion. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: reduce the mmio writes in kiq settingPrike Liang
There's no need to perform the two MMIO writes in the KIQ Setting registers programmed period, and reducing the MMIO writes will save the driver loading time. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu/sdma4.4.2: implement ring reset callback for sdma4.4.2Jiadong Zhu
Implement sdma queue reset callback via SMU interface. v2: Leverage inst_stop/start functions in reset sequence. Use GET_INST for physical SDMA instance. Disable apu for sdma reset. v3: Rephrase error prints. v4: Remove redundant prints. Remove setting PREEMPT registers as soft reset handles it. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: implement dpm sdma reset functionJiadong Zhu
Implement sdma soft reset by sending MSG_ResetSDMA on smu 13.0.6. v2: Add firmware version for the reset message. v3: Add ip version check. Print inst_mask on failure. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: update smu_v13_0_6 smu headerJiadong Zhu
update smu header for sdma soft reset. Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: remove redundant RAS error address coversion codeTao Zhou
Only one interface is responsible for the conversion. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/amdgpu: Add support for isp buffersPratap Nirujogi
Add support to create user BOs with MC address for isp using the dma-buf handle exported for the buffers allocated from system memory in isp driver. Export amdgpu_bo_create_kernel() and amdgpu_bo_free_kernel() as well for isp to allocate GTT internal buffers required for fw to run. Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: simplify RAS page retirement in one memory rowTao Zhou
Take R13 and column bits as a whole for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdkfd: pause autosuspend when creating pddJesse.zhang@amd.com
When using MES creating a pdd will require talking to the GPU to setup the relevant context. The code here forgot to wake up the GPU in case it was in suspend, this causes KVM to EFAULT for passthrough GPU for example. This issue can be masked if the GPU was woken up by other things (e.g. opening the KMS node) first and have not yet gone to sleep. v4: do the allocation of proc_ctx_bo in a lazy fashion when the first queue is created in a process (Felix) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Yunxiang Li <Yunxiang.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-12-10drm/amdgpu: fix when the cleaner shader is emittedChristian König
Emitting the cleaner shader must come after the check if a VM switch is necessary or not. Otherwise we will emit the cleaner shader every time and not just when it is necessary because we switched between applications. This can otherwise crash on gang submit and probably decreases performance quite a bit. v2: squash in fix from Srini (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: ee7a846ea27b ("drm/amdgpu: Emit cleaner shader at end of IB submission") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-12-10drm/amdgpu: Fix ISP HW init issuePratap Nirujogi
ISP hw_init is not called with the recent changes related to hw init levels. AMDGPU_INIT_LEVEL_DEFAULT is ignoring the ISP IP block as AMDGPU_IP_BLK_MASK_ALL is derived using incorrect max number of IP blocks. Update AMDGPU_IP_BLK_MASK_ALL to use AMD_IP_BLOCK_TYPE_NUM instead of AMDGPU_MAX_IP_NUM to fix the issue. Fixes: 14f2fe34f5c6 ("drm/amdgpu: Add init levels") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>