summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
AgeCommit message (Collapse)Author
2020-08-14drm/amd/pm: optimize the power related source code layoutEvan Quan
The target is to provide a clear entry point(for power routines). Also this can help to maintain a clear view about the frameworks used on different ASICs. Hopefully all these can make power part more friendly to play with. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14drm/amdgpu: revert "fix system hang issue during GPU reset"Christian König
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14drm/amd/powerplay: enable swSMU mgpu fan boost supportEvan Quan
Enable mgpu fan boost feature on swSMU routines. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-07drm/amd/powerplay: put VCN/JPEG into PG ungate state before dpm table setup(V3)Evan Quan
As VCN related dpm table setup needs VCN be in PG ungate state. Same logics applies to JPEG. V2: fix paste typo V3: code cosmetic Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Matt Coffin <mcoffin13@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-07drm/amd/powerplay: update swSMU VCN/JPEG PG logicsEvan Quan
Add lock protections and avoid unnecessary actions if the PG state is already the same as required. Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Matt Coffin <mcoffin13@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-06drm/amd/powerplay: add new sysfs interface for retrieving gpu metrics(V2)Evan Quan
A new interface for UMD to retrieve gpu metrics data. V2: rich the documentation Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-04drm/amdgpu/smu: rework i2c adpater registrationAlex Deucher
The i2c init/fini functions just register the i2c adapter. There is no need to call them during hw init/fini. They only need to be called once per driver init/fini. The previous behavior broke runtime pm because we unregistered the i2c adapter during suspend. Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amd/swsmu: allow asic to handle sensor type by itselfKevin Wang
1. allow asic to handle sensor type by itself. 2. if not, use smu common sensor to handle it. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amd/powerplay: drop unnecessary message support check(v2)Changfeng
Take back patch:drop unnecessary message support check Because the gpu reset fail problem on renoir can be fixed by: drm/amd/powerplay: skip invalid msg when smu set mp1 state It needs to remove SWSMU_CODE_LAYER_L1 in smu_cmn.h to guard a clear code layer. Signed-off-by: changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amdkfd: Add thermal throttling SMI eventMukul Joshi
Add support for reporting thermal throttling events through SMI. Also, add a counter to count the number of throttling interrupts observed and report the count in the SMI event message. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amdgpu: fix system hang issue during GPU resetDennis Li
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-23drm/amd/powerplay: skip invalid msg when smu set mp1 stateLikun Gao
Some asic may not support for some message of set mp1 state. If the return value of smu_send_smc_msg is -EINVAL, that means it failed to send msg to smc as it can not map an valid message for the ASIC. And with that case, smu_set_mp1_state should be skipped as those ASIC was in fact do not support for that. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-22drm/amd/powerplay: remove the dpm checking in the boot sequenceKenneth Feng
It's not necessary to retrieve the power features status when the asic is booted up the first time. This patch can have the features enablement status still checked in suspend/resume case and removed from the first boot up sequence. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-22Revert "drm/amd/powerplay: drop unnecessary message support check"Changfeng
The below 3 messages are not supported on Renoir SMU_MSG_PrepareMp1ForShutdown SMU_MSG_PrepareMp1ForUnload SMU_MSG_PrepareMp1ForReset It needs to revert patch: drm/amd/powerplay: drop unnecessary message support check to avoid set mp1 state fail during gpu reset on renoir. Signed-off-by: changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu/swSMU: remove eeprom from the smu i2c handlers (v2)Alex Deucher
The driver uses it for EEPROM access, but it's just an i2c bus. v2: change the callback name as well. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: tag swSMU code layersEvan Quan
Per designs, the swSMU code is separated into four layers. And the typical calling flow should be like: amdgpu_smu.c -> ${asic}_ppt.c -> smu_v11/12_0.c -> smu_cmn.c. Compile errors will come out for any violations. This can help to prevent cross callings(e.g. amdgpu_smu.c -> ${asic}_ppt.c -> amdgpu_smu.c -> ${asic}_ppt.c) which were common in our code. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: revise the calling flow on OD table updateEvan Quan
This can eliminate the cross callings and maintain clear code layer. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: drop unnecessary message support checkEvan Quan
These messages are known to be supported by all ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move SMC message issuing APIs to smu_cmn.cEvan Quan
Considering they can be shared by all ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move table setting common code to smu_cmn.cEvan Quan
As they are shared by all ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: maximum code sharing around watermarks settingEvan Quan
Maximum code sharing. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move more APIs to smu_cmn.cEvan Quan
Considering they are shared by all ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: common API for disabling all features with exceptionEvan Quan
We are moving to centralize all feature enablement/support checking and setting APIs in smu_cmn.c. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move ppfeature mask setting to smu_cmn.cEvan Quan
Considering they are shared by all ASICs. And we are moving to centralize all feature enablement/support checking and setting APIs in smu_cmn.c. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move dpm feature enablement checking to smu_cmn.cEvan Quan
Considering it is shared by all ASICs and smu_cmn.c should be the right place. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move dpm feature support checking to smu_cmn.cEvan Quan
Considering it is shared by all ASICs and smu_cmn.c should be the right place. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: move clock dpm enablement check to smu_v11/v12Evan Quan
As those APIs of smu_v11/v12 are more widely called. And they need this check also. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: unify swSMU index to asic specific index mappingEvan Quan
By this we can drop redundant code. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: widely share the API for data table retrievingEvan Quan
Considering the data table retrieving can be more widely shared, amdgpu_atombios.c is the right place. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu: add read amdgpu_gfxoff status in debugfsJinzhou.Su
Add interface for SMU12 device, used by UMR. v2: fix code style Signed-off-by: Jinzhou.Su <Jinzhou.Su@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amdgpu/powerplay: add smu support for navy_flounderJiansong Chen
Now navy_flounder will reuse the smu11 driver_if header and ppt functions for sienna_cichlid. Later navy_flounder can maintain its own version if the compatibility is broken. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amd/powerplay: sort the call flow on temperature ranges retrievingEvan Quan
This can help to maintain clear code layer. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amd/powerplay: drop unnecessary wrapper around pcie parameters settingEvan Quan
This can also help to maintain clear code layer. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amd/powerplay: drop unused APIs and parametersEvan Quan
Leftover of previous performance level setting cleanups. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amd/powerplay: update UMD pstate clock settingsEvan Quan
Preparing for coming code sharing around performance level setting. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amd/powerplay: add SMU mode1 resetWenhui Sheng
From PM FW 58.26.0 for sienna cichlid, SMU mode1 reset is support, driver sends PPSMC_MSG_Mode1Reset message to PM FW could trigger this reset. v2: add mode1 reset dpm interface v3: change maro name Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-10drm/amd/powerplay: put dpm frequency setting common code in smu_v11_0.cEvan Quan
As designed the common code shared among all smu v11 ASCIs go to smu_v11_0.c. This helps to maintain clear code layers. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-10drm/amd/powerplay: revise calling chain on retrieving frequency rangeEvan Quan
This helps to maintain clear code layers and drop unnecessary parameter. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-10drm/amd/powerplay: revise calling chain on setting soft limitEvan Quan
This helps to maintain clear code layers and drop unnecessary parameter. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-10drm/amd/powerplay: put setting hard limit common code in smu_v11_0.cEvan Quan
As designed the common code shared among all smu v11 ASCIs go to smu_v11_0.c. This helps to maintain clear code layers. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/powerplay: label internally used symbols as staticNirmoy Das
Used sparse(make C=1) to find these loose ends. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdgpu/sriov: Disable pm for multiple vf sriovEmily Deng
Signed-off-by: Emily Deng <Emily.Deng@amd.com> Ack-by: Monk.liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: maximum code sharing on sensor readingEvan Quan
Move the common code to amdgpu_smu.c instead of having one copy in both smu_v11_0.c and smu_v12_0.c. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: revise the calling chain on sensor readingEvan Quan
Update the calling chain from "amdgpu_smu.c -> ${asic}_ppt.c -> smu_v11/12_0.c -> amdgpu_smu.c (smu_common_read_sensor())" to " "amdgpu_smu.c -> ${asic}_ppt.c -> smu_v11/12_0.c". This can help to maintain clear code layers. More similar changes will be coming. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: drop unnecessary wrapper .populate_smc_tablesEvan Quan
Since .populate_smc_tables is just a wrapper of .set_default_dpm_table. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: drop redundant .set_min_dcefclk_deep_sleep API (v2)Evan Quan
It has exactly the same functionality as .set_deep_sleep_dcefclk. V2: correct the macro name for better trace Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: move maximum sustainable clock retrieving to .hw_initEvan Quan
Since DAL settings come between .hw_init and .late_init of SMU. And DAL needs to know the maximum sustainable clocks. Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-and-Tested-by: Flora Cui <flora.cui@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: drop unused code around power limitEvan Quan
Drop unused APIs, variables and argument. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: simplify the code around setting power limitEvan Quan
Use the cached max/current power limit and move the input check to the top layer. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amd/powerplay: simplify the code around retrieving power limitEvan Quan
Use the cached max/current power limit for other cases except .late_init. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>