Age | Commit message (Collapse) | Author |
|
Whilst we have support for setting fan curves there is no support for
disabling the zero RPM feature. Since the relevant bits are already
present in the OverDriveTable, hook them up to a sysctl setting so users
can influence this behaviour.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3489
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Correct the workload setting in order not to mix the setting
with the end user. Update the workload mask accordingly.
v2: changes as below:
1. the end user can not erase the workload from driver except default workload.
2. always shows the real highest priority workoad to the end user.
3. the real workload mask is combined with driver workload mask and end user workload mask.
v3: apply this to the other ASICs as well.
v4: simplify the code
v5: refine the code based on the review comments.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
always pick the pptable from IFWI on smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add an instance parameter to the existing function dpm_set_vcn_enable()
for future implementation. Re-write all pptable functions accordingly.
v2: Remove duplicated dpm_set_vcn_enable() functions in v1. Instead,
adding instance parameter to existing functions.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The following 3 commits landed in parallel:
commit d7d2688bf4ea ("drm/amd/pm: update workload mask after the setting")
commit 7a1613e47e65 ("drm/amdgpu/smu13: always apply the powersave optimization")
commit 7c210ca5a2d7 ("drm/amdgpu: handle default profile on on devices without fullscreen 3D")
While everything is set correctly, this caused the profile to be
reported incorrectly because both the powersave and fullscreen3d bits
were set in the mask and when the driver prints the profile, it looks
for the first bit set.
Fixes: d7d2688bf4ea ("drm/amd/pm: update workload mask after the setting")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
KASAN reports that the GPU metrics table allocated in
vangogh_tables_init() is not large enough for the memset done in
smu_cmn_init_soft_gpu_metrics(). Condensed report follows:
[ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu]
[ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067
...
[ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544
[ 33.861816] Tainted: [W]=WARN
[ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023
[ 33.861822] Call Trace:
[ 33.861826] <TASK>
[ 33.861829] dump_stack_lvl+0x66/0x90
[ 33.861838] print_report+0xce/0x620
[ 33.861853] kasan_report+0xda/0x110
[ 33.862794] kasan_check_range+0xfd/0x1a0
[ 33.862799] __asan_memset+0x23/0x40
[ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.867135] dev_attr_show+0x43/0xc0
[ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0
[ 33.867155] seq_read_iter+0x3f8/0x1140
[ 33.867173] vfs_read+0x76c/0xc50
[ 33.867198] ksys_read+0xfb/0x1d0
[ 33.867214] do_syscall_64+0x90/0x160
...
[ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s:
[ 33.867358] kasan_save_stack+0x33/0x50
[ 33.867364] kasan_save_track+0x17/0x60
[ 33.867367] __kasan_kmalloc+0x87/0x90
[ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu]
[ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu]
[ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu]
[ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu]
[ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu]
[ 33.869608] local_pci_probe+0xda/0x180
[ 33.869614] pci_device_probe+0x43f/0x6b0
Empirically we can confirm that the former allocates 152 bytes for the
table, while the latter memsets the 168 large block.
Root cause appears that when GPU metrics tables for v2_4 parts were added
it was not considered to enlarge the table to fit.
The fix in this patch is rather "brute force" and perhaps later should be
done in a smarter way, by extracting and consolidating the part version to
size logic to a common helper, instead of brute forcing the largest
possible allocation. Nevertheless, for now this works and fixes the out of
bounds write.
v2:
* Drop impossible v3_0 case. (Mario)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 41cec40bc9ba ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Wenyou Yang <WenYou.Yang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some devices do not support fullscreen 3D.
v2: Make the check generic.
Fixes: 336568de918e ("drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
|
|
We dont need to set the functions to NULL which arent
needed as global structure members are by default
set to zero or NULL for pointers.
Cc: Leo Liu <leo.liu@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove the dummy soft_reset functions for all
ip blocks.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove the dummy wait_for_idle functions for all
ip blocks.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
disable deep sleep during the compute workload for the
potential performance loss on smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update overdrive function on smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update the driver-fw interface file for smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On chips that support it, you can specificy 0 and 0xffff for
min and max and the PMFW will use that to determine the optimal
min and max. This enables optimal performance when the
user manually switches between performance levels using sysfs.
Previously we'd set soft min/max which could limit performance.
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This uses more aggressive hueristics than the the bootup default
profile. On windows the OS has a special fullscreen 3D mode
where this is used. Since we don't have the equivalent on Linux
default to this profile for dGPUs.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1500
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Needed to set the workload type at init time so that
we can apply the navi3x margin optimization.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It can avoid margin issues in some very demanding applications.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fill pcie other end recovery counter to metrics 1.6
v2: Add separate function to check recovery counter support
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update pmfw headers for smuv13.0.6 to version 0xE
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Partially revert
commit 0ca9f757a0e2 ("drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arrays")
The count attribute for these arrays does not get set until
after the arrays are allocated and populated leading to false
UBSAN warnings.
Fixes: 0ca9f757a0e2 ("drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arrays")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3662
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Don't wake up the GPU for reading pm values. Instead, take a runtime
powermanagement ref when trying to read it if and only if the GPU is
already awake.
This avoids spurious wake ups (eg: from applets).
We use pm_runtime_get_if_in_active because we care about "is the GPU awake?"
not about "is the GPU awake and something else prevents suspend?".
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Reading pm values from the GPU shouldn't prevent it to be suspended
by resetting the last active timestamp (eg: if an background app
monitors GPU sensors every second, it would prevent the autosuspend
sequence to trigger).
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of hw_fini.
Also update the ip_block ptr where ever needed as
there were cyclic dependency of hw_fini on suspend
and some followed clean up.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of hw_init.
Also update the ip_block ptr where ever needed as
there were cyclic dependency of hw_init on resume.
v2: squash in isp fix
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of resume.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of suspend.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of wait_for_idle.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Using pm_runtime_resume_and_get over pm_runtime_get_sync is recommended.
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
pm_runtime_put_autosuspend must be called from all exit paths.
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of soft_reset.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the *handle to amdgpu_ip_block ptr for all
functions pointers of late_fini.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We start with the function 'atomctrl_calculate_voltage_evv_on_sclk'
which has been unused since 2016's commit
e805ed83ba1c ("drm/amd/powerplay: delete useless files.")
Remove it.
It was also the last user of the entire fixed point maths library in
ppevvmath.h.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
They're not used.
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update the *handle to amdgpu_ip_block ptr for all
functions pointers of sw_fini.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update the *handle to amdgpu_ip_block ptr for all
functions pointers of sw_init.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update the ptr handle to amdgpu_ip_block ptr in all
the functions of late_init function ptr.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update the handle ptr to amdgpu_ip_block ptr
for all functions pointers on early_init.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This resolves the unchecded return value warning reported by Coverity.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Drop pending_reset flag in gmc block. Instead use init level to
determine which type of init is preferred - in this case MINIMAL.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When SMU IP is disabled by ip_block_mask, driver
should not refer to any dpm/swSMU callback. Instead,
any driver call into swSMU/dpm callback needs to
return error code EOPNOTSUPP.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use metrics 1_6 to report activities per partition
v2: Use separate per instance for different platforms, shared
vcn handled by other fix
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add new gpu_metrics_v1_6 with activities per partition
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use same metric table for APU and Non APU systems
for smu_v_13_0_6 to get metric data based on newer pmfw
versions
v2: Use inline func to check for unified metrics support
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update workload mask after the setting.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3625
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Update PMFW interface headers for updated metrics
table with gfx activity per xcd
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
During init, there can be some collisions on the i2c bus that result in
the EEPROM read failing. This has been mitigated in the PMFW to a
degree, but there is still a small chance that the bus will be busy.
When the read fails during RAS init, that disables page retirement
altogether, which is obviously not ideal. To try to avoid that
situation, set the eeprom_read function to retry once if the first read
fails, specifically for smu_v13_0_6.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
fix the pp_dpm_pcie issue on smu v14.0.2/3 as below:
0: 2.5GT/s, x4 250Mhz
1: 8.0GT/s, x4 616Mhz *
2: 8.0GT/s, x4 1143Mhz *
the middle level can be removed since it is always skipped on
smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
update the features set on smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit bbb05f8a9cd87f5046d05a0c596fddfb714ee457.
This breaks some manual setting of the profile mode in
certain cases.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3600
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Always reprogram the hardware state on init. This ensures
the PMFW state is explicitly programmed and we are not relying
on the default PMFW state.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|