summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/pm/swsmu/smu11
AgeCommit message (Collapse)Author
2025-05-13drm/amd/pm: Remove remainder of mode2_reset_is_supportDr. David Alan Gilbert
The previous patch removed smu_mode2_reset_is_support() which was the only function to call through the mode2_reset_is_support() method pointer. Remove the remaining functions that were assigned to it and the pointer itself. See discussion at: https://lore.kernel.org/all/DM4PR12MB5165D85BD85BC8FC8BF7A3B48E88A@DM4PR12MB5165.namprd12.prod.outlook.com/ Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-07drm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_rangeDr. David Alan Gilbert
The last use of smu_v11_0_get_dpm_level_range() was removed in 2020 by commit 46a301e14e8a ("drm/amd/powerplay: drop unnecessary Navi1x specific APIs") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-30drm/amd/pm: Fix comment styleLijo Lazar
Fix code comment style Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504271422.D6cqMlZ0-lkp@intel.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amd/pm/smu11: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1e866f1fe528 ("drm/amd/pm: Prevent divide by zero") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amd/pm: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b64625a303de ("drm/amd/pm: correct the address of Arcturus fan related registers") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-02-12drm/amd/pm: add support for IP version 11.5.2Ying Li
This initializes drm/amd/pm version 11.5.2 Signed-off-by: YING LI <yingli12@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd: Add the capability to mark certain firmware as "required"Mario Limonciello
Some of the firmware that is loaded by amdgpu is not actually required. For example the ISP firmware on some SoCs is optional, and if it's not present the ISP IP block just won't be initialized. The firmware loader core however will show a warning when this happens like this: ``` Direct firmware load for amdgpu/isp_4_1_0.bin failed with error -2 ``` To avoid confusion for non-required firmware, adjust the amd-ucode helper to take an extra argument indicating if the firmware is required or optional. On optional firmware use firmware_request_nowarn() instead of request_firmware() to avoid the warnings. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/amd-gfx/df71d375-7abd-4b32-97ce-15e57846eed8@amd.com/T/#t Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: power up or down vcn by instanceBoyuan Zhang
For smu ip with multiple vcn instances (smu 11/13/14), remove all the for loop in dpm_set_vcn_enable() functions. And use the instance argument to power up/down vcn for the given instance only, instead of powering up/down for all vcn instances. v2: remove all duplicated functions in v1. remove for-loop from each ip, and temporarily move to dpm_set_vcn_enable, in order to keep the exact same logic as before, until further separation in the next patch. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02drm/amd/pm: fix and simplify workload handlingAlex Deucher
smu->workload_mask is IP specific and should not be messed with in the common code. The mask bits vary across SMU versions. Move all handling of smu->workload_mask in to the backends and simplify the code. Store the user's preference in smu->power_profile_mode which will be reflected in sysfs. For internal driver profile switches for KFD or VCN, just update the workload mask so that the user's preference is retained. Remove all of the extra now unused workload related elements in the smu structure. v2: use refcounts for workload profiles v3: rework based on feedback from Lijo v4: fix the refcount on failure, drop backend mask v5: rework custom handling v6: handle failure cleanup with custom profile v7: Update documentation Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: stable@vger.kernel.org # 6.11.x
2024-12-02Revert "drm/amd/pm: correct the workload setting"Alex Deucher
This reverts commit 74e1006430a5377228e49310f6d915628609929e. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. This came back after a merge in 6.13-rc1, so revert again. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: 74e1006430a5 ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 44f392fbf628a7ff2d8bb8e83ca1851261f81a6f)
2024-11-21drm/amd/pm: Remove arcturus min power limitLijo Lazar
As per power team, there is no need to impose a lower bound on arcturus power limit. Any unreasonable limit set will result in frequent throttling. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-11-04drm/amd/pm: correct the workload settingKenneth Feng
Correct the workload setting in order not to mix the setting with the end user. Update the workload mask accordingly. v2: changes as below: 1. the end user can not erase the workload from driver except default workload. 2. always shows the real highest priority workoad to the end user. 3. the real workload mask is combined with driver workload mask and end user workload mask. v3: apply this to the other ASICs as well. v4: simplify the code v5: refine the code based on the review comments. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-04drm/amd/pm: add inst to dpm_set_vcn_enableBoyuan Zhang
Add an instance parameter to the existing function dpm_set_vcn_enable() for future implementation. Re-write all pptable functions accordingly. v2: Remove duplicated dpm_set_vcn_enable() functions in v1. Instead, adding instance parameter to existing functions. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-28drm/amd/pm: Vangogh: Fix kernel memory out of bounds writeTvrtko Ursulin
KASAN reports that the GPU metrics table allocated in vangogh_tables_init() is not large enough for the memset done in smu_cmn_init_soft_gpu_metrics(). Condensed report follows: [ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dump_stack_lvl+0x66/0x90 [ 33.861838] print_report+0xce/0x620 [ 33.861853] kasan_report+0xda/0x110 [ 33.862794] kasan_check_range+0xfd/0x1a0 [ 33.862799] __asan_memset+0x23/0x40 [ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] dev_attr_show+0x43/0xc0 [ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0 [ 33.867155] seq_read_iter+0x3f8/0x1140 [ 33.867173] vfs_read+0x76c/0xc50 [ 33.867198] ksys_read+0xfb/0x1d0 [ 33.867214] do_syscall_64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasan_save_stack+0x33/0x50 [ 33.867364] kasan_save_track+0x17/0x60 [ 33.867367] __kasan_kmalloc+0x87/0x90 [ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu] [ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu] [ 33.869608] local_pci_probe+0xda/0x180 [ 33.869614] pci_device_probe+0x43f/0x6b0 Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block. Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit. The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write. v2: * Drop impossible v3_0 case. (Mario) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: 41cec40bc9ba ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Wenyou Yang <WenYou.Yang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-15drm/amdgpu/swsmu: add automatic parameter to set_soft_freq_rangeAlex Deucher
On chips that support it, you can specificy 0 and 0xffff for min and max and the PMFW will use that to determine the optimal min and max. This enables optimal performance when the user manually switches between performance levels using sysfs. Previously we'd set soft min/max which could limit performance. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amd/pm: remove dump_pptable functionsPierre-Eric Pelloux-Prayer
They're not used. Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-26drm/amdgpu: Use init level for pending_reset flagLijo Lazar
Drop pending_reset flag in gmc block. Instead use init level to determine which type of init is preferred - in this case MINIMAL. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-13drm/amdgpu/swsmu: fix SMU11 typos (memlk -> memclk)Tobias Jakobi
No functional changes. Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14drm/amdgpu: refine pmfw/smu firmware loadingYang Wang
refine pmfw/smu firmware loading Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-14drm/amd/pm: remove dead code in navi10_emit_clk_levels and ↵Jesse Zhang
navi10_print_clk_levels Since the range of the varibable i is 0 - 3. So execution cannot reach this statement: default. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-20drm/amd/pm: Remove unused interface to set plpdLijo Lazar
Remove unused callback to set PLPD policy and its implementation from arcturus, aldebaran and SMUv13.0.6 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17drm/amd/pm: Add xgmi plpd to arcturus pm_policyLijo Lazar
On arcturus, allow changing xgmi plpd policy through 'pm_policy/xgmi_plpd' sysfs interface. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-13drm/amd/pm: check the return of send smc msg for navi10Jesse Zhang
Set smu work laod mask may fail, so check return. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-13drm/amd/pm: check the return of send smc msg for sienna_cichildJesse Zhang
Set smu work laod mask may fail, so check return. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-13drm/amdgpu/pm: Check input value for power profile setting on smu11, smu13 ↵Ma Jun
and smu14 Check the input value for CUSTOM profile mode setting on smu 11, smu13 and smu14. Otherwise we use uninitialized value of input[] Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amd/pm: fix uninitialized variable warnings for vangogh_pptTim Huang
1. Fix a issue that using uninitialized mask to get the ultimate frequency. 2. Check return of smu_cmn_send_smc_msg_with_param to avoid using uninitialized variable residency. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-30drm/amd/pm: Fix negative array index readJesse Zhang
Avoid using the negative values for clk_idex as an index into an array pptable->DpmDescriptor. V2: fix clk_index return check (Tim Huang) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-17Merge tag 'amd-drm-next-6.10-2024-04-13' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.10-2024-04-13: amdgpu: - HDCP fixes - ODM fixes - RAS fixes - Devcoredump improvements - Misc code cleanups - Expose VCN activity via sysfs - SMY 13.0.x updates - Enable fast updates on DCN 3.1.4 - Add dclk and vclk reporting on additional devices - Add ACA RAS infrastructure - Implement TLB flush fence - EEPROM handling fixes - SMUIO 14.0.2 support - SMU 14.0.1 Updates - Sync page table freeing with TLB flushes - DML2 refactor - DC debug improvements - SR-IOV fixes - Suspend and Resume fixes - DCN 3.5.x Updates - Z8 fixes - UMSCH fixes - GPU reset fixes - HDP fix for second GFX pipe on GC 10.x - Enable secondary GFX pipe on GC 10.3 - Refactor and clean up BACO/BOCO/BAMACO handling - VCN partitioning fix - DC DWB fixes - VSC SDP fixes - DCN 3.1.6 fix - GC 11.5 fixes - Remove invalid TTM resource start check - DCN 1.0 fixes amdkfd: - MQD handling cleanup - Preemption handling fixes for XCDs - TLB flush fix for GC 9.4.2 - Properly clean up workqueue during module unload - Fix memory leak process create failure - Range check CP bad op exception targets to avoid reporting invalid exceptions to userspace radeon: - Misc code cleanups From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240413213708.3427038-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-09drm/amdgpu/pm: Check AMDGPU_RUNPM_BAMACO when setting baco stateMa Jun
Check AMDGPU_RUNPM_BAMACO intead of amdgpu_runtime_pm when setting baco state. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-09drm/amdgpu/pm: Add support for MACO flag checkingMa Jun
Add support for MACO flag checking. MACO mode only works if BACO is supported. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-09drm/amdgpu/pm: Change the member function name in pp_hwmgr_func and ↵Ma Jun
pptable_funcs Use a unified and more explicit name get_bamaco_support to replace is_baco_support and get_asic_baco_capability Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-22drm/amdgpu: Fix truncation in smu_v11_0_init_microcodeSrinivasan Shanmugam
Reducing the size of ucode_prefix to 25 in the smu_v11_0_init_microcode function. we ensure that fw_name can accommodate the maximum possible string size Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.c: In function ‘smu_v11_0_init_microcode’: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.c:110:54: warning: ‘.bin’ directive output may be truncated writing 4 bytes into a region of size between 0 and 29 [-Wformat-truncation=] 110 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); | ^~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu11/smu_v11_0.c:110:9: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 36 110 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-21Merge tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Fixes from the last week (or 3 weeks in amdgpu case), after amdgpu, it's xe and nouveau then a few scattered core fixes. core: - fix rounding in drm_fixp2int_round() bridge: - fix documentation for DRM_BRIDGE_OP_EDID sun4i: - fix 64-bit division on 32-bit architectures tests: - fix dependency on DRM_KMS_HELPER probe-helper: - never return negative values from .get_modes() plus driver fixes xe: - invalidate userptr vma on page pin fault - fail early on sysfs file creation error - skip VMA pinning on xe_exec if no batches nouveau: - clear bo resource bus after eviction - documentation fixes - don't check devinit disable on GSP amdgpu: - Freesync fixes - UAF IOCTL fixes - Fix mmhub client ID mapping - IH 7.0 fix - DML2 fixes - VCN 4.0.6 fix - GART bind fix - GPU reset fix - SR-IOV fix - OD table handling fixes - Fix TA handling on boards without display hardware - DML1 fix - ABM fix - eDP panel fix - DPPCLK fix - HDCP fix - Revert incorrect error case handling in ioremap - VPE fix - HDMI fixes - SDMA 4.4.2 fix - Other misc fixes amdkfd: - Fix duplicate BO handling in process restore" * tag 'drm-next-2024-03-22' of https://gitlab.freedesktop.org/drm/kernel: (50 commits) drm/amdgpu/pm: Don't use OD table on Arcturus drm/amdgpu: drop setting buffer funcs in sdma442 drm/amd/display: Fix noise issue on HDMI AV mute drm/amd/display: Revert Remove pixle rate limit for subvp Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode" Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in amdgpu_device_init()" drm/amd/display: Add a dc_state NULL check in dc_state_release drm/amd/display: Return the correct HDCP error code drm/amd/display: Implement wait_for_odm_update_pending_complete drm/amd/display: Lock all enabled otg pipes even with no planes drm/amd/display: Amend coasting vtotal for replay low hz drm/amd/display: Fix idle check for shared firmware state drm/amd/display: Update odm when ODM combine is changed on an otg master pipe with no plane drm/amd/display: Init DPPCLK from SMU on dcn32 drm/amd/display: Add monitor patch for specific eDP drm/amd/display: Allow dirty rects to be sent to dmub when abm is active drm/amd/display: Override min required DCFCLK in dml1_validate drm/amdgpu: Bypass display ta if display hw is not available drm/amdgpu: correct the KGQ fallback message drm/amdgpu/pm: Check the validity of overdiver power limit ...
2024-03-20drm/amdgpu: add VCN sensor value for VangoghXiaojian Du
This will drm/amdgpu: add VCN sensor value for Vangogh. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu/pm: Don't use OD table on ArcturusMa Jun
OD is not supported on Arcturus, so the OD table should not be used. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20drm/amdgpu/pm: Check the validity of overdiver power limitMa Jun
Check the validity of overdriver power limit before using it. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Lazar Lijo <lijo.lazar@amd.com> Suggested-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-20drm/amdgpu/pm: Fix NULL pointer dereference when get power limitMa Jun
Because powerplay_table initialization is skipped under sriov case, We check and set default lower and upper OD value if powerplay_table is NULL. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reported-by: Yin Zhenguo <zhenguo.yin@amd.com> Suggested-by: Lazar Lijo <lijo.lazar@amd.com> Suggested-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-03-13Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm updates from Dave Airlie: "Highlights are usual, more AMD IP blocks for future hw, i915/xe changes, Displayport tunnelling support for i915, msm YUV over DP changes, new tests for ttm, but its mostly a lot of stuff all over the place from lots of people. core: - EDID cleanups - scheduler error handling fixes - managed: add drmm_release_action() with tests - add ratelimited drm debug print - DPCD PSR early transport macro - DP tunneling and bandwidth allocation helpers - remove built-in edids - dp: Avoid AUX transfers on powered-down displays - dp: Add VSC SDP helpers cross drivers: - use new drm print helpers - switch to ->read_edid callback - gem: add stats for shared buffers plus updates to amdgpu, i915, xe syncobj: - fixes to waiting and sleeping ttm: - add tests - fix errno codes - simply busy-placement handling - fix page decryption media: - tc358743: fix v4l device registration video: - move all kernel parameters for video behind CONFIG_VIDEO sound: - remove <drm/drm_edid.h> include from header ci: - add tests for msm - fix apq8016 runner efifb: - use copy of global screen_info state vesafb: - use copy of global screen_info state simplefb: - fix logging bridge: - ite-6505: fix DP link-training bug - samsung-dsim: fix error checking in probe - samsung-dsim: add bsh-smm-s2/pro boards - tc358767: fix regmap usage - imx: add i.MX8MP HDMI PVI plus DT bindings - imx: add i.MX8MP HDMI TX plus DT bindings - sii902x: fix probing and unregistration - tc358767: limit pixel PLL input range - switch to new drm_bridge_read_edid() interface panel: - ltk050h3146w: error-handling fixes - panel-edp: support delay between power-on and enable; use put_sync in unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49 V8.0, BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings - panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings - panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings - add BOE TH101MB31IG002-28A plus DT bindings - add EDT ETML1010G3DRA plus DT bindings - add Novatek NT36672E LCD DSI plus DT bindings - nt36523: support 120Hz timings, fix includes - simple: fix display timings on RK32FN48H - visionox-vtdr6130: fix initialization - add Powkiddy RGB10MAX3 plus DT bindings - st7703: support panel rotation plus DT bindings - add Himax HX83112A plus DT bindings - ltk500hd1829: add support for ltk101b4029w and admatec 9904370 - simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs panel-orientation-quirks: - GPD Win Mini amdgpu: - Validate DMABuf imports in compute VMs - Add RAS ACA framework - PSP 13 fixes - Misc code cleanups - Replay fixes - Atom interpretor PS, WS bounds checking - DML2 fixes - Audio fixes - DCN 3.5 Z state fixes - Remove deprecated ida_simple usage - UBSAN fixes - RAS fixes - Enable seq64 infrastructure - DC color block enablement - Documentation updates - DC documentation updates - DMCUB updates - ATHUB 4.1 support - LSDMA 7.0 support - JPEG DPG support - IH 7.0 support - HDP 7.0 support - VCN 5.0 support - SMU 13.0.6 updates - NBIO 7.11 updates - SDMA 6.1 updates - MMHUB 3.3 updates - DCN 3.5.1 support - NBIF 6.3.1 support - VPE 6.1.1 support amdkfd: - Validate DMABuf imports in compute VMs - SVM fixes - Trap handler updates and enhancements - Fix cache size reporting - Relocate the trap handler radeon: - Atom interpretor PS, WS bounds checking - Misc code cleanups xe: - new query for GuC submission version - Remove unused persistent exec_queues - Add vram frequency sysfs attributes - Add the flag XE_VM_BIND_FLAG_DUMPABLE - Drop pre-production workarounds - Drop kunit tests for unsupported platforms - Start pumbling SR-IOV support with memory based interrupts for VF - Allow to map BO in GGTT with PAT index corresponding to XE_CACHE_UC to work with memory based interrupts - Add GuC Doorbells Manager as prep work SR-IOV - Implement additional workarounds for xe2 and MTL - Program a few registers according to perfomance guide spec for Xe2 - Fix remaining 32b build issues and enable it back - Fix build with CONFIG_DEBUG_FS=n - Fix warnings from GuC ABI headers - Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF - Release mmap mappings on rpm suspend - Disable mid-thread preemption when not properly supported by hardware - Fix xe_exec by reserving extra fence slot for CPU bind - Fix xe_exec with full long running exec queue - Canonicalize addresses where needed for Xe2 and add to devcoredum - Toggle USM support for Xe2 - Only allow 1 ufence per exec / bind IOCTL - Add GuC firmware loading for Lunar Lake - Add XE_VMA_PTE_64K VMA flag i915: - Add more ADL-N PCI IDs - Enable fastboot also on older platforms - Early transport for panel replay and PSR - New ARL PCI IDs - DP TPS4 PHY test pattern support - Unify and improve VSC SDP for PSR and non-PSR cases - Refactor memory regions and improve debug logging - Rework global state serialization - Remove unused CDCLK divider fields - Unify HDCP connector logging format - Use display instead of graphics version in display code - Move VBT and opregion debugfs next to the implementation - Abstract opregion interface, use opaque type - MTL fixes - HPD handling fixes - Add GuC submission interface version query - Atomically invalidate userptr on mmu-notifier - Update handling of MMIO triggered reports - Don't make assumptions about intel_wakeref_t type - Extend driver code of Xe_LPG to Xe_LPG+ - Add flex arrays to struct i915_syncmap - Allow for very slow HuC loading - DP tunneling and bandwidth allocation support msm: - Correct bindings for MSM8976 and SM8650 platforms - Start migration of MDP5 platforms to DPU driver - X1E80100 MDSS support - DPU: - Improve DSC allocation, fixing several important corner cases - Add support for SDM630/SDM660 platforms - Simplify dpu_encoder_phys_ops - Apply fixes targeting DSC support with a single DSC encoder - Apply fixes for HCTL_EN timing configuration - X1E80100 support - Add support for YUV420 over DP - GPU: - fix sc7180 UBWC config - fix a7xx LLC config - new gpu support: a305B, a750, a702 - machine support: SM7150 (different power levels than other a618) - a7xx devcoredump support habanalabs: - configure IRQ affinity according to NUMA node - move HBM MMU page tables inside the HBM - improve device reset - check extended PCIe errors ivpu: - updates to firmware API - refactor BO allocation imx: - use devm_ functions during init hisilicon: - fix EDID includes mgag200: - improve ioremap usage - convert to struct drm_edid - Work around PCI write bursts nouveau: - disp: use kmemdup() - fix EDID includes - documentation fixes qaic: - fixes to BO handling - make use of DRM managed release - fix order of remove operations rockchip: - analogix_dp: get encoder port from DT - inno_hdmi: support HDMI for RK3128 - lvds: error-handling fixes ssd130x: - support SSD133x plus DT bindings tegra: - fix error handling tilcdc: - make use of DRM managed release v3d: - show memory stats in debugfs - Support display MMU page size vc4: - fix error handling in plane prepare_fb - fix framebuffer test in plane helpers virtio: - add venus capset defines vkms: - fix OOB access when programming the LUT - Kconfig improvements vmwgfx: - unmap surface before changing plane state - fix memory leak in error handling - documentation fixes - list command SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 as invalid - fix null-pointer deref in execbuf - refactor display-mode probing - fix fencing for creating cursor MOBs - fix cursor-memory lifetime xlnx: - fix live video input for ZynqMP DPSUB lima: - fix memory leak loongson: - fail if no VRAM present meson: - switch to new drm_bridge_read_edid() interface renesas: - add RZ/G2L DU support plus DT bindings mxsfb: - Use managed mode config sun4i: - HDMI: updates to atomic mode setting mediatek: - Add display driver for MT8188 VDOSYS1 - DSI driver cleanups - Filter modes according to hardware capability - Fix a null pointer crash in mtk_drm_crtc_finish_page_flip etnaviv: - enhancements for NPU and MRT support" * tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel: (1420 commits) drm/amd/display: Removed redundant @ symbol to fix kernel-doc warnings in -next repo drm/amd/pm: wait for completion of the EnableGfxImu message drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.1 drm/amdgpu: add smu 14.0.1 support drm/amdgpu: add VPE 6.1.1 discovery support drm/amdgpu/vpe: add VPE 6.1.1 support drm/amdgpu/vpe: don't emit cond exec command under collaborate mode drm/amdgpu/vpe: add collaborate mode support for VPE drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE drm/amdgpu/vpe: add multi instance VPE support drm/amdgpu/discovery: add nbif v6_3_1 ip block drm/amdgpu: Add nbif v6_3_1 ip block support drm/amdgpu: Add pcie v6_1_0 ip headers (v5) drm/amdgpu: Add nbif v6_3_1 ip headers (v5) arch/powerpc: Remove <linux/fb.h> from backlight code macintosh/via-pmu-backlight: Include <linux/backlight.h> fbdev/chipsfb: Include <linux/backlight.h> drm/etnaviv: Restore some id values drm/amdkfd: make kfd_class constant drm/amdgpu: add ring timeout information in devcoredump ...
2024-03-11Merge tag 'x86-apic-2024-03-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC updates from Thomas Gleixner: "Rework of APIC enumeration and topology evaluation. The current implementation has a couple of shortcomings: - It fails to handle hybrid systems correctly. - The APIC registration code which handles CPU number assignents is in the middle of the APIC code and detached from the topology evaluation. - The various mechanisms which enumerate APICs, ACPI, MPPARSE and guest specific ones, tweak global variables as they see fit or in case of XENPV just hack around the generic mechanisms completely. - The CPUID topology evaluation code is sprinkled all over the vendor code and reevaluates global variables on every hotplug operation. - There is no way to analyze topology on the boot CPU before bringing up the APs. This causes problems for infrastructure like PERF which needs to size certain aspects upfront or could be simplified if that would be possible. - The APIC admission and CPU number association logic is incomprehensible and overly complex and needs to be kept around after boot instead of completing this right after the APIC enumeration. This update addresses these shortcomings with the following changes: - Rework the CPUID evaluation code so it is common for all vendors and provides information about the APIC ID segments in a uniform way independent of the number of segments (Thread, Core, Module, ..., Die, Package) so that this information can be computed instead of rewriting global variables of dubious value over and over. - A few cleanups and simplifcations of the APIC, IO/APIC and related interfaces to prepare for the topology evaluation changes. - Seperation of the parser stages so the early evaluation which tries to find the APIC address can be seperately overridden from the late evaluation which enumerates and registers the local APIC as further preparation for sanitizing the topology evaluation. - A new registration and admission logic which - encapsulates the inner workings so that parsers and guest logic cannot longer fiddle in it - uses the APIC ID segments to build topology bitmaps at registration time - provides a sane admission logic - allows to detect the crash kernel case, where CPU0 does not run on the real BSP, automatically. This is required to prevent sending INIT/SIPI sequences to the real BSP which would reset the whole machine. This was so far handled by a tedious command line parameter, which does not even work in nested crash scenarios. - Associates CPU number after the enumeration completed and prevents the late registration of APICs, which was somehow tolerated before. - Converting all parsers and guest enumeration mechanisms over to the new interfaces. This allows to get rid of all global variable tweaking from the parsers and enumeration mechanisms and sanitizes the XEN[PV] handling so it can use CPUID evaluation for the first time. - Mopping up existing sins by taking the information from the APIC ID segment bitmaps. This evaluates hybrid systems correctly on the boot CPU and allows for cleanups and fixes in the related drivers, e.g. PERF. The series has been extensively tested and the minimal late fallout due to a broken ACPI/MADT table has been addressed by tightening the admission logic further" * tag 'x86-apic-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits) x86/topology: Ignore non-present APIC IDs in a present package x86/apic: Build the x86 topology enumeration functions on UP APIC builds too smp: Provide 'setup_max_cpus' definition on UP too smp: Avoid 'setup_max_cpus' namespace collision/shadowing x86/bugs: Use fixed addressing for VERW operand x86/cpu/topology: Get rid of cpuinfo::x86_max_cores x86/cpu/topology: Provide __num_[cores|threads]_per_package x86/cpu/topology: Rename topology_max_die_per_package() x86/cpu/topology: Rename smp_num_siblings x86/cpu/topology: Retrieve cores per package from topology bitmaps x86/cpu/topology: Use topology logical mapping mechanism x86/cpu/topology: Provide logical pkg/die mapping x86/cpu/topology: Simplify cpu_mark_primary_thread() x86/cpu/topology: Mop up primary thread mask handling x86/cpu/topology: Use topology bitmaps for sizing x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT x86/xen/smp_pv: Count number of vCPUs early x86/cpu/topology: Assign hotpluggable CPUIDs during init x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug x86/topology: Add a mechanism to track topology via APIC IDs ...
2024-02-28drm/amd/pm: Fix esm reg mask use to get pcie speedAsad Kamal
Fix mask used for esm ctrl register to get pcie link speed on smu_v11_0_3, smu_v13_0_2 & smu_v13_0_6 Fixes: 511a95552ec8 ("drm/amd/pm: Add SMU 13.0.6 support") Fixes: c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)") Fixes: f1c378593153 ("drm/amd/powerplay: add Arcturus support for gpu metrics export") Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-27drm/amdgpu/pm: Fix the power1_min_cap valueMa Jun
It's unreasonable to use 0 as the power1_min_cap when OD is disabled. So, use the same lower limit as the value used when OD is enabled. Fixes: 1958946858a6 ("drm/amd/pm: Support for getting power1_cap_min value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-02-26drm/amdgpu/pm: Fix the power1_min_cap valueMa Jun
It's unreasonable to use 0 as the power1_min_cap when OD is disabled. So, use the same lower limit as the value used when OD is enabled. Fixes: 1958946858a6 ("drm/amd/pm: Support for getting power1_cap_min value") Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-16x86/cpu/topology: Get rid of cpuinfo::x86_max_coresThomas Gleixner
Now that __num_cores_per_package and __num_threads_per_package are available, cpuinfo::x86_max_cores and the related math all over the place can be replaced with the ready to consume data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240213210253.176147806@linutronix.de
2024-01-29drm/amdgpu/pm: Use macro definitions in the smu IH process functionMa Jun
Replace the hard-coded numbers with macro definition Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-25drm/amdgpu/pm: Fix the power source flag errorMa Jun
The power source flag should be updated when [1] System receives an interrupt indicating that the power source has changed. [2] System resumes from suspend or runtime suspend Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-01-25drm/amdgpu/pm: Add default case for smu IH process funcMa Jun
Add default case for smu IH process func. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-22drm/amdgpu/pm: Fix the power source flag errorMa Jun
The power source flag should be updated when [1] System receives an interrupt indicating that the power source has changed. [2] System resumes from suspend or runtime suspend Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15drm/amdgpu: check PS, WS indexAlexander Richards
Theoretically, it would be possible for a buggy or malicious VBIOS to overwrite past the bounds of the passed parameters (or its own workspace); add bounds checking to prevent this from happening. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093 Signed-off-by: Alexander Richards <electrodeyt@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13drm/amd/pm: Remove redundant function members of pptable_funcsMa Jun
Remove redundant functions members of pptable_funcs and change the function type as static because they are not called by other files. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amdgpu: optimize RLC powerdown notification on VangoghPerry Yuan
The smu needs to get the rlc power down message to sync the rlc state with smu, the rlc state updating message need to be sent at while smu begin suspend sequence , otherwise SMU will crash while RLC state is not notified by driver, and rlc state probally changed after that notification, so it needs to notify rlc state to smu at the end of the suspend sequence in amdgpu_device_suspend() that can make sure the rlc state is correctly set to SMU. [ 101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff! [ 110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features. [ 110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features! [ 110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62 [ 110.884394] PM: suspend of devices aborted after 21213.620 msecs [ 110.884402] PM: start suspend of devices aborted after 21213.882 msecs [ 110.884405] PM: Some devices failed to suspend, or early wake event detected Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>