Age | Commit message (Collapse) | Author |
|
It malicious user provides a small pptable through sysfs and then
a bigger pptable, it may cause buffer overflow attack in function
smu_sys_set_pp_table().
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
It is possible for some waves in a workgroup to finish their save
sequence before the group leader has had time to capture the workgroup
barrier state. When this happens, having those waves exit do impact the
barrier state. As a consequence, the state captured by the group leader
is invalid, and is eventually incorrectly restored.
This patch proposes to have all waves in a workgroup wait for each other
at the end of their save sequence (just before calling s_endpgm_saved).
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
|
|
In function psp_init_cap_microcode(), it should bail out when failed to
load firmware, otherwise it may cause invalid memory access.
Fixes: 07dbfc6b102e ("drm/amd: Use `amdgpu_ucode_*` helpers for PSP")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The destructor of a gtt bo is declared as
void amdgpu_amdkfd_free_gtt_mem(struct amdgpu_device *adev, void **mem_obj);
Which takes void** as the second parameter.
GCC allows passing void* to the function because void* can be implicitly
casted to any other types, so it can pass compiling.
However, passing this void* parameter into the function's
execution process(which expects void** and dereferencing void**)
will result in errors.
Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Fixes: fb91065851cd ("drm/amdkfd: Refactor queue wptr_bo GART mapping")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Bump the driver version for RV/PCO compute stability fix
so mesa can use this check to enable compute queues on
RV/PCO.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
|
|
When mesa started using compute queues more often
we started seeing additional hangs with compute queues.
Disabling gfxoff seems to mitigate that. Manually
control gfxoff and gfx pg with command submissions to avoid
any issues related to gfxoff. KFD already does the same
thing for these chips.
v2: limit to compute
v3: limit to APUs
v4: limit to Raven/PCO
v5: only update the compute ring_funcs
v6: Disable GFX PG
v7: adjust order
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Błażej Szczygieł <mumei6102@gmail.com>
Suggested-by: Sergey Kovalenko <seryoga.engineering@gmail.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3861
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-January/119116.html
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
|
|
UVD and VCN were split into separate dpm helpers in commit
ff69bba05f08 ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
as such, there is no need to include UVD in the is_vcn variable since
UVD and VCN are handled by separate dpm helpers now. Fix the check.
Fixes: ff69bba05f08 ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3959
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-February/119827.html
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
|
|
This reverts commit
a2b5a9956269 ("drm/amd/display: Use HW lock mgr for PSR1")
Because it may cause system hang while connect with two edp panel.
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, there are several files in drm/amd/display that aim to have a
higher -Wframe-larger-than value to avoid instances of that warning with
a lower value from the user's configuration. However, with the way that
it is currently implemented, it does not respect the user's request via
CONFIG_FRAME_WARN for a higher stack frame limit, which can cause pain
when new instances of the warning appear and break the build due to
CONFIG_WERROR.
Adjust the logic to switch from a hard coded -Wframe-larger-than value
to only using the value as a minimum clamp and deferring to the
requested value from CONFIG_FRAME_WARN if it is higher.
Suggested-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Closes: https://lore.kernel.org/2025013003-audience-opposing-7f95@gregkh/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY]
When the system powers up eDP with external monitors in seamless boot
sequence, stutter get enabled before TTU and HUBP registers being
programmed, which resulting in underflow.
[HOW]
Enable TTU in hubp_init.
Change the sequence that do not perpare_bandwidth and optimize_bandwidth
while having seamless boot streams.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Paul Hsieh <paul.hsieh@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHAT & HOW]
hpo_stream_to_link_encoder_mapping has size MAX_HPO_DP2_ENCODERS(=4),
but location can have size up to 6. As a result, it is necessary to
check location against MAX_HPO_DP2_ENCODERS.
Similiarly, disp_cfg_stream_location can be used as an array index which
should be 0..5, so the ASSERT's conditions should be less without equal.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3904
Reviewed-by: Austin Zheng <Austin.Zheng@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Vulkan can't support DCC and Z/S compression on GFX12 without
WRITE_COMPRESS_DISABLE in this commit or a completely different DCC
interface.
AMDGPU_TILING_GFX12_SCANOUT is added because it's already used by userspace.
Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This restores the original behavior that gets min/max freq from EDID and
only set DP/eDP connector as freesync capable if "sink device is capable
of rendering incoming video stream without MSA timing parameters", i.e.,
`allow_invalid_MSA_timing_params` is true. The condition was mistakenly
removed by 0159f88a99c9 ("drm/amd/display: remove redundant freesync
parser for DP").
CC: Mario Limonciello <mario.limonciello@amd.com>
CC: Alex Hung <alex.hung@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3915
Fixes: 0159f88a99c9 ("drm/amd/display: remove redundant freesync parser for DP")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
The following page fault was observed duringthe KFD process release.
In this particular error case, the HIP test (./MemcpyPerformance -h)
does not require the queue. As a result, the process_context_addr was
not assigned when the KFD process was released, ultimately leading to
this page fault during the execution of the function
kfd_process_dequeue_from_all_devices().
[345962.294891] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:0 pasid:0)
[345962.295333] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[345962.295775] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000B33
[345962.296097] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[345962.296394] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
[345962.296633] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x1
[345962.296876] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[345962.297135] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1
[345962.297377] amdgpu 0000:03:00.0: amdgpu: RW: 0x0
[345962.297682] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:169 vmid:0 pasid:0)
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
[Why]
the offset address of mmCLK5_spll_field_8 was incorrect for dcn35
which causes SSC not to be enabled.
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Lo-An Chen <lo-an.chen@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Aldebaran doesn't support querying MM activity percentage. Keep the
field as 0xFFs to mark it as unsupported.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
change the config of cgcg on gfx12
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
|
|
The purpose of halt_if_hws_hang is to preserve GPU state for driver
debugging when queue preemption fails. Issuing per-queue reset may
kill wavefronts which caused the preemption failure.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
|
|
[why]
Updating the cursor enablement register can be a slow operation and accumulates
when high polling rate cursors cause frequent updates asynchronously to the
cursor position.
[how]
Since the cursor enable bit is cached there is no need to update the
enablement register if there is no change to it. This removes the
read-modify-write from the cursor position programming path in HUBP and
DPP, leaving only the register writes.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sung Lee <sung.lee@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
When HUBP is power gated, the SW state can get out of sync with the
hardware state causing cursor to not be programmed correctly.
[How]
Similar to DPP, add a HUBP reset function which is called wherever
HUBP is initialized or powergated. This function will clear the cursor
position and attribute cache allowing for proper programming when the
HUBP is brought back up.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sung Lee <sung.lee@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
MES internal will check CP_MES_MSCRATCH_LO/HI register to set scratch
data location during ucode start, driver side need to start the MES
one by one with different setting for each pipe
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Effectively amdgpu.gttsize gets set to ~1/2 of RAM, but that's controlled
by what the TTM page limit is set to. Clarify the kdoc.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If the parent is NULL, adev->pdev is used to retrieve the PCIe speed and
width, ensuring that the function can still determine these
capabilities from the device itself.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6193 amdgpu_device_gpu_bandwidth()
error: we previously assumed 'parent' could be null (see line 6180)
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
6170 static void amdgpu_device_gpu_bandwidth(struct amdgpu_device *adev,
6171 enum pci_bus_speed *speed,
6172 enum pcie_link_width *width)
6173 {
6174 struct pci_dev *parent = adev->pdev;
6175
6176 if (!speed || !width)
6177 return;
6178
6179 parent = pci_upstream_bridge(parent);
6180 if (parent && parent->vendor == PCI_VENDOR_ID_ATI) {
^^^^^^
If parent is NULL
6181 /* use the upstream/downstream switches internal to dGPU */
6182 *speed = pcie_get_speed_cap(parent);
6183 *width = pcie_get_width_cap(parent);
6184 while ((parent = pci_upstream_bridge(parent))) {
6185 if (parent->vendor == PCI_VENDOR_ID_ATI) {
6186 /* use the upstream/downstream switches internal to dGPU */
6187 *speed = pcie_get_speed_cap(parent);
6188 *width = pcie_get_width_cap(parent);
6189 }
6190 }
6191 } else {
6192 /* use the device itself */
--> 6193 *speed = pcie_get_speed_cap(parent);
^^^^^^ Then we are toasted here.
6194 *width = pcie_get_width_cap(parent);
6195 }
6196 }
Fixes: 757e8b951ce2 ("drm/amdgpu: cache gpu pcie link width")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The function amdgpu_dm_crtc_mem_type_changed was dereferencing pointers
returned by drm_atomic_get_plane_state without checking for errors. This
could lead to undefined behavior if the function returns an error pointer.
This commit adds checks using IS_ERR to ensure that new_plane_state and
old_plane_state are valid before dereferencing them.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:11486 amdgpu_dm_crtc_mem_type_changed()
error: 'new_plane_state' dereferencing possible ERR_PTR()
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c
11475 static bool amdgpu_dm_crtc_mem_type_changed(struct drm_device *dev,
11476 struct drm_atomic_state *state,
11477 struct drm_crtc_state *crtc_state)
11478 {
11479 struct drm_plane *plane;
11480 struct drm_plane_state *new_plane_state, *old_plane_state;
11481
11482 drm_for_each_plane_mask(plane, dev, crtc_state->plane_mask) {
11483 new_plane_state = drm_atomic_get_plane_state(state, plane);
11484 old_plane_state = drm_atomic_get_plane_state(state, plane);
^^^^^^^^^^^^^^^^^^^^^^^^^^ These functions can fail.
11485
--> 11486 if (old_plane_state->fb && new_plane_state->fb &&
11487 get_mem_type(old_plane_state->fb) != get_mem_type(new_plane_state->fb))
11488 return true;
11489 }
11490
11491 return false;
11492 }
Fixes: 4caacd1671b7 ("drm/amd/display: Do not elevate mem_type change to full update")
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
commit 26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in
amdgpu_job_prepare") set job->vm as NULL if there is no fence. It will
cause emit switch buffer be skippen if job->vm set as NULL.
Check job rather than vm could solve this problem.
Fixes: 26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare")
Signed-off-by: Lin.Cao <lincao12@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix the initialization and usage of SMU v13.0.6 capability values. Use
caps_set/clear functions to set/clear capability.
Also, fix SET_UCLK_MAX capability on APUs, it is supported on APUs.
Fixes: e9b86b841baf ("drm/amd/pm: Add capability flags for SMU v13.0.6")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
Suggested-by: Lazar Lijo <Lijo.Lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
pmfw now unifies PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
pmfw unified PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add capability flags for SMU v13.0.6 variants. Initialize the flags
based on firmware support. As there are multiple IP versions maintained,
it is more manageable with one time initialization of caps flags based
on IP version and firmware feature support.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This needs to be kerneldoc formatted.
Fixes: 5349658fa4a1 ("drm/amd: Add debug option to disable subvp")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
|
|
This needs to be kerneldoc formatted.
Fixes: 7594874227e1 ("drm/amd/display: add CEC notifier to amdgpu driver")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kun Liu <Kun.Liu2@amd.com>
|
|
Combine the platform and GPU caps like we do for PCIe Gen.
This aligns properly with expectations and documentation
for the interface.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3820
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Get the PCIe link with of the device itself (or it's
integrated upstream bridge) and cache that.
v2: fix typo
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3820
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When compiling allmodconfig (CONFIG_WERROR=y) with clang-19, see the
following errors:
.../display/dc/dml2/display_mode_core.c:6268:13: warning: stack frame size (3128) exceeds limit (3072) in 'dml_prefetch_check' [-Wframe-larger-than]
.../display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:7236:13: warning: stack frame size (3256) exceeds limit (3072) in 'dml_core_mode_support' [-Wframe-larger-than]
Mark static functions called by dml_prefetch_check() and
dml_core_mode_support() noinline_for_stack to avoid them become huge
functions and thus exceed the frame size limit.
A way to reproduce:
$ git checkout next-20250107
$ mkdir build_dir
$ export PATH=/tmp/llvm-19.1.6-x86_64/bin:$PATH
$ make LLVM=1 O=build_dir allmodconfig
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
The way how it chose static functions to mark:
[0] Unset CONFIG_WERROR in build_dir/.config.
To get display_mode_core.o without errors.
[1] Get a function list called by dml_prefetch_check().
$ sed -n '6268,6711p' drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c \
| sed -n -r 's/.*\W(\w+)\(.*/\1/p' | sort -u >/tmp/syms
[2] Get the non-inline function list.
Objdump won't show the symbols if they are inline functions.
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
$ objdump -d build_dir/.../display_mode_core.o | \
./scripts/checkstack.pl x86_64 0 | \
grep -f /tmp/syms | cut -d' ' -f2- >/tmp/orig
[3] Get the full function list.
Append "-fno-inline" to `CFLAGS_.../display_mode_core.o` in
drivers/gpu/drm/amd/display/dc/dml2/Makefile.
$ make LLVM=1 O=build_dir drivers/gpu/drm/ -j
$ objdump -d build_dir/.../display_mode_core.o | \
./scripts/checkstack.pl x86_64 0 | \
grep -f /tmp/syms | cut -d' ' -f2- >/tmp/noinline
[4] Get the inline function list.
If a symbol only in /tmp/noinline but not in /tmp/orig, it is a good
candidate to mark noinline.
$ diff /tmp/orig /tmp/noinline
Chosen functions and their stack sizes:
CalculateBandwidthAvailableForImmediateFlip [display_mode_core.o]:144
CalculateExtraLatency [display_mode_core.o]:176
CalculateTWait [display_mode_core.o]:64
CalculateVActiveBandwithSupport [display_mode_core.o]:112
set_calculate_prefetch_schedule_params [display_mode_core.o]:48
CheckGlobalPrefetchAdmissibility [dml2_core_dcn4_calcs.o]:544
calculate_bandwidth_available [dml2_core_dcn4_calcs.o]:320
calculate_vactive_det_fill_latency [dml2_core_dcn4_calcs.o]:272
CalculateDCFCLKDeepSleep [dml2_core_dcn4_calcs.o]:208
CalculateODMMode [dml2_core_dcn4_calcs.o]:208
CalculateOutputLink [dml2_core_dcn4_calcs.o]:176
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If user shader issues S_SETVSKIP then this state will persist when
executing the trap handler, causing vector instructions to be
skipped.
VSKIP state is already saved/restored through the MODE register.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
'add ip block' causes a confusion if the blocks are disabled later with
ip_block_mask. Instead change to 'detected' and also add device context.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Context empty interrupt is enabled for SDMA 4.4.2. Add a handler for
context empty interrupt so that it is disposed of fast, and not
propagated to KFD layer.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some monitors flicker when subvp is enabled which maybe related to
an uncommon timing they use. To isolate such issues, add a debug
option to help isolate this the issue for debugging.
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Source and binary have become mismatched during branch activity.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For partial migrate from ram to vram, the migrate->cpages is not
equal to migrate->npages, should use migrate->npages to check all needed
migrate pages which could be copied or not.
And only need to set those pages could be migrated to migrate->dst[i], or
the migrate_vma_pages will migrate the wrong pages based on the migrate->dst[i].
v2:
Add mpages to break the loop earlier.
v3:
Uses MIGRATE_PFN_MIGRATE to identify whether page could be migrated.
v4:
Correct the error part.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Philip Yang<Philip.Yang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This commit enables the cleaner shader feature for GFX12.0 and GFX12.0.1
GPUs. The cleaner shader is important for clearing GPU resources such as
Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and
Scalar General Purpose Registers (SGPRs) between workloads.
- This feature ensures that GPU resources are reset between workloads,
preventing data leaks and ensuring accurate computation.
By enabling the cleaner shader, this update enhances the security and
reliability of GPU operations on GFX12.0 hardware.
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Mark options only meant to be used for debugging as unsafe so that the
kernel is tainted when they are used.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
FW attestation was disabled on MP0_14_0_{2/3}.
V2:
Move check into is_fw_attestation_support func. (Frank)
Remove DRM_WARN log info. (Alex)
Fix format. (Christian)
Signed-off-by: Gui Chengming <Jack.Gui@amd.com>
Reviewed-by: Frank.Min <Frank.Min@amd.com>
Reviewed-by: Christian König <Christian.Koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
That is needed to enforce isolation between contexts.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We sometimes have people trying to use debugging options in production
environments.
Mark options only meant to be used for debugging as unsafe so that the
kernel is tainted when they are used.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Lets use the existing helper instead of peeking into the structure
directly.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Disable gfxoff with the compute workload on gfx12. This is a
workaround for the opencl test failure.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This commit addresses a circular locking dependency issue within the GFX
isolation mechanism. The problem was identified by a warning indicating
a potential deadlock due to inconsistent lock acquisition order.
- The `amdgpu_gfx_enforce_isolation_ring_begin_use` and
`amdgpu_gfx_enforce_isolation_ring_end_use` functions previously
acquired `enforce_isolation_mutex` and called `amdgpu_gfx_kfd_sch_ctrl`,
leading to potential deadlocks. ie., If `amdgpu_gfx_kfd_sch_ctrl` is
called while `enforce_isolation_mutex` is held, and
`amdgpu_gfx_enforce_isolation_handler` is called while `kfd_sch_mutex` is
held, it can create a circular dependency.
By ensuring consistent lock usage, this fix resolves the issue:
[ 606.297333] ======================================================
[ 606.297343] WARNING: possible circular locking dependency detected
[ 606.297353] 6.10.0-amd-mlkd-610-311224-lof #19 Tainted: G OE
[ 606.297365] ------------------------------------------------------
[ 606.297375] kworker/u96:3/3825 is trying to acquire lock:
[ 606.297385] ffff9aa64e431cb8 ((work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work)){+.+.}-{0:0}, at: __flush_work+0x232/0x610
[ 606.297413]
but task is already holding lock:
[ 606.297423] ffff9aa64e432338 (&adev->gfx.kfd_sch_mutex){+.+.}-{3:3}, at: amdgpu_gfx_kfd_sch_ctrl+0x51/0x4d0 [amdgpu]
[ 606.297725]
which lock already depends on the new lock.
[ 606.297738]
the existing dependency chain (in reverse order) is:
[ 606.297749]
-> #2 (&adev->gfx.kfd_sch_mutex){+.+.}-{3:3}:
[ 606.297765] __mutex_lock+0x85/0x930
[ 606.297776] mutex_lock_nested+0x1b/0x30
[ 606.297786] amdgpu_gfx_kfd_sch_ctrl+0x51/0x4d0 [amdgpu]
[ 606.298007] amdgpu_gfx_enforce_isolation_ring_begin_use+0x2a4/0x5d0 [amdgpu]
[ 606.298225] amdgpu_ring_alloc+0x48/0x70 [amdgpu]
[ 606.298412] amdgpu_ib_schedule+0x176/0x8a0 [amdgpu]
[ 606.298603] amdgpu_job_run+0xac/0x1e0 [amdgpu]
[ 606.298866] drm_sched_run_job_work+0x24f/0x430 [gpu_sched]
[ 606.298880] process_one_work+0x21e/0x680
[ 606.298890] worker_thread+0x190/0x350
[ 606.298899] kthread+0xe7/0x120
[ 606.298908] ret_from_fork+0x3c/0x60
[ 606.298919] ret_from_fork_asm+0x1a/0x30
[ 606.298929]
-> #1 (&adev->enforce_isolation_mutex){+.+.}-{3:3}:
[ 606.298947] __mutex_lock+0x85/0x930
[ 606.298956] mutex_lock_nested+0x1b/0x30
[ 606.298966] amdgpu_gfx_enforce_isolation_handler+0x87/0x370 [amdgpu]
[ 606.299190] process_one_work+0x21e/0x680
[ 606.299199] worker_thread+0x190/0x350
[ 606.299208] kthread+0xe7/0x120
[ 606.299217] ret_from_fork+0x3c/0x60
[ 606.299227] ret_from_fork_asm+0x1a/0x30
[ 606.299236]
-> #0 ((work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work)){+.+.}-{0:0}:
[ 606.299257] __lock_acquire+0x16f9/0x2810
[ 606.299267] lock_acquire+0xd1/0x300
[ 606.299276] __flush_work+0x250/0x610
[ 606.299286] cancel_delayed_work_sync+0x71/0x80
[ 606.299296] amdgpu_gfx_kfd_sch_ctrl+0x287/0x4d0 [amdgpu]
[ 606.299509] amdgpu_gfx_enforce_isolation_ring_begin_use+0x2a4/0x5d0 [amdgpu]
[ 606.299723] amdgpu_ring_alloc+0x48/0x70 [amdgpu]
[ 606.299909] amdgpu_ib_schedule+0x176/0x8a0 [amdgpu]
[ 606.300101] amdgpu_job_run+0xac/0x1e0 [amdgpu]
[ 606.300355] drm_sched_run_job_work+0x24f/0x430 [gpu_sched]
[ 606.300369] process_one_work+0x21e/0x680
[ 606.300378] worker_thread+0x190/0x350
[ 606.300387] kthread+0xe7/0x120
[ 606.300396] ret_from_fork+0x3c/0x60
[ 606.300406] ret_from_fork_asm+0x1a/0x30
[ 606.300416]
other info that might help us debug this:
[ 606.300428] Chain exists of:
(work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work) --> &adev->enforce_isolation_mutex --> &adev->gfx.kfd_sch_mutex
[ 606.300458] Possible unsafe locking scenario:
[ 606.300468] CPU0 CPU1
[ 606.300476] ---- ----
[ 606.300484] lock(&adev->gfx.kfd_sch_mutex);
[ 606.300494] lock(&adev->enforce_isolation_mutex);
[ 606.300508] lock(&adev->gfx.kfd_sch_mutex);
[ 606.300521] lock((work_completion)(&(&adev->gfx.enforce_isolation[i].work)->work));
[ 606.300536]
*** DEADLOCK ***
[ 606.300546] 5 locks held by kworker/u96:3/3825:
[ 606.300555] #0: ffff9aa5aa1f5d58 ((wq_completion)comp_1.1.0){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680
[ 606.300577] #1: ffffaa53c3c97e40 ((work_completion)(&sched->work_run_job)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680
[ 606.300600] #2: ffff9aa64e463c98 (&adev->enforce_isolation_mutex){+.+.}-{3:3}, at: amdgpu_gfx_enforce_isolation_ring_begin_use+0x1c3/0x5d0 [amdgpu]
[ 606.300837] #3: ffff9aa64e432338 (&adev->gfx.kfd_sch_mutex){+.+.}-{3:3}, at: amdgpu_gfx_kfd_sch_ctrl+0x51/0x4d0 [amdgpu]
[ 606.301062] #4: ffffffff8c1a5660 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x70/0x610
[ 606.301083]
stack backtrace:
[ 606.301092] CPU: 14 PID: 3825 Comm: kworker/u96:3 Tainted: G OE 6.10.0-amd-mlkd-610-311224-lof #19
[ 606.301109] Hardware name: Gigabyte Technology Co., Ltd. X570S GAMING X/X570S GAMING X, BIOS F7 03/22/2024
[ 606.301124] Workqueue: comp_1.1.0 drm_sched_run_job_work [gpu_sched]
[ 606.301140] Call Trace:
[ 606.301146] <TASK>
[ 606.301154] dump_stack_lvl+0x9b/0xf0
[ 606.301166] dump_stack+0x10/0x20
[ 606.301175] print_circular_bug+0x26c/0x340
[ 606.301187] check_noncircular+0x157/0x170
[ 606.301197] ? register_lock_class+0x48/0x490
[ 606.301213] __lock_acquire+0x16f9/0x2810
[ 606.301230] lock_acquire+0xd1/0x300
[ 606.301239] ? __flush_work+0x232/0x610
[ 606.301250] ? srso_alias_return_thunk+0x5/0xfbef5
[ 606.301261] ? mark_held_locks+0x54/0x90
[ 606.301274] ? __flush_work+0x232/0x610
[ 606.301284] __flush_work+0x250/0x610
[ 606.301293] ? __flush_work+0x232/0x610
[ 606.301305] ? __pfx_wq_barrier_func+0x10/0x10
[ 606.301318] ? mark_held_locks+0x54/0x90
[ 606.301331] ? srso_alias_return_thunk+0x5/0xfbef5
[ 606.301345] cancel_delayed_work_sync+0x71/0x80
[ 606.301356] amdgpu_gfx_kfd_sch_ctrl+0x287/0x4d0 [amdgpu]
[ 606.301661] amdgpu_gfx_enforce_isolation_ring_begin_use+0x2a4/0x5d0 [amdgpu]
[ 606.302050] ? srso_alias_return_thunk+0x5/0xfbef5
[ 606.302069] amdgpu_ring_alloc+0x48/0x70 [amdgpu]
[ 606.302452] amdgpu_ib_schedule+0x176/0x8a0 [amdgpu]
[ 606.302862] ? drm_sched_entity_error+0x82/0x190 [gpu_sched]
[ 606.302890] amdgpu_job_run+0xac/0x1e0 [amdgpu]
[ 606.303366] drm_sched_run_job_work+0x24f/0x430 [gpu_sched]
[ 606.303388] process_one_work+0x21e/0x680
[ 606.303409] worker_thread+0x190/0x350
[ 606.303424] ? __pfx_worker_thread+0x10/0x10
[ 606.303437] kthread+0xe7/0x120
[ 606.303449] ? __pfx_kthread+0x10/0x10
[ 606.303463] ret_from_fork+0x3c/0x60
[ 606.303476] ? __pfx_kthread+0x10/0x10
[ 606.303489] ret_from_fork_asm+0x1a/0x30
[ 606.303512] </TASK>
v2: Refactor lock handling to resolve circular dependency (Alex)
- Introduced a `sched_work` flag to defer the call to
`amdgpu_gfx_kfd_sch_ctrl` until after releasing
`enforce_isolation_mutex`.
- This change ensures that `amdgpu_gfx_kfd_sch_ctrl` is called outside
the critical section, preventing the circular dependency and deadlock.
- The `sched_work` flag is set within the mutex-protected section if
conditions are met, and the actual function call is made afterward.
- This approach ensures consistent lock acquisition order.
Fixes: afefd6f24502 ("drm/amdgpu: Implement Enforce Isolation Handler for KGD/KFD serialization")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.14-2025-01-10:
amdgpu:
- Fix max surface handling in DC
- clang fixes
- DCN 3.5 fixes
- DCN 4.0.1 fixes
- DC CRC fixes
- DML updates
- DSC fixes
- PSR fixes
- DC add some divide by 0 checks
- SMU13 updates
- SR-IOV fixes
- RAS fixes
- Cleaner shader support for gfx10.3 dGPUs
- fix drm buddy trim handling
- SDMA engine reset updates
_ Fix RB bitmap setup
- Fix doorbell ttm cleanup
- Add CEC notifier support
- DPIA updates
- MST fixes
amdkfd:
- Shader debugger fixes
- Trap handler cleanup
- Cleanup includes
- Eviction fence wq fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250110172731.2960668-1-alexander.deucher@amd.com
|