summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)Author
5 daysMerge tag 'drm-xe-fixes-2025-09-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Don't touch survivability_mode on fini (Michal) - Fixes around eviction and suspend (Thomas) - Extend Wa_13011645652 to PTL-H, WCL (Julia) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aMLq7QlaEPHGKXKX@intel.com
5 daysMerge tag 'drm-misc-fixes-2025-09-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A maintainer update, an out-of-bound check for panthor and a revert for nouveau to fix a race. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250911-glistening-uakari-of-serendipity-06ceb1@houat
5 daysMerge tag 'mediatek-drm-fixes-20250910' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250910 1. fix potential OF node use-after-free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250910231813.3526-1-chunkuang.hu@kernel.org
5 daysMerge tag 'amd-drm-fixes-6.17-2025-09-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-10: amdgpu: - PSP 11.x fix - DPCD quirk handing fix - DCN 3.5 PG fix - Audio suspend fix - OEM i2c clean up fix - Module unload memory leak fix - DC delay fix - ISP firmware fix - VCN fixes amdkfd: - P2P topology fix - APU mem limit calculation fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250910162855.2507853-1-alexander.deucher@amd.com
6 daysdrm/mediatek: clean up driver data initialisationJohan Hovold
The platform and drm devices are only used to look up the drm device and its driver data respectively when initialising the driver data during bind(). Drop the reference counts as soon as they have been used to make the code more readable. Note that the crtc count is never incremented on lookup failures. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-3-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
6 daysdrm/mediatek: fix potential OF node use-after-freeJohan Hovold
The for_each_child_of_node() helper drops the reference it takes to each node as it iterates over children and an explicit of_node_put() is only needed when exiting the loop early. Drop the recently introduced bogus additional reference count decrement at each iteration that could potentially lead to a use-after-free. Fixes: 1f403699c40f ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv") Cc: Ma Ke <make24@iscas.ac.cn> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-2-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
7 daysdrm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any timeDavid Rosca
There is no reason to require this to happen on first submitted IB only. We need to wait for the queue to be idle, but it can be done at any time (including when there are multiple video sessions active). Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8908fdce0634a623404e9923ed2f536101a39db5) Cc: stable@vger.kernel.org
7 daysdrm/amdgpu/vcn4: Fix IB parsing with multiple engine info packagesDavid Rosca
There can be multiple engine info packages in one IB and the first one may be common engine, not decode/encode. We need to parse the entire IB instead of stopping after finding first engine info. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit dc8f9f0f45166a6b37864e7a031c726981d6e5fc) Cc: stable@vger.kernel.org
7 daysdrm/amd/amdgpu: Declare isp firmware binary filePratap Nirujogi
Declare isp firmware file isp_4_1_1.bin required by isp4.1.1 device. Suggested-by: Alexey Zagorodnikov <xglooom@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d97b74a833eba1f4f69f67198fd98ef036c0e5f9) Cc: stable@vger.kernel.org
7 daysdrm/amd/display: use udelay rather than fsleepAlex Deucher
This function can be called from an atomic context so we can't use fsleep(). Fixes: 01f60348d8fb ("drm/amd/display: Fix 'failed to blank crtc!'") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4549 Cc: Wen Chen <Wen.Chen3@amd.com> Cc: Fangzhi Zuo <jerry.zuo@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 27e4dc2c0543fd1808cc52bd888ee1e0533c4a2e)
7 daysdrm/amdgpu: fix a memory leak in fence cleanup when unloadingAlex Deucher
Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao <lincao12@amd.com> Cc: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a525fa37aac36c4591cc8b07ae8957862415fbd5) Cc: stable@vger.kernel.org
7 daysdrm/xe: Extend Wa_13011645652 to PTL-H, WCLJulia Filipchuk
Expand workaround to additional graphics architectures. Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.17+ Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250903190122.1028373-2-julia.filipchuk@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 6fc957185e1691bb6dfa4193698a229db537c2a2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
7 daysdrm/xe: Block exec and rebind worker while evicting for suspend / hibernateThomas Hellström
When the xe pm_notifier evicts for suspend / hibernate, there might be racing tasks trying to re-validate again. This can lead to suspend taking excessive time or get stuck in a live-lock. This behaviour becomes much worse with the fix that actually makes re-validation bring back bos to VRAM rather than letting them remain in TT. Prevent that by having exec and the rebind worker waiting for a completion that is set to block by the pm_notifier before suspend and is signaled by the pm_notifier after resume / wakeup. It's probably still possible to craft malicious applications that block suspending. More work is pending to fix that. v3: - Avoid wait_for_completion() in the kernel worker since it could potentially cause work item flushes from freezable processes to wait forever. Instead terminate the rebind workers if needed and re-launch at resume. (Matt Auld) v4: - Fix some bad naming and leftover debug printouts. - Fix kerneldoc. - Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld). - Rework the interface of xe_vm_rebind_resume_worker (Matt Auld). Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-4-thomas.hellstrom@linux.intel.com (cherry picked from commit 599334572a5a99111015fbbd5152ce4dedc2f8b7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
7 daysdrm/xe: Allow the pm notifier to continue on failureThomas Hellström
Its actions are opportunistic anyway and will be completed on device suspend. Marking as a fix to simplify backporting of the fix that follows in the series. v2: - Keep the runtime pm reference over suspend / hibernate and document why. (Matt Auld, Rodrigo Vivi): Fixes: c6a4d46ec1d7 ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-3-thomas.hellstrom@linux.intel.com (cherry picked from commit ebd546fdffddfcaeab08afdd68ec93052c8fa740) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
7 daysdrm/xe: Attempt to bring bos back to VRAM after evictionThomas Hellström
VRAM+TT bos that are evicted from VRAM to TT may remain in TT also after a revalidation following eviction or suspend. This manifests itself as applications becoming sluggish after buffer objects get evicted or after a resume from suspend or hibernation. If the bo supports placement in both VRAM and TT, and we are on DGFX, mark the TT placement as fallback. This means that it is tried only after VRAM + eviction. This flaw has probably been present since the xe module was upstreamed but use a Fixes: commit below where backporting is likely to be simple. For earlier versions we need to open- code the fallback algorithm in the driver. v2: - Remove check for dgfx. (Matthew Auld) - Update the xe_dma_buf kunit test for the new strategy (CI) - Allow dma-buf to pin in current placement (CI) - Make xe_bo_validate() for pinned bos a NOP. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995 Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-2-thomas.hellstrom@linux.intel.com (cherry picked from commit cb3d7b3b46b799c96b54f8e8fe36794a55a77f0b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
7 daysdrm/xe/configfs: Don't touch survivability_mode on finiMichal Wajdeczko
This is a user controlled configfs attribute, we should not modify that outside the configfs attr.store() implementation. Fixes: bc417e54e24b ("drm/xe: Enable configfs support for survivability mode") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250904103521.7130-1-michal.wajdeczko@intel.com (cherry picked from commit 079a5c83dbd23db7a6eed8f558cf75e264d8a17b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
7 daysamd/amdkfd: correct mem limit calculation for small APUsYifan Zhang
Current mem limit check leaks some GTT memory (reserved_for_pt reserved_for_ras + adev->vram_pin_size) for small APUs. Since carveout VRAM is tunable on APUs, there are three case regarding the carveout VRAM size relative to GTT: 1. 0 < carveout < gtt apu_prefer_gtt = true, is_app_apu = false 2. carveout > gtt / 2 apu_prefer_gtt = false, is_app_apu = false 3. 0 = carveout apu_prefer_gtt = true, is_app_apu = true It doesn't make sense to check below limitation in case 1 (default case, small carveout) because the values in the below expression are mixed with carveout and gtt. adev->kfd.vram_used[xcp_id] + vram_needed > vram_size - reserved_for_pt - reserved_for_ras - atomic64_read(&adev->vram_pin_size) gtt: kfd.vram_used, vram_needed, vram_size carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size In case 1, vram allocation will go to gtt domain, skip vram check since ttm_mem_limit check already cover this allocation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fa7c99f04f6dd299388e9282812b14e95558ac8e)
7 daysdrm/amdkfd: fix p2p links bug in topologyEric Huang
When creating p2p links, KFD needs to check XGMI link with two conditions, hive_id and is_sharing_enabled, but it is missing to check is_sharing_enabled, so add it to fix the error. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 36cc7d13178d901982da7a122c883861d98da624)
7 daysdrm/amd/display: remove oem i2c adapter on finishGeoffrey McRae
Fixes a bug where unbinding of the GPU would leave the oem i2c adapter registered resulting in a null pointer dereference when applications try to access the invalid device. Fixes: 3d5470c97314 ("drm/amd/display/dm: add support for OEM i2c bus") Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 89923fb7ead4fdd37b78dd49962d9bb5892403e6) Cc: stable@vger.kernel.org
7 daysdrm/amd/display: Drop dm_prepare_suspend() and dm_complete()Mario Limonciello (AMD)
[Why] dm_prepare_suspend() was added in commit 50e0bae34fa6b ("drm/amd/display: Add and use new dm_prepare_suspend() callback") to allow display to turn off earlier in the suspend sequence. This caused a regression that HDMI audio sometimes didn't work properly after resume unless audio was playing during suspend. [How] Drop dm_prepare_suspend() callback. All code in it will still run during dm_suspend(). Also drop unnecessary dm_complete() callback. dm_complete() was used for failed prepare and also for any case of successful resume. The code in it already runs in dm_resume(). This change will introduce more time that the display is turned on during suspend sequence. The compositor can turn it off sooner if desired. Cc: Harry Wentland <harry.wentland@amd.com> Reported-by: Przemysław Kopa <prz.kopa@gmail.com> Closes: https://lore.kernel.org/amd-gfx/1cea0d56-7739-4ad9-bf8e-c9330faea2bb@kernel.org/T/#m383d9c08397043a271b36c32b64bb80e524e4b0f Reported-by: Kalvin <hikaph+oss@gmail.com> Closes: https://github.com/alsa-project/alsa-lib/issues/465 Closes: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4809 Fixes: 50e0bae34fa6b ("drm/amd/display: Add and use new dm_prepare_suspend() callback") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2fd653b9bb5aacec5d4c421ab290905898fe85a2) Cc: stable@vger.kernel.org
7 daysdrm/amd/display: Correct sequences and delays for DCN35 PG & RCGOvidiu Bunea
[why] The current PG & RCG programming in driver has some gaps and incorrect sequences. [how] Added delays after ungating clocks to allow ramp up, increased polling to allow more time for power up, and removed the incorrect sequences. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1bde5584e297921f45911ae874b0175dce5ed4b5) Cc: stable@vger.kernel.org
7 daysdrm/amd/display: Disable DPCD Probe QuirkFangzhi Zuo
Disable dpcd probe quirk to native aux. Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4500 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250904191351.746707-1-Jerry.Zuo@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c5f4fb40584ee591da9fa090c6f265d11cbb1acf) Cc: stable@vger.kernel.org # 6.16.y: 5281cbe0b55a Cc: stable@vger.kernel.org # 6.16.y: 0b4aa85e8981 Cc: stable@vger.kernel.org # 6.16.y: b87ed522b364 Cc: stable@vger.kernel.org # 6.16.y
7 daysdrm/i915/power: fix size for for_each_set_bit() in abox iterationJani Nikula
for_each_set_bit() expects size to be in bits, not bytes. The abox mask iteration uses bytes, but it works by coincidence, because the local variable holding the mask is unsigned long, and the mask only ever has bit 2 as the highest bit. Using a smaller type could lead to subtle and very hard to track bugs. Fixes: 62afef2811e4 ("drm/i915/rkl: RKL uses ABOX0 for pixel transfers") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250905104149.1144751-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 7ea3baa6efe4bb93d11e1c0e6528b1468d7debf6) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
8 daysdrm/amdgpu: Wait for bootloader after PSPv11 resetLijo Lazar
Some PSPv11 SOCs take a longer time for PSP based mode-1 reset. Instead of checking for C2PMSG_33 status, add the callback wait_for_bootloader. Wait for bootloader to be back to steady state is already part of the generic mode-1 reset flow. Increase the retry count for bootloader wait and also fix the mask to prevent fake pass. Fixes: 8345a71fc54b ("drm/amdgpu: Add more checks to PSP mailbox") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4531 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 32f73741d6ee41fd5db8791c1163931e313d0fdc)
12 daysMerge tag 'amd-drm-fixes-6.17-2025-09-03' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-03: amdgpu: - UserQ fixes - MES 11 fix - eDP/LVDS fix - Fix non-DC audio clean up - Fix duplicate cursor issue - Fix error path in PSP init Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250903221656.251254-1-alexander.deucher@amd.com
12 daysdrm/panthor: validate group queue countChia-I Wu
A panthor group can have at most MAX_CS_PER_CSG panthor queues. Fixes: 4bdca11507928 ("drm/panthor: Add the driver frontend block") Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # v1 Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250903192133.288477-1-olvaffe@gmail.com
13 daysMerge tag 'drm-xe-fixes-2025-09-03' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix incorrect migration of backed-up object to VRAM (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aLiP26TiHkYxtBXL@intel.com
13 daysMerge tag 'drm-misc-fixes-2025-09-03' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Two nouveau interrupt handling fixes, one race fix for ivpu, a race fix for drm_sched, and a clock fix for ti-sn65dsi86. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/qc2rd7bskgufjtyspbjflyjpswcnhyja6s7nm2yb67j7hezyey@yfn2w6n5trff
13 daysRevert "drm/nouveau: Remove waitque for sched teardown"Philipp Stanner
This reverts: commit bead88002227 ("drm/nouveau: Remove waitque for sched teardown") commit 5f46f5c7af8c ("drm/nouveau: Add new callback for scheduler teardown") from the drm/sched teardown leak fix series: https://lore.kernel.org/dri-devel/20250710125412.128476-2-phasta@kernel.org/ The aforementioned series removed a blocking waitqueue from nouveau_sched_fini(). It was mistakenly assumed that this waitqueue only prevents jobs from leaking, which the series fixed. The waitqueue, however, also guarantees that all VM_BIND related jobs are finished in order, cleaning up mappings in the GPU's MMU. These jobs must be executed sequentially. Without the waitqueue, this is no longer guaranteed, because entity and scheduler teardown can race with each other. Revert all patches related to the waitqueue removal. Fixes: bead88002227 ("drm/nouveau: Remove waitque for sched teardown") Suggested-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250901083107.10206-2-phasta@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
13 daysdrm/amd/amdgpu: Fix missing error return on kzalloc failureColin Ian King
Currently the kzalloc failure check just sets reports the failure and sets the variable ret to -ENOMEM, which is not checked later for this specific error. Fix this by just returning -ENOMEM rather than setting ret. Fixes: 4fb930715468 ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1ee9d1a0962c13ba5ab7e47d33a80e3b8dc4b52e)
14 daysdrm/bridge: ti-sn65dsi86: fix REFCLK settingMichael Walle
The bridge has three bootstrap pins which are sampled to determine the frequency of the external reference clock. The driver will also (over)write that setting. But it seems this is racy after the bridge is enabled. It was observed that although the driver write the correct value (by sniffing on the I2C bus), the register has the wrong value. The datasheet states that the GPIO lines have to be stable for at least 5us after asserting the EN signal. Thus, there seems to be some logic which samples the GPIO lines and this logic appears to overwrite the register value which was set by the driver. Waiting 20us after asserting the EN line resolves this issue. Fixes: a095f15c00e2 ("drm/bridge: add support for sn65dsi86 bridge driver") Signed-off-by: Michael Walle <mwalle@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250821122341.1257286-1-mwalle@kernel.org
2025-09-02drm/xe: Fix incorrect migration of backed-up object to VRAMThomas Hellström
If an object is backed up to shmem it is incorrectly identified as not having valid data by the move code. This means moving to VRAM skips the -EMULTIHOP step and the bo is cleared. This causes all sorts of weird behaviour on DGFX if an already evicted object is targeted by the shrinker. Fix this by using ttm_tt_is_swapped() to identify backed-up objects. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996 Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250828134837.5709-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 1047bd82794a1eab64d643f196d09171ce983f44) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-09-02drm/sched: Fix racy access to drm_sched_entity.dependencyPierre-Eric Pelloux-Prayer
The drm_sched_job_unschedulable trace point can access entity->dependency after it was cleared by the callback installed in drm_sched_entity_add_dependency_cb, causing: BUG: kernel NULL pointer dereference, address: 0000000000000020 [...] Workqueue: comp_1.1.0 drm_sched_run_job_work [gpu_sched] RIP: 0010:trace_event_raw_event_drm_sched_job_unschedulable+0x70/0xd0 [gpu_sched] To fix this we either need to keep a reference to the fence before setting up the callbacks, or move the trace_drm_sched_job_unschedulable calls into drm_sched_entity_add_dependency_cb where they can be done earlier. Fixes: 76d97c870f29 ("drm/sched: Trace dependencies for GPU jobs") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250901124032.1955-1-pierre-eric.pelloux-prayer@amd.com (cherry picked from commit b2b8af21fec35be417a3199b5a6c354605dd222a) Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-08-29nouveau: Membar before between semaphore writes and the interruptFaith Ekstrand
This ensures that the memory write and the interrupt are properly ordered and we won't wake up the kernel before the semaphore write has hit memory. Fixes: b1ca384772b6 ("drm/nouveau/gv100-: switch to volta semaphore methods") Cc: stable@vger.kernel.org Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250829021633.1674524-2-airlied@gmail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-29nouveau: fix disabling the nonstall irq due to storm codeDave Airlie
Nouveau has code that when it gets an IRQ with no allowed handler it disables it to avoid storms. However with nonstall interrupts, we often disable them from the drm driver, but still request their emission via the push submission. Just don't disable nonstall irqs ever in normal operation, the event handling code will filter them out, and the driver will just enable/disable them at load time. This fixes timeouts we've been seeing on/off for a long time, but they became a lot more noticeable on Blackwell. This doesn't fix all of them, there is a subsequent fence emission fix to fix the last few. Fixes: 3ebd64aa3c4f ("drm/nouveau/intr: support multiple trees, and explicit interfaces") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250829021633.1674524-1-airlied@gmail.com [ Fix a typo and a minor checkpatch.pl warning; remove "v2" from commit subject. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-08-29drm/amd/display: Clear the CUR_ENABLE register on DCN314 w/out DPP PGIvan Lipski
[Why&How] ON DCN314, clearing DPP SW structure without power gating it can cause a double cursor in full screen with non-native scaling. A W/A that clears CURSOR0_CONTROL cursor_enable flag if dcn10_plane_atomic_power_down is called and DPP power gating is disabled. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4168 Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 645f74f1dc119dad5a2c7bbc05cc315e76883011) Cc: stable@vger.kernel.org
2025-08-29drm/amdgpu: drop hw access in non-DC audio finiAlex Deucher
We already disable the audio pins in hw_fini so there is no need to do it again in sw_fini. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4481 Cc: oushixiong <oushixiong1025@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5eeb16ca727f11278b2917fd4311a7d7efb0bbd6) Cc: stable@vger.kernel.org
2025-08-29drm/amd: Re-enable common modes for eDP and LVDSMario Limonciello
[Why] Although compositors will add their own modes, Xorg won't use it's own modes and will only stick to modes advertised by the driver. This mean a user that used to pick 1024x768 could no longer access it unless the panel's native resolution was 1024x768. [How] Revert commit 6d396e7ac1ce3 ("drm/amd/display: Disable common modes for LVDS") and commit 7948afb46af92 ("drm/amd/display: Disable common modes for eDP"). The panel will still use scaling for any non-native modes due to commit 978fa2f6d0b12 ("drm/amd/display: Use scaling for non-native resolutions on eDP") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4538 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250828140856.2887993-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c2fbf72fe3c2d08856e834ca43328a8829a261d8)
2025-08-29drm/amdgpu/mes11: make MES_MISC_OP_CHANGE_CONFIG failure non-fatalAlex Deucher
If the firmware is too old, just warn and return success. Fixes: 27b791514789 ("drm/amdgpu/mes: keep enforce isolation up to date") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4414 Cc: shaoyun.Liu@amd.com Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9f28af76fab0948b59673f69c10aeec47de11c60) Cc: stable@vger.kernel.org
2025-08-29drm/amdgpu/sdma: bump firmware version checks for user queue supportJesse.Zhang
Using the previous firmware could lead to problems with PROTECTED_FENCE_SIGNAL commands, specifically causing register conflicts between MCU_DBG0 and MCU_DBG1. The updated firmware versions ensure proper alignment and unification of the SDMA_SUBOP_PROTECTED_FENCE_SIGNAL value with SDMA 7.x, resolving these hardware coordination issues Fixes: e8cca30d8b34 ("drm/amdgpu/sdma6: add ucode version checks for userq support") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit aab8b689aded255425db3d80c0030d1ba02fe2ef) Cc: stable@vger.kernel.org
2025-08-29Merge tag 'mediatek-drm-fixes-20250829' of ↵Dave Airlie
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250829 1. Add error handling for old state CRTC in atomic_disable 2. Fix DSI host and panel bridge pre-enable order 3. Fix device/node reference count leaks in mtk_drm_get_all_drm_priv 4. mtk_hdmi: Fix inverted parameters in some regmap_update_bits calls Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250828234116.4960-1-chunkuang.hu@kernel.org
2025-08-28drm/mediatek: mtk_hdmi: Fix inverted parameters in some regmap_update_bits callsLouis-Alexis Eyraud
In mtk_hdmi driver, a recent change replaced custom register access function calls by regmap ones, but two replacements by regmap_update_bits were done incorrectly, because original offset and mask parameters were inverted, so fix them. Fixes: d6e25b3590a0 ("drm/mediatek: hdmi: Use regmap instead of iomem for main registers") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250818-mt8173-fix-hdmi-issue-v1-1-55aff9b0295d@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-08-29Merge tag 'drm-msm-fixes-2025-08-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.17-rc4 Core/GPU: - fix comment doc warning in gpuvm - fix build with KMS disabled - fix pgtable setup/teardown race - global fault counter fix - various error path fixes - GPU devcoredump snapshot fixes - handle in-place VM_BIND remaps to solve turnip vm update race - skip re-emitting IBs for unusable VMs - Don't use %pK through printk - moved display snapshot init earlier, fixing a crash DPU: - Fixed crash in virtual plane checking code - Fixed mode comparison in virtual plane checking code DSI: - Adjusted width of resulution-related registers - Fixed locking issue on 14nm PLLs UBWC (per Bjorn's ack) - Added UBWC configuration for several missing platforms (fixing regression) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <rob.clark@oss.qualcomm.com> Link: https://lore.kernel.org/r/CACSVV02+u1VW1dzuz6JWwVEfpgTj6Y-JXMH+vX43KsKTVsW+Yg@mail.gmail.com
2025-08-29Merge tag 'amd-drm-fixes-6.17-2025-08-28' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-08-28: amdgpu: - UserQ fixes - Revert CSA fix - SR-IOV fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250828173904.75850-1-alexander.deucher@amd.com
2025-08-29Merge tag 'drm-misc-fixes-2025-08-28' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Several nouveau fixes to remove unused code, fix an error path and be less restrictive with the formats it accepts. A fix for amdgpu to pin vmapped dma-buf, and a revert for tegra for a regression in the dma-buf / GEM code. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250828-hypersonic-colorful-squirrel-64f04b@houat
2025-08-27drm/amdgpu/userq: fix error handling of invalid doorbellAlex Deucher
If the doorbell is invalid, be sure to set the r to an error state so the function returns an error. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7e2a5b0a9a165a7c51274aa01b18be29491b4345) Cc: stable@vger.kernel.org
2025-08-27drm/amdgpu: update firmware version checks for user queue supportJesse.Zhang
The minimum firmware versions required for user queue functionality have been increased to address an issue where the queue privilege state was lost during queue connect operations. The problem occurred because the privilege state was being restored to its initial value at the beginning of the function, overwriting the state that was properly set during the queue connect case. This commit updates the minimum version requirements: - ME firmware from 2390 to 2420 - PFP firmware from 2530 to 2580 - MEC firmware from 2600 to 2650 - MES firmware remains at 120 These updated firmware versions contain the necessary fixes to properly maintain queue privilege state throughout connect operations. Fixes: 61ca97e9590c ("drm/amdgpu: Add fw minimum version check for usermode queue") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5f976c9939f0d5916d2b8ef3156a6d1799781df1) Cc: stable@vger.kernel.org
2025-08-27drm/amd/amdgpu: disable hwmon power1_cap* for gfx 11.0.3 on vf modeYang Wang
the PPSMC_MSG_GetPptLimit msg is not valid for gfx 11.0.3 on vf mode, so skiped to create power1_cap* hwmon sysfs node. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e82a8d441038d8cb10b63047a9e705c42479d156) Cc: stable@vger.kernel.org
2025-08-27Revert "drm/amdgpu: fix incorrect vm flags to map bo"Alex Deucher
This reverts commit b08425fa77ad2f305fe57a33dceb456be03b653f. Revert this to align with 6.17 because the fixes tag was wrong on this commit. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit be33e8a239aac204d7e9e673c4220ef244eb1ba3)
2025-08-27drm/amdgpu/gfx12: set MQD as appriopriate for queue typesAlex Deucher
Set the MQD as appropriate for the kernel vs user queues. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7b9110f2897957efd9715b52fc01986509729db3) Cc: stable@vger.kernel.org