summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-07-16drm/ttm: fix locking in test ttm_bo_validate_no_placement_signaledChristian König
The test works even without it, but lockdep starts screaming when it is activated. Trivially fix it by acquiring the lock before we try to allocate something. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710144129.1803-1-christian.koenig@amd.com
2025-07-15dt-bindings: display: panel: samsung,atna30dw01: document ATNA30DW01Dale Whinham
The Samsung ATNA30DW01 panel is a 13" AMOLED eDP panel. It is similar to the ATNA33XC20 except that it is smaller and has a higher resolution. Tested-by: Jérôme de Bretagne <jerome.debretagne@gmail.com> Signed-off-by: Dale Whinham <daleyo@gmail.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250714173554.14223-3-daleyo@gmail.com
2025-07-15drm/bridge: megachips-stdpxxxx-ge-b850v3-fw: Fix a compile error due to ↵Andy Yan
bridge->detect parameter changes Fix the compile error due to bridge->detect parameter changes. Reported-by: Dixit Ashutosh <ashutosh.dixit@intel.com> Closes: https://lore.kernel.org/dri-devel/175250667117.3567548.8371527247937906463.b4-ty@oss.qualcomm.com/T/#m8ecd00a05a330bc9c76f11c981daafcb30a7c2e0 Fixes: 5d156a9c3d5e ("drm/bridge: Pass down connector to drm bridge detect hook") Signed-off-by: Andy Yan <andyshrk@163.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250715054754.800765-1-andyshrk@163.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-07-15drm/panfrost: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the resetMaíra Canal
Panfrost can skip the reset if TDR has fired before the free-job worker. Currently, since Panfrost doesn't take any action on these scenarios, the job is being leaked, considering that `free_job()` won't be called. To avoid such leaks, inform the scheduler that the job did not actually timeout and no reset was performed through the new status code DRM_GPU_SCHED_STAT_NO_HANG. Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-8-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/xe: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the resetMaíra Canal
Xe can skip the reset if TDR has fired before the free job worker and can also re-arm the timeout timer in some scenarios. Instead of manipulating scheduler's internals, inform the scheduler that the job did not actually timeout and no reset was performed through the new status code DRM_GPU_SCHED_STAT_NO_HANG. Note that, in the first case, there is no need to restart submission if it hasn't been stopped. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-7-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/etnaviv: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the resetMaíra Canal
Etnaviv can skip a hardware reset in two situations: 1. TDR has fired before the free-job worker and the timeout is spurious. 2. The GPU is still making progress on the front-end and we can give the job a chance to complete. Instead of manipulating scheduler's internals, inform the scheduler that the job did not actually timeout and no reset was performed through the new status code DRM_GPU_SCHED_STAT_NO_HANG. Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-6-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/v3d: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the resetMaíra Canal
When a CL/CSD job times out, we check if the GPU has made any progress since the last timeout. If so, instead of resetting the hardware, we skip the reset and allow the timer to be rearmed. This gives long-running jobs a chance to complete. Instead of manipulating scheduler's internals, inform the scheduler that the job did not actually timeout and no reset was performed through the new status code DRM_GPU_SCHED_STAT_NO_HANG. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-5-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/sched: Add new test for DRM_GPU_SCHED_STAT_NO_HANGMaíra Canal
Add a test to submit a single job against a scheduler with the timeout configured and verify that if the job is still running, the timeout handler will skip the reset and allow the job to complete. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-4-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/sched: Make timeout KUnit tests fasterMaíra Canal
As more KUnit tests are introduced to evaluate the basic capabilities of the `timedout_job()` hook, the test suite will continue to increase in duration. To reduce the overall running time of the test suite, decrease the scheduler's timeout for the timeout tests. Before this commit: [15:42:26] Elapsed time: 15.637s total, 0.002s configuring, 10.387s building, 5.229s running After this commit: [15:45:26] Elapsed time: 9.263s total, 0.002s configuring, 5.168s building, 4.037s running Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Acked-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-3-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/sched: Allow drivers to skip the reset and keep on runningMaíra Canal
When the DRM scheduler times out, it's possible that the GPU isn't hung; instead, a job just took unusually long (longer than the timeout) but is still running, and there is, thus, no reason to reset the hardware. This can occur in two scenarios: 1. The job is taking longer than the timeout, but the driver determined through a GPU-specific mechanism that the hardware is still making progress. Hence, the driver would like the scheduler to skip the timeout and treat the job as still pending from then onward. This happens in v3d, Etnaviv, and Xe. 2. Timeout has fired before the free-job worker. Consequently, the scheduler calls `sched->ops->timedout_job()` for a job that isn't timed out. These two scenarios are problematic because the job was removed from the `sched->pending_list` before calling `sched->ops->timedout_job()`, which means that when the job finishes, it won't be freed by the scheduler though `sched->ops->free_job()` - leading to a memory leak. To solve these problems, create a new `drm_gpu_sched_stat`, called DRM_GPU_SCHED_STAT_NO_HANG, which allows a driver to skip the reset. The new status will indicate that the job must be reinserted into `sched->pending_list`, and the hardware / driver will still complete that job. Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-2-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESETMaíra Canal
Among the scheduler's statuses, the only one that indicates an error is DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV signifies that the operation succeeded and the GPU is in a nominal state. However, to provide more information about the GPU's status, it is needed to convey more information than just "OK". Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has hung, but it has been successfully reset and is now in a nominal state again. Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15dt-bindings: display: rockchip,dw-mipi-dsi: Drop address/size cellsDiederik de Haas
The "rockchip,dw-mipi-dsi" binding has allOf "snps,dw-mipi-dsi.yaml" which has allOf "dsi-controller.yaml", which already has #address-cells and #size-cells defined as '1' and '0' respectively. So drop this re-definition. Signed-off-by: Diederik de Haas <didi.debian@cknow.org> Reviewed-by: "Rob Herring (Arm)" <robh@kernel.org> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20250709132323.128757-4-didi.debian@cknow.org
2025-07-14drm/panel-edp: Add BOE NE14QDM panel for Dell Latitude 7455Val Packett
Cannot confirm which variant exactly it is, as the EDID alphanumeric data contains '0RGNR' <0x80> 'NE14QDM' and ends there; but it's 60 Hz and with touch. I do not have access to datasheets for these panels, so the timing is a guess that was tested to work fine on this laptop. Raw EDID dump: 00 ff ff ff ff ff ff 00 09 e5 1e 0b 00 00 00 00 10 20 01 04 a5 1e 13 78 07 fd 85 a7 53 4c 9b 25 0f 50 54 00 00 00 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 a7 6d 00 a0 a0 40 78 60 30 20 36 00 2e bc 10 00 00 1a b9 57 00 a0 a0 40 78 60 30 20 36 00 2e bc 10 00 00 1a 00 00 00 fe 00 30 52 47 4e 52 80 4e 45 31 34 51 44 4d 00 00 00 00 00 02 41 31 a8 00 01 00 00 1a 41 0a 20 20 00 8f Signed-off-by: Val Packett <val@packett.cool> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250706205723.9790-7-val@packett.cool
2025-07-14drm/panthor: Remove dead VM flushing codeAdrián Larumbe
Commit ec62d37d2c0d("drm/panthor: Fix the fast-reset logic") did away with the only reference to panthor_vm_flush_all(), so let's get rid of the orphaned definition. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250711154557.739326-1-adrian.larumbe@collabora.com
2025-07-14drm/bridge: Pass down connector to drm bridge detect hookAndy Yan
In some application scenarios, we hope to get the corresponding connector when the bridge's detect hook is invoked. In most cases, we can get the connector by drm_atomic_get_connector_for_encoder if the encoder attached to the bridge is enabled, however there will still be some scenarios where the detect hook of the bridge is called but the corresponding encoder has not been enabled yet. For instance, this occurs when the device is hot plug in for the first time. Since the call to bridge's detect is initiated by the connector, passing down the corresponding connector directly will make things simpler. Signed-off-by: Andy Yan <andy.yan@rock-chips.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250703125027.311109-3-andyshrk@163.com [DB: added the chunk to the cdn-dp driver] Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-07-14drm/bridge: Make dp/hdmi_audio_* callback keep the same paramter order with ↵Andy Yan
get_modes Make the dp/hdmi_audio_* callback maintain the same parameter order as get_modes and edid_read: first the bridge, then the connector. Signed-off-by: Andy Yan <andy.yan@rock-chips.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250703125027.311109-2-andyshrk@163.com [DB: added the chunk to the cdn-dp driver] Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-07-13PM: hibernate: Add stub for pm_hibernate_is_recovering()Mario Limonciello
Randy reports that amdgpu fails to compile with the following error: ERROR: modpost: "pm_hibernate_is_recovering" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! This happens because pm_hibernate_is_recovering() is only compiled when CONFIG_PM_SLEEP is set. Add a stub for it so that drivers don't need to depend upon CONFIG_PM. Cc: Samuel Zhang <guoqing.zhang@amd.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/dri-devel/CAJZ5v0h1CX+aTu7dFy6vB-9LM6t5J4rt7Su3qVnq1xx-BFAm=Q@mail.gmail.com/T/#m2b9fe212b35fde11d58fcbc4e0727bc02ebba7b0 Fixes: c2aaddbd2dede ("PM: hibernate: add new api pm_hibernate_is_recovering()") Acked-by: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250712233715.821424-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-07-11drm: rust: rename as_ref() to from_raw() for drm constructorsAlice Ryhl
The prefix as_* should not be used for a constructor. Constructors usually use the prefix from_* instead. Some prior art in the stdlib: Box::from_raw, CString::from_raw, Rc::from_raw, Arc::from_raw, Waker::from_raw, File::from_raw_fd. There is also prior art in the kernel crate: cpufreq::Policy::from_raw, fs::File::from_raw_file, Kuid::from_raw, ARef::from_raw, SeqFile::from_raw, VmaNew::from_raw, Io::from_raw. Link: https://lore.kernel.org/r/aCd8D5IA0RXZvtcv@pollux Signed-off-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250711-device-as-ref-v2-2-1b16ab6402d7@google.com
2025-07-10drm/amdgpu: Fix lifetime of struct amdgpu_task_info after ring resetAndré Almeida
When a ring reset happens, amdgpu calls drm_dev_wedged_event() using struct amdgpu_task_info *ti as one of the arguments. After using *ti, a call to amdgpu_vm_put_task_info(ti) is required to correctly track its lifetime. However, it's called from a place that the ring reset path never reaches due to a goto after drm_dev_wedged_event() is called. Move amdgpu_vm_put_task_info() bellow the exit label to make sure that it's called regardless of the code path. amdgpu_vm_put_task_info() can only accept a valid address or NULL as argument, so initialise *ti to make sure we can call this function if *ti isn't used. Fixes: a72002cb181f ("drm/amdgpu: Make use of drm_wedge_task_info") Reported-by: Dave Airlie <airlied@gmail.com> Closes: https://lore.kernel.org/dri-devel/CAPM=9tz0rQP8VZWKWyuF8kUMqRScxqoa6aVdwWw9=5yYxyYQ2Q@mail.gmail.com/ Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250704030629.1064397-1-andrealmeid@igalia.com Signed-off-by: André Almeida <andrealmeid@igalia.com>
2025-07-10drm/doc: Fix grammar for "Task information"André Almeida
Remove the repetitive wording at the end of "Task information" section. Reviewed-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20250704190724.1159416-3-andrealmeid@igalia.com Signed-off-by: André Almeida <andrealmeid@igalia.com>
2025-07-10drm: Add missing struct drm_wedge_task_info kernel docAndré Almeida
Fix the following kernel doc warning: include/drm/drm_device.h:40: warning: Function parameter or struct member 'pid' not described in 'drm_wedge_task_info' include/drm/drm_device.h:40: warning: Function parameter or struct member 'comm' not described in 'drm_wedge_task_info' Fixes: 183bccafa176 ("drm: Create a task info option for wedge events") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/lkml/20250618151307.4a1a5e17@canb.auug.org.au/ Reviewed-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250704190724.1159416-2-andrealmeid@igalia.com Signed-off-by: André Almeida <andrealmeid@igalia.com>
2025-07-10drm/doc: Fix title underline for "Task information"André Almeida
Fix the following warning: Documentation/gpu/drm-uapi.rst:450: WARNING: Title underline too short. Task information --------------- [docutils] Fixes: cd37124b4093 ("drm/doc: Add a section about "Task information" for the wedge API") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/lkml/20250618150333.5ded99a0@canb.auug.org.au/ Reviewed-by: Raag Jadav <raag.jadav@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20250704190724.1159416-1-andrealmeid@igalia.com Signed-off-by: André Almeida <andrealmeid@igalia.com>
2025-07-10drm/amdgpu: do not resume device in thaw for normal hibernationSamuel Zhang
For normal hibernation, GPU do not need to be resumed in thaw since it is not involved in writing the hibernation image. Skip resume in this case can reduce the hibernation time. On VM with 8 * 192GB VRAM dGPUs, 98% VRAM usage and 1.7TB system memory, this can save 50 minutes. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Tested-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Link: https://lore.kernel.org/r/20250710062313.3226149-6-guoqing.zhang@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-07-10PM: hibernate: add new api pm_hibernate_is_recovering()Samuel Zhang
dev_pm_ops.thaw() is called in following cases: * normal case: after hibernation image has been created. * error case 1: creation of a hibernation image has failed. * error case 2: restoration from a hibernation image has failed. For normal case, it is called mainly for resume storage devices for saving the hibernation image. Other devices that are not involved in the image saving do not need to resume the device. But since there's no api to know which case thaw() is called, device drivers can't conditionally resume device in thaw(). The new pm_hibernate_is_recovering() is such a api to query if thaw() is called in normal case. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250710062313.3226149-5-guoqing.zhang@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-07-10PM: hibernate: shrink shmem pages after dev_pm_ops.prepare()Samuel Zhang
When hibernate with data center dGPUs, huge number of VRAM data will be moved to shmem during dev_pm_ops.prepare(). These shmem pages take a lot of system memory so that there's no enough free memory for creating the hibernation image. This will cause hibernation fail and abort. After dev_pm_ops.prepare(), call shrink_all_memory() to force move shmem pages to swap disk and reclaim the pages, so that there's enough system memory for hibernation image and less pages needed to copy to the image. This patch can only flush and free about half shmem pages. It will be better to flush and free more pages, even all of shmem pages, so that there're less pages to be copied to the hibernation image and the overall hibernation time can be reduced. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250710062313.3226149-4-guoqing.zhang@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-07-10drm/amdgpu: move GTT to shmem after eviction for hibernationSamuel Zhang
When hibernate with data center dGPUs, huge number of VRAM BOs evicted to GTT and takes too much system memory. This will cause hibernation fail due to insufficient memory for creating the hibernation image. Move GTT BOs to shmem in KMD, then shmem to swap disk in kernel hibernation code to make room for hibernation image. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250710062313.3226149-3-guoqing.zhang@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-07-10drm/ttm: add new api ttm_device_prepare_hibernation()Samuel Zhang
This new api is used for hibernation to move GTT BOs to shmem after VRAM eviction. shmem will be flushed to swap disk later to reduce the system memory usage for hibernation. Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250710062313.3226149-2-guoqing.zhang@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-07-10drm/nouveau: Remove waitque for sched teardownPhilipp Stanner
struct nouveau_sched contains a waitque needed to prevent drm_sched_fini() from being called while there are still jobs pending. Doing so so far would have caused memory leaks. With the new memleak-free mode of operation switched on in drm_sched_fini() by providing the callback nouveau_sched_cancel_job() the waitque is not necessary anymore. Remove the waitque. Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-10-phasta@kernel.org
2025-07-10drm/nouveau: Add new callback for scheduler teardownPhilipp Stanner
There is a new callback for always tearing the scheduler down in a leak-free, deadlock-free manner. Port Nouveau as its first user by providing the scheduler with a callback that ensures the fence context gets killed in drm_sched_fini(). Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-9-phasta@kernel.org
2025-07-10drm/nouveau: Make fence container helper usable driver-widePhilipp Stanner
In order to implement a new DRM GPU scheduler callback in Nouveau, a helper for obtaining a nouveau_fence from a dma_fence is necessary. Such a helper exists already inside nouveau_fence.c, called from_fence(). Make that helper available to other C files with a more precise name. Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-8-phasta@kernel.org
2025-07-10drm/sched: Warn if pending_list is not emptyPhilipp Stanner
drm_sched_fini() can leak jobs under certain circumstances. Warn if that happens. Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-7-phasta@kernel.org
2025-07-10drm/sched/tests: Add unit test for cancel_job()Philipp Stanner
The scheduler unit tests now provide a new callback, cancel_job(). This callback gets used by drm_sched_fini() for all still pending jobs to cancel them. Implement a new unit test to test this. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-6-phasta@kernel.org
2025-07-10drm/sched/tests: Implement cancel_job() callbackPhilipp Stanner
The GPU Scheduler now supports a new callback, cancel_job(), which lets the scheduler cancel all jobs which might not yet be freed when drm_sched_fini() runs. Using this callback allows for significantly simplifying the mock scheduler teardown code. Implement the cancel_job() callback and adjust the code where necessary. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-5-phasta@kernel.org
2025-07-10drm/sched: Avoid memory leaks with cancel_job() callbackPhilipp Stanner
Since its inception, the GPU scheduler can leak memory if the driver calls drm_sched_fini() while there are still jobs in flight. The simplest way to solve this in a backwards compatible manner is by adding a new callback, drm_sched_backend_ops.cancel_job(), which instructs the driver to signal the hardware fence associated with the job. Afterwards, the scheduler can safely use the established free_job() callback for freeing the job. Implement the new backend_ops callback cancel_job(). Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursulin@igalia.com/ Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250710125412.128476-4-phasta@kernel.org
2025-07-10drm/panthor: Fix UAF in panthor_gem_create_with_handle() debugfs codeSimona Vetter
The object is potentially already gone after the drm_gem_object_put(). In general the object should be fully constructed before calling drm_gem_handle_create(), except the debugfs tracking uses a separate lock and list and separate flag to denotate whether the object is actually initialized. Since I'm touching this all anyway simplify this by only adding the object to the debugfs when it's ready for that, which allows us to delete that separate flag. panthor_gem_debugfs_bo_rm() already checks whether we've actually been added to the list or this is some error path cleanup. v2: Fix build issues for !CONFIG_DEBUGFS (Adrián) v3: Add linebreak and remove outdated comment (Liviu) Fixes: a3707f53eb3f ("drm/panthor: show device-wide list of DRM GEM objects over DebugFS") Cc: Adrián Larumbe <adrian.larumbe@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Simona Vetter <simona.vetter@intel.com> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250709135220.1428931-1-simona.vetter@ffwll.ch
2025-07-09fbcon: Fix outdated registered_fb reference in commentShixiong Ou
The variable was renamed to fbcon_registered_fb, but this comment was not updated along with the change. Correct it to avoid confusion. Signed-off-by: Shixiong Ou <oushixiong@kylinos.cn> Fixes: efc3acbc105a ("fbcon: Maintain a private array of fb_info") [sima: Add Fixes: line.] Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250709103438.572309-1-oushixiong1025@163.com
2025-07-09drm/ast: Gen7: Switch default registers to gen4+ stateThomas Zimmermann
Change the default register settings for Gen7 to mach Gen4 and later. Gen7 currently uses the settings for Gen1, which is most likely incorrect. Using Gen4+ settings enables E2M linear-access modes in VGACRA2. It appears to be related to the chip's PCIE2MBOX feature, which is unused. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-11-tzimmermann@suse.de
2025-07-09drm/ast: Gen7: Disable VGASR0[1] as on Gen4+Thomas Zimmermann
Set VGACRB6[5], which disables asynchronous sequencer resets via VGASR0[1]. This was most likely an oversight when adding support for Gen7. Aligns Gen7 with the earlier Gen4+. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-10-tzimmermann@suse.de
2025-07-09drm/ast: Split ast_set_def_ext_reg() by chip generationThomas Zimmermann
Duplicate ast_set_def_ext_reg() for individual chip generations and move call it into per-chip source files. Remove the original code. AST2100 and AST2500 reuse the function from earlier chips. AST2600 appears to be incorrect as it uses an older function. Keep this behavior for now. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-9-tzimmermann@suse.de
2025-07-09drm/ast: Handle known struct ast_dramstruct with helpersThomas Zimmermann
Most of struct ast_dramstruct stores hardware state. Some index values have known or special meaning. The known values are - 0xffff - Terminal entry in the array - 0xff00 - Delays the programming for usecs - 0x0004 - Sets the type of DRAM Add constants and helper macros for these cases. Also add a helper macro for testing. Update Gen1 and Gen2+ accordingly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-8-tzimmermann@suse.de
2025-07-09drm/ast: Move struct ast_dramstruct to ast_post.hThomas Zimmermann
Declare struct ast_dramstruct in ast_post.h and remove its original header file. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-7-tzimmermann@suse.de
2025-07-09drm/ast: Move Gen2+ and Gen1 POST code to separate source filesThomas Zimmermann
Move POST code for Gen2+ and Gen1 to separate source files and hide it in ast_2100_post() ans ast_2000_post(). With P2A configuration, the POST logic for these chip generations has been mingled in ast_init_dram_reg(). Hence, handle all generations in a single change. The split simplifies both cases. Also move the DRAM init tables for each Gen into the respective source file. No changes to the overall logic. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-6-tzimmermann@suse.de
2025-07-09drm/ast: Move Gen4+ POST code to separate source fileThomas Zimmermann
Move POST code for Gen4+ to separate source file and hide it in ast_2300_post(). With P2A configuration, it performs a full board POST and enables the transmitter chip; otherwise it only enables the transmitter chip. Also fix coding style in several places. No changes to the overall logic. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-5-tzimmermann@suse.de
2025-07-09drm/ast: Move Gen6+ POST code to separate source fileThomas Zimmermann
Move POST code for Gen6+ to separate source file and hide it in ast_2500_post(). With P2A configuration, it performs a full board POST; otherwise it enables the transmitter chip. No changes to the overall logic. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-4-tzimmermann@suse.de
2025-07-09drm/ast: Move Gen7+ POST code to separate source fileThomas Zimmermann
Move POST code for Gen7+ to separate source file and hide it in ast_2600_post(). There's not much going on here except for enabling the DP transmitter chip. v2: - simplify logic (Jocelyn) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-3-tzimmermann@suse.de
2025-07-09drm/ast: Declare helpers for POST in headerThomas Zimmermann
Provide POST helpers in header file before splitting up the AST POST code. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250706162816.211552-2-tzimmermann@suse.de
2025-07-09dma-buf: heaps: Give default CMA heap a fixed nameJared Kangas
The CMA heap's name in devtmpfs can vary depending on how the heap is defined. Its name defaults to "reserved", but if a CMA area is defined in the devicetree, the heap takes on the devicetree node's name, such as "default-pool" or "linux,cma". To simplify naming, unconditionally name it "default_cma_region", but keep a legacy node in place backed by the same underlying allocator for backwards compatibility. Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jared Kangas <jkangas@redhat.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://lore.kernel.org/r/20250610131231.1724627-4-jkangas@redhat.com
2025-07-09dma-buf: heaps: Parameterize heap name in __add_cma_heap()Jared Kangas
Prepare for the introduction of a fixed-name CMA heap by replacing the unused void pointer parameter in __add_cma_heap() with the heap name. Reviewed-by: Maxime Ripard <mripard@kernel.org> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Jared Kangas <jkangas@redhat.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://lore.kernel.org/r/20250610131231.1724627-3-jkangas@redhat.com
2025-07-09Documentation: dma-buf: heaps: Fix code markupJared Kangas
Code snippets should be wrapped in double backticks to follow reStructuredText semantics; the use of single backticks uses the :title-reference: role by default, which isn't quite what we want. Add double backticks to code snippets to fix this. Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jared Kangas <jkangas@redhat.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://lore.kernel.org/r/20250610131231.1724627-2-jkangas@redhat.com
2025-07-09dma-buf: system_heap: No separate allocation for attachment sg_tablesT.J. Mercier
struct dma_heap_attachment is a separate allocation from the struct sg_table it contains, but there is no reason for this. Let's use the slab allocator just once instead of twice for dma_heap_attachment. Signed-off-by: T.J. Mercier <tjmercier@google.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://lore.kernel.org/r/20250417180943.1559755-1-tjmercier@google.com