linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-04-08	cifs: Fix support for WSL-style symlinks	Pali Rohár
	MS-FSCC in section 2.1.2.7 LX SYMLINK REPARSE_DATA_BUFFER now contains documentation about WSL symlink reparse point buffers. https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/68337353-9153-4ee1-ac6b-419839c3b7ad Fix the struct reparse_wsl_symlink_data_buffer to reflect buffer fields according to the MS-FSCC documentation. Fix the Linux SMB client to correctly fill the WSL symlink reparse point buffer when creaing new WSL-style symlink. There was a mistake during filling the data part of the reparse point buffer. It should starts with bytes "\x02\x00\x00\x00" (which represents version 2) but this constant was written as number 0x02000000 encoded in little endian, which resulted bytes "\x00\x00\x00\x02". This change is fixing this mistake. Fixes: 4e2043be5c14 ("cifs: Add support for creating WSL-style symlinks") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-08	net: libwx: handle page_pool_dev_alloc_pages error	Chenyuan Yang
	page_pool_dev_alloc_pages could return NULL. There was a WARN_ON(!page) but it would still proceed to use the NULL pointer and then crash. This is similar to commit 001ba0902046 ("net: fec: handle page_pool_dev_alloc_pages error"). This is found by our static analysis tool KNighter. Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250407184952.2111299-1-chenyuan0y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-08	Merge branch 'mptcp-only-inc-mpjoinackhmacfailure-for-hmac-failures'	Jakub Kicinski
	Matthieu Baerts says: ==================== mptcp: only inc MPJoinAckHMacFailure for HMAC failures Recently, during a debugging session using local MPTCP connections, I noticed MPJoinAckHMacFailure was strangely not zero on the server side. The first patch fixes this issue -- present since v5.9 -- and the second one validates it in the selftests. ==================== Link: https://patch.msgid.link/20250407-net-mptcp-hmac-failure-mib-v1-0-3c9ecd0a3a50@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-08	selftests: mptcp: validate MPJoin HMacFailure counters	Matthieu Baerts (NGI0)
	The parent commit fixes an issue around these counters where one of them -- MPJoinAckHMacFailure -- was wrongly incremented in some cases. This makes sure the counter is always 0. It should be incremented only in case of corruption, or a wrong implementation, which should not be the case in these selftests. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250407-net-mptcp-hmac-failure-mib-v1-2-3c9ecd0a3a50@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-08	mptcp: only inc MPJoinAckHMacFailure for HMAC failures	Matthieu Baerts (NGI0)
	Recently, during a debugging session using local MPTCP connections, I noticed MPJoinAckHMacFailure was not zero on the server side. The counter was in fact incremented when the PM rejected new subflows, because the 'subflow' limit was reached. The fix is easy, simply dissociating the two cases: only the HMAC validation check should increase MPTCP_MIB_JOINACKMAC counter. Fixes: 4cf8b7e48a09 ("subflow: introduce and use mptcp_can_accept_new_subflow()") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250407-net-mptcp-hmac-failure-mib-v1-1-3c9ecd0a3a50@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-08	selftests/mincore: Allow read-ahead pages to reach the end of the file	Qiuxu Zhuo
	When running the mincore_selftest on a system with an XFS file system, it failed the "check_file_mmap" test case due to the read-ahead pages reaching the end of the file. The failure log is as below: RUN global.check_file_mmap ... mincore_selftest.c:264:check_file_mmap:Expected i (1024) < vec_size (1024) mincore_selftest.c:265:check_file_mmap:Read-ahead pages reached the end of the file check_file_mmap: Test failed FAIL global.check_file_mmap This is because the read-ahead window size of the XFS file system on this machine is 4 MB, which is larger than the size from the #PF address to the end of the file. As a result, all the pages for this file are populated. blockdev --getra /dev/nvme0n1p5 8192 blockdev --getbsz /dev/nvme0n1p5 512 This issue can be fixed by extending the current FILE_SIZE 4MB to a larger number, but it will still fail if the read-ahead window size of the file system is larger enough. Additionally, in the real world, read-ahead pages reaching the end of the file can happen and is an expected behavior. Therefore, allowing read-ahead pages to reach the end of the file is a better choice for the "check_file_mmap" test case. Link: https://lore.kernel.org/r/20250311080940.21413-1-qiuxu.zhuo@intel.com Reported-by: Yi Lai <yi1.lai@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-04-08	selftests/futex: futex_waitv wouldblock test should fail	Edward Liaw
	Testcase should fail if -EWOULDBLOCK is not returned when expected value differs from actual value from the waiter. Link: https://lore.kernel.org/r/20250404221225.1596324-1-edliaw@google.com Fixes: 9d57f7c79748920636f8293d2f01192d702fe390 ("selftests: futex: Test sys_futex_waitv() wouldblock") Signed-off-by: Edward Liaw <edliaw@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-04-08	drm: Add UAPI for the Asahi driver	Alyssa Rosenzweig
	This adds the UAPI for the Asahi driver targeting the GPU in the Apple M1 and M2 series systems on chip. The UAPI design is based on other modern Vulkan-capable drivers, including Xe and Panthor. Memory management is based on explicit VM management. Synchronization is exclusively explicit sync. This UAPI is validated against our open source Mesa stack, which is fully conformant to the OpenGL 4.6, OpenGL ES 3.2, OpenCL 3.0, and Vulkan 1.4 standards. The Vulkan driver supports sparse, exercising the VM_BIND mechanism. This patch adds the standalone UAPI header. It is implemented by an open source DRM driver written in Rust. We fully intend to upstream this driver when possible. However, as a production graphics driver, it depends on a significant number of Rust abstractions that will take a long time to upstream. In the mean time, our userspace is upstream in Mesa but is not allowed to probe with upstream Mesa as the UAPI is not yet reviewed and merged in the upstream kernel. Although we ship a patched Mesa in Fedora Asahi Remix, any containers shipping upstream Mesa builds are broken for our users, including upstream Flatpak and Waydroid runtimes. Additionally, it forces us to maintain forks of Mesa and virglrenderer, which complicates bisects. The intention in sending out this patch is for this UAPI to be thoroughly reviewed. Once we as the DRM community are satisfied with the UAPI, this header lands signifying that the UAPI is stable and must only be evolved in backwards-compatible ways; it will be the UAPI implemented in the DRM driver that eventually lands upstream. That promise lets us enable upstream Mesa, solving all these issues while the upstream Rust abstractions are developed. https://github.com/alyssarosenzweig/linux/commits/agx-uapi-v7 contains the DRM driver implementing this proposed UAPI. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33984 contains the Mesa patches to implement this proposed UAPI. That Linux and Mesa branch together give a complete graphics/compute stack on top of this UAPI. Co-developed-by: Asahi Lina <lina@asahilina.net> Signed-off-by: Asahi Lina <lina@asahilina.net> Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Janne Grunau <j@jannau.net> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Link: https://lore.kernel.org/r/20250408-agx-uapi-v7-1-ad122d4f7324@rosenzweig.io Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2025-04-08	kunit: Spelling s/slowm/slow/	Geert Uytterhoeven
	Fix a misspelling of "slow". Link: https://lore.kernel.org/r/1f7ebf98598418914ec9f5b6d5cb8583d24a4bf0.1743089563.git.geert@linux-m68k.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <shuah@kernel.org>
2025-04-08	kunit: tool: fix count of tests if late test plan	Rae Moar
	Fix test count with late test plan. For example, TAP version 13 ok 1 test1 1..4 Returns a count of 1 passed, 1 crashed (because it expects tests after the test plan): returning the total count of 2 tests Change this to be 1 passed, 1 error: total count of 1 test Link: https://lore.kernel.org/r/20250319223351.1517262-1-rmoar@google.com Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <shuah@kernel.org>
2025-04-08	selftests: tpm2: test_smoke: use POSIX-conformant expression operator	Ahmed Salem
	Use POSIX-conformant expression operator symbol '='. The use of the non POSIX-conformant symbol '==' would work in bash, but not in sh where the unexpected operator error would result in test_smoke.sh being skipped. Instead of changing the shebang to use bash, which may not be available on all systems, use the POSIX-conformant expression symbol '=' to test for equality. Without this patch: =================== # make -j8 TARGETS=tpm2 kselftest # selftests: tpm2: test_smoke.sh # ./test_smoke.sh: 9: [: 2: unexpected operator ok 1 selftests: tpm2: test_smoke.sh # SKIP With this patch: ================ # make -j8 TARGETS=tpm2 kselftest # selftests: tpm2: test_smoke.sh # Ran 9 tests in 9.236s ok 1 selftests: tpm2: test_smoke.sh Link: https://lore.kernel.org/r/37ztyakgrrtgvec344mg7mspchwjpxxtsprtjidso3pwkmm4f4@awsa5mzgqmtb Signed-off-by: Ahmed Salem <x0rw3ll@gmail.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-04-08	selftests: tpm2: create a dedicated .gitignore	Khaled Elnaggar
	The tpm2 selftests produce two logs: SpaceTest.log and AsyncTest.log. Only SpaceTest.log was listed in selftests/.gitignore, while AsyncTest.log remained untracked. This change creates a dedicated .gitignore in the tpm2/ directory to manage these entries, keeping tpm2-specific patterns isolated from parent .gitignore. Fixed white-space errors during commit Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20250126195147.902608-1-khaledelnaggarlinux@gmail.com Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-04-08	drm/amdgpu/sdma7: add support for disable_kq	Alex Deucher
	When the parameter is set, disable user submissions to kernel queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/sdma6: add support for disable_kq	Alex Deucher
	When the parameter is set, disable user submissions to kernel queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/sdma: add flag for tracking disable_kq	Alex Deucher
	For SDMA, we still need kernel queues for paging so they need to be initialized, but we no not want to accept submissions from userspace when disable_kq is set. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx12: add support for disable_kq	Alex Deucher
	Plumb in support for disabling kernel queues. v2: use ring counts per Felix' suggestion v3: fix stream fault handler, enable EOP interrupts v4: fix MEC interrupt offset (Sunil) v5: clean up after removing extra sched.ready settings Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx11: add support for disable_kq	Alex Deucher
	Plumb in support for disabling kernel queues in GFX11. We have to bring up a GFX queue briefly in order to initialize the clear state. After that we can disable it. v2: use ring counts per Felix' suggestion v3: fix stream fault handler, enable EOP interrupts v4: fix MEC interrupt offset (Sunil) Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/mes: make more vmids available when disable_kq=1	Alex Deucher
	If we don't have kernel queues, the vmids can be used by the MES for user queues. Acked-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/mes: update hqd masks when disable_kq is set	Alex Deucher
	Make all resources available to user queues. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx: add generic handling for disable_kq	Alex Deucher
	Add proper checks for disable_kq functionality in gfx helper functions. Add special logic for families that require the clear state setup. v2: use ring count as per Felix suggestion v3: fix num_gfx_rings handling in amdgpu_gfx_graphics_queue_acquire() v4: fix error code (Alex) Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add ring flag for no user submissions	Alex Deucher
	This would be set by IPs which only accept submissions from the kernel, not userspace, such as when kernel queues are disabled. Don't expose the rings to userspace and reject any submissions in the CS IOCTL. v2: fix error code (Alex) Reviewed-by: Sunil Khatri<sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add parameter to disable kernel queues	Alex Deucher
	On chips that support user queues, setting this option will disable kernel queues to be used to validate user queues without kernel queues. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/userq: prevent runtime pm when userqs are active	Alex Deucher
	Similar to KFD, prevent runtime pm while user queues are active. Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: store userq_managers in a list in adev	Alex Deucher
	So we can iterate across them when we need to manage all user queues. v2: add uq_mgr to adev list in amdgpu_userq_mgr_init Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: bump version for user queue IP support query	Alex Deucher
	Add the user queue IP support query to the drm_amdgpu_info_device query. Cc: marek.olsak@amd.com Cc: prike.liang@amd.com Cc: sunil.khatri@amd.com Cc: yogesh.mohanmarimuthu@amd.com Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add UAPI to query if user queues are supported	Alex Deucher
	Add an INFO query to check if user queues are supported. v2: switch to a mask of IPs (Marek) v3: move to drm_amdgpu_info_device (Marek) Cc: marek.olsak@amd.com Cc: prike.liang@amd.com Cc: sunil.khatri@amd.com Cc: yogesh.mohanmarimuthu@amd.com Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx12: split userq setup to a separate switch	Alex Deucher
	Add a separate switch statement for the userq callback assignment so that we can assign the callbacks for each asic as the firmware becomes available. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx11: clean up and consolidate sw_init	Alex Deucher
	With the ME details fixed, we can now consolidate this state. Also split out the userq setup into a separate switch statement so that we can set them per IP version when the firmwares are ready. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Fix display freezing issue when resizing apps	Arvind Yadav
	The display is freezing because the amdgpu_userq_wait_ioctl() is waiting for a non-user queue fence(specifically, the PT update fence). RootCause: The resume_work is initiated by both amdgpu_userq_suspend and amdgpu_userqueue_ensure_ev_fence at same time. The amdgpu_userq_suspend signals a dma-fence and subsequently triggers the resume_work, which is intended to replace the existing fence by creating new dma-fence. However, following this, the amdgpu_userqueue_ensure_ev_fence schedules another resume_work that generates a new dma-fence, thereby replacing the one created by amdgpu_userq_suspend. Consequently, the original fence will never be signaled. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/mes: warn on unexpected pipe numbers	Alex Deucher
	Warn if the number of pipes exceeds what the MES supports. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/mes: centralize gfx_hqd mask management	Alex Deucher
	Move it to amdgpu_mes to align with the compute and sdma hqd masks. No functional change. v2: rebase on new changes v3: misc optimizations Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri<sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: remove is_mes_queue flag	Alex Deucher
	This was leftover from MES bring up when we had MES user queues in the kernel. It's no longer used so remove it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/mes: remove unused functions	Alex Deucher
	Leftover from the MES self tests that were removed previously. Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: validate user queue parameters	Alex Deucher
	Make sure these are set properly to ensure compatibility if we ever update the IOCTL interface. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: fix the memleak caused by fence not released	Arvind Yadav
	Encountering a taint issue during the unloading of gpu_sched due to the fence not being released/put. In this context, amdgpu_vm_clear_freed is responsible for creating a job to update the page table (PT). It allocates kmem_cache for drm_sched_fence and returns the finished fence associated with job->base.s_fence. In case of Usermode queue this finished fence is added to the timeline sync object through amdgpu_gem_update_bo_mapping, which is utilized by user space to ensure the completion of the PT update. [ 508.900587] ============================================================================= [ 508.900605] BUG drm_sched_fence (Tainted: G N): Objects remaining in drm_sched_fence on __kmem_cache_shutdown() [ 508.900617] ----------------------------------------------------------------------------- [ 508.900627] Slab 0xffffe0cc04548780 objects=32 used=2 fp=0xffff8ea81521f000 flags=0x17ffffc0000240(workingset\|head\|node=0\|zone=2\|lastcpupid=0x1fffff) [ 508.900645] CPU: 3 UID: 0 PID: 2337 Comm: rmmod Tainted: G N 6.12.0+ #1 [ 508.900651] Tainted: [N]=TEST [ 508.900653] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021 [ 508.900656] Call Trace: [ 508.900659] <TASK> [ 508.900665] dump_stack_lvl+0x70/0x90 [ 508.900674] dump_stack+0x14/0x20 [ 508.900678] slab_err+0xcb/0x110 [ 508.900687] ? srso_return_thunk+0x5/0x5f [ 508.900692] ? try_to_grab_pending+0xd3/0x1d0 [ 508.900697] ? srso_return_thunk+0x5/0x5f [ 508.900701] ? mutex_lock+0x17/0x50 [ 508.900708] __kmem_cache_shutdown+0x144/0x2d0 [ 508.900713] ? flush_rcu_work+0x50/0x60 [ 508.900719] kmem_cache_destroy+0x46/0x1f0 [ 508.900728] drm_sched_fence_slab_fini+0x19/0x970 [gpu_sched] [ 508.900736] __do_sys_delete_module.constprop.0+0x184/0x320 [ 508.900744] ? srso_return_thunk+0x5/0x5f [ 508.900747] ? debug_smp_processor_id+0x1b/0x30 [ 508.900754] __x64_sys_delete_module+0x16/0x20 [ 508.900758] x64_sys_call+0xdf/0x20d0 [ 508.900763] do_syscall_64+0x51/0x120 [ 508.900769] entry_SYSCALL_64_after_hwframe+0x76/0x7e v2: call dma_fence_put in amdgpu_gem_va_update_vm v3: Addressed review comments from Christian. - calling amdgpu_gem_update_timeline_node before switch. - puting a dma_fence in case of error or !timeline_syncobj. v4: Addressed review comments from Christian. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/userq: move the header to amdgpu directory	Alex Deucher
	To align with other headers. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/userq: remove BROKEN from config	Alex Deucher
	This can be enabled now. We have the firmware checks in place. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add userq firmware version checks	Alex Deucher
	Currently disabled until the firmwares are officially released. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/gfx11: fix config guard	Alex Deucher
	s/CONFIG_DRM_AMD_USERQ_GFX/CONFIG_DRM_AMDGPU_NAVI3X_USERQ/ Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/Kconfig: fix wording of DRM_AMDGPU_NAVI3X_USERQ	Alex Deucher
	The feature is not navi3x specific at this point. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: return an error in the userq IOCTL when DRM_AMDGPU_NAVI3X_USERQ=n	Alex Deucher
	I'd swear this was already fixed, but I guess the patch never landed. Add it now. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/userq: handle runtime pm	Alex Deucher
	Take a reference when we create a queue and drop it when we destroy the queue. We need to keep the device active while user queues are active. v2: squash in fix from Sunil v3: squash in fix from Prike Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu/userq: fix hardcoded uq functions	Alex Deucher
	Use the IP type to look up the userq functions rather than hardcoding it. Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Fix display freeze lockup error	Arvind Yadav
	A deadlock situation has arised between the userq signal ioctl and the eviction fence. In this scenario, the function amdgpu_userq_signal_ioctl() has acquired a reservation lock on the read/write buffer object (BO) through drm_exec. Subsequently, it calls amdgpu_userqueue_ensure_ev_fence(), which is in a waiting for the userq resume work. Meanwhile, the userq suspend worker has initiated the userq resume work(amdgpu_userqueue_resume_worker). This userq resume work attempts to validate the vm->done BO, leading to amdgpu_userqueue_validate_bos also attempting to reservation lock the same write BO that is already locked by amdgpu_userq_signal_ioctl. As a result, the resume work becomes stalled, causing amdgpu_userqueue_ensure_ev_fence to remain in a waiting state. Call Trace: [ 242.836469] INFO: task gnome-shel:cs0:1288 blocked for more than 120 seconds. [ 242.836486] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4 [ 242.836491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.836494] task:gnome-shel:cs0 state:D stack:0 pid:1288 tgid:1282 ppid:1180 flags:0x00000002 [ 242.836503] Call Trace: [ 242.836508] <TASK> [ 242.836517] __schedule+0x3e0/0xb10 [ 242.836530] ? srso_return_thunk+0x5/0x5f [ 242.836541] schedule+0x31/0x120 [ 242.836546] schedule_timeout+0x150/0x160 [ 242.836551] ? srso_return_thunk+0x5/0x5f [ 242.836555] ? sysvec_call_function+0x69/0xd0 [ 242.836562] ? srso_return_thunk+0x5/0x5f [ 242.836567] ? preempt_count_add+0x7f/0xd0 [ 242.836577] __wait_for_common+0x91/0x180 [ 242.836582] ? __pfx_schedule_timeout+0x10/0x10 [ 242.836590] wait_for_completion+0x28/0x30 [ 242.836595] __flush_work+0x16c/0x290 [ 242.836602] ? __pfx_wq_barrier_func+0x10/0x10 [ 242.836611] flush_delayed_work+0x3a/0x60 [ 242.836621] amdgpu_userqueue_ensure_ev_fence+0x2d/0xb0 [amdgpu] [ 242.836966] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu] [ 242.837171] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.837365] drm_ioctl_kernel+0xae/0x100 [drm] [ 242.837398] drm_ioctl+0x2a1/0x500 [drm] [ 242.837420] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.837622] ? srso_return_thunk+0x5/0x5f [ 242.837627] ? srso_return_thunk+0x5/0x5f [ 242.837630] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 242.837635] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] [ 242.837811] __x64_sys_ioctl+0x99/0xd0 [ 242.837820] x64_sys_call+0x1209/0x20d0 [ 242.837825] do_syscall_64+0x51/0x120 [ 242.837830] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 242.837835] RIP: 0033:0x7f2f33f1a94f [ 242.837838] RSP: 002b:00007f2f24ffea30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 242.837842] RAX: ffffffffffffffda RBX: 00007f2f24ffebd0 RCX: 00007f2f33f1a94f [ 242.837845] RDX: 00007f2f24ffebd0 RSI: 00000000c0306457 RDI: 000000000000000d [ 242.837847] RBP: 00007f2f24ffeab0 R08: 0000000000000000 R09: 0000000000000000 [ 242.837849] R10: 00007f2f24ffecd0 R11: 0000000000000246 R12: 00007f2f25000640 [ 242.837851] R13: 00000000c0306457 R14: 000000000000000d R15: 00007fff3b39c1e0 [ 242.837858] </TASK> [ 242.837865] INFO: task Xwayland:cs0:1517 blocked for more than 120 seconds. [ 242.837869] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4 [ 242.837872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.837874] task:Xwayland:cs0 state:D stack:0 pid:1517 tgid:1338 ppid:1282 flags:0x00004002 [ 242.837878] Call Trace: [ 242.837880] <TASK> [ 242.837883] __schedule+0x3e0/0xb10 [ 242.837890] schedule+0x31/0x120 [ 242.837894] schedule_preempt_disabled+0x1c/0x30 [ 242.837897] __mutex_lock.constprop.0+0x386/0x6e0 [ 242.837902] ? srso_return_thunk+0x5/0x5f [ 242.837905] ? __timer_delete_sync+0x81/0xe0 [ 242.837911] __mutex_lock_slowpath+0x13/0x20 [ 242.837915] mutex_lock+0x3b/0x50 [ 242.837919] amdgpu_userqueue_ensure_ev_fence+0x35/0xb0 [amdgpu] [ 242.838138] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu] [ 242.838340] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.838531] drm_ioctl_kernel+0xae/0x100 [drm] [ 242.838559] drm_ioctl+0x2a1/0x500 [drm] [ 242.838580] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ 242.838778] ? srso_return_thunk+0x5/0x5f [ 242.838783] ? srso_return_thunk+0x5/0x5f [ 242.838786] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 242.838791] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] [ 242.838967] __x64_sys_ioctl+0x99/0xd0 [ 242.838972] x64_sys_call+0x1209/0x20d0 [ 242.838975] do_syscall_64+0x51/0x120 [ 242.838979] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 242.838982] RIP: 0033:0x7f9118b1a94f [ 242.838985] RSP: 002b:00007f910cdff760 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 242.838989] RAX: ffffffffffffffda RBX: 00007f910cdff910 RCX: 00007f9118b1a94f [ 242.838991] RDX: 00007f910cdff910 RSI: 00000000c0306457 RDI: 000000000000000c [ 242.838993] RBP: 00007f910cdff7e0 R08: 0000000000000000 R09: 0000000000000001 [ 242.838995] R10: 00007f910cdff9d4 R11: 0000000000000246 R12: 00007f910ce00640 [ 242.838997] R13: 00000000c0306457 R14: 000000000000000c R15: 00007fff9dd11d10 [ 242.839004] </TASK> v2: Addressed review comemnts from Christian. v3/v4: Addressed review comemnts from Christian. - Move drm_exec drm_exec loop after userq fence create. - cleanup the newly created userq fence in case of error. v5 - Addressed review comemnts from Christian. - Create a new amdgpu_userq_fence_alloc() function for allocation. - Calling dma_fence_put for cleanup procedure. - make amdgpu_userq_fence_create() function static. - drm_exec_init is called after mutex_unlock. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Modify the seq64 VM cache policy	Arunpravin Paneer Selvam
	The seq64 VM cache policy should be set to UC (Uncached) to match with userqueue fence address kernel mapped memory's cache settings. Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Fix out-of-bounds issue in user fence	Arunpravin Paneer Selvam
	Fix out-of-bounds issue in userq fence create when accessing the userq xa structure. Added a lock to protect the race condition. v2:(Christian) - Allocate memory with GFP_ATOMIC. v3: - Moved to 2 xa approach. v4:(Christian) - Lock the xa_for_each blocks and memory allocation part as well to make sure that xa is not modified in between the 2 xa_for_each blocks. BUG: KASAN: slab-out-of-bounds in amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000006] Call Trace: [ +0.000005] <TASK> [ +0.000005] dump_stack_lvl+0x6c/0x90 [ +0.000011] print_report+0xc4/0x5e0 [ +0.000009] ? srso_return_thunk+0x5/0x5f [ +0.000008] ? kasan_complete_mode_report_info+0x26/0x1d0 [ +0.000007] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000405] kasan_report+0xdf/0x120 [ +0.000009] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000405] __asan_report_store8_noabort+0x17/0x20 [ +0.000007] amdgpu_userq_fence_create+0x726/0x880 [amdgpu] [ +0.000406] ? __pfx_amdgpu_userq_fence_create+0x10/0x10 [amdgpu] [ +0.000408] ? srso_return_thunk+0x5/0x5f [ +0.000008] ? ttm_resource_move_to_lru_tail+0x235/0x4f0 [ttm] [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000008] amdgpu_userq_signal_ioctl+0xd29/0x1c70 [amdgpu] [ +0.000412] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ +0.000404] ? try_to_wake_up+0x165/0x1840 [ +0.000010] ? __pfx_futex_wake_mark+0x10/0x10 [ +0.000011] drm_ioctl_kernel+0x178/0x2f0 [drm] [ +0.000050] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ +0.000404] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ +0.000043] ? __kasan_check_read+0x11/0x20 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000007] ? __kasan_check_write+0x14/0x20 [ +0.000008] drm_ioctl+0x513/0xd20 [drm] [ +0.000040] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu] [ +0.000407] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ +0.000044] ? srso_return_thunk+0x5/0x5f [ +0.000007] ? _raw_spin_lock_irqsave+0x99/0x100 [ +0.000007] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ +0.000006] ? __rseq_handle_notify_resume+0x188/0xc30 [ +0.000008] ? srso_return_thunk+0x5/0x5f [ +0.000008] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? _raw_spin_unlock_irqrestore+0x27/0x50 [ +0.000010] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu] [ +0.000388] __x64_sys_ioctl+0x135/0x1b0 [ +0.000009] x64_sys_call+0x1205/0x20d0 [ +0.000007] do_syscall_64+0x4d/0x120 [ +0.000008] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000007] RIP: 0033:0x7f7c3d31a94f Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: add db size and offset range for VCN and VPE	Saleemkhan Jamadar
	VCN and VPE have different offset range, update the doorbell offset range repsectively. Doorbell size for VCN and VPE is 32bit. v1 : add gfx switch case and fix checkpatch warnings (Shashank) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: map doorbell for the requested userq	Saleemkhan Jamadar
	Introduce db_info structure to the populate the doorbell information that is required to be mapped. Made changes to the doorbell mapping func more generic, by taking parameters that vary based on IPs and/or usecase into db_info structure. v2 - Fix space alignment and checkpatch warnings(Shashank) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: fix call to amdgpu_eviction_fence_detach	Christian König
	That needs to be done after grabbing the lock, not before. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08	drm/amdgpu: Fix Illegal opcode in command stream Error	Arvind Yadav
	When applications closes, it triggers the drm_file_free function which subsequently releases all allocated buffer objects. Concurrently, the resume_worker thread will attempt to map the usermode queue. However, since the wptr buffer object has already been deallocated, this will result in an Illegal opcode error being raised in the command stream. Now replacing drm_release() with a new function amdgpu_drm_release(). This function will set the flag to prevent the scheduling of any new queue resume/map, stop all queues and then call drm_release(). V2: - Replace drm_release with amdgpu_drm_release(Christian). Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>