summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd
AgeCommit message (Collapse)Author
2022-07-13drm/amdkfd: correct the MEC atomic support firmware checking for GC 10.3.7Prike Liang
On the GC 10.3.7 platform the initial MEC release version #3 can support atomic operation,so need correct and set its MEC atomic support version to #3. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-07drm/amdgpu: Fix one list corruption when create queue failsxinhui pan
Queue would be freed when create_queue_cpsch fails So lets do queue cleanup otherwise various list and memory issues happen. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-07drm/amdkfd: optimize svm range evictEric Huang
It is to avoid unnecessary queue eviction when range is not mapped to gpu. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-07drm/amdkfd: change svm range evictEric Huang
Adding always evict queues when flag is set to KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED as if XNACK off. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: Asynchronously free smi_clientPhilip Yang
The synchronize_rcu may take several ms, which noticeably slows down applications close SMI event handle. Use call_rcu to free client->fifo and client asynchronously and eliminate the synchronize_rcu call in the user thread. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: Add unmap from GPU SMI eventPhilip Yang
SVM range unmapped from GPUs when range is unmapped from CPU, or with xnack on from MMU notifier when range is evicted or migrated. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: Add user queue eviction restore SMI eventPhilip Yang
Output user queue eviction and restore event. User queue eviction may be triggered by svm or userptr MMU notifier, TTM eviction, device suspend and CRIU checkpoint and restore. User queue restore may be rescheduled if eviction happens again while restore. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: Add migration SMI eventPhilip Yang
For migration start and end event, output timestamp when migration starts, ends, svm range address and size, GPU id of migration source and destination and svm range attributes, Migration trigger could be prefetch, CPU or GPU page fault and TTM eviction. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: Add GPU recoverable fault SMI eventPhilip Yang
Use ktime_get_boottime_ns() as timestamp to correlate with other APIs. Output timestamp when GPU recoverable fault starts and ends to recover the fault, if migration happened or only GPU page table is updated to recover, fault address, if read or write fault. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: Enable per process SMI eventPhilip Yang
Process receive event from same process by default. Add a flag to be able to receive event from all processes, this requires super user permission. Event using pid 0 to send the event to all processes, to keep the default behavior of existing SMI events. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-30drm/amdkfd: fix cu mask for asics with wgpsJonathan Kim
GFX10 and up have work group processors (WGP) and WGP mode is the native compile mode. KFD and ROCr have no visibility into whether a dispatch is operating in CU or WGP mode. Enforce CU masking to be pairwise continguous in enablement and round robin distribute CUs across the SEs in a pairwise manner to assume WGP mode at all times. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-28Revert "drm/amdkfd: Free queue after unmap queue success"Philip Yang
This reverts commit ab8529b0cdb271d9b222cbbddb2641f3fca5df8f. This causes KFDTest KFDMemoryTest.MemoryRegister test failed on gfx9. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-28drm/amdgpu: add mc wptr addr support for mesJack Xiao
MES requires mc wptr address for usermode queues. Export bo gart address for mc wptr address. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-23drm/amdgpu: Update mes_v11_api_def.hGraham Sider
Update MES API to support oversubscription without aggregated doorbell for usermode queues. v2: Change oversubscription_no_aggregated_en to is_kfd_process (align with MES) Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-23drm/amdkfd: Enable GFX11 usermode queue oversubscriptionGraham Sider
Starting with GFX11, MES requires wptr BOs to be GTT allocated/mapped to GART for usermode queues in order to support oversubscription. In the case that work is submitted to an unmapped queue, MES must have a GART wptr address to determine whether the queue should be mapped. This change is accompanied with changes in MES and is applicable for MES_API_VERSION >= 2. v3: - Use amdgpu_vm_bo_lookup_mapping for wptr_bo mapping lookup - Move wptr_bo refcount increment to amdgpu_amdkfd_map_gtt_bo_to_gart - Remove list_del_init from amdgpu_amdkfd_map_gtt_bo_to_gart - Cleanup/fix create_queue wptr_bo error handling v4: - Add MES version shift/mask defines to amdgpu_mes.h - Change version check from MES_VERSION to MES_API_VERSION - Add check in kfd_ioctl_create_queue before wptr bo pin/GART map to ensure bo is a single page. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22drm/amdkfd: Free queue after unmap queue successPhilip Yang
After queue unmap or remove from MES successfully, free queue sysfs entries, doorbell and remove from queue list. Otherwise, application may destroy queue again, cause below kernel warning or crash backtrace. For outstanding queues, either application forget to destroy or failed to destroy, kfd_process_notifier_release will remove queue sysfs entries, kfd_process_wq_release will free queue doorbell. v2: decrement_queue_count for MES queue refcount_t: underflow; use-after-free. WARNING: CPU: 7 PID: 3053 at lib/refcount.c:28 Call Trace: kobject_put+0xd6/0x1a0 kfd_procfs_del_queue+0x27/0x30 [amdgpu] pqm_destroy_queue+0xeb/0x240 [amdgpu] kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu] kfd_ioctl+0x27d/0x500 [amdgpu] do_syscall_64+0x35/0x80 WARNING: CPU: 2 PID: 3053 at drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:400 Call Trace: deallocate_doorbell.isra.0+0x39/0x40 [amdgpu] destroy_queue_cpsch+0xb3/0x270 [amdgpu] pqm_destroy_queue+0x108/0x240 [amdgpu] kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu] kfd_ioctl+0x27d/0x500 [amdgpu] general protection fault, probably for non-canonical address 0xdead000000000108: Call Trace: pqm_destroy_queue+0xf0/0x200 [amdgpu] kfd_ioctl_destroy_queue+0x2f/0x60 [amdgpu] kfd_ioctl+0x19b/0x600 [amdgpu] Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-22drm/amdkfd: Add queue to MES if it becomes activePhilip Yang
We remove the user queue from MES scheduler to update queue properties. If the queue becomes active after updating, add the user queue to MES scheduler, to be able to handle command packet submission. v2: don't break pqm_set_gws Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-21drm/amdkfd: correct sdma queue number of sdma 6.0.1Yifan Zhang
sdma 6.0.1 has 8 queues instead of 2. Fixes: 26776a7031c423 ("drm/amdkfd: add GC 11.0.1 KFD support") Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-14drm/amdkfd: fix warning when CONFIG_HSA_AMD_P2P is not setAlex Deucher
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1542:11: warning: variable 'i' set but not used [-Wunused-but-set-variable] Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-14drm/amdkfd: Add available memory ioctlDaniel Phillips
Add a new KFD ioctl to return the largest possible memory size that can be allocated as a buffer object using kfd_ioctl_alloc_memory_of_gpu. It attempts to use exactly the same accept/reject criteria as that function so that allocating a new buffer object of the size returned by this new ioctl is guaranteed to succeed, barring races with other allocating tasks. This IOCTL will be used by libhsakmt: https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-10drm/amdkfd: Remove field io_link_count from struct kfd_topology_deviceRamesh Errabolu
The field is redundant and does not serve any functional role Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdkfd: Document and fix GTT BO kmap APIFelix Kuehling
Removed an unused parameter from two functions and added kernel-doc comments. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdkfd: Extend KFD device topology to surface peer-to-peer linksRamesh Errabolu
Extend KFD device topology to surface peer-to-peer links among GPU devices connected over PCIe or xGMI. Enabling HSA_AMD_P2P is REQUIRED to surface peer-to-peer links. Prior to this KFD did not expose to user mode any P2P links or indirect links that go over two or more direct hops. Old versions of the Thunk used to make up their own P2P and indirect links without the information about peer-accessibility and chipset support available to the kernel mode driver. In this patch we expose P2P links in a new sysfs directory to provide more reliable P2P link information to user mode. Old versions of the Thunk will continue to work as before and ignore the new directory. This avoids conflicts between P2P links exposed by KFD and P2P links created by the Thunk itself. New versions of the Thunk will use only the P2P links provided in the new p2p_links directory, if it exists, or fall back to the old code path on older KFDs that don't expose p2p_links. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdkfd: Define config HSA_AMD_P2P to support peer-to-peerRamesh Errabolu
Extend current kernel config requirements of amdgpu by adding config HSA_AMD_P2P. Enabling HSA_AMD_P2P is REQUIRED to support peer-to-peer communication between AMD GPU devices over PCIe bus Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-08drm/amdkfd:Fix fw version for 10.3.6Jesse Zhang
fix fw error when loading fw for 10.3.6 Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-06-03drm/amdkfd: Fix partial migration bugsPhilip Yang
Migration range from system memory to VRAM, if system page can not be locked or unmapped, we do partial migration and leave some pages in system memory. Several bugs found to copy pages and update GPU mapping for this situation: 1. copy to vram should use migrate->npage which is total pages of range as npages, not migrate->cpages which is number of pages can be migrated. 2. After partial copy, set VRAM res cursor as j + 1, j is number of system pages copied plus 1 page to skip copy. 3. copy to ram, should collect all continuous VRAM pages and copy together. 4. Call amdgpu_vm_update_range, should pass in offset as bytes, not as number of pages. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-06-03drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitionsMario Limonciello
Loading amdgpu on GC 10.3.7 shows an ERR level message: `kfd kfd: amdgpu: GC IP 0a0307 not supported in kfd` Add these targets to match yellow carp structures. Reported-by: David Chang <david.chang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Jesse(Jie) Zhang <Jesse.Zhang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.18.x
2022-06-01drm/amdkfd: Use mmget_not_zero in MMU notifierPhilip Yang
MMU notifier callback may pass in mm with mm->mm_users==0 when process is exiting, use mmget_no_zero to avoid accessing invalid mm in deferred list work after mm is gone. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLEChristian König
Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO doesn't needs to be preserved during eviction. KFD was already using a similar functionality for SVM BOs so replace the internal flag with the new UAPI. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26drm/amdkfd: Add gfx11 trap handlerJay Cornwall
Based on gfx10 with following changes: - GPR_ALLOC.VGPR_SIZE field moved (and size corrected in gfx10) - s_sendmsg_rtn_b64 replaces some s_sendmsg/s_getreg - Buffer instructions no longer have direct-to-LDS modifier Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Laurent Morichetti <laurent.morichetti@amd.com> Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26drm/amdkfd: port cwsr trap handler from dkms branchEric Huang
Most of changes are for debugger feature, and it is to simplify trap handler support for new asics in the future. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-26drm/amdkfd: simplify cpu hive assignmentJonathan Kim
CPU hive assignment currently assumes when a GPU hive is connected_to_cpu, there is only one hive in the system. Only assign CPUs to the hive if they are explicitly directly connected to the GPU hive to get rid of the need for this assumption. It's more efficient to do this when querying IO links since other non-CRAT info has to be filled in anyways. Also, stop re-assigning the same CPU to the same GPU hive if it has already been done before. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-19Merge tag 'amd-drm-next-5.19-2022-05-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-05-18: amdgpu: - Misc code cleanups - Additional SMU 13.x enablement - Smartshift fixes - GFX11 fixes - Support for SMU 13.0.4 - SMU mutex fix - Suspend/resume fix amdkfd: - static checker fix - Doorbell/MMIO resource handling fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220518205621.5741-1-alexander.deucher@amd.com
2022-05-16drm/amdkfd: Fix static checker warning on MES queue typeGraham Sider
convert_to_mes_queue_type return can be negative, but queue_input.queue_type is uint32_t. Put return in integer var and cast to unsigned after negative check. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-10drm/amdkfd: Update event_interrupt_isr_v11 returnGraham Sider
Add amdgpu_no_queue_eviction_on_vm_fault condition to event_interrupt_isr_v11 return. If no queue eviction on vm fault specified, function should return false for client/source ids specifying vm fault. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-06drm/amdkfd: Return true/false (not 1/0) from bool functionsYang Li
Return boolean values ("true" or "false") instead of 1 or 0 from bool functions. This fixes the following warnings from coccicheck: ./drivers/gpu/drm/amd/amdkfd/kfd_int_process_v11.c:244:9-10: WARNING: return of 0/1 in function 'event_interrupt_isr_v11' with return type bool Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-06drm/amdkfd: add GC 11.0.1 KFD supportHuang Rui
Add initial support for GC 11.0.1 in KFD compute driver. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-06Merge tag 'amd-drm-next-5.19-2022-04-29' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-04-29: amdgpu - RAS updates - SI dpm deadlock fix - Misc code cleanups - HDCP fixes - PSR fixes - DSC fixes - SDMA doorbell cleanups - S0ix fix - DC FP fix - Zen dom0 regression fix for APUs - IP discovery updates - Initial SoC21 support - Support for new vbios tables - Runtime PM fixes - Add PSP TA debugfs interface amdkfd: - Misc code cleanups - Ignore bogus MEC signals more efficiently - SVM fixes - Use bitmap helpers radeon: - Misc code cleanups - Spelling/grammer fixes From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220429144853.5742-1-alexander.deucher@amd.com
2022-05-05drm/amdkfd: add asic support for GC 11.0.2Eric Huang
Changes are inherited from GC 11.0.0. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-05drm/amdkfd: add asic support for SDMA 6.0.2Eric Huang
It is inherited from SDMA 6.0.0. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04drm/amdkfd: Add KFD support for soc21 v3Mukul Joshi
Add initial support for soc21 in KFD compute driver (Mukul) - Add new definition for soc21 device. - Add new file for amdgpu-kfd interface for GFX11 family. - Add new file for queue management, interrupt handling, mqd management for GFX11 family in KFD driver. - Related changes/updates for soc21 device in KFD driver. - Repurpose last 2 entries of SDMA MQD for driver use. v2: Add an optional argument into update queue operation (Mukul) v3: Switch to ip version check, replace kgd_dev with amdgpu_device (Hawking) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-04drm/amdkfd: add helper to generate cache info from gfx configAlex Deucher
Rather than using hardcoded tables, we can use the gfx and gmc config pulled from the IP discovery table to generate the cache configuration. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28drm/amdkfd: Fix circular lock dependency warningMukul Joshi
[ 168.544078] ====================================================== [ 168.550309] WARNING: possible circular locking dependency detected [ 168.556523] 5.16.0-kfd-fkuehlin #148 Tainted: G E [ 168.562558] ------------------------------------------------------ [ 168.568764] kfdtest/3479 is trying to acquire lock: [ 168.573672] ffffffffc0927a70 (&topology_lock){++++}-{3:3}, at: kfd_topology_device_by_id+0x16/0x60 [amdgpu] [ 168.583663] but task is already holding lock: [ 168.589529] ffff97d303dee668 (&mm->mmap_lock#2){++++}-{3:3}, at: vm_mmap_pgoff+0xa9/0x180 [ 168.597755] which lock already depends on the new lock. [ 168.605970] the existing dependency chain (in reverse order) is: [ 168.613487] -> #3 (&mm->mmap_lock#2){++++}-{3:3}: [ 168.619700] lock_acquire+0xca/0x2e0 [ 168.623814] down_read+0x3e/0x140 [ 168.627676] do_user_addr_fault+0x40d/0x690 [ 168.632399] exc_page_fault+0x6f/0x270 [ 168.636692] asm_exc_page_fault+0x1e/0x30 [ 168.641249] filldir64+0xc8/0x1e0 [ 168.645115] call_filldir+0x7c/0x110 [ 168.649238] ext4_readdir+0x58e/0x940 [ 168.653442] iterate_dir+0x16a/0x1b0 [ 168.657558] __x64_sys_getdents64+0x83/0x140 [ 168.662375] do_syscall_64+0x35/0x80 [ 168.666492] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 168.672095] -> #2 (&type->i_mutex_dir_key#6){++++}-{3:3}: [ 168.679008] lock_acquire+0xca/0x2e0 [ 168.683122] down_read+0x3e/0x140 [ 168.686982] path_openat+0x5b2/0xa50 [ 168.691095] do_file_open_root+0xfc/0x190 [ 168.695652] file_open_root+0xd8/0x1b0 [ 168.702010] kernel_read_file_from_path_initns+0xc4/0x140 [ 168.709542] _request_firmware+0x2e9/0x5e0 [ 168.715741] request_firmware+0x32/0x50 [ 168.721667] amdgpu_cgs_get_firmware_info+0x370/0xdd0 [amdgpu] [ 168.730060] smu7_upload_smu_firmware_image+0x53/0x190 [amdgpu] [ 168.738414] fiji_start_smu+0xcf/0x4e0 [amdgpu] [ 168.745539] pp_dpm_load_fw+0x21/0x30 [amdgpu] [ 168.752503] amdgpu_pm_load_smu_firmware+0x4b/0x80 [amdgpu] [ 168.760698] amdgpu_device_fw_loading+0xb8/0x140 [amdgpu] [ 168.768412] amdgpu_device_init.cold+0xdf6/0x1716 [amdgpu] [ 168.776285] amdgpu_driver_load_kms+0x15/0x120 [amdgpu] [ 168.784034] amdgpu_pci_probe+0x19b/0x3a0 [amdgpu] [ 168.791161] local_pci_probe+0x40/0x80 [ 168.797027] work_for_cpu_fn+0x10/0x20 [ 168.802839] process_one_work+0x273/0x5b0 [ 168.808903] worker_thread+0x20f/0x3d0 [ 168.814700] kthread+0x176/0x1a0 [ 168.819968] ret_from_fork+0x1f/0x30 [ 168.825563] -> #1 (&adev->pm.mutex){+.+.}-{3:3}: [ 168.834721] lock_acquire+0xca/0x2e0 [ 168.840364] __mutex_lock+0xa2/0x930 [ 168.846020] amdgpu_dpm_get_mclk+0x37/0x60 [amdgpu] [ 168.853257] amdgpu_amdkfd_get_local_mem_info+0xba/0xe0 [amdgpu] [ 168.861547] kfd_create_vcrat_image_gpu+0x1b1/0xbb0 [amdgpu] [ 168.869478] kfd_create_crat_image_virtual+0x447/0x510 [amdgpu] [ 168.877884] kfd_topology_add_device+0x5c8/0x6f0 [amdgpu] [ 168.885556] kgd2kfd_device_init.cold+0x385/0x4c5 [amdgpu] [ 168.893347] amdgpu_amdkfd_device_init+0x138/0x180 [amdgpu] [ 168.901177] amdgpu_device_init.cold+0x141b/0x1716 [amdgpu] [ 168.909025] amdgpu_driver_load_kms+0x15/0x120 [amdgpu] [ 168.916458] amdgpu_pci_probe+0x19b/0x3a0 [amdgpu] [ 168.923442] local_pci_probe+0x40/0x80 [ 168.929249] work_for_cpu_fn+0x10/0x20 [ 168.935008] process_one_work+0x273/0x5b0 [ 168.940944] worker_thread+0x20f/0x3d0 [ 168.946623] kthread+0x176/0x1a0 [ 168.951765] ret_from_fork+0x1f/0x30 [ 168.957277] -> #0 (&topology_lock){++++}-{3:3}: [ 168.965993] check_prev_add+0x8f/0xbf0 [ 168.971613] __lock_acquire+0x1299/0x1ca0 [ 168.977485] lock_acquire+0xca/0x2e0 [ 168.982877] down_read+0x3e/0x140 [ 168.987975] kfd_topology_device_by_id+0x16/0x60 [amdgpu] [ 168.995583] kfd_device_by_id+0xa/0x20 [amdgpu] [ 169.002180] kfd_mmap+0x95/0x200 [amdgpu] [ 169.008293] mmap_region+0x337/0x5a0 [ 169.013679] do_mmap+0x3aa/0x540 [ 169.018678] vm_mmap_pgoff+0xdc/0x180 [ 169.024095] ksys_mmap_pgoff+0x186/0x1f0 [ 169.029734] do_syscall_64+0x35/0x80 [ 169.035005] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 169.041754] other info that might help us debug this: [ 169.053276] Chain exists of: &topology_lock --> &type->i_mutex_dir_key#6 --> &mm->mmap_lock#2 [ 169.068389] Possible unsafe locking scenario: [ 169.076661] CPU0 CPU1 [ 169.082383] ---- ---- [ 169.088087] lock(&mm->mmap_lock#2); [ 169.092922] lock(&type->i_mutex_dir_key#6); [ 169.100975] lock(&mm->mmap_lock#2); [ 169.108320] lock(&topology_lock); [ 169.112957] *** DEADLOCK *** This commit fixes the deadlock warning by ensuring pm.mutex is not held while holding the topology lock. For this, kfd_local_mem_info is moved into the KFD dev struct and filled during device init. This cached value can then be used instead of querying the value again and again. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28drm/amdkfd: Fix updating IO links during device removalMukul Joshi
The logic to update the IO links when a KFD device is removed was not correct as it would miss updating the proximity domain values for some nodes where the node_from and node_to both were greater values than the proximity domain value of the KFD device being removed from topology. Fixes: 46d18d510d7831 ("drm/amdkfd: Cleanup IO links during KFD device removal") Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28drm/amdkfd: Use non-atomic bitmap functions when possibleChristophe JAILLET
All uses of the 'kfd->gtt_sa_bitmap' bitmap are protected with the 'kfd->gtt_sa_lock' mutex. So: - prefer the non-atomic '__set_bit()' function - use the non-atomic 'bitmap_[set|clear]()' functions instead of equivalent 'for' loops. These functions can work on several bits at a time Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28drm/amdkfd: Use bitmap_zalloc() when applicableChristophe JAILLET
'kfd->gtt_sa_bitmap' is a bitmap. So use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-28Merge tag 'amd-drm-next-5.19-2022-04-22' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-04-22: amdgpu: - SMU message documentation update - Misc code cleanups - Documenation updates - PSP TA updates - Runtime PM regression fix - SR-IOV header cleanup - Misc fixes amdkfd: - TLB flush fixes - GWS fixes - CRIU GWS support radeon: - Misc code cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220422150049.5859-1-alexander.deucher@amd.com
2022-04-28Merge tag 'amd-drm-next-5.19-2022-04-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.19-2022-04-15: amdgpu: - USB-C updates - GPUVM updates - TMZ fixes for RV - DCN 3.1 pstate fixes - Display z state fixes - RAS fixes - Misc code cleanups and spelling fixes - More DC FP rework - GPUVM TLB handling rework - Power management sysfs code cleanup - Add RAS support for VCN - Backlight fix - Add unique id support for more asics - Misc display updates - SR-IOV fixes - Extend CG and PG flags to 64 bits - Enable VCN clk sysfs nodes for navi12 amdkfd: - Fix IO link cleanup during device removal - RAS fixes - Retry fault fixes - Asynchronously free events - SVM fixes radeon: - Drop some dead code - Misc code cleanups From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220415135144.5700-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-04-26drm/amdkfd: Update mapping if range attributes changedPhilip Yang
Change SVM range mapping flags or access attributes don't trigger migration, if range is already mapped on GPUs we should update GPU mapping and pass flush_tlb flag true to amdgpu vm. Change SVM range preferred_loc or migration granularity don't need update GPU mapping, skip the validate_and_map. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-26drm/amdkfd: Add SVM range mapped_to_gpu flagPhilip Yang
To avoid unnecessary unmap SVM range from GPUs if range is not mapped on GPUs when migrating the range. This flag will also be used to flush TLB when updating the existing mapping on GPUs. It is protected by prange->migrate_mutex and mmap read lock in MMU notifier callback. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>