diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 12:53:43 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 12:53:43 -0700 |
commit | e058a84bfddc42ba356a2316f2cf1141974625c9 (patch) | |
tree | e6a02dd913e83f44ea9f5a779f9b9bd56d06a9e3 /drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | |
parent | c288d9cd710433e5991d58a0764c4d08a933b871 (diff) | |
parent | 8a02ea42bc1d4c448caf1bab0e05899dad503f74 (diff) |
Merge tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights:
- AMD enables two more GPUs, with resulting header files
- i915 has started to move to TTM for discrete GPU and enable DG1
discrete GPU support (not by default yet)
- new HyperV drm driver
- vmwgfx adds arm64 support
- TTM refactoring ongoing
- 16bpc display support for AMD hw
Otherwise it's just the usual insane amounts of work all over the
place in lots of drivers and the core, as mostly summarised below:
Core:
- mark AGP ioctls as legacy
- disable force probing for non-master clients
- HDR metadata property helpers
- HDMI infoframe signal colorimetry support
- remove drm_device.pdev pointer
- remove DRM_KMS_FB_HELPER config option
- remove drm_pci_alloc/free
- drm_err_*/drm_dbg_* helpers
- use drm driver names for fbdev
- leaked DMA handle fix
- 16bpc fixed point format fourcc
- add prefetching memcpy for WC
- Documentation fixes
aperture:
- add aperture ownership helpers
dp:
- aux fixes
- downstream 0 port handling
- use extended base receiver capability DPCD
- Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec
- mst: use khz as link rate during init
- VCPI fixes for StarTech hub
ttm:
- provide tt_shrink file via debugfs
- warn about freeing pinned BOs
- fix swapping error handling
- move page alignment into BO
- cleanup ttm_agp_backend
- add ttm_sys_manager
- don't override vm_ops
- ttm_bo_mmap removed
- make ttm_resource base of all managers
- remove VM_MIXEDMAP usage
panel:
- sysfs_emit support
- simple: runtime PM support
- simple: power up panel when reading EDID + caching
bridge:
- MHDP8546: HDCP support + DT bindings
- MHDP8546: Register DP AUX channel with userspace
- TI SN65DSI83 + SN65DSI84: add driver
- Sil8620: Fix module dependencies
- dw-hdmi: make CEC driver loading optional
- Ti-sn65dsi86: refclk fixes, subdrivers, runtime pm
- It66121: Add driver + DT bindings
- Adv7511: Support I2S IEC958 encoding
- Anx7625: fix power-on delay
- Nwi-dsi: Modesetting fixes; Cleanups
- lt6911: add missing MODULE_DEVICE_TABLE
- cdns: fix PM reference leak
hyperv:
- add new DRM driver for HyperV graphics
efifb:
- non-PCI device handling fixes
i915:
- refactor IP/device versioning
- XeLPD Display IP preperation work
- ADL-P enablement patches
- DG1 uAPI behind BROKEN
- disable mmap ioctl for discerte GPUs
- start enabling HuC loading for Gen12+
- major GuC backend rework for new platforms
- initial TTM support for Discrete GPUs
- locking rework for TTM prep
- use correct max source link rate for eDP
- %p4cc format printing
- GLK display fixes
- VLV DSI panel power fixes
- PSR2 disabled for RKL and ADL-S
- ACPI _DSM invalid access fixed
- DMC FW path abstraction
- ADL-S PCI ID update
- uAPI headers converted to kerneldoc
- initial LMEM support for DG1
- x86/gpu: add Jasperlake to gen11 early quirks
amdgpu:
- Aldebaran updates + initial SR-IOV
- new GPU: Beige Goby and Yellow Carp support
- more LTTPR display work
- Vangogh updates
- SDMA 5.x GCR fixes
- PCIe ASPM support
- Renoir TMZ enablement
- initial multiple eDP panel support
- use fdinfo to track devices/process info
- pin/unpin TTM fixes
- free resource on fence usage query
- fix fence calculation
- fix hotunplug/suspend issues
- GC/MM register access macro cleanup for SR-IOV
- W=1 fixes
- ACPI ATCS/ATIF handling rework
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes
- new INFO query for additional vbios info
amdkfd:
- SR-IOV aldebaran support
- HMM SVM support
radeon:
- SMU regression fixes
- Oland flickering fix
vmwgfx:
- enable console with fbdev emulation
- fix cpu updates of coherent multisample surfaces
- remove reservation semaphore
- add initial SVGA3 support
- support arm64
msm:
- devcoredump support for display errors
- dpu/dsi: yaml bindings conversion
- mdp5: alpha/blend_mode/zpos support
- a6xx: cached coherent buffer support
- gpu iova fault improvement
- a660 support
rockchip:
- RK3036 win1 scaling support
- RK3066/3188 missing register support
- RK3036/3066/3126/3188 alpha support
mediatek:
- MT8167 HDMI support
- MT8183 DPI dual edge support
tegra:
- fixed YUV support/scaling on Tegra186+
ast:
- use pcim_iomap
- fix DP501 EDID
bochs:
- screen blanking support
etnaviv:
- export more GPU ID values to userspace
- add HWDB entry for GPU on i.MX8MP
- rework linear window calcs
exynos:
- pm runtime changes
imx:
- Annotate dma_fence critical section
- fix PRG modifiers after drmm conversion
- Add 8 pixel alignment fix for 1366x768
- fix YUV advertising
- add color properties
ingenic:
- IPU planes fix
panfrost:
- Mediatek MT8183 support + DT bindings
- export AFBC_FEATURES register to userspace
simpledrm:
- %pr for printing resources
nouveau:
- pin/unpin TTM fixes
qxl:
- unpin shadow BO
virtio:
- create dumb BOs as guest blob
vkms:
- drmm_universal_plane_alloc
- add XRGB plane composition
- overlay support"
* tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm: (1570 commits)
drm/i915: Reinstate the mmap ioctl for some platforms
drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary crtc
Revert "drm/msm/mdp5: provide dynamic bandwidth management"
drm/msm/mdp5: provide dynamic bandwidth management
drm/msm/mdp5: add perf blocks for holding fudge factors
drm/msm/mdp5: switch to standard zpos property
drm/msm/mdp5: add support for alpha/blend_mode properties
drm/msm/mdp5: use drm_plane_state for pixel blend mode
drm/msm/mdp5: use drm_plane_state for storing alpha value
drm/msm/mdp5: use drm atomic helpers to handle base drm plane state
drm/msm/dsi: do not enable PHYs when called for the slave DSI interface
drm/msm: Add debugfs to trigger shrinker
drm/msm/dpu: Avoid ABBA deadlock between IRQ modules
drm/msm: devcoredump iommu fault support
iommu/arm-smmu-qcom: Add stall support
drm/msm: Improve the a6xx page fault handler
iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info
iommu/arm-smmu: Add support for driver IOMMU fault handlers
drm/msm: export hangcheck_period in debugfs
drm/msm/a6xx: add support for Adreno 660 GPU
...
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 545 |
1 files changed, 317 insertions, 228 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 9acee4a5b2ba..79cfa2d68487 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -25,18 +25,22 @@ * Alex Deucher * Jerome Glisse */ + #include <linux/dma-fence-array.h> #include <linux/interval_tree_generic.h> #include <linux/idr.h> #include <linux/dma-buf.h> #include <drm/amdgpu_drm.h> +#include <drm/drm_drv.h> #include "amdgpu.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" #include "amdgpu_gmc.h" #include "amdgpu_xgmi.h" #include "amdgpu_dma_buf.h" +#include "amdgpu_res_cursor.h" +#include "kfd_svm.h" /** * DOC: GPUVM @@ -328,7 +332,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, base->next = bo->vm_bo; bo->vm_bo = base; - if (bo->tbo.base.resv != vm->root.base.bo->tbo.base.resv) + if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv) return; vm->bulk_moveable = false; @@ -338,7 +342,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, amdgpu_vm_bo_idle(base); if (bo->preferred_domains & - amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type)) + amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type)) return; /* @@ -357,14 +361,14 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base, * Helper to get the parent entry for the child page table. NULL if we are at * the root page directory. */ -static struct amdgpu_vm_pt *amdgpu_vm_pt_parent(struct amdgpu_vm_pt *pt) +static struct amdgpu_vm_bo_base *amdgpu_vm_pt_parent(struct amdgpu_vm_bo_base *pt) { - struct amdgpu_bo *parent = pt->base.bo->parent; + struct amdgpu_bo *parent = pt->bo->parent; if (!parent) return NULL; - return container_of(parent->vm_bo, struct amdgpu_vm_pt, base); + return parent->vm_bo; } /* @@ -372,8 +376,8 @@ static struct amdgpu_vm_pt *amdgpu_vm_pt_parent(struct amdgpu_vm_pt *pt) */ struct amdgpu_vm_pt_cursor { uint64_t pfn; - struct amdgpu_vm_pt *parent; - struct amdgpu_vm_pt *entry; + struct amdgpu_vm_bo_base *parent; + struct amdgpu_vm_bo_base *entry; unsigned level; }; @@ -412,17 +416,17 @@ static bool amdgpu_vm_pt_descendant(struct amdgpu_device *adev, { unsigned mask, shift, idx; - if (!cursor->entry->entries) + if ((cursor->level == AMDGPU_VM_PTB) || !cursor->entry || + !cursor->entry->bo) return false; - BUG_ON(!cursor->entry->base.bo); mask = amdgpu_vm_entries_mask(adev, cursor->level); shift = amdgpu_vm_level_shift(adev, cursor->level); ++cursor->level; idx = (cursor->pfn >> shift) & mask; cursor->parent = cursor->entry; - cursor->entry = &cursor->entry->entries[idx]; + cursor->entry = &to_amdgpu_bo_vm(cursor->entry->bo)->entries[idx]; return true; } @@ -449,7 +453,7 @@ static bool amdgpu_vm_pt_sibling(struct amdgpu_device *adev, shift = amdgpu_vm_level_shift(adev, cursor->level - 1); num_entries = amdgpu_vm_num_entries(adev, cursor->level - 1); - if (cursor->entry == &cursor->parent->entries[num_entries - 1]) + if (cursor->entry == &to_amdgpu_bo_vm(cursor->parent->bo)->entries[num_entries - 1]) return false; cursor->pfn += 1ULL << shift; @@ -535,7 +539,7 @@ static void amdgpu_vm_pt_first_dfs(struct amdgpu_device *adev, * True when the search should continue, false otherwise. */ static bool amdgpu_vm_pt_continue_dfs(struct amdgpu_vm_pt_cursor *start, - struct amdgpu_vm_pt *entry) + struct amdgpu_vm_bo_base *entry) { return entry && (!start || entry != start->entry); } @@ -586,7 +590,7 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm, struct amdgpu_bo_list_entry *entry) { entry->priority = 0; - entry->tv.bo = &vm->root.base.bo->tbo; + entry->tv.bo = &vm->root.bo->tbo; /* Two for VM updates, one for TTM and one for the CS job */ entry->tv.num_shared = 4; entry->user_pages = NULL; @@ -618,7 +622,7 @@ void amdgpu_vm_del_from_lru_notify(struct ttm_buffer_object *bo) for (bo_base = abo->vm_bo; bo_base; bo_base = bo_base->next) { struct amdgpu_vm *vm = bo_base->vm; - if (abo->tbo.base.resv == vm->root.base.bo->tbo.base.resv) + if (abo->tbo.base.resv == vm->root.bo->tbo.base.resv) vm->bulk_moveable = false; } @@ -649,15 +653,16 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev, spin_lock(&adev->mman.bdev.lru_lock); list_for_each_entry(bo_base, &vm->idle, vm_status) { struct amdgpu_bo *bo = bo_base->bo; + struct amdgpu_bo *shadow = amdgpu_bo_shadowed(bo); if (!bo->parent) continue; - ttm_bo_move_to_lru_tail(&bo->tbo, &bo->tbo.mem, + ttm_bo_move_to_lru_tail(&bo->tbo, bo->tbo.resource, &vm->lru_bulk_move); - if (bo->shadow) - ttm_bo_move_to_lru_tail(&bo->shadow->tbo, - &bo->shadow->tbo.mem, + if (shadow) + ttm_bo_move_to_lru_tail(&shadow->tbo, + shadow->tbo.resource, &vm->lru_bulk_move); } spin_unlock(&adev->mman.bdev.lru_lock); @@ -689,15 +694,21 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm, list_for_each_entry_safe(bo_base, tmp, &vm->evicted, vm_status) { struct amdgpu_bo *bo = bo_base->bo; + struct amdgpu_bo *shadow = amdgpu_bo_shadowed(bo); r = validate(param, bo); if (r) return r; + if (shadow) { + r = validate(param, shadow); + if (r) + return r; + } if (bo->tbo.type != ttm_bo_type_kernel) { amdgpu_vm_bo_moved(bo_base); } else { - vm->update_funcs->map_table(bo); + vm->update_funcs->map_table(to_amdgpu_bo_vm(bo)); amdgpu_vm_bo_relocated(bo_base); } } @@ -729,7 +740,7 @@ bool amdgpu_vm_ready(struct amdgpu_vm *vm) * * @adev: amdgpu_device pointer * @vm: VM to clear BO from - * @bo: BO to clear + * @vmbo: BO to clear * @immediate: use an immediate update * * Root PD needs to be reserved when calling this. @@ -739,13 +750,14 @@ bool amdgpu_vm_ready(struct amdgpu_vm *vm) */ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, struct amdgpu_vm *vm, - struct amdgpu_bo *bo, + struct amdgpu_bo_vm *vmbo, bool immediate) { struct ttm_operation_ctx ctx = { true, false }; unsigned level = adev->vm_manager.root_level; struct amdgpu_vm_update_params params; - struct amdgpu_bo *ancestor = bo; + struct amdgpu_bo *ancestor = &vmbo->bo; + struct amdgpu_bo *bo = &vmbo->bo; unsigned entries, ats_entries; uint64_t addr; int r; @@ -769,11 +781,11 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, entries -= ats_entries; } else { - struct amdgpu_vm_pt *pt; + struct amdgpu_vm_bo_base *pt; - pt = container_of(ancestor->vm_bo, struct amdgpu_vm_pt, base); + pt = ancestor->vm_bo; ats_entries = amdgpu_vm_num_ats_entries(adev); - if ((pt - vm->root.entries) >= ats_entries) { + if ((pt - to_amdgpu_bo_vm(vm->root.bo)->entries) >= ats_entries) { ats_entries = 0; } else { ats_entries = entries; @@ -785,14 +797,15 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, if (r) return r; - if (bo->shadow) { - r = ttm_bo_validate(&bo->shadow->tbo, &bo->shadow->placement, - &ctx); + if (vmbo->shadow) { + struct amdgpu_bo *shadow = vmbo->shadow; + + r = ttm_bo_validate(&shadow->tbo, &shadow->placement, &ctx); if (r) return r; } - r = vm->update_funcs->map_table(bo); + r = vm->update_funcs->map_table(vmbo); if (r) return r; @@ -816,7 +829,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, amdgpu_gmc_get_vm_pde(adev, level, &value, &flags); } - r = vm->update_funcs->update(¶ms, bo, addr, 0, ats_entries, + r = vm->update_funcs->update(¶ms, vmbo, addr, 0, ats_entries, value, flags); if (r) return r; @@ -839,7 +852,7 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, } } - r = vm->update_funcs->update(¶ms, bo, addr, 0, entries, + r = vm->update_funcs->update(¶ms, vmbo, addr, 0, entries, value, flags); if (r) return r; @@ -849,35 +862,85 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev, } /** - * amdgpu_vm_bo_param - fill in parameters for PD/PT allocation + * amdgpu_vm_pt_create - create bo for PD/PT * * @adev: amdgpu_device pointer * @vm: requesting vm * @level: the page table level * @immediate: use a immediate update - * @bp: resulting BO allocation parameters + * @vmbo: pointer to the buffer object pointer */ -static void amdgpu_vm_bo_param(struct amdgpu_device *adev, struct amdgpu_vm *vm, +static int amdgpu_vm_pt_create(struct amdgpu_device *adev, + struct amdgpu_vm *vm, int level, bool immediate, - struct amdgpu_bo_param *bp) + struct amdgpu_bo_vm **vmbo) { - memset(bp, 0, sizeof(*bp)); + struct amdgpu_bo_param bp; + struct amdgpu_bo *bo; + struct dma_resv *resv; + unsigned int num_entries; + int r; - bp->size = amdgpu_vm_bo_size(adev, level); - bp->byte_align = AMDGPU_GPU_PAGE_SIZE; - bp->domain = AMDGPU_GEM_DOMAIN_VRAM; - bp->domain = amdgpu_bo_get_preferred_pin_domain(adev, bp->domain); - bp->flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | + memset(&bp, 0, sizeof(bp)); + + bp.size = amdgpu_vm_bo_size(adev, level); + bp.byte_align = AMDGPU_GPU_PAGE_SIZE; + bp.domain = AMDGPU_GEM_DOMAIN_VRAM; + bp.domain = amdgpu_bo_get_preferred_pin_domain(adev, bp.domain); + bp.flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS | AMDGPU_GEM_CREATE_CPU_GTT_USWC; - bp->bo_ptr_size = sizeof(struct amdgpu_bo); + + if (level < AMDGPU_VM_PTB) + num_entries = amdgpu_vm_num_entries(adev, level); + else + num_entries = 0; + + bp.bo_ptr_size = struct_size((*vmbo), entries, num_entries); + if (vm->use_cpu_for_update) - bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; - else if (!vm->root.base.bo || vm->root.base.bo->shadow) - bp->flags |= AMDGPU_GEM_CREATE_SHADOW; - bp->type = ttm_bo_type_kernel; - bp->no_wait_gpu = immediate; - if (vm->root.base.bo) - bp->resv = vm->root.base.bo->tbo.base.resv; + bp.flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; + + bp.type = ttm_bo_type_kernel; + bp.no_wait_gpu = immediate; + if (vm->root.bo) + bp.resv = vm->root.bo->tbo.base.resv; + + r = amdgpu_bo_create_vm(adev, &bp, vmbo); + if (r) + return r; + + bo = &(*vmbo)->bo; + if (vm->is_compute_context || (adev->flags & AMD_IS_APU)) { + (*vmbo)->shadow = NULL; + return 0; + } + + if (!bp.resv) + WARN_ON(dma_resv_lock(bo->tbo.base.resv, + NULL)); + resv = bp.resv; + memset(&bp, 0, sizeof(bp)); + bp.size = amdgpu_vm_bo_size(adev, level); + bp.domain = AMDGPU_GEM_DOMAIN_GTT; + bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC; + bp.type = ttm_bo_type_kernel; + bp.resv = bo->tbo.base.resv; + bp.bo_ptr_size = sizeof(struct amdgpu_bo); + + r = amdgpu_bo_create(adev, &bp, &(*vmbo)->shadow); + + if (!resv) + dma_resv_unlock(bo->tbo.base.resv); + + if (r) { + amdgpu_bo_unref(&bo); + return r; + } + + (*vmbo)->shadow->parent = amdgpu_bo_ref(bo); + amdgpu_bo_add_to_shadow_list(*vmbo); + + return 0; } /** @@ -899,37 +962,24 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device *adev, struct amdgpu_vm_pt_cursor *cursor, bool immediate) { - struct amdgpu_vm_pt *entry = cursor->entry; - struct amdgpu_bo_param bp; - struct amdgpu_bo *pt; + struct amdgpu_vm_bo_base *entry = cursor->entry; + struct amdgpu_bo *pt_bo; + struct amdgpu_bo_vm *pt; int r; - if (cursor->level < AMDGPU_VM_PTB && !entry->entries) { - unsigned num_entries; - - num_entries = amdgpu_vm_num_entries(adev, cursor->level); - entry->entries = kvmalloc_array(num_entries, - sizeof(*entry->entries), - GFP_KERNEL | __GFP_ZERO); - if (!entry->entries) - return -ENOMEM; - } - - if (entry->base.bo) + if (entry->bo) return 0; - amdgpu_vm_bo_param(adev, vm, cursor->level, immediate, &bp); - - r = amdgpu_bo_create(adev, &bp, &pt); + r = amdgpu_vm_pt_create(adev, vm, cursor->level, immediate, &pt); if (r) return r; /* Keep a reference to the root directory to avoid * freeing them up in the wrong order. */ - pt->parent = amdgpu_bo_ref(cursor->parent->base.bo); - amdgpu_vm_bo_base_init(&entry->base, vm, pt); - + pt_bo = &pt->bo; + pt_bo->parent = amdgpu_bo_ref(cursor->parent->bo); + amdgpu_vm_bo_base_init(entry, vm, pt_bo); r = amdgpu_vm_clear_bo(adev, vm, pt, immediate); if (r) goto error_free_pt; @@ -938,7 +988,7 @@ static int amdgpu_vm_alloc_pts(struct amdgpu_device *adev, error_free_pt: amdgpu_bo_unref(&pt->shadow); - amdgpu_bo_unref(&pt); + amdgpu_bo_unref(&pt_bo); return r; } @@ -947,16 +997,17 @@ error_free_pt: * * @entry: PDE to free */ -static void amdgpu_vm_free_table(struct amdgpu_vm_pt *entry) +static void amdgpu_vm_free_table(struct amdgpu_vm_bo_base *entry) { - if (entry->base.bo) { - entry->base.bo->vm_bo = NULL; - list_del(&entry->base.vm_status); - amdgpu_bo_unref(&entry->base.bo->shadow); - amdgpu_bo_unref(&entry->base.bo); - } - kvfree(entry->entries); - entry->entries = NULL; + struct amdgpu_bo *shadow; + + if (!entry->bo) + return; + shadow = amdgpu_bo_shadowed(entry->bo); + entry->bo->vm_bo = NULL; + list_del(&entry->vm_status); + amdgpu_bo_unref(&shadow); + amdgpu_bo_unref(&entry->bo); } /** @@ -973,7 +1024,7 @@ static void amdgpu_vm_free_pts(struct amdgpu_device *adev, struct amdgpu_vm_pt_cursor *start) { struct amdgpu_vm_pt_cursor cursor; - struct amdgpu_vm_pt *entry; + struct amdgpu_vm_bo_base *entry; vm->bulk_moveable = false; @@ -1241,10 +1292,10 @@ uint64_t amdgpu_vm_map_gart(const dma_addr_t *pages_addr, uint64_t addr) */ static int amdgpu_vm_update_pde(struct amdgpu_vm_update_params *params, struct amdgpu_vm *vm, - struct amdgpu_vm_pt *entry) + struct amdgpu_vm_bo_base *entry) { - struct amdgpu_vm_pt *parent = amdgpu_vm_pt_parent(entry); - struct amdgpu_bo *bo = parent->base.bo, *pbo; + struct amdgpu_vm_bo_base *parent = amdgpu_vm_pt_parent(entry); + struct amdgpu_bo *bo = parent->bo, *pbo; uint64_t pde, pt, flags; unsigned level; @@ -1252,9 +1303,10 @@ static int amdgpu_vm_update_pde(struct amdgpu_vm_update_params *params, pbo = pbo->parent; level += params->adev->vm_manager.root_level; - amdgpu_gmc_get_pde_for_bo(entry->base.bo, level, &pt, &flags); - pde = (entry - parent->entries) * 8; - return vm->update_funcs->update(params, bo, pde, pt, 1, 0, flags); + amdgpu_gmc_get_pde_for_bo(entry->bo, level, &pt, &flags); + pde = (entry - to_amdgpu_bo_vm(parent->bo)->entries) * 8; + return vm->update_funcs->update(params, to_amdgpu_bo_vm(bo), pde, pt, + 1, 0, flags); } /** @@ -1269,11 +1321,11 @@ static void amdgpu_vm_invalidate_pds(struct amdgpu_device *adev, struct amdgpu_vm *vm) { struct amdgpu_vm_pt_cursor cursor; - struct amdgpu_vm_pt *entry; + struct amdgpu_vm_bo_base *entry; for_each_amdgpu_vm_pt_dfs_safe(adev, vm, NULL, cursor, entry) - if (entry->base.bo && !entry->base.moved) - amdgpu_vm_bo_relocated(&entry->base); + if (entry->bo && !entry->moved) + amdgpu_vm_bo_relocated(entry); } /** @@ -1307,11 +1359,12 @@ int amdgpu_vm_update_pdes(struct amdgpu_device *adev, return r; while (!list_empty(&vm->relocated)) { - struct amdgpu_vm_pt *entry; + struct amdgpu_vm_bo_base *entry; - entry = list_first_entry(&vm->relocated, struct amdgpu_vm_pt, - base.vm_status); - amdgpu_vm_bo_idle(&entry->base); + entry = list_first_entry(&vm->relocated, + struct amdgpu_vm_bo_base, + vm_status); + amdgpu_vm_bo_idle(entry); r = amdgpu_vm_update_pde(¶ms, vm, entry); if (r) @@ -1334,9 +1387,9 @@ error: * Make sure to set the right flags for the PTEs at the desired level. */ static void amdgpu_vm_update_flags(struct amdgpu_vm_update_params *params, - struct amdgpu_bo *bo, unsigned level, + struct amdgpu_bo_vm *pt, unsigned int level, uint64_t pe, uint64_t addr, - unsigned count, uint32_t incr, + unsigned int count, uint32_t incr, uint64_t flags) { @@ -1352,7 +1405,7 @@ static void amdgpu_vm_update_flags(struct amdgpu_vm_update_params *params, flags |= AMDGPU_PTE_EXECUTABLE; } - params->vm->update_funcs->update(params, bo, pe, addr, count, incr, + params->vm->update_funcs->update(params, pt, pe, addr, count, incr, flags); } @@ -1491,7 +1544,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params, continue; } - pt = cursor.entry->base.bo; + pt = cursor.entry->bo; if (!pt) { /* We need all PDs and PTs for mapping something, */ if (flags & AMDGPU_PTE_VALID) @@ -1503,7 +1556,7 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params, if (!amdgpu_vm_pt_ancestor(&cursor)) return -EINVAL; - pt = cursor.entry->base.bo; + pt = cursor.entry->bo; shift = parent_shift; frag_end = max(frag_end, ALIGN(frag_start + 1, 1ULL << shift)); @@ -1532,9 +1585,9 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params, nptes, dst, incr, upd_flags, vm->task_info.pid, vm->immediate.fence_context); - amdgpu_vm_update_flags(params, pt, cursor.level, - pe_start, dst, nptes, incr, - upd_flags); + amdgpu_vm_update_flags(params, to_amdgpu_bo_vm(pt), + cursor.level, pe_start, dst, + nptes, incr, upd_flags); pe_start += nptes * 8; dst += nptes * incr; @@ -1557,7 +1610,11 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params, * completely covered by the range and so potentially still in use. */ while (cursor.pfn < frag_start) { - amdgpu_vm_free_pts(adev, params->vm, &cursor); + /* Make sure previous mapping is freed */ + if (cursor.entry->bo) { + params->table_freed = true; + amdgpu_vm_free_pts(adev, params->vm, &cursor); + } amdgpu_vm_pt_next(adev, &cursor); } @@ -1583,29 +1640,34 @@ static int amdgpu_vm_update_ptes(struct amdgpu_vm_update_params *params, * @last: last mapped entry * @flags: flags for the entries * @offset: offset into nodes and pages_addr - * @nodes: array of drm_mm_nodes with the MC addresses + * @res: ttm_resource to map * @pages_addr: DMA addresses to use for mapping * @fence: optional resulting fence + * @table_freed: return true if page table is freed * * Fill in the page table entries between @start and @last. * * Returns: * 0 for success, -EINVAL for failure. */ -static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, - struct amdgpu_device *bo_adev, - struct amdgpu_vm *vm, bool immediate, - bool unlocked, struct dma_resv *resv, - uint64_t start, uint64_t last, - uint64_t flags, uint64_t offset, - struct drm_mm_node *nodes, - dma_addr_t *pages_addr, - struct dma_fence **fence) +int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, + struct amdgpu_device *bo_adev, + struct amdgpu_vm *vm, bool immediate, + bool unlocked, struct dma_resv *resv, + uint64_t start, uint64_t last, + uint64_t flags, uint64_t offset, + struct ttm_resource *res, + dma_addr_t *pages_addr, + struct dma_fence **fence, + bool *table_freed) { struct amdgpu_vm_update_params params; + struct amdgpu_res_cursor cursor; enum amdgpu_sync_mode sync_mode; - uint64_t pfn; - int r; + int r, idx; + + if (!drm_dev_enter(&adev->ddev, &idx)) + return -ENODEV; memset(¶ms, 0, sizeof(params)); params.adev = adev; @@ -1622,14 +1684,6 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, else sync_mode = AMDGPU_SYNC_EXPLICIT; - pfn = offset >> PAGE_SHIFT; - if (nodes) { - while (pfn >= nodes->size) { - pfn -= nodes->size; - ++nodes; - } - } - amdgpu_vm_eviction_lock(vm); if (vm->evicting) { r = -EBUSY; @@ -1639,7 +1693,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, if (!unlocked && !dma_fence_is_signaled(vm->last_unlocked)) { struct dma_fence *tmp = dma_fence_get_stub(); - amdgpu_bo_fence(vm->root.base.bo, vm->last_unlocked, true); + amdgpu_bo_fence(vm->root.bo, vm->last_unlocked, true); swap(vm->last_unlocked, tmp); dma_fence_put(tmp); } @@ -1648,23 +1702,17 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, if (r) goto error_unlock; - do { + amdgpu_res_first(pages_addr ? NULL : res, offset, + (last - start + 1) * AMDGPU_GPU_PAGE_SIZE, &cursor); + while (cursor.remaining) { uint64_t tmp, num_entries, addr; - - num_entries = last - start + 1; - if (nodes) { - addr = nodes->start << PAGE_SHIFT; - num_entries = min((nodes->size - pfn) * - AMDGPU_GPU_PAGES_IN_CPU_PAGE, num_entries); - } else { - addr = 0; - } - + num_entries = cursor.size >> AMDGPU_GPU_PAGE_SHIFT; if (pages_addr) { bool contiguous = true; if (num_entries > AMDGPU_GPU_PAGES_IN_CPU_PAGE) { + uint64_t pfn = cursor.start >> PAGE_SHIFT; uint64_t count; contiguous = pages_addr[pfn + 1] == @@ -1684,16 +1732,18 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, } if (!contiguous) { - addr = pfn << PAGE_SHIFT; + addr = cursor.start; params.pages_addr = pages_addr; } else { - addr = pages_addr[pfn]; + addr = pages_addr[cursor.start >> PAGE_SHIFT]; params.pages_addr = NULL; } } else if (flags & (AMDGPU_PTE_VALID | AMDGPU_PTE_PRT)) { - addr += bo_adev->vm_manager.vram_base_offset; - addr += pfn << PAGE_SHIFT; + addr = bo_adev->vm_manager.vram_base_offset + + cursor.start; + } else { + addr = 0; } tmp = start + num_entries; @@ -1701,28 +1751,72 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, if (r) goto error_unlock; - pfn += num_entries / AMDGPU_GPU_PAGES_IN_CPU_PAGE; - if (nodes && nodes->size == pfn) { - pfn = 0; - ++nodes; - } + amdgpu_res_next(&cursor, num_entries * AMDGPU_GPU_PAGE_SIZE); start = tmp; - - } while (unlikely(start != last + 1)); + } r = vm->update_funcs->commit(¶ms, fence); + if (table_freed) + *table_freed = *table_freed || params.table_freed; + error_unlock: amdgpu_vm_eviction_unlock(vm); + drm_dev_exit(idx); return r; } +void amdgpu_vm_get_memory(struct amdgpu_vm *vm, uint64_t *vram_mem, + uint64_t *gtt_mem, uint64_t *cpu_mem) +{ + struct amdgpu_bo_va *bo_va, *tmp; + + list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) { + if (!bo_va->base.bo) + continue; + amdgpu_bo_get_memory(bo_va->base.bo, vram_mem, + gtt_mem, cpu_mem); + } + list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status) { + if (!bo_va->base.bo) + continue; + amdgpu_bo_get_memory(bo_va->base.bo, vram_mem, + gtt_mem, cpu_mem); + } + list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) { + if (!bo_va->base.bo) + continue; + amdgpu_bo_get_memory(bo_va->base.bo, vram_mem, + gtt_mem, cpu_mem); + } + list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) { + if (!bo_va->base.bo) + continue; + amdgpu_bo_get_memory(bo_va->base.bo, vram_mem, + gtt_mem, cpu_mem); + } + spin_lock(&vm->invalidated_lock); + list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status) { + if (!bo_va->base.bo) + continue; + amdgpu_bo_get_memory(bo_va->base.bo, vram_mem, + gtt_mem, cpu_mem); + } + list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status) { + if (!bo_va->base.bo) + continue; + amdgpu_bo_get_memory(bo_va->base.bo, vram_mem, + gtt_mem, cpu_mem); + } + spin_unlock(&vm->invalidated_lock); +} /** * amdgpu_vm_bo_update - update all BO mappings in the vm page table * * @adev: amdgpu_device pointer * @bo_va: requested BO and VM object * @clear: if true clear the entries + * @table_freed: return true if page table is freed * * Fill in the page table entries for @bo_va. * @@ -1730,14 +1824,13 @@ error_unlock: * 0 for success, -EINVAL for failure. */ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, - bool clear) + bool clear, bool *table_freed) { struct amdgpu_bo *bo = bo_va->base.bo; struct amdgpu_vm *vm = bo_va->base.vm; struct amdgpu_bo_va_mapping *mapping; dma_addr_t *pages_addr = NULL; struct ttm_resource *mem; - struct drm_mm_node *nodes; struct dma_fence **last_update; struct dma_resv *resv; uint64_t flags; @@ -1746,8 +1839,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, if (clear || !bo) { mem = NULL; - nodes = NULL; - resv = vm->root.base.bo->tbo.base.resv; + resv = vm->root.bo->tbo.base.resv; } else { struct drm_gem_object *obj = &bo->tbo.base; @@ -1757,12 +1849,12 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, struct drm_gem_object *gobj = dma_buf->priv; struct amdgpu_bo *abo = gem_to_amdgpu_bo(gobj); - if (abo->tbo.mem.mem_type == TTM_PL_VRAM) + if (abo->tbo.resource->mem_type == TTM_PL_VRAM) bo = gem_to_amdgpu_bo(gobj); } - mem = &bo->tbo.mem; - nodes = mem->mm_node; - if (mem->mem_type == TTM_PL_TT) + mem = bo->tbo.resource; + if (mem->mem_type == TTM_PL_TT || + mem->mem_type == AMDGPU_PL_PREEMPT) pages_addr = bo->tbo.ttm->dma_address; } @@ -1778,7 +1870,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, } if (clear || (bo && bo->tbo.base.resv == - vm->root.base.bo->tbo.base.resv)) + vm->root.bo->tbo.base.resv)) last_update = &vm->last_update; else last_update = &bo_va->last_pt_update; @@ -1810,8 +1902,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, r = amdgpu_vm_bo_update_mapping(adev, bo_adev, vm, false, false, resv, mapping->start, mapping->last, update_flags, - mapping->offset, nodes, - pages_addr, last_update); + mapping->offset, mem, + pages_addr, last_update, table_freed); if (r) return r; } @@ -1820,8 +1912,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, * the evicted list so that it gets validated again on the * next command submission. */ - if (bo && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv) { - uint32_t mem_type = bo->tbo.mem.mem_type; + if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) { + uint32_t mem_type = bo->tbo.resource->mem_type; if (!(bo->preferred_domains & amdgpu_mem_type_to_domain(mem_type))) @@ -1957,18 +2049,17 @@ static void amdgpu_vm_free_mapping(struct amdgpu_device *adev, */ static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) { - struct dma_resv *resv = vm->root.base.bo->tbo.base.resv; + struct dma_resv *resv = vm->root.bo->tbo.base.resv; struct dma_fence *excl, **shared; unsigned i, shared_count; int r; - r = dma_resv_get_fences_rcu(resv, &excl, - &shared_count, &shared); + r = dma_resv_get_fences(resv, &excl, &shared_count, &shared); if (r) { /* Not enough memory to grab the fence list, as last resort * block for all the fences to complete. */ - dma_resv_wait_timeout_rcu(resv, true, false, + dma_resv_wait_timeout(resv, true, false, MAX_SCHEDULE_TIMEOUT); return; } @@ -2004,7 +2095,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct dma_fence **fence) { - struct dma_resv *resv = vm->root.base.bo->tbo.base.resv; + struct dma_resv *resv = vm->root.bo->tbo.base.resv; struct amdgpu_bo_va_mapping *mapping; uint64_t init_pte_value = 0; struct dma_fence *f = NULL; @@ -2022,7 +2113,7 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, r = amdgpu_vm_bo_update_mapping(adev, adev, vm, false, false, resv, mapping->start, mapping->last, init_pte_value, - 0, NULL, NULL, &f); + 0, NULL, NULL, &f, NULL); amdgpu_vm_free_mapping(adev, vm, mapping, f); if (r) { dma_fence_put(f); @@ -2064,7 +2155,7 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev, list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) { /* Per VM BOs never need to bo cleared in the page tables */ - r = amdgpu_vm_bo_update(adev, bo_va, false); + r = amdgpu_vm_bo_update(adev, bo_va, false, NULL); if (r) return r; } @@ -2083,7 +2174,7 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev, else clear = true; - r = amdgpu_vm_bo_update(adev, bo_va, clear); + r = amdgpu_vm_bo_update(adev, bo_va, clear, NULL); if (r) return r; @@ -2163,7 +2254,7 @@ static void amdgpu_vm_bo_insert_map(struct amdgpu_device *adev, if (mapping->flags & AMDGPU_PTE_PRT) amdgpu_vm_prt_get(adev); - if (bo && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv && + if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv && !bo_va->base.moved) { list_move(&bo_va->base.vm_status, &vm->moved); } @@ -2525,7 +2616,7 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev, struct amdgpu_vm_bo_base **base; if (bo) { - if (bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv) + if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv) vm->bulk_moveable = false; for (base = &bo_va->base.bo->vm_bo; *base; @@ -2580,7 +2671,7 @@ bool amdgpu_vm_evictable(struct amdgpu_bo *bo) return true; /* Don't evict VM page tables while they are busy */ - if (!dma_resv_test_signaled_rcu(bo->tbo.base.resv, true)) + if (!dma_resv_test_signaled(bo->tbo.base.resv, true)) return false; /* Try to block ongoing updates */ @@ -2613,13 +2704,13 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev, struct amdgpu_vm_bo_base *bo_base; /* shadow bo doesn't have bo base, its validation needs its parent */ - if (bo->parent && bo->parent->shadow == bo) + if (bo->parent && (amdgpu_bo_shadowed(bo->parent) == bo)) bo = bo->parent; for (bo_base = bo->vm_bo; bo_base; bo_base = bo_base->next) { struct amdgpu_vm *vm = bo_base->vm; - if (evicted && bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv) { + if (evicted && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) { amdgpu_vm_bo_evicted(bo_base); continue; } @@ -2630,7 +2721,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev, if (bo->tbo.type == ttm_bo_type_kernel) amdgpu_vm_bo_relocated(bo_base); - else if (bo->tbo.base.resv == vm->root.base.bo->tbo.base.resv) + else if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv) amdgpu_vm_bo_moved(bo_base); else amdgpu_vm_bo_invalidated(bo_base); @@ -2760,8 +2851,8 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint32_t min_vm_size, */ long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout) { - timeout = dma_resv_wait_timeout_rcu(vm->root.base.bo->tbo.base.resv, - true, true, timeout); + timeout = dma_resv_wait_timeout(vm->root.bo->tbo.base.resv, true, + true, timeout); if (timeout <= 0) return timeout; @@ -2773,7 +2864,6 @@ long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout) * * @adev: amdgpu_device pointer * @vm: requested vm - * @vm_context: Indicates if it GFX or Compute context * @pasid: Process address space identifier * * Init @vm fields. @@ -2781,11 +2871,10 @@ long amdgpu_vm_wait_idle(struct amdgpu_vm *vm, long timeout) * Returns: * 0 for success, error for failure. */ -int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, - int vm_context, u32 pasid) +int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, u32 pasid) { - struct amdgpu_bo_param bp; - struct amdgpu_bo *root; + struct amdgpu_bo *root_bo; + struct amdgpu_bo_vm *root; int r, i; vm->va = RB_ROOT_CACHED; @@ -2816,16 +2905,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, vm->pte_support_ats = false; vm->is_compute_context = false; - if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) { - vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode & - AMDGPU_VM_USE_CPU_FOR_COMPUTE); + vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode & + AMDGPU_VM_USE_CPU_FOR_GFX); - if (adev->asic_type == CHIP_RAVEN) - vm->pte_support_ats = true; - } else { - vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode & - AMDGPU_VM_USE_CPU_FOR_GFX); - } DRM_DEBUG_DRIVER("VM update mode is %s\n", vm->use_cpu_for_update ? "CPU" : "SDMA"); WARN_ONCE((vm->use_cpu_for_update && @@ -2842,28 +2924,26 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, mutex_init(&vm->eviction_lock); vm->evicting = false; - amdgpu_vm_bo_param(adev, vm, adev->vm_manager.root_level, false, &bp); - if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) - bp.flags &= ~AMDGPU_GEM_CREATE_SHADOW; - r = amdgpu_bo_create(adev, &bp, &root); + r = amdgpu_vm_pt_create(adev, vm, adev->vm_manager.root_level, + false, &root); if (r) goto error_free_delayed; - - r = amdgpu_bo_reserve(root, true); + root_bo = &root->bo; + r = amdgpu_bo_reserve(root_bo, true); if (r) goto error_free_root; - r = dma_resv_reserve_shared(root->tbo.base.resv, 1); + r = dma_resv_reserve_shared(root_bo->tbo.base.resv, 1); if (r) goto error_unreserve; - amdgpu_vm_bo_base_init(&vm->root.base, vm, root); + amdgpu_vm_bo_base_init(&vm->root, vm, root_bo); r = amdgpu_vm_clear_bo(adev, vm, root, false); if (r) goto error_unreserve; - amdgpu_bo_unreserve(vm->root.base.bo); + amdgpu_bo_unreserve(vm->root.bo); if (pasid) { unsigned long flags; @@ -2883,12 +2963,12 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm, return 0; error_unreserve: - amdgpu_bo_unreserve(vm->root.base.bo); + amdgpu_bo_unreserve(vm->root.bo); error_free_root: - amdgpu_bo_unref(&vm->root.base.bo->shadow); - amdgpu_bo_unref(&vm->root.base.bo); - vm->root.base.bo = NULL; + amdgpu_bo_unref(&root->shadow); + amdgpu_bo_unref(&root_bo); + vm->root.bo = NULL; error_free_delayed: dma_fence_put(vm->last_unlocked); @@ -2914,17 +2994,14 @@ error_free_immediate: * 0 if this VM is clean */ static int amdgpu_vm_check_clean_reserved(struct amdgpu_device *adev, - struct amdgpu_vm *vm) + struct amdgpu_vm *vm) { enum amdgpu_vm_level root = adev->vm_manager.root_level; unsigned int entries = amdgpu_vm_num_entries(adev, root); unsigned int i = 0; - if (!(vm->root.entries)) - return 0; - for (i = 0; i < entries; i++) { - if (vm->root.entries[i].base.bo) + if (to_amdgpu_bo_vm(vm->root.bo)->entries[i].bo) return -EINVAL; } @@ -2958,7 +3035,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm, bool pte_support_ats = (adev->asic_type == CHIP_RAVEN); int r; - r = amdgpu_bo_reserve(vm->root.base.bo, true); + r = amdgpu_bo_reserve(vm->root.bo, true); if (r) return r; @@ -2985,7 +3062,9 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm, */ if (pte_support_ats != vm->pte_support_ats) { vm->pte_support_ats = pte_support_ats; - r = amdgpu_vm_clear_bo(adev, vm, vm->root.base.bo, false); + r = amdgpu_vm_clear_bo(adev, vm, + to_amdgpu_bo_vm(vm->root.bo), + false); if (r) goto free_idr; } @@ -3001,7 +3080,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (vm->use_cpu_for_update) { /* Sync with last SDMA update/clear before switching to CPU */ - r = amdgpu_bo_sync_wait(vm->root.base.bo, + r = amdgpu_bo_sync_wait(vm->root.bo, AMDGPU_FENCE_OWNER_UNDEFINED, true); if (r) goto free_idr; @@ -3029,7 +3108,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm, } /* Free the shadow bo for compute VM */ - amdgpu_bo_unref(&vm->root.base.bo->shadow); + amdgpu_bo_unref(&to_amdgpu_bo_vm(vm->root.bo)->shadow); if (pasid) vm->pasid = pasid; @@ -3045,7 +3124,7 @@ free_idr: spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags); } unreserve_bo: - amdgpu_bo_unreserve(vm->root.base.bo); + amdgpu_bo_unreserve(vm->root.bo); return r; } @@ -3088,7 +3167,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) amdgpu_amdkfd_gpuvm_destroy_cb(adev, vm); - root = amdgpu_bo_ref(vm->root.base.bo); + root = amdgpu_bo_ref(vm->root.bo); amdgpu_bo_reserve(root, true); if (vm->pasid) { unsigned long flags; @@ -3115,7 +3194,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm) amdgpu_vm_free_pts(adev, vm, NULL); amdgpu_bo_unreserve(root); amdgpu_bo_unref(&root); - WARN_ON(vm->root.base.bo); + WARN_ON(vm->root.bo); drm_sched_entity_destroy(&vm->immediate); drm_sched_entity_destroy(&vm->delayed); @@ -3232,7 +3311,7 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) /* Wait vm idle to make sure the vmid set in SPM_VMID is * not referenced anymore. */ - r = amdgpu_bo_reserve(fpriv->vm.root.base.bo, true); + r = amdgpu_bo_reserve(fpriv->vm.root.bo, true); if (r) return r; @@ -3240,7 +3319,7 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) if (r < 0) return r; - amdgpu_bo_unreserve(fpriv->vm.root.base.bo); + amdgpu_bo_unreserve(fpriv->vm.root.bo); amdgpu_vmid_free_reserved(adev, &fpriv->vm, AMDGPU_GFXHUB_0); break; default: @@ -3304,47 +3383,57 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm) bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid, uint64_t addr) { + bool is_compute_context = false; struct amdgpu_bo *root; + unsigned long irqflags; uint64_t value, flags; struct amdgpu_vm *vm; int r; - spin_lock(&adev->vm_manager.pasid_lock); + spin_lock_irqsave(&adev->vm_manager.pasid_lock, irqflags); vm = idr_find(&adev->vm_manager.pasid_idr, pasid); - if (vm) - root = amdgpu_bo_ref(vm->root.base.bo); - else + if (vm) { + root = amdgpu_bo_ref(vm->root.bo); + is_compute_context = vm->is_compute_context; + } else { root = NULL; - spin_unlock(&adev->vm_manager.pasid_lock); + } + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, irqflags); if (!root) return false; + addr /= AMDGPU_GPU_PAGE_SIZE; + + if (is_compute_context && + !svm_range_restore_pages(adev, pasid, addr)) { + amdgpu_bo_unref(&root); + return true; + } + r = amdgpu_bo_reserve(root, true); if (r) goto error_unref; /* Double check that the VM still exists */ - spin_lock(&adev->vm_manager.pasid_lock); + spin_lock_irqsave(&adev->vm_manager.pasid_lock, irqflags); vm = idr_find(&adev->vm_manager.pasid_idr, pasid); - if (vm && vm->root.base.bo != root) + if (vm && vm->root.bo != root) vm = NULL; - spin_unlock(&adev->vm_manager.pasid_lock); + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, irqflags); if (!vm) goto error_unlock; - addr /= AMDGPU_GPU_PAGE_SIZE; flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED | AMDGPU_PTE_SYSTEM; - if (vm->is_compute_context) { + if (is_compute_context) { /* Intentionally setting invalid PTE flag * combination to force a no-retry-fault */ flags = AMDGPU_PTE_EXECUTABLE | AMDGPU_PDE_PTE | AMDGPU_PTE_TF; value = 0; - } else if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) { /* Redirect the access to the dummy page */ value = adev->dummy_page_addr; @@ -3363,7 +3452,7 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, u32 pasid, } r = amdgpu_vm_bo_update_mapping(adev, adev, vm, true, false, NULL, addr, - addr, flags, value, NULL, NULL, + addr, flags, value, NULL, NULL, NULL, NULL); if (r) goto error_unlock; |