summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_ppgtt.c
AgeCommit message (Collapse)Author
2021-09-24drm/i915: Reduce the number of objects subject to memcpy recoverThomas Hellström
We really only need memcpy restore for objects that affect the operability of the migrate context. That is, primarily the page-table objects of the migrate VM. Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early restores using memcpy and a way to assign LMEM page-table object flags to be used by the vms. Restore objects without this flag with the gpu blitter and only objects carrying the flag using TTM memcpy. Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and defer for a later audit which vms actually need it. Most importantly, user- allocated vms with pinned page-table objects can be restored using the blitter. Performance-wise memcpy restore is probably as fast as gpu restore if not faster, but using gpu restore will help tackling future restrictions in mappable LMEM size. v4: - Don't mark the aliasing ppgtt page table flags for early resume, but rather the ggtt page table flags as intended. (Matthew Auld) - The check for user buffer objects during early resume is pointless, since they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld) v5: - Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored before we fire up the migrate context. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com
2021-06-05drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VERLucas De Marchi
This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
2021-06-01drm/i915: Don't free shared locks while sharedThomas Hellström
We are currently sharing the VM reservation locks across a number of gem objects with page-table memory. Since TTM will individiualize the reservation locks when freeing objects, including accessing the shared locks, make sure that the shared locks are not freed until that is done. For PPGTT we add an additional refcount, for GGTT we take additional measures to make sure objects sharing the GGTT reservation lock are freed at GGTT takedown Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-3-thomas.hellstrom@linux.intel.com
2021-04-27drm/i915/gtt: map the PD up frontMatthew Auld
We need to generalise our accessor for the page directories and tables from using the simple kmap_atomic to support local memory, and this setup must be done on acquisition of the backing storage prior to entering fence execution contexts. Here we replace the kmap with the object mapping code that for simple single page shmemfs object will return a plain kmap, that is then kept for the lifetime of the page directory. Note that keeping the mapping around is a potential concern here, since while the vma is pinned the mapping remains there for the PDs underneath, or at least until the used_count reaches zero, at which point we can safely destroy the mapping. For 32b this will be even worse since the address space is more limited, but since this change mostly impacts full ppGTT platforms, the justification is that for modern platforms we shouldn't care too much about 32b. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com
2021-04-08Merge tag 'drm-intel-gt-next-2021-04-06' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Prepare for local/device memory support on DG1 by starting to use it for kernel internal allocations: context, ring and engine scratch (Matt A, CQ, Abdiel, Imre) - Sandybridge fix to avoid hard hang on ring resume (Chris) - Limit imported dma-buf size to int32 (Matt A) - Double check heartbeat timeout before resetting (Chris) - Use new tasklet API for execution list (Emil) - Fix SPDX checkpats warnings (Chris) - Fixes for various checkpatch warnings (Chris) - Selftest improvements (Chris) - Move the defer_request waiter active assertion to correct spot (Chris) - Make local-memory probing a GT operation (Matt, Tvrtko) - Protect against request freeing during cancellation on wedging (Chris) - Retire unexpected starting state error dumping (Chris) - Distinction of memory regions in debugging (Zbigniew) - Always flush the submission queue on checking for idle (Chris) - Consolidate 2big error check to helper (Matt) - Decrease number of subplatform bits (Tvrtko) - Remove unused internal request priority levels (Chris) - Document the unused internal header bits in buddy allocator (Matt) - Cleanup the region class/instance encoding (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YGxksaZGXHnFxlwg@jlahtine-mobl.ger.corp.intel.com
2021-03-24drm/i915/gtt/dg1: add PTE_LM plumbing for ppGTTMatthew Auld
For the PTEs we get an LM bit, to signal whether the page resides in SMEM or LMEM. BSpec: 45040 v2: just use gen8_pte_encode for dg1 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210203171231.551338-2-matthew.auld@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-03-24drm/i915: Use a single page table lock for each gtt.Maarten Lankhorst
We may create page table objects on the fly, but we may need to wait with the ww lock held. Instead of waiting on a freed obj lock, ensure we have the same lock for each object to keep -EDEADLK working. This ensures that i915_vma_pin_ww can lock the page tables when required. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-41-maarten.lankhorst@linux.intel.com
2021-03-11Merge drm/drm-next into drm-intel-nextJani Nikula
Sync up with upstream. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-02-02drm/i915/gt: Remove references to struct drm_device.pdevThomas Zimmermann
Using struct drm_device.pdev is deprecated. Convert i915 to struct drm_device.dev. No functional changes. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-3-tzimmermann@suse.de
2021-01-14drm/i915/gt: Prune inlinesChris Wilson
Remove all the manual inlines from non-critical sections in gt/ add/remove: 2/0 grow/shrink: 0/3 up/down: 762/-1473 (-711) Function old new delta mi_set_context.isra - 602 +602 write_dma_entry - 160 +160 __set_pd_entry 214 69 -145 clear_pd_entry 190 42 -148 ring_request_alloc 2021 841 -1180 Total: Before=1605086, After=1604375, chg -0.04% Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210113152224.29794-1-chris@chris-wilson.co.uk
2020-09-07drm/i915/gt: Shrink i915_page_directory's slab bucketChris Wilson
kmalloc uses power-of-two slab buckets for small allocations (up to a few pages). Since i915_page_directory is a page of pointers, plus a couple more, this is rounded up to 8K, and we waste nearly 50% of that allocation. Long terms this leads to poor memory utilisation, bloating the kernel footprint, but the problem is exacerbated by our conservative preallocation scheme for binding VMA. As we are required to allocate all levels for each vma just in case we need to insert them upon binding, this leads to a large multiplication factor for a single page vma. By halving the allocation we need for the page directory structure, we halve the impact of that factor, bringing workloads that once fitted into memory, hopefully back to fitting into memory. We maintain the split between i915_page_directory and i915_page_table as we only need half the allocation for the lowest, most populous, level. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915/gt: Switch to object allocations for page directoriesChris Wilson
The GEM object is grossly overweight for the practicality of tracking large numbers of individual pages, yet it is currently our only abstraction for tracking DMA allocations. Since those allocations need to be reserved upfront before an operation, and that we need to break away from simple system memory, we need to ditch using plain struct page wrappers. In the process, we drop the WC mapping as we ended up clflushing everything anyway due to various issues across a wider range of platforms. Though in a future step, we need to drop the kmap_atomic approach which suggests we need to pre-map all the pages and keep them mapped. v2: Verify our large scratch page is suitably DMA aligned; and manually clear the scratch since we are allocating plain struct pages full of prior content. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Preallocate stashes for vma page-directoriesChris Wilson
We need to make the DMA allocations used for page directories to be performed up front so that we can include those allocations in our memory reservation pass. The downside is that we have to assume the worst case, even before we know the final layout, and always allocate enough page directories for this object, even when there will be overlap. This unfortunately can be quite expensive, especially as we have to clear/reset the page directories and DMA pages, but it should only be required during early phases of a workload when new objects are being discovered, or after memory/eviction pressure when we need to rebind. Once we reach steady state, the objects should not be moved and we no longer need to preallocating the pages tables. It should be noted that the lifetime for the page directories DMA is more or less decoupled from individual fences as they will be shared across objects across timelines. v2: Only allocate enough PD space for the PTE we may use, we do not need to allocate PD that will be left as scratch. v3: Store the shift unto the first PD level to encapsulate the different PTE counts for gen6/gen8. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-07-03drm/i915: Export ppgtt_bind_vmaChris Wilson
Reuse the ppgtt_bind_vma() for aliasing_ppgtt_bind_vma() so we can reduce some code near-duplication. The catch is that we need to then pass along the i915_address_space and not rely on vma->vm, as they differ with the aliasing-ppgtt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703102519.26539-1-chris@chris-wilson.co.uk
2020-01-07drm/i915/gtt: split up i915_gem_gttMatthew Auld
Attempt to split i915_gem_gtt.[ch] into more manageable chunks. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk