summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/i915_vma.h
AgeCommit message (Collapse)Author
2022-05-04drm/i915: use IOMEM_ERR_PTR() directlyKefeng Wang
Use IOMEM_ERR_PTR() instead of self defined IO_ERR_PTR(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220503144937.679424-1-tvrtko.ursulin@linux.intel.com
2022-03-07drm/i915: Remove the vma refcountThomas Hellström
Now that i915_vma_parked() is taking the object lock on vma destruction, and the only user of the vma refcount, i915_gem_object_unbind() also takes the object lock, remove the vma refcount. v3: Documentation update. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-3-thomas.hellstrom@linux.intel.com
2022-02-28drm/i915: Clarify vma lifetimeThomas Hellström
It's unclear what reference the initial vma kref reference refers to. A vma can have multiple weak references, the object vma list, the vm's bound list and the GT's closed_list, and the initial vma reference can be put from lookups of all these lists. With the current implementation this means that any holder of yet another vma refcount (currently only i915_gem_object_unbind()) needs to be holding two of either *) An object refcount, *) A vm open count *) A vma open count in order for us to not risk leaking a reference by having the initial vma reference being put twice. Address this by re-introducing i915_vma_destroy() which removes all weak references of the vma and *then* puts the initial vma refcount. This makes a strong vma reference hold on to the vma unconditionally. Perhaps a better name would be i915_vma_revoke() or i915_vma_zombify(), since other callers may still hold a refcount, but with the prospect of being able to replace the vma refcount with the object lock in the near future, let's stick with i915_vma_destroy(). Finally this commit fixes a race in that previously i915_vma_release() and now i915_vma_destroy() could destroy a vma without taking the vm->mutex after an advisory check that the vma mm_node was not allocated. This would race with the ungrab_vma() function creating a trace similar to the below one. This was fixed in one of the __i915_vma_put() callsites in commit bc1922e5d349 ("drm/i915: Fix a race between vma / object destruction and unbinding") but although not seemingly triggered by CI, that is not sufficient. This patch is needed to fix that properly. [823.012188] Console: switching to colour dummy device 80x25 [823.012422] [IGT] gem_ppgtt: executing [823.016667] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0 [852.436465] stack segment: 0000 [#1] PREEMPT SMP NOPTI [852.436480] CPU: 0 PID: 3200 Comm: gem_ppgtt Not tainted 5.16.0-CI-CI_DRM_11115+ #1 [852.436489] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.2422.A00.2110131104 10/13/2021 [852.436499] RIP: 0010:ungrab_vma+0x9/0x80 [i915] [852.436711] Code: ef e8 4b 85 cf e0 e8 36 a3 d6 e0 8b 83 f8 9c 00 00 85 c0 75 e1 5b 5d 41 5c 41 5d c3 e9 d6 fd 14 00 55 53 48 8b af c0 00 00 00 <8b> 45 00 85 c0 75 03 5b 5d c3 48 8b 85 a0 02 00 00 48 89 fb 48 8b [852.436727] RSP: 0018:ffffc90006db7880 EFLAGS: 00010246 [852.436734] RAX: 0000000000000000 RBX: ffffc90006db7598 RCX: 0000000000000000 [852.436742] RDX: ffff88815349e898 RSI: ffff88815349e858 RDI: ffff88810a284140 [852.436748] RBP: 6b6b6b6b6b6b6b6b R08: ffff88815349e898 R09: ffff88815349e8e8 [852.436754] R10: 0000000000000001 R11: 0000000051ef1141 R12: ffff88810a284140 [852.436762] R13: 0000000000000000 R14: ffff88815349e868 R15: ffff88810a284458 [852.436770] FS: 00007f5c04b04e40(0000) GS:ffff88849f000000(0000) knlGS:0000000000000000 [852.436781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [852.436788] CR2: 00007f5c04b38fe0 CR3: 000000010a6e8001 CR4: 0000000000770ef0 [852.436797] PKRU: 55555554 [852.436801] Call Trace: [852.436806] <TASK> [852.436811] i915_gem_evict_for_node+0x33c/0x3c0 [i915] [852.437014] i915_gem_gtt_reserve+0x106/0x130 [i915] [852.437211] i915_vma_pin_ww+0x8f4/0xb60 [i915] [852.437412] eb_validate_vmas+0x688/0x860 [i915] [852.437596] i915_gem_do_execbuffer+0xc0e/0x25b0 [i915] [852.437770] ? deactivate_slab+0x5f2/0x7d0 [852.437778] ? _raw_spin_unlock_irqrestore+0x50/0x60 [852.437789] ? i915_gem_execbuffer2_ioctl+0xc6/0x2c0 [i915] [852.437944] ? init_object+0x49/0x80 [852.437950] ? __lock_acquire+0x5e6/0x2580 [852.437963] i915_gem_execbuffer2_ioctl+0x116/0x2c0 [i915] [852.438129] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915] [852.438300] drm_ioctl_kernel+0xac/0x140 [852.438310] drm_ioctl+0x201/0x3d0 [852.438316] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915] [852.438490] __x64_sys_ioctl+0x6a/0xa0 [852.438498] do_syscall_64+0x37/0xb0 [852.438507] entry_SYSCALL_64_after_hwframe+0x44/0xae [852.438515] RIP: 0033:0x7f5c0415b317 [852.438523] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [852.438542] RSP: 002b:00007ffd765039a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [852.438553] RAX: ffffffffffffffda RBX: 000055e4d7829dd0 RCX: 00007f5c0415b317 [852.438562] RDX: 00007ffd76503a00 RSI: 00000000c0406469 RDI: 0000000000000017 [852.438571] RBP: 00007ffd76503a00 R08: 0000000000000000 R09: 0000000000000081 [852.438579] R10: 00000000ffffff7f R11: 0000000000000246 R12: 00000000c0406469 [852.438587] R13: 0000000000000017 R14: 00007ffd76503a00 R15: 0000000000000000 [852.438598] </TASK> [852.438602] Modules linked in: snd_hda_codec_hdmi i915 mei_hdcp x86_pkg_temp_thermal snd_hda_intel snd_intel_dspcfg drm_buddy coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ttm ghash_clmulni_intel snd_hwdep snd_hda_core e1000e drm_dp_helper ptp snd_pcm mei_me drm_kms_helper pps_core mei syscopyarea sysfillrect sysimgblt fb_sys_fops prime_numbers intel_lpss_pci smsc75xx usbnet mii [852.440310] ---[ end trace e52cdd2fe4fd911c ]--- v2: Fix typos in the commit message. Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.") Fixes: bc1922e5d349 ("drm/i915: Fix a race between vma / object destruction and unbinding") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220222133209.587978-1-thomas.hellstrom@linux.intel.com
2022-01-18drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for ↵Maarten Lankhorst
i915_vma_unbind, v2. We want to remove more members of i915_vma, which requires the locking to be held more often. Start requiring gem object lock for i915_vma_unbind, as it's one of the callers that may unpin pages. Some special care is needed when evicting, because the last reference to the object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, and we need to cache vma->obj before unlocking. Changes since v1: - Make trylock failing a WARN. (Matt) - Remove double i915_vma_wait_for_bind() (Matt) - Move atomic_set to right before mutex_unlock(), to make it more clear they belong together. (Matt) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com
2022-01-11drm/i915: Use vma resources for async unbindingThomas Hellström
Implement async (non-blocking) unbinding by not syncing the vma before calling unbind on the vma_resource. Add the resulting unbind fence to the object's dma_resv from where it is picked up by the ttm migration code. Ideally these unbind fences should be coalesced with the migration blit fence to avoid stalling the migration blit waiting for unbind, as they can certainly go on in parallel, but since we don't yet have a reasonable data structure to use to coalesce fences and attach the resulting fence to a timeline, we defer that for now. Note that with async unbinding, even while the unbind waits for the preceding bind to complete before unbinding, the vma itself might have been destroyed in the process, clearing the vma pages. Therefore we can only allow async unbinding if we have a refcounted sg-list and keep a refcount on that for the vma resource pages to stay intact until binding occurs. If this condition is not met, a request for an async unbind is diverted to a sync unbind. v2: - Use a separate kmem_cache for vma resources for now to isolate their memory allocation and aid debugging. - Move the check for vm closed to the actual unbinding thread. Regardless of whether the vm is closed, we need the unbind fence to properly wait for capture. - Clear vma_res::vm on unbind and update its documentation. v4: - Take cache coloring into account when searching for vma resources pending unbind. (Matthew Auld) v5: - Fix timeout and error check in i915_vma_resource_bind_dep_await(). - Avoid taking a reference on the object for async binding if async unbind capable. - Fix braces around a single-line if statement. v6: - Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>) - Don't allow async unbinding if the vma_res pages are not the same as the object pages. (Matthew Auld) v7: - s/unsigned long/u64/ in a number of places (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
2022-01-11drm/i915: Use the vma resource as argument for gtt binding / unbindingThomas Hellström
When introducing asynchronous unbinding, the vma itself may no longer be alive when the actual binding or unbinding takes place. Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource instead of a struct i915_vma for the bind_vma() and unbind_vma() ops. Similarly change the insert_entries() op for struct i915_address_space. Replace a couple of i915_vma_snapshot members with their newly introduced i915_vma_resource counterparts, since they have the same lifetime. Also make sure to avoid changing the struct i915_vma_flags (in particular the bind flags) async. That should now only be done sync under the vm mutex. v2: - Update the vma_res::bound_flags when binding to the aliased ggtt v6: - Remove I915_VMA_ALLOC_BIT (Matthew Auld) - Change some members of struct i915_vma_resource from unsigned long to u64 (Matthew Auld) v7: - Fix vma resource size parameters to be u64 rather than unsigned long (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
2022-01-11drm/i915: Initial introduction of vma resourcesThomas Hellström
Introduce vma resources, sort of similar to TTM resources, needed for asynchronous bind management. Initially we will use them to hold completion of unbinding when we capture data from a vma, but they will be used extensively in upcoming patches for asynchronous vma unbinding. v6: - Some documentation updates Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-2-thomas.hellstrom@linux.intel.com
2021-12-20drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members, v3.Maarten Lankhorst
Big delta, but boils down to moving set_pages to i915_vma.c, and removing the special handling, all callers use the defaults anyway. We only remap in ggtt, so default case will fall through. Because we still don't require locking in i915_vma_unpin(), handle this by using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in unpin, which only fails if we race a against a new pin. Changes since v1: - aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case from __i915_vma_get_pages(). (Matt) Changes since v2: - Free correct old pages in __i915_vma_get_pages(). (Matt) Remove race of clearing vma->pages accidentally from put, free it but leave it set, as only get has the lock. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.com Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2021-12-20drm/i915: Remove unused bits of i915_vma/active apiMaarten Lankhorst
When reworking the code to move the eviction fence to the object, the best code is removed code. Remove some functions that are unused, and change the function definition if it's only used in 1 place. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> [mlankhorst: Remove new use of i915_active_has_exclusive] Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-2-maarten.lankhorst@linux.intel.com
2021-11-19drm/i915: Remove resv from i915_vmaMaarten Lankhorst
It's just an alias to vma->obj->base.resv, no need to duplicate it. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-5-matthew.auld@intel.com
2021-11-19drm/i915: vma is always backed by an object.Maarten Lankhorst
vma->obj and vma->resv are now never NULL, and some checks can be removed. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211117142024.1043017-4-matthew.auld@intel.com
2021-10-15drm/i915: Multi-BB execbufMatthew Brost
Allow multiple batch buffers to be submitted in a single execbuf IOCTL after a context has been configured with the 'set_parallel' extension. The number batches is implicit based on the contexts configuration. This is implemented with a series of loops. First a loop is used to find all the batches, a loop to pin all the HW contexts, a loop to create all the requests, a loop to submit (emit BB start, etc...) all the requests, a loop to tie the requests to the VMAs they touch, and finally a loop to commit the requests to the backend. A composite fence is also created for the generated requests to return to the user and to stick in dma resv slots. No behavior from the existing IOCTL should be changed aside from when throttling because the ring for a context is full. In this situation, i915 will now wait while holding the object locks. This change was done because the code is much simpler to wait while holding the locks and we believe there isn't a huge benefit of dropping these locks. If this proves false we can restructure the code to drop the locks during the wait. IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1 media UMD: https://github.com/intel/media-driver/pull/1252 v2: (Matthew Brost) - Return proper error value if i915_request_create fails v3: (John Harrison) - Add comment explaining create / add order loops + locking - Update commit message explaining different in IOCTL behavior - Line wrap some comments - eb_add_request returns void - Return -EINVAL rather triggering BUG_ON if cmd parser used (Checkpatch) - Check eb->batch_len[*current_batch] v4: (CI) - Set batch len if passed if via execbuf args - Call __i915_request_skip after __i915_request_commit (Kernel test robot) - Initialize rq to NULL in eb_pin_timeline v5: (John Harrison) - Fix typo in comments near bb order loops Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-21-matthew.brost@intel.com
2021-07-28drm/i915: move vma slab to direct module init/exitDaniel Vetter
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_vmas to just a slab_vmas. We have to keep i915_drv.h include in i915_globals otherwise there's nothing anymore that pulls in GEM_BUG_ON. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-9-daniel.vetter@ffwll.ch
2021-06-11Merge tag 'drm-intel-gt-next-2021-06-10' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - Disable mmap ioctl for gen12+ (excl. TGL-LP) - Start enabling HuC loading by default for upcoming Gen12+ platforms (excludes TGL and RKL) Core Changes: - Backmerge of drm-next Driver Changes: - Revert "i915: use io_mapping_map_user" (Eero, Matt A) - Initialize the TTM device and memory managers (Thomas) - Major rework to the GuC submission backend to prepare for enabling on new platforms (Michal Wa., Daniele, Matt B, Rodrigo) - Fix i915_sg_page_sizes to record dma segments rather than physical pages (Thomas) - Locking rework to prep for TTM conversion (Thomas) - Replace IS_GEN and friends with GRAPHICS_VER (Lucas) - Use DEVICE_ATTR_RO macro (Yue) - Static code checker fixes (Zhihao) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YMHeDxg9VLiFtyn3@jlahtine-mobl.ger.corp.intel.com
2021-06-03drm/i915: Promote ptrdiff() to i915_utils.hMichal Wajdeczko
Generic helpers should be placed in i915_utils.h. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-9-matthew.brost@intel.com
2021-05-25drm/i915/debugfs: Print remap info for DPT VMAs as wellImre Deak
Similarly to GGTT VMAs, DPT VMAs can be also a remapped or rotated view of the mapped object, so make sure we debug print the details for these views as well besides the normal view. While at it also fix the debug print for the VMA type of DPT VMAs. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210524172703.2113058-3-imre.deak@intel.com
2021-05-25drm/i915/adlp: Fix GEM VM asserts for DPT VMsImre Deak
An object mapped via DPT can have remapped and rotated VMA instances besides the normal VMA instance, similarly to GGTT VMA instances. Adjust the corresponding VMA lookup asserts. While at it also check if a DPT VM is passed incorrectly to i915_vm_to_ppgtt(). Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210524172703.2113058-2-imre.deak@intel.com
2021-03-24drm/i915: Take reservation lock around i915_vma_pin.Maarten Lankhorst
We previously complained when ww == NULL. This function is now only used in selftests to pin an object, and ww locking is now fixed. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> [danvet: Resolve conflict because we don't have a set-domain refactor, see https://lore.kernel.org/intel-gfx/20210203090205.25818-8-chris@chris-wilson.co.uk/ The really worrying thing here is that the above patch had a change in arguments for i915_gem_object_set_to_gtt_domain(), without any explanation. I decided to just faithfully apply Maarten's change but not the argument change which was in Maarten's context diff.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-26-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: make lockdep slightly happier about execbuf.Maarten Lankhorst
As soon as we install fences, we should stop allocating memory in order to prevent any potential deadlocks. This is required later on, when we start adding support for dma-fence annotations. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-11-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: Ensure we hold the object mutex in pin correctly.Maarten Lankhorst
Currently we have a lot of places where we hold the gem object lock, but haven't yet been converted to the ww dance. Complain loudly about those places. i915_vma_pin shouldn't have the obj lock held, so we can do a ww dance, while i915_vma_pin_ww should. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #irc Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-6-maarten.lankhorst@linux.intel.com
2021-01-20drm/i915/gem: Protect used framebuffers from casual evictionChris Wilson
In the shrinker, we protect framebuffers from light reclaim as we typically expect framebuffers to be reused in the near future (and with low latency requirements). We can apply the same logic to the GGTT eviction and defer framebuffers to the second pass only used if the caller is desperate enough to wait for space to become available. In most cases, the caller will use a smaller partial vma instead of trying to force the object into the GGTT if doing so will cause other users to be evicted. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-5-chris@chris-wilson.co.uk
2020-09-07drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.Maarten Lankhorst
As a preparation step for full object locking and wait/wound handling during pin and object mapping, ensure that we always pass the ww context in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this happens. This also requires changing the order of eb_parse slightly, to ensure we pass ww at a point where we could still handle -EDEADLK safely. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-05-28drm/i915/gt: Remove local entries from GGTT on suspendChris Wilson
Across suspend/resume, we clear the entire GGTT and rebuild from scratch. In particular, we want to only preserve the global entries for use by the HW, and delay reinstating the local binds until required by the user. This means that we can evict any local binds in the global GTT, saving any time in preserving their state, as they will be rebound on demand. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1947 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200528082427.21402-2-chris@chris-wilson.co.uk
2020-04-01drm/i915/gt: Make fence revocation unequivocalChris Wilson
If we must revoke the fence because the VMA is no longer present, or because the fence no longer applies, ensure that we do and convert it into an error if we try but cannot. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200401210104.15907-3-chris@chris-wilson.co.uk
2020-03-16drm/i915/gt: Allocate i915_fence_reg arrayChris Wilson
Since the number of fence regs can vary dramactically between platforms, allocate the array on demand so we don't waste as much space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-4-chris@chris-wilson.co.uk
2020-03-16drm/i915: Move GGTT fence registers under gt/Chris Wilson
Since the fence registers control HW detiling through the GGTT aperture, make them a part of the intel_ggtt under gt/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-1-chris@chris-wilson.co.uk
2020-01-30drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutexChris Wilson
On Braswell and Broxton (also known as Valleyview and Apollolake), we need to serialise updates of the GGTT using the big stop_machine() hammer. This has the side effect of appearing to lockdep as a possible reclaim (since it uses the cpuhp mutex and that is tainted by per-cpu allocations). However, we want to use vm->mutex (including ggtt->mutex) from within the shrinker and so must avoid such possible taints. For this purpose, we introduced the asynchronous vma binding and we can apply it to the PIN_GLOBAL so long as take care to add the necessary waits for the worker afterwards. Closes: https://gitlab.freedesktop.org/drm/intel/issues/211 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-3-chris@chris-wilson.co.uk
2020-01-07drm/i915/gtt: split up i915_gem_gttMatthew Auld
Attempt to split i915_gem_gtt.[ch] into more manageable chunks. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk
2019-12-23drm/i915: Introduce a vma.krefChris Wilson
Start introducing a kref on i915_vma in order to protect the vma unbind (i915_gem_object_unbind) from a parallel destruction (i915_vma_parked). Later, we will use the refcount to manage all access and turn i915_vma into a first class container. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Acked-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222210256.2066451-2-chris@chris-wilson.co.uk
2019-12-05drm/i915: Try hard to bind the contextChris Wilson
It is not acceptable for context pinning to fail with -ENOSPC as we should always be able to make space in the GGTT. The only reason we may fail is that other "temporary" context pins are reserving their space and we need to wait for an available slot. Closes: https://gitlab.freedesktop.org/drm/intel/issues/676 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191205113726.413351-2-chris@chris-wilson.co.uk
2019-12-04drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSETAbdiel Janulgue
This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature comes from the value returned by this ioctl which is the offset into the device fd which userpace uses with mmap(2). mmap_gtt was our initial mmap_offset implementation, this extends our CPU mmap support to allow additional fault handlers that depends on the object's backing pages. Note that we multiplex mmap_gtt and mmap_offset through the same ioctl, and use the zero extending behaviour of drm to differentiate between them, when we inspect the flags. To support multiple mmap types on an object we need to support multiple mmap_offsets for an object (each offset in the global device address space corresponding to a unique instance of the object for a file + mmap type). As we drop the simplified drm core idea of a single mmap_offset, we need to provide replacement hooks for the dumb mmap interface as well. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675 Testcase: igt/gem_mmap_offset Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk
2019-10-21drm/i915: Lift i915_vma_parked() onto the gtChris Wilson
Currently even though i915_vma_parked() operates on a per-gt struct, it is called from a global pm notify. This oddity was only because the long term plan is to decouple the vma cache from the pm notification, but right now the oddity stands out like a sore thumb! Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191021183236.21790-1-chris@chris-wilson.co.uk
2019-10-04drm/i915: Pull i915_vma_pin under the vm->mutexChris Wilson
Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
2019-10-04drm/i915: Use helpers for drm_mm_node booleansChris Wilson
A subset of 71724f708997 ("drm/mm: Use helpers for drm_mm_node booleans") in order to prepare drm-intel-next-queued for subsequent patches before we can backmerge 71724f708997 itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004142226.13711-1-chris@chris-wilson.co.uk
2019-09-11drm/i915: Make i915_vma.flags atomic_t for mutex reductionChris Wilson
In preparation for reducing struct_mutex stranglehold around the vm, make the vma.flags atomic so that we can acquire a pin on the vma atomically before deciding if we need to take the mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk
2019-09-09drm/i915: cleanup cache-coloringMatthew Auld
Try to tidy up the cache-coloring such that we rid the code of any mm.color_adjust assumptions, this should hopefully make it more obvious in the code when we need to actually use the cache-level as the color, and as a bonus should make adding a different color-scheme simpler. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
2019-09-09drm/i915: export color_differsMatthew Auld
Export color_differs so that we can use it elsewhere. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-1-matthew.auld@intel.com
2019-08-22drm/i915: Replace i915_vma_put_fence()Chris Wilson
Avoid calling i915_vma_put_fence() by using our alternate paths that bind a secondary vma avoiding the original fenced vma. For the few instances where we need to release the fence (i.e. on binding when the GGTT range becomes invalid), replace the put_fence with a revoke_fence. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk
2019-08-22drm/i915: Track ggtt fence reservations under its own mutexChris Wilson
We can reduce the locking for fence registers from the dev->struct_mutex to a local mutex. We could introduce a mutex for the sole purpose of tracking the fence acquisition, except there is a little bit of overlap with the fault tracking, so use the i915_ggtt.mutex as it covers both. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-1-chris@chris-wilson.co.uk
2019-08-22Merge drm/drm-next into drm-intel-next-queuedRodrigo Vivi
We need the rename of reservation_object to dma_resv. The solution on this merge came from linux-next: From: Stephen Rothwell <sfr@canb.auug.org.au> Date: Wed, 14 Aug 2019 12:48:39 +1000 Subject: [PATCH] drm: fix up fallout from "dma-buf: rename reservation_object to dma_resv" Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> --- drivers/gpu/drm/i915/gt/intel_engine_pool.c | 8 ++++---- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c index 03d90b49584a..4cd54c569911 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c @@ -43,12 +43,12 @@ static int pool_active(struct i915_active *ref) { struct intel_engine_pool_node *node = container_of(ref, typeof(*node), active); - struct reservation_object *resv = node->obj->base.resv; + struct dma_resv *resv = node->obj->base.resv; int err; - if (reservation_object_trylock(resv)) { - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); + if (dma_resv_trylock(resv)) { + dma_resv_add_excl_fence(resv, NULL); + dma_resv_unlock(resv); } err = i915_gem_object_pin_pages(node->obj); which is a simplified version from a previous one which had: Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-08-20drm/i915: Be defensive when starting vma activityChris Wilson
Before we acquire the vma for GPU activity, ensure that the underlying object is not already in the process of being freed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190820100531.8430-1-chris@chris-wilson.co.uk
2019-08-13dma-buf: rename reservation_object to dma_resvChristian König
Be more consistent with the naming of the other DMA-buf objects. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/323401/
2019-08-12drm/i915: Forgo last_fence active request trackingChris Wilson
We were using the last_fence to track the last request that used this vma that might be interpreted by a fence register and forced ourselves to wait for this request before modifying any fence register that overlapped our vma. Due to requirement that we need to track any XY_BLT command, linear or tiled, this in effect meant that we have to track the vma for its active lifespan anyway, so we can forgo the explicit last_fence tracking and just use the whole vma->active. Another solution would be to pipeline the register updates, and would help resolve some long running stalls for gen3 (but only gen 2 and 3!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk
2019-08-02drm/i915: Hide unshrinkable context objects from the shrinkerChris Wilson
The shrinker cannot touch objects used by the contexts (logical state and ring). Currently we mark those as "pin_global" to let the shrinker skip over them, however, if we remove them from the shrinker lists entirely, we don't event have to include them in our shrink accounting. By keeping the unshrinkable objects in our shrinker tracking, we report a large number of objects available to be shrunk, and leave the shrinker deeply unsatisfied when we fail to reclaim those. The shrinker will persist in trying to reclaim the unavailable objects, forcing the system into a livelock (not even hitting the dread oomkiller). v2: Extend unshrinkable protection for perma-pinned scratch and guc allocations (Tvrtko) v3: Notice that we should be pinned when marking unshrinkable and so the link cannot be empty; merge duplicate paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk
2019-06-13drm/i915: Move fence register tracking from i915->mm to ggttChris Wilson
As the fence registers only apply to regions inside the GGTT is makes more sense that we track these as part of the i915_ggtt and not the general mm. In the next patch, we will then pull the register locking underneath the i915_ggtt.mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
2019-06-06drm/i915: Move object close under its own lockChris Wilson
Use i915_gem_object_lock() to guard the LUT and active reference to allow us to break free of struct_mutex for handling GEM_CLOSE. Testcase: igt/gem_close_race Testcase: igt/gem_exec_parallel Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk
2019-06-06drm/i915: fix documentation build warningsJani Nikula
Just a straightforward bag of fixes for a clean htmldocs build. Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190605095657.23601-2-jani.nikula@intel.com
2019-05-28drm/i915: Move GEM object domain management from struct_mutex to localChris Wilson
Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
2019-05-28drm/i915: Move object->pages API to i915_gem_object.[ch]Chris Wilson
Currently the code for manipulating the pages on an object is still residing in i915_gem.c, move it to i915_gem_object.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-3-chris@chris-wilson.co.uk
2019-05-20drm/i915: Add a new "remapped" gtt_viewVille Syrjälä
To overcome display engine stride limits we'll want to remap the pages in the GTT. To that end we need a new gtt_view type which is just like the "rotated" type except not rotated. v2: Use intel_remapped_plane_info base type s/unused/unused_mbz/ (Chris) Separate BUILD_BUG_ON()s (Chris) Use I915_GTT_PAGE_SIZE (Chris) v3: Use i915_gem_object_get_dma_address() (Chris) Trim the sg (Tvrtko) v4: Actually trim this time. Limit the max length to one row of pages to keep things simple Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>