summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_context_types.h
AgeCommit message (Collapse)Author
2021-10-15drm/i915: Multi-BB execbufMatthew Brost
Allow multiple batch buffers to be submitted in a single execbuf IOCTL after a context has been configured with the 'set_parallel' extension. The number batches is implicit based on the contexts configuration. This is implemented with a series of loops. First a loop is used to find all the batches, a loop to pin all the HW contexts, a loop to create all the requests, a loop to submit (emit BB start, etc...) all the requests, a loop to tie the requests to the VMAs they touch, and finally a loop to commit the requests to the backend. A composite fence is also created for the generated requests to return to the user and to stick in dma resv slots. No behavior from the existing IOCTL should be changed aside from when throttling because the ring for a context is full. In this situation, i915 will now wait while holding the object locks. This change was done because the code is much simpler to wait while holding the locks and we believe there isn't a huge benefit of dropping these locks. If this proves false we can restructure the code to drop the locks during the wait. IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1 media UMD: https://github.com/intel/media-driver/pull/1252 v2: (Matthew Brost) - Return proper error value if i915_request_create fails v3: (John Harrison) - Add comment explaining create / add order loops + locking - Update commit message explaining different in IOCTL behavior - Line wrap some comments - eb_add_request returns void - Return -EINVAL rather triggering BUG_ON if cmd parser used (Checkpatch) - Check eb->batch_len[*current_batch] v4: (CI) - Set batch len if passed if via execbuf args - Call __i915_request_skip after __i915_request_commit (Kernel test robot) - Initialize rq to NULL in eb_pin_timeline v5: (John Harrison) - Fix typo in comments near bb order loops Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-21-matthew.brost@intel.com
2021-10-15drm/i915/guc: Implement no mid batch preemption for multi-lrcMatthew Brost
For some users of multi-lrc, e.g. split frame, it isn't safe to preempt mid BB. To safely enable preemption at the BB boundary, a handshake between parent and child is needed, syncing the set of BBs at the beginning and end of each batch. This is implemented via custom emit_bb_start & emit_fini_breadcrumb functions and enabled by default if a context is configured by set parallel extension. Lastly, this patch updates the process descriptor to the correct size as the memory used in the handshake is directly after the process descriptor. v2: (John Harrison) - Fix a few comments wording - Add struture for parent page layout v3: (John Harrison) - A structure for sync semaphore - Use offsetof to calc address - Update commit message v4: (John Harrison) - Fix typos in comment explaining memory map of scratch page Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-20-matthew.brost@intel.com
2021-10-15drm/i915/guc: Connect UAPI to GuC multi-lrc interfaceMatthew Brost
Introduce 'set parallel submit' extension to connect UAPI to GuC multi-lrc interface. Kernel doc in new uAPI should explain it all. IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1 media UMD: https://github.com/intel/media-driver/pull/1252 v2: (Daniel Vetter) - Add IGT link and placeholder for media UMD link v3: (Kernel test robot) - Fix warning in unpin engines call (John Harrison) - Reword a bunch of the kernel doc v4: (John Harrison) - Add comment why perma-pin is done after setting gem context - Update some comments / docs for proto contexts v5: (John Harrison) - Rework perma-pin comment - Add BUG_IN if context is pinned when setting gem context Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-17-matthew.brost@intel.com
2021-10-15drm/i915/guc: Insert submit fences between requests in parent-child relationshipMatthew Brost
The GuC must receive requests in the order submitted for contexts in a parent-child relationship to function correctly. To ensure this, insert a submit fence between the current request and last request submitted for requests / contexts in a parent child relationship. This is conceptually similar to a single timeline. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-14-matthew.brost@intel.com
2021-10-15drm/i915/guc: Add multi-lrc context registrationMatthew Brost
Add multi-lrc context registration H2G. In addition a workqueue and process descriptor are setup during multi-lrc context registration as these data structures are needed for multi-lrc submission. v2: (John Harrison) - Move GuC specific fields into sub-struct - Clean up WQ defines - Add comment explaining math to derive WQ / PD address v3: (John Harrison) - Add PARENT_SCRATCH_SIZE define - Update comment explaining multi-lrc register v4: (John Harrison) - Move PARENT_SCRATCH_SIZE to common file Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-9-matthew.brost@intel.com
2021-10-15drm/i915/guc: Introduce context parent-child relationshipMatthew Brost
Introduce context parent-child relationship. Once this relationship is created all pinning / unpinning operations are directed to the parent context. The parent context is responsible for pinning all of its children and itself. This is a precursor to the full GuC multi-lrc implementation but aligns to how GuC mutli-lrc interface is defined - a single H2G is used register / deregister all of the contexts simultaneously. Subsequent patches in the series will implement the pinning / unpinning operations for parent / child contexts. v2: (Daniel Vetter) - Add kernel doc, add wrapper to access parent to ensure safety v3: (John Harrison) - Fix comment explaing GEM_BUG_ON in to_parent() - Make variable names generic (non-GuC specific) v4: (John Harrison) - s/its'/its/g Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-8-matthew.brost@intel.com
2021-10-15drm/i915/guc: Take GT PM ref when deregistering contextMatthew Brost
Taking a PM reference to prevent intel_gt_wait_for_idle from short circuiting while a deregister context H2G is in flight. To do this must issue the deregister H2G from a worker as context can be destroyed from an atomic context and taking GT PM ref blows up. Previously we took a runtime PM from this atomic context which worked but will stop working once runtime pm autosuspend in enabled. So this patch is two fold, stop intel_gt_wait_for_idle from short circuting and fix runtime pm autosuspend. v2: (John Harrison) - Split structure changes out in different patch (Tvrtko) - Don't drop lock in deregister_destroyed_contexts v3: (John Harrison) - Flush destroyed contexts before destroying context reg pool Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-3-matthew.brost@intel.com
2021-10-15drm/i915/guc: Move GuC guc_id allocation under submission state sub-structMatthew Brost
Move guc_id allocation under submission state sub-struct as a future patch will reuse the spin lock as a global submission state lock. Moving this into sub-struct makes ownership of fields / lock clear. v2: (Docs) - Add comment for submission_state sub-structure v3: (John Harrison) - Fixup a few comments v4: (John Harrison) - Fix typo Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-2-matthew.brost@intel.com
2021-09-24drm/i915/gt: Register the migrate contexts with their enginesThomas Hellström
Pinned contexts, like the migrate contexts need reset after resume since their context image may have been lost. Also the GuC needs to register pinned contexts. Add a list to struct intel_engine_cs where we add all pinned contexts on creation, and traverse that list at resume time to reset the pinned contexts. This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now, but proper LMEM backup / restore is needed for full suspend functionality. However, note that even with full LMEM backup / restore it may be desirable to keep the reset since backing up the migrate context images must happen using memcpy() after the migrate context has become inactive, and for performance- and other reasons we want to avoid memcpy() from LMEM. Also traverse the list at guc_init_lrc_mapping() calling guc_kernel_context_pin() for the pinned contexts, like is already done for the kernel context. v2: - Don't reset the contexts on each __engine_unpark() but rather at resume time (Chris Wilson). v3: - Reset contexts in the engine sanitize callback. (Chris Wilson) Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Brost Matthew <matthew.brost@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com
2021-09-13drm/i915/guc: Add GuC kernel docMatthew Brost
Add GuC kernel doc for all structures added thus far for GuC submission and update the main GuC submission section with the new interface details. v2: - Drop guc_active.lock DOC v3: - Fixup a few kernel doc comments (Daniele) v4 (Daniele): - Implement doc suggestions from John - Add kerneldoc for all members of the GuC structure and pull the file in i915.rst v5 (Daniele): - Implement new doc suggestions from John Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-24-matthew.brost@intel.com
2021-09-13drm/i915/guc: Drop guc_active move everything into guc_stateMatthew Brost
Now that we have locking hierarchy of sched_engine->lock -> ce->guc_state everything from guc_active can be moved into guc_state and protected the guc_state.lock. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-23-matthew.brost@intel.com
2021-09-13drm/i915/guc: Move fields protected by guc->contexts_lock into sub structureMatthew Brost
To make ownership of locking clear move fields (guc_id, guc_id_ref, guc_id_link) to sub structure guc_id in intel_context. Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-22-matthew.brost@intel.com
2021-09-13drm/i915/guc: Move GuC priority fields in context under guc_activeMatthew Brost
Move GuC management fields in context under guc_active struct as this is where the lock that protects theses fields lives. Also only set guc_prio field once during context init. v2: (Daniele) - set CONTEXT_SET_INIT Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-21-matthew.brost@intel.com
2021-09-13drm/i915/guc: Drop pin count check trick between sched_disable and re-pinMatthew Brost
Drop pin count check trick between a sched_disable and re-pin, now rely on the lock and counter of the number of committed requests to determine if scheduling should be disabled on the context. Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-20-matthew.brost@intel.com
2021-09-13drm/i915/guc: Rework and simplify lockingMatthew Brost
Rework and simplify the locking with GuC subission. Drop sched_state_no_lock and move all fields under the guc_state.sched_state and protect all these fields with guc_state.lock . This requires changing the locking hierarchy from guc_state.lock -> sched_engine.lock to sched_engine.lock -> guc_state.lock. v2: (Daniele) - Don't check fields outside of lock during sched disable, check less fields within lock as some of the outside are no longer needed Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-18-matthew.brost@intel.com
2021-09-13drm/i915/guc: Move guc_blocked fence to struct guc_stateMatthew Brost
Move guc_blocked fence to struct guc_state as the lock which protects the fence lives there. s/ce->guc_blocked/ce->guc_state.blocked/g v2: (Daniele) - s/blocked_fence/blocked/g Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-17-matthew.brost@intel.com
2021-09-13drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2HMatthew Brost
While debugging an issue with full GT resets I went down a rabbit hole thinking the scrubbing of lost G2H wasn't working correctly. This proved to be incorrect as this was working just fine but this chase inspired me to write a selftest to prove that this works. This simple selftest injects errors dropping various G2H and then issues a full GT reset proving that the scrubbing of these G2H doesn't blow up. v2: (Daniel Vetter) - Use ifdef instead of macros for selftests v3: (Checkpatch) - A space after 'switch' statement v4: (Daniele) - A comment saying GT won't idle if G2H are lost Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-12-matthew.brost@intel.com
2021-07-27drm/i915/guc: Implement GuC priority managementMatthew Brost
Implement a simple static mapping algorithm of the i915 priority levels (int, -1k to 1k exposed to user) to the 4 GuC levels. Mapping is as follows: i915 level < 0 -> GuC low level (3) i915 level == 0 -> GuC normal level (2) i915 level < INT_MAX -> GuC high level (1) i915 level == INT_MAX -> GuC highest level (0) We believe this mapping should cover the UMD use cases (3 distinct user levels + 1 kernel level). In addition to static mapping, a simple counter system is attached to each context tracking the number of requests inflight on the context at each level. This is needed as the GuC levels are per context while in the i915 levels are per request. v2: (Daniele) - Add BUILD_BUG_ON to enforce ordering of priority levels - Add missing lockdep to guc_prio_fini - Check for return before setting context registered flag - Map DISPLAY priority or higher to highest guc prio - Update comment for guc_prio Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-33-matthew.brost@intel.com
2021-07-27drm/i915/guc: Support request cancellationMatthew Brost
This adds GuC backend support for i915_request_cancel(), which in turn makes CONFIG_DRM_I915_REQUEST_TIMEOUT work. This implementation makes use of fence while there are likely simplier options. A fence was chosen because of another feature coming soon which requires a user to block on a context until scheduling is disabled. In that case we return the fence to the user and the user can wait on that fence. v2: (Daniele) - A comment about locking the blocked incr / decr - A comments about the use of the fence - Update commit message explaining why fence - Delete redundant check blocked count in unblock function - Ring buffer implementation - Comment about blocked in submission path - Shorter rpm path v3: (Checkpatch) - Fix typos in commit message (Daniel) - Rework to simplier locking structure in guc_context_block / unblock Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-26-matthew.brost@intel.com
2021-07-27drm/i915/guc: Implement banned contexts for GuC submissionMatthew Brost
When using GuC submission, if a context gets banned disable scheduling and mark all inflight requests as complete. Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-25-matthew.brost@intel.com
2021-07-27drm/i915: Move active request tracking to a vfuncMatthew Brost
Move active request tracking to a backend vfunc rather than assuming all backends want to do this in the manner. In the of case execlists / ring submission the tracking is on the physical engine while with GuC submission it is on the context. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-8-matthew.brost@intel.com
2021-07-27drm/i915/guc: GuC virtual enginesMatthew Brost
Implement GuC virtual engines. Rather simple implementation, basically just allocate an engine, setup context enter / exit function to virtual engine specific functions, set all other variables / functions to guc versions, and set the engine mask to that of all the siblings. v2: Update to work with proto-ctx v3: (Daniele) - Drop include, add comment to intel_virtual_engine_has_heartbeat Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-2-matthew.brost@intel.com
2021-07-22drm/i915/guc: Defer context unpin until scheduling is disabledMatthew Brost
With GuC scheduling, it isn't safe to unpin a context while scheduling is enabled for that context as the GuC may touch some of the pinned state (e.g. LRC). To ensure scheduling isn't enabled when an unpin is done, a call back is added to intel_context_unpin when pin count == 1 to disable scheduling for that context. When the response CTB is received it is safe to do the final unpin. Future patches may add a heuristic / delay to schedule the disable call back to avoid thrashing on schedule enable / disable. v2: (John H) - s/drm_dbg/drm_err (Daneiel) - Clean up sched state function Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-9-matthew.brost@intel.com
2021-07-22drm/i915/guc: Insert fence on context when deregisteringMatthew Brost
Sometimes during context pinning a context with the same guc_id is registered with the GuC. In this a case deregister must be done before the context can be registered. A fence is inserted on all requests while the deregister is in flight. Once the G2H is received indicating the deregistration is complete the context is registered and the fence is released. v2: (John H) - Fix commit message Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <john.c.harrison@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-8-matthew.brost@intel.com
2021-07-22drm/i915/guc: Implement GuC context operations for new intefaceMatthew Brost
Implement GuC context operations which includes GuC specific operations alloc, pin, unpin, and destroy. v2: (Daniel Vetter) - Use msleep_interruptible rather than cond_resched in busy loop (Michal) - Remove C++ style comment v3: (Matthew Brost) - Drop GUC_ID_START (John Harrison) - Fix a bunch of typos - Use drm_err rather than drm_dbg for G2H errors (Daniele) - Fix ;; typo - Clean up sched state functions - Add lockdep for guc_id functions - Don't call __release_guc_id when guc_id is invalid - Use MISSING_CASE - Add comment in guc_context_pin - Use shorter path to rpm (Daniele / CI) - Don't call release_guc_id on an invalid guc_id in destroy v4: (Daniel Vetter) - Add FIXME comment Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-7-matthew.brost@intel.com
2021-07-22drm/i915/guc: Implement GuC submission taskletMatthew Brost
Implement GuC submission tasklet for new interface. The new GuC interface uses H2G to submit contexts to the GuC. Since H2G use a single channel, a single tasklet is used for the submission path. Also the per engine interrupt handler has been updated to disable the rescheduling of the physical engine tasklet, when using GuC scheduling, as the physical engine tasklet is no longer used. In this patch the field, guc_id, has been added to intel_context and is not assigned. Patches later in the series will assign this value. v2: (John Harrison) - Clean up some comments v3: (John Harrison) - More comment cleanups Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721215101.139794-5-matthew.brost@intel.com
2021-07-08drm/i915: Stop storing the ring size in the ring pointer (v3)Jason Ekstrand
Previously, we were storing the ring size in the ring pointer before it was actually allocated. We would then guard setting the ring size on checking for CONTEXT_ALLOC_BIT. This is error-prone at best and really only saves us a few bytes on something that already burns at least 4K. Instead, this patch adds a new ring_size field and makes everything use that. v2 (Daniel Vetter): - Replace 512 * SZ_4K with SZ_2M v2 (Jason Ekstrand): - Rebase on top of page migration code Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-3-jason@jlekstrand.net
2021-04-08Merge tag 'drm-intel-gt-next-2021-04-06' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Prepare for local/device memory support on DG1 by starting to use it for kernel internal allocations: context, ring and engine scratch (Matt A, CQ, Abdiel, Imre) - Sandybridge fix to avoid hard hang on ring resume (Chris) - Limit imported dma-buf size to int32 (Matt A) - Double check heartbeat timeout before resetting (Chris) - Use new tasklet API for execution list (Emil) - Fix SPDX checkpats warnings (Chris) - Fixes for various checkpatch warnings (Chris) - Selftest improvements (Chris) - Move the defer_request waiter active assertion to correct spot (Chris) - Make local-memory probing a GT operation (Matt, Tvrtko) - Protect against request freeing during cancellation on wedging (Chris) - Retire unexpected starting state error dumping (Chris) - Distinction of memory regions in debugging (Zbigniew) - Always flush the submission queue on checking for idle (Chris) - Consolidate 2big error check to helper (Matt) - Decrease number of subplatform bits (Tvrtko) - Remove unused internal request priority levels (Chris) - Document the unused internal header bits in buddy allocator (Matt) - Cleanup the region class/instance encoding (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YGxksaZGXHnFxlwg@jlahtine-mobl.ger.corp.intel.com
2021-03-26drm/i915: Request watchdog infrastructureTvrtko Ursulin
Prepares the plumbing for setting request/fence expiration time. All code is put in place but is never activated due yet missing ability to actually configure the timer. Outline of the basic operation: A timer is started when request is ready for execution. If the request completes (retires) before the timer fires, timer is cancelled and nothing further happens. If the timer fires request is added to a lockless list and worker queued. Purpose of this is twofold: a) It allows request cancellation from a more friendly context and b) coalesces multiple expirations into a single event of consuming the list. Worker locklessly consumes the list of expired requests and cancels them all using previous added i915_request_cancel(). Associated timeout value is stored in rq->context.watchdog.timeout_us. v2: * Log expiration. v3: * Include more information about user timeline in the log message. v4: * Remove obsolete comment and fix formatting. (Matt) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-6-tvrtko.ursulin@linux.intel.com
2021-03-24drm/i915/gt: SPDX cleanupChris Wilson
Clean up the SPDX licence declarations to comply with checkpatch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-1-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-01-01drm/i915: Populate logical context during first pin.Maarten Lankhorst
This allows us to remove pin_map from state allocation, which saves us a few retry loops. We won't need this until first pin, anyway. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201231170405.22843-1-chris@chris-wilson.co.uk
2020-12-29drm/i915/gem: Peek at the inflight contextChris Wilson
If supported by the backend, we can quickly look at the context's inflight engine rather than search along the active list to confirm. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201229144114.31686-1-chris@chris-wilson.co.uk
2020-12-24drm/i915/gt: Defer schedule_out until after the next dequeueChris Wilson
Inside schedule_out, we do extra work upon idling the context, such as updating the runtime, kicking off retires, kicking virtual engines. However, if we are in a series of processing single requests per contexts, we may find ourselves scheduling out the context, only to immediately schedule it back in during dequeue. This is just extra work that we can avoid if we keep the context marked as inflight across the dequeue. This becomes more significant later on for minimising virtual engine misses. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-4-chris@chris-wilson.co.uk
2020-11-26drm/i915/gt: Split the breadcrumb spinlock between global and contextsChris Wilson
As we funnel more and more contexts into the breadcrumbs on an engine, the hold time of b->irq_lock grows. As we may then contend with the b->irq_lock during request submission, this increases the burden upon the engine->active.lock and so directly impacts both our execution latency and client latency. If we split the b->irq_lock by introducing a per-context spinlock to manage the signalers within a context, we then only need the b->irq_lock for enabling/disabling the interrupt and can avoid taking the lock for walking the list of contexts within the signal worker. Even with the current setup, this greatly reduces the number of times we have to take and fight for b->irq_lock. Furthermore, this closes the race between enabling the signaling context while it is in the process of being signaled and removed: <4>[ 416.208555] list_add corruption. prev->next should be next (ffff8881951d5910), but was dead000000000100. (prev=ffff8882781bb870). <4>[ 416.208573] WARNING: CPU: 7 PID: 0 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70 <4>[ 416.208575] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio mei_hdcp x86_pkg_temp_thermal coretemp ax88179_178a usbnet mii crct10dif_pclmul snd_intel_dspcfg crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core e1000e snd_pcm ptp pps_core mei_me mei prime_numbers intel_lpss_pci [last unloaded: i915] <4>[ 416.208611] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G U 5.8.0-CI-CI_DRM_8852+ #1 <4>[ 416.208614] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3212.A00.1905212112 05/21/2019 <4>[ 416.208627] RIP: 0010:__list_add_valid+0x4d/0x70 <4>[ 416.208631] Code: c3 48 89 d1 48 c7 c7 60 18 33 82 48 89 c2 e8 ea e0 b6 ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 b0 18 33 82 e8 d3 e0 b6 ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 00 19 33 82 e8 <4>[ 416.208633] RSP: 0018:ffffc90000280e18 EFLAGS: 00010086 <4>[ 416.208636] RAX: 0000000000000000 RBX: ffff888250a44880 RCX: 0000000000000105 <4>[ 416.208639] RDX: 0000000000000105 RSI: ffffffff82320c5b RDI: 00000000ffffffff <4>[ 416.208641] RBP: ffff8882781bb870 R08: 0000000000000000 R09: 0000000000000001 <4>[ 416.208643] R10: 00000000054d2957 R11: 000000006abbd991 R12: ffff8881951d58c8 <4>[ 416.208646] R13: ffff888286073880 R14: ffff888286073848 R15: ffff8881951d5910 <4>[ 416.208669] FS: 0000000000000000(0000) GS:ffff88829c180000(0000) knlGS:0000000000000000 <4>[ 416.208671] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 416.208673] CR2: 0000556231326c48 CR3: 0000000005610001 CR4: 0000000000760ee0 <4>[ 416.208675] PKRU: 55555554 <4>[ 416.208677] Call Trace: <4>[ 416.208679] <IRQ> <4>[ 416.208751] i915_request_enable_breadcrumb+0x278/0x400 [i915] <4>[ 416.208839] __i915_request_submit+0xca/0x2a0 [i915] <4>[ 416.208892] __execlists_submission_tasklet+0x480/0x1830 [i915] <4>[ 416.208942] execlists_submission_tasklet+0xc4/0x130 [i915] <4>[ 416.208947] tasklet_action_common.isra.17+0x6c/0x1c0 <4>[ 416.208954] __do_softirq+0xdf/0x498 <4>[ 416.208960] ? handle_fasteoi_irq+0x150/0x150 <4>[ 416.208964] asm_call_on_stack+0xf/0x20 <4>[ 416.208966] </IRQ> <4>[ 416.208969] do_softirq_own_stack+0xa1/0xc0 <4>[ 416.208972] irq_exit_rcu+0xb5/0xc0 <4>[ 416.208976] common_interrupt+0xf7/0x260 <4>[ 416.208980] asm_common_interrupt+0x1e/0x40 <4>[ 416.208985] RIP: 0010:cpuidle_enter_state+0xb6/0x410 <4>[ 416.208987] Code: 00 31 ff e8 9c 3e 89 ff 80 7c 24 0b 00 74 12 9c 58 f6 c4 02 0f 85 31 03 00 00 31 ff e8 e3 6c 90 ff e8 fe a4 94 ff fb 45 85 ed <0f> 88 c7 02 00 00 49 63 c5 4c 2b 24 24 48 8d 14 40 48 8d 14 90 48 <4>[ 416.208989] RSP: 0018:ffffc90000143e70 EFLAGS: 00000206 <4>[ 416.208991] RAX: 0000000000000007 RBX: ffffe8ffffda8070 RCX: 0000000000000000 <4>[ 416.208993] RDX: 0000000000000000 RSI: ffffffff8238b4ee RDI: ffffffff8233184f <4>[ 416.208995] RBP: ffffffff826b4e00 R08: 0000000000000000 R09: 0000000000000000 <4>[ 416.208997] R10: 0000000000000001 R11: 0000000000000000 R12: 00000060e7f24a8f <4>[ 416.208998] R13: 0000000000000003 R14: 0000000000000003 R15: 0000000000000003 <4>[ 416.209012] cpuidle_enter+0x24/0x40 <4>[ 416.209016] do_idle+0x22f/0x2d0 <4>[ 416.209022] cpu_startup_entry+0x14/0x20 <4>[ 416.209025] start_secondary+0x158/0x1a0 <4>[ 416.209030] secondary_startup_64+0xa4/0xb0 <4>[ 416.209039] irq event stamp: 10186977 <4>[ 416.209042] hardirqs last enabled at (10186976): [<ffffffff810b9363>] tasklet_action_common.isra.17+0xe3/0x1c0 <4>[ 416.209044] hardirqs last disabled at (10186977): [<ffffffff81a5e5ed>] _raw_spin_lock_irqsave+0xd/0x50 <4>[ 416.209047] softirqs last enabled at (10186968): [<ffffffff810b9a1a>] irq_enter_rcu+0x6a/0x70 <4>[ 416.209049] softirqs last disabled at (10186969): [<ffffffff81c00f4f>] asm_call_on_stack+0xf/0x20 <4>[ 416.209317] list_del corruption, ffff8882781bb870->next is LIST_POISON1 (dead000000000100) <4>[ 416.209317] WARNING: CPU: 7 PID: 46 at lib/list_debug.c:47 __list_del_entry_valid+0x4e/0x90 <4>[ 416.209317] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio mei_hdcp x86_pkg_temp_thermal coretemp ax88179_178a usbnet mii crct10dif_pclmul snd_intel_dspcfg crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core e1000e snd_pcm ptp pps_core mei_me mei prime_numbers intel_lpss_pci [last unloaded: i915] <4>[ 416.209317] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G U W 5.8.0-CI-CI_DRM_8852+ #1 <4>[ 416.209317] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3212.A00.1905212112 05/21/2019 <4>[ 416.209317] RIP: 0010:__list_del_entry_valid+0x4e/0x90 <4>[ 416.209317] Code: 2e 48 8b 32 48 39 fe 75 3a 48 8b 50 08 48 39 f2 75 48 b8 01 00 00 00 c3 48 89 fe 48 89 c2 48 c7 c7 38 19 33 82 e8 62 e0 b6 ff <0f> 0b 31 c0 c3 48 89 fe 48 c7 c7 70 19 33 82 e8 4e e0 b6 ff 0f 0b <4>[ 416.209317] RSP: 0018:ffffc90000280de8 EFLAGS: 00010086 <4>[ 416.209317] RAX: 0000000000000000 RBX: ffff8882781bb848 RCX: 0000000000010104 <4>[ 416.209317] RDX: 0000000000010104 RSI: ffffffff8238b4ee RDI: 00000000ffffffff <4>[ 416.209317] RBP: ffff8882781bb880 R08: 0000000000000000 R09: 0000000000000001 <4>[ 416.209317] R10: 000000009fb6666e R11: 00000000feca9427 R12: ffffc90000280e18 <4>[ 416.209317] R13: ffff8881951d5930 R14: dead0000000000d8 R15: ffff8882781bb880 <4>[ 416.209317] FS: 0000000000000000(0000) GS:ffff88829c180000(0000) knlGS:0000000000000000 <4>[ 416.209317] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 416.209317] CR2: 0000556231326c48 CR3: 0000000005610001 CR4: 0000000000760ee0 <4>[ 416.209317] PKRU: 55555554 <4>[ 416.209317] Call Trace: <4>[ 416.209317] <IRQ> <4>[ 416.209317] remove_signaling_context.isra.13+0xd/0x70 [i915] <4>[ 416.209513] signal_irq_work+0x1f7/0x4b0 [i915] This is caused by virtual engines where although we take the breadcrumb lock on each of the active engines, they may be different engines on different requests, It turns out that the b->irq_lock was not a sufficient proxy for the engine->active.lock in the case of more than one request, so introduce an explicit lock around ce->signals. v2: ce->signal_lock is acquired with only RCU protection and so must be treated carefully and not cleared during reallocation. We also then need to confirm that the ce we lock is the same as we found in the breadcrumb list. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2276 Fixes: c18636f76344 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: 2854d866327a ("drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-4-chris@chris-wilson.co.uk
2020-11-26drm/i915/gt: Protect context lifetime with RCUChris Wilson
Allow a brief period for continued access to a dead intel_context by deferring the release of the struct until after an RCU grace period. As we are using a dedicated slab cache for the contexts, we can defer the release of the slab pages via RCU, with the caveat that individual structs may be reused from the freelist within an RCU grace period. To handle that, we have to avoid clearing members of the zombie struct. This is required for a later patch to handle locking around virtual requests in the signaler, as those requests may want to move between engines and be destroyed while we are holding b->irq_lock on a physical engine. v2: Drop mutex_reinit(), if we never mark the mutex as destroyed we don't need to reset the debug code, at the loss of having the mutex debug code spot us attempting to destroy a locked mutex. v3: As the intended use will remain strongly referenced counted, with very little inflight access across reuse, drop the ctor. v4: Drop the unrequired change to remove the temporary reference around dropping the active context, and add back some more missing ctor operations. v5: The ctor is back. Tvrtko spotted that ce->signal_lock [introduced later] maybe accessed under RCU and so needs special care not to be reinitialised. v6: Don't mix SLAB_TYPESAFE_BY_RCU and RCU list iteration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-3-chris@chris-wilson.co.uk
2020-09-07drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.Maarten Lankhorst
As a preparation step for full object locking and wait/wound handling during pin and object mapping, ensure that we always pass the ww context in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this happens. This also requires changing the order of eb_parse slightly, to ensure we pass ww at a point where we could still handle -EDEADLK safely. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Rework intel_context pinning to do everything outside of pin_mutexMaarten Lankhorst
Instead of doing everything inside of pin_mutex, we move all pinning outside. Because i915_active has its own reference counting and pinning is also having the same issues vs mutexes, we make sure everything is pinned first, so the pinning in i915_active only needs to bump refcounts. This allows us to take pin refcounts correctly all the time. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-14-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-04-28drm/i915/execlists: Avoid reusing the same logical CCIDChris Wilson
The bspec is confusing on the nature of the upper 32bits of the LRC descriptor. Once upon a time, it said that it uses the upper 32b to decide if it should perform a lite-restore, and so we must ensure that each unique context submitted to HW is given a unique CCID [for the duration of it being on the HW]. Currently, this is achieved by using a small circular tag, and assigning every context submitted to HW a new id. However, this tag is being cleared on repinning an inflight context such that we end up re-using the 0 tag for multiple contexts. To avoid accidentally clearing the CCID in the upper 32bits of the LRC descriptor, split the descriptor into two dwords so we can update the GGTT address separately from the CCID. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
2020-04-25drm/i915: Add per ctx batchbuffer wa for timestampMika Kuoppala
Restoration of a previous timestamp can collide with updating the timestamp, causing a value corruption. Combat this issue by using indirect ctx bb to modify the context image during restoring process. We can preload value into scratch register. From which we then do the actual write with LRR. LRR is faster and thus less error prone as probability of race drops. v2: tidying (Chris) v3: lrr for all engines v4: grp v5: reg bit v6: wa_bb_offset, virtual engines (Chris) References: HSDES#16010904313 Testcase: igt/i915_selftest/gt_lrc Suggested-by: Joseph Koston <joseph.koston@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com
2020-03-19drm/i915: Use explicit flag to mark unreachable intel_contextChris Wilson
I need to keep the GEM context around a bit longer so adding an explicit flag for syncing execbuf with closed/abandonded contexts. v2: * Use already available context flags. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk
2020-03-10drm/i915/gt: Mark up racy reads for intel_context.inflightChris Wilson
When being used across multiple real engines inside a virtual engine, the intel_context.inflight is updated atomically, and so we must annotate the racy read from outside the owning context. [11142.482846] BUG: KCSAN: data-race in __execlists_submission_tasklet [i915] / __execlists_submission_tasklet [i915] [11142.482867] [11142.482878] write (marked) to 0xffff8881f257b5e0 of 8 bytes by interrupt on cpu 2: [11142.483107] __execlists_submission_tasklet+0x1d33/0x2120 [i915] [11142.483336] execlists_submission_tasklet+0xd3/0x170 [i915] [11142.483355] tasklet_action_common.isra.0+0x42/0xa0 [11142.483371] __do_softirq+0xd7/0x2cd [11142.483384] irq_exit+0xbe/0xe0 [11142.483401] do_IRQ+0x51/0x100 [11142.483424] ret_from_intr+0x0/0x1c [11142.483446] do_idle+0x133/0x1f0 [11142.483465] cpu_startup_entry+0x14/0x16 [11142.483483] start_secondary+0x120/0x180 [11142.483498] secondary_startup_64+0xa4/0xb0 [11142.483512] [11142.483528] read to 0xffff8881f257b5e0 of 8 bytes by interrupt on cpu 1: [11142.483755] __execlists_submission_tasklet+0x14e/0x2120 [i915] [11142.483981] execlists_submission_tasklet+0xd3/0x170 [i915] [11142.483999] tasklet_action_common.isra.0+0x42/0xa0 [11142.484014] __do_softirq+0xd7/0x2cd [11142.484028] do_softirq_own_stack+0x2a/0x40 [11142.484046] do_softirq.part.0+0x26/0x30 [11142.484071] __local_bh_enable_ip+0x46/0x50 [11142.484299] i915_gem_do_execbuffer+0x39c1/0x4e50 [i915] [11142.484528] i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915] [11142.484546] drm_ioctl_kernel+0xe4/0x120 [11142.484559] drm_ioctl+0x297/0x4c7 [11142.484572] ksys_ioctl+0x89/0xb0 [11142.484586] __x64_sys_ioctl+0x42/0x60 [11142.484610] do_syscall_64+0x6e/0x2c0 [11142.484627] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200310141320.24149-1-chris@chris-wilson.co.uk
2020-02-16drm/i915: Track hw reported context runtimeTvrtko Ursulin
GPU saves accumulated context runtime (in CS timestamp units) in PPHWSP which will be useful for us in cases when we are not able to track context busyness ourselves (like with GuC). Keep a copy of this in struct intel_context from where it can be easily read even if the context is not pinned. v2: (Chris) * Do not store pphwsp address in intel_context. * Log CS wrap-around. * Simplify calculation by relying on integer wraparound. v3: * Include total/avg in traces and error state for debugging Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200216133620.394962-1-chris@chris-wilson.co.uk
2020-01-03drm/i915/gt: Always poison the kernel_context image before unparkingChris Wilson
Keep scrubbing the kernel_context image with poison before we reset it in order to demonstrate that we will be resilient in the case where it is accidentally overwritten on idle. Suggested-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200102131707.1463945-5-chris@chris-wilson.co.uk
2019-12-23drm/i915: Mark the GEM context link as RCU protectedChris Wilson
The only protection for intel_context.gem_cotext is granted by RCU, so annotate it as a rcu protected pointer and carefully dereference it in the few occasions we need to use it. Fixes: 9f3ccd40acf4 ("drm/i915: Drop GEM context as a direct link from i915_request") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222233558.2201901-1-chris@chris-wilson.co.uk
2019-12-21drm/i915: Remove i915->kernel_contextChris Wilson
Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
2019-12-20drm/i915: Push the use-semaphore marker onto the intel_contextChris Wilson
Instead of rummaging through the intel_context to peek at the GEM context in the middle of request submission to decide whether to use semaphores, store that information on the intel_context itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-2-chris@chris-wilson.co.uk
2019-12-20drm/i915: Drop GEM context as a direct link from i915_requestChris Wilson
Keep the intel_context as being the primary state for i915_request, with the GEM context a backpointer from the low level state for the rarer cases we need client information. Our goal is to remove such references to clients from the backend, and leave the HW submission agnostic to client interfaces and self-contained. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
2019-12-03drm/i915/gt: Track the context validity explicitlyChris Wilson
Rather than assume if and only if the engine->default_state is not set that the context is invalid, instead track when we know the context has valid state -- either because we have copied the default_state or we have completed a context switch to save the HW state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191203124155.3019926-1-chris@chris-wilson.co.uk
2019-10-04drm/i915: Remove logical HW IDChris Wilson
With the introduction of ctx->engines[] we allow multiple logical contexts to be used on the same engine (e.g. with virtual engines). According to bspec, aach logical context requires a unique tag in order for context-switching to occur correctly between them. [Simple experiments show that it is not so easy to trick the HW into performing a lite-restore with matching logical IDs, though my memory from early Broadwell experiments do suggest that it should be generating lite-restores.] We only need to keep a unique tag for the active lifetime of the context, and for as long as we need to identify that context. The HW uses the tag to determine if it should use a lite-restore (why not the LRCA?) and passes the tag back for various status identifies. The only status we need to track is for OA, so when using perf, we assign the specific context a unique tag. v2: Calculate required number of tags to fill ELSP. Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk
2019-08-16drm/i915/execlists: Lift process_csb() out of the irq-off spinlockChris Wilson
If we only call process_csb() from the tasklet, though we lose the ability to bypass ksoftirqd interrupt processing on direct submission paths, we can push it out of the irq-off spinlock. The penalty is that we then allow schedule_out to be called concurrently with schedule_in requiring us to handle the usage count (baked into the pointer itself) atomically. As we do kick the tasklets (via local_bh_enable()) after our submission, there is a possibility there to see if we can pull the local softirq processing back from the ksoftirqd. v2: Store the 'switch_priority_hint' on submission, so that we can safely check during process_csb(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816171608.11760-1-chris@chris-wilson.co.uk