summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_engine.h
AgeCommit message (Collapse)Author
2020-11-03drm/i915/gt: Expose more parameters for emitting writes into the ringChris Wilson
Add another lower level to emit_ggtt_write so that the GGTT nature of the write is not hardcoded into the emitter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201102221057.29626-1-chris@chris-wilson.co.uk (cherry picked from commit 2739d8cfc50aafff49d599cc0a5bc855445e99a7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Cancel outstanding work after disabling heartbeats on an engineChris Wilson
We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk (cherry picked from commit 7a991cd3e3da9a56d5616b62d425db000a3242f2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-07drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbsChris Wilson
On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Fixes: 4fe6abb8f513 ("drm/i915/gt: Ignore irq enabling on the virtual engines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbsChris Wilson
After staring at the breadcrumb enabling/cancellation and coming to the conclusion that the cause of the mysterious stale breadcrumbs must the act of submitting a completed requests, we can then redirect those completed requests onto a dedicated signaled_list at the time of construction and so eliminate intel_engine_transfer_stale_breadcrumbs(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-2-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-06-18drm/i915/gt: Always report the sample time for busy-statsChris Wilson
Return the monotonic timestamp (ktime_get()) at the time of sampling the busy-time. This is used in preference to taking ktime_get() separately before or after the read seqlock as there can be some large variance in reported timestamps. For selftests trying to ascertain that we are reporting accurate to within a few microseconds, even a small delay leads to the test failing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
2020-06-02drm/i915/gt: Split low level gen2-7 CS emittersChris Wilson
Pull the routines for writing CS packets out of intel_ring_submission into their own files. These are low level operations for building CS instructions, rather than the logic for filling the global ring buffer with requests, and we will want to reuse them outside of this context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200601072446.19548-2-chris@chris-wilson.co.uk
2020-05-14drm/i915/gt: Transfer old virtual breadcrumbs to irq_workerChris Wilson
The second try at staging the transfer of the breadcrumb. In part one, we realised we could not simply move to the second engine as we were only holding the breadcrumb lock on the first. So in commit 6c81e21a4742 ("drm/i915/gt: Stage the transfer of the virtual breadcrumb"), we removed it from the first engine and marked up this request to reattach the signaling on the new engine. However, this failed to take into account that we only attach the breadcrumb if the new request is added at the start of the queue, which if we are transferring, it is because we know there to be a request to be signaled (and hence we would not be attached). In this attempt, we try to transfer the completed requests to the irq_worker on its rq->engine->breadcrumbs. This preserves the coupling between the rq and its breadcrumbs, so that i915_request_cancel_breadcrumb() does not attempt to manipulate the list under the wrong lock. v2: Code sharing is fun. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1862 Fixes: 6c81e21a4742 ("drm/i915/gt: Stage the transfer of the virtual breadcrumb") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513074809.18194-1-chris@chris-wilson.co.uk
2020-05-07drm/i915/gen12: Fix HDC pipeline flushMika Kuoppala
HDC pipeline flush is bit on the first dword of the PIPE_CONTROL, not the second. Make it so. v2: function naming (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-2-mika.kuoppala@linux.intel.com
2020-05-01drm/i915/gt: Make timeslicing an explicit engine propertyChris Wilson
In order to allow userspace to rely on timeslicing to reorder their batches, we must support preemption of those user batches. Declare timeslicing as an explicit property that is a combination of having the kernel support and HW support. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk
2020-04-30drm/i915/gt: Always enable busy-stats for execlistsChris Wilson
In the near future, we will utilize the busy-stats on each engine to approximate the C0 cycles of each, and use that as an input to a manual RPS mechanism. That entails having busy-stats always enabled and so we can remove the enable/disable routines and simplify the pmu setup. As a consequence of always having the stats enabled, we can also show the current active time via sysfs/engine/xcs/active_time_ns. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
2020-04-03drm/i915/gt: Free request pool from virtual enginesChris Wilson
While extremely unlikely to be populated, we could capture a request on the virtual engine which we should free along with the virtual engine. Fixes: 43acd6516ca9 ("drm/i915: Keep a per-engine request pool") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200403203303.10903-1-chris@chris-wilson.co.uk
2020-03-09drm/i915/gt: Defend against concurrent updates to execlists->activeChris Wilson
[ 206.875637] BUG: KCSAN: data-race in __i915_schedule+0x7fc/0x930 [i915] [ 206.875654] [ 206.875666] race at unknown origin, with read to 0xffff8881f7644480 of 8 bytes by task 703 on cpu 3: [ 206.875901] __i915_schedule+0x7fc/0x930 [i915] [ 206.876130] __bump_priority+0x63/0x80 [i915] [ 206.876361] __i915_sched_node_add_dependency+0x258/0x300 [i915] [ 206.876593] i915_sched_node_add_dependency+0x50/0xa0 [i915] [ 206.876824] i915_request_await_dma_fence+0x1da/0x530 [i915] [ 206.877057] i915_request_await_object+0x2fe/0x470 [i915] [ 206.877287] i915_gem_do_execbuffer+0x45dc/0x4c20 [i915] [ 206.877517] i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915] [ 206.877535] drm_ioctl_kernel+0xe4/0x120 [ 206.877549] drm_ioctl+0x297/0x4c7 [ 206.877563] ksys_ioctl+0x89/0xb0 [ 206.877577] __x64_sys_ioctl+0x42/0x60 [ 206.877591] do_syscall_64+0x6e/0x2c0 [ 206.877606] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: Be safe and include mb References: https://gitlab.freedesktop.org/drm/intel/issues/1318 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200309170540.10332-1-chris@chris-wilson.co.uk
2020-02-10drm/i915/selftests: Drop live_preempt_hangChris Wilson
live_preempt_hang's use of hang injection has been superseded by live_preempt_reset's use of an non-preemptible spinner. The latter does not require intrusive hacks into the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200209230838.361154-2-chris@chris-wilson.co.uk
2020-01-31drm/i915: extract engine WA programming to common resume functionDaniele Ceraolo Spurio
The workarounds are a common "feature" across gens and submission mechanisms and we already call the other WA related functions from common engine ones (<setup/cleanup>_common), so it makes sense to do the same with WA application. Medium-term, This will help us reduce the duplication once the GuC resume function is added, but short term it will also allow us to use the workaround lists for pre-gen8 engine workarounds. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200131075716.2212299-2-chris@chris-wilson.co.uk
2020-01-10drm/i915: Start chopping up the GPU error captureChris Wilson
In the near future, we will want to start a GPU error capture from a new context, from inside the softirq region of a forced preemption. To do so requires us to break up the monolithic error capture to provide new entry points with finer control; in particular focusing on one engine/gt, and being able to compose an error state from little pieces of HW capture. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110123059.1348712-1-chris@chris-wilson.co.uk
2019-12-24drm/i915/gt: Flush other retirees inside intel_gt_retire_requests()Chris Wilson
Our goal in wait_for_idle (intel_gt_retire_requests) is to the current workload *and* their idle barriers. This requires us to notice the late arrival of those, which is done by inspecting the list of active timelines. However, if a concurrent retirer is running that new timeline may not be added until after we drop the lock -- so flush concurrent retirers before we take the lock and inspect the list. Closes: https://gitlab.freedesktop.org/drm/intel/issues/878 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191223211008.2371613-1-chris@chris-wilson.co.uk
2019-12-22drm/i915/gt: Merge engine init/setup loopsChris Wilson
Now that we don't need to create GEM contexts in the middle of engine construction, we can pull the engine init/setup loops together. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222144046.1674865-2-chris@chris-wilson.co.uk
2019-12-22drm/i915/gt: Pull GT initialisation under intel_gt_init()Chris Wilson
Begin pulling the GT setup underneath a single GT umbrella; let intel_gt take ownership of its engines! As hinted, the complication is the lifetime of the probed engine versus the active lifetime of the GT backends. We need to detect the engine layout early and keep it until the end so that we can sanitize state on takeover and release. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222120752.1368352-1-chris@chris-wilson.co.uk
2019-12-21drm/i915/gt: Repeat wait_for_idle for retirement workersChris Wilson
Since we may retire timelines from secondary workers, intel_gt_retire_requests() is not always a reliable indicator that all pending retirements are complete. If we do detect secondary workers are in progress, recommend intel_gt_wait_for_idle() to repeat the retirement check. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221180204.1201217-1-chris@chris-wilson.co.uk
2019-12-18drm/i915/gt: Remove direct invocation of breadcrumb signalingChris Wilson
Only signal the breadcrumbs from inside the irq_work, simplifying our interface and calling conventions. The micro-optimisation here is that by always using the irq_work interface, we know we are always inside an irq-off critical section for the breadcrumb signaling and can ellide save/restore of the irq flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191217095642.3124521-7-chris@chris-wilson.co.uk
2019-12-13drm/i915: Introduce new macros for tracingVenkata Sandeep Dhanalakota
New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and GT_TRACE() are introduce to tag device name and engine name with contexts and requests tracing in i915. Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com
2019-12-05drm/i915/gt: Replace I915_READ with intel_uncore_readAndi Shyti
Get rid of the last remaining I915_READ in gt/ and make gt-land the first I915_READ-free happy island. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191205164422.727968-1-chris@chris-wilson.co.uk
2019-11-25drm/i915/gt: Mark the execlists->active as the primary volatile accessChris Wilson
Since we want to do a lockless read of the current active request, and that request is written to by process_csb also without serialisation, we need to instruct gcc to take care in reading the pointer itself. Otherwise, we have observed execlists_active() to report 0x40. [ 2400.760381] igt/para-4098 1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4 [ 2400.760826] igt/para-4098 1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000 [ 2400.761271] igt/para-4098 1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 } [ 2400.761726] igt/para-4097 0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940 which is impossible! The answer is that as we keep the existing execlists->active pointing into the array as we copy over that array, the unserialised read may see a partial pointer value. Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk
2019-10-29drm/i915/gt: Make timeslice duration configurableChris Wilson
Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. However, not all workloads necessarily benefit from timeslicing and in the extreme some sysadmin may want to disable or reduce the timeslicing granularity. The timeslicing mechanism can be compiled out^W^W disabled (but should DCE!) with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk
2019-10-25drm/i915: Move intel_engine_context_in/out into intel_lrc.cTvrtko Ursulin
Intel_lrc.c is the only caller and so to avoid some header file ordering issues in future patches move these two over there. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191025090952.10135-1-tvrtko.ursulin@linux.intel.com
2019-10-24drm/i915/gt: Split intel_ring_submissionChris Wilson
Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
2019-10-23drm/i915/gt: Replace hangcheck by heartbeatsChris Wilson
Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk
2019-10-23drm/i915/execlists: Force preemptionChris Wilson
If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk
2019-10-22drm/i915: Pass intel_gt to intel_engines_initTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-6-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_engines_setupTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-5-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_engines_cleanupTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-4-tvrtko.ursulin@linux.intel.com
2019-10-22drm/i915: Pass intel_gt to intel_engines_init_mmioTvrtko Ursulin
Engines belong to the GT so make it indicative in the API. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-2-tvrtko.ursulin@linux.intel.com
2019-10-09drm/i915/gt: execlists->active is serialised by the taskletChris Wilson
The active/pending execlists is no longer protected by the engine->active.lock, but is serialised by the tasklet instead. Update the locking around the debug and stats to follow suit. v2: local_bh_disable() to prevent recursing into the tasklet in case we trigger a softirq (Tvrtko) Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191009160906.16195-1-chris@chris-wilson.co.uk
2019-10-08drm/i915/gt: Flush submission tasklet before waiting/retiringChris Wilson
A common bane of ours is arbitrary delays in ksoftirqd processing our submission tasklet. Give the submission tasklet a kick before we wait to avoid those delays eating into a tight timeout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191008105655.13256-1-chris@chris-wilson.co.uk
2019-09-26drm/i915: Don't disable interrupts for intel_engine_breadcrumbs_irq()Sebastian Andrzej Siewior
The function intel_engine_breadcrumbs_irq() is always invoked from an interrupt handler and for that reason it invokes (as an optimisation) only spin_lock() for locking assuming that the interrupts are already disabled. The function intel_engine_signal_breadcrumbs() is provided to disable interrupts while the former function is invoked so that assumption is also true for callers from preemptible context. On PREEMPT_RT local_irq_disable() really disables interrupts and this forbids to invoke spin_lock() which becomes a sleeping spinlock. This is also problematic with `threadirqs' in conjunction with irq_work. With force threading the interrupt handler, the handler is invoked with disabled BH but with interrupts enabled. This is okay and the lock itself is never acquired in IRQ context. This changes with irq_work (signal_irq_work()) which _still_ invokes intel_engine_breadcrumbs_irq() from IRQ context. Lockdep should see this and complain. Acquire the locks in intel_engine_breadcrumbs_irq() with _irqsave() suffix and let all callers invoke intel_engine_breadcrumbs_irq() directly instead using intel_engine_signal_breadcrumbs(). Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190926105644.16703-2-bigeasy@linutronix.de
2019-08-13drm/i915/guc: Use a local cancel_port_requestsChris Wilson
Since execlists and the guc have diverged in their port tracking, we cannot simply reuse the execlists cancellation code as it leads to unbalanced reference counting. Use a local, simpler routine for the guc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190812203626.3948-1-chris@chris-wilson.co.uk
2019-08-13drm/i915: drop engine_pin/unpin_breadcrumbs_irqDaniele Ceraolo Spurio
The last user has been removed, so drop the functions. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190812233152.2172-2-daniele.ceraolospurio@intel.com
2019-08-09drm/i915: Lift timeline into intel_contextChris Wilson
Move the timeline from being inside the intel_ring to intel_context itself. This saves much pointer dancing and makes the relations of the context to its timeline much clearer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk
2019-08-06drm/i915/gt: Move the [class][inst] lookup for engines onto the GTChris Wilson
To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
2019-08-04drm/i915: Replace struct_mutex for batch pool serialisationChris Wilson
Switch to tracking activity via i915_active on individual nodes, only keeping a list of retired objects in the cache, and reaping the cache when the engine itself idles. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk
2019-07-12drm/i915/gt: Use intel_gt as the primary object for handling resetsChris Wilson
Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
2019-07-04drm/i915/gt: Ignore forcewake acquisition for posting_readsChris Wilson
We don't care about the result of the read, so it may be garbage, we only care that the mmio is flushed. As such, we can forgo using an individual forcewake and lock around any posting-read for an engine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703155225.9501-4-chris@chris-wilson.co.uk
2019-06-21drm/i915: Rename i915_timeline to intel_timeline and move under gtTvrtko Ursulin
Move all timeline code under gt and rename to intel_gt prefix. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
2019-06-20drm/i915/execlists: Preempt-to-busyChris Wilson
When using a global seqno, we required a precise stop-the-workd event to handle preemption and unwind the global seqno counter. To accomplish this, we would preempt to a special out-of-band context and wait for the machine to report that it was idle. Given an idle machine, we could very precisely see which requests had completed and which we needed to feed back into the run queue. However, now that we have scrapped the global seqno, we no longer need to precisely unwind the global counter and only track requests by their per-context seqno. This allows us to loosely unwind inflight requests while scheduling a preemption, with the enormous caveat that the requests we put back on the run queue are still _inflight_ (until the preemption request is complete). This makes request tracking much more messy, as at any point then we can see a completed request that we believe is not currently scheduled for execution. We also have to be careful not to rewind RING_TAIL past RING_HEAD on preempting to the running context, and for this we use a semaphore to prevent completion of the request before continuing. To accomplish this feat, we change how we track requests scheduled to the HW. Instead of appending our requests onto a single list as we submit, we track each submission to ELSP as its own block. Then upon receiving the CS preemption event, we promote the pending block to the inflight block (discarding what was previously being tracked). As normal CS completion events arrive, we then remove stale entries from the inflight tracker. v2: Be a tinge paranoid and ensure we flush the write into the HWS page for the GPU semaphore to pick in a timely fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
2019-06-14drm/i915: Replace engine->timeline with a plain listChris Wilson
To continue the onslaught of removing the assumption of a global execution ordering, another casualty is the engine->timeline. Without an actual timeline to track, it is overkill and we can replace it with a much less grand plain list. We still need a list of requests inflight, for the simple purpose of finding inflight requests (for retiring, resetting, preemption etc). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
2019-06-14drm/i915: Keep contexts pinned until after the next kernel context switchChris Wilson
We need to keep the context image pinned in memory until after the GPU has finished writing into it. Since it continues to write as we signal the final breadcrumb, we need to keep it pinned until the request after it is complete. Currently we know the order in which requests execute on each engine, and so to remove that presumption we need to identify a request/context-switch we know must occur after our completion. Any request queued after the signal must imply a context switch, for simplicity we use a fresh request from the kernel context. The sequence of operations for keeping the context pinned until saved is: - On context activation, we preallocate a node for each physical engine the context may operate on. This is to avoid allocations during unpinning, which may be from inside FS_RECLAIM context (aka the shrinker) - On context deactivation on retirement of the last active request (which is before we know the context has been saved), we add the preallocated node onto a barrier list on each engine - On engine idling, we emit a switch to kernel context. When this switch completes, we know that all previous contexts must have been saved, and so on retiring this request we can finally unpin all the contexts that were marked as deactivated prior to the switch. We can enhance this in future by flushing all the idle contexts on a regular heartbeat pulse of a switch to kernel context, which will also be used to check for hung engines. v2: intel_context_active_acquire/_release Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
2019-06-12drm/i915: Remove POSTING_READ16Tvrtko Ursulin
Only a few call sites remain which have been converted to uncore mmio accessors and so the macro can be removed. ENGINE_POSTING_READ16 is added to replace one engine->mmio_base relative call site. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190611104548.30545-3-tvrtko.ursulin@linux.intel.com
2019-06-07drm/i915: Make Gen6/7 RING_FAULT_REG access engine centricTvrtko Ursulin
Similar to earlier conversions, eliminate the implicit dev_priv by introducing some helpers which take the engine parameter (since the register itself is per engine). v2: * Always use parentheses in macro arguments. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190607101535.767-1-tvrtko.ursulin@linux.intel.com
2019-05-28drm/i915/guc: Updates for GuC 32.0.3 firmwareMichal Wajdeczko
New GuC 32.0.3 firmware made many changes around its ABI that require driver updates: * FW release version numbering schema now includes patch number * FW release version encoding in CSS header * Boot parameters * Suspend/resume protocol * Sample-forcewake command * Additional Data Structures (ADS) This commit is a squash of patches 3-8 from series [1]. [1] https://patchwork.freedesktop.org/series/58760/ Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Jeff Mcgee <jeff.mcgee@intel.com> Cc: John Spotswood <john.a.spotswood@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # numbering schema Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ccs heaser Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # boot params Acked-by: John Spotswood <john.a.spotswood@intel.com> # suspend/resume Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # sample-forcewake Acked-by: John Spotswood <john.a.spotswood@intel.com> # sample-forcewake Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ADS Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190527183613.17076-4-michal.wajdeczko@intel.com
2019-05-08drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEADChris Wilson
After realising we need to sample RING_START to detect context switches from preemption events that do not allow for the seqno to advance, we can also realise that the seqno itself is just a distance along the ring and so can be replaced by sampling RING_HEAD. v2: Bonus comment for the mystery separate CS_STALL before MI_USER_INTERRUPT Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190508080704.24223-1-chris@chris-wilson.co.uk