drm/i915: Do not share hwsp across contexts any more, v8.

Instead of sharing pages with breadcrumbs, give each timeline a single page. This allows unrelated timelines not to share locks any more during command submission. As an additional benefit, seqno wraparound no longer requires i915_vma_pin, which means we no longer need to worry about a potential -EDEADLK at a point where we are ready to submit. Changes since v1: - Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle). - Extra check for completion in intel_read_hwsp(). Changes since v2: - Fix inconsistent indent in hwsp_alloc() (kbuild) - memset entire cacheline to 0. Changes since v3: - Do same in intel_timeline_reset_seqno(), and clflush for good measure. Changes since v4: - Use refcounting on timeline, instead of relying on i915_active. - Fix waiting on kernel requests. Changes since v5: - Bump amount of slots to maximum (256), for best wraparounds. - Add hwsp_offset to i915_request to fix potential wraparound hang. - Ensure timeline wrap test works with the changes. - Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to fix a hang. Changes since v6: - Rename i915_request_active_offset to i915_request_active_seqno(), and elaborate the function. (tvrtko) Changes since v7: - Move hunk to where it belongs. (jekstrand) - Replace CACHELINE_BYTES with TIMELINE_SEQNO_BYTES. (jekstrand) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> #v1 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-2-maarten.lankhorst@linux.intel.com
author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> 2021-03-23 16:49:50 +0100
committer: Daniel Vetter <daniel.vetter@ffwll.ch> 2021-03-24 11:38:56 +0100
commit: 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc (patch)
tree: 848bf3cfa805e5b522ecf136f38e7fd6f92c4452 /drivers/gpu/drm/i915/i915_request.h
parent: 547be6a479fd19fe1071d2cf76ed574fbff13481 (diff)
1 files changed, 21 insertions, 10 deletions
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 1bfe214a47e9..ce773c033642 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -237,16 +237,6 @@ struct i915_request {
 	 */
 	const u32 *hwsp_seqno;
 
-	/*
-	 * If we need to access the timeline's seqno for this request in
-	 * another request, we need to keep a read reference to this associated
-	 * cacheline, so that we do not free and recycle it before the foreign
-	 * observers have completed. Hence, we keep a pointer to the cacheline
-	 * inside the timeline's HWSP vma, but it is only valid while this
-	 * request has not completed and guarded by the timeline mutex.
-	 */
-	struct intel_timeline_cacheline __rcu *hwsp_cacheline;
-
 	/** Position in the ring of the start of the request */
 	u32 head;
 
@@ -616,4 +606,25 @@ i915_request_active_timeline(const struct i915_request *rq)
 					 lockdep_is_held(&rq->engine->active.lock));
 }
 
+static inline u32
+i915_request_active_seqno(const struct i915_request *rq)
+{
+	u32 hwsp_phys_base =
+		page_mask_bits(i915_request_active_timeline(rq)->hwsp_offset);
+	u32 hwsp_relative_offset = offset_in_page(rq->hwsp_seqno);
+
+	/*
+	 * Because of wraparound, we cannot simply take tl->hwsp_offset,
+	 * but instead use the fact that the relative for vaddr is the
+	 * offset as for hwsp_offset. Take the top bits from tl->hwsp_offset
+	 * and combine them with the relative offset in rq->hwsp_seqno.
+	 *
+	 * As rw->hwsp_seqno is rewritten when signaled, this only works
+	 * when the request isn't signaled yet, but at that point you
+	 * no longer need the offset.
+	 */
+
+	return hwsp_phys_base + hwsp_relative_offset;
+}
+
 #endif /* I915_REQUEST_H */
author	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2021-03-23 16:49:50 +0100
committer	Daniel Vetter <daniel.vetter@ffwll.ch>	2021-03-24 11:38:56 +0100
commit	12ca695d2c1ed26b2dcbb528b42813bd0f216cfc (patch)
tree	848bf3cfa805e5b522ecf136f38e7fd6f92c4452 /drivers/gpu/drm/i915/i915_request.h
parent	547be6a479fd19fe1071d2cf76ed574fbff13481 (diff)