drm/i915/gt: Batch TLB invalidations

Invalidate TLB in batches, in order to reduce performance regressions. Currently, every caller performs a full barrier around a TLB invalidation, ignoring all other invalidations that may have already removed their PTEs from the cache. As this is a synchronous operation and can be quite slow, we cause multiple threads to contend on the TLB invalidate mutex blocking userspace. We only need to invalidate the TLB once after replacing our PTE to ensure that there is no possible continued access to the physical address before releasing our pages. By tracking a seqno for each full TLB invalidate we can quickly determine if one has been performed since rewriting the PTE, and only if necessary trigger one for ourselves. That helps to reduce the performance regression introduced by TLB invalidate logic. [mchehab: rebased to not require moving the code to a separate file] Cc: stable@vger.kernel.org Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Cc: Fei Yang <fei.yang@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org (cherry picked from commit 5d36acb7198b0e5eb88e6b701f9ad7b9448f8df9) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
author: Chris Wilson <chris.p.wilson@intel.com> 2022-07-27 14:29:55 +0200
committer: Rodrigo Vivi <rodrigo.vivi@intel.com> 2022-08-08 14:06:55 -0400
commit: 59eda6ce824e95b98c45628fe6c0adb9130c6df2 (patch)
tree: 3cc0cce91171a8a38c47d1e8c89c965974d278fd /drivers/gpu/drm/i915/i915_vma_resource.h
parent: e5a95c83ed1492c0f442b448b20c90c8faaf702b (diff)
1 files changed, 5 insertions, 1 deletions
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index 5d8427caa2ba..06923d1816e7 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -67,6 +67,7 @@ struct i915_page_sizes {
  * taken when the unbind is scheduled.
  * @skip_pte_rewrite: During ggtt suspend and vm takedown pte rewriting
  * needs to be skipped for unbind.
+ * @tlb: pointer for obj->mm.tlb, if async unbind. Otherwise, NULL
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -119,6 +120,8 @@ struct i915_vma_resource {
 	bool immediate_unbind:1;
 	bool needs_wakeref:1;
 	bool skip_pte_rewrite:1;
+
+	u32 *tlb;
 };
 
 bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
@@ -131,7 +134,8 @@ struct i915_vma_resource *i915_vma_resource_alloc(void);
 
 void i915_vma_resource_free(struct i915_vma_resource *vma_res);
 
-struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
+struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res,
+					   u32 *tlb);
 
 void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
author	Chris Wilson <chris.p.wilson@intel.com>	2022-07-27 14:29:55 +0200
committer	Rodrigo Vivi <rodrigo.vivi@intel.com>	2022-08-08 14:06:55 -0400
commit	59eda6ce824e95b98c45628fe6c0adb9130c6df2 (patch)
tree	3cc0cce91171a8a38c47d1e8c89c965974d278fd /drivers/gpu/drm/i915/i915_vma_resource.h
parent	e5a95c83ed1492c0f442b448b20c90c8faaf702b (diff)