linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-09-07	drm/i915/gem: Reduce context termination list iteration guard to RCU	Chris Wilson
	As we now protect the timeline list using RCU, we can drop the timeline->mutex for guarding the list iteration during context close, as we are searching for an inflight request. Any new request will see the context is banned and not be submitted. In doing so, pull the checks for a concurrent submission of the request (notably the i915_request_completed()) under the engine spinlock, to fully serialise with __i915_request_submit()). That is in the case of preempt-to-busy where the request may be completed during the __i915_request_submit(), we need to be careful that we sample the request status after serialising so that we don't miss the request the engine is actually submitting. Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()") References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622 References: https://gitlab.freedesktop.org/drm/intel/-/issues/2158 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806105954.7766-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/selftests: Prevent selecting 0 for our random width/align	Chris Wilson
	When igt_random_offset() is a given a range of [0, PAGE_SIZE], it is allowed to return 0. However, attempting to use a size of 0 for the igt_lmem_write_cpu() byte poking, leads to call igt_random_offset() with a range of [offset, offset + 0] and ask it to find a length of 4 within it. This triggers the bug on that the requested length should fit within the range! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806145728.16495-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Hold context/request reference while breadcrumbs are active	Chris Wilson
	Currently we hold no actual reference to the request nor context while they are attached to a breadcrumb. To avoid freeing the request/context too early, we serialise with cancel-breadcrumbs by taking the irq spinlock in i915_request_retire(). The alternative is to take a reference for a new breadcrumb and release it upon signaling; removing the more frequently hit contention point in i915_request_retire(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-2-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Move intel_breadcrumbs_arm_irq earlier	Chris Wilson
	Move the __intel_breadcrumbs_arm_irq earlier, next to the disarm_irq, so that we can make use of it in the following patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200801160225.6814-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Shrink i915_page_directory's slab bucket	Chris Wilson
	kmalloc uses power-of-two slab buckets for small allocations (up to a few pages). Since i915_page_directory is a page of pointers, plus a couple more, this is rounded up to 8K, and we waste nearly 50% of that allocation. Long terms this leads to poor memory utilisation, bloating the kernel footprint, but the problem is exacerbated by our conservative preallocation scheme for binding VMA. As we are required to allocate all levels for each vma just in case we need to insert them upon binding, this leads to a large multiplication factor for a single page vma. By halving the allocation we need for the page directory structure, we halve the impact of that factor, bringing workloads that once fitted into memory, hopefully back to fitting into memory. We maintain the split between i915_page_directory and i915_page_table as we only need half the allocation for the lowest, most populous, level. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Switch to object allocations for page directories	Chris Wilson
	The GEM object is grossly overweight for the practicality of tracking large numbers of individual pages, yet it is currently our only abstraction for tracking DMA allocations. Since those allocations need to be reserved upfront before an operation, and that we need to break away from simple system memory, we need to ditch using plain struct page wrappers. In the process, we drop the WC mapping as we ended up clflushing everything anyway due to various issues across a wider range of platforms. Though in a future step, we need to drop the kmap_atomic approach which suggests we need to pre-map all the pages and keep them mapped. v2: Verify our large scratch page is suitably DMA aligned; and manually clear the scratch since we are allocating plain struct pages full of prior content. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Preallocate stashes for vma page-directories	Chris Wilson
	We need to make the DMA allocations used for page directories to be performed up front so that we can include those allocations in our memory reservation pass. The downside is that we have to assume the worst case, even before we know the final layout, and always allocate enough page directories for this object, even when there will be overlap. This unfortunately can be quite expensive, especially as we have to clear/reset the page directories and DMA pages, but it should only be required during early phases of a workload when new objects are being discovered, or after memory/eviction pressure when we need to rebind. Once we reach steady state, the objects should not be moved and we no longer need to preallocating the pages tables. It should be noted that the lifetime for the page directories DMA is more or less decoupled from individual fences as they will be shared across objects across timelines. v2: Only allocate enough PD space for the PTE we may use, we do not need to allocate PD that will be left as scratch. v3: Store the shift unto the first PD level to encapsulate the different PTE counts for gen6/gen8. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs	Chris Wilson
	On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Fixes: 4fe6abb8f513 ("drm/i915/gt: Ignore irq enabling on the virtual engines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Only transfer the virtual context to the new engine if active	Chris Wilson
	One more complication of preempt-to-busy with respect to the virtual engine is that we may have retired the last request along the virtual engine at the same time as preparing to submit the completed request to a new engine. That submit will be shortcircuited, but not before we have updated the context with the new register offsets and marked the virtual engine as bound to the new engine (by calling swap on ve->siblings[]). As we may have just retired the completed request, we may also be in the middle of calling virtual_context_exit() to turn off the power management associated with the virtual engine, and that in turn walks the ve->siblings[]. If we happen to call swap() on the array as we walk, we will call intel_engine_pm_put() twice on the same engine. In this patch, we prevent this by only updating the bound engine after a successful submission which weeds out the already completed requests. Alternatively, we could walk a non-volatile array for the pm, such as using the engine->mask. The small advantage to performing the update after the submit is that we then only have to do a swap for active requests. Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") References: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine" Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Nayana, Venkata Ramana" <venkata.ramana.nayana@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs	Chris Wilson
	After staring at the breadcrumb enabling/cancellation and coming to the conclusion that the cause of the mysterious stale breadcrumbs must the act of submitting a completed requests, we can then redirect those completed requests onto a dedicated signaled_list at the time of construction and so eliminate intel_engine_transfer_stale_breadcrumbs(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-2-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs	Chris Wilson
	Since the breadcrumb enabling/cancelling itself is serialised by the breadcrumbs.irq_lock, with a bit of care we can remove the outer serialisation with i915_request.lock for concurrent dma_fence_enable_signaling(). This has the important side-effect of eliminating the nested i915_request.lock within request submission. The challenge in serialisation is around the unsubmission where we take an active request that wants a breadcrumb on the signaling engine and put it to sleep. We do not want a concurrent dma_fence_enable_signaling() to attach a breadcrumb as we unsubmit, so we must mark the request as no longer active before serialising with the concurrent enable-signaling. On retire, we serialise with the concurrent enable-signaling, but instead of clearing ACTIVE, we mark it as SIGNALED. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Provide a fastpath for waiting on vma bindings	Chris Wilson
	Before we can execute a request, we must wait for all of its vma to be bound. This is a frequent operation for which we can optimise away a few atomic operations (notably a cmpxchg) in lieu of the RCU protection. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-7-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier()	Chris Wilson
	As the conversion between idle-barrier and full i915_active_fence is already serialised by explicit memory barriers, we can reduce the spinlock in i915_active_acquire_preallocate_barrier() for finding an idle-barrier to reuse to an RCU read lock to ensure the fence remains valid, only taking the spinlock for the update of the rbtree itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-6-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Make the stale cached active node available for any timeline	Chris Wilson
	Rather than require the next timeline after idling to match the MRU before idling, reset the index on the node and allow it to match the first request. However, this requires cmpxchg(u64) and so is not trivial on 32b, so for compatibility we just fallback to keeping the cached node pointing to the MRU timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-5-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Keep the most recently used active-fence upon discard	Chris Wilson
	Whenever an i915_active idles, we prune its tree of old fence slots to prevent a gradual leak should it be used to track many, many timelines. The downside is that we then have to frequently reallocate the rbtree. A compromise is that we keep the most recently used fence slot, and reuse that for the next active reference as that is the most likely timeline to be reused. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-4-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Export a preallocate variant of i915_active_acquire()	Chris Wilson
	Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. v2: Refactor active_lookup() so it can be called again before/after locking to resolve contention. Since we protect the rbtree until we idle, we can do a lockfree lookup, with the caveat that if another thread performs a concurrent insertion, the rotations from the insert may cause us to not find our target. A second pass holding the treelock will find the target if it exists, or the place to perform our insertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Skip taking acquire mutex for no ref->active callback	Chris Wilson
	If no active callback is defined for i915_active, we do not need to serialise its enabling with the mutex. We still do only want to call the debug activate once, and must still serialise with a concurrent retire. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-2-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/selftests: Drop stale timeline constructor assert	Chris Wilson
	Since we pass around encoded parameters to the kernel context constructor using the ce->timeline pointer, we can no longer assert that it should be zero for mock timeline construction. Fixes: d1bf5dd8f6d5 ("drm/i915/gt: Support multiple pinned timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731102206.6793-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Updated Fixes: link after rebasing and reordering into drm-intel-gt-next branch] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Pull release of node->age under the spinlock	Chris Wilson
	We need to ensure that the list is valid prior to marking the node as retrievable, otherwise we may see two threads compete over the same node in intel_gt_get_buffer_pool(). If the first thread acquires and releases the node in the same jiffie, the second thread may then acquire it (as the jiffie now again matches the expected value) and claim the node before it is put back into the list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200730134049.8822-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Support multiple pinned timelines	Chris Wilson
	We may need to allocate more than one pinned context/timeline for each engine which can utilise the per-engine HWSP, so we need to give each a different offset within it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200730183906.25422-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gem: Delay tracking the GEM context until it is registered	Chris Wilson
	Avoid exposing a partially constructed context by deferring the list_add() from the initial construction to the end of registration. Otherwise, if we peek into the list of contexts from inside debugfs, we may see the partially constructed context and chase down some dangling incomplete pointers. Reported-by: CQ Tang <cq.tang@intel.com> Fixes: 3aa9945a528e ("drm/i915: Separate GEM context construction and registration to userspace") References: f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: CQ Tang <cq.tang@intel.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200730092856.23615-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Fix termination condition for freeing all buffer objects	Chris Wilson
	A last minute change, that unfortunately broke CI so badly it declared SUCCESS, was to refactor the debug free all buffer pool code to reuse the normal worker, inverted the termination condition so that it instead of discarding the nodes, they were all declared young enough and eligible for reuse. Fixes: 06b73c2d0b65 ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729110756.2344-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Updating Fixes: link after rebasing and reordering into drm-intel-gt-next] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/selftests: Flush the active barriers before asserting	Chris Wilson
	Before we peek at the barrier status for an assert, first serialise with its callbacks so that we see a stable value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728153325.28351-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool	Chris Wilson
	Some very low hanging fruit, but contention on the pool->lock is noticeable between intel_gt_get_buffer_pool() and pool_retire(), with the majority of the hold time due to the locked list iteration. If we make the node itself RCU protected, we can perform the search for an suitable node just under RCU, reserving taking the lock itself for claiming the node and manipulating the list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200729080245.8070-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gt: Disable preparser around xcs invalidations on tgl	Chris Wilson
	Unlike rcs where we have conclusive evidence from our selftesting that disabling the preparser before performing the TLB invalidate and relocations does impact upon the GPU execution, the evidence for the same requirement on xcs is much more circumstantial. Let's apply the preparser disable between batches as we invalidate the TLB as a dose of healthy paranoia, just in case. References: https://gitlab.freedesktop.org/drm/intel/-/issues/2169 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152110.830-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/gem: Remove disordered per-file request list for throttling	Chris Wilson
	I915_GEM_THROTTLE dates back to the time before contexts where there was just a single engine, and therefore a single timeline and request list globally. That request list was in execution/retirement order, and so walking it to find a particular aged request made sense and could be split per file. That is no more. We now have many timelines with a file, as many as the user wants to construct (essentially per-engine, per-context). Each of those run independently and so make the single list futile. Remove the disordered list, and iterate over all the timelines to find a request to wait on in each to satisfy the criteria that the CPU is no more than 20ms ahead of its oldest request. It should go without saying that the I915_GEM_THROTTLE ioctl is no longer used as the primary means of throttling, so it makes sense to push the complication into the ioctl where it only impacts upon its few irregular users, rather than the execbuf/retire where everybody has to pay the cost. Fortunately, the few users do not create vast amount of contexts, so the loops over contexts/engines should be concise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Soften the tasklet flush frequency before waits	Chris Wilson
	We include a tasklet flush before waiting on a request as a precaution against the HW being lax in event signaling. We now have a precautionary flush in the engine's heartbeat and so do not need to be quite so zealous on every request wait. If we focus on the request, the only tasklet flush that matters is if there is a delay in submitting this request to HW, so if the request is not ready to be executed, no advantage in reducing this wait can be gained by running the tasklet. And there is little point in doing busy work for no result. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200715115147.11866-10-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915/selftests: Mock the status_page.vma for the kernel_context	Chris Wilson
	Since we assert that the kernel_context is using the perma-pinned HWSP, make it so. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2179 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200715155858.16410-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/i915: Reduce i915_request.lock contention for i915_request_wait	Chris Wilson
	Currently, we use i915_request_completed() directly in i915_request_wait() and follow up with a manual invocation of dma_fence_signal(). This appears to cause a large number of contentions on i915_request.lock as when the process is woken up after the fence is signaled by an interrupt, we will then try and call dma_fence_signal() ourselves while the signaler is still holding the lock. dma_fence_is_signaled() has the benefit of checking the DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so avoids most of that contention. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200716100754.5670-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07	drm/bridge: dw-mipi-dsi.c: Add VPG runtime config through debugfs	Angelo Ribeiro
	Add support for the video pattern generator (VPG) BER pattern mode and configuration in runtime. This enables using the debugfs interface to manipulate the VPG after the pipeline is set. Also, enables the usage of the VPG BER pattern. Changes in v2: - Added VID_MODE_VPG_MODE - Solved incompatible return type on __get and __set Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Adrian Pop <pop.adrian61@gmail.com> Signed-off-by: Angelo Ribeiro <angelo.ribeiro@synopsys.com> Tested-by: Yannick Fertre <yannick.fertre@st.com> Tested-by: Adrian Pop <pop.adrian61@gmail.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Cc: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Jose Abreu <jose.abreu@synopsys.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/a809feb7d7153a92e323416f744f1565e995da01.1586180592.git.angelo.ribeiro@synopsys.com
2020-09-07	drm/bridge/synopsys: dsi: add support for non-continuous HS clock	Antonio Borneo
	Current code enables the HS clock when video mode is started or to send out a HS command, and disables the HS clock to send out a LP command. This is not what DSI spec specify. Enable HS clock either in command and in video mode. Set automatic HS clock management for panels and devices that support non-continuous HS clock. Signed-off-by: Antonio Borneo <antonio.borneo@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701194234.18123-1-yannick.fertre@st.com
2020-09-07	drm/bridge/synopsys: dsi: allow sending longer LP commands	Antonio Borneo
	Current code does not properly computes the max length of LP commands that can be send during H or V sync, and rely on static values. Limiting the max LP length to 4 byte during the V-sync is overly conservative. Relax the limit and allows longer LP commands (16 bytes) to be sent during V-sync. Signed-off-by: Antonio Borneo <antonio.borneo@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701143131.841-1-yannick.fertre@st.com
2020-09-07	drm/bridge/synopsys: dsi: allow LP commands in video mode	Antonio Borneo
	Current code only sends LP commands in command mode. Allows sending LP commands also in video mode by setting the proper flag in DSI_VID_MODE_CFG. Signed-off-by: Antonio Borneo <antonio.borneo@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200708140836.32418-1-yannick.fertre@st.com
2020-09-06	drm/panel: s6e63m0: Fix up DRM_DEV* regression	Linus Walleij
	Ooops the panel drivers stopped to use DRM_DEV* messages and we predictably create errors by merging code that still use it. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: David Airlie <airlied@linux.ie> Link: https://patchwork.freedesktop.org/patch/msgid/20200906132903.5739-1-linus.walleij@linaro.org
2020-09-06	Merge tag 'for-linus-5.9-rc4-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "A small series for fixing a problem with Xen PVH guests when running as backends (e.g. as dom0). Mapping other guests' memory is now working via ZONE_DEVICE, thus not requiring to abuse the memory hotplug functionality for that purpose" * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: add helpers to allocate unpopulated memory memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC xen/balloon: add header guard
2020-09-06	drm/i915: panel: Use atomic PWM API for devs with an external PWM controller	Hans de Goede
	Now that the PWM drivers which we use have been converted to the atomic PWM API, we can move the i915 panel code over to using the atomic PWM API. The removes a long standing FIXME and this removes a flicker where the backlight brightness would jump to 100% when i915 loads even if using the fastset path. Note that this commit also simplifies pwm_disable_backlight(), by dropping the intel_panel_actually_set_backlight(..., 0) call. This call sets the PWM to 0% duty-cycle. I believe that this call was only present as a workaround for a bug in the pwm-crc.c driver where it failed to clear the PWM_OUTPUT_ENABLE bit. This is fixed by an earlier patch in this series. After the dropping of this workaround, the usleep call, which seems unnecessary to begin with, has no useful effect anymore, so drop that too. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-18-hdegoede@redhat.com
2020-09-06	drm/i915: panel: Honor the VBT PWM min setting for devs with an external PWM ↵	Hans de Goede
	controller So far for devices using an external PWM controller (devices using pwm_setup_backlight()), we have been hardcoding the minimum allowed PWM level to 0. But several of these devices specify a non 0 minimum setting in their VBT. Change pwm_setup_backlight() to use get_backlight_min_vbt() to get the minimum level. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-17-hdegoede@redhat.com
2020-09-06	drm/i915: panel: Honor the VBT PWM frequency for devs with an external PWM ↵	Hans de Goede
	controller So far for devices using an external PWM controller (devices using pwm_setup_backlight()), we have been hardcoding the period-time passed to pwm_config() to 21333 ns. I suspect this was done because many VBTs set the PWM frequency to 200 which corresponds to a period-time of 5000000 ns, which greatly exceeds the PWM_MAX_PERIOD_NS define in the Crystal Cove PMIC PWM driver, which used to be 21333. This PWM_MAX_PERIOD_NS define was actually based on a bug in the PWM driver where its period and duty-cycle times where off by a factor of 256. Due to this bug the hardcoded CRC_PMIC_PWM_PERIOD_NS value of 21333 would result in the PWM driver using its divider of 128, which would result in a PWM output frequency of 6000000 Hz / 256 / 128 = 183 Hz. So actually pretty close to the default VBT value of 200 Hz. Now that this bug in the pwm-crc driver is fixed, we can actually use the VBT defined frequency. This is important because: a) With the pwm-crc driver fixed it will now translate the hardcoded CRC_PMIC_PWM_PERIOD_NS value of 21333 ns / 46 Khz to a PWM output frequency of 23 KHz (the max it can do). b) The pwm-lpss driver used on many models has always honored the 21333 ns / 46 Khz request Some panels do not like such high output frequencies. E.g. on a Terra Pad 1061 tablet, using the LPSS PWM controller, the backlight would go from off to max, when changing the sysfs backlight brightness value from 90-100%, anything under aprox. 90% would turn the backlight fully off. Honoring the VBT specified PWM frequency will also hopefully fix the various bug reports which we have received about users perceiving the backlight to flicker after a suspend/resume cycle. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-16-hdegoede@redhat.com
2020-09-06	drm/i915: panel: Add get_vbt_pwm_freq() helper	Hans de Goede
	Factor the code which checks and drm_dbg_kms-s the VBT PWM frequency out of get_backlight_max_vbt(). This is a preparation patch for honering the VBT PWM frequency for devices which use an external PWM controller (devices using pwm_setup_backlight()). Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-15-hdegoede@redhat.com
2020-09-06	phy: mediatek: Move mtk_hdmi_phy driver into drivers/phy/mediatek folder	CK Hu
	mtk_hdmi_phy is currently placed inside mediatek drm driver, but it's more suitable to place a phy driver into phy driver folder, so move mtk_hdmi_phy driver into phy driver folder. Signed-off-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Acked-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Tested-by: Frank Wunderlich <frank-w@public-files.de>
2020-09-06	drm/mediatek: Separate mtk_hdmi_phy to an independent module	CK Hu
	mtk_hdmi_phy is a part of mtk_hdmi module, but phy driver should be an independent module rather than be part of drm module, so separate the phy driver to an independent module. Signed-off-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Tested-by: Frank Wunderlich <frank-w@public-files.de>
2020-09-06	drm/mediatek: Move tz_disabled from mtk_hdmi_phy to mtk_hdmi driver	CK Hu
	tz_disabled is used to control mtk_hdmi output signal, but this variable is stored in mtk_hdmi_phy and mtk_hdmi_phy does not use it. So move tz_disabled to mtk_hdmi where it's used. Signed-off-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Tested-by: Frank Wunderlich <frank-w@public-files.de>
2020-09-05	drm: xlnx: dpsub: Fix DMADEVICES Kconfig dependency	Laurent Pinchart
	The dpsub driver uses the DMA engine API, and thus selects DMA_ENGINE to provide that API. DMA_ENGINE depends on DMADEVICES, which can be deselected by the user, creating a possibly unmet indirect dependency: WARNING: unmet direct dependencies detected for DMA_ENGINE Depends on [n]: DMADEVICES [=n] Selected by [m]: - DRM_ZYNQMP_DPSUB [=m] && HAS_IOMEM [=y] && (ARCH_ZYNQMP \|\| COMPILE_TEST [=y]) && COMMON_CLK [=y] && DRM [=m] && OF [=y] Add a dependency on DMADEVICES to fix this. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Randy Dunlap <rdunlap@infradead.org>
2020-09-05	drm/panel: s6e63m0: Order enable/disable sequence	Linus Walleij
	The upstream S6E63M0 driver has some peculiarities around the prepare/enable disable/unprepare sequence: the screen is taken out of sleep in prepare() as part of s6e63m0_init() the put to on with MIPI_DCS_SET_DISPLAY_ON in enable(). However it is just put into sleep mode directly in disable(). As disable()/enable() can be called without unprepare()/prepare() being called, this is unbalanced, we should take the display out of sleep in enable() then turn it off(). Further MIPI_DCS_SET_DISPLAY_OFF is never called balanced with MIPI_DCS_SET_DISPLAY_ON. The vendor driver for Samsung GT-I8190 (Golden) does all of these things in strict order. Augment the driver to do exit sleep/set display on in enable() and set display off/enter sleep in disable(). Further send an explicit reset pulse in power_on() so we come up in a known state, and issue the MCS_ERROR_CHECK command after setting display on like the vendor driver does. Also use the timings from the vendor driver in the sequence. Doing all of these things makes the display much more stable on the Samsung GT-I8190 when enabling/disabling the display pipeline. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Cc: Stephan Gerhold <stephan@gerhold.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200817213906.88207-1-linus.walleij@linaro.org
2020-09-05	drm/panel: s6e63m0: Add code to identify panel	Linus Walleij
	We add code to identify a few different panels mounted on the s6e63m0 controller. This is necessary to achieve the proper biasing with DSI versions of the panel. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Stephan Gerhold <stephan@gerhold.net> Cc: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Acked-by: Paul Cercueil <paul@crapouillou.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200809215104.1830206-5-linus.walleij@linaro.org
2020-09-05	drm/panel: s6e63m0: Add reading functionality	Linus Walleij
	This adds code to send read commands to read a single byte from the display, in order to perform MTP ID look-up of the mounted panel on the s6e63m0 controller. This is needed for proper biasing on the DSI variants. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Stephan Gerhold <stephan@gerhold.net> Cc: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Acked-by: Paul Cercueil <paul@crapouillou.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200809215104.1830206-4-linus.walleij@linaro.org
2020-09-05	drm/panel: s6e63m0: Add DSI transport	Linus Walleij
	This makes it possible to use the s6e63m0 panel with a DSI host, such as in the Samsung GT-I8190 (Golden) mobile phone. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Stephan Gerhold <stephan@gerhold.net> Cc: Stephan Gerhold <stephan@gerhold.net> Cc: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Acked-by: Paul Cercueil <paul@crapouillou.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200809215104.1830206-3-linus.walleij@linaro.org
2020-09-05	drm/panel: s6e63m0: Break out SPI transport	Linus Walleij
	This panel can be accessed using both SPI and DSI. To make it possible to probe and use the device also from a DSI bus, first break out the SPI support to its own file. Since all the panel driver does is write DCS commands to the panel, we pass a DCS write function to probe() from each subdriver. We make the Kconfig entry for SPI mode default so all current users will continue to work. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Stephan Gerhold <stephan@gerhold.net> Cc: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Acked-by: Paul Cercueil <paul@crapouillou.net> Link: https://patchwork.freedesktop.org/patch/384873/
2020-09-04	drm/msm/adreno: remove return value of function XX_print	Bernard Zhao
	XX_print like pfp_print/me_print/meq_print/roq_print are just used in file a5xx_debugfs.c. And these function always return 0, this return value is meaningless. This change is to make the code a bit more readable. Signed-off-by: Bernard Zhao <bernard@vivo.com> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-09-04	drm/msm: drop cache sync hack	Rob Clark
	Now that it isn't causing problems to use dma_map/unmap, we can drop the hack of using dma_sync in certain cases. Signed-off-by: Rob Clark <robdclark@chromium.org>