summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/i915_gem_userptr.c
AgeCommit message (Collapse)Author
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/mm.h> We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/mm.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. The APIs that are going to be moved first are: mm_alloc() __mmdrop() mmdrop() mmdrop_async_fn() mmdrop_async() mmget_not_zero() mmput() mmput_async() get_task_mm() mm_access() mm_release() Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-27mm: use mmget_not_zero() helperVegard Nossum
We already have the helper, we can convert the rest of the kernel mechanically using: git grep -l 'atomic_inc_not_zero.*mm_users' | xargs sed -i 's/atomic_inc_not_zero(&\(.*\)->mm_users)/mmget_not_zero\(\1\)/' This is needed for a later patch that hooks into the helper, but might be a worthwhile cleanup on its own. Link: http://lkml.kernel.org/r/20161218123229.22952-3-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27mm: add new mmgrab() helperVegard Nossum
Apart from adding the helper function itself, the rest of the kernel is converted mechanically using: git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_count);/mmgrab\(\1\);/' git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_count);/mmgrab\(\&\1\);/' This is needed for a later patch that hooks into the helper, but might be a worthwhile cleanup on its own. (Michal Hocko provided most of the kerneldoc comment.) Link: http://lkml.kernel.org/r/20161218123229.22952-1-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-04Merge tag 'v4.10-rc2' into drm-intel-next-queuedDaniel Vetter
Backmerge Linux 4.10-rc2 to resync with our -fixes cherry-picks. I've done the backmerge directly because Dave is on vacation. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-12-14mm: add locked parameter to get_user_pages_remote()Lorenzo Stoakes
Patch series "mm: unexport __get_user_pages_unlocked()". This patch series continues the cleanup of get_user_pages*() functions taking advantage of the fact we can now pass gup_flags as we please. It firstly adds an additional 'locked' parameter to get_user_pages_remote() to allow for its callers to utilise VM_FAULT_RETRY functionality. This is necessary as the invocation of __get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of this and no other existing higher level function would allow it to do so. Secondly existing callers of __get_user_pages_unlocked() are replaced with the appropriate higher-level replacement - get_user_pages_unlocked() if the current task and memory descriptor are referenced, or get_user_pages_remote() if other task/memory descriptors are referenced (having acquiring mmap_sem.) This patch (of 2): Add a int *locked parameter to get_user_pages_remote() to allow VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked(). Taking into account the previous adjustments to get_user_pages*() functions allowing for the passing of gup_flags, we are now in a position where __get_user_pages_unlocked() need only be exported for his ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to subsequently unexport __get_user_pages_unlocked() as well as allowing for future flexibility in the use of get_user_pages_remote(). [sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change] Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-01drm/i915: Make GEM object alloc/free and stolen created take dev_privTvrtko Ursulin
Where it is more appropriate and also to be consistent with the direction of the driver. v2: Leave out object alloc/free inlining. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-11drm/i915: Assorted dev_priv cleanupsTvrtko Ursulin
A small selection of macros which can only accept dev_priv from now on and a resulting trickle of fixups. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-01drm/i915: Allow shrinking of userptr objects once againTvrtko Ursulin
Commit 1bec9b0bda3d ("drm/i915/shrinker: Only shmemfs objects are backed by swap") stopped considering the userptr objects in shrinker callbacks. Restore that so idle userptr objects can be discarded in order to free up memory. One change further to what was introduced in 1bec9b0bda3d is to start considering userptr objects in oom but that should also be a correct thing to do. v2: Introduce I915_GEM_OBJECT_IS_SHRINKABLE. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1bec9b0bda3d ("drm/i915/shrinker: Only shmemfs objects are backed by swap") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: <stable@vger.kernel.org> Link: http://patchwork.freedesktop.org/patch/msgid/1478011450-6634-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-11-01drm/i915: Improve lockdep tracking for obj->mm.lockChris Wilson
The shrinker may appear to recurse into obj->mm.lock as the shrinker may be called from a direct reclaim path whilst handling get_pages. We filter out recursing on the same obj->mm.lock by inspecting obj->mm.pages, but we do want to take the lock on a second object in order to reap their pages. lockdep spots the recursion on the same lockclass and needs annotation to avoid a false positive. To keep the two paths distinct, create an enum to indicate which subclass of obj->mm.lock we are using. This removes the false positive and avoids masking real bugs. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161101121134.27504-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-10-28drm/i915: Use lockless object freeChris Wilson
Having moved the locked phase of freeing an object to a separate worker, we can now declare to the core that we only need the unlocked variant of driver->gem_free_object, and can use the simple unreference internally. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-20-chris@chris-wilson.co.uk
2016-10-28drm/i915: Move object backing storage manipulation to its own lockingChris Wilson
Break the allocation of the backing storage away from struct_mutex into a per-object lock. This allows parallel page allocation, provided we can do so outside of struct_mutex (i.e. set-domain-ioctl, pwrite, GTT fault), i.e. before execbuf! The increased cost of the atomic counters are hidden behind i915_vma_pin() for the typical case of execbuf, i.e. as the object is typically bound between execbufs, the page_pin_count is static. The cost will be felt around set-domain and pwrite, but offset by the improvement from reduced struct_mutex contention. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-14-chris@chris-wilson.co.uk
2016-10-28drm/i915: Pass around sg_table to get_pages/put_pages backendChris Wilson
The plan is to move obj->pages out from under the struct_mutex into its own per-object lock. We need to prune any assumption of the struct_mutex from the get_pages/put_pages backends, and to make it easier we pass around the sg_table to operate on rather than indirectly via the obj. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
2016-10-28drm/i915: Refactor object page APIChris Wilson
The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
2016-10-28drm/i915: Use a radixtree for random access to the object's backing storageChris Wilson
A while ago we switched from a contiguous array of pages into an sglist, for that was both more convenient for mapping to hardware and avoided the requirement for a vmalloc array of pages on every object. However, certain GEM API calls (like pwrite, pread as well as performing relocations) do desire access to individual struct pages. A quick hack was to introduce a cache of the last access such that finding the following page was quick - this works so long as the caller desired sequential access. Walking backwards, or multiple callers, still hits a slow linear search for each page. One solution is to store each successful lookup in a radix tree. v2: Rewrite building the radixtree for clarity, hopefully. v3: Rearrange execbuf to avoid calling i915_gem_object_get_sg() from within an atomic section and so relax the allocation context to a simple GFP_KERNEL and mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-10-chris@chris-wilson.co.uk
2016-10-28drm/i915: Rearrange i915_wait_request() accounting with callersChris Wilson
Our low-level wait routine has evolved from our generic wait interface that handled unlocked, RPS boosting, waits with time tracking. If we push our GEM fence tracking to use reservation_objects (required for handling multiple timelines), we lose the ability to pass the required information down to i915_wait_request(). However, if we push the extra functionality from i915_wait_request() to the individual callsites (i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use of those extras, we can both simplify our low level wait and prepare for extending the GEM interface for use of reservation_objects. v2: Rewrite i915_wait_request() kerneldocs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-4-chris@chris-wilson.co.uk
2016-10-19mm: replace get_user_pages_remote() write/force parameters with gup_flagsLorenzo Stoakes
This removes the 'write' and 'force' from get_user_pages_remote() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-09drm/i915: Expand bool interruptible to pass flags to i915_wait_request()Chris Wilson
We need finer control over wakeup behaviour during i915_wait_request(), so expand the current bool interruptible to a bitmask. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk
2016-08-18drm/i915/userptr: Make gup errors stickierChris Wilson
Keep any error reported by the gup_worker until we are notified that the arena has changed (via the mmu-notifier). This has the importance of making two consecutive calls to i915_gem_object_get_pages() reporting the same error, and curtailing a loop of detecting a fault and requeueing a gup_worker. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-19-chris@chris-wilson.co.uk
2016-08-05drm/i915/userptr: Remove superfluous interruptible=false on waitingChris Wilson
Inside the kthread context, we can't be interrupted by signals so touching the mm.interruptible flag is pointless and wait-request now consumes EIO itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-4-git-send-email-chris@chris-wilson.co.uk
2016-08-05drm/i915: Convert non-blocking userptr waits for requests over to using RCUChris Wilson
We can completely avoid taking the struct_mutex around the non-blocking waits by switching over to the RCU request management (trading the mutex for a RCU read lock and some complex atomic operations). The improvement is that we gain further contention reduction, and overall the code become simpler due to the reduced mutex dancing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-3-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Move obj->active:5 to obj->flagsChris Wilson
We are motivated to avoid using a bitfield for obj->active for a couple of reasons. Firstly, we wish to document our lockless read of obj->active using READ_ONCE inside i915_gem_busy_ioctl() and that requires an integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code when presented with a bitfield and that shows up high on the profiles of request tracking (mainly due to excess memory traffic as it converts the bitfield to a register and back and generates frequent AGI in the process). v2: BIT, break up a long line in compute the other engines, new paint for i915_gem_object_is_active (now i915_gem_object_get_active). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-23-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: s/__i915_wait_request/i915_wait_request/Chris Wilson
There is only one wait on request function now, so drop the "expert" indication of leading __. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-21-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Mark up i915_gem_active for locking annotationChris Wilson
The future annotations will track the locking used for access to ensure that it is always sufficient. We make the preparations now to present the API ahead and to make sure that GCC can eliminate the unused parameter. Before: 6298417 3619610 696320 10614347 a1f64b vmlinux After: 6298417 3619610 696320 10614347 a1f64b vmlinux (with i915 builtin) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-12-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Prepare i915_gem_active for annotationsChris Wilson
In the future, we will want to add annotations to the i915_gem_active struct. The API is thus expanded to hide direct access to the contents of i915_gem_active and mediated instead through a number of helpers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-11-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Introduce i915_gem_active for request trackingChris Wilson
In the next patch, request tracking is made more generic and for that we need a new expanded struct and to separate out the logic changes from the mechanical churn, we split out the structure renaming into this patch. v2: Writer's block. Add some spiel about why we track requests. v3: Now i915_gem_active. v4: Now with i915_gem_active_set() for attaching to the active request. v5: Use i915_gem_active_set() from inside the retirement handlers Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-10-git-send-email-chris@chris-wilson.co.uk
2016-08-04drm/i915: Be more careful when unbinding vmaChris Wilson
When we call i915_vma_unbind(), we will wait upon outstanding rendering. This will also trigger a retirement phase, which may update the object lists. If, we extend request tracking to the VMA itself (rather than keep it at the encompassing object), then there is a potential that the obj->vma_list be modified for other elements upon i915_vma_unbind(). As a result, if we walk over the object list and call i915_vma_unbind(), we need to be prepared for that list to change. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-8-git-send-email-chris@chris-wilson.co.uk
2016-07-20drm/i915: Rename drm_gem_object_unreference_unlocked in preparation for ↵Chris Wilson
lockless free Whilst this ultimately wraps kref_put_mutex(), our goal here is the lockless variant, so keep the _unlocked() suffix until we need it no more. s/drm_gem_object_unreference_unlocked/i915_gem_object_put_unlocked/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-7-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-6-git-send-email-chris@chris-wilson.co.uk
2016-07-20drm/i915: Rename drm_gem_object_unreference in preparation for lockless freeChris Wilson
Ultimately wraps kref_put(), so adopt its nomenclature for consistency with other subsystems. s/drm_gem_object_unreference/i915_gem_object_put/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-6-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-5-git-send-email-chris@chris-wilson.co.uk
2016-07-20drm/i915: Wrap drm_gem_object_reference in i915_gem_object_getChris Wilson
Ultimately wraps kref_get(), so adopt its nomenclature for consistency with other subsystems. s/drm_gem_object_reference/i915_gem_object_get/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-5-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-4-git-send-email-chris@chris-wilson.co.uk
2016-07-20drm/i915: Rename request reference/unreference to get/putChris Wilson
Now that we derive requests from struct fence, swap over to its nomenclature for references. It's shorter and more idiomatic across the kernel. s/i915_gem_request_reference/i915_gem_request_get/ s/i915_gem_request_unreference/i915_gem_request_put/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-2-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-1-git-send-email-chris@chris-wilson.co.uk
2016-05-20drm/i915: Introduce & use new lightweight SGL iteratorsDave Gordon
The existing for_each_sg_page() iterator is somewhat heavyweight, and is limiting i915 driver performance in a few benchmarks. So here we introduce somewhat lighter weight iterators, primarily for use with GEM objects or other case where we need only deal with whole aligned pages. Unlike the old iterator, the new iterators use an internal state structure which is not intended to be accessed by the caller; instead each takes as a parameter an output variable which is set before each iteration. This makes them particularly simple to use :) One of the new iterators provides the caller with the DMA address of each page in turn; the other provides the 'struct page' pointer required by many memory management operations. Various uses of for_each_sg_page() are then converted to the new macros. v2: Force inlining of the sg_iter constructor and make the union anonymous. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-4-git-send-email-chris@chris-wilson.co.uk
2016-05-19drm/i915/userptr: Convert to drm_i915_privateChris Wilson
userptr directly only uses drm_device in a single interface where it meant to use drm_i915_private (everywhere else we have to derive it from the drm_i915_gem_object and so require going from drm_device). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463671036-3235-1-git-send-email-chris@chris-wilson.co.uk
2016-04-14drm/i915: Prevent leaking of -EIO from i915_wait_request()Chris Wilson
Reporting -EIO from i915_wait_request() has proven very troublematic over the years, with numerous hard-to-reproduce bugs cropping up in the corner case of where a reset occurs and the code wasn't expecting such an error. If the we reset the GPU or have detected a hang and wish to reset the GPU, the request is forcibly complete and the wait broken. Currently, we report either -EAGAIN or -EIO in order for the caller to retreat and restart the wait (if appropriate) after dropping and then reacquiring the struct_mutex (essential to allow the GPU reset to proceed). However, if we take the view that the request is complete (no further work will be done on it by the GPU because it is dead and soon to be reset), then we can proceed with the task at hand and then drop the struct_mutex allowing the reset to occur. This transfers the burden of checking whether it is safe to proceed to the caller, which in all but one instance it is safe - completely eliminating the source of all spurious -EIO. Of note, we only have two API entry points where we expect that userspace can observe an EIO. First is when submitting an execbuf, if the GPU is terminally wedged, then the operation cannot succeed and an -EIO is reported. Secondly, existing userspace uses the throttle ioctl to detect an already wedged GPU before starting using HW acceleration (or to confirm that the GPU is wedged after an error condition). So if the GPU is wedged when the user calls throttle, also report -EIO. v2: Split more carefully the change to i915_wait_request() and assorted ABI from the reset handling. v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code so that we don't start to leak EIO there in future (and break our hang resistant modesetting). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
2016-04-14drm/i915: Store the reset counter when constructing a requestChris Wilson
As the request is only valid during the same global reset epoch, we can record the current reset_counter when constructing the request and reuse it when waiting upon that request in future. This removes a very hairy atomic check serialised by the struct_mutex at the time of waiting and allows us to transfer those waits to a central dispatcher for all waiters and all requests. PS: With per-engine resets, we obviously cannot assume a global reset epoch for the requests - a per-engine epoch makes the most sense. The challenge then is how to handle checking in the waiter for when to break the wait, as the fine-grained reset may also want to requeue the request (i.e. the assumption that just because the epoch changes the request is completed may be broken - or we just avoid breaking that assumption with the fine-grained resets). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-7-git-send-email-chris@chris-wilson.co.uk
2016-04-11drm/i915/userptr: Store i915 backpointer for i915_mm_structChris Wilson
Since we only ever use the drm_i915_private from the stored i915_mm_struct->dev, save some electrons by storing the right backpointer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-3-git-send-email-chris@chris-wilson.co.uk
2016-04-11drm/i915/userptr: Hold mmref whilst calling get-user-pagesChris Wilson
Holding a reference to the containing task_struct is not sufficient to prevent the mm_struct from being reaped under memory pressure. If this happens whilst we are calling get_user_pages(), explosions erupt - sometimes an immediate GPF, sometimes page flag corruption. To prevent the target mm from being reaped as we are reading from it, acquire a reference before we begin. Testcase: igt/gem_shrink/*userptr Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-2-git-send-email-chris@chris-wilson.co.uk
2016-04-11drm/i915/userptr: Flush cancellations before mmu-notifier invalidate returnsChris Wilson
In order to ensure that all invalidations are completed before the operation returns to userspace (i.e. before the munmap() syscall returns) we need to wait upon the outstanding operations. We are allowed to block inside the invalidate_range_start callback, and as struct_mutex is the inner lock with mmap_sem we can wait upon the struct_mutex without provoking lockdep into warning about a deadlock. However, we don't actually want to wait upon outstanding rendering whilst holding the struct_mutex if we can help it otherwise we also block other processes from submitting work to the GPU. So first we do a wait without the lock and then when we reacquire the lock, we double check that everything is ready for removing the invalidated pages. Finally to wait upon the outstanding unpinning tasks, we create a private workqueue as a means to conveniently wait upon all at once. The drawback is that this workqueue is per-mm, so any threads concurrently invalidating objects will wait upon each other. The advantage of using the workqueue is that we can wait in parallel for completion of rendering and unpinning of several objects (of particular importance if the process terminates with a whole mm full of objects). v2: Apply a cup of tea to the changelog. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94699 Testcase: igt/gem_userptr_blits/sync-unmap-cycles Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-04-11Merge tag 'v4.6-rc3' into drm-intel-next-queuedDaniel Vetter
Linux 4.6-rc3 Backmerge requested by Chris Wilson to make his patches apply cleanly. Tiny conflict in vmalloc.c with the (properly acked and all) patch in drm-intel-next: commit 4da56b99d99e5a7df2b7f11e87bfea935f909732 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Apr 4 14:46:42 2016 +0100 mm/vmap: Add a notifier for when we run out of vmap address space and Linus' tree. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-04-11drm,i915: Introduce drm_malloc_gfp()Chris Wilson
I have instances where I want to use drm_malloc_ab() but with a custom gfp mask. And with those, where I want a temporary allocation, I want to try a high-order kmalloc() before using a vmalloc(). So refactor my usage into drm_malloc_gfp(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: dri-devel@lists.freedesktop.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Dave Airlie <airlied@redhat.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-21Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "This is the main drm pull request for 4.6 kernel. Overall the coolest thing here for me is the nouveau maxwell signed firmware support from NVidia, it's taken a long while to extract this from them. I also wish the ARM vendors just designed one set of display IP, ARM display block proliferation is definitely increasing. Core: - drm_event cleanups - Internal API cleanup making mode_fixup optional. - Apple GMUX vga switcheroo support. - DP AUX testing interface Panel: - Refactoring of DSI core for use over more transports. New driver: - ARM hdlcd driver i915: - FBC/PSR (framebuffer compression, panel self refresh) enabled by default. - Ongoing atomic display support work - Ongoing runtime PM work - Pixel clock limit checks - VBT DSI description support - GEM fixes - GuC firmware scheduler enhancements amdkfd: - Deferred probing fixes to avoid make file or link ordering. amdgpu/radeon: - ACP support for i2s audio support. - Command Submission/GPU scheduler/GPUVM optimisations - Initial GPU reset support for amdgpu vmwgfx: - Support for DX10 gen mipmaps - Pageflipping and other fixes. exynos: - Exynos5420 SoC support for FIMD - Exynos5422 SoC support for MIPI-DSI nouveau: - GM20x secure boot support - adds acceleration for Maxwell GPUs. - GM200 support - GM20B clock driver support - Power sensors work etnaviv: - Correctness fixes for GPU cache flushing - Better support for i.MX6 systems. imx-drm: - VBlank IRQ support - Fence support - OF endpoint support msm: - HDMI support for 8996 (snapdragon 820) - Adreno 430 support - Timestamp queries support virtio-gpu: - Fixes for Android support. rockchip: - Add support for Innosilicion HDMI rcar-du: - Support for 4 crtcs - R8A7795 support - RCar Gen 3 support omapdrm: - HDMI interlace output support - dma-buf import support - Refactoring to remove a lot of legacy code. tilcdc: - Rewrite of pageflipping code - dma-buf support - pinctrl support vc4: - HDMI modesetting bug fixes - Significant 3D performance improvement. fsl-dcu (FreeScale): - Lots of fixes tegra: - Two small fixes sti: - Atomic support for planes - Improved HDMI support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1063 commits) drm/amdgpu: release_pages requires linux/pagemap.h drm/sti: restore mode_fixup callback drm/amdgpu/gfx7: add MTYPE definition drm/amdgpu: removing BO_VAs shouldn't be interruptible drm/amd/powerplay: show uvd/vce power gate enablement for tonga. drm/amd/powerplay: show uvd/vce power gate info for fiji drm/amdgpu: use sched fence if possible drm/amdgpu: move ib.fence to job.fence drm/amdgpu: give a fence param to ib_free drm/amdgpu: include the right version of gmc header files for iceland drm/radeon: fix indentation. drm/amd/powerplay: add uvd/vce dpm enabling flag to fix the performance issue for CZ drm/amdgpu: switch back to 32bit hw fences v2 drm/amdgpu: remove amdgpu_fence_is_signaled drm/amdgpu: drop the extra fence range check v2 drm/amdgpu: signal fences directly in amdgpu_fence_process drm/amdgpu: cleanup amdgpu_fence_wait_empty v2 drm/amdgpu: keep all fences in an RCU protected array v2 drm/amdgpu: add number of hardware submissions to amdgpu_fence_driver_init_ring drm/amdgpu: RCU protected amd_sched_fence_release ...
2016-03-02drm/i915: Avoid snooping with userptr where not supportedTvrtko Ursulin
commit e5756c10d841ddb448293c849392f3d6b809561f Author: Imre Deak <imre.deak@intel.com> Date: Fri Aug 14 18:43:30 2015 +0300 drm/i915/bxt: don't allow cached GEM mappings on A stepping Added an exception of disallowing snooping for Broxton A stepping hardware but userptr was still enabling it regardless. Move the check to HAS_SNOOP now that it is used from multiple call sites and use it. v2: Userptr cannot be supported when it cannot be coherent and generalize the code better. (Chris Wilson) v3: Make has_snoop true only when !has_llc. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1456920631-34302-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-02-26drm/i915: Rename vma->*_list to *_link for consistencyChris Wilson
Elsewhere we have adopted the convention of using '_link' to denote elements in the list (and '_list' for the actual list_head itself), and that the name should indicate which list the link belongs to (and preferrably not just where the link is being stored). s/vma_link/obj_link/ (we iterate over obj->vma_list) s/mm_list/vm_link/ (we iterate over vm->[in]active_list) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-02-16mm/gup: Introduce get_user_pages_remote()Dave Hansen
For protection keys, we need to understand whether protections should be enforced in software or not. In general, we enforce protections when working on our own task, but not when on others. We call these "current" and "remote" operations. This patch introduces a new get_user_pages() variant: get_user_pages_remote() Which is a replacement for when get_user_pages() is called on non-current tsk/mm. We also introduce a new gup flag: FOLL_REMOTE which can be used for the "__" gup variants to get this new behavior. The uprobes is_trap_at_addr() location holds mmap_sem and calls get_user_pages(current->mm) on an instruction address. This makes it a pretty unique gup caller. Being an instruction access and also really originating from the kernel (vs. the app), I opted to consider this a 'remote' access where protection keys will not be enforced. Without protection keys, this patch should not change any behavior. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: jack@suse.cz Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210154.3F0E51EA@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-08drm/i915: Allow i915_gem_object_get_page() on userptr as wellChris Wilson
commit 033908aed5a596f6202c848c6bbc8a40fb1a8490 Author: Dave Gordon <david.s.gordon@intel.com> Date: Thu Dec 10 18:51:23 2015 +0000 drm/i915: mark GEM object pages dirty when mapped & written by the CPU introduced a check into i915_gem_object_get_dirty_pages() that returned a NULL pointer when called with a bad object, one that was not backed by shmemfs. This WARN was too strict as we can work on all struct page backed objects, and resulted in a WARN + GPF for existing userspace. In order to differentiate the various types of objects, add a new flags field to the i915_gem_object_ops struct to describe their capabilities, with the first flag being whether the object has struct pages. v2: Drop silly const before an integer in the structure declaration. Testcase: igt/gem_userptr_blits/relocations Reported-and-tested-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Tested-by: Michal Winiarski <michal.winiarski@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 033908aed5a5 ("drm/i915: mark GEM object pages dirty when mapped & written by the CPU") Link: http://patchwork.freedesktop.org/patch/msgid/1453487551-16799-1-git-send-email-chris@chris-wilson.co.uk (cherry picked from commit de4726649b6b1d7f3f02b2031ee99e067cb71e2d) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-02-03drm/i915: Allow i915_gem_object_get_page() on userptr as wellChris Wilson
commit 033908aed5a596f6202c848c6bbc8a40fb1a8490 Author: Dave Gordon <david.s.gordon@intel.com> Date: Thu Dec 10 18:51:23 2015 +0000 drm/i915: mark GEM object pages dirty when mapped & written by the CPU introduced a check into i915_gem_object_get_dirty_pages() that returned a NULL pointer when called with a bad object, one that was not backed by shmemfs. This WARN was too strict as we can work on all struct page backed objects, and resulted in a WARN + GPF for existing userspace. In order to differentiate the various types of objects, add a new flags field to the i915_gem_object_ops struct to describe their capabilities, with the first flag being whether the object has struct pages. v2: Drop silly const before an integer in the structure declaration. Testcase: igt/gem_userptr_blits/relocations Reported-and-tested-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Kristian Høgsberg Kristensen <krh@bitplanet.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Tested-by: Michal Winiarski <michal.winiarski@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453487551-16799-1-git-send-email-chris@chris-wilson.co.uk
2016-01-25drm/i915: Improve handling of overlapping objectsChris Wilson
The generic interval tree we use to speed up range invalidation is an augmented rbtree that can report all overlapping intervals for a given range. Therefore we do not need to degrade to a linear list if we find overlapping objects. Oops. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453397563-2848-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-11-10Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "I Was Almost Tempted To Capitalise Every Word, but then I decided I couldn't read it myself! I've also got one pull request for the sti driver outstanding. It relied on a commit in Greg's tree and I didn't find out in time, that commit is in your tree now so I might send that along once this is merged. I also had the accidental misfortune to have access to a Skylake on my desk for a few days, and I've had to encourage Intel to try harder, which seems to be happening now. Here is the main drm-next pull request for 4.4. Highlights: New driver: vc4 driver for the Rasberry Pi VPU. (From Eric Anholt at Broadcom.) Core: Atomic fbdev support Atomic helpers for runtime pm dp/aux i2c STATUS_UPDATE handling struct_mutex usage cleanups. Generic of probing support. Documentation: Kerneldoc for VGA switcheroo code. Rename to gpu instead of drm to reflect scope. i915: Skylake GuC firmware fixes HPD A support VBT backlight fallbacks Fastboot by default for some systems FBC work BXT/SKL workarounds Skylake deeper sleep state fixes amdgpu: Enable GPU scheduler by default New atombios opcodes GPUVM debugging options Stoney support. Fencing cleanups. radeon: More efficient CS checking nouveau: gk20a instance memory handling improvements. Improved PGOB detection and GK107 support Kepler GDDR5 PLL statbility improvement G8x/GT2xx reclock improvements new userspace API compatiblity fixes. virtio-gpu: Add 3D support - qemu 2.5 has it merged for it's gtk backend. msm: Initial msm88896 (snapdragon 8200) exynos: HDMI cleanups Enable mixer driver byt default Add DECON-TV support vmwgfx: Move to using memremap + fixes. rcar-du: Add support for R8A7793/4 DU armada: Remove support for non-component mode Improved plane handling Power savings while in DPMS off. tda998x: Remove unused slave encoder support Use more HDMI helpers Fix EDID read handling dwhdmi: Interlace video mode support for ipu-v3/dw_hdmi Hotplug state fixes Audio driver integration imx: More color formats support. tegra: Minor fixes/improvements" [ Merge fixup: remove unused variable 'dev' that had all uses removed in commit 4e270f088011: "drm/gem: Drop struct_mutex requirement from drm_gem_mmap_obj" ] * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (764 commits) drm/vmwgfx: Relax irq locking somewhat drm/vmwgfx: Properly flush cursor updates and page-flips drm/i915/skl: disable display side power well support for now drm/i915: Extend DSL readout fix to BDW and SKL. drm/i915: Do graphics device reset under forcewake drm/i915: Skip fence installation for objects with rotated views (v4) vga_switcheroo: Drop client power state VGA_SWITCHEROO_INIT drm/amdgpu: group together common fence implementation drm/amdgpu: remove AMDGPU_FENCE_OWNER_MOVE drm/amdgpu: remove now unused fence functions drm/amdgpu: fix fence fallback check drm/amdgpu: fix stoping the scheduler timeout drm/amdgpu: cleanup on error in amdgpu_cs_ioctl() drm/i915: Fix locking around GuC firmware load drm/amdgpu: update Fiji's Golden setting drm/amdgpu: update Fiji's rev id drm/amdgpu: extract common code in vi_common_early_init drm/amd/scheduler: don't oops on failure to load drm/amdgpu: don't oops on failure to load (v2) drm/amdgpu: don't VT switch on suspend ...
2015-10-13drm/i915: Deny wrapping an userptr into a framebufferChris Wilson
Pinning a userptr onto the hardware raises interesting questions about the lifetime of such a surface as the framebuffer extends that life beyond the client's address space. That is the hardware will need to keep scanning out from the backing storage even after the client wants to remap its address space. As the hardware pins the backing storage, the userptr becomes invalid and this raises a WARN when the clients tries to unmap its address space. The situation can be even more complicated when the buffer is passed between processes, between a client and display server, where the lifetime and hardware access is even more confusing. Deny it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-10-06drm/i915: Use a task to cancel the userptr on invalidate_rangeChris Wilson
Whilst discussing possible ways to trigger an invalidate_range on a userptr with an aliased GGTT mmapping (and so cause a struct_mutex deadlock), the conclusion is that we can, and we must, prevent any possible deadlock by avoiding taking the mutex at all during invalidate_range. This has numerous advantages all of which stem from avoid the sleeping function from inside the unknown context. In particular, it simplifies the invalidate_range because we no longer have to juggle the spinlock/mutex and can just hold the spinlock for the entire walk. To compensate, we have to make get_pages a bit more complicated in order to serialise with a pending cancel_userptr worker. As we hold the struct_mutex, we have no choice but to return EAGAIN and hope that the worker is then flushed before we retry after reacquiring the struct_mutex. The important caveat is that the invalidate_range itself is no longer synchronous. There exists a small but definite period in time in which the old PTE's page remain accessible via the GPU. Note however that the physical pages themselves are not invalidated by the mmu_notifier, just the CPU view of the address space. The impact should be limited to a delay in pages being flushed, rather than a possibility of writing to the wrong pages. The only race condition that this worsens is remapping an userptr active on the GPU where fresh work may still reference the old pages due to struct_mutex contention. Given that userspace is racing with the GPU, it is fair to say that the results are undefined. v2: Only queue (and importantly only take one refcnt) the worker once. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>