Age | Commit message (Collapse) | Author |
|
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.
The following Coccinelle semantic patch was used:
@@
@@
- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
In AFS, mountpoints appear as symlinks with mode 0644 and normal symlinks
have mode 0777, so use this to distinguish them rather than reading the
content and parsing it. In the case of a mountpoint, the symlink body is a
formatted string indicating the location of the target volume.
Note that with this, kAFS no longer 'pre-fetches' the contents of symlinks,
so afs_readpage() may fail with an access-denial because when the VFS calls
d_automount(), it wraps the call in an credentials override that sets the
initial creds - thereby preventing access to the caller's keyrings and the
authentication keys held therein.
To this end, a patch reverting that change to the VFS is required also.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Flush outstanding writes in afs when an fd is closed. This is what NFS and
CIFS do.
Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Handle the situation where afs_write_begin() is told to expect that a
full-page write will be made, but this doesn't happen (EFAULT, CTRL-C,
etc.), and so afs_write_end() sees a partial write took place. Currently,
no attempt is to deal with the discrepency.
Fix this by loading the gap from the server.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Kill struct afs_read::pg_offset as nothing uses it. It's unnecessary as pos
can be masked off.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
When an AFS server is given an FS.FetchData{,64} request to read data from
a file, it is permitted by the protocol to return more or less than was
requested. kafs currently relies on the latter behaviour in readpage{,s}
to handle a partial page at the end of the file (we just ask for a whole
page and clear space beyond the short read).
However, we don't handle all cases. Add:
(1) Handle excess data by discarding it rather than aborting. Note that
we use a common static buffer to discard into so that the decryption
algorithm advances the PCBC state.
(2) Handle a short read that affects more than just the last page.
Note that if a read comes up unexpectedly short of long, it's possible that
the server's copy of the file changed - in which case the data version
number will have been incremented and the callback will have been broken -
in which case all the pages currently attached to the inode will be zapped
anyway at some point.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Servers may send a callback array that is the same size as
the FID array, or an empty array. If the callback count is
0, the code would attempt to read (fid_count * 12) bytes of
data, which would fail and result in an unmarshalling error.
This would lead to stale data for remotely modified files
or directories.
Store the callback array size in the internal afs_call
structure and use that to determine the amount of data to
read.
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
|
|
Mode bits for an afs file should not be enforced in the usual
way.
For files, the absence of user bits can restrict file access
with respect to what is granted by the server.
These bits apply regardless of the owner or the current uid; the
rest of the mode bits (group, other) are ignored.
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The group was hard coded to GLOBAL_ROOT_GID; use the group
ID that was received from the server.
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
afs_fill_page() loads the page it wants to fill into the afs_read request
without incrementing its refcount - but then calls afs_put_read() to clean
up afterwards, which then releases a ref on the page.
Fix this by getting a ref on the page before calling
afs_vnode_fetch_data().
This causes sync after a write to hang in afs_writepages_region() because
find_get_pages_tag() gets confused and doesn't return.
Fixes: 196ee9cd2d04 ("afs: Make afs_fs_fetch_data() take a list of pages")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
|
|
In afs_writepages_region(), inside the loop where we find dirty pages to
deal with, one of the if-statements is missing a put_page().
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
GVTg has introduced the context status notifier to schedule the GVTg
workload. At that time, the notifier is bound to GVTg context only,
so GVTg is not aware of host workloads.
Now we are going to improve GVTg's guest workload scheduler policy,
and add Guc emulation support for new Gen graphics. Both these two
features require acknowledgment for all contexts running on hardware.
(But will not alter host workload.) So here try to make some change.
The change is simple:
1. Move the context status notifier head from i915_gem_context to
intel_engine_cs. Which means there is a notifier head per engine
instead of per context. Execlist driver still call notifier for
each context sched-in/out events of current engine.
2. At GVTg side, it binds a notifier_block for each physical engine
at GVTg initialization period. Then GVTg can hear all context
status events.
In this patch, GVTg do nothing for host context event, but later
will add a function there. But in any case, the notifier callback is
a noop if this is no active vGPU.
Since intel_gvt_init() is called at early initialization stage and
require the status notifier head has been initiated, I initiate it in
intel_engine_setup().
v2: remove a redundant newline. (chris)
Fixes: 3c7ba6359d70 ("drm/i915: Introduce execlist context status change notification")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232
Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.com
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We need to flush the batch _before_ we check the number of samples,
otherwise we'll miss all of the batched samples.
Fixes: cf43e6b ("block: add scalable completion tracking of requests")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This emulates execlists on top of the GuC in order to defer submission of
requests to the hardware. This deferral allows time for high priority
requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).
v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)
v3: Apply lockdep nesting to enabling signaling on the request, using
the pattern we already have in __i915_gem_request_submit();
v4: Replaying requests after a hang also now needs the timeline
spinlock, to disable the interrupts at least
v5: Hold wq lock for completeness, and emit a tracepoint for enabling signal
v6: Reorder interrupt checking for a happier gcc.
v7: Only signal the tasklet after a user-interrupt if using guc scheduling
v8: Restore lost update of rq through the i915_guc_irq_handler (Tvrtko)
v9: Avoid re-initialising the engine->irq_tasklet from inside a reset
v10: Hook up the execlists-style tracepoints
v11: Clear the execlists irq_posted bit after taking over the interrupt/tasklet
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316125619.6856-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
|
|
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: 53f3a9e26ed5 ("mmc: USB SD Host Controller (USHC) driver")
Cc: stable <stable@vger.kernel.org> # 2.6.37
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When manually overwriting the HWS, rather than assume irq_seqno_barrier
does the right thing, we can explicitly flush the cacheline instead.
This avoids us calling the engine->irq_seqno_barrier() from an illegal
context:
[ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002
[ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci
[ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G U 4.11.0-rc1+ #203
[ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010
[ 1472.651900] Call Trace:
[ 1472.651913] dump_stack+0x63/0x90
[ 1472.651922] __schedule_bug+0x5d/0x6b
[ 1472.651930] __schedule+0x46a/0x5f0
[ 1472.651934] schedule+0x38/0x90
[ 1472.651938] schedule_hrtimeout_range_clock+0x85/0x110
[ 1472.651945] ? hrtimer_init+0x10/0x10
[ 1472.651949] schedule_hrtimeout_range+0xe/0x10
[ 1472.651952] usleep_range+0x4d/0x60
[ 1472.652037] gen5_seqno_barrier+0x13/0x20 [i915]
[ 1472.652101] intel_engine_init_global_seqno+0xd7/0x160 [i915]
[ 1472.652160] __i915_gem_set_wedged_BKL+0xa0/0x180 [i915]
[ 1472.652166] multi_cpu_stop+0xbb/0xe0
[ 1472.652170] ? cpu_stop_queue_work+0x90/0x90
[ 1472.652174] cpu_stopper_thread+0x82/0x110
[ 1472.652179] smpboot_thread_fn+0x137/0x190
[ 1472.652184] kthread+0xf7/0x130
[ 1472.652187] ? sort_range+0x20/0x20
[ 1472.652191] ? kthread_park+0x90/0x90
[ 1472.652195] ret_from_fork+0x2c/0x40
Testcase: igt/gem_eio #ilk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170314111452.9375-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
|
|
The SDHCI controller in the SAMA5D2 chip requires a valid voltage set
in the power control register, otherwise commands will fail with a
timeout error.
When using the regulator framework to specify the regulator used by the
mmc device, the voltage is not configured, and it is not possible to use
the connected device.
Implement a custom 'set_power' function for this specific hardware, that
configures the voltage in the register in all cases.
Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Cc: stable@vger.kernel.org #4.9+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Fix compilation warning:
drivers/mmc/core/block.c:1563:24: warning: variable ‘mq_rq’ set but not
used [-Wunused-but-set-variable] struct mmc_queue_req *mq_rq;
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The clearing wb size should be the one that it is assigned.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Higher sclks seem to be unstable on some boards.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=100222
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Higher sclks seem to be unstable on some boards.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=100222
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
While going through the event inheritance code Oleg got confused.
Add some comments to better explain the silent dissapearance of
orphaned events.
So what happens is that at perf_event_release_kernel() time; when an
event looses its connection to userspace (and ceases to exist from the
user's perspective) we can still have an arbitrary amount of inherited
copies of the event. We want to synchronously find and remove all
these child events.
Since that requires a bit of lock juggling, there is the possibility
that concurrent clone()s will create new child events. Therefore we
first mark the parent event as DEAD, which marks all the extant child
events as orphaned.
We then avoid copying orphaned events; in order to avoid getting more
of them.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20170316125823.289567442@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
We have ctx->event_list that contains all events; no need to
repeatedly iterate the group lists to find them all.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20170316125823.239678244@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
While hunting for clues to a use-after-free, Oleg spotted that
perf_event_init_context() can loose an error value with the result
that fork() can succeed even though we did not fully inherit the perf
event context.
Spotted-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: oleg@redhat.com
Cc: stable@vger.kernel.org
Fixes: 889ff0150661 ("perf/core: Split context's event group list into pinned and non-pinned lists")
Link: http://lkml.kernel.org/r/20170316125823.190342547@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Dmitry reported syzcaller tripped a use-after-free in perf_release().
After much puzzlement Oleg spotted the below scenario:
Task1 Task2
fork()
perf_event_init_task()
/* ... */
goto bad_fork_$foo;
/* ... */
perf_event_free_task()
mutex_lock(ctx->lock)
perf_free_event(B)
perf_event_release_kernel(A)
mutex_lock(A->child_mutex)
list_for_each_entry(child, ...) {
/* child == B */
ctx = B->ctx;
get_ctx(ctx);
mutex_unlock(A->child_mutex);
mutex_lock(A->child_mutex)
list_del_init(B->child_list)
mutex_unlock(A->child_mutex)
/* ... */
mutex_unlock(ctx->lock);
put_ctx() /* >0 */
free_task();
mutex_lock(ctx->lock);
mutex_lock(A->child_mutex);
/* ... */
mutex_unlock(A->child_mutex);
mutex_unlock(ctx->lock)
put_ctx() /* 0 */
ctx->task && !TOMBSTONE
put_task_struct() /* UAF */
This patch closes the hole by making perf_event_free_task() destroy the
task <-> ctx relation such that perf_event_release_kernel() will no longer
observe the now dead task.
Spotted-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: fweisbec@gmail.com
Cc: oleg@redhat.com
Cc: stable@vger.kernel.org
Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events")
Link: http://lkml.kernel.org/r/20170314155949.GE32474@worktop
Link: http://lkml.kernel.org/r/20170316125823.140295131@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This patch can fix two issues:
Issue 1:
In previous code, div may be overflow when setting clock frequency
as f_min. We can use DIV_ROUND_UP to fix this boundary related
issue.
Issue 2:
In previous code, we can not set the correct clock frequency when
div equals 0xff.
Signed-off-by: Yong Mao <yong.mao@mediatek.com>
Signed-off-by: Chaotian Jing <chaotian.jing@mediatek.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Set byt rc residency counters high level as chv does by
default. We lose some accuracy on byt but we can do the calculation
without extra hw read on both platforms, as now they behave
identically in this respect.
v2: use ktime
v3: keep comparison u32 (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1489592584-10422-1-git-send-email-mika.kuoppala@intel.com
|
|
We have used cz timestamp register to gain a reference time wrt
to residency calculations. The residency counts are in cz clk ticks
(333Mhz clock) but for some reason the cz timestamp register gives
100us units. Perhaps for some other usage, the base-ten based values
are easier, but in residency calculations raw units would have been
the easiest.
As there is not much advantage of using base-ten clock through
a more costly punit access, take our reference times directly from
kernel clock.
v2: use ktime (Chris, Ville)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Use intel_rc6_residency to get benefit for increased resolution
in byt/chv.
v2: output raw and time (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Vlv and chv residency counters are 40 bits in width.
With a control bit, we can choose between upper or lower
32 bit window into this counter.
Lets toggle this bit on and off on and read both parts.
As a result we can push the wrap from 13 seconds to 54
minutes.
v2: commit msg, loop readability, goto elimination (Chris)
v3: bug ref, divide outside runtime pm lock (Chris)
References: https://bugs.freedesktop.org/show_bug.cgi?id=94852
Reported-by: Len Brown <len.brown@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Change the granularity from milliseconds to microseconds
when returning rc6 residencies. This is in preparation
for increased resolution on some platforms.
v2: use 64bit div macro (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Plan is to make generic residency calculation utility
function for usage outside of sysfs. As a first step
move residency calculation into intel_pm.c
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The current suspend resume implementation is assuming register values are
kept when entering suspend, which is no longer the case with the
suspend-to-RAM on the sama5d2.
While at it, switch to the generic infrastructure to enter suspend mode
(drm_atomic_helper_suspend/resume()).
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Sylvain Rochet <sylvain.rochet@finsecur.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1488371461-22243-1-git-send-email-boris.brezillon@free-electrons.com
|
|
lockdep doesn't like us taking the mm->mmap_sem inside the get_pages
callback for a couple of reasons. The straightforward deadlock:
[13755.434059] =============================================
[13755.434061] [ INFO: possible recursive locking detected ]
[13755.434064] 4.11.0-rc1-CI-CI_DRM_297+ #1 Tainted: G U
[13755.434066] ---------------------------------------------
[13755.434068] gem_userptr_bli/8398 is trying to acquire lock:
[13755.434070] (&mm->mmap_sem){++++++}, at: [<ffffffffa00c988a>] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434096]
but task is already holding lock:
[13755.434098] (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
[13755.434105]
other info that might help us debug this:
[13755.434108] Possible unsafe locking scenario:
[13755.434110] CPU0
[13755.434111] ----
[13755.434112] lock(&mm->mmap_sem);
[13755.434115] lock(&mm->mmap_sem);
[13755.434117]
*** DEADLOCK ***
[13755.434121] May be due to missing lock nesting notation
[13755.434126] 2 locks held by gem_userptr_bli/8398:
[13755.434128] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
[13755.434135] #1: (&obj->mm.lock){+.+.+.}, at: [<ffffffffa00b887d>] __i915_gem_object_get_pages+0x1d/0x70 [i915]
[13755.434156]
stack backtrace:
[13755.434161] CPU: 3 PID: 8398 Comm: gem_userptr_bli Tainted: G U 4.11.0-rc1-CI-CI_DRM_297+ #1
[13755.434165] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F4 02/20/2017
[13755.434169] Call Trace:
[13755.434174] dump_stack+0x67/0x92
[13755.434178] __lock_acquire+0x133a/0x1b50
[13755.434182] lock_acquire+0xc9/0x220
[13755.434200] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434204] down_read+0x42/0x70
[13755.434221] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434238] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434255] ____i915_gem_object_get_pages+0x25/0x60 [i915]
[13755.434272] __i915_gem_object_get_pages+0x59/0x70 [i915]
[13755.434288] i915_gem_fault+0x397/0x6a0 [i915]
[13755.434304] ? i915_gem_fault+0x1a1/0x6a0 [i915]
[13755.434308] ? __lock_acquire+0x449/0x1b50
[13755.434311] ? __lock_acquire+0x449/0x1b50
[13755.434315] ? vm_mmap_pgoff+0xa9/0xd0
[13755.434318] __do_fault+0x19/0x70
[13755.434321] __handle_mm_fault+0x863/0xe50
[13755.434325] handle_mm_fault+0x17f/0x370
[13755.434329] ? handle_mm_fault+0x40/0x370
[13755.434332] __do_page_fault+0x279/0x560
[13755.434336] do_page_fault+0xc/0x10
[13755.434339] page_fault+0x22/0x30
[13755.434342] RIP: 0033:0x7f5ab91b5880
[13755.434345] RSP: 002b:00007fff62922218 EFLAGS: 00010216
[13755.434348] RAX: 0000000000b74500 RBX: 00007f5ab7f81000 RCX: 0000000000000000
[13755.434352] RDX: 0000000000100000 RSI: 00007f5ab7f81000 RDI: 00007f5aba61c000
[13755.434355] RBP: 00007f5aba61c000 R08: 0000000000000007 R09: 0000000100000000
[13755.434359] R10: 000000000000037d R11: 00007f5ab91b5840 R12: 0000000000000001
[13755.434362] R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000000
and cyclic deadlocks:
[ 2566.458979] ======================================================
[ 2566.459054] [ INFO: possible circular locking dependency detected ]
[ 2566.459127] 4.11.0-rc1+ #26 Not tainted
[ 2566.459194] -------------------------------------------------------
[ 2566.459266] gem_streaming_w/759 is trying to acquire lock:
[ 2566.459334] (&obj->mm.lock){+.+.+.}, at: [<ffffffffa034bc80>] i915_gem_object_pin_pages+0x0/0xc0 [i915]
[ 2566.459605]
[ 2566.459605] but task is already holding lock:
[ 2566.459699] (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
[ 2566.459814]
[ 2566.459814] which lock already depends on the new lock.
[ 2566.459814]
[ 2566.459934]
[ 2566.459934] the existing dependency chain (in reverse order) is:
[ 2566.460030]
[ 2566.460030] -> #1 (&mm->mmap_sem){++++++}:
[ 2566.460139] lock_acquire+0xfe/0x220
[ 2566.460214] down_read+0x4e/0x90
[ 2566.460444] i915_gem_userptr_get_pages+0x6e/0x340 [i915]
[ 2566.460669] ____i915_gem_object_get_pages+0x8b/0xd0 [i915]
[ 2566.460900] __i915_gem_object_get_pages+0x6a/0x80 [i915]
[ 2566.461132] __i915_vma_do_pin+0x7fa/0x930 [i915]
[ 2566.461352] eb_add_vma+0x67b/0x830 [i915]
[ 2566.461572] eb_lookup_vmas+0xafe/0x1010 [i915]
[ 2566.461792] i915_gem_do_execbuffer+0x715/0x2870 [i915]
[ 2566.462012] i915_gem_execbuffer2+0x106/0x2b0 [i915]
[ 2566.462152] drm_ioctl+0x36c/0x670 [drm]
[ 2566.462236] do_vfs_ioctl+0x12c/0xa60
[ 2566.462317] SyS_ioctl+0x41/0x70
[ 2566.462399] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 2566.462477]
[ 2566.462477] -> #0 (&obj->mm.lock){+.+.+.}:
[ 2566.462587] __lock_acquire+0x1602/0x1790
[ 2566.462661] lock_acquire+0xfe/0x220
[ 2566.462893] i915_gem_object_pin_pages+0x4c/0xc0 [i915]
[ 2566.463116] i915_gem_fault+0x2c2/0x8c0 [i915]
[ 2566.463197] __do_fault+0x42/0x130
[ 2566.463276] __handle_mm_fault+0x92c/0x1280
[ 2566.463356] handle_mm_fault+0x1e2/0x440
[ 2566.463443] __do_page_fault+0x1c4/0x500
[ 2566.463529] do_page_fault+0xc/0x10
[ 2566.463613] page_fault+0x1f/0x30
[ 2566.463693]
[ 2566.463693] other info that might help us debug this:
[ 2566.463693]
[ 2566.463820] Possible unsafe locking scenario:
[ 2566.463820]
[ 2566.463918] CPU0 CPU1
[ 2566.463988] ---- ----
[ 2566.464068] lock(&mm->mmap_sem);
[ 2566.464143] lock(&obj->mm.lock);
[ 2566.464226] lock(&mm->mmap_sem);
[ 2566.464304] lock(&obj->mm.lock);
[ 2566.464378]
[ 2566.464378] *** DEADLOCK ***
[ 2566.464378]
[ 2566.464504] 1 lock held by gem_streaming_w/759:
[ 2566.464576] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
[ 2566.464699]
[ 2566.464699] stack backtrace:
[ 2566.464801] CPU: 0 PID: 759 Comm: gem_streaming_w Not tainted 4.11.0-rc1+ #26
[ 2566.464881] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F8 03/02/2016
[ 2566.464983] Call Trace:
[ 2566.465061] dump_stack+0x68/0x9f
[ 2566.465144] print_circular_bug+0x20b/0x260
[ 2566.465234] __lock_acquire+0x1602/0x1790
[ 2566.465323] ? debug_check_no_locks_freed+0x1a0/0x1a0
[ 2566.465564] ? i915_gem_object_wait+0x238/0x650 [i915]
[ 2566.465657] ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
[ 2566.465749] lock_acquire+0xfe/0x220
[ 2566.465985] ? i915_sg_trim+0x1b0/0x1b0 [i915]
[ 2566.466223] i915_gem_object_pin_pages+0x4c/0xc0 [i915]
[ 2566.466461] ? i915_sg_trim+0x1b0/0x1b0 [i915]
[ 2566.466699] i915_gem_fault+0x2c2/0x8c0 [i915]
[ 2566.466939] ? i915_gem_pwrite_ioctl+0xce0/0xce0 [i915]
[ 2566.467030] ? __lock_acquire+0x642/0x1790
[ 2566.467122] ? __lock_acquire+0x642/0x1790
[ 2566.467209] ? debug_lockdep_rcu_enabled+0x35/0x40
[ 2566.467299] ? get_unmapped_area+0x1b4/0x1d0
[ 2566.467387] __do_fault+0x42/0x130
[ 2566.467474] __handle_mm_fault+0x92c/0x1280
[ 2566.467564] ? __pmd_alloc+0x1e0/0x1e0
[ 2566.467651] ? vm_mmap_pgoff+0x160/0x190
[ 2566.467740] ? handle_mm_fault+0x111/0x440
[ 2566.467827] handle_mm_fault+0x1e2/0x440
[ 2566.467914] ? handle_mm_fault+0x5d/0x440
[ 2566.468002] __do_page_fault+0x1c4/0x500
[ 2566.468090] do_page_fault+0xc/0x10
[ 2566.468180] page_fault+0x1f/0x30
[ 2566.468263] RIP: 0033:0x557895ced32a
[ 2566.468337] RSP: 002b:00007fffd6dd8a10 EFLAGS: 00010202
[ 2566.468419] RAX: 00007f659a4db000 RBX: 0000000000000003 RCX: 00007f659ad032da
[ 2566.468501] RDX: 0000000000000000 RSI: 0000000000100000 RDI: 0000000000000000
[ 2566.468586] RBP: 0000000000000007 R08: 0000000000000003 R09: 0000000100000000
[ 2566.468667] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557895ceda60
[ 2566.468749] R13: 0000000000000001 R14: 00007fffd6dd8ac0 R15: 00007f659a4db000
By checking the status of the gup worker (serialized by the
obj->mm.lock) we can determine whether it is still active, has failed or
has succeeded. If the worker is still active (or failed), we know that
it cannot be bound and so we can skip taking struct_mutex (risking
potential recursion). As we check the worker status, we mark it to
discard any partial results, forcing us to restart on the next
get_pages.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Fixes: 1c8782dd313e ("drm/i915/userptr: Disallow wrapping GTT into a userptr")
Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
Testcase: igt/gem_userptr_blits/dmabuf-sync
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315140150.19432-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Test runs on a ppc64 BE guest succeeded. linux/samples/statx/test-statx
program was executed on the following file types,
1. Regular file
2. Directory
3. device file
4. symlink
5. Named pipe
The test run also included invoking test-statx with the runtime options
provided in the main() function of test-statx.c
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
After commit e9afc746299d ("hwrng: geode - Use linux/io.h instead of
asm/io.h") the geode-rng driver uses devres with pci_dev->dev to keep
track of resources, but does not actually register a PCI driver. This
results in the following issues:
1. The driver leaks memory because the driver does not attach to a
device. The driver only uses the PCI device as a reference. devm_*()
functions will release resources on driver detach, which the geode-rng
driver will never do. As a result,
2. The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.
Revert the changes made by e9afc746299d ("hwrng: geode - Use linux/io.h
instead of asm/io.h").
Cc: <stable@vger.kernel.org>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Fixes: 6e9b5e76882c ("hwrng: geode - Migrate to managed API")
Cc: Matt Mackall <mpm@selenic.com>
Cc: Corentin LABBE <clabbe.montjoie@gmail.com>
Cc: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-geode@lists.infradead.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
After commit 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"), the
amd-rng driver uses devres with pci_dev->dev to keep track of resources,
but does not actually register a PCI driver. This results in the
following issues:
1. The message
WARNING: CPU: 2 PID: 621 at drivers/base/dd.c:349 driver_probe_device+0x38c
is output when the i2c_amd756 driver loads and attempts to register a PCI
driver. The PCI & device subsystems assume that no resources have been
registered for the device, and the WARN_ON() triggers since amd-rng has
already do so.
2. The driver leaks memory because the driver does not attach to a
device. The driver only uses the PCI device as a reference. devm_*()
functions will release resources on driver detach, which the amd-rng
driver will never do. As a result,
3. The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.
Revert the changes made by 31b2a73c9c5f ("hwrng: amd - Migrate to managed
API").
Cc: <stable@vger.kernel.org>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Fixes: 31b2a73c9c5f ("hwrng: amd - Migrate to managed API").
Cc: Matt Mackall <mpm@selenic.com>
Cc: Corentin LABBE <clabbe.montjoie@gmail.com>
Cc: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-geode@lists.infradead.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The CCP driver generally uses a round-robin approach when
assigning operations to available CCPs. For the DMA engine,
however, the DMA mappings of the SGs are associated with a
specific CCP. When an IOMMU is enabled, the IOMMU is
programmed based on this specific device.
If the DMA operations are not performed by that specific
CCP then addressing errors and I/O page faults will occur.
Update the CCP driver to allow a specific CCP device to be
requested for an operation and use this in the DMA engine
support.
Cc: <stable@vger.kernel.org> # 4.9.x-
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Sowmini pointed out Dmitry's RTNL deadlock report to me, and it turns out
to be perfectly accurate - there are various error paths that miss unlock
of the RTNL.
To fix those, change the locking a bit to not be conditional in all those
nl80211_prepare_*_dump() functions, but make those require the RTNL to
start with, and fix the buggy error paths. This also let me use sparse
(by appropriately overriding the rtnl_lock/rtnl_unlock functions) to
validate the changes.
Cc: stable@vger.kernel.org
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
No one is using the structure imx_drm_component, so let's remove the
definition to save several lines.
Signed-off-by: Liu Ying <gnuiyl@gmail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Allow the planes to use the PRG/PRE units as linear prefetchers when
possible. This improves DRAM efficiency a bit and reduces the chance
for display underflow when the memory subsystem is under load.
This does not yet support scanning out tiled buffers directly, as this
needs more work, but it already wires up the basic interaction between
imx-drm, the IPUv3 driver and the PRG and PRE drivers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
On i.MX6 QuadPlus the PRG needs to be clocked in order to pass
through the data access requests from the IDMAC. This call is a
no-op for other all other SoCs.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Using non-zero AXI IDs for anything other than the display channels
collides with the PRG AXI snooping, so only do this if there is no
PRG present.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
The i.MX6 QuadPlus IPU needs to PRG unit to gain access to the
data bus. Make sure it is present and available to be used.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Document the valid compatible strings for the IPUv3.
On i.MX6 QuadPlus the IPU needs to know which PRG has to be
used for this IPU instance. Add a "fsl,prg" property containing
a phandle pointing to the correct PRG device.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
This adds support for the i.MX6 QUadPlus PRG unit. It glues together the
IPU and the PRE units.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
---
v4: add missing ipu_soc->prg_priv
|
|
After negative guc fw selection we could leave guc
submission flag still turned on. Reorder some checks
to cover this case. While here, fix info message and
return early if there is no Guc.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
[tursulin: fixup bad alignment]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315133741.150420-1-michal.wajdeczko@intel.com
|
|
This field is used to determine which kind of firmware the struct
describes (GuC/HuC) - the name does not reflect.
The enum used here have "type" in the name, so let's go with that.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315133415.15343-1-arkadiusz.hiler@intel.com
|
|
Avoid adding to the waitqueue and reprobing the current vblank if the
caller is only querying the current vblank sequence and timestamp, where
we know that the wait would return immediately.
v2: Add CRTC identifier to debug messages
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Dave Airlie <airlied@redhat.com>,
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315204027.20160-2-chris@chris-wilson.co.uk
|
|
On vblank instant-off systems, we can get into a situation where the cost
of enabling and disabling the vblank IRQ around a drmWaitVblank query
dominates. And with the advent of even deeper hardware sleep state,
touching registers becomes ever more expensive. However, we know that if
the user wants the current vblank counter, they are also very likely to
immediately queue a vblank wait and so we can keep the interrupt around
and only turn it off if we have no further vblank requests queued within
the interrupt interval.
After vblank event delivery, this patch adds a shadow of one vblank where
the interrupt is kept alive for the user to query and queue another vblank
event. Similarly, if the user is using blocking drmWaitVblanks, the
interrupt will be disabled on the IRQ following the wait completion.
However, if the user is simply querying the current vblank counter and
timestamp, the interrupt will be disabled after every IRQ and the user
will enabled it again on the first query following the IRQ.
v2: Mario Kleiner -
After testing this, one more thing that would make sense is to move
the disable block at the end of drm_handle_vblank() instead of at the
top.
Turns out that if high precision timestaming is disabled or doesn't
work for some reason (as can be simulated by echo 0 >
/sys/module/drm/parameters/timestamp_precision_usec), then with your
delayed disable code at its current place, the vblank counter won't
increment anymore at all for instant queries, ie. with your other
"instant query" patches. Clients which repeatedly query the counter
and wait for it to progress will simply hang, spinning in an endless
query loop. There's that comment in vblank_disable_and_save:
"* Skip this step if there isn't any high precision timestamp
* available. In that case we can't account for this and just
* hope for the best.
*/
With the disable happening after leading edge of vblank (== hw counter
increment already happened) but before the vblank counter/timestamp
handling in drm_handle_vblank, that step is needed to keep the counter
progressing, so skipping it is bad.
Now without high precision timestamping support, a kms driver must not
set dev->vblank_disable_immediate = true, as this would cause problems
for clients, so this shouldn't matter, but it would be good to still
make this robust against a future kms driver which might have
unreliable high precision timestamping, e.g., high precision
timestamping that intermittently doesn't work.
v3: Patch before coffee needs extra coffee.
Testcase: igt/kms_vblank
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Dave Airlie <airlied@redhat.com>,
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315204027.20160-1-chris@chris-wilson.co.uk
|