Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota fix from Jan Kara:
"A fix of a check for quota limit"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: correct space limit check
|
|
Christian Brauner reported that if you use the TIOCGPTPEER ioctl() to
get a slave pty file descriptor, the resulting file descriptor doesn't
look right in /proc/<pid>/fd/<fd>. In particular, he wanted to use
readlink() on /proc/self/fd/<fd> to get the pathname of the slave pty
(basically implementing "ptsname{_r}()").
The reason for that was that we had generated the wrong 'struct path'
when we create the pty in ptmx_open().
In particular, the dentry was correct, but the vfsmount pointed to the
mount of the ptmx node. That _can_ be correct - in case you use
"/dev/pts/ptmx" to open the master - but usually is not. The normal
case is to use /dev/ptmx, which then looks up the pts/ directory, and
then the vfsmount of the ptmx node is obviously the /dev directory, not
the /dev/pts/ directory.
We actually did have the right vfsmount available, but in the wrong
place (it gets looked up in 'devpts_acquire()' when we get a reference
to the pts filesystem), and so ptmx_open() used the wrong mnt pointer.
The end result of this confusion was that the pty worked fine, but when
if you did TIOCGPTPEER to get the slave side of the pty, end end result
would also work, but have that dodgy 'struct path'.
And then when doing "d_path()" on to get the pathname, the vfsmount
would not match the root of the pts directory, and d_path() would return
an empty pathname thinking that the entry had escaped a bind mount into
another mount.
This fixes the problem by making devpts_acquire() return the vfsmount
for the pts filesystem, allowing ptmx_open() to trivially just use the
right mount for the pts dentry, and create the proper 'struct path'.
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since DRM IOCTL's are lockless, there is a chance that BOs could be
released while a job submission is in progress. To avoid that, keep the
GEM reference until the job has been pinned, part of which will be to
take another reference.
v2: remove redundant check and avoid memory leak
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The mapping of PRIME buffers can reuse much of the GEM mapping code, so
extract the common bits into a new tegra_gem_mmap() helper.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
None of the driver-specific IOCTLs are privileged, so mark them as such
and advertise that the driver supports render nodes.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add tracepoint events for SOR controller register accesses.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add tracepoint events for DPAUX controller register accesses.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add tracepoint events for DSI controller register accesses.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add tracepoint events for HDMI controller register accesses.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add tracepoint events for display controller register accesses.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Register offsets are usually fairly small numbers, so an unsigned int is
more than enough to represent them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Register offsets are usually fairly small numbers, so an unsigned int is
more than enough to represent them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Register offsets are usually fairly small numbers, so an unsigned int is
more than enough to represent them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Register offsets are usually fairly small numbers, so an unsigned int is
more than enough to represent them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Register offsets are usually fairly small numbers, so an unsigned int is
more than enough to represent them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
When IOMMU is off, ->mm_lock is not initialized and ->mm is NULL.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Use drm_*_get() and drm_*_put() helpers instead of drm_*_reference()
and drm_*_unreference() helpers.
drm_*_reference() and drm_*_unreference() functions are just
compatibility alias for drm_*_get() and drm_*_put() and should not be
used by new code. So convert all users of compatibility functions to
use the new APIs.
Generated by: scripts/coccinelle/api/drm-get-put.cocci
Signed-off-by: Cihangir Akturk <cakturk@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The defines are set anyway to prevent an empty string. The test for the
SoC is the same as for Nouveau for the Tegra GPU firmware (see
drivers/gpu/drm/nouveau/nouveau_platform.c)
v2:
- Place the defines above each chip's vic_config struct
- MODULE_FIRMWARE() at the end of the file
Fixes: 0ae797a8ba05 ("drm/tegra: Add VIC support")
Signed-off-by: Nicolas Chauvet <kwizart@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Without CONFIG_OF, we can run into a build error:
drivers/gpu/drm/tegra/dpaux.c:378:20: error: 'pinconf_generic_dt_node_to_map_group' undeclared here (not in a function); did you mean 'pinconf_generic_params'?
.dt_node_to_map = pinconf_generic_dt_node_to_map_group,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pinconf_generic_params
drivers/gpu/drm/tegra/dpaux.c:379:17: error: 'pinconf_generic_dt_free_map' undeclared here (not in a function); did you mean 'pinconf_generic_params'?
This adds an explicit dependency.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The display architecture in Tegra186 changes slightly compared to
earlier Tegra generations, which requires that we recursively scan
host1x sub-devices from device tree.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
platform_get_irq() returns an error code, but the host1x driver
ignores it and always returns -ENXIO. This is not correct and,
prevents -EPROBE_DEFER from being propagated properly.
Notice that platform_get_irq() no longer returns 0 on error:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e330b9a6bb35dc7097a4f02cb1ae7b6f96df92af
Print and propagate the return value of platform_get_irq on failure.
This issue was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Some parts of Host1x uses BIT_WORD/BIT_MASK/BITS_PER_LONG to calculate
register or field offsets. This worked fine on ARMv7, but now that
BITS_PER_LONG is 64 but our registers are still 32-bit things are
broken.
Fix by replacing..
- BIT_WORD with (x / 32)
- BIT_MASK with BIT(x % 32)
- BITS_PER_LONG with 32
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Pinning a Host1x BO currently cannot fail and zero is a valid address
for a BO when IOMMU is enabled. To avoid false errors remove checks
for NULL BO physical addresses.
Fixes: 404bfb78daf3 ("gpu: host1x: Add IOMMU support")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
I2C slave controller must be powered and active all the time when I2C
slave backend is registered in order to let master address and
communicate with us.
Now if the controller is runtime PM capable it will be suspended after
probe and cannot ever respond to the master or generate interrupts.
Fix this by resuming the controller when I2C slave backend is registered
and let it suspend after unregistering.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
I guess pm_runtime_put_noidle() call in i2c_dw_probe_slave() was copied
by accident from similar master mode adapter registration code. It is
unbalanced due missing pm_runtime_get_noresume() but harmless since it
doesn't decrease dev->power.usage_count below zero.
In theory we can hit similar needless runtime suspend/resume cycle
during slave mode adapter registration that was happening when
registering the master mode adapter. See commit cd998ded5c12 ("i2c:
designware: Prevent runtime suspend during adapter registration").
However, since we are slave, we can consider it as a wrong configuration
if we have other slaves attached under this adapter and can omit the
pm_runtime_get_noresume()/pm_runtime_put_noidle() calls for simplicity.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
C-Media devices (at least some models) mute the playback stream when
volumes are set to the minimum value. But this isn't informed via TLV
and the user-space, typically PulseAudio, gets confused as if it's
still played in a low volume.
This patch adds the new flag, min_mute, to struct usb_mixer_elem_info
for indicating that the mixer element is with the minimum-mute volume.
This flag is set for known C-Media devices in
snd_usb_mixer_fu_apply_quirk() in turn.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196669
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
'hotplug.2017.07.25b', 'misc.2017.08.17a', 'spin_unlock_wait_no.2017.08.17a', 'srcu.2017.07.27c' and 'torture.2017.07.24c' into HEAD
doc.2017.08.17a: Documentation updates.
fixes.2017.08.17a: RCU fixes.
hotplug.2017.07.25b: CPU-hotplug updates.
misc.2017.08.17a: Miscellaneous fixes outside of RCU (give or take conflicts).
spin_unlock_wait_no.2017.08.17a: Remove spin_unlock_wait().
srcu.2017.07.27c: SRCU updates.
torture.2017.07.24c: Torture-test updates.
|
|
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair. This commit therefore removes the underlying arch-specific
arch_spin_unlock_wait() for all architectures providing them.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
|
|
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair. This commit therefore removes spin_unlock_wait() and related
definitions from core code.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair. This commit therefore eliminates the spin_unlock_wait() call and
associated else-clause and hoists the then-clause's lock and unlock out of
the "if" statement. This should be safe from a performance perspective
because according to Tejun there should be few if any drivers that don't
set their own error handler.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: <linux-ide@vger.kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair. This commit therefore replaces the spin_unlock_wait() call in
exit_sem() with spin_lock() followed immediately by spin_unlock().
This should be safe from a performance perspective because exit_sem()
is rarely invoked in production.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
|
|
There is no agreed-upon definition of spin_unlock_wait()'s semantics, and
it appears that all callers could do just as well with a lock/unlock pair.
This commit therefore replaces the spin_unlock_wait() call in do_exit()
with spin_lock() followed immediately by spin_unlock(). This should be
safe from a performance perspective because the lock is a per-task lock,
and this is happening only at task-exit time.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair. This commit therefore replaces the spin_unlock_wait() call in
completion_done() with spin_lock() followed immediately by spin_unlock().
This should be safe from a performance perspective because the lock
will be held only the wakeup happens really quickly.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
This commit documents the situations in which RCU needs the
scheduling-clock interrupt to be enabled, along with the consequences
of failing to meet RCU's needs in this area.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
There are too many ways for the compiler to optimize (that is, break)
dependencies carried via integer values, so it is now permissible to
carry dependencies only via pointers. This commit catches up some of
the documentation on this point.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Suggested-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The memory-barriers.txt document contains an obsolete passage stating that
smp_read_barrier_depends() is required to force ordering for read-to-write
dependencies. We now know that this is not required, even for DEC Alpha.
This commit therefore updates this passage to state that read-to-write
dependencies are respected even without smp_read_barrier_depends().
Reported-by: Lance Roy <ldr709@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Luc Maranget <luc.maranget@inria.fr>
[ paulmck: Reference control-dependencies sections and use WRITE_ONCE()
per Will Deacon. Correctly place split-cache paragraph while there. ]
Acked-by: Will Deacon <will.deacon@arm.com>
|
|
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED with IPIs using cpumask built
from all runqueues for which current thread's mm is the same as the
thread calling sys_membarrier. It executes faster than the non-expedited
variant (no blocking). It also works on NOHZ_FULL configurations.
Scheduler-wise, it requires a memory barrier before and after context
switching between processes (which have different mm). The memory
barrier before context switch is already present. For the barrier after
context switch:
* Our TSO archs can do RELEASE without being a full barrier. Look at
x86 spin_unlock() being a regular STORE for example. But for those
archs, all atomics imply smp_mb and all of them have atomic ops in
switch_mm() for mm_cpumask(), and on x86 the CR3 load acts as a full
barrier.
* From all weakly ordered machines, only ARM64 and PPC can do RELEASE,
the rest does indeed do smp_mb(), so there the spin_unlock() is a full
barrier and we're good.
* ARM64 has a very heavy barrier in switch_to(), which suffices.
* PPC just removed its barrier from switch_to(), but appears to be
talking about adding something to switch_mm(). So add a
smp_mb__after_unlock_lock() for now, until this is settled on the PPC
side.
Changes since v3:
- Properly document the memory barriers provided by each architecture.
Changes since v2:
- Address comments from Peter Zijlstra,
- Add smp_mb__after_unlock_lock() after finish_lock_switch() in
finish_task_switch() to add the memory barrier we need after storing
to rq->curr. This is much simpler than the previous approach relying
on atomic_dec_and_test() in mmdrop(), which actually added a memory
barrier in the common case of switching between userspace processes.
- Return -EINVAL when MEMBARRIER_CMD_SHARED is used on a nohz_full
kernel, rather than having the whole membarrier system call returning
-ENOSYS. Indeed, CMD_PRIVATE_EXPEDITED is compatible with nohz_full.
Adapt the CMD_QUERY mask accordingly.
Changes since v1:
- move membarrier code under kernel/sched/ because it uses the
scheduler runqueue,
- only add the barrier when we switch from a kernel thread. The case
where we switch from a user-space thread is already handled by
the atomic_dec_and_test() in mmdrop().
- add a comment to mmdrop() documenting the requirement on the implicit
memory barrier.
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Andrew Hunter <ahh@google.com>
CC: Maged Michael <maged.michael@gmail.com>
CC: gromer@google.com
CC: Avi Kivity <avi@scylladb.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Dave Watson <davejwatson@fb.com>
|
|
The rcu_idle_exit() and rcu_idle_enter() functions are exported because
they were originally used by RCU_NONIDLE(), which was intended to
be usable from modules. However, RCU_NONIDLE() now instead uses
rcu_irq_enter_irqson() and rcu_irq_exit_irqson(), which are not
exported, and there have been no complaints.
This commit therefore removes the exports from rcu_idle_exit() and
rcu_idle_enter().
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
All current callers of rcu_idle_enter() have irqs disabled, and
rcu_idle_enter() relies on this, but doesn't check. This commit
therefore adds a RCU_LOCKDEP_WARN() to add some verification to the trust.
While we are there, pass "true" rather than "1" to rcu_eqs_enter().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
All callers to rcu_idle_enter() have irqs disabled, so there is no
point in rcu_idle_enter disabling them again. This commit therefore
replaces the irq disabling with a RCU_LOCKDEP_WARN().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This commit adds assertions verifying the consistency of the rcu_node
structure's ->blkd_tasks list and its ->gp_tasks, ->exp_tasks, and
->boost_tasks pointers. In particular, the ->blkd_tasks lists must be
empty except for leaf rcu_node structures.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Set disable_rcu_irq_enter on not only rcu_eqs_enter_common() but also
rcu_eqs_exit(), since rcu_eqs_exit() suffers from the same issue as was
fixed for rcu_eqs_enter_common() by commit 03ecd3f48e57 ("rcu/tracing:
Add rcu_disabled to denote when rcu_irq_enter() will not work").
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The _rcu_barrier_trace() function is a wrapper for trace_rcu_barrier(),
which needs TPS() protection for strings passed through the second
argument. However, it has escaped prior TPS()-ification efforts because
it _rcu_barrier_trace() does not start with "trace_". This commit
therefore adds the needed TPS() protection
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
These RCU waits were set to use interruptible waits to avoid the kthreads
contributing to system load average, even though they are not interruptible
as they are spawned from a kthread. Use the new TASK_IDLE swaits which makes
our goal clear, and removes confusion about these paths possibly being
interruptible -- they are not.
When the system is idle the RCU grace-period kthread will spend all its time
blocked inside the swait_event_interruptible(). If the interruptible() was
not used, then this kthread would contribute to the load average. This means
that an idle system would have a load average of 2 (or 3 if PREEMPT=y),
rather than the load average of 0 that almost fifty years of UNIX has
conditioned sysadmins to expect.
The same argument applies to swait_event_interruptible_timeout() use. The
RCU grace-period kthread spends its time blocked inside this call while
waiting for grace periods to complete. In particular, if there was only one
busy CPU, but that CPU was frequently invoking call_rcu(), then the RCU
grace-period kthread would spend almost all its time blocked inside the
swait_event_interruptible_timeout(). This would mean that the load average
would be 2 rather than the expected 1 for the single busy CPU.
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
There are cases where folks are using an interruptible swait when
using kthreads. This is rather confusing given you'd expect
interruptible waits to be -- interruptible, but kthreads are not
interruptible ! The reason for such practice though is to avoid
having these kthreads contribute to the system load average.
When systems are idle some kthreads may spend a lot of time blocking if
using swait_event_timeout(). This would contribute to the system load
average. On systems without preemption this would mean the load average
of an idle system is bumped to 2 instead of 0. On systems with PREEMPT=y
this would mean the load average of an idle system is bumped to 3
instead of 0.
This adds proper API using TASK_IDLE to make such goals explicit and
avoid confusion.
Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
There is currently event tracing to track when a task is preempted
within a preemptible RCU read-side critical section, and also when that
task subsequently reaches its outermost rcu_read_unlock(), but none
indicating when a new grace period starts when that grace period must
wait on pre-existing readers that have been been preempted at least once
since the beginning of their current RCU read-side critical sections.
This commit therefore adds an event trace at grace-period start in
the case where there are such readers. Note that only the first
reader in the list is traced.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
This commit saves a few lines in kernel/rcu/rcu.h by moving to single-line
definitions for trivial functions, instead of the old style where the
two curly braces each get their own line.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Strings used in event tracing need to be specially handled, for example,
using the TPS() macro. Without the TPS() macro, although output looks
fine from within a running kernel, extracting traces from a crash dump
produces garbage instead of strings. This commit therefore adds the TPS()
macro to some unadorned strings that were passed to event-tracing macros.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|