summaryrefslogtreecommitdiff
path: root/kernel/locking/percpu-rwsem.c
AgeCommit message (Collapse)Author
2017-01-14locking/percpu-rwsem: Replace waitqueue with rcuwaitDavidlohr Bueso
The use of any kind of wait queue is an overkill for pcpu-rwsems. While one option would be to use the less heavy simple (swait) flavor, this is still too much for what pcpu-rwsems needs. For one, we do not care about any sort of queuing in that the only (rare) time writers (and readers, for that matter) are queued is when trying to acquire the regular contended rw_sem. There cannot be any further queuing as writers are serialized by the rw_sem in the first place. Given that percpu_down_write() must not be called after exit_notify(), we can replace the bulky waitqueue with rcuwait such that a writer can wait for its turn to take the lock. As such, we can avoid the queue handling and locking overhead. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1484148146-14210-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10locking/percpu-rwsem: Optimize readers and reduce global impactPeter Zijlstra
Currently the percpu-rwsem switches to (global) atomic ops while a writer is waiting; which could be quite a while and slows down releasing the readers. This patch cures this problem by ordering the reader-state vs reader-count (see the comments in __percpu_down_read() and percpu_down_write()). This changes a global atomic op into a full memory barrier, which doesn't have the global cacheline contention. This also enables using the percpu-rwsem with rcu_sync disabled in order to bias the implementation differently, reducing the writer latency by adding some cost to readers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org [ Fixed modular build. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-25ext4: fix races between changing inode journal mode and ext4_writepagesDaeho Jeong
In ext4, there is a race condition between changing inode journal mode and ext4_writepages(). While ext4_writepages() is executed on a non-journalled mode inode, the inode's journal mode could be enabled by ioctl() and then, some pages dirtied after switching the journal mode will be still exposed to ext4_writepages() in non-journaled mode. To resolve this problem, we use fs-wide per-cpu rw semaphore by Jan Kara's suggestion because we don't want to waste ext4_inode_info's space for this extra rare case. Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2015-10-06locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read()Oleg Nesterov
Based on Peter Zijlstra's earlier patch. Change percpu_down_read() to use __down_read(), this way we can do rwsem_acquire_read() unconditionally at the start to make this code more symmetric and clean. Originally-From: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06locking/percpu-rwsem: Fix the comments outdated by rcu_syncOleg Nesterov
Update the comments broken by the previous change. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06locking/percpu-rwsem: Make use of the rcu_sync infrastructureOleg Nesterov
Currently down_write/up_write calls synchronize_sched_expedited() twice, which is evil. Change this code to rely on rcu-sync primitives. This avoids the _expedited "big hammer", and this can be faster in the contended case or even in the case when a single thread does down_write/up_write in a loop. Of course, a single down_write() will take more time, but otoh it will be much more friendly to the whole system. To simplify the review this patch doesn't update the comments, fixed by the next change. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06locking/percpu-rwsem: Make percpu_free_rwsem() after kzalloc() safeOleg Nesterov
This is the temporary ugly hack which will be reverted later. We only need it to ensure that the next patch will not break "change sb_writers to use percpu_rw_semaphore" patches routed via the VFS tree. The alloc_super()->destroy_super() error path assumes that it is safe to call percpu_free_rwsem() after kzalloc() without percpu_init_rwsem(), so let's not disappoint it. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06locking/percpu-rwsem: Export symbols for locktorturePaul E. McKenney
This commit exports percpu_down_read(), percpu_down_write(), __percpu_init_rwsem(), percpu_up_read(), and percpu_up_write() to allow locktorture to test them when built as a module. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-08-15percpu-rwsem: introduce percpu_down_read_trylock()Oleg Nesterov
Add percpu_down_read_trylock(), it will have the user soon. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06locking: Move the percpu-rwsem code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-52bjmtty46we26hbfd9sc9iy@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>