summaryrefslogtreecommitdiff
path: root/kernel/rcu/rcu.h
AgeCommit message (Collapse)Author
2020-04-27rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so builtPaul E. McKenney
Systems running CPU-bound real-time task do not want IPIs sent to CPUs executing nohz_full userspace tasks. Battery-powered systems don't want IPIs sent to idle CPUs in low-power mode. Unfortunately, RCU tasks trace can and will send such IPIs in some cases. Both of these situations occur only when the target CPU is in RCU dyntick-idle mode, in other words, when RCU is not watching the target CPU. This suggests that CPUs in dyntick-idle mode should use memory barriers in outermost invocations of rcu_read_lock_trace() and rcu_read_unlock_trace(), which would allow the RCU tasks trace grace period to directly read out the target CPU's read-side state. One challenge is that RCU tasks trace is not targeting a specific CPU, but rather a task. And that task could switch from one CPU to another at any time. This commit therefore uses try_invoke_on_locked_down_task() and checks for task_curr() in trc_inspect_reader_notrunning(). When this condition holds, the target task is running and cannot move. If CONFIG_TASKS_TRACE_RCU_READ_MB=y, the new rcu_dynticks_zero_in_eqs() function can be used to check if the specified integer (in this case, t->trc_reader_nesting) is zero while the target CPU remains in that same dyntick-idle sojourn. If so, the target task is in a quiescent state. If not, trc_read_check_handler() must indicate failure so that the grace-period kthread can take appropriate action or retry after an appropriate delay, as the case may be. With this change, given CONFIG_TASKS_TRACE_RCU_READ_MB=y, if a given CPU remains idle or a given task continues executing in nohz_full mode, the RCU tasks trace grace-period kthread will detect this without the need to send an IPI. Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcu-tasks: Add RCU tasks to rcutorture writer stall outputPaul E. McKenney
This commit adds state for each RCU-tasks flavor to the rcutorture writer stall output. The initial state is minimal, but you have to start somewhere. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ paulmck: Fixes based on feedback from kbuild test robot. ]
2020-04-27rcutorture: Add torture tests for RCU Tasks TracePaul E. McKenney
This commit adds the definitions required to torture the tracing flavor of RCU tasks. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-04-27rcutorture: Add torture tests for RCU Tasks RudePaul E. McKenney
This commit adds the definitions required to torture the rude flavor of RCU tasks. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-03-21Merge branches 'doc.2020.02.27a', 'fixes.2020.03.21a', ↵Paul E. McKenney
'kfree_rcu.2020.02.20a', 'locktorture.2020.02.20a', 'ovld.2020.02.20a', 'rcu-tasks.2020.02.20a', 'srcu.2020.02.20a' and 'torture.2020.02.20a' into HEAD doc.2020.02.27a: Documentation updates. fixes.2020.03.21a: Miscellaneous fixes. kfree_rcu.2020.02.20a: Updates to kfree_rcu(). locktorture.2020.02.20a: Lock torture-test updates. ovld.2020.02.20a: Updates to callback-overload handling. rcu-tasks.2020.02.20a: RCU-tasks updates. srcu.2020.02.20a: SRCU updates. torture.2020.02.20a: Torture-test updates.
2020-02-20rcutorture: Allow boottime stall warnings to be suppressedPaul E. McKenney
In normal production, an RCU CPU stall warning at boottime is often just as bad as at any other time. In fact, given the desire for fast boot, any sort of long-term stall at boot is a bad idea. However, heavy rcutorture testing on large hyperthreaded systems can generate boottime RCU CPU stalls as a matter of course. This commit therefore provides a kernel boot parameter that suppresses reporting of boottime RCU CPU stall warnings and similarly of rcutorture writer stalls. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Warn on for_each_leaf_node_cpu_mask() from non-leafPaul E. McKenney
The for_each_leaf_node_cpu_mask() and for_each_leaf_node_possible_cpu() macros must be invoked only on leaf rcu_node structures. Failing to abide by this restriction can result in infinite loops on systems with more than 64 CPUs (or for more than 32 CPUs on 32-bit systems). This commit therefore adds WARN_ON_ONCE() calls to make misuse of these two macros easier to debug. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24Merge branches 'doc.2019.12.10a', 'exp.2019.12.09a', 'fixes.2020.01.24a', ↵Paul E. McKenney
'kfree_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD doc.2019.12.10a: Documentations updates exp.2019.12.09a: Expedited grace-period updates fixes.2020.01.24a: Miscellaneous fixes kfree_rcu.2020.01.24a: Batch kfree_rcu() work list.2020.01.10a: RCU-protected-list updates preempt.2020.01.24a: Preemptible RCU updates torture.2019.12.09a: Torture-test updates
2020-01-24rcu: Fix harmless omission of "CONFIG_" from #if conditionLai Jiangshan
The C preprocessor macros SRCU and TINY_RCU should instead be CONFIG_SRCU and CONFIG_TINY_RCU, respectively in the #f in kernel/rcu/rcu.h. But there is no harm when "TINY_RCU" is wrongly used, which are always non-defined, which makes "!defined(TINY_RCU)" always true, which means the code block is always included, and the included code block doesn't cause any compilation error so far in CONFIG_TINY_RCU builds. It is also the reason this change should not be taken in -stable. This commit adds the needed "CONFIG_" prefix to both macros. Not for -stable. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove kfree_rcu() special casing and lazy-callback handlingJoel Fernandes (Google)
This commit removes kfree_rcu() special-casing and the lazy-callback handling from Tree RCU. It moves some of this special casing to Tiny RCU, the removal of which will be the subject of later commits. This results in a nice negative delta. Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Add slab.h #include, thanks to kbuild test robot <lkp@intel.com>. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2019-12-09rcu: Make PREEMPT_RCU be a modifier to TREE_RCULai Jiangshan
Currently PREEMPT_RCU and TREE_RCU are mutually exclusive Kconfig options. But PREEMPT_RCU actually specifies a kind of TREE_RCU, namely a preemptible TREE_RCU. This commit therefore makes PREEMPT_RCU be a modifer to the TREE_RCU Kconfig option. This has the benefit of simplifying several of the #if expressions that formerly needed to check both, but now need only check one or the other. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2019-10-30Merge branches 'doc.2019.10.29a', 'fixes.2019.10.30a', 'nohz.2019.10.28a', ↵Paul E. McKenney
'replace.2019.10.30a', 'torture.2019.10.05a' and 'lkmm.2019.10.05a' into HEAD doc.2019.10.29a: RCU documentation updates. fixes.2019.10.30a: RCU miscellaneous fixes. nohz.2019.10.28a: RCU NO_HZ and NO_HZ_FULL updates. replace.2019.10.30a: Replace rcu_swap_protected() with rcu_replace(). torture.2019.10.05a: RCU torture-test updates. lkmm.2019.10.05a: Linux kernel memory model updates.
2019-10-30rcu: Suppress levelspread uninitialized messagesPaul E. McKenney
New tools bring new warnings, and with v5.3 comes: kernel/rcu/srcutree.c: warning: 'levelspread[<U aa0>]' may be used uninitialized in this function [-Wuninitialized]: => 121:34 This commit suppresses this warning by initializing the full array to INT_MIN, which will result in failures should any out-of-bounds references appear. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-10-05rcu: Remove unused function rcutorture_record_progress()Ethan Hansen
The function rcutorture_record_progress() is declared in rcu.h, but is never used. This commit therefore removes rcutorture_record_progress() to clean code. Signed-off-by: Ethan Hansen <1ethanhansen@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2019-08-01rcu: Add kernel parameter to dump trace after RCU CPU stall warningPaul E. McKenney
This commit adds a rcu_cpu_stall_ftrace_dump kernel boot parameter, that, when set, causes the trace buffer to be dumped after an RCU CPU stall warning is printed. This kernel boot parameter is disabled by default, maintaining compatibility with previous behavior. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-05-28rcutorture: Add trivial RCU implementationPaul E. McKenney
I have been showing off a trivial RCU implementation for non-preemptive environments for some time now: #define rcu_read_lock() #define rcu_read_unlock() #define rcu_dereference(p) READ_ONCE(p) #define rcu_assign_pointer(p, v) smp_store_release(&(p), (v)) void synchronize_rcu(void) { int cpu; for_each_online_cpu(cpu) sched_setaffinity(current->pid, cpumask_of(cpu)); } Trivial or not, as the old saying goes, "if it ain't tested, it don't work!". This commit therefore adds a "trivial" flavor to rcutorture and a corresponding TRIVIAL test scenario. This variant does not handle CPU hotplug, which is unconditionally enabled on x86 for post-v5.1-rc3 kernels, which is why the TRIVIAL.boot says "rcutorture.onoff_interval=0". This commit actually does handle CONFIG_PREEMPT=y kernels, but only because it turns back the Linux-kernel clock in order to provide these alternative definitions (or the moral equivalent thereof): #define rcu_read_lock() preempt_disable() #define rcu_read_unlock() preempt_enable() In CONFIG_PREEMPT=n kernels without debugging, these are equivalent to empty macros give or take a compiler barrier. However, the have been successfully tested with actual empty macros as well. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> [ paulmck: Fix symbol issue reported by kbuild test robot <lkp@intel.com>. ] [ paulmck: Work around sched_setaffinity() issue noted by Andrea Parri. ] [ paulmck: Add rcutorture.shuffle_interval=0 to TRIVIAL.boot to fix interaction with shuffler task noted by Peter Zijlstra. ] Tested-by: Andrea Parri <andrea.parri@amarulasolutions.com>
2019-05-15Merge tag 'trace-v5.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major changes in this tracing update includes: - Removal of non-DYNAMIC_FTRACE from 32bit x86 - Removal of mcount support from x86 - Emulating a call from int3 on x86_64, fixes live kernel patching - Consolidated Tracing Error logs file Minor updates: - Removal of klp_check_compiler_support() - kdb ftrace dumping output changes - Accessing and creating ftrace instances from inside the kernel - Clean up of #define if macro - Introduction of TRACE_EVENT_NOP() to disable trace events based on config options And other minor fixes and clean ups" * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits) x86: Hide the int3_emulate_call/jmp functions from UML livepatch: Remove klp_check_compiler_support() ftrace/x86: Remove mcount support ftrace/x86_32: Remove support for non DYNAMIC_FTRACE tracing: Simplify "if" macro code tracing: Fix documentation about disabling options using trace_options tracing: Replace kzalloc with kcalloc tracing: Fix partial reading of trace event's id file tracing: Allow RCU to run between postponed startup tests tracing: Fix white space issues in parse_pred() function tracing: Eliminate const char[] auto variables ring-buffer: Fix mispelling of Calculate tracing: probeevent: Fix to make the type of $comm string tracing: probeevent: Do not accumulate on ret variable tracing: uprobes: Re-enable $comm support for uprobe events ftrace/x86_64: Emulate call function while updating in breakpoint handler x86_64: Allow breakpoints to emulate call instructions x86_64: Add gap to int3 to allow for call emulation tracing: kdb: Allow ftdump to skip all but the last few entries tracing: Add trace_total_entries() / trace_total_entries_cpu() ...
2019-04-08rcu: validate arguments for rcu tracepointsYafang Shao
When CONFIG_RCU_TRACE is not set, all these tracepoints are defined as do-nothing macro. We'd better make those inline functions that take proper arguments. As RCU_TRACE() is defined as do-nothing marco as well when CONFIG_RCU_TRACE is not set, so we can clean it up. Link: http://lkml.kernel.org/r/1553602391-11926-4-git-send-email-laoar.shao@gmail.com Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-03-26rcu: Move RCU CPU stall-warning code out of update.cPaul E. McKenney
The RCU CPU stall-warning code for normal grace periods is currently scattered across three files, due to earlier Tiny RCU support for RCU CPU stall warnings and for old Kconfig options that have long since been retired. Given that it is hard for the lead RCU maintainer to find relevant stall-warning code, it would be good to consolidate it. This commit starts this process by moving stall-warning code from kernel/rcu/update.c to a new kernel/rcu/tree_stall.h file. Note that the definitions of rcu_cpu_stall_suppress and rcu_cpu_stall_timeout must remain in kernel/rcu/update.h to provide compatibility for kernel boot parameter lists. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-02-09Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', ↵Paul E. McKenney
'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD doc.2019.01.26a: Documentation updates. fixes.2019.01.26a: Miscellaneous fixes. sil.2019.01.26a: Removal of a few more spin_is_locked() instances. spdx.2019.02.09a: Add SPDX identifiers to RCU files srcu.2019.01.26a: SRCU updates. torture.2019.01.26a: Torture-test updates.
2019-02-09rcu/rcu.h: Convert to SPDX license identifierPaul E. McKenney
Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-01-25rcu: Fix obsolete DYNTICK_IRQ_NONIDLE commentPaul E. McKenney
This commit updates the DYNTICK_IRQ_NONIDLE header comment to remove the obsolete commentary about unmatched rcu_irq_{enter,exit}(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2019-01-25rcu: Eliminate RCU_BH_FLAVOR and RCU_SCHED_FLAVORPaul E. McKenney
Now that the RCU flavors have been consolidated, RCU_BH_FLAVOR and RCU_SCHED_FLAVOR are no longer used. This commit therefore saves a few lines by removing them. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01rcutorture: Dump grace-period diagnostics upon forward-progress OOMPaul E. McKenney
This commit adds an OOM notifier during rcutorture forward-progress testing. If this notifier is invoked, it dumps out some grace-period state to help debug the forward-progress problem. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-12-01rcutorture: Affinity forward-progress test to avoid housekeeping CPUsPaul E. McKenney
This commit affinities the forward-progress tests to avoid hogging a housekeeping CPU on the theory that the offloaded callbacks will be running on those housekeeping CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> [ paulmck: Fix NULL-pointer issue located by kbuild test robot. ] Tested-by: Rong Chen <rong.a.chen@intel.com>
2018-08-30Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' ↵Paul E. McKenney
and 'torture.2018.08.29a' into HEAD doc.2018.08.30a: Documentation updates dynticks.2018.08.30b: RCU flavor consolidation updates and cleanups srcu.2018.08.30b: SRCU updates torture.2018.08.29a: Torture-test updates
2018-08-30srcu: Make call_srcu() available during very early bootPaul E. McKenney
Event tracing is moving to SRCU in order to take advantage of the fact that SRCU may be safely used from idle and even offline CPUs. However, event tracing can invoke call_srcu() very early in the boot process, even before workqueue_init_early() is invoked (let alone rcu_init()). Therefore, call_srcu()'s attempts to queue work fail miserably. This commit therefore detects this situation, and refrains from attempting to queue work before rcu_init() time, but does everything else that it would have done, and in addition, adds the srcu_struct to a global list. The rcu_init() function now invokes a new srcu_init() function, which is empty if CONFIG_SRCU=n. Otherwise, srcu_init() queues work for each srcu_struct on the list. This all happens early enough in boot that there is but a single CPU with interrupts disabled, which allows synchronization to be dispensed with. Of course, the queued work won't actually be invoked until after workqueue_init() is invoked, which happens shortly after the scheduler is up and running. This means that although call_srcu() may be invoked any time after per-CPU variables have been set up, there is still a very narrow window when synchronize_srcu() won't work, and this window extends from the time that the scheduler starts until the time that workqueue_init() returns. This can be fixed in a manner similar to the fix for synchronize_rcu_expedited() and friends, but until someone actually needs to use synchronize_srcu() during this window, this fix is added churn for no benefit. Finally, note that Tree SRCU's new srcu_init() function invokes queue_work() rather than the queue_delayed_work() function that is invoked post-boot. The reason is that queue_delayed_work() will (as you would expect) post a timer, and timers have not yet been initialized. So use of queue_work() avoids the complaints about use of uninitialized spinlocks that would otherwise result. Besides, some delay is already provide by the aforementioned fact that the queued work won't actually be invoked until after the scheduler is up and running. Requested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-30rcu: Provide functions for determining if call_rcu() has been invokedPaul E. McKenney
This commit adds rcu_head_init() and rcu_head_after_call_rcu() functions to help RCU users detect when another CPU has passed the specified rcu_head structure and function to call_rcu(). The rcu_head_init() should be invoked before making the structure visible to RCU readers, and then the rcu_head_after_call_rcu() may be invoked from within an RCU read-side critical section on an rcu_head structure that was obtained during a traversal of the data structure in question. The rcu_head_after_call_rcu() function will return true if the rcu_head structure has already been passed (with the specified function) to call_rcu(), otherwise it will return false. If rcu_head_init() has not been invoked on the rcu_head structure or if the rcu_head (AKA callback) has already been invoked, then rcu_head_after_call_rcu() will do WARN_ON_ONCE(). Reported-by: NeilBrown <neilb@suse.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Apply neilb naming feedback. ]
2018-08-30rcu: Clean up flavor-related definitions and comments in rcu.hPaul E. McKenney
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30rcu: Remove now-unused rcutorture APIsPaul E. McKenney
This commit removes rcu_sched_get_gp_seq(), rcu_bh_get_gp_seq(), rcu_exp_batches_completed_sched(), rcu_sched_force_quiescent_state(), and rcu_bh_force_quiescent_state(), which are no longer used because rcutorture no longer does "rcu_bh" and "rcu_sched" torture types. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30rcu: Remove rsp parameter from rcu_node tree accessor macrosPaul E. McKenney
There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's rcu_node tree's accessor macros. This commit therefore removes the rsp parameter from those macros in kernel/rcu/rcu.h, and removes some now-unused rsp local variables while in the area. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12Merge branches 'fixes1.2018.07.12b' and 'torture1.2018.07.12b' into HEADPaul E. McKenney
fixes1.2018.07.12b: Post-gp_seq miscellaneous fixes torture1.2018.07.12b: Post-gp_seq torture-test updates
2018-07-12rcutorture: Add support to detect if boost kthread prio is too lowJoel Fernandes (Google)
When rcutorture is built in to the kernel, an earlier patch detects that and raises the priority of RCU's kthreads to allow rcutorture's RCU priority boosting tests to succeed. However, if rcutorture is built as a module, those priorities must be raised manually via the rcutree.kthread_prio kernel boot parameter. If this manual step is not taken, rcutorture's RCU priority boosting tests will fail due to kthread starvation. One approach would be to raise the default priority, but that risks breaking existing users. Another approach would be to allow runtime adjustment of RCU's kthread priorities, but that introduces numerous "interesting" race conditions. This patch therefore instead detects too-low priorities, and prints a message and disables the RCU priority boosting tests in that case. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12rcu: Remove rcutorture test version and sequence numberPaul E. McKenney
Back when RCU had a debugfs interface, there was a test version and sequence number that allowed associating debugfs data with a particular test run, where the test run started with modprobe and ended with rmmod, which was how tests were run back on the old ABAT system within IBM. But rcutorture testing no longer runs on ABAT, and there is no longer an RCU debugfs interface, so there is no longer any need for test versions and sequence numbers. This commit therefore removes the rcutorture_record_test_transition() and rcutorture_record_progress() functions, and along with them the rcutorture_testseq and rcutorture_vernum variables that they update. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12rcu: Make rcu_seq_diff() more exactPaul E. McKenney
The current implementatation of rcu_seq_diff() follows tradition in providing a rough-and-ready approximation of the number of elapsed grace periods between the two rcu_seq values. However, this difference is used to flag RCU-failure "near misses", which can be a valuable debugging aid, so more exactitude would be an improvement. This commit therefore improves the accuracy of rcu_seq_diff(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12rcu: Add comment documenting how rcu_seq_snap worksJoel Fernandes (Google)
rcu_seq_snap may be tricky to decipher. Lets document how it works with an example to make it easier. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Shrink comment as suggested by Peter Zijlstra. ]
2018-07-12rcutorture: Correctly handle grace-period sequence wrapPaul E. McKenney
The new ->gq_seq grace-period sequence numbers must be shifted down, which give artifacts when these numbers wrap. This commit therefore enables rcutorture and rcuperf to handle grace-period sequence numbers even if they do wrap. It does this by allowing a special subtraction function to be specified, and this function subtracts before shifting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12rcu: Make rcu_start_this_gp() check for grace period already startedPaul E. McKenney
In the old days of ->gpnum and ->completed, the code requesting a new grace period checked to see if that grace period had already started, bailing early if so. The new-age ->gp_seq approach instead checks whether the grace period has already finished. A compensating change pushed the requested grace period down to the bottom of the tree, thus reducing lock contention and even eliminating it in some cases. But why not further reduce contention, especially on large systems, by doing both, especially given that the cost of doing both is extremely small? This commit therefore adds a new rcu_seq_started() function that checks whether a specified grace period has already started. It then uses this new function in place of rcu_seq_done() in the rcu_start_this_gp() function's funnel locking code. Reported-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12rcutorture: Convert rcutorture_get_gp_data() to ->gp_seqPaul E. McKenney
SRCU has long used ->srcu_gp_seq, and now RCU uses ->gp_seq. This commit therefore moves the rcutorture_get_gp_data() function from a ->gpnum / ->completed pair to ->gp_seq. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12rcu: Move RCU's grace-period-change code to ->gp_seqPaul E. McKenney
This commit moves __note_gp_changes(), note_gp_changes(), and __rcu_pending() to ->gp_seq, creating new rcu_seq_completed_gp() and rcu_seq_new_gp() functions for this purpose. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Reinstate "cpuend: trace as suggested by Joel Fernandes. ]
2018-07-12rcu: Make rcutorture's batches-completed API use ->gp_seqPaul E. McKenney
The rcutorture test invokes rcu_batches_started(), rcu_batches_completed(), rcu_batches_started_bh(), rcu_batches_completed_bh(), rcu_batches_started_sched(), and rcu_batches_completed_sched() to do grace-period consistency checks, and rcuperf uses the _completed variants for statistics. These functions use ->gpnum and ->completed. This commit therefore replaces them with rcu_get_gp_seq(), rcu_bh_get_gp_seq(), and rcu_sched_get_gp_seq(), adjusting rcutorture and rcuperf to make use of them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-06-25rcu: Make expedited grace period use direct call on last leafPaul E. McKenney
During expedited grace-period initialization, a work item is scheduled for each leaf rcu_node structure. However, that initialization code is itself (normally) executing from a workqueue, so one of the leaf rcu_node structures could just as well be handled by that pre-existing workqueue, and with less overhead. This commit therefore uses a shiny new rcu_is_leaf_node() macro to execute the last leaf rcu_node structure's initialization directly from the pre-existing workqueue. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-05-15Merge branches 'exp.2018.05.15a', 'fixes.2018.05.15a', 'lock.2018.05.15a' ↵Paul E. McKenney
and 'torture.2018.05.15a' into HEAD exp.2018.05.15a: Parallelize expedited grace-period initialization. fixes.2018.05.15a: Miscellaneous fixes. lock.2018.05.15a: Decrease lock contention on root rcu_node structure, which is a step towards merging RCU flavors. torture.2018.05.15a: Torture-test updates.
2018-05-15rcu: Add leaf-node macrosPaul E. McKenney
This commit adds rcu_first_leaf_node() that returns a pointer to the first leaf rcu_node structure in the specified RCU flavor and an rcu_is_leaf_node() that returns true iff the specified rcu_node structure is a leaf. This commit also uses these macros where appropriate. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Nicholas Piggin <npiggin@gmail.com>
2018-05-15rcu: Parallelize expedited grace-period initializationPaul E. McKenney
The latency of RCU expedited grace periods grows with increasing numbers of CPUs, eventually failing to be all that expedited. Much of the growth in latency is in the initialization phase, so this commit uses workqueues to carry out this initialization concurrently on a rcu_node-by-rcu_node basis. This change makes use of a new rcu_par_gp_wq because flushing a work item from another work item running from the same workqueue can result in deadlock. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Nicholas Piggin <npiggin@gmail.com>
2018-02-23rcu: Create RCU-specific workqueues with rescuersPaul E. McKenney
RCU's expedited grace periods can participate in out-of-memory deadlocks due to all available system_wq kthreads being blocked and there not being memory available to create more. This commit prevents such deadlocks by allocating an RCU-specific workqueue_struct at early boot time, and providing it with a rescuer to ensure forward progress. This uses the shiny new init_rescuer() function provided by Tejun (but indirectly). This commit also causes SRCU to use this new RCU-specific workqueue_struct. Note that SRCU's use of workqueues never blocks them waiting for readers, so this should be safe from a forward-progress viewpoint. Note that this moves SRCU from system_power_efficient_wq to a normal workqueue. In the unlikely event that this results in measurable degradation, a separate power-efficient workqueue will be creates for SRCU. Reported-by: Prateek Sood <prsood@codeaurora.org> Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Tejun Heo <tj@kernel.org>
2018-02-20rcu: Make expedited RCU CPU selection avoid unnecessary storesPaul E. McKenney
This commit reworks the first loop in sync_rcu_exp_select_cpus() to avoid doing unnecssary stores to other CPUs' rcu_data structures. This speeds up that first loop by roughly a factor of two on an old x86 system. In the case where the system is mostly idle, this loop incurs a large fraction of the overhead of the synchronize_rcu_expedited(). There is less benefit on busy systems because the overhead of the smp_call_function_single() in the second loop dominates in that case. However, it is not unusual to do configuration chances involving RCU grace periods (both expedited and normal) while the system is mostly idle, so this optimization is worth doing. While we are in the area, this commit also adds parentheses to arguments used by the for_each_leaf_node_possible_cpu() macro. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20rcu: Add more tracing of expedited grace periodsPaul E. McKenney
This commit adds more tracing of expedited grace periods to enable improved debugging of slowdowns. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20rcu: Use wrapper for lockdep assertsMatthew Wilcox
Commits c0b334c5bfa9 and ea9b0c8a26a2 introduced new sparse warnings by accessing rcu_node->lock directly and ignoring the __private marker. Introduce a new wrapper and use it. Also fix a similar problem in srcutree.c introduced by a3883df3935e. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20rcu: Consolidate rcu.h #ifdefsPaul E. McKenney
The kernel/rcu/rcu.h file has a pair of consecutive #ifdefs on CONFIG_TINY_RCU, so this commit consolidates them, thus saving a few lines of code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>