summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-03-21Merge branches 'doc.2020.02.27a', 'fixes.2020.03.21a', ↵Paul E. McKenney
'kfree_rcu.2020.02.20a', 'locktorture.2020.02.20a', 'ovld.2020.02.20a', 'rcu-tasks.2020.02.20a', 'srcu.2020.02.20a' and 'torture.2020.02.20a' into HEAD doc.2020.02.27a: Documentation updates. fixes.2020.03.21a: Miscellaneous fixes. kfree_rcu.2020.02.20a: Updates to kfree_rcu(). locktorture.2020.02.20a: Lock torture-test updates. ovld.2020.02.20a: Updates to callback-overload handling. rcu-tasks.2020.02.20a: RCU-tasks updates. srcu.2020.02.20a: SRCU updates. torture.2020.02.20a: Torture-test updates.
2020-03-21rcu: Make rcu_barrier() account for offline no-CBs CPUsPaul E. McKenney
Currently, rcu_barrier() ignores offline CPUs, However, it is possible for an offline no-CBs CPU to have callbacks queued, and rcu_barrier() must wait for those callbacks. This commit therefore makes rcu_barrier() directly invoke the rcu_barrier_func() with interrupts disabled for such CPUs. This requires passing the CPU number into this function so that it can entrain the rcu_barrier() callback onto the correct CPU's callback list, given that the code must instead execute on the current CPU. While in the area, this commit fixes a bug where the first CPU's callback might have been invoked before rcu_segcblist_entrain() returned, which would also result in an early wakeup. Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ paulmck: Apply optimization feedback from Boqun Feng. ] Cc: <stable@vger.kernel.org> # 5.5.x
2020-03-21rcu: Mark rcu_state.gp_seq to detect concurrent writesPaul E. McKenney
The rcu_state structure's gp_seq field is only to be modified by the RCU grace-period kthread, which is single-threaded. This commit therefore enlists KCSAN's help in enforcing this restriction. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27Documentation/memory-barriers: Fix typosSeongJae Park
Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc: Add rcutorture scripting to torture.txtPaul E. McKenney
For testing mainline, the kvm.sh rcutorture script is the preferred approach to testing. This commit therefore adds it to the torture.txt documentation. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc/RCU/rcu: Use https instead of http if possibleSeongJae Park
Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc/RCU/rcu: Use absolute paths for non-rst filesSeongJae Park
Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc/RCU/rcu: Use ':ref:' for links to other docsSeongJae Park
Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc/RCU/listRCU: Update example function nameSeongJae Park
listRCU.rst document gives an example with 'ipc_lock()', but the function has dropped off by commit 82061c57ce93 ("ipc: drop ipc_lock()"). Because the main logic of 'ipc_lock()' has melded in 'shm_lock()' by the commit, this commit updates the document to use 'shm_lock()' instead. Reviewed-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc/RCU/listRCU: Fix typos in a example code snippetsSeongJae Park
Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc/RCU/Design: Remove remaining HTML tags in ReST filesSeongJae Park
Commit ccc9971e2147 ("docs: rcu: convert some articles from html to ReST") has converted a few of html RCU docs into ReST files, but a few of html tags which not supported on rst is remaining. This commit converts those to ReST appropriate alternatives. Reviewed-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-27doc: Add some more RCU list patterns in the kernelJoel Fernandes (Google)
- Add more information about RCU list patterns taking examples from audit subsystem in the linux kernel. - Keep the current audit examples, even though the kernel has changed. - Modify inline text for better passage quality. - Fix typo in code-blocks and improve code comments. - Add text formatting (italics, bold and code) for better emphasis. Patch originally submitted at https://lore.kernel.org/patchwork/patch/1082804/ Co-developed-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Set KCSAN Kconfig options to detect more data racesPaul E. McKenney
This commit enables the KCSAN Kconfig options that (1) detect data races between reads and writes even when the writes do not change the variable's value and (2) detect data races involving plain C-language writes. These changes only affect scripted rcutorture runs and can be overridden using the kvm.sh --kconfig argument. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Manually clean up after rcu_barrier() failurePaul E. McKenney
Currently, if rcu_barrier() returns too soon, the test waits 100ms and then does another instance of the test. However, if rcu_barrier() were to have waited for more than 100ms too short a time, this could cause the test's rcu_head structures to be reused while they were still on RCU's callback lists. This can result in knock-on errors that obscure the original rcu_barrier() test failure. This commit therefore adds code that attempts to wait until all of the test's callbacks have been invoked. Of course, if RCU completely lost track of the corresponding rcu_head structures, this wait could be forever. This commit therefore also complains if this attempted recovery takes more than one second, and it also gives up when the test ends. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Make rcu_torture_barrier_cbs() post from corresponding CPUPaul E. McKenney
Currently, rcu_torture_barrier_cbs() posts callbacks from whatever CPU it is running on, which means that all these kthreads might well be posting from the same CPU, which would drastically reduce the effectiveness of this test. This commit therefore uses IPIs to make the callbacks be posted from the corresponding CPU (given by local variable myid). If the IPI fails (which can happen if the target CPU is offline or does not exist at all), the callback is posted on whatever CPU is currently running. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcuperf: Measure memory footprint during kfree_rcu() testJoel Fernandes (Google)
During changes to kfree_rcu() code, we often check the amount of free memory. As an alternative to checking this manually, this commit adds a measurement in the test itself. It measures four times during the test for available memory, digitally filters these measurements to produce a running average with a weight of 0.5, and compares this digitally filtered value with the amount of available memory at the beginning of the test. Something like the following is printed at the end of the run: Total time taken by all kfree'ers: 6369738407 ns, loops: 10000, batches: 764, memory footprint: 216MB Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Annotation lockless accesses to rcu_torture_currentPaul E. McKenney
The rcutorture global variable rcu_torture_current is accessed locklessly, so it must use the RCU pointer load/store primitives. This commit therefore adds several that were missed. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely and due to this being used only by rcutorture. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Add READ_ONCE() to rcu_torture_count and rcu_torture_batchPaul E. McKenney
The rcutorture rcu_torture_count and rcu_torture_batch per-CPU variables are read locklessly, so this commit adds the READ_ONCE() to a load in order to avoid various types of compiler vandalism^Woptimization. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely and due to this being rcutorture. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Fix stray access to rcu_fwd_cb_nodelayPaul E. McKenney
The rcu_fwd_cb_nodelay variable suppresses excessively long read-side delays while carrying out an rcutorture forward-progress test. As such, it is accessed both by readers and updaters, and most of the accesses therefore use *_ONCE(). Except for one in rcu_read_delay(), which this commit fixes. This data race was reported by KCSAN. Not appropriate for backporting due to this being rcutorture. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Fix rcu_torture_one_read()/rcu_torture_writer() data racePaul E. McKenney
The ->rtort_pipe_count field in the rcu_torture structure checks for too-short grace periods, and is therefore read by rcutorture's readers while being updated by rcutorture's writers. This commit therefore adds the needed READ_ONCE() and WRITE_ONCE() invocations. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely and due to this being rcutorture. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Make kvm-find-errors.sh abort on bad directoryPaul E. McKenney
Currently, kvm-find-errors.sh gives a usage prompt when given a bad directory, but then soldiers on, giving a series of confusing error messages. This commit therefore prints an error message and exits when given a bad directory, hopefully reducing confusion. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Summarize summary of build and run resultsPaul E. McKenney
When running the default list of tests, the run summary of a successful (that is, failed to find any errors) run fits easily on a 24-line screen. But a run with something like "--configs '5*CFLIST'" will be 80 lines long, and it is all too easy to miss a failure message when scrolling back. This commit therefore prints out the number of runs with failing builds or runtime failures, but only if there are any such failures. For example, a run with a single build error and a single runtime error would print two lines like this: 1 runs with build errors. 1 runs with runtime errors. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Add 100-CPU configurationPaul E. McKenney
The small-system rcutorture configurations have served us well for a great many years, but it is now time to add a larger one. This commit does just that, but does not add it to the defaults in CFLIST. This allows the kvm.sh argument '--configs "4*CFLIST TREE10" to run four instances of each of the default configurations concurrently with one instance of the large configuration. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20torture: Allow disabling of boottime CPU-hotplug torture operationsPaul E. McKenney
In theory, RCU-hotplug operations are supposed to work as soon as there is more than one CPU online. However, in practice, in normal production there is no way to make them happen until userspace is up and running. Besides which, on smaller systems, rcutorture doesn't start doing hotplug operations until 30 seconds after the start of boot, which on most systems also means the better part of 30 seconds after the end of boot. This commit therefore provides a new torture.disable_onoff_at_boot kernel boot parameter that suppresses CPU-hotplug torture operations until about the time that init is spawned. Of course, if you know of a need for boottime CPU-hotplug operations, then you should avoid passing this argument to any of the torture tests. You might also want to look at the splats linked to below. Link: https://lore.kernel.org/lkml/20191206185208.GA25636@paulmck-ThinkPad-P72/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Suppress boottime bad-sequence warningsPaul E. McKenney
In normal production, an excessively long wait on a grace period (synchronize_rcu(), for example) at boottime is often just as bad as at any other time. In fact, given the desire for fast boot, any sort of long wait at boot is a bad idea. However, heavy rcutorture testing on large hyperthreaded systems can generate such long waits during boot as a matter of course. This commit therefore causes the rcupdate.rcu_cpu_stall_suppress_at_boot kernel boot parameter to suppress reporting of bootime bad-sequence warning due to excessively long grace-period waits. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Allow boottime stall warnings to be suppressedPaul E. McKenney
In normal production, an RCU CPU stall warning at boottime is often just as bad as at any other time. In fact, given the desire for fast boot, any sort of long-term stall at boot is a bad idea. However, heavy rcutorture testing on large hyperthreaded systems can generate boottime RCU CPU stalls as a matter of course. This commit therefore provides a kernel boot parameter that suppresses reporting of boottime RCU CPU stall warnings and similarly of rcutorture writer stalls. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20torture: Forgive -EBUSY from boottime CPU-hotplug operationsPaul E. McKenney
During boot, CPU hotplug is often disabled, for example by PCI probing. On large systems that take substantial time to boot, this can result in spurious RCU_HOTPLUG errors. This commit therefore forgives any boottime -EBUSY CPU-hotplug failures by adjusting counters to pretend that the corresponding attempt never happened. A non-splat record of the failed attempt is emitted to the console with the added string "(-EBUSY forgiven during boot)". Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Refrain from callback flooding during bootPaul E. McKenney
Additional rcutorture aggression can result in, believe it or not, boot times in excess of three minutes on large hyperthreaded systems. This is long enough for rcutorture to decide to do some callback flooding, which seems a bit excessive given that userspace cannot have started until long after boot, and it is userspace that does the real-world callback flooding. Worse yet, because Tiny RCU lacks forward-progress functionality, the looping-in-the-kernel tests can also be problematic during early boot. This commit therefore causes rcutorture to hold off on callback flooding until about the time that init is spawned, and the same for looping-in-the-kernel tests for Tiny RCU. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20torture: Make results-directory date format completion-friendlyPaul E. McKenney
The names of the per-test results directories are of the form 2019.11.29-20:42:19. This works, but the ":" characters make tab-based shell name completion a bit onerous because the user must remember to include a quote character somewhere before the first ":". This commit therefore changes the ":" characters to periods, as in 2019.12.01-20.48.01", which allows tab-based completion to work more naturally. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcutorture: Suppress forward-progress complaints during early bootPaul E. McKenney
Some larger systems can take in excess of 50 seconds to complete their early boot initcalls prior to spawing init. This does not in any way help the forward-progress judgments of built-in rcutorture (when rcutorture is built as a module, the insmod or modprobe command normally cannot happen until some time after boot completes). This commit therefore suppresses such complaints until about the time that init is spawned. This also includes a fix to a stupid error located by kbuild test robot. [ paulmck: Apply kbuild test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ paulmck: Fix to nohz_full slow-expediting recovery logic, per bpetkov. ] [ paulmck: Restrict splat to CONFIG_PREEMPT_RT=y kernels and simplify. ] Tested-by: Borislav Petkov <bp@alien8.de>
2020-02-20srcu: Hold srcu_struct ->lock when updating ->srcu_gp_seqPaul E. McKenney
A read of the srcu_struct structure's ->srcu_gp_seq field should not need READ_ONCE() when that structure's ->lock is held. Except that this lock is not always held when updating this field. This commit therefore acquires the lock around updates and removes a now-unneeded READ_ONCE(). This data race was reported by KCSAN. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ paulmck: Switch from READ_ONCE() to lock per Peter Zilstra question. ] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-02-20srcu: Fix process_srcu()/srcu_batches_completed() dataracePaul E. McKenney
The srcu_struct structure's ->srcu_idx field is accessed locklessly, so reads must use READ_ONCE(). This commit therefore adds the needed READ_ONCE() invocation where it was missed. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20srcu: Fix __call_srcu()/srcu_get_delay() dataracePaul E. McKenney
The srcu_struct structure's ->srcu_gp_seq_needed_exp field is accessed locklessly, so updates must use WRITE_ONCE(). This commit therefore adds the needed WRITE_ONCE() invocations. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20srcu: Fix __call_srcu()/process_srcu() dataracePaul E. McKenney
The srcu_node structure's ->srcu_gp_seq_needed_exp field is accessed locklessly, so updates must use WRITE_ONCE(). This commit therefore adds the needed WRITE_ONCE() invocations. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Add missing annotation for exit_tasks_rcu_finish()Jules Irenge
Sparse reports a warning at exit_tasks_rcu_finish(void) |warning: context imbalance in exit_tasks_rcu_finish() |- wrong count at exit To fix this, this commit adds a __releases(&tasks_rcu_exit_srcu). Given that exit_tasks_rcu_finish() does actually call __srcu_read_lock(), this not only fixes the warning but also improves on the readability of the code. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2020-02-20rcu: Add missing annotation for exit_tasks_rcu_start()Jules Irenge
Sparse reports a warning at exit_tasks_rcu_start(void) |warning: context imbalance in exit_tasks_rcu_start() - wrong count at exit To fix this, this commit adds an __acquires(&tasks_rcu_exit_srcu). Given that exit_tasks_rcu_start() does actually call __srcu_read_lock(), this not only fixes the warning but also improves on the readability of the code. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
2020-02-20rcu-tasks: *_ONCE() for rcu_tasks_cbs_headPaul E. McKenney
The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly by rcu_tasks_kthread() when waiting for work to do. This commit therefore applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the single potential store outside of rcu_tasks_kthread. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Update __call_rcu() commentsPaul E. McKenney
The __call_rcu() function's header comment refers to a cpu argument that no longer exists, and the comment of the return path from rcu_nocb_try_bypass() ignores the non-no-CBs CPU case. This commit therefore update both comments. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Fix spelling mistake "leval" -> "level"Colin Ian King
This commit fixes a spelling mistake in a pr_info() message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: React to callback overload by boosting RCU readersPaul E. McKenney
RCU priority boosting currently is not applied until the grace period is at least 250 milliseconds old (or the number of milliseconds specified by the CONFIG_RCU_BOOST_DELAY Kconfig option). Although this has worked well, it can result in OOM under conditions of RCU callback flooding. One can argue that the real-time systems using RCU priority boosting should carefully avoid RCU callback flooding, but one can just as well argue that an OOM is a rather obnoxious error message. This commit therefore disables the RCU priority boosting delay when there are excessive numbers of callbacks queued. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: React to callback overload by aggressively seeking quiescent statesPaul E. McKenney
In default configutions, RCU currently waits at least 100 milliseconds before asking cond_resched() and/or resched_rcu() for help seeking quiescent states to end a grace period. But 100 milliseconds can be one good long time during an RCU callback flood, for example, as can happen when user processes repeatedly open and close files in a tight loop. These 100-millisecond gaps in successive grace periods during a callback flood can result in excessive numbers of callbacks piling up, unnecessarily increasing memory footprint. This commit therefore asks cond_resched() and/or resched_rcu() for help as early as the first FQS scan when at least one of the CPUs has more than 20,000 callbacks queued, a number that can be changed using the new rcutree.qovld kernel boot parameter. An auxiliary qovld_calc variable is used to avoid acquisition of locks that have not yet been initialized. Early tests indicate that this reduces the RCU-callback memory footprint during rcutorture floods by from 50% to 4x, depending on configuration. Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reported-by: Tejun Heo <tj@kernel.org> [ paulmck: Fix bug located by Qian Cai. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Dexuan Cui <decui@microsoft.com> Tested-by: Qian Cai <cai@lca.pw>
2020-02-20rcu: Clear ->core_needs_qs at GP end or self-reported QSPaul E. McKenney
The rcu_data structure's ->core_needs_qs field does not necessarily get cleared in a timely fashion after the corresponding CPUs' quiescent state has been reported. From a functional viewpoint, no harm done, but this can result in excessive invocation of RCU core processing, as witnessed by the kernel test robot, which saw greatly increased softirq overhead. This commit therefore restores the rcu_report_qs_rdp() function's clearing of this field, but only when running on the corresponding CPU. Cases where some other CPU reports the quiescent state (for example, on behalf of an idle CPU) are handled by setting this field appropriately within the __note_gp_changes() function's end-of-grace-period checks. This handling is carried out regardless of whether the end of a grace period actually happened, thus handling the case where a CPU goes non-idle after a quiescent state is reported on its behalf, but before the grace period ends. This fix also avoids cross-CPU updates to ->core_needs_qs, While in the area, this commit changes the __note_gp_changes() need_gp variable's name to need_qs because it is a quiescent state that is needed from the CPU in question. Fixes: ed93dfc6bc00 ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20locktorture: Forgive apparent unfairness if CPU hotplugPaul E. McKenney
If CPU hotplug testing is enabled, a lock might appear to be maximally unfair just because one of the CPUs was offline almost all the time. This commit therefore forgives unfairness if CPU hotplug testing was enabled. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20locktorture: Use private random-number generatorsPaul E. McKenney
Both lock_torture_writer() and lock_torture_reader() use the "static" keyword on their DEFINE_TORTURE_RANDOM(rand) declarations, which means that a single instance of a random-number generator are shared among all the writers and another is shared among all the readers. Unfortunately, this random-number generator was not designed for concurrent access. This commit therefore removes both "static" keywords so that each reader and each writer gets its own random-number generator. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20locktorture: Allow CPU-hotplug to be disabled via --bootargsPaul E. McKenney
The bootparam_hotplug_cpu() bash function was checking for CPU-hotplug kernel-boot parameters from --bootargs, but that check was specific to rcutorture ("rcutorture\.onoff_"). This commit therefore makes this check also work for locktorture ("torture\.onoff_"). Note that rcuperf does not do CPU-hotplug operations, so it is not necessary to make a similar change for rcuperf. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20locktorture: Print ratio of acquisitions, not failuresPaul E. McKenney
The __torture_print_stats() function in locktorture.c carefully initializes local variable "min" to statp[0].n_lock_acquired, but then compares it to statp[i].n_lock_fail. Given that the .n_lock_fail field should normally be zero, and given the initialization, it seems reasonable to display the maximum and minimum number acquisitions instead of miscomputing the maximum and minimum number of failures. This commit therefore switches from failures to acquisitions. And this turns out to be not only a day-zero bug, but entirely my own fault. I hate it when that happens! Fixes: 0af3fe1efa53 ("locktorture: Add a lock-torture kernel module") Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Peter Zijlstra <peterz@infradead.org>
2020-02-20rcu: Add a trace event for kfree_rcu() use of kfree_bulk()Uladzislau Rezki (Sony)
The event is given three parameters, first one is the name of RCU flavour, second one is the number of elements in array for free and last one is an address of the array holding pointers to be freed by the kfree_bulk() function. To enable the trace event your kernel has to be build with CONFIG_RCU_TRACE=y, after that it is possible to track the events using ftrace subsystem. Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Support kfree_bulk() interface in kfree_rcu()Uladzislau Rezki (Sony)
The kfree_rcu() logic can be improved further by using kfree_bulk() interface along with "basic batching support" introduced earlier. The are at least two advantages of using "bulk" interface: - in case of large number of kfree_rcu() requests kfree_bulk() reduces the per-object overhead caused by calling kfree() per-object. - reduces the number of cache-misses due to "pointer chasing" between objects which can be far spread between each other. This approach defines a new kfree_rcu_bulk_data structure that stores pointers in an array with a specific size. Number of entries in that array depends on PAGE_SIZE making kfree_rcu_bulk_data structure to be exactly one page. Since it deals with "block-chain" technique there is an extra need in dynamic allocation when a new block is required. Memory is allocated with GFP_NOWAIT | __GFP_NOWARN flags, i.e. that allows to skip direct reclaim under low memory condition to prevent stalling and fails silently under high memory pressure. The "emergency path" gets maintained when a system is run out of memory. In that case objects are linked into regular list. The "rcuperf" was run to analyze this change in terms of memory consumption and kfree_bulk() throughput. 1) Testing on the Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz, 12xCPUs with following parameters: kfree_loops=200000 kfree_alloc_num=1000 kfree_rcu_test=1 kfree_vary_obj_size=1 dev.2020.01.10a branch Default / CONFIG_SLAB 53607352517 ns, loops: 200000, batches: 1885, memory footprint: 1248MB 53529637912 ns, loops: 200000, batches: 1921, memory footprint: 1193MB 53570175705 ns, loops: 200000, batches: 1929, memory footprint: 1250MB Patch / CONFIG_SLAB 23981587315 ns, loops: 200000, batches: 810, memory footprint: 1219MB 23879375281 ns, loops: 200000, batches: 822, memory footprint: 1190MB 24086841707 ns, loops: 200000, batches: 794, memory footprint: 1380MB Default / CONFIG_SLUB 51291025022 ns, loops: 200000, batches: 1713, memory footprint: 741MB 51278911477 ns, loops: 200000, batches: 1671, memory footprint: 719MB 51256183045 ns, loops: 200000, batches: 1719, memory footprint: 647MB Patch / CONFIG_SLUB 50709919132 ns, loops: 200000, batches: 1618, memory footprint: 456MB 50736297452 ns, loops: 200000, batches: 1633, memory footprint: 507MB 50660403893 ns, loops: 200000, batches: 1628, memory footprint: 429MB in case of CONFIG_SLAB there is double increase in performance and slightly higher memory usage. As for CONFIG_SLUB, the performance figures are better together with lower memory usage. 2) Testing on the HiKey-960, arm64, 8xCPUs with below parameters: CONFIG_SLAB=y kfree_loops=200000 kfree_alloc_num=1000 kfree_rcu_test=1 102898760401 ns, loops: 200000, batches: 5822, memory footprint: 158MB 89947009882 ns, loops: 200000, batches: 6715, memory footprint: 115MB rcuperf shows approximately ~12% better throughput in case of using "bulk" interface. The "drain logic" or its RCU callback does the work faster that leads to better throughput. Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Make nocb_gp_wait() double-check unexpected-callback warningPaul E. McKenney
Currently, nocb_gp_wait() unconditionally complains if there is a callback not already associated with a grace period. This assumes that either there was no such callback initially on the one hand, or that the rcu_advance_cbs() function assigned all such callbacks to a grace period on the other. However, in theory there are some situations that would prevent rcu_advance_cbs() from assigning all of the callbacks. This commit therefore checks for unassociated callbacks immediately after rcu_advance_cbs() returns, while the corresponding rcu_node structure's ->lock is still held. If there are unassociated callbacks at that point, the subsequent WARN_ON_ONCE() is disabled. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-02-20rcu: Tighten rcu_lockdep_assert_cblist_protected() checkPaul E. McKenney
The ->nocb_lock lockdep assertion is currently guarded by cpu_online(), which is incorrect for no-CBs CPUs, whose callback lists must be protected by ->nocb_lock regardless of whether or not the corresponding CPU is online. This situation could result in failure to detect bugs resulting from failing to hold ->nocb_lock for offline CPUs. This commit therefore removes the cpu_online() guard. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>