summaryrefslogtreecommitdiff
path: root/kernel/rcu
AgeCommit message (Collapse)Author
2020-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25Merge tag 'pm-5.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix more fallout of recent RCU-lockdep changes in CPU idle code and two devfreq issues. Specifics: - Export rcu_idle_{enter,exit} to modules to fix build issues introduced by recent RCU-lockdep fixes (Borislav Petkov) - Add missing return statement to a stub function in the ACPI processor driver to fix a build issue introduced by recent RCU-lockdep fixes (Rafael Wysocki) - Fix recently introduced suspicious RCU usage warnings in the PSCI cpuidle driver and drop stale comments regarding RCU_NONIDLE() usage from enter_s2idle_proper() (Ulf Hansson) - Fix error code path in the tegra30 devfreq driver (Dan Carpenter) - Add missing information to devfreq_summary debugfs (Chanwoo Choi)" * tag 'pm-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset PM / devfreq: tegra30: Disable clock on error in probe PM / devfreq: Add timer type to devfreq_summary debugfs cpuidle: Drop misleading comments about RCU usage cpuidle: psci: Fix suspicious RCU usage rcu/tree: Export rcu_idle_{enter,exit} to modules
2020-09-21rcu/tree: Export rcu_idle_{enter,exit} to modulesBorislav Petkov
Fix this link error: ERROR: modpost: "rcu_idle_enter" [drivers/acpi/processor.ko] undefined! ERROR: modpost: "rcu_idle_exit" [drivers/acpi/processor.ko] undefined! when CONFIG_ACPI_PROCESSOR is built as module. PeterZ says that in light of ARM needing those soon too, they should simply be exported. Fixes: 1fecfdbb7acc ("ACPI: processor: Take over RCU-idle for C3-BM idle") Reported-by: Sven Joachim <svenjoac@gmx.de> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Paul E. McKenney <paulmckrcu@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-16rcu-tasks: Enclose task-list scan in rcu_read_lock()Paul E. McKenney
The rcu_tasks_trace_postgp() function uses for_each_process_thread() to scan the task list without the benefit of RCU read-side protection, which can result in use-after-free errors on task_struct structures. This error was missed because the TRACE01 rcutorture scenario enables lockdep, but also builds with CONFIG_PREEMPT_NONE=y. In this situation, preemption is disabled everywhere, so lockdep thinks everywhere can be a legitimate RCU reader. This commit therefore adds the needed rcu_read_lock() and rcu_read_unlock(). Note that this bug can occur only after an RCU Tasks Trace CPU stall warning, which by default only happens after a grace period has extended for ten minutes (yes, not a typo, minutes). Fixes: 4593e772b502 ("rcu-tasks: Add stall warnings for RCU Tasks Trace") Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: <bpf@vger.kernel.org> Cc: <stable@vger.kernel.org> # 5.7.x Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Fix low-probability task_struct leakPaul E. McKenney
When rcu_tasks_trace_postgp() function detects an RCU Tasks Trace CPU stall, it adds all tasks blocking the current grace period to a list, invoking get_task_struct() on each to prevent them from being freed while on the list. It then traverses that list, printing stall-warning messages for each one that is still blocking the current grace period and removing it from the list. The list removal invokes the matching put_task_struct(). This of course means that in the admittedly unlikely event that some task executes its outermost rcu_read_unlock_trace() in the meantime, it won't be removed from the list and put_task_struct() won't be executing, resulting in a task_struct leak. This commit therefore makes the list removal and put_task_struct() unconditional, stopping the leak. Note further that this bug can occur only after an RCU Tasks Trace CPU stall warning, which by default only happens after a grace period has extended for ten minutes (yes, not a typo, minutes). Fixes: 4593e772b502 ("rcu-tasks: Add stall warnings for RCU Tasks Trace") Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: <bpf@vger.kernel.org> Cc: <stable@vger.kernel.org> # 5.7.x Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Fix grace-period/unlock race in RCU Tasks TracePaul E. McKenney
The more intense grace-period processing resulting from the 50x RCU Tasks Trace grace-period speedups exposed the following race condition: o Task A running on CPU 0 executes rcu_read_lock_trace(), entering a read-side critical section. o When Task A eventually invokes rcu_read_unlock_trace() to exit its read-side critical section, this function notes that the ->trc_reader_special.s flag is zero and and therefore invoke wil set ->trc_reader_nesting to zero using WRITE_ONCE(). But before that happens... o The RCU Tasks Trace grace-period kthread running on some other CPU interrogates Task A, but this fails because this task is currently running. This kthread therefore sends an IPI to CPU 0. o CPU 0 receives the IPI, and thus invokes trc_read_check_handler(). Because Task A has not yet cleared its ->trc_reader_nesting counter, this function sees that Task A is still within its read-side critical section. This function therefore sets the ->trc_reader_nesting.b.need_qs flag, AKA the .need_qs flag. Except that Task A has already checked the .need_qs flag, which is part of the ->trc_reader_special.s flag. The .need_qs flag therefore remains set until Task A's next rcu_read_unlock_trace(). o Task A now invokes synchronize_rcu_tasks_trace(), which cannot start a new grace period until the current grace period completes. And thus cannot return until after that time. But Task A's .need_qs flag is still set, which prevents the current grace period from completing. And because Task A is blocked, it will never execute rcu_read_unlock_trace() until its call to synchronize_rcu_tasks_trace() returns. We are therefore deadlocked. This race is improbable, but 80 hours of rcutorture made it happen twice. The race was possible before the grace-period speedup, but roughly 50x less probable. Several thousand hours of rcutorture would have been necessary to have a reasonable chance of making this happen before this 50x speedup. This commit therefore eliminates this deadlock by setting ->trc_reader_nesting to a large negative number before checking the .need_qs and zeroing (or decrementing with respect to its initial value) ->trc_reader_nesting. For its part, the IPI handler's trc_read_check_handler() function adds a check for negative values, deferring evaluation of the task in this case. Taken together, these changes avoid this deadlock scenario. Fixes: 276c410448db ("rcu-tasks: Split ->trc_reader_need_end") Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: <bpf@vger.kernel.org> Cc: <stable@vger.kernel.org> # 5.7.x Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Shorten per-grace-period sleep for RCU Tasks TracePaul E. McKenney
The various RCU tasks flavors currently wait 100 milliseconds between each grace period in order to prevent CPU-bound loops and to favor efficiency over latency. However, RCU Tasks Trace needs to have a grace-period latency of roughly 25 milliseconds, which is completely infeasible given the 100-millisecond per-grace-period sleep. This commit therefore reduces this sleep duration to 5 milliseconds (or one jiffy, whichever is longer) in kernels built with CONFIG_TASKS_TRACE_RCU_READ_MB=y. Link: https://lore.kernel.org/bpf/CAADnVQK_AiX+S_L_A4CQWT11XyveppBbQSQgH_qWGyzu_E8Yeg@mail.gmail.com/ Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: <bpf@vger.kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Selectively enable more RCU Tasks Trace IPIsPaul E. McKenney
Many workloads are quite sensitive to IPIs, and such workloads should build kernels with CONFIG_TASKS_TRACE_RCU_READ_MB=y to prevent RCU Tasks Trace from using them under normal conditions. However, other workloads are quite happy to permit more IPIs if doing so makes BPF program updates go faster. This commit therefore sets the default value for the rcupdate.rcu_task_ipi_delay kernel parameter to zero for kernels that have been built with CONFIG_TASKS_TRACE_RCU_READ_MB=n, while retaining the old default of (HZ / 10) for kernels that have indicated an aversion to IPIs via CONFIG_TASKS_TRACE_RCU_READ_MB=y. Link: https://lore.kernel.org/bpf/CAADnVQK_AiX+S_L_A4CQWT11XyveppBbQSQgH_qWGyzu_E8Yeg@mail.gmail.com/ Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: <bpf@vger.kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()Paul E. McKenney
Commit 8344496e8b49 ("rcu-tasks: Conditionally compile show_rcu_tasks_gp_kthreads()") introduced conditional compilation of several functions, but forgot one occurrence of show_rcu_tasks_classic_gp_kthread() that causes the compiler to warn of an unused static function. This commit uses "static inline" to avoid these complaints and possibly also to avoid emitting an actual definition of this function. Fixes: 8344496e8b49 ("rcu-tasks: Conditionally compile show_rcu_tasks_gp_kthreads()") Cc: <stable@vger.kernel.org> # 5.8.x Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Use more aggressive polling for RCU Tasks TracePaul E. McKenney
The RCU Tasks Trace grace periods are too slow, as in 40x slower than those of RCU Tasks. This is due to my having assumed a one-second grace period was OK, and thus not having optimized any further. This commit provides the first step in this optimization process, namely by allowing the task_list scan backoff interval to be specified on a per-flavor basis, and then speeding up the scans for RCU Tasks Trace. However, kernels built with CONFIG_TASKS_TRACE_RCU_READ_MB=y continue to use the old slower backoff, consistent with that Kconfig option's goal of reducing IPIs. Link: https://lore.kernel.org/bpf/CAADnVQK_AiX+S_L_A4CQWT11XyveppBbQSQgH_qWGyzu_E8Yeg@mail.gmail.com/ Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: <bpf@vger.kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-09-16rcu-tasks: Mark variables staticPaul E. McKenney
The n_heavy_reader_attempts, n_heavy_reader_updates, and n_heavy_reader_ofl_updates variables are not used outside of their translation unit, so this commit marks them static. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-07rcu: kasan: record and print call_rcu() call stackWalter Wu
Patch series "kasan: memorize and print call_rcu stack", v8. This patchset improves KASAN reports by making them to have call_rcu() call stack information. It is useful for programmers to solve use-after-free or double-free memory issue. The KASAN report was as follows(cleaned up slightly): BUG: KASAN: use-after-free in kasan_rcu_reclaim+0x58/0x60 Freed by task 0: kasan_save_stack+0x24/0x50 kasan_set_track+0x24/0x38 kasan_set_free_info+0x18/0x20 __kasan_slab_free+0x10c/0x170 kasan_slab_free+0x10/0x18 kfree+0x98/0x270 kasan_rcu_reclaim+0x1c/0x60 Last call_rcu(): kasan_save_stack+0x24/0x50 kasan_record_aux_stack+0xbc/0xd0 call_rcu+0x8c/0x580 kasan_rcu_uaf+0xf4/0xf8 Generic KASAN will record the last two call_rcu() call stacks and print up to 2 call_rcu() call stacks in KASAN report. it is only suitable for generic KASAN. This feature considers the size of struct kasan_alloc_meta and kasan_free_meta, we try to optimize the structure layout and size, lets it get better memory consumption. [1]https://bugzilla.kernel.org/show_bug.cgi?id=198437 [2]https://groups.google.com/forum/#!searchin/kasan-dev/better$20stack$20traces$20for$20rcu%7Csort:date/kasan-dev/KQsjT_88hDE/7rNUZprRBgAJ This patch (of 4): This feature will record the last two call_rcu() call stacks and prints up to 2 call_rcu() call stacks in KASAN report. When call_rcu() is called, we store the call_rcu() call stack into slub alloc meta-data, so that the KASAN report can print rcu stack. [1]https://bugzilla.kernel.org/show_bug.cgi?id=198437 [2]https://groups.google.com/forum/#!searchin/kasan-dev/better$20stack$20traces$20for$20rcu%7Csort:date/kasan-dev/KQsjT_88hDE/7rNUZprRBgAJ [walter-zh.wu@mediatek.com: build fix] Link: http://lkml.kernel.org/r/20200710162401.23816-1-walter-zh.wu@mediatek.com Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthias Brugger <matthias.bgg@gmail.com> Link: http://lkml.kernel.org/r/20200710162123.23713-1-walter-zh.wu@mediatek.com Link: http://lkml.kernel.org/r/20200601050847.1096-1-walter-zh.wu@mediatek.com Link: http://lkml.kernel.org/r/20200601050927.1153-1-walter-zh.wu@mediatek.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-06Merge tag 'sched-fifo-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sched/fifo updates from Ingo Molnar: "This adds the sched_set_fifo*() encapsulation APIs to remove static priority level knowledge from non-scheduler code. The three APIs for non-scheduler code to set SCHED_FIFO are: - sched_set_fifo() - sched_set_fifo_low() - sched_set_normal() These are two FIFO priority levels: default (high), and a 'low' priority level, plus sched_set_normal() to set the policy back to non-SCHED_FIFO. Since the changes affect a lot of non-scheduler code, we kept this in a separate tree" * tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) sched,tracing: Convert to sched_set_fifo() sched: Remove sched_set_*() return value sched: Remove sched_setscheduler*() EXPORTs sched,psi: Convert to sched_set_fifo_low() sched,rcutorture: Convert to sched_set_fifo_low() sched,rcuperf: Convert to sched_set_fifo_low() sched,locktorture: Convert to sched_set_fifo() sched,irq: Convert to sched_set_fifo() sched,watchdog: Convert to sched_set_fifo() sched,serial: Convert to sched_set_fifo() sched,powerclamp: Convert to sched_set_fifo() sched,ion: Convert to sched_set_normal() sched,powercap: Convert to sched_set_fifo*() sched,spi: Convert to sched_set_fifo*() sched,mmc: Convert to sched_set_fifo*() sched,ivtv: Convert to sched_set_fifo*() sched,drm/scheduler: Convert to sched_set_fifo*() sched,msm: Convert to sched_set_fifo*() sched,psci: Convert to sched_set_fifo*() sched,drbd: Convert to sched_set_fifo*() ...
2020-07-31Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull the v5.9 RCU bits from Paul E. McKenney: - Documentation updates - Miscellaneous fixes - kfree_rcu updates - RCU tasks updates - Read-side scalability tests - SRCU updates - Torture-test updates Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-05Merge tag 'core-urgent-2020-07-05' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull rcu fixlet from Thomas Gleixner: "A single fix for a printk format warning in RCU" * tag 'core-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcuperf: Fix printk format warning
2020-06-29Merge branches 'doc.2020.06.29a', 'fixes.2020.06.29a', ↵Paul E. McKenney
'kfree_rcu.2020.06.29a', 'rcu-tasks.2020.06.29a', 'scale.2020.06.29a', 'srcu.2020.06.29a' and 'torture.2020.06.29a' into HEAD doc.2020.06.29a: Documentation updates. fixes.2020.06.29a: Miscellaneous fixes. kfree_rcu.2020.06.29a: kfree_rcu() updates. rcu-tasks.2020.06.29a: RCU Tasks updates. scale.2020.06.29a: Read-side scalability tests. srcu.2020.06.29a: SRCU updates. torture.2020.06.29a: Torture-test updates.
2020-06-29rcutorture: Check for unwatched readersPaul E. McKenney
RCU is supposed to be watching all non-idle kernel code and also all softirq handlers. This commit adds some teeth to this statement by adding a WARN_ON_ONCE(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu/rcutorture: Replace 0 with falseJules Irenge
Coccinelle reports a warning WARNING: Assignment of 0/1 to bool variable The root cause is that the variable lastphase is a bool, but is initialised with integer 0. This commit therefore replaces the 0 with a false. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcutorture: NULL rcu_torture_current earlier in cleanup codePaul E. McKenney
Currently, the rcu_torture_current variable remains non-NULL until after all readers have stopped. During this time, rcu_torture_stats_print() will think that the test is still ongoing, which can result in confusing dmesg output. This commit therefore NULLs rcu_torture_current immediately after the rcu_torture_writer() kthread has decided to stop, thus informing rcu_torture_stats_print() much sooner. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcutorture: Add races with task-exit processingPaul E. McKenney
Several variants of Linux-kernel RCU interact with task-exit processing, including preemptible RCU, Tasks RCU, and Tasks Trace RCU. This commit therefore adds testing of this interaction to rcutorture by adding rcutorture.read_exit_burst and rcutorture.read_exit_delay kernel-boot parameters. These kernel parameters control the frequency and spacing of special read-then-exit kthreads that are spawned. [ paulmck: Apply feedback from Dan Carpenter's static checker. ] [ paulmck: Reduce latency to avoid false-positive shutdown hangs. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29srcu: Avoid local_irq_save() before acquiring spinlock_tSebastian Andrzej Siewior
SRCU disables interrupts to get a stable per-CPU pointer and then acquires the spinlock which is in the per-CPU data structure. The release uses spin_unlock_irqrestore(). While this is correct on a non-RT kernel, this conflicts with the RT semantics because the spinlock is converted to a 'sleeping' spinlock. Sleeping locks can obviously not be acquired with interrupts disabled. Acquire the per-CPU pointer `ssp->sda' without disabling preemption and then acquire the spinlock_t of the per-CPU data structure. The lock will ensure that the data is consistent. The added call to check_init_srcu_struct() is now needed because a statically defined srcu_struct may remain uninitialized until this point and the newly introduced locking operation requires an initialized spinlock_t. This change was tested for four hours with 8*SRCU-N and 8*SRCU-P without causing any warnings. Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: rcu@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29srcu: Fix a typo in comment "amoritized"->"amortized"Ethon Paul
This commit fixes a typo in a comment. Signed-off-by: Ethon Paul <ethp@qq.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Rename refperf.c to refscale.c and change internal namesPaul E. McKenney
This commit further avoids conflation of refperf with the kernel's perf feature by renaming kernel/rcu/refperf.c to kernel/rcu/refscale.c, and also by similarly renaming the functions and variables inside this file. This has the side effect of changing the names of the kernel boot parameters, so kernel-parameters.txt and ver_functions.sh are also updated. The rcutorture --torture type remains refperf, and this will be addressed in a separate commit. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Rename RCU_REF_PERF_TEST to RCU_REF_SCALE_TESTPaul E. McKenney
The old Kconfig option name is all too easy to conflate with the unrelated "perf" feature, so this commit renames RCU_REF_PERF_TEST to RCU_REF_SCALE_TEST. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Fix synchronize_rcu_tasks_trace() header commentPaul E. McKenney
The synchronize_rcu_tasks_trace() header comment incorrectly claims that any number of things delimit RCU Tasks Trace read-side critical sections, when in fact only rcu_read_lock_trace() and rcu_read_unlock_trace() do so. This commit therefore fixes this comment, and, while in the area, fixes a typo in the rcu_read_lock_trace() header comment. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add test for RCU Tasks readersPaul E. McKenney
This commit adds testing for RCU Tasks readers to the refperf module. This also applies to RCU Rude readers, as both flavors have empty (as in non-existent) read-side markers. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add test for RCU Tasks Trace readers.Paul E. McKenney
This commit adds testing for RCU Tasks Trace readers to the refperf module. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Change readdelay module parameter to nanosecondsPaul E. McKenney
The current units of microseconds are too coarse, so this commit changes the units to nanoseconds. However, ndelay is used only for the nanoseconds with udelay being used for whole microseconds. For example, setting refperf.readdelay=1500 results in a udelay(1) followed by an ndelay(500). Suggested-by: Akira Yokosawa <akiyks@gmail.com> [ paulmck: Abstracted delay per Akira feedback and move from 80 to 100 lines. ] [ paulmck: Fix names as suggested by kbuild test robot. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Work around 64-bit divisionArnd Bergmann
A 64-bit division was introduced in refperf, breaking compilation on all 32-bit architectures: kernel/rcu/refperf.o: in function `main_func': refperf.c:(.text+0x57c): undefined reference to `__aeabi_uldivmod' Fix this by using div_u64 to mark the expensive operation. [ paulmck: Update primitive and format per Nathan Chancellor. ] Fixes: bd5b16d6c88d ("refperf: Allow decimal nanoseconds") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Adjust refperf.loop default valuePaul E. McKenney
With the various measurement optimizations, 10,000 loops normally suffices. This commit therefore reduces the refperf.loops default value from 10,000,000 to 10,000. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add read-side delay module parameterPaul E. McKenney
This commit adds a refperf.readdelay module parameter that controls the duration of each critical section. This parameter allows gathering data showing how the performance differences between the various primitives vary with critical-section length. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Simplify initialization-time wakeup protocolPaul E. McKenney
This commit moves the reader-launch wait loop from ref_perf_init() to main_func(), removing one layer of wakeup and allowing slightly faster system boot. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Label experiment-number column "Runs"Paul E. McKenney
The experiment-number column is currently labeled "Threads", which is misleading at best. This commit therefore relabels it as "Runs", and adjusts the scripts accordingly. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add warmup and cooldown processing phasesPaul E. McKenney
This commit causes all the readers to start running unmeasured load until all readers have done at least one such run (thus having warmed up), then run the measured load, and then run unmeasured load until all readers have completed their measured load. This approach avoids any thread running measured load while other readers are idle. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: More closely synchronize reader start timesPaul E. McKenney
Currently, readers are awakened individually. On most systems, this results in significant wakeup delay from one reader to the next, which can result in the first and last reader having sole access to the synchronization primitive in question. If that synchronization primitive involves shared memory, those readers will rack up a huge number of operations in a very short time, causing large perturbations in the results. This commit therefore has the readers busy-wait after being awakened, and uses a new n_started variable to synchronize their start times. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Convert reader_task structure's "start" field to intPaul E. McKenney
This commit converts the reader_task structure's "start" field to int in order to demote a full barrier to an smp_load_acquire() and also to simplify the code a bit. While in the area, and to enlist the compiler's help in ensuring that nothing was missed, the field's name was changed to start_reader. Also while in the area, change the main_func() store to use smp_store_release() to further fortify against wait/wake races. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Tune reader measurement intervalPaul E. McKenney
This commit moves a printk() out of the measurement interval, converts a atomic_dec()/atomic_read() pair to atomic_dec_and_test(), and adds a smp_mb__before_atomic() to avoid potential wake/wait hangs. These changes have the added benefit of reducing the number of loops required for amortizing loop overhead for CONFIG_PREEMPT=n RCU measurements from 1,000,000 to 10,000. This reduction in turn shortens the test, reducing the probability of interference. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Make functions staticPaul E. McKenney
Because the reset_readers() and process_durations() functions are used only within kernel/rcu/refperf.c, this commit makes them static. Reported-by: kbuild test robot <lkp@intel.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Dynamically allocate thread-summary output bufferPaul E. McKenney
Currently, the buffer used to accumulate the thread-summary output is fixed size, which will cause problems if someone decides to run on a large number of PCUs. This commit therefore dynamically allocates this buffer. [ paulmck: Fix memory allocation as suggested by KASAN. ] Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Dynamically allocate experiment-summary output bufferPaul E. McKenney
Currently, the buffer used to accumulate the experiment-summary output is fixed size, which will cause problems if someone decides to run one hundred experiments. This commit therefore dynamically allocates this buffer. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Provide module parameter to specify number of experimentsPaul E. McKenney
The current code uses the number of threads both to limit the number of threads and to specify the number of experiments, but also varies the number of threads as the experiments progress. This commit takes a different approach by adding an refperf.nruns module parameter that specifies the number of experiments, and furthermore uses the same number of threads for each experiment. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Convert nreaders to a module parameterPaul E. McKenney
This commit converts nreaders to a module parameter, with the default of -1 specifying the old behavior of using 75% of the readers. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Allow decimal nanosecondsPaul E. McKenney
The CONFIG_PREEMPT=n rcu_read_lock()/rcu_read_unlock() pair's overhead, even including loop overhead, is far less than one nanosecond. Since logscale plots are not all that happy with zero values, provide picoseconds as decimals. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Hoist function-pointer calls out of the loopPaul E. McKenney
Current runs show PREEMPT=n rcu_read_lock()/rcu_read_unlock() pairs consuming between 20 and 30 nanoseconds, when in fact the actual value is zero, give or take the barrier() asm's effect on compiler optimizations. The additional overhead is caused by function calls through pointers (especially in these days of Spectre mitigations) and perhaps also needless argument passing, a non-const loop limit, and an upcounting loop. This commit therefore combines the ->readlock() and ->readunlock() function pointers into a single ->readsection() function pointer that takes the loop count as a const parameter and keeps any data passed from the read-lock to the read-unlock internal to this new function. These changes reduce the measured overhead of the aforementioned PREEMPT=n rcu_read_lock()/rcu_read_unlock() pairs from between 20 and 30 nanoseconds to somewhere south of 500 picoseconds. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add holdoff parameter to allow CPUs to come onlinePaul E. McKenney
This commit adds an rcuperf module parameter named "holdoff" that defaults to 10 seconds if refperf is built in and to zero otherwise. The assumption is that all the CPUs are online by the time that the modprobe and insmod commands are going to do anything, and that normal systems will have all the CPUs online within ten seconds. Larger systems may take many tens of seconds or even minutes to get to this point, hence this being a module parameter instead of being a hard-coded constant. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcuperf: Add comments explaining the high reader overheadPaul E. McKenney
This commit adds comments explaining why the readers have otherwise insane levels of measurement overhead, namely that they are intended as a test load for update-side performance measurements, not as a straight-up read-side performance test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add a test to measure performance of read-side synchronizationJoel Fernandes (Google)
Add a test for comparing the performance of RCU with various read-side synchronization mechanisms. The test has proved useful for collecting data and performing these comparisons. Currently RCU, SRCU, reader-writer lock, reader-writer semaphore and reference counting can be measured using refperf.perf_type parameter. Each invocation of the test runs measures performance of a specific mechanism. The maximum number of CPUs to concurrently run readers on is chosen by the test itself and is 75% of the total number of CPUs. So if you had 24 CPUs, the test runs with a maximum of 18 parallel readers. A number of experiments are conducted, and in each experiment, the number of readers is increased by 1, upto the 75% of CPUs mark. During each experiment, all readers execute an empty loop with refperf.loops iterations and time the total loop duration. This is then averaged. Example output: Parameters "refperf.perf_type=srcu refperf.loops=2000000" looks like: [ 3.347133] srcu-ref-perf: [ 3.347133] Threads Time(ns) [ 3.347133] 1 36 [ 3.347133] 2 34 [ 3.347133] 3 34 [ 3.347133] 4 34 [ 3.347133] 5 33 [ 3.347133] 6 33 [ 3.347133] 7 33 [ 3.347133] 8 33 [ 3.347133] 9 33 [ 3.347133] 10 33 [ 3.347133] 11 33 [ 3.347133] 12 33 [ 3.347133] 13 33 [ 3.347133] 14 33 [ 3.347133] 15 32 [ 3.347133] 16 33 [ 3.347133] 17 33 [ 3.347133] 18 34 Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcuperf: Remove useless while loops around wait_eventJoel Fernandes (Google)
wait_event() already retries if the condition for the wake up is not satisifed after wake up. Remove them from the rcuperf test. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Fix code-style issuesPaul E. McKenney
This commit declares trc_n_readers_need_end and trc_wait static and replaced a "&" with "&&". The "&" happened to work because the values are bool, but accidents waiting to happen and all that... Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Conditionally compile show_rcu_tasks_gp_kthreads()Paul E. McKenney
The show_rcu_tasks_gp_kthreads() function is not invoked by Tiny RCU, but is nevertheless defined in Tiny RCU builds that enable Tasks Trace RCU. This commit therefore conditionally compiles this function so that it is defined only in builds that actually use it. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>