summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2020-06-30dma-mapping: Add a new dma_need_sync APIChristoph Hellwig
Add a new API to check if calls to dma_sync_single_for_{device,cpu} are required for a given DMA streaming mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200629130359.2690853-2-hch@lst.de
2020-06-29audit: add gfp parameter to audit_log_nfcfgRichard Guy Briggs
Fixed an inconsistent use of GFP flags in nft_obj_notify() that used GFP_KERNEL when a GFP flag was passed in to that function. Given this allocated memory was then used in audit_log_nfcfg() it led to an audit of all other GFP allocations in net/netfilter/nf_tables_api.c and a modification of audit_log_nfcfg() to accept a GFP parameter. Reported-by: Dan Carptenter <dan.carpenter@oracle.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-29kcsan: Disable branch tracing in core runtimeMarco Elver
Disable branch tracing in core KCSAN runtime if branches are being traced (TRACE_BRANCH_PROFILING). This it to avoid its performance impact, but also avoid recursion in case KCSAN is enabled for the branch tracing runtime. The latter had already been a problem for KASAN: https://lore.kernel.org/lkml/CANpmjNOeXmD5E3O50Z3MjkiuCYaYOPyi+1rq=GZvEKwBvLR0Ug@mail.gmail.com/ Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Simplify compiler flagsMarco Elver
Simplify the set of compiler flags for the runtime by removing cc-option from -fno-stack-protector, because all supported compilers support it. This saves us one compiler invocation during build. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Add jiffies test to test suiteMarco Elver
Add a test that KCSAN nor the compiler gets confused about accesses to jiffies on different architectures. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Remove existing special atomic rulesMarco Elver
Remove existing special atomic rules from kcsan_is_atomic_special() because they are no longer needed. Since we rely on the compiler emitting instrumentation distinguishing volatile accesses, the rules have become redundant. Let's keep kcsan_is_atomic_special() around, so that we have an obvious place to add special rules should the need arise in future. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Rename test.c to selftest.cMarco Elver
Rename 'test.c' to 'selftest.c' to better reflect its purpose (Kconfig variable and code inside already match this). This is to avoid confusion with the test suite module in 'kcsan-test.c'. No functional change. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Silence -Wmissing-prototypes warning with W=1Marco Elver
The functions here should not be forward declared for explicit use elsewhere in the kernel, as they should only be emitted by the compiler due to sanitizer instrumentation. Add forward declarations a line above their definition to shut up warnings in W=1 builds. Link: https://lkml.kernel.org/r/202006060103.jSCpnV1g%lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Prefer '__no_kcsan inline' in testMarco Elver
Instead of __no_kcsan_or_inline, prefer '__no_kcsan inline' in test -- this is in case we decide to remove __no_kcsan_or_inline. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29locking/osq_lock: Annotate a data race in osq_lockQian Cai
The prev->next pointer can be accessed concurrently as noticed by KCSAN: write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107: osq_lock+0x25f/0x350 osq_wait_next at kernel/locking/osq_lock.c:79 (inlined by) osq_lock at kernel/locking/osq_lock.c:185 rwsem_optimistic_spin <snip> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100: osq_lock+0x196/0x350 osq_lock at kernel/locking/osq_lock.c:157 rwsem_optimistic_spin <snip> Since the write only stores NULL to prev->next and the read tests if prev->next equals to this_cpu_ptr(&osq_node). Even if the value is shattered, the code is still working correctly. Thus, mark it as an intentional data race using the data_race() macro. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29kcsan: Add test suiteMarco Elver
This adds KCSAN test focusing on behaviour of the integrated runtime. Tests various race scenarios, and verifies the reports generated to console. Makes use of KUnit for test organization, and the Torture framework for test thread control. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29fork: Annotate a data race in vm_area_dup()Qian Cai
struct vm_area_struct could be accessed concurrently as noticed by KCSAN, write to 0xffff9cf8bba08ad8 of 8 bytes by task 14263 on cpu 35: vma_interval_tree_insert+0x101/0x150: rb_insert_augmented_cached at include/linux/rbtree_augmented.h:58 (inlined by) vma_interval_tree_insert at mm/interval_tree.c:23 __vma_link_file+0x6e/0xe0 __vma_link_file at mm/mmap.c:629 vma_link+0xa2/0x120 mmap_region+0x753/0xb90 do_mmap+0x45c/0x710 vm_mmap_pgoff+0xc0/0x130 ksys_mmap_pgoff+0x1d1/0x300 __x64_sys_mmap+0x33/0x40 do_syscall_64+0x91/0xc44 entry_SYSCALL_64_after_hwframe+0x49/0xbe read to 0xffff9cf8bba08a80 of 200 bytes by task 14262 on cpu 122: vm_area_dup+0x6a/0xe0 vm_area_dup at kernel/fork.c:362 __split_vma+0x72/0x2a0 __split_vma at mm/mmap.c:2661 split_vma+0x5a/0x80 mprotect_fixup+0x368/0x3f0 do_mprotect_pkey+0x263/0x420 __x64_sys_mprotect+0x51/0x70 do_syscall_64+0x91/0xc44 entry_SYSCALL_64_after_hwframe+0x49/0xbe vm_area_dup() blindly copies all fields of original VMA to the new one. This includes coping vm_area_struct::shared.rb which is normally protected by i_mmap_lock. But this is fine because the read value will be overwritten on the following __vma_link_file() under proper protection. Thus, mark it as an intentional data race and insert a few assertions for the fields that should not be modified concurrently. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29Merge branches 'doc.2020.06.29a', 'fixes.2020.06.29a', ↵Paul E. McKenney
'kfree_rcu.2020.06.29a', 'rcu-tasks.2020.06.29a', 'scale.2020.06.29a', 'srcu.2020.06.29a' and 'torture.2020.06.29a' into HEAD doc.2020.06.29a: Documentation updates. fixes.2020.06.29a: Miscellaneous fixes. kfree_rcu.2020.06.29a: kfree_rcu() updates. rcu-tasks.2020.06.29a: RCU Tasks updates. scale.2020.06.29a: Read-side scalability tests. srcu.2020.06.29a: SRCU updates. torture.2020.06.29a: Torture-test updates.
2020-06-29torture: Dump ftrace at shutdown only if requestedPaul E. McKenney
If there is a large number of torture tests running concurrently, all of which are dumping large ftrace buffers at shutdown time, the resulting dumping can take a very long time, particularly on systems with rotating-rust storage. This commit therefore adds a default-off torture.ftrace_dump_at_shutdown module parameter that enables shutdown-time ftrace-buffer dumping. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcutorture: Check for unwatched readersPaul E. McKenney
RCU is supposed to be watching all non-idle kernel code and also all softirq handlers. This commit adds some teeth to this statement by adding a WARN_ON_ONCE(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu/rcutorture: Replace 0 with falseJules Irenge
Coccinelle reports a warning WARNING: Assignment of 0/1 to bool variable The root cause is that the variable lastphase is a bool, but is initialised with integer 0. This commit therefore replaces the 0 with a false. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcutorture: NULL rcu_torture_current earlier in cleanup codePaul E. McKenney
Currently, the rcu_torture_current variable remains non-NULL until after all readers have stopped. During this time, rcu_torture_stats_print() will think that the test is still ongoing, which can result in confusing dmesg output. This commit therefore NULLs rcu_torture_current immediately after the rcu_torture_writer() kthread has decided to stop, thus informing rcu_torture_stats_print() much sooner. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcutorture: Add races with task-exit processingPaul E. McKenney
Several variants of Linux-kernel RCU interact with task-exit processing, including preemptible RCU, Tasks RCU, and Tasks Trace RCU. This commit therefore adds testing of this interaction to rcutorture by adding rcutorture.read_exit_burst and rcutorture.read_exit_delay kernel-boot parameters. These kernel parameters control the frequency and spacing of special read-then-exit kthreads that are spawned. [ paulmck: Apply feedback from Dan Carpenter's static checker. ] [ paulmck: Reduce latency to avoid false-positive shutdown hangs. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29locktorture: Use true and false to assign to bool variablesZou Wei
This commit fixes the following coccicheck warnings: kernel/locking/locktorture.c:689:6-10: WARNING: Assignment of 0/1 to bool variable kernel/locking/locktorture.c:907:2-20: WARNING: Assignment of 0/1 to bool variable kernel/locking/locktorture.c:938:3-20: WARNING: Assignment of 0/1 to bool variable kernel/locking/locktorture.c:668:2-19: WARNING: Assignment of 0/1 to bool variable kernel/locking/locktorture.c:674:2-19: WARNING: Assignment of 0/1 to bool variable kernel/locking/locktorture.c:634:2-20: WARNING: Assignment of 0/1 to bool variable kernel/locking/locktorture.c:640:2-20: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29srcu: Avoid local_irq_save() before acquiring spinlock_tSebastian Andrzej Siewior
SRCU disables interrupts to get a stable per-CPU pointer and then acquires the spinlock which is in the per-CPU data structure. The release uses spin_unlock_irqrestore(). While this is correct on a non-RT kernel, this conflicts with the RT semantics because the spinlock is converted to a 'sleeping' spinlock. Sleeping locks can obviously not be acquired with interrupts disabled. Acquire the per-CPU pointer `ssp->sda' without disabling preemption and then acquire the spinlock_t of the per-CPU data structure. The lock will ensure that the data is consistent. The added call to check_init_srcu_struct() is now needed because a statically defined srcu_struct may remain uninitialized until this point and the newly introduced locking operation requires an initialized spinlock_t. This change was tested for four hours with 8*SRCU-N and 8*SRCU-P without causing any warnings. Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: rcu@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29srcu: Fix a typo in comment "amoritized"->"amortized"Ethon Paul
This commit fixes a typo in a comment. Signed-off-by: Ethon Paul <ethp@qq.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Rename refperf.c to refscale.c and change internal namesPaul E. McKenney
This commit further avoids conflation of refperf with the kernel's perf feature by renaming kernel/rcu/refperf.c to kernel/rcu/refscale.c, and also by similarly renaming the functions and variables inside this file. This has the side effect of changing the names of the kernel boot parameters, so kernel-parameters.txt and ver_functions.sh are also updated. The rcutorture --torture type remains refperf, and this will be addressed in a separate commit. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Rename RCU_REF_PERF_TEST to RCU_REF_SCALE_TESTPaul E. McKenney
The old Kconfig option name is all too easy to conflate with the unrelated "perf" feature, so this commit renames RCU_REF_PERF_TEST to RCU_REF_SCALE_TEST. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Fix synchronize_rcu_tasks_trace() header commentPaul E. McKenney
The synchronize_rcu_tasks_trace() header comment incorrectly claims that any number of things delimit RCU Tasks Trace read-side critical sections, when in fact only rcu_read_lock_trace() and rcu_read_unlock_trace() do so. This commit therefore fixes this comment, and, while in the area, fixes a typo in the rcu_read_lock_trace() header comment. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add test for RCU Tasks readersPaul E. McKenney
This commit adds testing for RCU Tasks readers to the refperf module. This also applies to RCU Rude readers, as both flavors have empty (as in non-existent) read-side markers. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add test for RCU Tasks Trace readers.Paul E. McKenney
This commit adds testing for RCU Tasks Trace readers to the refperf module. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Change readdelay module parameter to nanosecondsPaul E. McKenney
The current units of microseconds are too coarse, so this commit changes the units to nanoseconds. However, ndelay is used only for the nanoseconds with udelay being used for whole microseconds. For example, setting refperf.readdelay=1500 results in a udelay(1) followed by an ndelay(500). Suggested-by: Akira Yokosawa <akiyks@gmail.com> [ paulmck: Abstracted delay per Akira feedback and move from 80 to 100 lines. ] [ paulmck: Fix names as suggested by kbuild test robot. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Work around 64-bit divisionArnd Bergmann
A 64-bit division was introduced in refperf, breaking compilation on all 32-bit architectures: kernel/rcu/refperf.o: in function `main_func': refperf.c:(.text+0x57c): undefined reference to `__aeabi_uldivmod' Fix this by using div_u64 to mark the expensive operation. [ paulmck: Update primitive and format per Nathan Chancellor. ] Fixes: bd5b16d6c88d ("refperf: Allow decimal nanoseconds") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Adjust refperf.loop default valuePaul E. McKenney
With the various measurement optimizations, 10,000 loops normally suffices. This commit therefore reduces the refperf.loops default value from 10,000,000 to 10,000. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add read-side delay module parameterPaul E. McKenney
This commit adds a refperf.readdelay module parameter that controls the duration of each critical section. This parameter allows gathering data showing how the performance differences between the various primitives vary with critical-section length. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Simplify initialization-time wakeup protocolPaul E. McKenney
This commit moves the reader-launch wait loop from ref_perf_init() to main_func(), removing one layer of wakeup and allowing slightly faster system boot. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Label experiment-number column "Runs"Paul E. McKenney
The experiment-number column is currently labeled "Threads", which is misleading at best. This commit therefore relabels it as "Runs", and adjusts the scripts accordingly. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add warmup and cooldown processing phasesPaul E. McKenney
This commit causes all the readers to start running unmeasured load until all readers have done at least one such run (thus having warmed up), then run the measured load, and then run unmeasured load until all readers have completed their measured load. This approach avoids any thread running measured load while other readers are idle. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: More closely synchronize reader start timesPaul E. McKenney
Currently, readers are awakened individually. On most systems, this results in significant wakeup delay from one reader to the next, which can result in the first and last reader having sole access to the synchronization primitive in question. If that synchronization primitive involves shared memory, those readers will rack up a huge number of operations in a very short time, causing large perturbations in the results. This commit therefore has the readers busy-wait after being awakened, and uses a new n_started variable to synchronize their start times. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Convert reader_task structure's "start" field to intPaul E. McKenney
This commit converts the reader_task structure's "start" field to int in order to demote a full barrier to an smp_load_acquire() and also to simplify the code a bit. While in the area, and to enlist the compiler's help in ensuring that nothing was missed, the field's name was changed to start_reader. Also while in the area, change the main_func() store to use smp_store_release() to further fortify against wait/wake races. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Tune reader measurement intervalPaul E. McKenney
This commit moves a printk() out of the measurement interval, converts a atomic_dec()/atomic_read() pair to atomic_dec_and_test(), and adds a smp_mb__before_atomic() to avoid potential wake/wait hangs. These changes have the added benefit of reducing the number of loops required for amortizing loop overhead for CONFIG_PREEMPT=n RCU measurements from 1,000,000 to 10,000. This reduction in turn shortens the test, reducing the probability of interference. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Make functions staticPaul E. McKenney
Because the reset_readers() and process_durations() functions are used only within kernel/rcu/refperf.c, this commit makes them static. Reported-by: kbuild test robot <lkp@intel.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Dynamically allocate thread-summary output bufferPaul E. McKenney
Currently, the buffer used to accumulate the thread-summary output is fixed size, which will cause problems if someone decides to run on a large number of PCUs. This commit therefore dynamically allocates this buffer. [ paulmck: Fix memory allocation as suggested by KASAN. ] Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Dynamically allocate experiment-summary output bufferPaul E. McKenney
Currently, the buffer used to accumulate the experiment-summary output is fixed size, which will cause problems if someone decides to run one hundred experiments. This commit therefore dynamically allocates this buffer. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Provide module parameter to specify number of experimentsPaul E. McKenney
The current code uses the number of threads both to limit the number of threads and to specify the number of experiments, but also varies the number of threads as the experiments progress. This commit takes a different approach by adding an refperf.nruns module parameter that specifies the number of experiments, and furthermore uses the same number of threads for each experiment. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Convert nreaders to a module parameterPaul E. McKenney
This commit converts nreaders to a module parameter, with the default of -1 specifying the old behavior of using 75% of the readers. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Allow decimal nanosecondsPaul E. McKenney
The CONFIG_PREEMPT=n rcu_read_lock()/rcu_read_unlock() pair's overhead, even including loop overhead, is far less than one nanosecond. Since logscale plots are not all that happy with zero values, provide picoseconds as decimals. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Hoist function-pointer calls out of the loopPaul E. McKenney
Current runs show PREEMPT=n rcu_read_lock()/rcu_read_unlock() pairs consuming between 20 and 30 nanoseconds, when in fact the actual value is zero, give or take the barrier() asm's effect on compiler optimizations. The additional overhead is caused by function calls through pointers (especially in these days of Spectre mitigations) and perhaps also needless argument passing, a non-const loop limit, and an upcounting loop. This commit therefore combines the ->readlock() and ->readunlock() function pointers into a single ->readsection() function pointer that takes the loop count as a const parameter and keeps any data passed from the read-lock to the read-unlock internal to this new function. These changes reduce the measured overhead of the aforementioned PREEMPT=n rcu_read_lock()/rcu_read_unlock() pairs from between 20 and 30 nanoseconds to somewhere south of 500 picoseconds. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add holdoff parameter to allow CPUs to come onlinePaul E. McKenney
This commit adds an rcuperf module parameter named "holdoff" that defaults to 10 seconds if refperf is built in and to zero otherwise. The assumption is that all the CPUs are online by the time that the modprobe and insmod commands are going to do anything, and that normal systems will have all the CPUs online within ten seconds. Larger systems may take many tens of seconds or even minutes to get to this point, hence this being a module parameter instead of being a hard-coded constant. Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcuperf: Add comments explaining the high reader overheadPaul E. McKenney
This commit adds comments explaining why the readers have otherwise insane levels of measurement overhead, namely that they are intended as a test load for update-side performance measurements, not as a straight-up read-side performance test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29refperf: Add a test to measure performance of read-side synchronizationJoel Fernandes (Google)
Add a test for comparing the performance of RCU with various read-side synchronization mechanisms. The test has proved useful for collecting data and performing these comparisons. Currently RCU, SRCU, reader-writer lock, reader-writer semaphore and reference counting can be measured using refperf.perf_type parameter. Each invocation of the test runs measures performance of a specific mechanism. The maximum number of CPUs to concurrently run readers on is chosen by the test itself and is 75% of the total number of CPUs. So if you had 24 CPUs, the test runs with a maximum of 18 parallel readers. A number of experiments are conducted, and in each experiment, the number of readers is increased by 1, upto the 75% of CPUs mark. During each experiment, all readers execute an empty loop with refperf.loops iterations and time the total loop duration. This is then averaged. Example output: Parameters "refperf.perf_type=srcu refperf.loops=2000000" looks like: [ 3.347133] srcu-ref-perf: [ 3.347133] Threads Time(ns) [ 3.347133] 1 36 [ 3.347133] 2 34 [ 3.347133] 3 34 [ 3.347133] 4 34 [ 3.347133] 5 33 [ 3.347133] 6 33 [ 3.347133] 7 33 [ 3.347133] 8 33 [ 3.347133] 9 33 [ 3.347133] 10 33 [ 3.347133] 11 33 [ 3.347133] 12 33 [ 3.347133] 13 33 [ 3.347133] 14 33 [ 3.347133] 15 32 [ 3.347133] 16 33 [ 3.347133] 17 33 [ 3.347133] 18 34 Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcuperf: Remove useless while loops around wait_eventJoel Fernandes (Google)
wait_event() already retries if the condition for the wake up is not satisifed after wake up. Remove them from the rcuperf test. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Fix code-style issuesPaul E. McKenney
This commit declares trc_n_readers_need_end and trc_wait static and replaced a "&" with "&&". The "&" happened to work because the values are bool, but accidents waiting to happen and all that... Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Conditionally compile show_rcu_tasks_gp_kthreads()Paul E. McKenney
The show_rcu_tasks_gp_kthreads() function is not invoked by Tiny RCU, but is nevertheless defined in Tiny RCU builds that enable Tasks Trace RCU. This commit therefore conditionally compiles this function so that it is defined only in builds that actually use it. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-29rcu-tasks: Add #include of rcupdate_trace.h to update.cPaul E. McKenney
Although this is in some strict sense unnecessary, it is good to allow the compiler to compare the function declaration with its definition. This commit therefore adds a #include of linux/rcupdate_trace.h to kernel/rcu/update.c. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>