diff options
Diffstat (limited to 'Documentation/RCU')
-rw-r--r-- | Documentation/RCU/Design/Data-Structures/Data-Structures.rst | 28 | ||||
-rw-r--r-- | Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst | 10 | ||||
-rw-r--r-- | Documentation/RCU/Design/Memory-Ordering/TreeRCU-dyntick.svg | 8 | ||||
-rw-r--r-- | Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg | 8 | ||||
-rw-r--r-- | Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg | 8 | ||||
-rw-r--r-- | Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg | 4 | ||||
-rw-r--r-- | Documentation/RCU/Design/Requirements/Requirements.rst | 19 | ||||
-rw-r--r-- | Documentation/RCU/checklist.rst | 69 | ||||
-rw-r--r-- | Documentation/RCU/rcu_dereference.rst | 5 | ||||
-rw-r--r-- | Documentation/RCU/stallwarn.rst | 2 | ||||
-rw-r--r-- | Documentation/RCU/torture.rst | 2 | ||||
-rw-r--r-- | Documentation/RCU/whatisRCU.rst | 57 |
12 files changed, 130 insertions, 90 deletions
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst index b34990c7c377..04e16775c752 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst @@ -921,10 +921,10 @@ This portion of the ``rcu_data`` structure is declared as follows: :: - 1 int dynticks_snap; + 1 int watching_snap; 2 unsigned long dynticks_fqs; -The ``->dynticks_snap`` field is used to take a snapshot of the +The ``->watching_snap`` field is used to take a snapshot of the corresponding CPU's dyntick-idle state when forcing quiescent states, and is therefore accessed from other CPUs. Finally, the ``->dynticks_fqs`` field is used to count the number of times this CPU @@ -935,8 +935,8 @@ This portion of the rcu_data structure is declared as follows: :: - 1 long dynticks_nesting; - 2 long dynticks_nmi_nesting; + 1 long nesting; + 2 long nmi_nesting; 3 atomic_t dynticks; 4 bool rcu_need_heavy_qs; 5 bool rcu_urgent_qs; @@ -945,14 +945,14 @@ These fields in the rcu_data structure maintain the per-CPU dyntick-idle state for the corresponding CPU. The fields may be accessed only from the corresponding CPU (and from tracing) unless otherwise stated. -The ``->dynticks_nesting`` field counts the nesting depth of process +The ``->nesting`` field counts the nesting depth of process execution, so that in normal circumstances this counter has value zero or one. NMIs, irqs, and tracers are counted by the -``->dynticks_nmi_nesting`` field. Because NMIs cannot be masked, changes +``->nmi_nesting`` field. Because NMIs cannot be masked, changes to this variable have to be undertaken carefully using an algorithm provided by Andy Lutomirski. The initial transition from idle adds one, and nested transitions add two, so that a nesting level of five is -represented by a ``->dynticks_nmi_nesting`` value of nine. This counter +represented by a ``->nmi_nesting`` value of nine. This counter can therefore be thought of as counting the number of reasons why this CPU cannot be permitted to enter dyntick-idle mode, aside from process-level transitions. @@ -960,12 +960,12 @@ process-level transitions. However, it turns out that when running in non-idle kernel context, the Linux kernel is fully capable of entering interrupt handlers that never exit and perhaps also vice versa. Therefore, whenever the -``->dynticks_nesting`` field is incremented up from zero, the -``->dynticks_nmi_nesting`` field is set to a large positive number, and -whenever the ``->dynticks_nesting`` field is decremented down to zero, -the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that +``->nesting`` field is incremented up from zero, the +``->nmi_nesting`` field is set to a large positive number, and +whenever the ``->nesting`` field is decremented down to zero, +the ``->nmi_nesting`` field is set to zero. Assuming that the number of misnested interrupts is not sufficient to overflow the -counter, this approach corrects the ``->dynticks_nmi_nesting`` field +counter, this approach corrects the ``->nmi_nesting`` field every time the corresponding CPU enters the idle loop from process context. @@ -992,8 +992,8 @@ code. +-----------------------------------------------------------------------+ | **Quick Quiz**: | +-----------------------------------------------------------------------+ -| Why not simply combine the ``->dynticks_nesting`` and | -| ``->dynticks_nmi_nesting`` counters into a single counter that just | +| Why not simply combine the ``->nesting`` and | +| ``->nmi_nesting`` counters into a single counter that just | | counts the number of reasons that the corresponding CPU is non-idle? | +-----------------------------------------------------------------------+ | **Answer**: | diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst index 5750f125361b..1a5ff1a9f02e 100644 --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst @@ -147,11 +147,11 @@ RCU read-side critical sections preceding and following the current idle sojourn. This case is handled by calls to the strongly ordered ``atomic_add_return()`` read-modify-write atomic operation that -is invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry -time and within ``rcu_dynticks_eqs_exit()`` at idle-exit time. -The grace-period kthread invokes ``rcu_dynticks_snap()`` and -``rcu_dynticks_in_eqs_since()`` (both of which invoke -an ``atomic_add_return()`` of zero) to detect idle CPUs. +is invoked within ``ct_kernel_exit_state()`` at idle-entry +time and within ``ct_kernel_enter_state()`` at idle-exit time. +The grace-period kthread invokes first ``ct_rcu_watching_cpu_acquire()`` +(preceded by a full memory barrier) and ``rcu_watching_snap_stopped_since()`` +(both of which rely on acquire semantics) to detect idle CPUs. +-----------------------------------------------------------------------+ | **Quick Quiz**: | diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-dyntick.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-dyntick.svg index 423df00c4df9..3fbc19c48a58 100644 --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-dyntick.svg +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-dyntick.svg @@ -528,7 +528,7 @@ font-style="normal" y="-8652.5312" x="2466.7822" - xml:space="preserve">dyntick_save_progress_counter()</text> + xml:space="preserve">rcu_watching_snap_save()</text> <text style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier" id="text202-7-2-7-2-0" @@ -537,7 +537,7 @@ font-style="normal" y="-8368.1475" x="2463.3262" - xml:space="preserve">rcu_implicit_dynticks_qs()</text> + xml:space="preserve">rcu_watching_snap_recheck()</text> </g> <g id="g4504" @@ -607,7 +607,7 @@ font-weight="bold" font-size="192" id="text202-7-5-3-27-6" - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text> <text xml:space="preserve" x="3745.7725" @@ -638,7 +638,7 @@ font-weight="bold" font-size="192" id="text202-7-5-3-27-6-1" - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text> <text xml:space="preserve" x="3745.7725" diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg index d82a77d03d8c..25c7acc8a4c2 100644 --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg @@ -844,7 +844,7 @@ font-style="normal" y="1547.8876" x="4417.6396" - xml:space="preserve">dyntick_save_progress_counter()</text> + xml:space="preserve">rcu_watching_snap_save()</text> <g style="fill:none;stroke-width:0.025in" transform="translate(6501.9719,-10685.904)" @@ -899,7 +899,7 @@ font-style="normal" y="1858.8729" x="4414.1836" - xml:space="preserve">rcu_implicit_dynticks_qs()</text> + xml:space="preserve">rcu_watching_snap_recheck()</text> <text xml:space="preserve" x="14659.87" @@ -977,7 +977,7 @@ font-weight="bold" font-size="192" id="text202-7-5-3-27-6" - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text> <text xml:space="preserve" x="3745.7725" @@ -1008,7 +1008,7 @@ font-weight="bold" font-size="192" id="text202-7-5-3-27-6-1" - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text> <text xml:space="preserve" x="3745.7725" diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg index 53e0dc2a2c79..d05bc7b27edb 100644 --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg @@ -2974,7 +2974,7 @@ font-style="normal" y="38114.047" x="-334.33856" - xml:space="preserve">dyntick_save_progress_counter()</text> + xml:space="preserve">rcu_watching_snap_save()</text> <g style="fill:none;stroke-width:0.025in" transform="translate(1749.9916,25880.249)" @@ -3029,7 +3029,7 @@ font-style="normal" y="38425.035" x="-337.79462" - xml:space="preserve">rcu_implicit_dynticks_qs()</text> + xml:space="preserve">rcu_watching_snap_recheck()</text> <text xml:space="preserve" x="9907.8887" @@ -3107,7 +3107,7 @@ font-weight="bold" font-size="192" id="text202-7-5-3-27-6" - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text> <text xml:space="preserve" x="3745.7725" @@ -3138,7 +3138,7 @@ font-weight="bold" font-size="192" id="text202-7-5-3-27-6-1" - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text> + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text> <text xml:space="preserve" x="3745.7725" diff --git a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg index 4fa7506082bf..a92356ce4011 100644 --- a/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg +++ b/Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg @@ -516,7 +516,7 @@ font-style="normal" y="-8652.5312" x="2466.7822" - xml:space="preserve">dyntick_save_progress_counter()</text> + xml:space="preserve">rcu_watching_snap_save()</text> <text style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier" id="text202-7-2-7-2-0" @@ -525,7 +525,7 @@ font-style="normal" y="-8368.1475" x="2463.3262" - xml:space="preserve">rcu_implicit_dynticks_qs()</text> + xml:space="preserve">rcu_watching_snap_recheck()</text> <text sodipodi:linespacing="125%" style="font-size:192px;font-style:normal;font-weight:bold;line-height:125%;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier" diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index cccafdaa1f84..6125e7068d2c 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -2357,6 +2357,7 @@ section. #. `Sched Flavor (Historical)`_ #. `Sleepable RCU`_ #. `Tasks RCU`_ +#. `Tasks Trace RCU`_ Bottom-Half Flavor (Historical) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2610,6 +2611,16 @@ critical sections that are delimited by voluntary context switches, that is, calls to schedule(), cond_resched(), and synchronize_rcu_tasks(). In addition, transitions to and from userspace execution also delimit tasks-RCU read-side critical sections. +Idle tasks are ignored by Tasks RCU, and Tasks Rude RCU may be used to +interact with them. + +Note well that involuntary context switches are *not* Tasks-RCU quiescent +states. After all, in preemptible kernels, a task executing code in a +trampoline might be preempted. In this case, the Tasks-RCU grace period +clearly cannot end until that task resumes and its execution leaves that +trampoline. This means, among other things, that cond_resched() does +not provide a Tasks RCU quiescent state. (Instead, use rcu_softirq_qs() +from softirq or rcu_tasks_classic_qs() otherwise.) The tasks-RCU API is quite compact, consisting only of call_rcu_tasks(), synchronize_rcu_tasks(), and @@ -2632,9 +2643,13 @@ moniker. And this operation is considered to be quite rude by real-time workloads that don't want their ``nohz_full`` CPUs receiving IPIs and by battery-powered systems that don't want their idle CPUs to be awakened. +Once kernel entry/exit and deep-idle functions have been properly tagged +``noinstr``, Tasks RCU can start paying attention to idle tasks (except +those that are idle from RCU's perspective) and then Tasks Rude RCU can +be removed from the kernel. + The tasks-rude-RCU API is also reader-marking-free and thus quite compact, -consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(), -and rcu_barrier_tasks_rude(). +consisting solely of synchronize_rcu_tasks_rude(). Tasks Trace RCU ~~~~~~~~~~~~~~~ diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst index 2d42998a89a6..7de3e308f330 100644 --- a/Documentation/RCU/checklist.rst +++ b/Documentation/RCU/checklist.rst @@ -68,7 +68,8 @@ over a rather long period of time, but improvements are always welcome! rcu_read_lock_sched(), or by the appropriate update-side lock. Explicit disabling of preemption (preempt_disable(), for example) can serve as rcu_read_lock_sched(), but is less readable and - prevents lockdep from detecting locking issues. + prevents lockdep from detecting locking issues. Acquiring a + spinlock also enters an RCU read-side critical section. Please note that you *cannot* rely on code known to be built only in non-preemptible kernels. Such code can and will break, @@ -193,14 +194,13 @@ over a rather long period of time, but improvements are always welcome! when publicizing a pointer to a structure that can be traversed by an RCU read-side critical section. -5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), - call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used, - the callback function may be invoked from softirq context, - and in any case with bottom halves disabled. In particular, - this callback function cannot block. If you need the callback - to block, run that code in a workqueue handler scheduled from - the callback. The queue_rcu_work() function does this for you - in the case of call_rcu(). +5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), or + call_rcu_tasks_trace() is used, the callback function may be + invoked from softirq context, and in any case with bottom halves + disabled. In particular, this callback function cannot block. + If you need the callback to block, run that code in a workqueue + handler scheduled from the callback. The queue_rcu_work() + function does this for you in the case of call_rcu(). 6. Since synchronize_rcu() can block, it cannot be called from any sort of irq context. The same rule applies @@ -253,10 +253,10 @@ over a rather long period of time, but improvements are always welcome! corresponding readers must use rcu_read_lock_trace() and rcu_read_unlock_trace(). - c. If an updater uses call_rcu_tasks_rude() or - synchronize_rcu_tasks_rude(), then the corresponding - readers must use anything that disables preemption, - for example, preempt_disable() and preempt_enable(). + c. If an updater uses synchronize_rcu_tasks_rude(), + then the corresponding readers must use anything that + disables preemption, for example, preempt_disable() + and preempt_enable(). Mixing things up will result in confusion and broken kernels, and has even resulted in an exploitable security issue. Therefore, @@ -325,11 +325,9 @@ over a rather long period of time, but improvements are always welcome! d. Periodically invoke rcu_barrier(), permitting a limited number of updates per grace period. - The same cautions apply to call_srcu(), call_rcu_tasks(), - call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is - why there is an srcu_barrier(), rcu_barrier_tasks(), - rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(), - respectively. + The same cautions apply to call_srcu(), call_rcu_tasks(), and + call_rcu_tasks_trace(). This is why there is an srcu_barrier(), + rcu_barrier_tasks(), and rcu_barrier_tasks_trace(), respectively. Note that although these primitives do take action to avoid memory exhaustion when any given CPU has too many callbacks, @@ -383,15 +381,16 @@ over a rather long period of time, but improvements are always welcome! to safely access and/or modify that data structure. Do not assume that RCU callbacks will be executed on the same - CPU that executed the corresponding call_rcu() or call_srcu(). - For example, if a given CPU goes offline while having an RCU - callback pending, then that RCU callback will execute on some - surviving CPU. (If this was not the case, a self-spawning RCU - callback would prevent the victim CPU from ever going offline.) - Furthermore, CPUs designated by rcu_nocbs= might well *always* - have their RCU callbacks executed on some other CPUs, in fact, - for some real-time workloads, this is the whole point of using - the rcu_nocbs= kernel boot parameter. + CPU that executed the corresponding call_rcu(), call_srcu(), + call_rcu_tasks(), or call_rcu_tasks_trace(). For example, if + a given CPU goes offline while having an RCU callback pending, + then that RCU callback will execute on some surviving CPU. + (If this was not the case, a self-spawning RCU callback would + prevent the victim CPU from ever going offline.) Furthermore, + CPUs designated by rcu_nocbs= might well *always* have their + RCU callbacks executed on some other CPUs, in fact, for some + real-time workloads, this is the whole point of using the + rcu_nocbs= kernel boot parameter. In addition, do not assume that callbacks queued in a given order will be invoked in that order, even if they all are queued on the @@ -444,7 +443,7 @@ over a rather long period of time, but improvements are always welcome! real-time workloads than is synchronize_rcu_expedited(). It is also permissible to sleep in RCU Tasks Trace read-side - critical, which are delimited by rcu_read_lock_trace() and + critical section, which are delimited by rcu_read_lock_trace() and rcu_read_unlock_trace(). However, this is a specialized flavor of RCU, and you should not use it without first checking with its current users. In most cases, you should instead use SRCU. @@ -490,6 +489,12 @@ over a rather long period of time, but improvements are always welcome! since the last time that you passed that same object to call_rcu() (or friends). + CONFIG_RCU_STRICT_GRACE_PERIOD: + combine with KASAN to check for pointers leaked out + of RCU read-side critical sections. This Kconfig + option is tough on both performance and scalability, + and so is limited to four-CPU systems. + __rcu sparse checks: tag the pointer to the RCU-protected data structure with __rcu, and sparse will warn you if you access that @@ -499,9 +504,9 @@ over a rather long period of time, but improvements are always welcome! These debugging aids can help you find problems that are otherwise extremely difficult to spot. -17. If you pass a callback function defined within a module to one of - call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), - or call_rcu_tasks_trace(), then it is necessary to wait for all +17. If you pass a callback function defined within a module + to one of call_rcu(), call_srcu(), call_rcu_tasks(), or + call_rcu_tasks_trace(), then it is necessary to wait for all pending callbacks to be invoked before unloading that module. Note that it is absolutely *not* sufficient to wait for a grace period! For example, synchronize_rcu() implementation is *not* @@ -514,7 +519,6 @@ over a rather long period of time, but improvements are always welcome! - call_rcu() -> rcu_barrier() - call_srcu() -> srcu_barrier() - call_rcu_tasks() -> rcu_barrier_tasks() - - call_rcu_tasks_rude() -> rcu_barrier_tasks_rude() - call_rcu_tasks_trace() -> rcu_barrier_tasks_trace() However, these barrier functions are absolutely *not* guaranteed @@ -531,7 +535,6 @@ over a rather long period of time, but improvements are always welcome! - Either synchronize_srcu() or synchronize_srcu_expedited(), together with and srcu_barrier() - synchronize_rcu_tasks() and rcu_barrier_tasks() - - synchronize_tasks_rude() and rcu_barrier_tasks_rude() - synchronize_tasks_trace() and rcu_barrier_tasks_trace() If necessary, you can use something like workqueues to execute diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst index 659d5913784d..2524dcdadde2 100644 --- a/Documentation/RCU/rcu_dereference.rst +++ b/Documentation/RCU/rcu_dereference.rst @@ -408,7 +408,10 @@ member of the rcu_dereference() to use in various situations: RCU flavors, an RCU read-side critical section is entered using rcu_read_lock(), anything that disables bottom halves, anything that disables interrupts, or anything that disables - preemption. + preemption. Please note that spinlock critical sections + are also implied RCU read-side critical sections, even when + they are preemptible, as they are in kernels built with + CONFIG_PREEMPT_RT=y. 2. If the access might be within an RCU read-side critical section on the one hand, or protected by (say) my_lock on the other, diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst index ca7b7cd806a1..30080ff6f406 100644 --- a/Documentation/RCU/stallwarn.rst +++ b/Documentation/RCU/stallwarn.rst @@ -249,7 +249,7 @@ ticks this GP)" indicates that this CPU has not taken any scheduling-clock interrupts during the current stalled grace period. The "idle=" portion of the message prints the dyntick-idle state. -The hex number before the first "/" is the low-order 12 bits of the +The hex number before the first "/" is the low-order 16 bits of the dynticks counter, which will have an even-numbered value if the CPU is in dyntick-idle mode and an odd-numbered value otherwise. The hex number between the two "/"s is the value of the nesting, which will be diff --git a/Documentation/RCU/torture.rst b/Documentation/RCU/torture.rst index 49e7beea6ae1..4b1f99c4181f 100644 --- a/Documentation/RCU/torture.rst +++ b/Documentation/RCU/torture.rst @@ -318,7 +318,7 @@ Suppose that a previous kvm.sh run left its output in this directory:: tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 -Then this run can be re-run without rebuilding as follow: +Then this run can be re-run without rebuilding as follow:: kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index 60ce02475142..1ef5784c1b84 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -172,14 +172,25 @@ rcu_read_lock() critical section. Reference counts may be used in conjunction with RCU to maintain longer-term references to data structures. + Note that anything that disables bottom halves, preemption, + or interrupts also enters an RCU read-side critical section. + Acquiring a spinlock also enters an RCU read-side critical + sections, even for spinlocks that do not disable preemption, + as is the case in kernels built with CONFIG_PREEMPT_RT=y. + Sleeplocks do *not* enter RCU read-side critical sections. + rcu_read_unlock() ^^^^^^^^^^^^^^^^^ void rcu_read_unlock(void); This temporal primitives is used by a reader to inform the reclaimer that the reader is exiting an RCU read-side critical - section. Note that RCU read-side critical sections may be nested - and/or overlapping. + section. Anything that enables bottom halves, preemption, + or interrupts also exits an RCU read-side critical section. + Releasing a spinlock also exits an RCU read-side critical section. + + Note that RCU read-side critical sections may be nested and/or + overlapping. synchronize_rcu() ^^^^^^^^^^^^^^^^^ @@ -239,21 +250,25 @@ rcu_assign_pointer() ^^^^^^^^^^^^^^^^^^^^ void rcu_assign_pointer(p, typeof(p) v); - Yes, rcu_assign_pointer() **is** implemented as a macro, though it - would be cool to be able to declare a function in this manner. - (Compiler experts will no doubt disagree.) + Yes, rcu_assign_pointer() **is** implemented as a macro, though + it would be cool to be able to declare a function in this manner. + (And there has been some discussion of adding overloaded functions + to the C language, so who knows?) The updater uses this spatial macro to assign a new value to an RCU-protected pointer, in order to safely communicate the change in value from the updater to the reader. This is a spatial (as opposed to temporal) macro. It does not evaluate to an rvalue, - but it does execute any memory-barrier instructions required - for a given CPU architecture. Its ordering properties are that - of a store-release operation. - - Perhaps just as important, it serves to document (1) which - pointers are protected by RCU and (2) the point at which a - given structure becomes accessible to other CPUs. That said, + but it does provide any compiler directives and memory-barrier + instructions required for a given compile or CPU architecture. + Its ordering properties are that of a store-release operation, + that is, any prior loads and stores required to initialize the + structure are ordered before the store that publishes the pointer + to that structure. + + Perhaps just as important, rcu_assign_pointer() serves to document + (1) which pointers are protected by RCU and (2) the point at which + a given structure becomes accessible to other CPUs. That said, rcu_assign_pointer() is most frequently used indirectly, via the _rcu list-manipulation primitives such as list_add_rcu(). @@ -272,7 +287,11 @@ rcu_dereference() executes any needed memory-barrier instructions for a given CPU architecture. Currently, only Alpha needs memory barriers within rcu_dereference() -- on other CPUs, it compiles to a - volatile load. + volatile load. However, no mainstream C compilers respect + address dependencies, so rcu_dereference() uses volatile casts, + which, in combination with the coding guidelines listed in + rcu_dereference.rst, prevent current compilers from breaking + these dependencies. Common coding practice uses rcu_dereference() to copy an RCU-protected pointer to a local variable, then dereferences @@ -416,7 +435,7 @@ their assorted primitives. This section shows a simple use of the core RCU API to protect a global pointer to a dynamically allocated structure. More-typical -uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst. +uses of RCU may be found in listRCU.rst and NMI-RCU.rst. :: struct foo { @@ -499,8 +518,8 @@ So, to sum up: data item. See checklist.rst for additional rules to follow when using RCU. -And again, more-typical uses of RCU may be found in listRCU.rst, -arrayRCU.rst, and NMI-RCU.rst. +And again, more-typical uses of RCU may be found in listRCU.rst +and NMI-RCU.rst. .. _4_whatisRCU: @@ -952,8 +971,8 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be initialized after each and every call to kmem_cache_alloc(), which renders reference-free spinlock acquisition completely unsafe. Therefore, when using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter. -(Those willing to use a kmem_cache constructor may also use locking, -including cache-friendly sequence locking.) +(Those willing to initialize their locks in a kmem_cache constructor +may also use locking, including cache-friendly sequence locking.) With traditional reference counting -- such as that implemented by the kref library in Linux -- there is typically code that runs when the last @@ -1084,7 +1103,7 @@ RCU-Tasks-Rude:: Critical sections Grace period Barrier - N/A call_rcu_tasks_rude rcu_barrier_tasks_rude + N/A N/A synchronize_rcu_tasks_rude |