summaryrefslogtreecommitdiff
path: root/Documentation/RCU/whatisRCU.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/RCU/whatisRCU.rst')
-rw-r--r--Documentation/RCU/whatisRCU.rst225
1 files changed, 173 insertions, 52 deletions
diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst
index e488c8e557a9..a1582bd653d1 100644
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -15,6 +15,9 @@ to start learning about RCU:
| 2014 Big API Table https://lwn.net/Articles/609973/
| 6. The RCU API, 2019 Edition https://lwn.net/Articles/777036/
| 2019 Big API Table https://lwn.net/Articles/777165/
+| 7. The RCU API, 2024 Edition https://lwn.net/Articles/988638/
+| 2024 Background Information https://lwn.net/Articles/988641/
+| 2024 Big API Table https://lwn.net/Articles/988666/
For those preferring video:
@@ -59,8 +62,8 @@ experiment with should focus on Section 2. People who prefer to start
with example uses should focus on Sections 3 and 4. People who need to
understand the RCU implementation should focus on Section 5, then dive
into the kernel source code. People who reason best by analogy should
-focus on Section 6. Section 7 serves as an index to the docbook API
-documentation, and Section 8 is the traditional answer key.
+focus on Section 6 and 7. Section 8 serves as an index to the docbook
+API documentation, and Section 9 is the traditional answer key.
So, start with the section that makes the most sense to you and your
preferred method of learning. If you need to know everything about
@@ -172,14 +175,25 @@ rcu_read_lock()
critical section. Reference counts may be used in conjunction
with RCU to maintain longer-term references to data structures.
+ Note that anything that disables bottom halves, preemption,
+ or interrupts also enters an RCU read-side critical section.
+ Acquiring a spinlock also enters an RCU read-side critical
+ sections, even for spinlocks that do not disable preemption,
+ as is the case in kernels built with CONFIG_PREEMPT_RT=y.
+ Sleeplocks do *not* enter RCU read-side critical sections.
+
rcu_read_unlock()
^^^^^^^^^^^^^^^^^
void rcu_read_unlock(void);
This temporal primitives is used by a reader to inform the
reclaimer that the reader is exiting an RCU read-side critical
- section. Note that RCU read-side critical sections may be nested
- and/or overlapping.
+ section. Anything that enables bottom halves, preemption,
+ or interrupts also exits an RCU read-side critical section.
+ Releasing a spinlock also exits an RCU read-side critical section.
+
+ Note that RCU read-side critical sections may be nested and/or
+ overlapping.
synchronize_rcu()
^^^^^^^^^^^^^^^^^
@@ -239,21 +253,25 @@ rcu_assign_pointer()
^^^^^^^^^^^^^^^^^^^^
void rcu_assign_pointer(p, typeof(p) v);
- Yes, rcu_assign_pointer() **is** implemented as a macro, though it
- would be cool to be able to declare a function in this manner.
- (Compiler experts will no doubt disagree.)
+ Yes, rcu_assign_pointer() **is** implemented as a macro, though
+ it would be cool to be able to declare a function in this manner.
+ (And there has been some discussion of adding overloaded functions
+ to the C language, so who knows?)
The updater uses this spatial macro to assign a new value to an
RCU-protected pointer, in order to safely communicate the change
in value from the updater to the reader. This is a spatial (as
opposed to temporal) macro. It does not evaluate to an rvalue,
- but it does execute any memory-barrier instructions required
- for a given CPU architecture. Its ordering properties are that
- of a store-release operation.
-
- Perhaps just as important, it serves to document (1) which
- pointers are protected by RCU and (2) the point at which a
- given structure becomes accessible to other CPUs. That said,
+ but it does provide any compiler directives and memory-barrier
+ instructions required for a given compile or CPU architecture.
+ Its ordering properties are that of a store-release operation,
+ that is, any prior loads and stores required to initialize the
+ structure are ordered before the store that publishes the pointer
+ to that structure.
+
+ Perhaps just as important, rcu_assign_pointer() serves to document
+ (1) which pointers are protected by RCU and (2) the point at which
+ a given structure becomes accessible to other CPUs. That said,
rcu_assign_pointer() is most frequently used indirectly, via
the _rcu list-manipulation primitives such as list_add_rcu().
@@ -272,7 +290,11 @@ rcu_dereference()
executes any needed memory-barrier instructions for a given
CPU architecture. Currently, only Alpha needs memory barriers
within rcu_dereference() -- on other CPUs, it compiles to a
- volatile load.
+ volatile load. However, no mainstream C compilers respect
+ address dependencies, so rcu_dereference() uses volatile casts,
+ which, in combination with the coding guidelines listed in
+ rcu_dereference.rst, prevent current compilers from breaking
+ these dependencies.
Common coding practice uses rcu_dereference() to copy an
RCU-protected pointer to a local variable, then dereferences
@@ -416,7 +438,7 @@ their assorted primitives.
This section shows a simple use of the core RCU API to protect a
global pointer to a dynamically allocated structure. More-typical
-uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst.
+uses of RCU may be found in listRCU.rst and NMI-RCU.rst.
::
struct foo {
@@ -499,8 +521,8 @@ So, to sum up:
data item.
See checklist.rst for additional rules to follow when using RCU.
-And again, more-typical uses of RCU may be found in listRCU.rst,
-arrayRCU.rst, and NMI-RCU.rst.
+And again, more-typical uses of RCU may be found in listRCU.rst
+and NMI-RCU.rst.
.. _4_whatisRCU:
@@ -952,8 +974,18 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
initialized after each and every call to kmem_cache_alloc(), which renders
reference-free spinlock acquisition completely unsafe. Therefore, when
using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter.
-(Those willing to use a kmem_cache constructor may also use locking,
-including cache-friendly sequence locking.)
+If using refcount_t, the specialized refcount_{add|inc}_not_zero_acquire()
+and refcount_set_release() APIs should be used to ensure correct operation
+ordering when verifying object identity and when initializing newly
+allocated objects. Acquire fence in refcount_{add|inc}_not_zero_acquire()
+ensures that identity checks happen *after* reference count is taken.
+refcount_set_release() should be called after a newly allocated object is
+fully initialized and release fence ensures that new values are visible
+*before* refcount can be successfully taken by other users. Once
+refcount_set_release() is called, the object should be considered visible
+by other tasks.
+(Those willing to initialize their locks in a kmem_cache constructor
+may also use locking, including cache-friendly sequence locking.)
With traditional reference counting -- such as that implemented by the
kref library in Linux -- there is typically code that runs when the last
@@ -989,32 +1021,41 @@ RCU list traversal::
list_entry_rcu
list_entry_lockless
list_first_entry_rcu
+ list_first_or_null_rcu
+ list_tail_rcu
list_next_rcu
+ list_next_or_null_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
list_for_each_entry_from_rcu
- list_first_or_null_rcu
- list_next_or_null_rcu
+ list_for_each_entry_lockless
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
+ hlist_for_each_entry_rcu_notrace
hlist_for_each_entry_rcu_bh
hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
+ hlist_nulls_next_rcu
hlist_nulls_for_each_entry_rcu
+ hlist_nulls_for_each_entry_safe
hlist_bl_first_rcu
hlist_bl_for_each_entry_rcu
RCU pointer/list update::
rcu_assign_pointer
+ rcu_replace_pointer
+ INIT_LIST_HEAD_RCU
list_add_rcu
list_add_tail_rcu
list_del_rcu
list_replace_rcu
+ list_splice_init_rcu
+ list_splice_tail_init_rcu
hlist_add_behind_rcu
hlist_add_before_rcu
hlist_add_head_rcu
@@ -1022,34 +1063,53 @@ RCU pointer/list update::
hlist_del_rcu
hlist_del_init_rcu
hlist_replace_rcu
- list_splice_init_rcu
- list_splice_tail_init_rcu
hlist_nulls_del_init_rcu
hlist_nulls_del_rcu
hlist_nulls_add_head_rcu
+ hlist_nulls_add_tail_rcu
+ hlist_nulls_add_fake
+ hlists_swap_heads_rcu
hlist_bl_add_head_rcu
- hlist_bl_del_init_rcu
hlist_bl_del_rcu
hlist_bl_set_first_rcu
RCU::
- Critical sections Grace period Barrier
-
- rcu_read_lock synchronize_net rcu_barrier
- rcu_read_unlock synchronize_rcu
- rcu_dereference synchronize_rcu_expedited
- rcu_read_lock_held call_rcu
- rcu_dereference_check kfree_rcu
- rcu_dereference_protected
+ Critical sections Grace period Barrier
+
+ rcu_read_lock synchronize_net rcu_barrier
+ rcu_read_unlock synchronize_rcu
+ guard(rcu)() synchronize_rcu_expedited
+ scoped_guard(rcu) synchronize_rcu_mult
+ rcu_dereference call_rcu
+ rcu_dereference_check call_rcu_hurry
+ rcu_dereference_protected kfree_rcu
+ rcu_read_lock_held kvfree_rcu
+ rcu_read_lock_any_held kfree_rcu_mightsleep
+ rcu_pointer_handoff cond_synchronize_rcu
+ unrcu_pointer cond_synchronize_rcu_full
+ cond_synchronize_rcu_expedited
+ cond_synchronize_rcu_expedited_full
+ get_completed_synchronize_rcu
+ get_completed_synchronize_rcu_full
+ get_state_synchronize_rcu
+ get_state_synchronize_rcu_full
+ poll_state_synchronize_rcu
+ poll_state_synchronize_rcu_full
+ same_state_synchronize_rcu
+ same_state_synchronize_rcu_full
+ start_poll_synchronize_rcu
+ start_poll_synchronize_rcu_full
+ start_poll_synchronize_rcu_expedited
+ start_poll_synchronize_rcu_expedited_full
bh::
Critical sections Grace period Barrier
- rcu_read_lock_bh call_rcu rcu_barrier
- rcu_read_unlock_bh synchronize_rcu
- [local_bh_disable] synchronize_rcu_expedited
+ rcu_read_lock_bh [Same as RCU] [Same as RCU]
+ rcu_read_unlock_bh
+ [local_bh_disable]
[and friends]
rcu_dereference_bh
rcu_dereference_bh_check
@@ -1060,9 +1120,9 @@ sched::
Critical sections Grace period Barrier
- rcu_read_lock_sched call_rcu rcu_barrier
- rcu_read_unlock_sched synchronize_rcu
- [preempt_disable] synchronize_rcu_expedited
+ rcu_read_lock_sched [Same as RCU] [Same as RCU]
+ rcu_read_unlock_sched
+ [preempt_disable]
[and friends]
rcu_read_lock_sched_notrace
rcu_read_unlock_sched_notrace
@@ -1072,46 +1132,107 @@ sched::
rcu_read_lock_sched_held
+RCU: Initialization/cleanup/ordering::
+
+ RCU_INIT_POINTER
+ RCU_INITIALIZER
+ RCU_POINTER_INITIALIZER
+ init_rcu_head
+ destroy_rcu_head
+ init_rcu_head_on_stack
+ destroy_rcu_head_on_stack
+ SLAB_TYPESAFE_BY_RCU
+
+
+RCU: Quiescents states and control::
+
+ cond_resched_tasks_rcu_qs
+ rcu_all_qs
+ rcu_softirq_qs_periodic
+ rcu_end_inkernel_boot
+ rcu_expedite_gp
+ rcu_gp_is_expedited
+ rcu_unexpedite_gp
+ rcu_cpu_stall_reset
+ rcu_head_after_call_rcu
+ rcu_is_watching
+
+
+RCU-sync primitive::
+
+ rcu_sync_is_idle
+ rcu_sync_init
+ rcu_sync_enter
+ rcu_sync_exit
+ rcu_sync_dtor
+
+
RCU-Tasks::
- Critical sections Grace period Barrier
+ Critical sections Grace period Barrier
- N/A call_rcu_tasks rcu_barrier_tasks
+ N/A call_rcu_tasks rcu_barrier_tasks
synchronize_rcu_tasks
RCU-Tasks-Rude::
- Critical sections Grace period Barrier
+ Critical sections Grace period Barrier
- N/A call_rcu_tasks_rude rcu_barrier_tasks_rude
- synchronize_rcu_tasks_rude
+ N/A synchronize_rcu_tasks_rude rcu_barrier_tasks_rude
+ call_rcu_tasks_rude
RCU-Tasks-Trace::
- Critical sections Grace period Barrier
+ Critical sections Grace period Barrier
- rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
+ rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
rcu_read_unlock_trace synchronize_rcu_tasks_trace
+ guard(rcu_tasks_trace)()
+ scoped_guard(rcu_tasks_trace)
-SRCU::
+SRCU list traversal::
+ list_for_each_entry_srcu
+ hlist_for_each_entry_srcu
- Critical sections Grace period Barrier
- srcu_read_lock call_srcu srcu_barrier
- srcu_read_unlock synchronize_srcu
- srcu_dereference synchronize_srcu_expedited
+SRCU::
+
+ Critical sections Grace period Barrier
+
+ srcu_read_lock call_srcu srcu_barrier
+ srcu_read_unlock synchronize_srcu
+ srcu_read_lock_fast synchronize_srcu_expedited
+ srcu_read_unlock_fast get_state_synchronize_srcu
+ srcu_read_lock_nmisafe start_poll_synchronize_srcu
+ srcu_read_unlock_nmisafe start_poll_synchronize_srcu_expedited
+ srcu_read_lock_notrace poll_state_synchronize_srcu
+ srcu_read_unlock_notrace
+ srcu_down_read
+ srcu_up_read
+ srcu_down_read_fast
+ srcu_up_read_fast
+ guard(srcu)()
+ scoped_guard(srcu)
+ srcu_read_lock_held
+ srcu_dereference
srcu_dereference_check
+ srcu_dereference_notrace
srcu_read_lock_held
-SRCU: Initialization/cleanup::
+
+SRCU: Initialization/cleanup/ordering::
DEFINE_SRCU
DEFINE_STATIC_SRCU
+ DEFINE_SRCU_FAST // for srcu_read_lock_fast() and friends
+ DEFINE_STATIC_SRCU_FAST // for srcu_read_lock_fast() and friends
init_srcu_struct
+ init_srcu_struct_fast
cleanup_srcu_struct
+ smp_mb__after_srcu_read_unlock
All: lockdep-checked RCU utility APIs::