diff options
Diffstat (limited to 'Documentation/RCU/whatisRCU.rst')
| -rw-r--r-- | Documentation/RCU/whatisRCU.rst | 225 |
1 files changed, 173 insertions, 52 deletions
diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst index e488c8e557a9..a1582bd653d1 100644 --- a/Documentation/RCU/whatisRCU.rst +++ b/Documentation/RCU/whatisRCU.rst @@ -15,6 +15,9 @@ to start learning about RCU: | 2014 Big API Table https://lwn.net/Articles/609973/ | 6. The RCU API, 2019 Edition https://lwn.net/Articles/777036/ | 2019 Big API Table https://lwn.net/Articles/777165/ +| 7. The RCU API, 2024 Edition https://lwn.net/Articles/988638/ +| 2024 Background Information https://lwn.net/Articles/988641/ +| 2024 Big API Table https://lwn.net/Articles/988666/ For those preferring video: @@ -59,8 +62,8 @@ experiment with should focus on Section 2. People who prefer to start with example uses should focus on Sections 3 and 4. People who need to understand the RCU implementation should focus on Section 5, then dive into the kernel source code. People who reason best by analogy should -focus on Section 6. Section 7 serves as an index to the docbook API -documentation, and Section 8 is the traditional answer key. +focus on Section 6 and 7. Section 8 serves as an index to the docbook +API documentation, and Section 9 is the traditional answer key. So, start with the section that makes the most sense to you and your preferred method of learning. If you need to know everything about @@ -172,14 +175,25 @@ rcu_read_lock() critical section. Reference counts may be used in conjunction with RCU to maintain longer-term references to data structures. + Note that anything that disables bottom halves, preemption, + or interrupts also enters an RCU read-side critical section. + Acquiring a spinlock also enters an RCU read-side critical + sections, even for spinlocks that do not disable preemption, + as is the case in kernels built with CONFIG_PREEMPT_RT=y. + Sleeplocks do *not* enter RCU read-side critical sections. + rcu_read_unlock() ^^^^^^^^^^^^^^^^^ void rcu_read_unlock(void); This temporal primitives is used by a reader to inform the reclaimer that the reader is exiting an RCU read-side critical - section. Note that RCU read-side critical sections may be nested - and/or overlapping. + section. Anything that enables bottom halves, preemption, + or interrupts also exits an RCU read-side critical section. + Releasing a spinlock also exits an RCU read-side critical section. + + Note that RCU read-side critical sections may be nested and/or + overlapping. synchronize_rcu() ^^^^^^^^^^^^^^^^^ @@ -239,21 +253,25 @@ rcu_assign_pointer() ^^^^^^^^^^^^^^^^^^^^ void rcu_assign_pointer(p, typeof(p) v); - Yes, rcu_assign_pointer() **is** implemented as a macro, though it - would be cool to be able to declare a function in this manner. - (Compiler experts will no doubt disagree.) + Yes, rcu_assign_pointer() **is** implemented as a macro, though + it would be cool to be able to declare a function in this manner. + (And there has been some discussion of adding overloaded functions + to the C language, so who knows?) The updater uses this spatial macro to assign a new value to an RCU-protected pointer, in order to safely communicate the change in value from the updater to the reader. This is a spatial (as opposed to temporal) macro. It does not evaluate to an rvalue, - but it does execute any memory-barrier instructions required - for a given CPU architecture. Its ordering properties are that - of a store-release operation. - - Perhaps just as important, it serves to document (1) which - pointers are protected by RCU and (2) the point at which a - given structure becomes accessible to other CPUs. That said, + but it does provide any compiler directives and memory-barrier + instructions required for a given compile or CPU architecture. + Its ordering properties are that of a store-release operation, + that is, any prior loads and stores required to initialize the + structure are ordered before the store that publishes the pointer + to that structure. + + Perhaps just as important, rcu_assign_pointer() serves to document + (1) which pointers are protected by RCU and (2) the point at which + a given structure becomes accessible to other CPUs. That said, rcu_assign_pointer() is most frequently used indirectly, via the _rcu list-manipulation primitives such as list_add_rcu(). @@ -272,7 +290,11 @@ rcu_dereference() executes any needed memory-barrier instructions for a given CPU architecture. Currently, only Alpha needs memory barriers within rcu_dereference() -- on other CPUs, it compiles to a - volatile load. + volatile load. However, no mainstream C compilers respect + address dependencies, so rcu_dereference() uses volatile casts, + which, in combination with the coding guidelines listed in + rcu_dereference.rst, prevent current compilers from breaking + these dependencies. Common coding practice uses rcu_dereference() to copy an RCU-protected pointer to a local variable, then dereferences @@ -416,7 +438,7 @@ their assorted primitives. This section shows a simple use of the core RCU API to protect a global pointer to a dynamically allocated structure. More-typical -uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst. +uses of RCU may be found in listRCU.rst and NMI-RCU.rst. :: struct foo { @@ -499,8 +521,8 @@ So, to sum up: data item. See checklist.rst for additional rules to follow when using RCU. -And again, more-typical uses of RCU may be found in listRCU.rst, -arrayRCU.rst, and NMI-RCU.rst. +And again, more-typical uses of RCU may be found in listRCU.rst +and NMI-RCU.rst. .. _4_whatisRCU: @@ -952,8 +974,18 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be initialized after each and every call to kmem_cache_alloc(), which renders reference-free spinlock acquisition completely unsafe. Therefore, when using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter. -(Those willing to use a kmem_cache constructor may also use locking, -including cache-friendly sequence locking.) +If using refcount_t, the specialized refcount_{add|inc}_not_zero_acquire() +and refcount_set_release() APIs should be used to ensure correct operation +ordering when verifying object identity and when initializing newly +allocated objects. Acquire fence in refcount_{add|inc}_not_zero_acquire() +ensures that identity checks happen *after* reference count is taken. +refcount_set_release() should be called after a newly allocated object is +fully initialized and release fence ensures that new values are visible +*before* refcount can be successfully taken by other users. Once +refcount_set_release() is called, the object should be considered visible +by other tasks. +(Those willing to initialize their locks in a kmem_cache constructor +may also use locking, including cache-friendly sequence locking.) With traditional reference counting -- such as that implemented by the kref library in Linux -- there is typically code that runs when the last @@ -989,32 +1021,41 @@ RCU list traversal:: list_entry_rcu list_entry_lockless list_first_entry_rcu + list_first_or_null_rcu + list_tail_rcu list_next_rcu + list_next_or_null_rcu list_for_each_entry_rcu list_for_each_entry_continue_rcu list_for_each_entry_from_rcu - list_first_or_null_rcu - list_next_or_null_rcu + list_for_each_entry_lockless hlist_first_rcu hlist_next_rcu hlist_pprev_rcu hlist_for_each_entry_rcu + hlist_for_each_entry_rcu_notrace hlist_for_each_entry_rcu_bh hlist_for_each_entry_from_rcu hlist_for_each_entry_continue_rcu hlist_for_each_entry_continue_rcu_bh hlist_nulls_first_rcu + hlist_nulls_next_rcu hlist_nulls_for_each_entry_rcu + hlist_nulls_for_each_entry_safe hlist_bl_first_rcu hlist_bl_for_each_entry_rcu RCU pointer/list update:: rcu_assign_pointer + rcu_replace_pointer + INIT_LIST_HEAD_RCU list_add_rcu list_add_tail_rcu list_del_rcu list_replace_rcu + list_splice_init_rcu + list_splice_tail_init_rcu hlist_add_behind_rcu hlist_add_before_rcu hlist_add_head_rcu @@ -1022,34 +1063,53 @@ RCU pointer/list update:: hlist_del_rcu hlist_del_init_rcu hlist_replace_rcu - list_splice_init_rcu - list_splice_tail_init_rcu hlist_nulls_del_init_rcu hlist_nulls_del_rcu hlist_nulls_add_head_rcu + hlist_nulls_add_tail_rcu + hlist_nulls_add_fake + hlists_swap_heads_rcu hlist_bl_add_head_rcu - hlist_bl_del_init_rcu hlist_bl_del_rcu hlist_bl_set_first_rcu RCU:: - Critical sections Grace period Barrier - - rcu_read_lock synchronize_net rcu_barrier - rcu_read_unlock synchronize_rcu - rcu_dereference synchronize_rcu_expedited - rcu_read_lock_held call_rcu - rcu_dereference_check kfree_rcu - rcu_dereference_protected + Critical sections Grace period Barrier + + rcu_read_lock synchronize_net rcu_barrier + rcu_read_unlock synchronize_rcu + guard(rcu)() synchronize_rcu_expedited + scoped_guard(rcu) synchronize_rcu_mult + rcu_dereference call_rcu + rcu_dereference_check call_rcu_hurry + rcu_dereference_protected kfree_rcu + rcu_read_lock_held kvfree_rcu + rcu_read_lock_any_held kfree_rcu_mightsleep + rcu_pointer_handoff cond_synchronize_rcu + unrcu_pointer cond_synchronize_rcu_full + cond_synchronize_rcu_expedited + cond_synchronize_rcu_expedited_full + get_completed_synchronize_rcu + get_completed_synchronize_rcu_full + get_state_synchronize_rcu + get_state_synchronize_rcu_full + poll_state_synchronize_rcu + poll_state_synchronize_rcu_full + same_state_synchronize_rcu + same_state_synchronize_rcu_full + start_poll_synchronize_rcu + start_poll_synchronize_rcu_full + start_poll_synchronize_rcu_expedited + start_poll_synchronize_rcu_expedited_full bh:: Critical sections Grace period Barrier - rcu_read_lock_bh call_rcu rcu_barrier - rcu_read_unlock_bh synchronize_rcu - [local_bh_disable] synchronize_rcu_expedited + rcu_read_lock_bh [Same as RCU] [Same as RCU] + rcu_read_unlock_bh + [local_bh_disable] [and friends] rcu_dereference_bh rcu_dereference_bh_check @@ -1060,9 +1120,9 @@ sched:: Critical sections Grace period Barrier - rcu_read_lock_sched call_rcu rcu_barrier - rcu_read_unlock_sched synchronize_rcu - [preempt_disable] synchronize_rcu_expedited + rcu_read_lock_sched [Same as RCU] [Same as RCU] + rcu_read_unlock_sched + [preempt_disable] [and friends] rcu_read_lock_sched_notrace rcu_read_unlock_sched_notrace @@ -1072,46 +1132,107 @@ sched:: rcu_read_lock_sched_held +RCU: Initialization/cleanup/ordering:: + + RCU_INIT_POINTER + RCU_INITIALIZER + RCU_POINTER_INITIALIZER + init_rcu_head + destroy_rcu_head + init_rcu_head_on_stack + destroy_rcu_head_on_stack + SLAB_TYPESAFE_BY_RCU + + +RCU: Quiescents states and control:: + + cond_resched_tasks_rcu_qs + rcu_all_qs + rcu_softirq_qs_periodic + rcu_end_inkernel_boot + rcu_expedite_gp + rcu_gp_is_expedited + rcu_unexpedite_gp + rcu_cpu_stall_reset + rcu_head_after_call_rcu + rcu_is_watching + + +RCU-sync primitive:: + + rcu_sync_is_idle + rcu_sync_init + rcu_sync_enter + rcu_sync_exit + rcu_sync_dtor + + RCU-Tasks:: - Critical sections Grace period Barrier + Critical sections Grace period Barrier - N/A call_rcu_tasks rcu_barrier_tasks + N/A call_rcu_tasks rcu_barrier_tasks synchronize_rcu_tasks RCU-Tasks-Rude:: - Critical sections Grace period Barrier + Critical sections Grace period Barrier - N/A call_rcu_tasks_rude rcu_barrier_tasks_rude - synchronize_rcu_tasks_rude + N/A synchronize_rcu_tasks_rude rcu_barrier_tasks_rude + call_rcu_tasks_rude RCU-Tasks-Trace:: - Critical sections Grace period Barrier + Critical sections Grace period Barrier - rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace + rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace rcu_read_unlock_trace synchronize_rcu_tasks_trace + guard(rcu_tasks_trace)() + scoped_guard(rcu_tasks_trace) -SRCU:: +SRCU list traversal:: + list_for_each_entry_srcu + hlist_for_each_entry_srcu - Critical sections Grace period Barrier - srcu_read_lock call_srcu srcu_barrier - srcu_read_unlock synchronize_srcu - srcu_dereference synchronize_srcu_expedited +SRCU:: + + Critical sections Grace period Barrier + + srcu_read_lock call_srcu srcu_barrier + srcu_read_unlock synchronize_srcu + srcu_read_lock_fast synchronize_srcu_expedited + srcu_read_unlock_fast get_state_synchronize_srcu + srcu_read_lock_nmisafe start_poll_synchronize_srcu + srcu_read_unlock_nmisafe start_poll_synchronize_srcu_expedited + srcu_read_lock_notrace poll_state_synchronize_srcu + srcu_read_unlock_notrace + srcu_down_read + srcu_up_read + srcu_down_read_fast + srcu_up_read_fast + guard(srcu)() + scoped_guard(srcu) + srcu_read_lock_held + srcu_dereference srcu_dereference_check + srcu_dereference_notrace srcu_read_lock_held -SRCU: Initialization/cleanup:: + +SRCU: Initialization/cleanup/ordering:: DEFINE_SRCU DEFINE_STATIC_SRCU + DEFINE_SRCU_FAST // for srcu_read_lock_fast() and friends + DEFINE_STATIC_SRCU_FAST // for srcu_read_lock_fast() and friends init_srcu_struct + init_srcu_struct_fast cleanup_srcu_struct + smp_mb__after_srcu_read_unlock All: lockdep-checked RCU utility APIs:: |
