Commit 9855e873 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull RCU updates from Paul McKenney:

 - Update Tasks RCU and Tasks Rude RCU description in Requirements.rst
   and clarify rcu_assign_pointer() and rcu_dereference() ordering
   properties

 - Add lockdep assertions for RCU readers, limit inline wakeups for
   callback-bypass synchronize_rcu(), add an
   rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter, add
   Uladzislau Rezki as RCU maintainer, and fix a subtle
   callback-migration memory-ordering issue

 - Remove a number of redundant memory barriers

 - Remove unnecessary bypass-list lock-contention mitigation, use
   parking API instead of open-coded ad-hoc equivalent, and upgrade
   obsolete comments

 - Revert avoidance of a deadlock that can no longer occur and properly
   synchronize Tasks Trace RCU checking of runqueues

 - Add tests for handling of double-call_rcu() bug, add missing
   MODULE_DESCRIPTION, and add a script that histograms the number of
   calls to RCU updaters

 - Fill out SRCU polled-grace-period API

* tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation
  rcu: Eliminate lockless accesses to rcu_sync->gp_count
  MAINTAINERS: Add Uladzislau Rezki as RCU maintainer
  rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter
  rcu/exp: Remove redundant full memory barrier at the end of GP
  rcu: Remove full memory barrier on RCU stall printout
  rcu: Remove full memory barrier on boot time eqs sanity check
  rcu/exp: Remove superfluous full memory barrier upon first EQS snapshot
  rcu: Remove superfluous full memory barrier upon first EQS snapshot
  rcu: Remove full ordering on second EQS snapshot
  srcu: Fill out polled grace-period APIs
  srcu: Update cleanup_srcu_struct() comment
  srcu: Add NUM_ACTIVE_SRCU_POLL_OLDSTATE
  srcu: Disable interrupts directly in srcu_gp_end()
  rcu: Disable interrupts directly in rcu_gp_init()
  rcu/tree: Reduce wake up for synchronize_rcu() common case
  rcu/tasks: Fix stale task snaphot for Tasks Trace
  tools/rcu: Add rcu-updaters.sh script
  rcutorture: Add missing MODULE_DESCRIPTION() macros
  rcutorture: Fix rcu_torture_fwd_cb_cr() data race
  ...
parents 253e1e98 02219caa
Loading
Loading
Loading
Loading
+3 −3
Original line number Diff line number Diff line
@@ -149,9 +149,9 @@ This case is handled by calls to the strongly ordered
``atomic_add_return()`` read-modify-write atomic operation that
is invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry
time and within ``rcu_dynticks_eqs_exit()`` at idle-exit time.
The grace-period kthread invokes ``rcu_dynticks_snap()`` and
``rcu_dynticks_in_eqs_since()`` (both of which invoke
an ``atomic_add_return()`` of zero) to detect idle CPUs.
The grace-period kthread invokes first ``ct_dynticks_cpu_acquire()``
(preceded by a full memory barrier) and ``rcu_dynticks_in_eqs_since()``
(both of which rely on acquire semantics) to detect idle CPUs.

+-----------------------------------------------------------------------+
| **Quick Quiz**:                                                       |
+16 −0
Original line number Diff line number Diff line
@@ -2357,6 +2357,7 @@ section.
#. `Sched Flavor (Historical)`_
#. `Sleepable RCU`_
#. `Tasks RCU`_
#. `Tasks Trace RCU`_

Bottom-Half Flavor (Historical)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -2610,6 +2611,16 @@ critical sections that are delimited by voluntary context switches, that
is, calls to schedule(), cond_resched(), and
synchronize_rcu_tasks(). In addition, transitions to and from
userspace execution also delimit tasks-RCU read-side critical sections.
Idle tasks are ignored by Tasks RCU, and Tasks Rude RCU may be used to
interact with them.

Note well that involuntary context switches are *not* Tasks-RCU quiescent
states.  After all, in preemptible kernels, a task executing code in a
trampoline might be preempted.  In this case, the Tasks-RCU grace period
clearly cannot end until that task resumes and its execution leaves that
trampoline.  This means, among other things, that cond_resched() does
not provide a Tasks RCU quiescent state.  (Instead, use rcu_softirq_qs()
from softirq or rcu_tasks_classic_qs() otherwise.)

The tasks-RCU API is quite compact, consisting only of
call_rcu_tasks(), synchronize_rcu_tasks(), and
@@ -2632,6 +2643,11 @@ moniker. And this operation is considered to be quite rude by real-time
workloads that don't want their ``nohz_full`` CPUs receiving IPIs and
by battery-powered systems that don't want their idle CPUs to be awakened.

Once kernel entry/exit and deep-idle functions have been properly tagged
``noinstr``, Tasks RCU can start paying attention to idle tasks (except
those that are idle from RCU's perspective) and then Tasks Rude RCU can
be removed from the kernel.

The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(),
and rcu_barrier_tasks_rude().
+19 −11
Original line number Diff line number Diff line
@@ -250,21 +250,25 @@ rcu_assign_pointer()
^^^^^^^^^^^^^^^^^^^^
	void rcu_assign_pointer(p, typeof(p) v);

	Yes, rcu_assign_pointer() **is** implemented as a macro, though it
	would be cool to be able to declare a function in this manner.
	(Compiler experts will no doubt disagree.)
	Yes, rcu_assign_pointer() **is** implemented as a macro, though
	it would be cool to be able to declare a function in this manner.
	(And there has been some discussion of adding overloaded functions
	to the C language, so who knows?)

	The updater uses this spatial macro to assign a new value to an
	RCU-protected pointer, in order to safely communicate the change
	in value from the updater to the reader.  This is a spatial (as
	opposed to temporal) macro.  It does not evaluate to an rvalue,
	but it does execute any memory-barrier instructions required
	for a given CPU architecture.  Its ordering properties are that
	of a store-release operation.

	Perhaps just as important, it serves to document (1) which
	pointers are protected by RCU and (2) the point at which a
	given structure becomes accessible to other CPUs.  That said,
	but it does provide any compiler directives and memory-barrier
	instructions required for a given compile or CPU architecture.
	Its ordering properties are that of a store-release operation,
	that is, any prior loads and stores required to initialize the
	structure are ordered before the store that publishes the pointer
	to that structure.

	Perhaps just as important, rcu_assign_pointer() serves to document
	(1) which pointers are protected by RCU and (2) the point at which
	a given structure becomes accessible to other CPUs.  That said,
	rcu_assign_pointer() is most frequently used indirectly, via
	the _rcu list-manipulation primitives such as list_add_rcu().

@@ -283,7 +287,11 @@ rcu_dereference()
	executes any needed memory-barrier instructions for a given
	CPU architecture.  Currently, only Alpha needs memory barriers
	within rcu_dereference() -- on other CPUs, it compiles to a
	volatile load.
	volatile load.	However, no mainstream C compilers respect
	address dependencies, so rcu_dereference() uses volatile casts,
	which, in combination with the coding guidelines listed in
	rcu_dereference.rst, prevent current compilers from breaking
	these dependencies.

	Common coding practice uses rcu_dereference() to copy an
	RCU-protected pointer to a local variable, then dereferences
+8 −0
Original line number Diff line number Diff line
@@ -5015,6 +5015,14 @@
			the ->nocb_bypass queue.  The definition of "too
			many" is supplied by this kernel boot parameter.

	rcutree.nohz_full_patience_delay= [KNL]
			On callback-offloaded (rcu_nocbs) CPUs, avoid
			disturbing RCU unless the grace period has
			reached the specified age in milliseconds.
			Defaults to zero.  Large values will be capped
			at five seconds.  All values will be rounded down
			to the nearest value representable by jiffies.

	rcutree.qhimark= [KNL]
			Set threshold of queued RCU callbacks beyond which
			batch limiting is disabled.
+1 −0
Original line number Diff line number Diff line
@@ -18863,6 +18863,7 @@ M: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> (kernel/rcu/tasks.h)
M:	Joel Fernandes <joel@joelfernandes.org>
M:	Josh Triplett <josh@joshtriplett.org>
M:	Boqun Feng <boqun.feng@gmail.com>
M:	Uladzislau Rezki <urezki@gmail.com>
R:	Steven Rostedt <rostedt@goodmis.org>
R:	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
R:	Lai Jiangshan <jiangshanlai@gmail.com>
Loading