Commit Graph

421 Commits

Author SHA1 Message Date
Joel Fernandes (Google) 97b59370fa doc: Fix "struction" typo in RCU memory-ordering documentation
This commit replaces "struction" with the correct "structure".

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) a78ad16c7f doc: Correct parameter in stallwarn
The stallwarn document incorrectly mentions 'fps=' instead of 'fqs='.
This commit orrects that.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Paul E. McKenney 97562c0181 doc: RCU scheduler spinlock rcu_read_unlock() restriction remains
Given RCU flavor consolidation, when rcu_read_unlock() is invoked with
interrupts disabled, the reporting of the corresponding quiescent state is
deferred until interrupts are re-enabled.  There was therefore some hope
that this would allow dropping the restriction against holding scheduler
spinlocks across an rcu_read_unlock() without disabling interrupts across
the entire corresponding RCU read-side critical section.  Unfortunately,
the need to quickly provide a quiescent state to expedited grace periods
sometimes requires a call to raise_softirq() during rcu_read_unlock()
execution.  Because raise_softirq() can sometimes acquire the scheduler
spinlocks, the restriction must remain in effect.  This commit therefore
updates the RCU requirements documentation accordingly.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 97949f0176 doc: Make listing in RCU perf/scale requirements use rcu_assign_pointer()
The code listing under this section has a quick quiz that says line
19 uses rcu_access_pointer, but the code listing itself instead uses
rcu_dereference().  This commit therefore makes the code listing match
the quick quiz.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 8b9df28d7f doc: Remove obsolete (non-)requirement about disabling preemption
The Requirements.html document says "Disabling Preemption Does
Not Block Grace Periods".  However this is no longer true with
the RCU consolidation.  This commit therefore removes the obsolete
(non-)requirement entirely.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 93eb14201f doc: Make reader aware of rcu_dereference_protected
The whatisRCU.txt document says rcu_dereference() cannot be used
outside of rcu_read_lock() protected sections.  The commit adds a
mention of rcu_dereference_protected(), so that the new reader knows
that this API can be used to avoid update-side use of rcu_read_lock()
and rcu_read_unlock().

Cc: tytso@mit.edu
Suggested-by: tytso@mit.edu
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Update wording, including further feedback from Joel. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 1c7d6d4411 doc: rcu: Encourage use of rcu_barrier in checklist
The checklist suggests rcu_barrier_bh() for RCU-bh and similarly for
sched, however these APIs are now implemented as rcu_barrier() itself due
to the RCU consolidation. This commit therefore corrects checklist.txt
to encourage use of the underlying rcu_barrier() API.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) e060a03a1c doc: rcu: Remove obsolete checklist item about synchronize_rcu usage
Since the RCU mechanisms have been consolidated, the checklist item
warning that synchronize_rcu() waits only for RCU readers is obsolete.
This commit therefore removes this checklist item.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) bc2072c9ad doc: rcu: Remove obsolete suggestion from checklist
call_rcu_bh is now implemented in terms of call_rcu, so the suggestion
to use a different API for speed benefits is not accurate anymore.
This commit updates the document accordingly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 090c1685fd doc: rcu: Add more rationale for using rcu_read_lock_sched in checklist
This commit explains why rcu_read_lock_sched is better than using
preempt_disable.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 3398496483 doc: rcu: Update core and full API in whatisRCU
RCU consolidation effort causes the update side of the RCU API to
be consistent across all the 3 RCU flavors (normal, sched, bh). This
commit therefore updates the full API in the whatisRCU document, thus
encouraging people to use the consolidated RCU update API instead of
the old RCU-bh and RCU-sched update APIs.

Also rcu_dereference is documented to be the same for all 3 mechanisms
(even before the consolidation), however its actually different - as
using the right rcu_dereference primitive (such as rcu_dereference_bh
for bh) is needed to make lock debugging work correctly. This update
also corrects that.

Also, add local_bh_disable() and local_bh_enable() as softirq
protection primitives and correct a grammar error in a quiz answer.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 70f0508cab doc: rcu: Update description of gp_seq fields in rcu_data
The rcu_state structure doesn't have a gp_seq_needed field. Update the
description under rcu_data accordingly, to reflect this.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) 82eccec851 doc: rcu: Better clarify the rcu_segcblist ->len field
An important note under the rcu_segcblist description could use a more
detailed description. Especially explanation of the scenario where the
->head field may be temporarily NULL making it not wise to rely on it
to determine if callbacks are associated with the rcu_segcblist.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) b54d9db260 doc: rcu: Update Data-Structures for RCU flavor consolidation
This patch updates all Data-Structures document figures and text and
removes some unwanted figures, to reflect the recent work Paul has been
doing with consolidating all flavors of RCU.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:56:25 -08:00
Joel Fernandes (Google) c9b6f899e1 doc: Remove rcu_dynticks from Data-Structures
rcu_dynticks was folded into rcu_data structure. Update the data
structures RCU document accordingly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:49:05 -08:00
Joel Fernandes (Google) 5cc379a42a doc: Update information about resched_cpu
Since commit fced9c8cfe ("rcu: Avoid resched_cpu() when rescheduling
the current CPU"), resched_cpu is not directly called from
sync_sched_exp_handler. Update the documentation about the same.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-12 08:48:31 -08:00
Joel Fernandes (Google) dd944caa81 doc: Remove rcu_preempt_state reference in stallwarn
Consolidation of RCU-bh, RCU-preempt, and RCU-sched into one RCU flavor
to rule them all resulted in the removal of rcu_preempt_state.  However,
stallwarn.txt still mentions rcu_preempt_state.  This commit therefore
Updates stallwarn documentation accordingly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:44:41 -08:00
Joel Fernandes (Google) 2d0350a8f0 doc: Clarify RCU data-structure comment about rcu_tree fanout
RCU Data-Structures document describes a trick to test RCU with small
number of CPUs but with a taller tree. It wasn't immediately clear how
the document arrived at 16 CPUs which also requires setting the
FANOUT_LEAF to 2 instead of the default of 16.  This commit therefore
provides the needed clarification.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:44:41 -08:00
Paul E. McKenney 832aa35a65 doc: Set down forward-progress requirements
This commit adds a section to the requirements documentation setting down
requirements for grace-period and callback-invocation forward progress.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:44:41 -08:00
Linus Torvalds 01aa9d518e This is a fairly typical cycle for documentation. There's some welcome
readability improvements for the formatted output, some LICENSES updates
 including the addition of the ISC license, the removal of the unloved and
 unmaintained 00-INDEX files, the deprecated APIs document from Kees, more
 MM docs from Mike Rapoport, and the usual pile of typo fixes and
 corrections.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbztcuAAoJEI3ONVYwIuV6nTAP/0Be+5dNPGJmSnb/RbkwBuBV
 zAFVUj2sx4lZlRmWRZ0r7AOef2eSw3IvwBix/vnmllYCVahjp+BdRbhXQAijjyeb
 FWWjOH50/J+BaxSthAINiLRLvuoe0D/M08OpmXQfRl5q0S8RufeV3BDtEABx9j2n
 IICPGTl8LpPUgSMA4cw8zPhHdauhZpbmL2mGE9LXZ27SJT4S8lcHMwyPU1n5S+Jd
 ChEz5g9dYr3GNxFp712pkI5GcVL3tP2nfoVwK7EuGf1tvSnEnn2kzac8QgMqorIh
 xB2+Sh4XIUCbHYpGHpxIniD+WI4voNr/E7STQioJK5o2G4HTuxLjktvTezNF8paa
 hgNHWjPQBq0OOCdM/rsffONFF2J/v/r7E3B+kaRg8pE0uZWTFaDMs6MVaL2fL4Ls
 DrFhi90NJI/Fs7uB4sriiviShAhwboiSIRXJi4VlY/5oFJKHFgqes+R7miU+zTX3
 2qv0k4mWZXWDV9w1piPxSCZSdRzaoYSoxEihX+tnYpCyEcYd9ovW/X1Uhl/wCWPl
 Ft+Op6rkHXRXVfZzTLuF6PspZ4Udpw2PUcnA5zj5FRDDBsjSMFR31c19IFbCeiNY
 kbTIcqejJG1WbVrAK4LCcFyVSGxbrr281eth4rE06cYmmsz3kJy1DB6Lhyg/2vI0
 I8K9ZJ99n1RhPJIcburB
 =C0wt
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.20' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "This is a fairly typical cycle for documentation. There's some welcome
  readability improvements for the formatted output, some LICENSES
  updates including the addition of the ISC license, the removal of the
  unloved and unmaintained 00-INDEX files, the deprecated APIs document
  from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
  fixes and corrections"

* tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
  docs: Fix typos in histogram.rst
  docs: Introduce deprecated APIs list
  kernel-doc: fix declaration type determination
  doc: fix a typo in adding-syscalls.rst
  docs/admin-guide: memory-hotplug: remove table of contents
  doc: printk-formats: Remove bogus kobject references for device nodes
  Documentation: preempt-locking: Use better example
  dm flakey: Document "error_writes" feature
  docs/completion.txt: Fix a couple of punctuation nits
  LICENSES: Add ISC license text
  LICENSES: Add note to CDDL-1.0 license that it should not be used
  docs/core-api: memory-hotplug: add some details about locking internals
  docs/core-api: rename memory-hotplug-notifier to memory-hotplug
  docs: improve readability for people with poorer eyesight
  yama: clarify ptrace_scope=2 in Yama documentation
  docs/vm: split memory hotplug notifier description to Documentation/core-api
  docs: move memory hotplug description into admin-guide/mm
  doc: Fix acronym "FEKEK" in ecryptfs
  docs: fix some broken documentation references
  iommu: Fix passthrough option documentation
  ...
2018-10-24 18:01:11 +01:00
Henrik Austad a7ddcea58a Drop all 00-INDEX files from Documentation/
This is a respin with a wider audience (all that get_maintainer returned)
and I know this spams a *lot* of people. Not sure what would be the correct
way, so my apologies for ruining your inbox.

The 00-INDEX files are supposed to give a summary of all files present
in a directory, but these files are horribly out of date and their
usefulness is brought into question. Often a simple "ls" would reveal
the same information as the filenames are generally quite descriptive as
a short introduction to what the file covers (it should not surprise
anyone what Documentation/sched/sched-design-CFS.txt covers)

A few years back it was mentioned that these files were no longer really
needed, and they have since then grown further out of date, so perhaps
it is time to just throw them out.

A short status yields the following _outdated_ 00-INDEX files, first
counter is files listed in 00-INDEX but missing in the directory, last
is files present but not listed in 00-INDEX.

List of outdated 00-INDEX:
Documentation: (4/10)
Documentation/sysctl: (0/1)
Documentation/timers: (1/0)
Documentation/blockdev: (3/1)
Documentation/w1/slaves: (0/1)
Documentation/locking: (0/1)
Documentation/devicetree: (0/5)
Documentation/power: (1/1)
Documentation/powerpc: (0/5)
Documentation/arm: (1/0)
Documentation/x86: (0/9)
Documentation/x86/x86_64: (1/1)
Documentation/scsi: (4/4)
Documentation/filesystems: (2/9)
Documentation/filesystems/nfs: (0/2)
Documentation/cgroup-v1: (0/2)
Documentation/kbuild: (0/4)
Documentation/spi: (1/0)
Documentation/virtual/kvm: (1/0)
Documentation/scheduler: (0/2)
Documentation/fb: (0/1)
Documentation/block: (0/1)
Documentation/networking: (6/37)
Documentation/vm: (1/3)

Then there are 364 subdirectories in Documentation/ with several files that
are missing 00-INDEX alltogether (and another 120 with a single file and no
00-INDEX).

I don't really have an opinion to whether or not we /should/ have 00-INDEX,
but the above 00-INDEX should either be removed or be kept up to date. If
we should keep the files, I can try to keep them updated, but I rather not
if we just want to delete them anyway.

As a starting point, remove all index-files and references to 00-INDEX and
see where the discussion is going.

Signed-off-by: Henrik Austad <henrik@austad.us>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Just-do-it-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: [Almost everybody else]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-09-09 15:08:58 -06:00
Paul E. McKenney b56ada1209 Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' and 'torture.2018.08.29a' into HEAD
doc.2018.08.30a: Documentation updates
dynticks.2018.08.30b: RCU flavor consolidation updates and cleanups
srcu.2018.08.30b: SRCU updates
torture.2018.08.29a: Torture-test updates
2018-08-30 16:12:53 -07:00
Paul E. McKenney 3e31009898 rcu: Defer reporting RCU-preempt quiescent states when disabled
This commit defers reporting of RCU-preempt quiescent states at
rcu_read_unlock_special() time when any of interrupts, softirq, or
preemption are disabled.  These deferred quiescent states are reported
at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug
offline operation.  Of course, if another RCU read-side critical
section has started in the meantime, the reporting of the quiescent
state will be further deferred.

This also means that disabling preemption, interrupts, and/or
softirqs will act as an RCU-preempt read-side critical section.
This is enforced by checking preempt_count() as needed.

Some special cases must be handled on an ad-hoc basis, for example,
context switch is a quiescent state even though both the scheduler and
do_exit() disable preemption.  In these cases, additional calls to
rcu_preempt_deferred_qs() override the preemption disabling.  Similar
logic overrides disabled interrupts in rcu_preempt_check_callbacks()
because in this case the quiescent state happened just before the
corresponding scheduling-clock interrupt.

In theory, this change lifts a long-standing restriction that required
that if interrupts were disabled across a call to rcu_read_unlock()
that the matching rcu_read_lock() also be contained within that
interrupts-disabled region of code.  Because the reporting of the
corresponding RCU-preempt quiescent state is now deferred until
after interrupts have been enabled, it is no longer possible for this
situation to result in deadlocks involving the scheduler's runqueue and
priority-inheritance locks.  This may allow some code simplification that
might reduce interrupt latency a bit.  Unfortunately, in practice this
would also defer deboosting a low-priority task that had been subjected
to RCU priority boosting, so real-time-response considerations might
well force this restriction to remain in place.

Because RCU-preempt grace periods are now blocked not only by RCU
read-side critical sections, but also by disabling of interrupts,
preemption, and softirqs, it will be possible to eliminate RCU-bh and
RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels.  This may
require some additional plumbing to provide the network denial-of-service
guarantees that have been traditionally provided by RCU-bh.  Once these
are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh
into RCU-sched.  This would mean that all kernels would have but
one flavor of RCU, which would open the door to significant code
cleanup.

Moving to a single flavor of RCU would also have the beneficial effect
of reducing the NOCB kthreads by at least a factor of two.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback
  from Joel Fernandes. ]
[ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in
  response to bug reports from kbuild test robot. ]
[ paulmck: Fix bug located by kbuild test robot involving recursion
  via rcu_preempt_deferred_qs(). ]
2018-08-30 16:02:34 -07:00
Paul E. McKenney 5c3f78ec28 doc: Fix broken HTML directive
This commit adds the needed "<".

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 11:01:08 -07:00
Paul E. McKenney 77095901b8 doc: Update removal of RCU-bh/sched update machinery
The RCU-bh update API is now defined in terms of that of RCU-bh and
RCU-sched, so this commit updates the documentation accordingly.

In addition, although RCU-sched persists in !PREEMPT kernels, in
the PREEMPT case its update API is now defined in terms of that of
RCU-preempt, so this commit also updates the documentation accordingly.

While in the area, this commit removes the documentation for the
now-obsolete synchronize_rcu_mult() and clarifies the Tasks RCU
documentation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30 10:59:48 -07:00
Joel Fernandes (Google) ea24c125fe doc: Improve rcu_dynticks::dynticks documentation
The very useful RCU Data-Structures describes that the dynticks counter
of the rcu_dynticks data structure is incremented when we transitions to
or from dynticks-idle mode. However it doesn't mention that it is also
incremented due to transitions to and from user mode which for dynticks
purposes is an extended quiescent state.

I found this with tracing calls to rcu_dynticks_eqs_enter which can also
happen from rcu_user_enter. Lets add this information to the
Data-Structures document.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 08:54:30 -07:00
Joel Fernandes (Google) a5a2889544 doc: Fix broken RCU-requirements link to LKML archive
Two of the Requirements.html LKML links are broken. This patch changes
them to use the archive from lore.kernel.org, which works fine.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 08:54:30 -07:00
Paul E. McKenney e77cb32558 doc: Add design documentation on interruption of NMI handlers
Make Requirements.html talk about how NMI handlers can take what appear
to RCU to be normal interrupts.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-29 08:54:14 -07:00
NeilBrown b7b6f94cf6 rculist: Improve documentation for list_for_each_entry_from_rcu()
Unfortunately the patch for adding list_for_each_entry_from_rcu()
wasn't the final patch after all review.  It is functionally
correct but the documentation was incomplete.

This patch adds this missing documentation which includes an update to
the documentation for list_for_each_entry_continue_rcu() to match the
documentation for the new list_for_each_entry_from_rcu(), and adds
list_for_each_entry_from_rcu() and the already existing
hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:25 -07:00
Andrea Parri 264d4f88ad doc: Update synchronize_rcu() definition in whatisRCU.txt
The synchronize_rcu() definition based on RW-locks in whatisRCU.txt
does not meet the "Memory-Barrier Guarantees" in Requirements.html;
for example, the following SB-like test:

    P0:                      P1:

    WRITE_ONCE(x, 1);        WRITE_ONCE(y, 1);
    synchronize_rcu();       smp_mb();
    r0 = READ_ONCE(y);       r1 = READ_ONCE(x);

should not be allowed to reach the state "r0 = 0 AND r1 = 0", but
the current write_lock()+write_unlock() definition can not ensure
this.  This commit therefore inserts an smp_mb__after_spinlock()
in order to cause this synchronize_rcu() implementation to provide
this memory-barrier guarantee.

Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:39:23 -07:00
Paul E. McKenney e1333462e3 doc: Update RCU CPU stall-warning documentation
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:51 -07:00
Paul E. McKenney 869cc745bf doc: Update memory-ordering documentation for ->gp-seq
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:51 -07:00
Paul E. McKenney b1e1f21f5b doc: Update data-structure documentation for ->gp_seq
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-12 15:38:49 -07:00
Paul Gortmaker 628c08420b doc: Ensure whatisRCU.txt actually says what RCU is
It came to my attention that the file "whatisRCU.txt" does not
manage to actually ever spell out what is RCU.

This might not be an issue for a lot of people, but we have to
assume the consumers of these documents are starting from ground
zero; otherwise they'd not be reading the docs.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
2018-05-15 10:28:12 -07:00
Paul E. McKenney 1dfa55e019 Merge branches 'cond_resched.2017.12.04a', 'dyntick.2017.11.28a', 'fixes.2017.12.11a', 'srbd.2017.12.05a' and 'torture.2017.12.11a' into HEAD
cond_resched.2017.12.04a: Convert cond_resched_rcu_qs() to cond_resched()
dyntick.2017.11.28a: Make RCU dynticks handle interrupts from NMI
fixes.2017.12.11a: Miscellaneous fixes
srbd.2017.12.05a: Remove now-redundant smp_read_barrier_depends()
torture.2017.12.11a: Torture-testing update
2017-12-11 09:21:58 -08:00
Paul E. McKenney 9ad3c143d7 doc: De-emphasize smp_read_barrier_depends
This commit keeps only the historical and low-level discussion of
smp_read_barrier_depends().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Adjusted to allow for David Howells feedback on prior commit. ]
2017-12-05 11:57:53 -08:00
Paul E. McKenney f2b1760aed doc: Eliminate cond_resched_rcu_qs() in favor of cond_resched()
Now that cond_resched() also provides RCU quiescent states when
needed, it can be used in place of cond_resched_rcu_qs().  This
commit therefore documents this change.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-12-04 10:28:59 -08:00
Paul E. McKenney 3af3999b9a doc: Update dyntick-idle design documentation for NMI/irq consolidation
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-11-28 15:51:22 -08:00
Linus Torvalds 6098850e7e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Documentation updates

   - RCU CPU stall-warning updates

   - Torture-test updates

   - Miscellaneous fixes

  Size wise the biggest updates are to documentation. Excluding
  documentation most of the code increase comes from a single commit
  which expands debugging"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  srcu: Add parameters to SRCU docbook comments
  doc: Rewrite confusing statement about memory barriers
  memory-barriers.txt: Fix typo in pairing example
  rcu/segcblist: Include rcupdate.h
  rcu: Add extended-quiescent-state testing advice
  rcu: Suppress lockdep false-positive ->boost_mtx complaints
  rcu: Do not include rtmutex_common.h unconditionally
  torture: Provide TMPDIR environment variable to specify tmpdir
  rcutorture: Dump writer stack if stalled
  rcutorture: Add interrupt-disable capability to stall-warning tests
  rcu: Suppress RCU CPU stall warnings while dumping trace
  rcu: Turn off tracing before dumping trace
  rcu: Make RCU CPU stall warnings check for irq-disabled CPUs
  sched,rcu: Make cond_resched() provide RCU quiescent state
  sched: Make resched_cpu() unconditional
  irq_work: Map irq_work_on_queue() to irq_work_on() in !SMP
  rcu: Create call_rcu_tasks() kthread at boot time
  rcu: Fix up pending cbs check in rcu_prepare_for_idle
  memory-barriers: Rework multicopy-atomicity section
  memory-barriers: Replace uses of "transitive"
  ...
2017-11-13 12:18:10 -08:00
Tom Saeger d354a6f150 Documentation: fix ref to workqueue content
Signed-off-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-10-19 12:57:26 -06:00
Paul E. McKenney d3cf5176d0 documentation: Update RCU CPU stall warning messages
The RCU CPU stall warnings have morphed significantly since the last
update, so this commit brings the documentation up to date.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-10-09 14:23:36 -07:00
Paul E. McKenney 3d916a443e documentation: Slow systems can stall RCU grace periods
If a fast system has a worst-case grace-period duration of (say) ten
seconds, then running the same workload on a system ten times as slow
will get you an RCU CPU stall warning given default stall-warning
timeout settings.  This commit therefore adds this possibility to
stallwarn.txt.

Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-10-09 14:23:36 -07:00
Paul E. McKenney dfa0ee48ef documentation: Long-running irq handlers can stall RCU grace periods
If a periodic interrupt's handler takes longer to execute than the period
between successive interrupts, RCU's kthreads and softirq handlers can
be prevented from executing, resulting in otherwise inexplicable RCU
CPU stall warnings.  This commit therefore calls out this possibility
in Documentation/RCU/stallwarn.txt.

Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-10-09 14:23:36 -07:00
Paul E. McKenney bb7e5ce7dd documentation: RCU grace-period memory ordering guarantees
This commit provides text and diagrams showing how Tree RCU implements
its grace-period memory ordering guarantees.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-09 14:23:36 -07:00
Paul E. McKenney 850bf6d592 doc: Set down RCU's scheduling-clock-interrupt needs
This commit documents the situations in which RCU needs the
scheduling-clock interrupt to be enabled, along with the consequences
of failing to meet RCU's needs in this area.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-08-17 07:31:14 -07:00
Paul E. McKenney 8a597d636f doc: No longer allowed to use rcu_dereference on non-pointers
There are too many ways for the compiler to optimize (that is, break)
dependencies carried via integer values, so it is now permissible to
carry dependencies only via pointers.  This commit catches up some of
the documentation on this point.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-08-17 07:29:58 -07:00
Paul E. McKenney 4de5f89ef8 doc: Update RCU documentation
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-08-17 07:29:48 -07:00
Paul E. McKenney ae91aa0adb rcu: Remove debugfs tracing
RCU's debugfs tracing used to be the only reasonable low-level debug
information available, but ftrace and event tracing has since surpassed
the RCU debugfs level of usefulness.  This commit therefore removes
RCU's debugfs tracing.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 18:52:43 -07:00
Paul E. McKenney 41a2901e7d rcu: Remove SPARSE_RCU_POINTER Kconfig option
The sparse-based checking for non-RCU accesses to RCU-protected pointers
has been around for a very long time, and it is now the only type of
sparse-based checking that is optional.  This commit therefore makes
it unconditional.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
2017-06-08 18:52:41 -07:00
Paul E. McKenney fe5ac724d8 rcu: Remove nohz_full full-system-idle state machine
The NO_HZ_FULL_SYSIDLE full-system-idle capability was added in 2013
by commit 0edd1b1784 ("nohz_full: Add full-system-idle state machine"),
but has not been used.  This commit therefore removes it.

If it turns out to be needed later, this commit can always be reverted.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-08 18:52:39 -07:00
Paul E. McKenney c75e9caaf8 doc: Take tail recursion into account in RCU requirements
This commit classifies tail recursion as an alternative way to write
a loop, with similar limitations.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 08:25:34 -07:00
Paul E. McKenney 09f501a0f0 srcu: Document auto-expediting requirement
This commit documents the auto-expediting requirement satisfied by
commits 2da4b2a7fd ("srcu: Expedite first synchronize_srcu() when idle")
and 22607d66bb ("srcu: Specify auto-expedite holdoff time").

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08 08:25:33 -07:00
Paul E. McKenney f2094107ac Merge branches 'doc.2017.04.12a', 'fixes.2017.04.19a' and 'srcu.2017.04.21a' into HEAD
doc.2017.04.12a: Documentation updates
fixes.2017.04.19a: Miscellaneous fixes
srcu.2017.04.21a: Parallelize SRCU callback handling
2017-04-21 06:00:13 -07:00
Paul E. McKenney 5f0d5a3ae7 mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU
A group of Linux kernel hackers reported chasing a bug that resulted
from their assumption that SLAB_DESTROY_BY_RCU provided an existence
guarantee, that is, that no block from such a slab would be reallocated
during an RCU read-side critical section.  Of course, that is not the
case.  Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
slab of blocks.

However, there is a phrase for this, namely "type safety".  This commit
therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
to avoid future instances of this sort of confusion.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-mm@kvack.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
[ paulmck: Add comments mentioning the old name, as requested by Eric
  Dumazet, in order to help people familiar with the old name find
  the new one. ]
Acked-by: David Rientjes <rientjes@google.com>
2017-04-18 11:42:36 -07:00
Paul E. McKenney 9226b10d78 rcu: Place guard on rcu_all_qs() and rcu_note_context_switch() actions
The rcu_all_qs() and rcu_note_context_switch() do a series of checks,
taking various actions to supply RCU with quiescent states, depending
on the outcomes of the various checks.  This is a bit much for scheduling
fastpaths, so this commit creates a separate ->rcu_urgent_qs field in
the rcu_dynticks structure that acts as a global guard for these checks.
Thus, in the common case, rcu_all_qs() and rcu_note_context_switch()
check the ->rcu_urgent_qs field, find it false, and simply return.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2017-04-18 11:38:18 -07:00
Paul E. McKenney 0f9be8cabb rcu: Eliminate flavor scan in rcu_momentary_dyntick_idle()
The rcu_momentary_dyntick_idle() function scans the RCU flavors, checking
that one of them still needs a quiescent state before doing an expensive
atomic operation on the ->dynticks counter.  However, this check reduces
overhead only after a rare race condition, and increases complexity.  This
commit therefore removes the scan and the mechanism enabling the scan.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-18 11:38:18 -07:00
Paul E. McKenney 9577df9a31 rcu: Pull rcu_qs_ctr into rcu_dynticks structure
The rcu_qs_ctr variable is yet another isolated per-CPU variable,
so this commit pulls it into the pre-existing rcu_dynticks per-CPU
structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-18 11:38:17 -07:00
Paul E. McKenney abb06b9948 rcu: Pull rcu_sched_qs_mask into rcu_dynticks structure
The rcu_sched_qs_mask variable is yet another isolated per-CPU variable,
so this commit pulls it into the pre-existing rcu_dynticks per-CPU
structure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-18 11:38:17 -07:00
Paul E. McKenney d3d3a3ccc4 doc: Emphasize that "toy" RCU requires recursive rwlock
Reported-by: "yangzc@uit.com.cn" <yangzc@uit.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:43 -07:00
Michalis Kokologiannakis 93728af0a1 doc: Update the comparisons rule in rcu_dereference.txt
When an RCU-protected pointer is fetched but never dereferenced
rcu_access_pointer() should be used in place of rcu_dereference().
This commit explicitly records this very fact in Documentation/
RCU/rcu_dereference.txt, in order to prevent the usage of
rcu_dereference() in comparisons.

Signed-off-by: Michalis Kokologiannakis <mixaskok@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:42 -07:00
Paul E. McKenney 066bb1c84a doc: Update rcu_assign_pointer() definition in whatisRCU.txt
The rcu_assign_pointer() macro has changed over time, and the version
in Documentation/RCU/whatisRCU.txt has not kept up.  This commit brings
it into 2017, albeit in a simplified fashion.

Reported-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:42 -07:00
Paul E. McKenney 6771853bce doc: Update requirements based on recent changes
These changes include lighter-weight expedited grace periods, the fact
that expedited grace periods and rcu_barrier() no longer block CPU
hotplug, some HTML font fixups, noting that rcu_barrier() need not wait
for a grace period (even if callbacks are posted), the fact that SRCU
read-side critical sections can be used from offline CPUs, and the fact
that SRCU now maintains per-CPU callback lists.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:41 -07:00
Paul E. McKenney aa123a748e doc: Update RCU data-structure documentation for rcu_segcblist
The rcu_segcblist data structure, which contains segmented lists
of RCU callbacks, was recently added.  This commit updates the
documentation accordingly.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:41 -07:00
Paul E. McKenney 8e2a439753 doc: Update stallwarn.txt to make causes more prominent
This commit rearranges the Documentation/RCU/stallwarn.txt file to
put the list of issues that can cause RCU CPU stall warnings near
the beginning of the document.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:40 -07:00
Paul E. McKenney b4553f0cfe doc: Add mid-boot operation to expedited grace periods
This commit adds a description of how expedited grace periods operate
during the mid-boot "dead zone", which starts when the scheduler spawns
the first kthread and ends when all of RCU's kthreads have been spawned.
In short, before mid-boot, synchronous grace periods can be a no-op.
After the end of mid-boot, workqueues may be used.  During mid-boot,
the requesting task drivees the expedited grace period.

For more detail, see https://lwn.net/Articles/716148/.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:40 -07:00
Paul E. McKenney f1387d7705 doc: Synchronous RCU grace periods are now legal throughout boot
This commit updates the "Early Boot" section of the RCU requirements
to describe how synchronous RCU grace periods are now legal throughout
the boot process.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-04-12 08:23:39 -07:00
Paul E. McKenney 31945aa9f1 Merge branches 'doc.2017.01.15b', 'dyntick.2017.01.23a', 'fixes.2017.01.23a', 'srcu.2017.01.25a' and 'torture.2017.01.15b' into HEAD
doc.2017.01.15b: Documentation updates
dyntick.2017.01.23a: Dyntick tracking consolidation
fixes.2017.01.23a: Miscellaneous fixes
srcu.2017.01.25a: SRCU rewrite, fixes, and verification
torture.2017.01.15b: Torture-test updates
2017-01-25 12:56:05 -08:00
Paul E. McKenney bb4e2c08bb rcu: Eliminate unused expedited_normal counter
Expedited grace periods no longer fall back to normal grace periods
in response to lock contention, given that expedited grace periods
now use the rcu_node tree so as to avoid contention.  This commit
therfore removes the expedited_normal counter.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2017-01-23 11:37:14 -08:00
Paul E. McKenney 4afac159c3 doc: Quick-Quiz answers are now inline
Now that quick-quiz answers are inline, there is no separate section
containing those answers.  This commit therefore removes the dangling
reference from the RCU data-structures design documentation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2017-01-14 21:30:33 -08:00
Tetsuo Handa 526914a0ae doc: Fix RCU requirements typos
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2017-01-14 21:29:10 -08:00
Paul E. McKenney e73c14ebc3 rcu: Design documentation for expedited grace periods
This commit adds design documentation for expedited grace periods.
This documentation is in HTML rather than the new documentation
format because (1) I have prototype documentation already in HTML,
and (2) Attempting to learn the new documentation format while
creating the design documentation seems likely to result in neither
happening in a timely fashion.

Once the design documentation is complete, we can start a conversion effort.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2017-01-14 21:28:58 -08:00
Pranith Kumar 8cf503d337 Documentation/RCU: Fix minor typo
deference should actually be dereference.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-11-14 10:39:48 -08:00
Paul E. McKenney e2c85cb12c documentation: Present updated RCU guarantee
Recent memory-model work deduces the relationships of RCU read-side
critical sections and grace periods based on the relationships of
accesses within a critical section and accesses preceding and following
the grace period.  This commit therefore adds this viewpoint.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-11-14 10:39:36 -08:00
Petr Mladek 3989144f86 kthread: kthread worker API cleanup
A good practice is to prefix the names of functions by the name
of the subsystem.

The kthread worker API is a mix of classic kthreads and workqueues.  Each
worker has a dedicated kthread.  It runs a generic function that process
queued works.  It is implemented as part of the kthread subsystem.

This patch renames the existing kthread worker API to use
the corresponding name from the workqueues API prefixed by
kthread_:

__init_kthread_worker()		-> __kthread_init_worker()
init_kthread_worker()		-> kthread_init_worker()
init_kthread_work()		-> kthread_init_work()
insert_kthread_work()		-> kthread_insert_work()
queue_kthread_work()		-> kthread_queue_work()
flush_kthread_work()		-> kthread_flush_work()
flush_kthread_worker()		-> kthread_flush_worker()

Note that the names of DEFINE_KTHREAD_WORK*() macros stay
as they are. It is common that the "DEFINE_" prefix has
precedence over the subsystem names.

Note that INIT() macros and init() functions use different
naming scheme. There is no good solution. There are several
reasons for this solution:

  + "init" in the function names stands for the verb "initialize"
    aka "initialize worker". While "INIT" in the macro names
    stands for the noun "INITIALIZER" aka "worker initializer".

  + INIT() macros are used only in DEFINE() macros

  + init() functions are used close to the other kthread()
    functions. It looks much better if all the functions
    use the same scheme.

  + There will be also kthread_destroy_worker() that will
    be used close to kthread_cancel_work(). It is related
    to the init() function. Again it looks better if all
    functions use the same naming scheme.

  + there are several precedents for such init() function
    names, e.g. amd_iommu_init_device(), free_area_init_node(),
    jump_label_init_type(),  regmap_init_mmio_clk(),

  + It is not an argument but it was inconsistent even before.

[arnd@arndb.de: fix linux-next merge conflict]
 Link: http://lkml.kernel.org/r/20160908135724.1311726-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/1470754545-17632-3-git-send-email-pmladek@suse.com
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:33 -07:00
Paul E. McKenney ed2bec07fd documentation: Record reason for rcu_head two-byte alignment
There is an assertion in __call_rcu() that checks only the bottom
bit of the rcu_head pointer, rather than the bottom two (as might be
expected for 32-bit systems) or the bottom three (as might be expected
for 64-bit systems).  This choice might be a bit surprising in these days
of ubiquitous 32-bit and 64-bit systems.  This commit therefore records
the reason for this odd alignment check, namely that m68k guarantees
only two-byte alignment despite being a 32-bit architectures.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-08-22 09:25:33 -07:00
SeongJae Park e1ef69217f rcutorture: Remove outdated config option description
CONFIG_RCU_TORTURE_TEST_RUNNABLE was removed by commit 4e9a073f60
("torture: Remove CONFIG_RCU_TORTURE_TEST_RUNNABLE, simplify code"),
but the documentation was not updated accordingly.  This commit therefore
updates the documentation to reflect CONFIG_RCU_TORTURE_TEST_RUNNABLE's
removal and to add a description for the alternative module parameter.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-08-22 09:23:23 -07:00
Eric Engestrom 14ef05754f Documentation: Fix spelling mistake
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-14 16:01:00 -07:00
Paul E. McKenney c79dac758d documentation: Add RCU_NONIDLE() restrictions to requirements
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-14 16:00:59 -07:00
Paul E. McKenney db4855b5a8 documentation: Add references to 2010 and 2014 Big API Tables
Reported-by: Jim Roskind <jar@roskind.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-14 16:00:58 -07:00
Paul E. McKenney 2921b12362 documentation: Add reference to 2014 RCU API LWN article
Reported-by: Jim Roskind <jar@roskind.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-06-14 16:00:58 -07:00
Linus Torvalds e9ad9b9bd3 The most interesting thing (IMO) this time around is some beginning
infrastructural work to allow documents to be written using restructured
 text.  Maybe someday, in a galaxy far far away, we'll be able to eliminate
 the DocBook dependency and have a much better integrated set of kernel
 docs.  Someday.
 
 Beyond that, there's a new document on security hardening from Kees, the
 movement of some sample code over to samples/, a number of improvements to
 the serial docs from Geert, and the usual collection of corrections, typo
 fixes, etc.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPf/VAAoJEI3ONVYwIuV60pkP/3brq+CavbwptWppESoyZaf7
 mpVSH7sOKicMcfHYYIXHmmg0K5gM4e22ATl39+izUCRZRwRnObXvroH++G5mARLs
 MUDxLvkc/QxDDuCZnUBq5E2gPtuyYpgj1q9fMGB+70ucc/EXYp5cxUhDmbNVrpSG
 KBMoZqKaW/Cf8/4fvRQG/glSR0iwyaQuvvoFAWLHgf8uWN/JPM2Cnv9V2zGQCtzP
 4B4Jzayu2BGKowBd65WUYdpGnccc7OAJFSJDY/Z9x7kVxKyD+VTn7VgxGnXxs88v
 uNmUEMENUpswzuoYEnDHoR0Y2o7jUi2doFKv+eacSmPaMLWL5EMDzcooZ+Vi7HWH
 mvp6GtAZ5qs96OGjsi+gFIw4kY8HGdnpzs7qk/uEdAndfAif5v24YLSQRG2rUCJM
 LxomnAWOJEIWGKJtuJnl16aZkgOcn6soecXw3PJmpxzhwd8BnQzwyZIdaZ98kwjA
 7Enq2Mmw5NBQwGIV2ODUxzoQ3Axj7aJJsDra2n6lPGTGXONGdgNFzk/hGmtQSuIp
 Aeatiy66FF0qKomzs2+EACOFP+eH/IId0yvW83Pj0o9nV25YZiPsw0Z1Tae5n3+g
 zgTFycalaowIwE3YzyH6BwvnMrluiPpUTjSLsmEaviJxE7/o+zrjOvMvallUIVUn
 YkJcia/DtSuc7u7LYkWe
 =2O+a
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux

Pull Documentation updates from Jon Corbet:
 "A bit busier this time around.

  The most interesting thing (IMO) this time around is some beginning
  infrastructural work to allow documents to be written using
  restructured text.  Maybe someday, in a galaxy far far away, we'll be
  able to eliminate the DocBook dependency and have a much better
  integrated set of kernel docs.  Someday.

  Beyond that, there's a new document on security hardening from Kees,
  the movement of some sample code over to samples/, a number of
  improvements to the serial docs from Geert, and the usual collection
  of corrections, typo fixes, etc"

* tag 'docs-for-linus' of git://git.lwn.net/linux: (55 commits)
  doc: self-protection: provide initial details
  serial: doc: Use port->state instead of info
  serial: doc: Always refer to tty_port->mutex
  Documentation: vm: Spelling s/paltform/platform/g
  Documentation/memcg: update kmem limit doc as codes behavior
  docproc: print a comment about autogeneration for rst output
  docproc: add support for reStructuredText format via --rst option
  docproc: abstract terminating lines at first space
  docproc: abstract docproc directive detection
  docproc: reduce unnecessary indentation
  docproc: add variables for subcommand and filename
  kernel-doc: use rst C domain directives and references for types
  kernel-doc: produce RestructuredText output
  kernel-doc: rewrite usage description, remove duplicated comments
  Doc: correct the location of sysrq.c
  Documentation: fix common spelling mistakes
  samples: v4l: from Documentation to samples directory
  samples: connector: from Documentation to samples directory
  Documentation: xillybus: fix spelling mistake
  Documentation: x86: fix spelling mistakes
  ...
2016-05-19 18:07:25 -07:00
Kees Cook 08559657b2 Documentation: fix common spelling mistakes
This fixes several spelling mistakes in the Documentation/ tree, which
are caught by checkpatch.pl's spell checking.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-04-28 07:51:59 -06:00
Paul E. McKenney dcd36d01fb Merge branches 'doc.2016.04.19a', 'exp.2016.03.31d', 'fixes.2016.03.31d' and 'torture.2016.04.21a' into HEAD
doc.2016.04.19a: Documentation updates
exp.2016.03.31d: Expedited grace-period updates
fixes.2016.03.31d: Miscellaneous fixes
torture.2016.004.21a Torture-test updates
2016-04-21 13:48:20 -07:00
Paul E. McKenney 5c1458478c documentation: Add documentation for RCU's major data structures
This commit adds documentation for RCU's major data structures,
including rcu_state, rcu_node, rcu_data, rcu_dynticks, and rcu_head.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-04-19 14:48:12 -07:00
Paul E. McKenney e2fd9d3584 rcu: Remove expedited GP funnel-lock bypass
Commit #cdacbe1f91264 ("rcu: Add fastpath bypassing funnel locking")
turns out to be a pessimization at high load because it forces a tree
full of tasks to wait for an expedited grace period that they probably
do not need.  This commit therefore removes this optimization.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:34:07 -07:00
Paul E. McKenney 0c7d10e4b9 documentation: Emphasize the call_rcu() is illegal from idle
Although call_rcu()'s fastpath works just fine on an idle CPU,
some branches of the slowpath invoke the scheduler, which uses
RCU.  Therefore, this commit emphasizes the fact that call_rcu()
must not be invoked from an idle CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:23 -07:00
Paul E. McKenney 5413e24c94 documentation: Sharpen up the no-readers quick quiz
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:23 -07:00
Paul E. McKenney 6146f8df48 documentation: Get rid of duplicate .htmlx file
This commit uses colors to obscure the quick-quiz answers, thus getting
rid of the .htmlx file.  Use your mouse to select the answer in order
to see the text.  Alternatively, use your favorite scripting language
to remove all occurences of "<font color="ffffff">" from the file.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:22 -07:00
Paul E. McKenney 11a65df573 documentation: Remove unnecessary images from requirements
This commit removes a cutesy cartoon and also a diagram that can
just as easily be represented by text.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:21 -07:00
Paul E. McKenney 514f1eb5f4 documentation: Document illegality of call_rcu() from offline CPUs
There is already a blanket statement about no member of RCU's API
being legal from an offline CPU, but add an explicit note where it
states that it is illegal to invoke call_rcu() from an NMI handler.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:21 -07:00
Paul E. McKenney d8936c0b7e documentation: Explain why rcu_read_lock() needs no barrier()
This commit adds a Quick Quiz whose answer explains why the compiler
code reordering enabled by CONFIG_PREEMPT=n's empty rcu_read_lock()
and rcu_read_unlock() functions does not hinder RCU's ability to figure
out which RCU read-side critical sections have completed and not.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:20 -07:00
Paul E. McKenney f43b62542e documentation: Add synchronize_rcu_mult() to the requirements
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:20 -07:00
Paul E. McKenney 41abcf321d documentation: Add real-time requirements from CPU-bound workloads
This commit records RCU's responsibility to avoid degrading latencies
of CPUs running tight loops within properly configured workloads,
both in kernel and in userspace.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:19 -07:00
Yao Dongdong 70946a44de documentation: Make sample code and documentation consistent
In the chapter 'analogy with reader-writer locking', the sample
code uses spinlock_t in reader-writer case. Just correct it so
that we can read the document easily.

Signed-off-by: Yao Dongdong <yaodongdong@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2016-03-31 13:33:19 -07:00
Paul E. McKenney c64c4b0f9a documentation: Update RCU requirements based on expedited changes
Because RCU-sched expedited grace periods now use IPIs and interact
with rcu_read_unlock(), it is no longer sufficient to disable preemption
across RCU read-side critical sections that acquire and hold scheduler
locks.  It is now necessary to instead disable interrupts.  This commit
documents this change.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-05 12:34:32 -08:00
Paul E. McKenney 4b689330b1 documentation: Clarify RCU memory barriers and requirements
The RCU requirements do not make it absolutely clear that the
memory-barrier requirements are not intended to replace the fundamental
requirement that all pre-existing RCU readers complete before a grace
period completes.  This commit therefore pulls the memory-barrier
requirements into a separate section and explicitly calls out the
relationship between the memory-barrier requirements and the fundamental
requirement.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-05 12:33:37 -08:00
Paul E. McKenney a4b575627e documentation: Expand on scheduler/RCU deadlock requirements
This commit adds a second option for avoiding scheduler/RCU deadlocks,
namely that preemption be disabled across the entire RCU read-side
critical section in question.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-05 12:33:26 -08:00
Paul E. McKenney 0825458b1d documentation: Composability analogies
This commit expands on RCU's composability by comparing it to that of
transactional memory and of locking.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-05 12:33:11 -08:00
Paul E. McKenney 01d3ad3834 documentation: Cover requirements controlling stall warnings
This commit adds verbiage on boot and sysfs parameters that can be
used to control RCU CPU stall warnings, both to change the timeout
and to suppress these warnings entirely.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-05 12:32:19 -08:00
Paul E. McKenney 701e80312f Documentation: Record bottom-bit-zero guarantee for ->next
This commit records RCU's guarantee that the bottom bit of the rcu_head
structure's ->next field will remain zero for callbacks posted via
call_rcu(), but not necessarily for <tt>kfree_rcu()</tt> or some
possible future call_rcu_lazy() variant that might one day be created
for energy-efficiency purposese.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Updates URLs as suggested by Josh Triplett. ]
2015-12-05 12:31:47 -08:00
Paul E. McKenney 649e4368ff documentation: Record RCU requirements
This commit adds RCU requirements as published in a 2015 LWN series.
Bringing these requirements in-tree allows them to be updated as changes
are discovered.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Updates to charset and URLs as suggested by Josh Triplett. ]
2015-12-05 12:19:07 -08:00
Paul E. McKenney 39cd2dd39a Merge branches 'doc.2015.10.06a', 'percpu-rwsem.2015.10.06a' and 'torture.2015.10.06a' into HEAD
doc.2015.10.06a:  Documentation updates.
percpu-rwsem.2015.10.06a:  Optimization of per-CPU reader-writer semaphores.
torture.2015.10.06a:  Torture-test updates.
2015-10-07 16:06:25 -07:00
Paul E. McKenney b672adf8cf documentation: Catch up list of torture_type options
This commit rids the documentation of long-obsolete torture_type options
such as rcu_sync and adds new ones such as tasks.  Also add verbiage
noting the fact that rcutorture now concurrently tests the asynchrounous,
synchronous, expedited synchronous, and polling grace-period primitives.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Miller <davem@davemloft.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:22:48 -07:00
Jason A. Donenfeld 2c4ac34bc2 documentation: Correct doc to use rcu_dereference_protected
As there is lots of misinformation and outdated information on the
Internet about nearly all topics related to the kernel, I thought it
would be best if I based my RCU code on the guidelines of the examples
in the Documentation/ tree of the latest kernel. One thing that stuck
out when reading the whatisRCU.txt document was, "interesting how we
don't need any function to dereference rcu protected pointers when doing
updates if a lock is held. I wonder how static analyzers will work with
that." Then, a few weeks later, upon discovering sparse's __rcu support,
I ran it over my code, and lo and behold, things weren't done right.
Examining other RCU usages in the kernel reveal consistent usage of
rcu_dereference_protected, passing in lockdep_is_held as the
conditional. So, this patch adds that idiom to the documentation, so
that others ahead of me won't endure the same exercise.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:22:43 -07:00
Paul E. McKenney da873def8d documentation: Call out slow consoles as cause of stall warnings
The Linux kernel outputs copious text during boot, and a slow serial
console can result in stall warnings, particularly when messages are
printed with interrupts disabled.  This commit adds this to the list
of causes of RCU CPU stall warning messages.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:22:38 -07:00
Paul E. McKenney 0d43eb34f9 rcu: Invert passed_quiesce and rename to cpu_no_qs
This commit inverts the sense of the rcu_data structure's ->passed_quiesce
field and renames it to ->cpu_no_qs.  This will allow a later commit to
use an "aggregate OR" operation to test expedited as well as normal grace
periods without added overhead.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-20 21:16:21 -07:00
Paul E. McKenney 3dbe43f6fb Merge branches 'doc.2015.07.15a' and 'torture.2015.07.15a' into HEAD
doc.2015.07.15a: Documentation updates.
torture.2015.07.15a: Torture-test updates.
2015-08-04 08:42:02 -07:00
Paul E. McKenney 8ff4fbfd69 Merge branches 'fixes.2015.07.22a' and 'initexp.2015.08.04a' into HEAD
fixes.2015.07.22a: Miscellaneous fixes.
initexp.2015.08.04a: Initialization and expedited updates.
	(Single branch due to conflicts.)
2015-08-04 08:40:58 -07:00
Paul E. McKenney f78f5b90c4 rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()
This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for
consistency with the WARN() series of macros.  This also requires
inverting the sense of the conditional, which this commit also does.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
2015-07-22 15:27:32 -07:00
Paul E. McKenney cdacbe1f91 rcu: Add fastpath bypassing funnel locking
In the common case, there will be only one expedited grace period in
the system at a given time, in which case it is not helpful to use
funnel locking.  This commit therefore adds a fastpath that bypasses
funnel locking when the root ->exp_funnel_mutex is not held.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-17 14:59:06 -07:00
Paul E. McKenney 99a930b0d2 documentation: Describe new expedited stall warnings
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-17 14:59:03 -07:00
Paul E. McKenney 75c27f119b rcu: Remove CONFIG_RCU_CPU_STALL_INFO
The CONFIG_RCU_CPU_STALL_INFO has been default-y for a couple of
releases with no complaints, so it is time to eliminate this Kconfig
option entirely, so that the long-form RCU CPU stall warnings cannot
be disabled.  This commit does just that.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-17 14:58:44 -07:00
Paul E. McKenney 297f739938 documentation: Fix spelling of "operators"
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-15 14:43:10 -07:00
Linus Torvalds 1e467e68e5 Documentation updates for 4.2
The main thing here is Ingo's big subdirectory documenting feature support
 for each architecture.  Beyond that, it's the usual pile of fixes, tweaks,
 and small additions.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVi0g2AAoJEI3ONVYwIuV6Me4QAIfa79z05ABSjlyWaKw46plH
 lULR9cyHdR59JVPHKjSOfT9/c+GOdoz6kkXQoe/TgVyj5fRB8seUW5GJXCASndkk
 aVd4c6yKFH1NISXsSdVQC0JbpgAURgcSR6x59It++fG3NINvXronFTWGMBHMLKcI
 A2hM2jNP914Dy5r4ipWZKzF1KxIlqK9kmLxlNoE6/LoQfBhh1dMdnyfuM11sguAy
 s5pr9JeCPbWC0RE7st/qEivXF4lpj6hd3XoYfM2Y+oukj5xEPQevLTLHOgtesnx9
 guUAul5Sw27n+Dx8I0Qxf1n+5SkrijoAa72g5vAxTs+ilOey67qba012NaYSy7RK
 s15XOIZ/1JTS9JjkO7GR5NbG6AiIIAH5P+Y501ivCIrsWciTOgKj7cOzakIEV8/P
 NX4120Lh5lbBrWeYkl8WbgMO0Me8cThbALC+rncF/wjvGyREKyxNlZ9qvBqmHYjG
 5Et2DT+rANaDmmblgMK3tX/zI1g3pN51e+CRF+Hzh1jZD3MZ/i+KS4qgfGFDzMIj
 uoniO5VfyD4zRbyv4Grg7XMpXiP8xFxKDypglYiXzzwlkarUgbMGOoFE7AkiPOKB
 t9gLPetbDsDyU/bSpzHlfObZp+q+pCxHPhyLS7hxEi3gBxYajIMbkpHHJugnE0+H
 TfkIhy6QQm1vAPTpRXaE
 =ODt8
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux-2.6

Pull documentation updates from Jonathan Corbet:
 "The main thing here is Ingo's big subdirectory documenting feature
  support for each architecture.  Beyond that, it's the usual pile of
  fixes, tweaks, and small additions"

* tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (79 commits)
  doc:md: fix typo in md.txt.
  Documentation/mic/mpssd: don't build x86 userspace when cross compiling
  Documentation/prctl: don't build tsc tests when cross compiling
  Documentation/vDSO: don't build tests when cross compiling
  Doc:ABI/testing: Fix typo in sysfs-bus-fcoe
  Doc: Docbook: Change wikipedia's URL from http to https in scsi.tmpl
  Doc: Change wikipedia's URL from http to https
  Documentation/kernel-parameters: add missing pciserial to the earlyprintk
  Doc:pps: Fix typo in pps.txt
  kbuild : Fix documentation of INSTALL_HDR_PATH
  Documentation: filesystems: updated struct file_operations documentation in vfs.txt
  kbuild: edit explanation of clean-files variable
  Doc: ja_JP: Fix typo in HOWTO
  Move freefall program from Documentation/ to tools/
  Documentation: ARM: EXYNOS: Describe boot loaders interface
  Doc:nfc: Fix typo in nfc-hci.txt
  vfs: Minor documentation fix
  Doc: networking: txtimestamp: fix printf format warning
  Documentation, intel_pstate: Improve legacy mode internal governors description
  Documentation: extend use case for EXPORT_SYMBOL_GPL()
  ...
2015-06-24 20:01:36 -07:00
Masanari Iida ae13c65bc7 Doc: Change wikipedia's URL from http to https
Recently wikipedia announced to secure access to the servers.
Now all http access re-route to https.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-06-22 10:14:05 -06:00
Paul E. McKenney 0868aa2216 Merge branches 'array.2015.05.27a', 'doc.2015.05.27a', 'fixes.2015.05.27a', 'hotplug.2015.05.27a', 'init.2015.05.27a', 'tiny.2015.05.27a' and 'torture.2015.05.27a' into HEAD
array.2015.05.27a:  Remove all uses of RCU-protected array indexes.
doc.2015.05.27a:  Docuemntation updates.
fixes.2015.05.27a:  Miscellaneous fixes.
hotplug.2015.05.27a:  CPU-hotplug updates.
init.2015.05.27a:  Initialization/Kconfig updates.
tiny.2015.05.27a:  Updates to Tiny RCU.
torture.2015.05.27a:  Torture-testing updates.
2015-05-27 13:00:49 -07:00
Milos Vyletel ed38446424 documentation: State that rcu_dereference() reloads pointer
Make a note stating that repeated calls of rcu_dereference() may not
return the same pointer if update happens while in critical section.

Reported-by: Jeff Haran <jeff.haran@citrix.com>
Signed-off-by: Milos Vyletel <milos@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-05-27 12:57:28 -07:00
Paul E. McKenney ee7c29be36 documentation: Update rcu_dereference.txt based on WG21 discussions
This commit provides another caveat for the care and feeding of pointers
returned by rcu_dereference() that was pointed out in discussions within
the C++ standards committee.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2015-05-27 12:57:28 -07:00
Paul E. McKenney cf9fbf8017 documentation: RCU-protected array indexes no longer supported
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-05-27 12:56:17 -07:00
Paul E. McKenney 78e691f4ae Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', 'preempt.2015.01.06a', 'srcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEAD
doc.2015.01.07a: Documentation updates.
fixes.2015.01.15a: Miscellaneous fixes.
preempt.2015.01.06a: Changes to handling of lists of preempted tasks.
srcu.2015.01.06a: SRCU updates.
stall.2015.01.16a: RCU CPU stall-warning updates and fixes.
torture.2015.01.11a: RCU torture-test updates and fixes.
2015-01-15 23:34:34 -08:00
Paul E. McKenney fb81a44b88 rcu: Add GP-kthread-starvation checks to CPU stall warnings
This commit adds a message that is printed if the relevant grace-period
kthread has not been able to run for the two seconds preceding the
stall warning.  (The two seconds is double the maximum interval between
successive bouts of quiescent-state forcing.)

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-15 23:33:15 -08:00
Paul E. McKenney 5cd37193ce rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors
Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
in places where it would be useful for it to apply to the normal RCU
flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
case for workloads that aggressively overload the system, particularly
those that generate large numbers of RCU updates on systems running
NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
from cond_resched_rcu_qs() to the normal RCU flavors.

Note that it is unfortunately necessary to leave the old ->passed_quiesce
mechanism in place to allow quiescent states that apply to only one
flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
that case, but that is not so good for debugging of RCU internals.)
In addition, if one of the RCU flavor's grace period has stalled, this
will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
quiescent state visible from other CPUs.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
  was used in preemptible code. ]
2015-01-15 23:33:14 -08:00
Paul E. McKenney 6ccd2ecd42 rcu: Improve diagnostics for spurious RCU CPU stall warnings
The current RCU CPU stall warning code will print "Stall ended before
state dump start" any time that the stall-warning code is triggered on
a CPU that has already reported a quiescent state for the current grace
period and if all quiescent states have been reported for the current
grace period.  However, a true stall can result in these symptoms, for
example, by preventing RCU's grace-period kthreads from ever running

This commit therefore checks for this condition, reporting the end of
the stall only if one of the grace-period counters has actually advanced.
Otherwise, it reports the last time that the grace-period kthread made
meaningful progress.  (In normal situations, the grace-period kthread
should make meaningful progress at least every jiffies_till_next_fqs
jiffies.)

Reported-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Miroslav Benes <mbenes@suse.cz>
2015-01-06 11:05:27 -08:00
Xie XiuQi 84596ccbf1 documentation: Update sysfs path for rcu_cpu_stall_timeout
Commit 6bfc09e232 ("rcu: Provide RCU CPU stall warnings for tiny RCU")
moved the rcu_cpu_stall_timeout module parameter from rcutree.c to
rcupdate.c, but failed to update Documentation/RCU/stallwarn.txt. This
commit therefore repairs this omission.

commit 96224daa16 ("documentation: Update sysfs path for rcu_cpu_stall_suppress")
updated the path for rcu_cpu_stall_suppress, but failed to update for
rcu_cpu_stall_timeout.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06 10:57:24 -08:00
Paul E. McKenney 9ea6c58856 Merge branches 'torture.2014.11.03a', 'cpu.2014.11.03a', 'doc.2014.11.13a', 'fixes.2014.11.13a', 'signal.2014.10.29a' and 'rt.2014.10.29a' into HEAD
cpu.2014.11.03a: Changes for per-CPU variables.
doc.2014.11.13a: Documentation updates.
fixes.2014.11.13a: Miscellaneous fixes.
signal.2014.10.29a: Signal changes.
rt.2014.10.29a: Real-time changes.
torture.2014.11.03a: torture-test changes.
2014-11-13 10:39:04 -08:00
Pranith Kumar 28f6569ab7 rcu: Remove redundant TREE_PREEMPT_RCU config option
PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-10-29 10:20:05 -07:00
Paul E. McKenney 0eafa46823 rcu: Remove CONFIG_RCU_CPU_STALL_VERBOSE
The CONFIG_RCU_CPU_STALL_VERBOSE Kconfig parameter causes preemptible
RCU's CPU stall warnings to dump out any preempted tasks that are blocking
the current RCU grace period.  This information is useful, and the default
has been CONFIG_RCU_CPU_STALL_VERBOSE=y for some years.  It is therefore
time for this commit to remove this Kconfig parameter, so that future
kernel builds will always act as if CONFIG_RCU_CPU_STALL_VERBOSE=y.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-10-28 13:48:13 -07:00
Paul E. McKenney 37fe5f0e27 documentation: Add verbiage on RCU-tasks stall warning messages
This commit documents RCU-tasks stall warning messages and also describes
when to use the new cond_resched_rcu_qs() API.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:27:28 -07:00
Ken Helias 1d023284c3 list: fix order of arguments for hlist_add_after(_rcu)
All other add functions for lists have the new item as first argument
and the position where it is added as second argument.  This was changed
for no good reason in this function and makes using it unnecessary
confusing.

The name was changed to hlist_add_behind() to cause unconverted code to
generate a compile error instead of using the wrong parameter order.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Ken Helias <kenhelias@firemail.de>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[intel driver bits]
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:24 -07:00
Paul E. McKenney 9963185c04 documentation: Add pointer to percpu-ref for RCU and refcount
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2014-07-08 08:32:56 -07:00
Pranith Kumar 5e40ad7f6a documentation: Update reference, kerneltrap.org no longer works
The kerneltrap.org site no longer works, so this commit updates it to
a working reference, namely gmane.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-07-08 08:14:09 -07:00
Paul E. McKenney d07e6d080d documentation: Update API documentation
This commit updates the list of RCU API members in whatisRCU.txt.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-04-29 08:38:53 -07:00
Paul E. McKenney b4c5bf3534 documentation: Record rcu_dereference() value mishandling
Recent LKML discussings (see http://lwn.net/Articles/586838/ and
http://lwn.net/Articles/588300/ for the LWN writeups) brought out
some ways of misusing the return value from rcu_dereference() that
are not necessarily completely intuitive.  This commit therefore
documents what can and cannot safely be done with these values.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-04-29 08:38:33 -07:00
Paul E. McKenney 96224daa16 documentation: Update sysfs path for rcu_cpu_stall_suppress
Commit 6bfc09e232 (rcu: Provide RCU CPU stall warnings for tiny RCU)
moved the rcu_cpu_stall_suppress module parameter from rcutree.c to
rcupdate.c, but failed to update Documentation/RCU/stallwarn.txt.  This
commit therefore repairs this omission.

Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-04-29 08:21:50 -07:00
Paul E. McKenney e4696a1d3b documentation: Fix some inconsistencies in RTFP.txt
Some of the early history leaves out some citations and vice versa.  This
commit fixes these up.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17 14:56:10 -08:00
Paul E. McKenney 6e67669678 documentation: Document call_rcu() safety mechanisms and limitations
The call_rcu() family of primitives will take action to accelerate
grace periods when the number of callbacks pending on a given CPU
becomes excessive.  Although this safety mechanism can be useful,
it is no substitute for users of call_rcu() having rate-limit controls
in place.  This commit adds this nuance to the documentation.

Reported-by: "Michael S. Tsirkin" <mst@redhat.com>
Reported-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17 14:55:58 -08:00
Henrik Austad 3cf8ca1c25 Documentation/: update 00-INDEX files
Some of the 00-INDEX files are somewhat outdated and some folders does
not contain 00-INDEX at all.  Only outdated (with the notably exception
of spi) indexes are touched here, the 169 folders without 00-INDEX has
not been touched.

New 00-INDEX
 - spi/* was added in a series of commits dating back to 2006

Added files (missing in (*/)00-INDEX)
 - dmatest.txt was added by commit 851b7e16a0 ("dmatest: run test via
   debugfs")
 - this_cpu_ops.txt was added by commit a1b2a555d6 ("percpu: add
   documentation on this_cpu operations")
 - ww-mutex-design.txt was added by commit 040a0a3710 ("mutex: Add
   support for wound/wait style locks")
 - bcache.txt was added by commit cafe563591 ("bcache: A block layer
   cache")
 - kernel-per-CPU-kthreads.txt was added by commit 49717cb404
   ("kthread: Document ways of reducing OS jitter due to per-CPU
   kthreads")
 - phy.txt was added by commit ff76496347 ("drivers: phy: add generic
   PHY framework")
 - block/null_blk was added by commit 12f8f4fc03 ("null_blk:
   documentation")
 - module-signing.txt was added by commit 3cafea3076 ("Add
   Documentation/module-signing.txt file")
 - assoc_array.txt was added by commit 3cb989501c ("Add a generic
   associative array implementation.")
 - arm/IXP4xx was part of the initial repo
 - arm/cluster-pm-race-avoidance.txt was added by commit 7fe31d28e8
   ("ARM: mcpm: introduce helpers for platform coherency exit/setup")
 - arm/firmware.txt was added by commit 7366b92a77 ("ARM: Add
   interface for registering and calling firmware-specific operations")
 - arm/kernel_mode_neon.txt was added by commit 2afd0a0524 ("ARM:
   7825/1: document the use of NEON in kernel mode")
 - arm/tcm.txt was added by commit bc581770cf ("ARM: 5580/2: ARM TCM
   (Tightly-Coupled Memory) support v3")
 - arm/vlocks.txt was added by commit 9762f12d3e ("ARM: mcpm: Add
   baremetal voting mutexes")
 - blackfin/gptimers-example.c, Makefile was added by commit
   4b60779d5e ("Blackfin: add an example showing how to use the
   gptimers API")
 - devicetree/usage-model.txt was added by commit 31134efc68 ("dt:
   Linux DT usage model documentation")
 - fb/api.txt was added by commit fb21c2f428 ("fbdev: Add FOURCC-based
   format configuration API")
 - fb/sm501.txt was added by commit e6a0498071 ("video, sm501: add
   edid and commandline support")
 - fb/udlfb.txt was added by commit 96f8d864af ("fbdev: move udlfb out
   of staging.")
 - filesystems/Makefile was added by commit 1e0051ae48
   ("Documentation/fs/: split txt and source files")
 - filesystems/nfs/nfsd-admin-interfaces.txt was added by commit
   8a4c6e19cf ("nfsd: document kernel interfaces for nfsd
   configuration")
 - ide/warm-plug-howto.txt was added by commit f74c91413e ("ide: add
   warm-plug support for IDE devices (take 2)")
 - laptops/Makefile was added by commit d49129accc
   ("Documentation/laptop/: split txt and source files")
 - leds/leds-blinkm.txt was added by commit b54cf35a7f ("LEDS: add
   BlinkM RGB LED driver, documentation and update MAINTAINERS")
 - leds/ledtrig-oneshot.txt was added by commit 5e417281cd ("leds: add
   oneshot trigger")
 - leds/ledtrig-transient.txt was added by commit 44e1e9f8e7 ("leds:
   add new transient trigger for one shot timer activation")
 - m68k/README.buddha was part of the initial repo
 - networking/LICENSE.(qla3xxx|qlcnic|qlge) was added by commits
   40839129f7, c4e84bde1d, 5a4faa8737
 - networking/Makefile was added by commit 3794f3e812 ("docsrc: build
   Documentation/ sources")
 - networking/i40evf.txt was added by commit 105bf2fe6b ("i40evf: add
   driver to kernel build system")
 - networking/ipsec.txt was added by commit b3c6efbc36 ("xfrm: Add
   file to document IPsec corner case")
 - networking/mac80211-auth-assoc-deauth.txt was added by commit
   3cd7920a2b ("mac80211: add auth/assoc/deauth flow diagram")
 - networking/netlink_mmap.txt was added by commit 5683264c39
   ("netlink: add documentation for memory mapped I/O")
 - networking/nf_conntrack-sysctl.txt was added by commit c9f9e0e159
   ("netfilter: doc: add nf_conntrack sysctl api documentation") lan)
 - networking/team.txt was added by commit 3d249d4ca7 ("net: introduce
   ethernet teaming device")
 - networking/vxlan.txt was added by commit d342894c5d ("vxlan:
   virtual extensible lan")
 - power/runtime_pm.txt was added by commit 5e928f77a0 ("PM: Introduce
   core framework for run-time PM of I/O devices (rev.  17)")
 - power/charger-manager.txt was added by commit 3bb3dbbd56
   ("power_supply: Add initial Charger-Manager driver")
 - RCU/lockdep-splat.txt was added by commit d7bd2d68aa ("rcu:
   Document interpretation of RCU-lockdep splats")
 - s390/kvm.txt was added by 5ecee4b (KVM: s390: API documentation)
 - s390/qeth.txt was added by commit b4d72c08b3 ("qeth: bridgeport
   support - basic control")
 - scheduler/sched-bwc.txt was added by commit 88ebc08ea9 ("sched: Add
   documentation for bandwidth control")
 - scsi/advansys.txt was added by commit 4bd6d7f356 ("[SCSI] advansys:
   Move documentation to Documentation/scsi")
 - scsi/bfa.txt was added by commit 1ec90174bd ("[SCSI] bfa: add
   readme file")
 - scsi/bnx2fc.txt was added by commit 12b8fc10ea ("[SCSI] bnx2fc: Add
   driver documentation")
 - scsi/cxgb3i.txt was added by commit c3673464eb ("[SCSI] cxgb3i: Add
   cxgb3i iSCSI driver.")
 - scsi/hpsa.txt was added by commit 992ebcf14f ("[SCSI] hpsa: Add
   hpsa.txt to Documentation/scsi")
 - scsi/link_power_management_policy.txt was added by commit
   ca77329fb7 ("[libata] Link power management infrastructure")
 - scsi/osd.txt was added by commit 78e0c621de ("[SCSI] osd:
   Documentation for OSD library")
 - scsi/scsi-parameter.txt was created/moved by commit 163475fb11
   ("Documentation: move SCSI parameters to their own text file")
 - serial/driver was part of the initial repo
 - serial/n_gsm.txt was added by commit 323e84122e ("n_gsm: add a
   documentation")
 - timers/Makefile was added by commit 3794f3e812 ("docsrc: build
   Documentation/ sources")
 - virt/kvm/s390.txt was added by commit d9101fca3d ("KVM: s390:
   diagnose call documentation")
 - vm/split_page_table_lock was added by commit 49076ec2cc ("mm:
   dynamically allocate page->ptl if it cannot be embedded to struct
   page")
 - w1/slaves/w1_ds28e04 was added by commit fbf7f7b4e2 ("w1: Add
   1-wire slave device driver for DS28E04-100")
 - w1/masters/omap-hdq was added by commit e0a29382c6 ("hdq:
   documentation for OMAP HDQ")
 - x86/early-microcode.txt was added by commit 0d91ea86a8 ("x86, doc:
   Documentation for early microcode loading")
 - x86/earlyprintk.txt was added by commit a1aade4788 ("x86/doc:
   mini-howto for using earlyprintk=dbgp")
 - x86/entry_64.txt was added by commit 8b4777a4b5 ("x86-64: Document
   some of entry_64.S")
 - x86/pat.txt was added by commit d27554d874 ("x86: PAT
   documentation")

Moved files
 - arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by
   commit 37b8304642 ("ARM: kuser: move interface documentation out of
   the source code")
 - efi-stub.txt was moved out of x86/ and down into Documentation/ in
   commit 4172fe2f8a ("EFI stub documentation updates")
 - laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit
   efcfed9bad ("Move hp_accel to drivers/platform/x86")
 - commit 5616c23ad9 ("x86: doc: move x86-generic documentation from
   Doc/x86/i386"):
   * x86/usb-legacy-support.txt
   * x86/boot.txt
   * x86/zero_page.txt
 - power/video_extension.txt was moved to acpi in commit 70e66e4df1
   ("ACPI / video: move video_extension.txt to Documentation/acpi")

Removed files (left in 00-INDEX)
 - memory.txt was removed by commit 00ea8990aa ("memory.txt: remove
   stray information")
 - gpio.txt was moved to gpio/ in commit fd8e198cfc ("Documentation:
   gpiolib: document new interface")
 - networking/DLINK.txt was removed by commit 168e06ae26
   ("drivers/net: delete old parallel port de600/de620 drivers")
 - serial/hayes-esp.txt was removed by commit f53a2ade0b ("tty: esp:
   remove broken driver")
 - s390/TAPE was removed by commit 9e280f6693 ("[S390] remove tape
   block docu")
 - vm/locking was removed by commit 57ea8171d2 ("mm: documentation:
   remove hopelessly out-of-date locking doc")
 - laptops/acer-wmi.txt was remvoed by commit 020036678e ("acer-wmi:
   Delete out-of-date documentation")

Typos/misc issues
 - rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit
   030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
   authentication.")
 - commit b88cf73d92 ("net: add missing entries to
   Documentation/networking/00-INDEX")
   * generic-hdlc.txt was added as generic_hdlc.txt
   * spider_net.txt was added as spider-net.txt
 - w1/master/mxc-w1 was added as mxc_w1 by commit a5fd9139f7 ("w1: add
   1-wire master driver for i.MX27 / i.MX31")
 - s390/zfcpdump.txt was added as zfcpdump by commit 6920c12a40
   ("[S390] Add Documentation/s390/00-INDEX.")

Signed-off-by: Henrik Austad <henrik@austad.us>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	[rcu bits]
Acked-by: Rob Landley <rob@landley.net>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Len Brown <len.brown@intel.com>
Cc: James Bottomley <JBottomley@parallels.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:40 -08:00
Paul E. McKenney 0d3c55bc9f Merge branches 'doc.2013.12.03a', 'fixes.2013.12.12a', 'rcutorture.2013.12.03a' and 'sparse.2013.12.12a' into HEAD
doc.2013.12.03a: Topic branch for documentation changes.
fixes.2013.12.12a: Topic branch for miscellaneous fixes.
rcutorture.2013.12.03a: Topic branch for new rcutorture/KVM scripting.
sparse.2013.12.12a: Topic branch for sparse-RCU changes.
2013-12-12 12:35:38 -08:00
Paul E. McKenney 96d3fd0d31 rcu: Break call_rcu() deadlock involving scheduler and perf
Dave Jones got the following lockdep splat:

>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  3.12.0-rc3+ #92 Not tainted
>  -------------------------------------------------------
>  trinity-child2/15191 is trying to acquire lock:
>   (&rdp->nocb_wq){......}, at: [<ffffffff8108ff43>] __wake_up+0x23/0x50
>
> but task is already holding lock:
>   (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&ctx->lock){-.-...}:
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80
>         [<ffffffff811500ff>] __perf_event_task_sched_out+0x2df/0x5e0
>         [<ffffffff81091b83>] perf_event_task_sched_out+0x93/0xa0
>         [<ffffffff81732052>] __schedule+0x1d2/0xa20
>         [<ffffffff81732f30>] preempt_schedule_irq+0x50/0xb0
>         [<ffffffff817352b6>] retint_kernel+0x26/0x30
>         [<ffffffff813eed04>] tty_flip_buffer_push+0x34/0x50
>         [<ffffffff813f0504>] pty_write+0x54/0x60
>         [<ffffffff813e900d>] n_tty_write+0x32d/0x4e0
>         [<ffffffff813e5838>] tty_write+0x158/0x2d0
>         [<ffffffff811c4850>] vfs_write+0xc0/0x1f0
>         [<ffffffff811c52cc>] SyS_write+0x4c/0xa0
>         [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
>
> -> #2 (&rq->lock){-.-.-.}:
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80
>         [<ffffffff810980b2>] wake_up_new_task+0xc2/0x2e0
>         [<ffffffff81054336>] do_fork+0x126/0x460
>         [<ffffffff81054696>] kernel_thread+0x26/0x30
>         [<ffffffff8171ff93>] rest_init+0x23/0x140
>         [<ffffffff81ee1e4b>] start_kernel+0x3f6/0x403
>         [<ffffffff81ee1571>] x86_64_start_reservations+0x2a/0x2c
>         [<ffffffff81ee1664>] x86_64_start_kernel+0xf1/0xf4
>
> -> #1 (&p->pi_lock){-.-.-.}:
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
>         [<ffffffff810979d1>] try_to_wake_up+0x31/0x350
>         [<ffffffff81097d62>] default_wake_function+0x12/0x20
>         [<ffffffff81084af8>] autoremove_wake_function+0x18/0x40
>         [<ffffffff8108ea38>] __wake_up_common+0x58/0x90
>         [<ffffffff8108ff59>] __wake_up+0x39/0x50
>         [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
>         [<ffffffff81111450>] __call_rcu+0x140/0x820
>         [<ffffffff81111b8d>] call_rcu+0x1d/0x20
>         [<ffffffff81093697>] cpu_attach_domain+0x287/0x360
>         [<ffffffff81099d7e>] build_sched_domains+0xe5e/0x10a0
>         [<ffffffff81efa7fc>] sched_init_smp+0x3b7/0x47a
>         [<ffffffff81ee1f4e>] kernel_init_freeable+0xf6/0x202
>         [<ffffffff817200be>] kernel_init+0xe/0x190
>         [<ffffffff8173d22c>] ret_from_fork+0x7c/0xb0
>
> -> #0 (&rdp->nocb_wq){......}:
>         [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
>         [<ffffffff8108ff43>] __wake_up+0x23/0x50
>         [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
>         [<ffffffff81111450>] __call_rcu+0x140/0x820
>         [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30
>         [<ffffffff81149abf>] put_ctx+0x4f/0x70
>         [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230
>         [<ffffffff81056b8d>] do_exit+0x30d/0xcc0
>         [<ffffffff8105893c>] do_group_exit+0x4c/0xc0
>         [<ffffffff810589c4>] SyS_exit_group+0x14/0x20
>         [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
>
> other info that might help us debug this:
>
> Chain exists of:
>   &rdp->nocb_wq --> &rq->lock --> &ctx->lock
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(&ctx->lock);
>                                 lock(&rq->lock);
>                                 lock(&ctx->lock);
>    lock(&rdp->nocb_wq);
>
>  *** DEADLOCK ***
>
> 1 lock held by trinity-child2/15191:
>  #0:  (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230
>
> stack backtrace:
> CPU: 2 PID: 15191 Comm: trinity-child2 Not tainted 3.12.0-rc3+ #92
>  ffffffff82565b70 ffff880070c2dbf8 ffffffff8172a363 ffffffff824edf40
>  ffff880070c2dc38 ffffffff81726741 ffff880070c2dc90 ffff88022383b1c0
>  ffff88022383aac0 0000000000000000 ffff88022383b188 ffff88022383b1c0
> Call Trace:
>  [<ffffffff8172a363>] dump_stack+0x4e/0x82
>  [<ffffffff81726741>] print_circular_bug+0x200/0x20f
>  [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0
>  [<ffffffff810c6439>] ? get_lock_stats+0x19/0x60
>  [<ffffffff8100b2f4>] ? native_sched_clock+0x24/0x80
>  [<ffffffff810cc243>] lock_acquire+0x93/0x200
>  [<ffffffff8108ff43>] ? __wake_up+0x23/0x50
>  [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
>  [<ffffffff8108ff43>] ? __wake_up+0x23/0x50
>  [<ffffffff8108ff43>] __wake_up+0x23/0x50
>  [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
>  [<ffffffff81111450>] __call_rcu+0x140/0x820
>  [<ffffffff8109bc8f>] ? local_clock+0x3f/0x50
>  [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30
>  [<ffffffff81149abf>] put_ctx+0x4f/0x70
>  [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230
>  [<ffffffff81056b8d>] do_exit+0x30d/0xcc0
>  [<ffffffff810c9af5>] ? trace_hardirqs_on_caller+0x115/0x1e0
>  [<ffffffff810c9bcd>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff8105893c>] do_group_exit+0x4c/0xc0
>  [<ffffffff810589c4>] SyS_exit_group+0x14/0x20
>  [<ffffffff8173d4e4>] tracesys+0xdd/0xe2

The underlying problem is that perf is invoking call_rcu() with the
scheduler locks held, but in NOCB mode, call_rcu() will with high
probability invoke the scheduler -- which just might want to use its
locks.  The reason that call_rcu() needs to invoke the scheduler is
to wake up the corresponding rcuo callback-offload kthread, which
does the job of starting up a grace period and invoking the callbacks
afterwards.

One solution (championed on a related problem by Lai Jiangshan) is to
simply defer the wakeup to some point where scheduler locks are no longer
held.  Since we don't want to unnecessarily incur the cost of such
deferral, the task before us is threefold:

1.	Determine when it is likely that a relevant scheduler lock is held.

2.	Defer the wakeup in such cases.

3.	Ensure that all deferred wakeups eventually happen, preferably
	sooner rather than later.

We use irqs_disabled_flags() as a proxy for relevant scheduler locks
being held.  This works because the relevant locks are always acquired
with interrupts disabled.  We may defer more often than needed, but that
is at least safe.

The wakeup deferral is tracked via a new field in the per-CPU and
per-RCU-flavor rcu_data structure, namely ->nocb_defer_wakeup.

This flag is checked by the RCU core processing.  The __rcu_pending()
function now checks this flag, which causes rcu_check_callbacks()
to initiate RCU core processing at each scheduling-clock interrupt
where this flag is set.  Of course this is not sufficient because
scheduling-clock interrupts are often turned off (the things we used to
be able to count on!).  So the flags are also checked on entry to any
state that RCU considers to be idle, which includes both NO_HZ_IDLE idle
state and NO_HZ_FULL user-mode-execution state.

This approach should allow call_rcu() to be invoked regardless of what
locks you might be holding, the key word being "should".

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2013-12-03 10:10:18 -08:00
Paul E. McKenney c9592ecb98 rcu: Fix typo in Documentation/RCU/trace.txt
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03 10:08:56 -08:00
Michael Opdenacker 4b0d3f0fde rcu: Fix occurrence of "the the" in checklist.txt
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Add "then" as suggested by Josh Triplett. ]
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-09-25 10:07:02 -07:00
Paul E. McKenney 64d3b7a1d5 rcu: Update stall-warning documentation
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-09-25 06:49:37 -07:00
Paul E. McKenney 25f27ce4a6 Merge branches 'doc.2013.08.19a', 'fixes.2013.08.20a', 'sysidle.2013.08.31a' and 'torture.2013.08.20a' into HEAD
doc.2013.08.19a: Documentation updates
fixes.2013.08.20a: Miscellaneous fixes
sysidle.2013.08.31a: Detect system-wide idle state.
torture.2013.08.20a: rcutorture updates.
2013-08-31 14:44:45 -07:00
Paul E. McKenney 2ec1f2d987 rcu: Increase rcutorture test coverage
Currently, rcutorture has separate torture_types to test synchronous,
asynchronous, and expedited grace-period primitives.  This has
two disadvantages: (1) Three times the number of runs to cover the
combinations and (2) Little testing of concurrent combinations of the
three options.  This commit therefore adds a pair of module parameters
that control normal and expedited state, with the default being both
types, randomly selected, by the fakewriter processes, thus reducing
source-code size and increasing test coverage.  In addtion, the writer
task switches between asynchronous-normal and expedited grace-period
primitives driven by the same pair of module parameters.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-20 11:38:41 -07:00
Paul E. McKenney 6ae3771850 rcu: Update RTFP documentation
Note that this commit also updates the formatting of serveral of the
bibtex entries to conform to that of my .bib files.  I started
accumulating entries back in the 1980s, back when bibtex insisted
that comma (",") was a separator, not a terminator.  This rule forced
commas to the fronts of lines.  25 years later, bibtex allows commas
to be terminators, but I am too lazy to rework all my .bib files.
Keeping the same format as my .bib files allows my to simply
incorporate my RCU.bib file into Documentation/RCU/RTFP.txt, which
is much easier than my earlier practice of keeping track of what
had changed and adding individual entries.  (I sometimes find relevant
papers that were published some years back, for example.)

In addition, this change adds entries for papers published in the
last year or so.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-19 21:39:27 -07:00
Paul E. McKenney d84297c99b rcu: Fix rcu_barrier() documentation
There was a time when rcu_barrier() was guaranteed to wait for at least
a grace period, but that time ended due to energy-efficiency concerns.
So now rcu_barrier() is a no-op if there are no RCU callbacks queued in
the system.  This commit updates the documentation to reflect this change.

Now, rcu_barrier() often does wait for a grace period, so, one could
imagine some modification to rcu_barrier() to more efficiently handle
cases where both rcu_barrier() and a grace period are needed.  But this
must wait until someone shows a real-world need for a change.

Reported-by: Bob Copeland <bob@cozybit.com>
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-18 17:05:32 -07:00
Paul E. McKenney be77f87c00 Merge branches 'cbnum.2013.06.10a', 'doc.2013.06.10a', 'fixes.2013.06.10a', 'srcu.2013.06.10a' and 'tiny.2013.06.10a' into HEAD
cbnum.2013.06.10a: Apply simplifications stemming from the new callback
	numbering.

doc.2013.06.10a: Documentation updates.

fixes.2013.06.10a: Miscellaneous fixes.

srcu.2013.06.10a: Updates to SRCU.

tiny.2013.06.10a: Eliminate TINY_PREEMPT_RCU.
2013-06-10 13:46:44 -07:00
Paul E. McKenney 7807acdb6b rcu: Remove TINY_PREEMPT_RCU tracing documentation
Because TINY_PREEMPT_RCU is no more, this commit removes its tracing
formats from the documentation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10 13:45:52 -07:00
Paul E. McKenney 99f88919f8 rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw().
These interfaces never did get used, so this commit removes them,
their rcutorture tests, and documentation referencing them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-06-10 13:45:25 -07:00
Frederic Weisbecker c032862fba Merge commit '8700c95adb03' into timers/nohz
The full dynticks tree needs the latest RCU and sched
upstream updates in order to fix some dependencies.

Merge a common upstream merge point that has these
updates.

Conflicts:
	include/linux/perf_event.h
	kernel/rcutree.h
	kernel/rcutree_plugin.h

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2013-05-02 17:54:19 +02:00
Frederic Weisbecker 3451d0243c nohz: Rename CONFIG_NO_HZ to CONFIG_NO_HZ_COMMON
We are planning to convert the dynticks Kconfig options layout
into a choice menu. The user must be able to easily pick
any of the following implementations: constant periodic tick,
idle dynticks, full dynticks.

As this implies a mutual exclusion, the two dynticks implementions
need to converge on the selection of a common Kconfig option in order
to ease the sharing of a common infrastructure.

It would thus seem pretty natural to reuse CONFIG_NO_HZ to
that end. It already implements all the idle dynticks code
and the full dynticks depends on all that code for now.
So ideally the choice menu would propose CONFIG_NO_HZ_IDLE and
CONFIG_NO_HZ_EXTENDED then both would select CONFIG_NO_HZ.

On the other hand we want to stay backward compatible: if
CONFIG_NO_HZ is set in an older config file, we want to
enable CONFIG_NO_HZ_IDLE by default.

But we can't afford both at the same time or we run into
a circular dependency:

1) CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED both select
   CONFIG_NO_HZ
2) If CONFIG_NO_HZ is set, we default to CONFIG_NO_HZ_IDLE

We might be able to support that from Kconfig/Kbuild but it
may not be wise to introduce such a confusing behaviour.

So to solve this, create a new CONFIG_NO_HZ_COMMON option
which gathers the common code between idle and full dynticks
(that common code for now is simply the idle dynticks code)
and select it from their referring Kconfig.

Then we'll later create CONFIG_NO_HZ_IDLE and map CONFIG_NO_HZ
to it for backward compatibility.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Gilad Ben Yossef <gilad@benyossef.com>
Cc: Hakan Akkan <hakanakkan@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-03 13:56:03 +02:00
Paul E. McKenney 6d87669357 Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD
doc.2013.03.12a: Documentation changes.

fixes.2013.03.13a: Miscellaneous fixes.

idlenocb.2013.03.26b: Remove restrictions on no-CBs CPUs, make
	RCU_FAST_NO_HZ take advantage of numbered callbacks, add
	callback acceleration based on numbered callbacks.
2013-03-26 08:07:38 -07:00
Paul E. McKenney 6231069bda rcu: Add softirq-stall indications to stall-warning messages
If RCU's softirq handler is prevented from executing, an RCU CPU stall
warning can result.  Ways to prevent RCU's softirq handler from executing
include: (1) CPU spinning with interrupts disabled, (2) infinite loop
in some softirq handler, and (3) in -rt kernels, an infinite loop in a
set of real-time threads running at priorities higher than that of RCU's
softirq handler.

Because this situation can be difficult to track down, this commit causes
the count of RCU softirq handler invocations to be printed with RCU
CPU stall warnings.  This information does require some interpretation,
as now documented in Documentation/RCU/stallwarn.txt.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-03-13 14:43:56 -07:00
Paul E. McKenney 3f944adb9d rcu: Documentation update
This commit applies a few updates based on a quick review of the RCU
documentations.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-12 14:09:02 -07:00
Paul E. McKenney 4357fb570b rcu: Make bugginess of code sample more evident
One of the code samples in whatisRCU.txt shows a bug, but someone scanning
the document quickly might mistake it for a valid use of RCU.  Add some
screaming comments to help keep speed-readers on track.

Reported-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-03-12 14:09:01 -07:00
Paul E. McKenney aac1cda34b Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD
urgent.2012.10.27a: Fix for RCU user-mode transition (already in -tip).

doc.2012.11.08a: Documentation updates, most notably codifying the
	memory-barrier guarantees inherent to grace periods.

fixes.2012.11.13a: Miscellaneous fixes.

srcu.2012.10.27a: Allow statically allocated and initialized srcu_struct
	structures (courtesy of Lai Jiangshan).

stall.2012.11.13a: Add more diagnostic information to RCU CPU stall
	warnings, also decrease from 60 seconds to 21 seconds.

hotplug.2012.11.08a: Minor updates to CPU hotplug handling.

tracing.2012.11.08a: Improved debugfs tracing, courtesy of Michael Wang.

idle.2012.10.24a: Updates to RCU idle/adaptive-idle handling, including
	a boot parameter that maps normal grace periods to expedited.

Resolved conflict in kernel/rcutree.c due to side-by-side change.
2012-11-16 09:59:58 -08:00
Paul E. McKenney d484a21513 rcu: Add documentation for the new rcuexp debugfs trace file
This commit adds the documentation of the rcuexp debugfs trace file
that records statistics for expedited grace periods.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-11-16 09:55:31 -08:00
Paul E. McKenney 40e80c469f rcu: Update documentation for TREE_RCU debugfs tracing
This commit updates the tracing documentation to reflect the new
format that has per-RCU-flavor directories.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-11-16 09:54:02 -08:00
Paul E. McKenney bb08f76d84 rcu: Remove list_for_each_continue_rcu()
The list_for_each_continue_rcu() macro is no longer used, so this commit
removes it.  The list_for_each_entry_continue_rcu() macro should be
used instead.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-11-13 14:08:21 -08:00
Paul E. McKenney a4d611fdca rcu: Document alternative RCU/reference-count algorithms
The approach for mixing RCU and reference counting listed in the RCU
documentation only describes one possible approach.  This approach can
result in failure on the read side, which is nice if you want fresh data,
but not so good if you want simple code.  This commit therefore adds
two additional approaches that feature unconditional reference-count
acquisition by RCU readers.  These approaches are very similar to that
used in the security code.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-11-08 11:44:38 -08:00
Kees Cook 57d34a6cee rcu: Update docs to include kfree_rcu()
Mention kfree_rcu() in the call_rcu() section.  Additionally fix the
example code for list replacement that used the wrong structure element.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-11-08 11:44:25 -08:00
Dhaval Giani 0f9574d832 rcu: Correct the name of a reference in list of RCU papers
Trying to go through the history of RCU (not for the weak
minded) led me to search for a non-existent paper.

Correct it to the actual reference

Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:44:47 -07:00
Paul E. McKenney bda4ec9f6a Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a', 'fixes.2012.09.23a', 'hotplug.2012.09.23a' and 'idlechop.2012.09.23a' into HEAD
bigrt.2012.09.23a contains additional commits to reduce scheduling latency
	from RCU on huge systems (many hundrends or thousands of CPUs).

doctorture.2012.09.23a contains documentation changes and rcutorture fixes.

fixes.2012.09.23a contains miscellaneous fixes.

hotplug.2012.09.23a contains CPU-hotplug-related changes.

idle.2012.09.23a fixes architectures for which RCU no longer considered
	the idle loop to be a quiescent state due to earlier
	adaptive-dynticks changes.  Affected architectures are alpha,
	cris, frv, h8300, m32r, m68k, mn10300, parisc, score, xtensa,
	and ia64.
2012-09-24 20:02:22 -07:00
Paul E. McKenney 86f343b50b rcu: Fix CONFIG_RCU_FAST_NO_HZ stall warning message
The print_cpu_stall_fast_no_hz() function attempts to print -1 when
the ->idle_gp_timer is not pending, but unsigned arithmetic causes it
to instead print ULONG_MAX, which is 4294967295 on 32-bit systems and
18446744073709551615 on 64-bit systems.  Neither of these are the most
reader-friendly values, so this commit instead causes "timer not pending"
to be printed when ->idle_gp_timer is not pending.

Reported-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-09-23 07:42:52 -07:00
Paul E. McKenney 2aef619c75 rcu: Document SRCU dead-CPU capabilities, emphasize read-side limits
The current documentation did not help someone grepping for SRCU to
learn that disabling preemption is not a replacement for srcu_read_lock(),
so upgrade the documentation to bring this out, not just for SRCU,
but also for RCU-bh.  Also document the fact that SRCU readers are
respected on CPUs executing in user mode, idle CPUs, and even on
offline CPUs.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2012-09-23 07:42:23 -07:00
Paul E. McKenney 4605c0143c rcu: Adjust debugfs tracing for kthread-based quiescent-state forcing
Moving quiescent-state forcing into a kthread dispenses with the need
for the ->n_rp_need_fqs field, so this commit removes it.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:41:54 -07:00
Paul E. McKenney 74d874e7bd rcu: Update documentation to cover call_srcu() and srcu_barrier().
The advent of call_srcu() and srcu_barrier() obsoleted some of the
documentation, so this commit brings that up to date.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-07-02 12:34:03 -07:00
Paul E. McKenney fae4b54f28 rcu: Introduce rcutorture testing for rcu_barrier()
Although rcutorture does invoke rcu_barrier() and friends, it cannot
really be called a torture test given that it invokes them only once
at the end of the test.  This commit therefore introduces heavy-duty
rcutorture testing for rcu_barrier(), which may be carried out
concurrently with normal rcutorture testing.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-04-30 10:48:18 -07:00
Paul E. McKenney 236fefafe5 rcu: Call out dangers of expedited RCU primitives
The expedited RCU primitives can be quite useful, but they have some
high costs as well.  This commit updates and creates docbook comments
calling out the costs, and updates the RCU documentation as well.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:06:08 -08:00
Paul E. McKenney 2036d94a7b rcu: Rework detection of use of RCU by offline CPUs
Because newly offlined CPUs continue executing after completing the
CPU_DYING notifiers, they legitimately enter the scheduler and use
RCU while appearing to be offline.  This calls for a more sophisticated
approach as follows:

1.	RCU marks the CPU online during the CPU_UP_PREPARE phase.

2.	RCU marks the CPU offline during the CPU_DEAD phase.

3.	Diagnostics regarding use of read-side RCU by offline CPUs use
	RCU's accounting rather than the cpu_online_map.  (Note that
	__call_rcu() still uses cpu_online_map to detect illegal
	invocations within CPU_DYING notifiers.)

4.	Offline CPUs are prevented from hanging the system by
	force_quiescent_state(), which pays attention to cpu_online_map.
	Some additional work (in a later commit) will be needed to
	guarantee that force_quiescent_state() waits a full jiffy before
	assuming that a CPU is offline, for example, when called from
	idle entry.  (This commit also makes the one-jiffy wait
	explicit, since the old-style implicit wait can now be defeated
	by RCU_FAST_NO_HZ and by rcutorture.)

This approach avoids the false positives encountered when attempting to
use more exact classification of CPU online/offline state.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:06:07 -08:00
Paul E. McKenney 24cd7fd0ea rcu: Update stall-warning documentation
Add documentation of CONFIG_RCU_CPU_STALL_VERBOSE, CONFIG_RCU_CPU_STALL_INFO,
and RCU_STALL_DELAY_DELTA.  Describe multiple stall-warning messages from
a single stall, and the timing of the subsequent messages.  Add headings.
Remove RCU_SECONDS_TILL_STALL_RECHECK because this value is now computed
at runtime from RCU_CPU_STALL_TIMEOUT, so that sysfs changes to the timeout
value now directly affect the RCU_SECONDS_TILL_STALL_RECHECK value.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:03:52 -08:00
Paul E. McKenney c13f3757d0 rcu: Add CPU-stall capability to rcutorture
Add module parameters to rcutorture that induce a CPU stall.
The stall_cpu parameter specifies how long to stall in seconds,
defaulting to zero, which indicates no stalling is to be undertaken.
The stall_cpu_holdoff parameter specifies how many seconds after
insmod (or boot, if rcutorture is built into the kernel) that this
stall is to start.  The default value for stall_cpu_holdoff is ten
seconds.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:03:52 -08:00
Paul E. McKenney 105617da8d rcu: Make documentation give more realistic rcutorture duration
The torture.txt documentation gives an example rcutorture run with a
100-second duration.  This is ridiculously short, unless maybe testing
a fix for a egregious bug.  Use a more-realistic one-hour duration for
the example.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:03:51 -08:00
Paul E. McKenney 9b9ec9b90e rcutorture: Permit holding off CPU-hotplug operations during boot
When rcutorture is started automatically at boot time, it might well
also start CPU-hotplug operations at that time, which might not be
desirable.  This commit therefore adds an rcutorture parameter that
allows CPU-hotplug operations to be held off for the specified number
of seconds after the start of boot.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:03:50 -08:00
Paul E. McKenney 4c62abc90b rcu: Bring RTFP.txt up to date.
Add publications from 2010 and 2011 to RTFP.txt.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-02-21 09:03:21 -08:00
Kees Cook d493011a37 docs: Additional LWN links to RCU API
Tyler Hicks pointed me at an additional article on RCU and I figured
it should probably be mentioned with the others.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:32:23 -08:00
Paul E. McKenney b58bdccaa8 rcu: Add rcutorture CPU-hotplug capability
Running CPU-hotplug operations concurrently with rcutorture has
historically been a good way to find bugs in both RCU and CPU hotplug.
This commit therefore adds an rcutorture module parameter called
"onoff_interval" that causes a randomly selected CPU-hotplug operation to
be executed at the specified interval, in seconds.  The default value of
"onoff_interval" is zero, which disables rcutorture-instigated CPU-hotplug
operations.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:56 -08:00
Paul E. McKenney d5f546d834 rcu: Add rcutorture system-shutdown capability
Although it is easy to run rcutorture tests under KVM, there is currently
no nice way to run such a test for a fixed time period, collect all of
the rcutorture data, and then shut the system down cleanly.  This commit
therefore adds an rcutorture module parameter named "shutdown_secs" that
specified the run duration in seconds, after which rcutorture terminates
the test and powers the system down.  The default value for "shutdown_secs"
is zero, which disables shutdown.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:46 -08:00
Paul E. McKenney 9ceae0e248 rcu: Add documentation for raw SRCU read-side primitives
Update various files in Documentation/RCU to reflect srcu_read_lock_raw()
and srcu_read_unlock_raw().  Credit to Peter Zijlstra for suggesting
use of the existing _raw suffix instead of the earlier bulkref names.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:41 -08:00
Paul E. McKenney 2c01531f08 rcu: Document failing tick as cause of RCU CPU stall warning
One of lclaudio's systems was seeing RCU CPU stall warnings from idle.
These turned out to be caused by a bug that stopped scheduling-clock
tick interrupts from being sent to a given CPU for several hundred seconds.
This commit therefore updates the documentation to call this out as a
possible cause for RCU CPU stall warnings.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:26 -08:00
Paul E. McKenney 9b2e4f1880 rcu: Track idleness independent of idle tasks
Earlier versions of RCU used the scheduling-clock tick to detect idleness
by checking for the idle task, but handled idleness differently for
CONFIG_NO_HZ=y.  But there are now a number of uses of RCU read-side
critical sections in the idle task, for example, for tracing.  A more
fine-grained detection of idleness is therefore required.

This commit presses the old dyntick-idle code into full-time service,
so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
always invoked at the beginning of an idle loop iteration.  Similarly,
rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
at the end of an idle-loop iteration.  This allows the idle task to
use RCU everywhere except between consecutive rcu_idle_enter() and
rcu_idle_exit() calls, in turn allowing architecture maintainers to
specify exactly where in the idle loop that RCU may be used.

Because some of the userspace upcall uses can result in what looks
to RCU like half of an interrupt, it is not possible to expect that
the irq_enter() and irq_exit() hooks will give exact counts.  This
patch therefore expands the ->dynticks_nesting counter to 64 bits
and uses two separate bitfields to count process/idle transitions
and interrupt entry/exit transitions.  It is presumed that userspace
upcalls do not happen in the idle loop or from usermode execution
(though usermode might do a system call that results in an upcall).
The counter is hard-reset on each process/idle transition, which
avoids the interrupt entry/exit error from accumulating.  Overflow
is avoided by the 64-bitness of the ->dyntick_nesting counter.

This commit also adds warnings if a non-idle task asks RCU to enter
idle state (and these checks will need some adjustment before applying
Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
In addition, validation of ->dynticks and ->dynticks_nesting is added.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:24 -08:00
Paul E. McKenney d7bd2d68aa rcu: Document interpretation of RCU-lockdep splats
There has been quite a bit of confusion about what RCU-lockdep splats
mean, so this commit adds some documentation describing how to
interpret them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:28 -07:00
Paul E. McKenney 8cd889cbb6 rcu: Update documentation for additional RCU lockdep functions
Add documentation for rcu_dereference_bh_check(),
rcu_dereference_sched_check(), srcu_dereference_check(), and
rcu_dereference_index_check().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:25 -07:00
Michal Hocko e5177ec77d rcu: Not necessary to pass rcu_read_lock_held() to rcu_dereference_protected()
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check() use rcu_read_lock_held() as a part of condition
automatically.  Therefore, callers of rcu_dereference_check() no longer
need to pass rcu_read_lock_held() to rcu_dereference_check().

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:23 -07:00
Paul E. McKenney e4cc1f22b2 rcu: Simplify quiescent-state accounting
There is often a delay between the time that a CPU passes through a
quiescent state and the time that this quiescent state is reported to the
RCU core.  It is quite possible that the grace period ended before the
quiescent state could be reported, for example, some other CPU might have
deduced that this CPU passed through dyntick-idle mode.  It is critically
important that quiescent state be counted only against the grace period
that was in effect at the time that the quiescent state was detected.

Previously, this was handled by recording the number of the last grace
period to complete when passing through a quiescent state.  The RCU
core then checks this number against the current value, and rejects
the quiescent state if there is a mismatch.  However, one additional
possibility must be accounted for, namely that the quiescent state was
recorded after the prior grace period completed but before the current
grace period started.  In this case, the RCU core must reject the
quiescent state, but the recorded number will match.  This is handled
when the CPU becomes aware of a new grace period -- at that point,
it invalidates any prior quiescent state.

This works, but is a bit indirect.  The new approach records the current
grace period, and the RCU core checks to see (1) that this is still the
current grace period and (2) that this grace period has not yet ended.
This approach simplifies reasoning about correctness, and this commit
changes over to this new approach.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:38:22 -07:00
Paul E. McKenney b15a2e7d16 rcu: Fix RCU's NMI documentation
It has long been the case that the architecture must call nmi_enter()
and nmi_exit() rather than irq_enter() and irq_exit() in order to
permit RCU read-side critical sections in NMIs.  Catch the documentation
up with reality.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2011-09-28 21:36:44 -07:00
Paul E. McKenney bdf2a43649 rcu: Catch rcutorture up to new RCU API additions
Now that the RCU API contains synchronize_rcu_bh(), synchronize_sched(),
call_rcu_sched(), and rcu_bh_expedited()...

Make rcutorture test synchronize_rcu_bh(), getting rid of the old
rcu_bh_torture_synchronize() workaround.  Similarly, make rcutorture test
synchronize_sched(), getting rid of the old sched_torture_synchronize()
workaround.  Make rcutorture test call_rcu_sched() instead of wrappering
synchronize_sched().  Also add testing of rcu_bh_expedited().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:36:43 -07:00
Paul E. McKenney 63cd758e07 rcu: Update rcutorture documentation
Update rcutorture documentation to account for boosting, new types of
RCU torture testing that have been added over the past few years, and
the memory-barrier testing that was added an embarrassingly long time
ago.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:36:39 -07:00
Paul E. McKenney d5988af531 rcu: Update documentation to flag RCU_BOOST trace information
Call out the RCU_TRACE information that is provided only in kernels
built with RCU_BOOST.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28 21:36:35 -07:00
Wanlong Gao 25eb650a69 doc: fix wrong arch/i386 references
Change all "arch/i386" to "arch/x86" in Documentaion/,
since the directory has changed.

Also update the files which have changed their filename
in the meantime accordingly.

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
[jkosina@suse.cz: reword changelog]
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-13 13:43:05 +02:00
Paul E. McKenney 23b5c8fa01 rcu: Decrease memory-barrier usage based on semi-formal proof
(Note: this was reverted, and is now being re-applied in pieces, with
this being the fifth and final piece.  See below for the reason that
it is now felt to be safe to re-apply this.)

Commit d09b62d fixed grace-period synchronization, but left some smp_mb()
invocations in rcu_process_callbacks() that are no longer needed, but
sheer paranoia prevented them from being removed.  This commit removes
them and provides a proof of correctness in their absence.  It also adds
a memory barrier to rcu_report_qs_rsp() immediately before the update to
rsp->completed in order to handle the theoretical possibility that the
compiler or CPU might move massive quantities of code into a lock-based
critical section.  This also proves that the sheer paranoia was not
entirely unjustified, at least from a theoretical point of view.

In addition, the old dyntick-idle synchronization depended on the fact
that grace periods were many milliseconds in duration, so that it could
be assumed that no dyntick-idle CPU could reorder a memory reference
across an entire grace period.  Unfortunately for this design, the
addition of expedited grace periods breaks this assumption, which has
the unfortunate side-effect of requiring atomic operations in the
functions that track dyntick-idle state for RCU.  (There is some hope
that the algorithms used in user-level RCU might be applied here, but
some work is required to handle the NMIs that user-space applications
can happily ignore.  For the short term, better safe than sorry.)

This proof assumes that neither compiler nor CPU will allow a lock
acquisition and release to be reordered, as doing so can result in
deadlock.  The proof is as follows:

1.	A given CPU declares a quiescent state under the protection of
	its leaf rcu_node's lock.

2.	If there is more than one level of rcu_node hierarchy, the
	last CPU to declare a quiescent state will also acquire the
	->lock of the next rcu_node up in the hierarchy,  but only
	after releasing the lower level's lock.  The acquisition of this
	lock clearly cannot occur prior to the acquisition of the leaf
	node's lock.

3.	Step 2 repeats until we reach the root rcu_node structure.
	Please note again that only one lock is held at a time through
	this process.  The acquisition of the root rcu_node's ->lock
	must occur after the release of that of the leaf rcu_node.

4.	At this point, we set the ->completed field in the rcu_state
	structure in rcu_report_qs_rsp().  However, if the rcu_node
	hierarchy contains only one rcu_node, then in theory the code
	preceding the quiescent state could leak into the critical
	section.  We therefore precede the update of ->completed with a
	memory barrier.  All CPUs will therefore agree that any updates
	preceding any report of a quiescent state will have happened
	before the update of ->completed.

5.	Regardless of whether a new grace period is needed, rcu_start_gp()
	will propagate the new value of ->completed to all of the leaf
	rcu_node structures, under the protection of each rcu_node's ->lock.
	If a new grace period is needed immediately, this propagation
	will occur in the same critical section that ->completed was
	set in, but courtesy of the memory barrier in #4 above, is still
	seen to follow any pre-quiescent-state activity.

6.	When a given CPU invokes __rcu_process_gp_end(), it becomes
	aware of the end of the old grace period and therefore makes
	any RCU callbacks that were waiting on that grace period eligible
	for invocation.

	If this CPU is the same one that detected the end of the grace
	period, and if there is but a single rcu_node in the hierarchy,
	we will still be in the single critical section.  In this case,
	the memory barrier in step #4 guarantees that all callbacks will
	be seen to execute after each CPU's quiescent state.

	On the other hand, if this is a different CPU, it will acquire
	the leaf rcu_node's ->lock, and will again be serialized after
	each CPU's quiescent state for the old grace period.

On the strength of this proof, this commit therefore removes the memory
barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp().
The effect is to reduce the number of memory barriers by one and to
reduce the frequency of execution from about once per scheduling tick
per CPU to once per grace period.

This was reverted do to hangs found during testing by Yinghai Lu and
Ingo Molnar.  Frederic Weisbecker supplied Yinghai with tracing that
located the underlying problem, and Frederic also provided the fix.

The underlying problem was that the HARDIRQ_ENTER() macro from
lib/locking-selftest.c invoked irq_enter(), which in turn invokes
rcu_irq_enter(), but HARDIRQ_EXIT() invoked __irq_exit(), which
does not invoke rcu_irq_exit().  This situation resulted in calls
to rcu_irq_enter() that were not balanced by the required calls to
rcu_irq_exit().  Therefore, after these locking selftests completed,
RCU's dyntick-idle nesting count was a large number (for example,
72), which caused RCU to to conclude that the affected CPU was not in
dyntick-idle mode when in fact it was.

RCU would therefore incorrectly wait for this dyntick-idle CPU, resulting
in hangs.

In contrast, with Frederic's patch, which replaces the irq_enter()
in HARDIRQ_ENTER() with an __irq_enter(), these tests don't ever call
either rcu_irq_enter() or rcu_irq_exit(), which works because the CPU
running the test is already marked as not being in dyntick-idle mode.
This means that the rcu_irq_enter() and rcu_irq_exit() calls and RCU
then has no problem working out which CPUs are in dyntick-idle mode and
which are not.

The reason that the imbalance was not noticed before the barrier patch
was applied is that the old implementation of rcu_enter_nohz() ignored
the nesting depth.  This could still result in delays, but much shorter
ones.  Whenever there was a delay, RCU would IPI the CPU with the
unbalanced nesting level, which would eventually result in rcu_enter_nohz()
being called, which in turn would force RCU to see that the CPU was in
dyntick-idle mode.

The reason that very few people noticed the problem is that the mismatched
irq_enter() vs. __irq_exit() occured only when the kernel was built with
CONFIG_DEBUG_LOCKING_API_SELFTESTS.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-26 09:42:23 -07:00
Paul E. McKenney 80d02085d9 Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
This reverts commit e59fb3120b.

This reversion was due to (extreme) boot-time slowdowns on SPARC seen by
Yinghai Lu and on x86 by Ingo
.
This is a non-trivial reversion due to intervening commits.

Conflicts:

	Documentation/RCU/trace.txt
	kernel/rcutree.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-19 23:25:29 +02:00
Paul E. McKenney 5ece5bab3e rcu: Add forward-progress diagnostic for per-CPU kthreads
Increment a per-CPU counter on each pass through rcu_cpu_kthread()'s
service loop, and add it to the rcudata trace output.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:57 -07:00
Paul E. McKenney 15ba0ba860 rcu: add grace-period age and more kthread state to tracing
This commit adds the age in jiffies of the current grace period along
with the duration in jiffies of the longest grace period since boot
to the rcu/rcugp debugfs file.  It also adds an additional "O" state
to kthread tracing to differentiate between the kthread waiting due to
having nothing to do on the one hand and waiting due to being on the
wrong CPU on the other hand.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-05-05 23:16:56 -07:00
Paul E. McKenney 90e6ac3657 rcu: update tracing documentation for new rcutorture and rcuboost
This commit documents the new debugfs rcu/rcutorture and rcu/rcuboost
trace files.  The description has been updated as suggested by Josh
Triplett.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-05-05 23:16:56 -07:00
Paul E. McKenney 0ac3d136b2 rcu: add callback-queue information to rcudata output
This commit adds an indication of the state of the callback queue using
a string of four characters following the "ql=" integer queue length.
The first character is "N" if there are callbacks that have been
queued that are not yet ready to be handled by the next grace period, or
"." otherwise.  The second character is "R" if there are callbacks queued
that are ready to be handled by the next grace period, or "." otherwise.
The third character is "W" if there are callbacks waiting for the current
grace period, or "." otherwise.  Finally, the fourth character is "D"
if there are callbacks that have been handled by a prior grace period
and are waiting to be invoked, or ".".

Note that callbacks that are in the process of being invoked are
not shown.  These callbacks would have been removed from the rcu_data
structure's list by rcu_do_batch() prior to being executed.  (These
callbacks are also not reflected in the "ql=" total, FWIW.)

Also, document the new callback-queue trace information.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:56 -07:00
Paul E. McKenney 2fa218d8bb rcu: Update RCU's trace.txt documentation for new format
The trace.txt file had obsolete output for the debugfs rcu/rcudata
file, so update it.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:56 -07:00
Paul E. McKenney 12f5f524ca rcu: merge TREE_PREEPT_RCU blocked_tasks[] lists
Combine the current TREE_PREEMPT_RCU ->blocked_tasks[] lists in the
rcu_node structure into a single ->blkd_tasks list with ->gp_tasks
and ->exp_tasks tail pointers.  This is in preparation for RCU priority
boosting, which will add a third dimension to the combinatorial explosion
in the ->blocked_tasks[] case, but simply a third pointer in the new
->blkd_tasks case.

Also update documentation to reflect blocked_tasks[] merge

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:54 -07:00
Paul E. McKenney e59fb3120b rcu: Decrease memory-barrier usage based on semi-formal proof
Commit d09b62d fixed grace-period synchronization, but left some smp_mb()
invocations in rcu_process_callbacks() that are no longer needed, but
sheer paranoia prevented them from being removed.  This commit removes
them and provides a proof of correctness in their absence.  It also adds
a memory barrier to rcu_report_qs_rsp() immediately before the update to
rsp->completed in order to handle the theoretical possibility that the
compiler or CPU might move massive quantities of code into a lock-based
critical section.  This also proves that the sheer paranoia was not
entirely unjustified, at least from a theoretical point of view.

In addition, the old dyntick-idle synchronization depended on the fact
that grace periods were many milliseconds in duration, so that it could
be assumed that no dyntick-idle CPU could reorder a memory reference
across an entire grace period.  Unfortunately for this design, the
addition of expedited grace periods breaks this assumption, which has
the unfortunate side-effect of requiring atomic operations in the
functions that track dyntick-idle state for RCU.  (There is some hope
that the algorithms used in user-level RCU might be applied here, but
some work is required to handle the NMIs that user-space applications
can happily ignore.  For the short term, better safe than sorry.)

This proof assumes that neither compiler nor CPU will allow a lock
acquisition and release to be reordered, as doing so can result in
deadlock.  The proof is as follows:

1.	A given CPU declares a quiescent state under the protection of
	its leaf rcu_node's lock.

2.	If there is more than one level of rcu_node hierarchy, the
	last CPU to declare a quiescent state will also acquire the
	->lock of the next rcu_node up in the hierarchy,  but only
	after releasing the lower level's lock.  The acquisition of this
	lock clearly cannot occur prior to the acquisition of the leaf
	node's lock.

3.	Step 2 repeats until we reach the root rcu_node structure.
	Please note again that only one lock is held at a time through
	this process.  The acquisition of the root rcu_node's ->lock
	must occur after the release of that of the leaf rcu_node.

4.	At this point, we set the ->completed field in the rcu_state
	structure in rcu_report_qs_rsp().  However, if the rcu_node
	hierarchy contains only one rcu_node, then in theory the code
	preceding the quiescent state could leak into the critical
	section.  We therefore precede the update of ->completed with a
	memory barrier.  All CPUs will therefore agree that any updates
	preceding any report of a quiescent state will have happened
	before the update of ->completed.

5.	Regardless of whether a new grace period is needed, rcu_start_gp()
	will propagate the new value of ->completed to all of the leaf
	rcu_node structures, under the protection of each rcu_node's ->lock.
	If a new grace period is needed immediately, this propagation
	will occur in the same critical section that ->completed was
	set in, but courtesy of the memory barrier in #4 above, is still
	seen to follow any pre-quiescent-state activity.

6.	When a given CPU invokes __rcu_process_gp_end(), it becomes
	aware of the end of the old grace period and therefore makes
	any RCU callbacks that were waiting on that grace period eligible
	for invocation.

	If this CPU is the same one that detected the end of the grace
	period, and if there is but a single rcu_node in the hierarchy,
	we will still be in the single critical section.  In this case,
	the memory barrier in step #4 guarantees that all callbacks will
	be seen to execute after each CPU's quiescent state.

	On the other hand, if this is a different CPU, it will acquire
	the leaf rcu_node's ->lock, and will again be serialized after
	each CPU's quiescent state for the old grace period.

On the strength of this proof, this commit therefore removes the memory
barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp().
The effect is to reduce the number of memory barriers by one and to
reduce the frequency of execution from about once per scheduling tick
per CPU to once per grace period.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:54 -07:00
Paul E. McKenney a00e0d714f rcu: Remove conditional compilation for RCU CPU stall warnings
The RCU CPU stall warnings can now be controlled using the
rcu_cpu_stall_suppress boot-time parameter or via the same parameter
from sysfs.  There is therefore no longer any reason to have
kernel config parameters for this feature.  This commit therefore
removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE
kernel config parameters.  The RCU_CPU_STALL_TIMEOUT parameter remains
to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter
remains to allow task-stall information to be suppressed if desired.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:54 -07:00