Merge branch 'lkmm.2020.11.06a' into HEAD
lkmm.2020.11.06a: Linux-kernel memory model (LKMM) updates.
This commit is contained in:
commit
50df51d12c
|
@ -1870,7 +1870,7 @@ There are some more advanced barrier functions:
|
|||
|
||||
These are for use with atomic RMW functions that do not imply memory
|
||||
barriers, but where the code needs a memory barrier. Examples for atomic
|
||||
RMW functions that do not imply are memory barrier are e.g. add,
|
||||
RMW functions that do not imply a memory barrier are e.g. add,
|
||||
subtract, (failed) conditional operations, _relaxed functions,
|
||||
but not atomic_read or atomic_set. A common example where a memory
|
||||
barrier may be required is when atomic ops are used for reference
|
||||
|
|
|
@ -0,0 +1,76 @@
|
|||
It has been said that successful communication requires first identifying
|
||||
what your audience knows and then building a bridge from their current
|
||||
knowledge to what they need to know. Unfortunately, the expected
|
||||
Linux-kernel memory model (LKMM) audience might be anywhere from novice
|
||||
to expert both in kernel hacking and in understanding LKMM.
|
||||
|
||||
This document therefore points out a number of places to start reading,
|
||||
depending on what you know and what you would like to learn. Please note
|
||||
that the documents later in this list assume that the reader understands
|
||||
the material provided by documents earlier in this list.
|
||||
|
||||
o You are new to Linux-kernel concurrency: simple.txt
|
||||
|
||||
o You have some background in Linux-kernel concurrency, and would
|
||||
like an overview of the types of low-level concurrency primitives
|
||||
that the Linux kernel provides: ordering.txt
|
||||
|
||||
Here, "low level" means atomic operations to single variables.
|
||||
|
||||
o You are familiar with the Linux-kernel concurrency primitives
|
||||
that you need, and just want to get started with LKMM litmus
|
||||
tests: litmus-tests.txt
|
||||
|
||||
o You are familiar with Linux-kernel concurrency, and would
|
||||
like a detailed intuitive understanding of LKMM, including
|
||||
situations involving more than two threads: recipes.txt
|
||||
|
||||
o You would like a detailed understanding of what your compiler can
|
||||
and cannot do to control dependencies: control-dependencies.txt
|
||||
|
||||
o You are familiar with Linux-kernel concurrency and the use of
|
||||
LKMM, and would like a quick reference: cheatsheet.txt
|
||||
|
||||
o You are familiar with Linux-kernel concurrency and the use
|
||||
of LKMM, and would like to learn about LKMM's requirements,
|
||||
rationale, and implementation: explanation.txt
|
||||
|
||||
o You are interested in the publications related to LKMM, including
|
||||
hardware manuals, academic literature, standards-committee
|
||||
working papers, and LWN articles: references.txt
|
||||
|
||||
|
||||
====================
|
||||
DESCRIPTION OF FILES
|
||||
====================
|
||||
|
||||
README
|
||||
This file.
|
||||
|
||||
cheatsheet.txt
|
||||
Quick-reference guide to the Linux-kernel memory model.
|
||||
|
||||
control-dependencies.txt
|
||||
Guide to preventing compiler optimizations from destroying
|
||||
your control dependencies.
|
||||
|
||||
explanation.txt
|
||||
Detailed description of the memory model.
|
||||
|
||||
litmus-tests.txt
|
||||
The format, features, capabilities, and limitations of the litmus
|
||||
tests that LKMM can evaluate.
|
||||
|
||||
ordering.txt
|
||||
Overview of the Linux kernel's low-level memory-ordering
|
||||
primitives by category.
|
||||
|
||||
recipes.txt
|
||||
Common memory-ordering patterns.
|
||||
|
||||
references.txt
|
||||
Background information.
|
||||
|
||||
simple.txt
|
||||
Starting point for someone new to Linux-kernel concurrency.
|
||||
And also a reminder of the simpler approaches to concurrency!
|
|
@ -0,0 +1,258 @@
|
|||
CONTROL DEPENDENCIES
|
||||
====================
|
||||
|
||||
A major difficulty with control dependencies is that current compilers
|
||||
do not support them. One purpose of this document is therefore to
|
||||
help you prevent your compiler from breaking your code. However,
|
||||
control dependencies also pose other challenges, which leads to the
|
||||
second purpose of this document, namely to help you to avoid breaking
|
||||
your own code, even in the absence of help from your compiler.
|
||||
|
||||
One such challenge is that control dependencies order only later stores.
|
||||
Therefore, a load-load control dependency will not preserve ordering
|
||||
unless a read memory barrier is provided. Consider the following code:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q)
|
||||
p = READ_ONCE(b);
|
||||
|
||||
This is not guaranteed to provide any ordering because some types of CPUs
|
||||
are permitted to predict the result of the load from "b". This prediction
|
||||
can cause other CPUs to see this load as having happened before the load
|
||||
from "a". This means that an explicit read barrier is required, for example
|
||||
as follows:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q) {
|
||||
smp_rmb();
|
||||
p = READ_ONCE(b);
|
||||
}
|
||||
|
||||
However, stores are not speculated. This means that ordering is
|
||||
(usually) guaranteed for load-store control dependencies, as in the
|
||||
following example:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q)
|
||||
WRITE_ONCE(b, 1);
|
||||
|
||||
Control dependencies can pair with each other and with other types
|
||||
of ordering. But please note that neither the READ_ONCE() nor the
|
||||
WRITE_ONCE() are optional. Without the READ_ONCE(), the compiler might
|
||||
fuse the load from "a" with other loads. Without the WRITE_ONCE(),
|
||||
the compiler might fuse the store to "b" with other stores. Worse yet,
|
||||
the compiler might convert the store into a load and a check followed
|
||||
by a store, and this compiler-generated load would not be ordered by
|
||||
the control dependency.
|
||||
|
||||
Furthermore, if the compiler is able to prove that the value of variable
|
||||
"a" is always non-zero, it would be well within its rights to optimize
|
||||
the original example by eliminating the "if" statement as follows:
|
||||
|
||||
q = a;
|
||||
b = 1; /* BUG: Compiler and CPU can both reorder!!! */
|
||||
|
||||
So don't leave out either the READ_ONCE() or the WRITE_ONCE().
|
||||
In particular, although READ_ONCE() does force the compiler to emit a
|
||||
load, it does *not* force the compiler to actually use the loaded value.
|
||||
|
||||
It is tempting to try use control dependencies to enforce ordering on
|
||||
identical stores on both branches of the "if" statement as follows:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q) {
|
||||
barrier();
|
||||
WRITE_ONCE(b, 1);
|
||||
do_something();
|
||||
} else {
|
||||
barrier();
|
||||
WRITE_ONCE(b, 1);
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
Unfortunately, current compilers will transform this as follows at high
|
||||
optimization levels:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
barrier();
|
||||
WRITE_ONCE(b, 1); /* BUG: No ordering vs. load from a!!! */
|
||||
if (q) {
|
||||
/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
|
||||
do_something();
|
||||
} else {
|
||||
/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
Now there is no conditional between the load from "a" and the store to
|
||||
"b", which means that the CPU is within its rights to reorder them: The
|
||||
conditional is absolutely required, and must be present in the final
|
||||
assembly code, after all of the compiler and link-time optimizations
|
||||
have been applied. Therefore, if you need ordering in this example,
|
||||
you must use explicit memory ordering, for example, smp_store_release():
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q) {
|
||||
smp_store_release(&b, 1);
|
||||
do_something();
|
||||
} else {
|
||||
smp_store_release(&b, 1);
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
Without explicit memory ordering, control-dependency-based ordering is
|
||||
guaranteed only when the stores differ, for example:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q) {
|
||||
WRITE_ONCE(b, 1);
|
||||
do_something();
|
||||
} else {
|
||||
WRITE_ONCE(b, 2);
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
The initial READ_ONCE() is still required to prevent the compiler from
|
||||
knowing too much about the value of "a".
|
||||
|
||||
But please note that you need to be careful what you do with the local
|
||||
variable "q", otherwise the compiler might be able to guess the value
|
||||
and again remove the conditional branch that is absolutely required to
|
||||
preserve ordering. For example:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q % MAX) {
|
||||
WRITE_ONCE(b, 1);
|
||||
do_something();
|
||||
} else {
|
||||
WRITE_ONCE(b, 2);
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
If MAX is compile-time defined to be 1, then the compiler knows that
|
||||
(q % MAX) must be equal to zero, regardless of the value of "q".
|
||||
The compiler is therefore within its rights to transform the above code
|
||||
into the following:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
WRITE_ONCE(b, 2);
|
||||
do_something_else();
|
||||
|
||||
Given this transformation, the CPU is not required to respect the ordering
|
||||
between the load from variable "a" and the store to variable "b". It is
|
||||
tempting to add a barrier(), but this does not help. The conditional
|
||||
is gone, and the barrier won't bring it back. Therefore, if you need
|
||||
to relying on control dependencies to produce this ordering, you should
|
||||
make sure that MAX is greater than one, perhaps as follows:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
|
||||
if (q % MAX) {
|
||||
WRITE_ONCE(b, 1);
|
||||
do_something();
|
||||
} else {
|
||||
WRITE_ONCE(b, 2);
|
||||
do_something_else();
|
||||
}
|
||||
|
||||
Please note once again that each leg of the "if" statement absolutely
|
||||
must store different values to "b". As in previous examples, if the two
|
||||
values were identical, the compiler could pull this store outside of the
|
||||
"if" statement, destroying the control dependency's ordering properties.
|
||||
|
||||
You must also be careful avoid relying too much on boolean short-circuit
|
||||
evaluation. Consider this example:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q || 1 > 0)
|
||||
WRITE_ONCE(b, 1);
|
||||
|
||||
Because the first condition cannot fault and the second condition is
|
||||
always true, the compiler can transform this example as follows, again
|
||||
destroying the control dependency's ordering:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
WRITE_ONCE(b, 1);
|
||||
|
||||
This is yet another example showing the importance of preventing the
|
||||
compiler from out-guessing your code. Again, although READ_ONCE() really
|
||||
does force the compiler to emit code for a given load, the compiler is
|
||||
within its rights to discard the loaded value.
|
||||
|
||||
In addition, control dependencies apply only to the then-clause and
|
||||
else-clause of the "if" statement in question. In particular, they do
|
||||
not necessarily order the code following the entire "if" statement:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q) {
|
||||
WRITE_ONCE(b, 1);
|
||||
} else {
|
||||
WRITE_ONCE(b, 2);
|
||||
}
|
||||
WRITE_ONCE(c, 1); /* BUG: No ordering against the read from "a". */
|
||||
|
||||
It is tempting to argue that there in fact is ordering because the
|
||||
compiler cannot reorder volatile accesses and also cannot reorder
|
||||
the writes to "b" with the condition. Unfortunately for this line
|
||||
of reasoning, the compiler might compile the two writes to "b" as
|
||||
conditional-move instructions, as in this fanciful pseudo-assembly
|
||||
language:
|
||||
|
||||
ld r1,a
|
||||
cmp r1,$0
|
||||
cmov,ne r4,$1
|
||||
cmov,eq r4,$2
|
||||
st r4,b
|
||||
st $1,c
|
||||
|
||||
The control dependencies would then extend only to the pair of cmov
|
||||
instructions and the store depending on them. This means that a weakly
|
||||
ordered CPU would have no dependency of any sort between the load from
|
||||
"a" and the store to "c". In short, control dependencies provide ordering
|
||||
only to the stores in the then-clause and else-clause of the "if" statement
|
||||
in question (including functions invoked by those two clauses), and not
|
||||
to code following that "if" statement.
|
||||
|
||||
|
||||
In summary:
|
||||
|
||||
(*) Control dependencies can order prior loads against later stores.
|
||||
However, they do *not* guarantee any other sort of ordering:
|
||||
Not prior loads against later loads, nor prior stores against
|
||||
later anything. If you need these other forms of ordering, use
|
||||
smp_load_acquire(), smp_store_release(), or, in the case of prior
|
||||
stores and later loads, smp_mb().
|
||||
|
||||
(*) If both legs of the "if" statement contain identical stores to
|
||||
the same variable, then you must explicitly order those stores,
|
||||
either by preceding both of them with smp_mb() or by using
|
||||
smp_store_release(). Please note that it is *not* sufficient to use
|
||||
barrier() at beginning and end of each leg of the "if" statement
|
||||
because, as shown by the example above, optimizing compilers can
|
||||
destroy the control dependency while respecting the letter of the
|
||||
barrier() law.
|
||||
|
||||
(*) Control dependencies require at least one run-time conditional
|
||||
between the prior load and the subsequent store, and this
|
||||
conditional must involve the prior load. If the compiler is able
|
||||
to optimize the conditional away, it will have also optimized
|
||||
away the ordering. Careful use of READ_ONCE() and WRITE_ONCE()
|
||||
can help to preserve the needed conditional.
|
||||
|
||||
(*) Control dependencies require that the compiler avoid reordering the
|
||||
dependency into nonexistence. Careful use of READ_ONCE() or
|
||||
atomic{,64}_read() can help to preserve your control dependency.
|
||||
|
||||
(*) Control dependencies apply only to the then-clause and else-clause
|
||||
of the "if" statement containing the control dependency, including
|
||||
any functions that these two clauses call. Control dependencies
|
||||
do *not* apply to code beyond the end of that "if" statement.
|
||||
|
||||
(*) Control dependencies pair normally with other types of barriers.
|
||||
|
||||
(*) Control dependencies do *not* provide multicopy atomicity. If you
|
||||
need all the CPUs to agree on the ordering of a given store against
|
||||
all other accesses, use smp_mb().
|
||||
|
||||
(*) Compilers do not understand control dependencies. It is therefore
|
||||
your job to ensure that they do not break your code.
|
|
@ -0,0 +1,172 @@
|
|||
This document contains brief definitions of LKMM-related terms. Like most
|
||||
glossaries, it is not intended to be read front to back (except perhaps
|
||||
as a way of confirming a diagnosis of OCD), but rather to be searched
|
||||
for specific terms.
|
||||
|
||||
|
||||
Address Dependency: When the address of a later memory access is computed
|
||||
based on the value returned by an earlier load, an "address
|
||||
dependency" extends from that load extending to the later access.
|
||||
Address dependencies are quite common in RCU read-side critical
|
||||
sections:
|
||||
|
||||
1 rcu_read_lock();
|
||||
2 p = rcu_dereference(gp);
|
||||
3 do_something(p->a);
|
||||
4 rcu_read_unlock();
|
||||
|
||||
In this case, because the address of "p->a" on line 3 is computed
|
||||
from the value returned by the rcu_dereference() on line 2, the
|
||||
address dependency extends from that rcu_dereference() to that
|
||||
"p->a". In rare cases, optimizing compilers can destroy address
|
||||
dependencies. Please see Documentation/RCU/rcu_dereference.txt
|
||||
for more information.
|
||||
|
||||
See also "Control Dependency" and "Data Dependency".
|
||||
|
||||
Acquire: With respect to a lock, acquiring that lock, for example,
|
||||
using spin_lock(). With respect to a non-lock shared variable,
|
||||
a special operation that includes a load and which orders that
|
||||
load before later memory references running on that same CPU.
|
||||
An example special acquire operation is smp_load_acquire(),
|
||||
but atomic_read_acquire() and atomic_xchg_acquire() also include
|
||||
acquire loads.
|
||||
|
||||
When an acquire load returns the value stored by a release store
|
||||
to that same variable, then all operations preceding that store
|
||||
happen before any operations following that load acquire.
|
||||
|
||||
See also "Relaxed" and "Release".
|
||||
|
||||
Coherence (co): When one CPU's store to a given variable overwrites
|
||||
either the value from another CPU's store or some later value,
|
||||
there is said to be a coherence link from the second CPU to
|
||||
the first.
|
||||
|
||||
It is also possible to have a coherence link within a CPU, which
|
||||
is a "coherence internal" (coi) link. The term "coherence
|
||||
external" (coe) link is used when it is necessary to exclude
|
||||
the coi case.
|
||||
|
||||
See also "From-reads" and "Reads-from".
|
||||
|
||||
Control Dependency: When a later store's execution depends on a test
|
||||
of a value computed from a value returned by an earlier load,
|
||||
a "control dependency" extends from that load to that store.
|
||||
For example:
|
||||
|
||||
1 if (READ_ONCE(x))
|
||||
2 WRITE_ONCE(y, 1);
|
||||
|
||||
Here, the control dependency extends from the READ_ONCE() on
|
||||
line 1 to the WRITE_ONCE() on line 2. Control dependencies are
|
||||
fragile, and can be easily destroyed by optimizing compilers.
|
||||
Please see control-dependencies.txt for more information.
|
||||
|
||||
See also "Address Dependency" and "Data Dependency".
|
||||
|
||||
Cycle: Memory-barrier pairing is restricted to a pair of CPUs, as the
|
||||
name suggests. And in a great many cases, a pair of CPUs is all
|
||||
that is required. In other cases, the notion of pairing must be
|
||||
extended to additional CPUs, and the result is called a "cycle".
|
||||
In a cycle, each CPU's ordering interacts with that of the next:
|
||||
|
||||
CPU 0 CPU 1 CPU 2
|
||||
WRITE_ONCE(x, 1); WRITE_ONCE(y, 1); WRITE_ONCE(z, 1);
|
||||
smp_mb(); smp_mb(); smp_mb();
|
||||
r0 = READ_ONCE(y); r1 = READ_ONCE(z); r2 = READ_ONCE(x);
|
||||
|
||||
CPU 0's smp_mb() interacts with that of CPU 1, which interacts
|
||||
with that of CPU 2, which in turn interacts with that of CPU 0
|
||||
to complete the cycle. Because of the smp_mb() calls between
|
||||
each pair of memory accesses, the outcome where r0, r1, and r2
|
||||
are all equal to zero is forbidden by LKMM.
|
||||
|
||||
See also "Pairing".
|
||||
|
||||
Data Dependency: When the data written by a later store is computed based
|
||||
on the value returned by an earlier load, a "data dependency"
|
||||
extends from that load to that later store. For example:
|
||||
|
||||
1 r1 = READ_ONCE(x);
|
||||
2 WRITE_ONCE(y, r1 + 1);
|
||||
|
||||
In this case, the data dependency extends from the READ_ONCE()
|
||||
on line 1 to the WRITE_ONCE() on line 2. Data dependencies are
|
||||
fragile and can be easily destroyed by optimizing compilers.
|
||||
Because optimizing compilers put a great deal of effort into
|
||||
working out what values integer variables might have, this is
|
||||
especially true in cases where the dependency is carried through
|
||||
an integer.
|
||||
|
||||
See also "Address Dependency" and "Control Dependency".
|
||||
|
||||
From-Reads (fr): When one CPU's store to a given variable happened
|
||||
too late to affect the value returned by another CPU's
|
||||
load from that same variable, there is said to be a from-reads
|
||||
link from the load to the store.
|
||||
|
||||
It is also possible to have a from-reads link within a CPU, which
|
||||
is a "from-reads internal" (fri) link. The term "from-reads
|
||||
external" (fre) link is used when it is necessary to exclude
|
||||
the fri case.
|
||||
|
||||
See also "Coherence" and "Reads-from".
|
||||
|
||||
Fully Ordered: An operation such as smp_mb() that orders all of
|
||||
its CPU's prior accesses with all of that CPU's subsequent
|
||||
accesses, or a marked access such as atomic_add_return()
|
||||
that orders all of its CPU's prior accesses, itself, and
|
||||
all of its CPU's subsequent accesses.
|
||||
|
||||
Marked Access: An access to a variable that uses an special function or
|
||||
macro such as "r1 = READ_ONCE(x)" or "smp_store_release(&a, 1)".
|
||||
|
||||
See also "Unmarked Access".
|
||||
|
||||
Pairing: "Memory-barrier pairing" reflects the fact that synchronizing
|
||||
data between two CPUs requires that both CPUs their accesses.
|
||||
Memory barriers thus tend to come in pairs, one executed by
|
||||
one of the CPUs and the other by the other CPU. Of course,
|
||||
pairing also occurs with other types of operations, so that a
|
||||
smp_store_release() pairs with an smp_load_acquire() that reads
|
||||
the value stored.
|
||||
|
||||
See also "Cycle".
|
||||
|
||||
Reads-From (rf): When one CPU's load returns the value stored by some other
|
||||
CPU, there is said to be a reads-from link from the second
|
||||
CPU's store to the first CPU's load. Reads-from links have the
|
||||
nice property that time must advance from the store to the load,
|
||||
which means that algorithms using reads-from links can use lighter
|
||||
weight ordering and synchronization compared to algorithms using
|
||||
coherence and from-reads links.
|
||||
|
||||
It is also possible to have a reads-from link within a CPU, which
|
||||
is a "reads-from internal" (rfi) link. The term "reads-from
|
||||
external" (rfe) link is used when it is necessary to exclude
|
||||
the rfi case.
|
||||
|
||||
See also Coherence" and "From-reads".
|
||||
|
||||
Relaxed: A marked access that does not imply ordering, for example, a
|
||||
READ_ONCE(), WRITE_ONCE(), a non-value-returning read-modify-write
|
||||
operation, or a value-returning read-modify-write operation whose
|
||||
name ends in "_relaxed".
|
||||
|
||||
See also "Acquire" and "Release".
|
||||
|
||||
Release: With respect to a lock, releasing that lock, for example,
|
||||
using spin_unlock(). With respect to a non-lock shared variable,
|
||||
a special operation that includes a store and which orders that
|
||||
store after earlier memory references that ran on that same CPU.
|
||||
An example special release store is smp_store_release(), but
|
||||
atomic_set_release() and atomic_cmpxchg_release() also include
|
||||
release stores.
|
||||
|
||||
See also "Acquire" and "Relaxed".
|
||||
|
||||
Unmarked Access: An access to a variable that uses normal C-language
|
||||
syntax, for example, "a = b[2]";
|
||||
|
||||
See also "Marked Access".
|
|
@ -946,6 +946,23 @@ Limitations of the Linux-kernel memory model (LKMM) include:
|
|||
carrying a dependency, then the compiler can break that dependency
|
||||
by substituting a constant of that value.
|
||||
|
||||
Conversely, LKMM sometimes doesn't recognize that a particular
|
||||
optimization is not allowed, and as a result, thinks that a
|
||||
dependency is not present (because the optimization would break it).
|
||||
The memory model misses some pretty obvious control dependencies
|
||||
because of this limitation. A simple example is:
|
||||
|
||||
r1 = READ_ONCE(x);
|
||||
if (r1 == 0)
|
||||
smp_mb();
|
||||
WRITE_ONCE(y, 1);
|
||||
|
||||
There is a control dependency from the READ_ONCE to the WRITE_ONCE,
|
||||
even when r1 is nonzero, but LKMM doesn't realize this and thinks
|
||||
that the write may execute before the read if r1 != 0. (Yes, that
|
||||
doesn't make sense if you think about it, but the memory model's
|
||||
intelligence is limited.)
|
||||
|
||||
2. Multiple access sizes for a single variable are not supported,
|
||||
and neither are misaligned or partially overlapping accesses.
|
||||
|
||||
|
|
|
@ -0,0 +1,556 @@
|
|||
This document gives an overview of the categories of memory-ordering
|
||||
operations provided by the Linux-kernel memory model (LKMM).
|
||||
|
||||
|
||||
Categories of Ordering
|
||||
======================
|
||||
|
||||
This section lists LKMM's three top-level categories of memory-ordering
|
||||
operations in decreasing order of strength:
|
||||
|
||||
1. Barriers (also known as "fences"). A barrier orders some or
|
||||
all of the CPU's prior operations against some or all of its
|
||||
subsequent operations.
|
||||
|
||||
2. Ordered memory accesses. These operations order themselves
|
||||
against some or all of the CPU's prior accesses or some or all
|
||||
of the CPU's subsequent accesses, depending on the subcategory
|
||||
of the operation.
|
||||
|
||||
3. Unordered accesses, as the name indicates, have no ordering
|
||||
properties except to the extent that they interact with an
|
||||
operation in the previous categories. This being the real world,
|
||||
some of these "unordered" operations provide limited ordering
|
||||
in some special situations.
|
||||
|
||||
Each of the above categories is described in more detail by one of the
|
||||
following sections.
|
||||
|
||||
|
||||
Barriers
|
||||
========
|
||||
|
||||
Each of the following categories of barriers is described in its own
|
||||
subsection below:
|
||||
|
||||
a. Full memory barriers.
|
||||
|
||||
b. Read-modify-write (RMW) ordering augmentation barriers.
|
||||
|
||||
c. Write memory barrier.
|
||||
|
||||
d. Read memory barrier.
|
||||
|
||||
e. Compiler barrier.
|
||||
|
||||
Note well that many of these primitives generate absolutely no code
|
||||
in kernels built with CONFIG_SMP=n. Therefore, if you are writing
|
||||
a device driver, which must correctly order accesses to a physical
|
||||
device even in kernels built with CONFIG_SMP=n, please use the
|
||||
ordering primitives provided for that purpose. For example, instead of
|
||||
smp_mb(), use mb(). See the "Linux Kernel Device Drivers" book or the
|
||||
https://lwn.net/Articles/698014/ article for more information.
|
||||
|
||||
|
||||
Full Memory Barriers
|
||||
--------------------
|
||||
|
||||
The Linux-kernel primitives that provide full ordering include:
|
||||
|
||||
o The smp_mb() full memory barrier.
|
||||
|
||||
o Value-returning RMW atomic operations whose names do not end in
|
||||
_acquire, _release, or _relaxed.
|
||||
|
||||
o RCU's grace-period primitives.
|
||||
|
||||
First, the smp_mb() full memory barrier orders all of the CPU's prior
|
||||
accesses against all subsequent accesses from the viewpoint of all CPUs.
|
||||
In other words, all CPUs will agree that any earlier action taken
|
||||
by that CPU happened before any later action taken by that same CPU.
|
||||
For example, consider the following:
|
||||
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_mb(); // Order store to x before load from y.
|
||||
r1 = READ_ONCE(y);
|
||||
|
||||
All CPUs will agree that the store to "x" happened before the load
|
||||
from "y", as indicated by the comment. And yes, please comment your
|
||||
memory-ordering primitives. It is surprisingly hard to remember their
|
||||
purpose after even a few months.
|
||||
|
||||
Second, some RMW atomic operations provide full ordering. These
|
||||
operations include value-returning RMW atomic operations (that is, those
|
||||
with non-void return types) whose names do not end in _acquire, _release,
|
||||
or _relaxed. Examples include atomic_add_return(), atomic_dec_and_test(),
|
||||
cmpxchg(), and xchg(). Note that conditional RMW atomic operations such
|
||||
as cmpxchg() are only guaranteed to provide ordering when they succeed.
|
||||
When RMW atomic operations provide full ordering, they partition the
|
||||
CPU's accesses into three groups:
|
||||
|
||||
1. All code that executed prior to the RMW atomic operation.
|
||||
|
||||
2. The RMW atomic operation itself.
|
||||
|
||||
3. All code that executed after the RMW atomic operation.
|
||||
|
||||
All CPUs will agree that any operation in a given partition happened
|
||||
before any operation in a higher-numbered partition.
|
||||
|
||||
In contrast, non-value-returning RMW atomic operations (that is, those
|
||||
with void return types) do not guarantee any ordering whatsoever. Nor do
|
||||
value-returning RMW atomic operations whose names end in _relaxed.
|
||||
Examples of the former include atomic_inc() and atomic_dec(),
|
||||
while examples of the latter include atomic_cmpxchg_relaxed() and
|
||||
atomic_xchg_relaxed(). Similarly, value-returning non-RMW atomic
|
||||
operations such as atomic_read() do not guarantee full ordering, and
|
||||
are covered in the later section on unordered operations.
|
||||
|
||||
Value-returning RMW atomic operations whose names end in _acquire or
|
||||
_release provide limited ordering, and will be described later in this
|
||||
document.
|
||||
|
||||
Finally, RCU's grace-period primitives provide full ordering. These
|
||||
primitives include synchronize_rcu(), synchronize_rcu_expedited(),
|
||||
synchronize_srcu() and so on. However, these primitives have orders
|
||||
of magnitude greater overhead than smp_mb(), atomic_xchg(), and so on.
|
||||
Furthermore, RCU's grace-period primitives can only be invoked in
|
||||
sleepable contexts. Therefore, RCU's grace-period primitives are
|
||||
typically instead used to provide ordering against RCU read-side critical
|
||||
sections, as documented in their comment headers. But of course if you
|
||||
need a synchronize_rcu() to interact with readers, it costs you nothing
|
||||
to also rely on its additional full-memory-barrier semantics. Just please
|
||||
carefully comment this, otherwise your future self will hate you.
|
||||
|
||||
|
||||
RMW Ordering Augmentation Barriers
|
||||
----------------------------------
|
||||
|
||||
As noted in the previous section, non-value-returning RMW operations
|
||||
such as atomic_inc() and atomic_dec() guarantee no ordering whatsoever.
|
||||
Nevertheless, a number of popular CPU families, including x86, provide
|
||||
full ordering for these primitives. One way to obtain full ordering on
|
||||
all architectures is to add a call to smp_mb():
|
||||
|
||||
WRITE_ONCE(x, 1);
|
||||
atomic_inc(&my_counter);
|
||||
smp_mb(); // Inefficient on x86!!!
|
||||
r1 = READ_ONCE(y);
|
||||
|
||||
This works, but the added smp_mb() adds needless overhead for
|
||||
x86, on which atomic_inc() provides full ordering all by itself.
|
||||
The smp_mb__after_atomic() primitive can be used instead:
|
||||
|
||||
WRITE_ONCE(x, 1);
|
||||
atomic_inc(&my_counter);
|
||||
smp_mb__after_atomic(); // Order store to x before load from y.
|
||||
r1 = READ_ONCE(y);
|
||||
|
||||
The smp_mb__after_atomic() primitive emits code only on CPUs whose
|
||||
atomic_inc() implementations do not guarantee full ordering, thus
|
||||
incurring no unnecessary overhead on x86. There are a number of
|
||||
variations on the smp_mb__*() theme:
|
||||
|
||||
o smp_mb__before_atomic(), which provides full ordering prior
|
||||
to an unordered RMW atomic operation.
|
||||
|
||||
o smp_mb__after_atomic(), which, as shown above, provides full
|
||||
ordering subsequent to an unordered RMW atomic operation.
|
||||
|
||||
o smp_mb__after_spinlock(), which provides full ordering subsequent
|
||||
to a successful spinlock acquisition. Note that spin_lock() is
|
||||
always successful but spin_trylock() might not be.
|
||||
|
||||
o smp_mb__after_srcu_read_unlock(), which provides full ordering
|
||||
subsequent to an srcu_read_unlock().
|
||||
|
||||
It is bad practice to place code between the smp__*() primitive and the
|
||||
operation whose ordering that it is augmenting. The reason is that the
|
||||
ordering of this intervening code will differ from one CPU architecture
|
||||
to another.
|
||||
|
||||
|
||||
Write Memory Barrier
|
||||
--------------------
|
||||
|
||||
The Linux kernel's write memory barrier is smp_wmb(). If a CPU executes
|
||||
the following code:
|
||||
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_wmb();
|
||||
WRITE_ONCE(y, 1);
|
||||
|
||||
Then any given CPU will see the write to "x" has having happened before
|
||||
the write to "y". However, you are usually better off using a release
|
||||
store, as described in the "Release Operations" section below.
|
||||
|
||||
Note that smp_wmb() might fail to provide ordering for unmarked C-language
|
||||
stores because profile-driven optimization could determine that the
|
||||
value being overwritten is almost always equal to the new value. Such a
|
||||
compiler might then reasonably decide to transform "x = 1" and "y = 1"
|
||||
as follows:
|
||||
|
||||
if (x != 1)
|
||||
x = 1;
|
||||
smp_wmb(); // BUG: does not order the reads!!!
|
||||
if (y != 1)
|
||||
y = 1;
|
||||
|
||||
Therefore, if you need to use smp_wmb() with unmarked C-language writes,
|
||||
you will need to make sure that none of the compilers used to build
|
||||
the Linux kernel carry out this sort of transformation, both now and in
|
||||
the future.
|
||||
|
||||
|
||||
Read Memory Barrier
|
||||
-------------------
|
||||
|
||||
The Linux kernel's read memory barrier is smp_rmb(). If a CPU executes
|
||||
the following code:
|
||||
|
||||
r0 = READ_ONCE(y);
|
||||
smp_rmb();
|
||||
r1 = READ_ONCE(x);
|
||||
|
||||
Then any given CPU will see the read from "y" as having preceded the read from
|
||||
"x". However, you are usually better off using an acquire load, as described
|
||||
in the "Acquire Operations" section below.
|
||||
|
||||
Compiler Barrier
|
||||
----------------
|
||||
|
||||
The Linux kernel's compiler barrier is barrier(). This primitive
|
||||
prohibits compiler code-motion optimizations that might move memory
|
||||
references across the point in the code containing the barrier(), but
|
||||
does not constrain hardware memory ordering. For example, this can be
|
||||
used to prevent to compiler from moving code across an infinite loop:
|
||||
|
||||
WRITE_ONCE(x, 1);
|
||||
while (dontstop)
|
||||
barrier();
|
||||
r1 = READ_ONCE(y);
|
||||
|
||||
Without the barrier(), the compiler would be within its rights to move the
|
||||
WRITE_ONCE() to follow the loop. This code motion could be problematic
|
||||
in the case where an interrupt handler terminates the loop. Another way
|
||||
to handle this is to use READ_ONCE() for the load of "dontstop".
|
||||
|
||||
Note that the barriers discussed previously use barrier() or its low-level
|
||||
equivalent in their implementations.
|
||||
|
||||
|
||||
Ordered Memory Accesses
|
||||
=======================
|
||||
|
||||
The Linux kernel provides a wide variety of ordered memory accesses:
|
||||
|
||||
a. Release operations.
|
||||
|
||||
b. Acquire operations.
|
||||
|
||||
c. RCU read-side ordering.
|
||||
|
||||
d. Control dependencies.
|
||||
|
||||
Each of the above categories has its own section below.
|
||||
|
||||
|
||||
Release Operations
|
||||
------------------
|
||||
|
||||
Release operations include smp_store_release(), atomic_set_release(),
|
||||
rcu_assign_pointer(), and value-returning RMW operations whose names
|
||||
end in _release. These operations order their own store against all
|
||||
of the CPU's prior memory accesses. Release operations often provide
|
||||
improved readability and performance compared to explicit barriers.
|
||||
For example, use of smp_store_release() saves a line compared to the
|
||||
smp_wmb() example above:
|
||||
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_store_release(&y, 1);
|
||||
|
||||
More important, smp_store_release() makes it easier to connect up the
|
||||
different pieces of the concurrent algorithm. The variable stored to
|
||||
by the smp_store_release(), in this case "y", will normally be used in
|
||||
an acquire operation in other parts of the concurrent algorithm.
|
||||
|
||||
To see the performance advantages, suppose that the above example read
|
||||
from "x" instead of writing to it. Then an smp_wmb() could not guarantee
|
||||
ordering, and an smp_mb() would be needed instead:
|
||||
|
||||
r1 = READ_ONCE(x);
|
||||
smp_mb();
|
||||
WRITE_ONCE(y, 1);
|
||||
|
||||
But smp_mb() often incurs much higher overhead than does
|
||||
smp_store_release(), which still provides the needed ordering of "x"
|
||||
against "y". On x86, the version using smp_store_release() might compile
|
||||
to a simple load instruction followed by a simple store instruction.
|
||||
In contrast, the smp_mb() compiles to an expensive instruction that
|
||||
provides the needed ordering.
|
||||
|
||||
There is a wide variety of release operations:
|
||||
|
||||
o Store operations, including not only the aforementioned
|
||||
smp_store_release(), but also atomic_set_release(), and
|
||||
atomic_long_set_release().
|
||||
|
||||
o RCU's rcu_assign_pointer() operation. This is the same as
|
||||
smp_store_release() except that: (1) It takes the pointer to
|
||||
be assigned to instead of a pointer to that pointer, (2) It
|
||||
is intended to be used in conjunction with rcu_dereference()
|
||||
and similar rather than smp_load_acquire(), and (3) It checks
|
||||
for an RCU-protected pointer in "sparse" runs.
|
||||
|
||||
o Value-returning RMW operations whose names end in _release,
|
||||
such as atomic_fetch_add_release() and cmpxchg_release().
|
||||
Note that release ordering is guaranteed only against the
|
||||
memory-store portion of the RMW operation, and not against the
|
||||
memory-load portion. Note also that conditional operations such
|
||||
as cmpxchg_release() are only guaranteed to provide ordering
|
||||
when they succeed.
|
||||
|
||||
As mentioned earlier, release operations are often paired with acquire
|
||||
operations, which are the subject of the next section.
|
||||
|
||||
|
||||
Acquire Operations
|
||||
------------------
|
||||
|
||||
Acquire operations include smp_load_acquire(), atomic_read_acquire(),
|
||||
and value-returning RMW operations whose names end in _acquire. These
|
||||
operations order their own load against all of the CPU's subsequent
|
||||
memory accesses. Acquire operations often provide improved performance
|
||||
and readability compared to explicit barriers. For example, use of
|
||||
smp_load_acquire() saves a line compared to the smp_rmb() example above:
|
||||
|
||||
r0 = smp_load_acquire(&y);
|
||||
r1 = READ_ONCE(x);
|
||||
|
||||
As with smp_store_release(), this also makes it easier to connect
|
||||
the different pieces of the concurrent algorithm by looking for the
|
||||
smp_store_release() that stores to "y". In addition, smp_load_acquire()
|
||||
improves upon smp_rmb() by ordering against subsequent stores as well
|
||||
as against subsequent loads.
|
||||
|
||||
There are a couple of categories of acquire operations:
|
||||
|
||||
o Load operations, including not only the aforementioned
|
||||
smp_load_acquire(), but also atomic_read_acquire(), and
|
||||
atomic64_read_acquire().
|
||||
|
||||
o Value-returning RMW operations whose names end in _acquire,
|
||||
such as atomic_xchg_acquire() and atomic_cmpxchg_acquire().
|
||||
Note that acquire ordering is guaranteed only against the
|
||||
memory-load portion of the RMW operation, and not against the
|
||||
memory-store portion. Note also that conditional operations
|
||||
such as atomic_cmpxchg_acquire() are only guaranteed to provide
|
||||
ordering when they succeed.
|
||||
|
||||
Symmetry being what it is, acquire operations are often paired with the
|
||||
release operations covered earlier. For example, consider the following
|
||||
example, where task0() and task1() execute concurrently:
|
||||
|
||||
void task0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_store_release(&y, 1);
|
||||
}
|
||||
|
||||
void task1(void)
|
||||
{
|
||||
r0 = smp_load_acquire(&y);
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
If "x" and "y" are both initially zero, then either r0's final value
|
||||
will be zero or r1's final value will be one, thus providing the required
|
||||
ordering.
|
||||
|
||||
|
||||
RCU Read-Side Ordering
|
||||
----------------------
|
||||
|
||||
This category includes read-side markers such as rcu_read_lock()
|
||||
and rcu_read_unlock() as well as pointer-traversal primitives such as
|
||||
rcu_dereference() and srcu_dereference().
|
||||
|
||||
Compared to locking primitives and RMW atomic operations, markers
|
||||
for RCU read-side critical sections incur very low overhead because
|
||||
they interact only with the corresponding grace-period primitives.
|
||||
For example, the rcu_read_lock() and rcu_read_unlock() markers interact
|
||||
with synchronize_rcu(), synchronize_rcu_expedited(), and call_rcu().
|
||||
The way this works is that if a given call to synchronize_rcu() cannot
|
||||
prove that it started before a given call to rcu_read_lock(), then
|
||||
that synchronize_rcu() must block until the matching rcu_read_unlock()
|
||||
is reached. For more information, please see the synchronize_rcu()
|
||||
docbook header comment and the material in Documentation/RCU.
|
||||
|
||||
RCU's pointer-traversal primitives, including rcu_dereference() and
|
||||
srcu_dereference(), order their load (which must be a pointer) against any
|
||||
of the CPU's subsequent memory accesses whose address has been calculated
|
||||
from the value loaded. There is said to be an *address dependency*
|
||||
from the value returned by the rcu_dereference() or srcu_dereference()
|
||||
to that subsequent memory access.
|
||||
|
||||
A call to rcu_dereference() for a given RCU-protected pointer is
|
||||
usually paired with a call to a call to rcu_assign_pointer() for that
|
||||
same pointer in much the same way that a call to smp_load_acquire() is
|
||||
paired with a call to smp_store_release(). Calls to rcu_dereference()
|
||||
and rcu_assign_pointer are often buried in other APIs, for example,
|
||||
the RCU list API members defined in include/linux/rculist.h. For more
|
||||
information, please see the docbook headers in that file, the most
|
||||
recent LWN article on the RCU API (https://lwn.net/Articles/777036/),
|
||||
and of course the material in Documentation/RCU.
|
||||
|
||||
If the pointer value is manipulated between the rcu_dereference()
|
||||
that returned it and a later dereference(), please read
|
||||
Documentation/RCU/rcu_dereference.rst. It can also be quite helpful to
|
||||
review uses in the Linux kernel.
|
||||
|
||||
|
||||
Control Dependencies
|
||||
--------------------
|
||||
|
||||
A control dependency extends from a marked load (READ_ONCE() or stronger)
|
||||
through an "if" condition to a marked store (WRITE_ONCE() or stronger)
|
||||
that is executed only by one of the legs of that "if" statement.
|
||||
Control dependencies are so named because they are mediated by
|
||||
control-flow instructions such as comparisons and conditional branches.
|
||||
|
||||
In short, you can use a control dependency to enforce ordering between
|
||||
an READ_ONCE() and a WRITE_ONCE() when there is an "if" condition
|
||||
between them. The canonical example is as follows:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q)
|
||||
WRITE_ONCE(b, 1);
|
||||
|
||||
In this case, all CPUs would see the read from "a" as happening before
|
||||
the write to "b".
|
||||
|
||||
However, control dependencies are easily destroyed by compiler
|
||||
optimizations, so any use of control dependencies must take into account
|
||||
all of the compilers used to build the Linux kernel. Please see the
|
||||
"control-dependencies.txt" file for more information.
|
||||
|
||||
|
||||
Unordered Accesses
|
||||
==================
|
||||
|
||||
Each of these two categories of unordered accesses has a section below:
|
||||
|
||||
a. Unordered marked operations.
|
||||
|
||||
b. Unmarked C-language accesses.
|
||||
|
||||
|
||||
Unordered Marked Operations
|
||||
---------------------------
|
||||
|
||||
Unordered operations to different variables are just that, unordered.
|
||||
However, if a group of CPUs apply these operations to a single variable,
|
||||
all the CPUs will agree on the operation order. Of course, the ordering
|
||||
of unordered marked accesses can also be constrained using the mechanisms
|
||||
described earlier in this document.
|
||||
|
||||
These operations come in three categories:
|
||||
|
||||
o Marked writes, such as WRITE_ONCE() and atomic_set(). These
|
||||
primitives required the compiler to emit the corresponding store
|
||||
instructions in the expected execution order, thus suppressing
|
||||
a number of destructive optimizations. However, they provide no
|
||||
hardware ordering guarantees, and in fact many CPUs will happily
|
||||
reorder marked writes with each other or with other unordered
|
||||
operations, unless these operations are to the same variable.
|
||||
|
||||
o Marked reads, such as READ_ONCE() and atomic_read(). These
|
||||
primitives required the compiler to emit the corresponding load
|
||||
instructions in the expected execution order, thus suppressing
|
||||
a number of destructive optimizations. However, they provide no
|
||||
hardware ordering guarantees, and in fact many CPUs will happily
|
||||
reorder marked reads with each other or with other unordered
|
||||
operations, unless these operations are to the same variable.
|
||||
|
||||
o Unordered RMW atomic operations. These are non-value-returning
|
||||
RMW atomic operations whose names do not end in _acquire or
|
||||
_release, and also value-returning RMW operations whose names
|
||||
end in _relaxed. Examples include atomic_add(), atomic_or(),
|
||||
and atomic64_fetch_xor_relaxed(). These operations do carry
|
||||
out the specified RMW operation atomically, for example, five
|
||||
concurrent atomic_inc() operations applied to a given variable
|
||||
will reliably increase the value of that variable by five.
|
||||
However, many CPUs will happily reorder these operations with
|
||||
each other or with other unordered operations.
|
||||
|
||||
This category of operations can be efficiently ordered using
|
||||
smp_mb__before_atomic() and smp_mb__after_atomic(), as was
|
||||
discussed in the "RMW Ordering Augmentation Barriers" section.
|
||||
|
||||
In short, these operations can be freely reordered unless they are all
|
||||
operating on a single variable or unless they are constrained by one of
|
||||
the operations called out earlier in this document.
|
||||
|
||||
|
||||
Unmarked C-Language Accesses
|
||||
----------------------------
|
||||
|
||||
Unmarked C-language accesses are normal variable accesses to normal
|
||||
variables, that is, to variables that are not "volatile" and are not
|
||||
C11 atomic variables. These operations provide no ordering guarantees,
|
||||
and further do not guarantee "atomic" access. For example, the compiler
|
||||
might (and sometimes does) split a plain C-language store into multiple
|
||||
smaller stores. A load from that same variable running on some other
|
||||
CPU while such a store is executing might see a value that is a mashup
|
||||
of the old value and the new value.
|
||||
|
||||
Unmarked C-language accesses are unordered, and are also subject to
|
||||
any number of compiler optimizations, many of which can break your
|
||||
concurrent code. It is possible to used unmarked C-language accesses for
|
||||
shared variables that are subject to concurrent access, but great care
|
||||
is required on an ongoing basis. The compiler-constraining barrier()
|
||||
primitive can be helpful, as can the various ordering primitives discussed
|
||||
in this document. It nevertheless bears repeating that use of unmarked
|
||||
C-language accesses requires careful attention to not just your code,
|
||||
but to all the compilers that might be used to build it. Such compilers
|
||||
might replace a series of loads with a single load, and might replace
|
||||
a series of stores with a single store. Some compilers will even split
|
||||
a single store into multiple smaller stores.
|
||||
|
||||
But there are some ways of using unmarked C-language accesses for shared
|
||||
variables without such worries:
|
||||
|
||||
o Guard all accesses to a given variable by a particular lock,
|
||||
so that there are never concurrent conflicting accesses to
|
||||
that variable. (There are "conflicting accesses" when
|
||||
(1) at least one of the concurrent accesses to a variable is an
|
||||
unmarked C-language access and (2) when at least one of those
|
||||
accesses is a write, whether marked or not.)
|
||||
|
||||
o As above, but using other synchronization primitives such
|
||||
as reader-writer locks or sequence locks.
|
||||
|
||||
o Use locking or other means to ensure that all concurrent accesses
|
||||
to a given variable are reads.
|
||||
|
||||
o Restrict use of a given variable to statistics or heuristics
|
||||
where the occasional bogus value can be tolerated.
|
||||
|
||||
o Declare the accessed variables as C11 atomics.
|
||||
https://lwn.net/Articles/691128/
|
||||
|
||||
o Declare the accessed variables as "volatile".
|
||||
|
||||
If you need to live more dangerously, please do take the time to
|
||||
understand the compilers. One place to start is these two LWN
|
||||
articles:
|
||||
|
||||
Who's afraid of a big bad optimizing compiler?
|
||||
https://lwn.net/Articles/793253
|
||||
Calibrating your fear of big bad optimizing compilers
|
||||
https://lwn.net/Articles/799218
|
||||
|
||||
Used properly, unmarked C-language accesses can reduce overhead on
|
||||
fastpaths. However, the price is great care and continual attention
|
||||
to your compiler as new versions come out and as new optimizations
|
||||
are enabled.
|
|
@ -161,26 +161,8 @@ running LKMM litmus tests.
|
|||
DESCRIPTION OF FILES
|
||||
====================
|
||||
|
||||
Documentation/cheatsheet.txt
|
||||
Quick-reference guide to the Linux-kernel memory model.
|
||||
|
||||
Documentation/explanation.txt
|
||||
Describes the memory model in detail.
|
||||
|
||||
Documentation/litmus-tests.txt
|
||||
Describes the format, features, capabilities, and limitations
|
||||
of the litmus tests that LKMM can evaluate.
|
||||
|
||||
Documentation/recipes.txt
|
||||
Lists common memory-ordering patterns.
|
||||
|
||||
Documentation/references.txt
|
||||
Provides background reading.
|
||||
|
||||
Documentation/simple.txt
|
||||
Starting point for someone new to Linux-kernel concurrency.
|
||||
And also for those needing a reminder of the simpler approaches
|
||||
to concurrency!
|
||||
Documentation/README
|
||||
Guide to the other documents in the Documentation/ directory.
|
||||
|
||||
linux-kernel.bell
|
||||
Categorizes the relevant instructions, including memory
|
||||
|
|
|
@ -7,7 +7,9 @@ C CoRR+poonceonce+Once
|
|||
* reads from the same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -7,7 +7,9 @@ C CoRW+poonceonce+Once
|
|||
* a given variable and a later write to that same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -7,7 +7,9 @@ C CoWR+poonceonce+Once
|
|||
* given variable and a later read from that same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -7,7 +7,9 @@ C CoWW+poonceonce
|
|||
* writes to the same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -10,7 +10,10 @@ C IRIW+fencembonceonces+OnceOnce
|
|||
* process? This litmus test exercises LKMM's "propagation" rule.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -10,7 +10,10 @@ C IRIW+poonceonces+OnceOnce
|
|||
* different process?
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -7,7 +7,12 @@ C ISA2+pooncelock+pooncelock+pombonce
|
|||
* (in P0() and P1()) is visible to external process P2().
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
spinlock_t mylock;
|
||||
int x;
|
||||
int y;
|
||||
int z;
|
||||
}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
|
|
|
@ -9,7 +9,11 @@ C ISA2+poonceonces
|
|||
* of the smp_load_acquire() invocations are replaced by READ_ONCE()?
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
int z;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -11,7 +11,11 @@ C ISA2+pooncerelease+poacquirerelease+poacquireonce
|
|||
* (AKA non-rf) link, so release-acquire is all that is needed.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
int z;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -11,7 +11,10 @@ C LB+fencembonceonce+ctrlonceonce
|
|||
* another control dependency and order would still be maintained.)
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -8,7 +8,10 @@ C LB+poacquireonce+pooncerelease
|
|||
* to the other?
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -7,7 +7,10 @@ C LB+poonceonces
|
|||
* be prevented even with no explicit ordering?
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -8,23 +8,26 @@ C MP+fencewmbonceonce+fencermbonceonce
|
|||
* is usually better to use smp_store_release() and smp_load_acquire().
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_wmb();
|
||||
WRITE_ONCE(*y, 1);
|
||||
int buf;
|
||||
int flag;
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
P0(int *buf, int *flag) // Producer
|
||||
{
|
||||
WRITE_ONCE(*buf, 1);
|
||||
smp_wmb();
|
||||
WRITE_ONCE(*flag, 1);
|
||||
}
|
||||
|
||||
P1(int *buf, int *flag) // Consumer
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
r0 = READ_ONCE(*flag);
|
||||
smp_rmb();
|
||||
r1 = READ_ONCE(*x);
|
||||
r1 = READ_ONCE(*buf);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
||||
exists (1:r0=1 /\ 1:r1=0) (* Bad outcome. *)
|
||||
|
|
|
@ -10,25 +10,26 @@ C MP+onceassign+derefonce
|
|||
*)
|
||||
|
||||
{
|
||||
y=z;
|
||||
z=0;
|
||||
int *p=y;
|
||||
int x;
|
||||
int y=0;
|
||||
}
|
||||
|
||||
P0(int *x, int **y)
|
||||
P0(int *x, int **p) // Producer
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
rcu_assign_pointer(*y, x);
|
||||
rcu_assign_pointer(*p, x);
|
||||
}
|
||||
|
||||
P1(int *x, int **y)
|
||||
P1(int *x, int **p) // Consumer
|
||||
{
|
||||
int *r0;
|
||||
int r1;
|
||||
|
||||
rcu_read_lock();
|
||||
r0 = rcu_dereference(*y);
|
||||
r0 = rcu_dereference(*p);
|
||||
r1 = READ_ONCE(*r0);
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
exists (1:r0=x /\ 1:r1=0)
|
||||
exists (1:r0=x /\ 1:r1=0) (* Bad outcome. *)
|
||||
|
|
|
@ -11,9 +11,11 @@ C MP+polockmbonce+poacquiresilsil
|
|||
*)
|
||||
|
||||
{
|
||||
spinlock_t lo;
|
||||
int x;
|
||||
}
|
||||
|
||||
P0(spinlock_t *lo, int *x)
|
||||
P0(spinlock_t *lo, int *x) // Producer
|
||||
{
|
||||
spin_lock(lo);
|
||||
smp_mb__after_spinlock();
|
||||
|
@ -21,7 +23,7 @@ P0(spinlock_t *lo, int *x)
|
|||
spin_unlock(lo);
|
||||
}
|
||||
|
||||
P1(spinlock_t *lo, int *x)
|
||||
P1(spinlock_t *lo, int *x) // Consumer
|
||||
{
|
||||
int r1;
|
||||
int r2;
|
||||
|
@ -32,4 +34,4 @@ P1(spinlock_t *lo, int *x)
|
|||
r3 = spin_is_locked(lo);
|
||||
}
|
||||
|
||||
exists (1:r1=1 /\ 1:r2=0 /\ 1:r3=1)
|
||||
exists (1:r1=1 /\ 1:r2=0 /\ 1:r3=1) (* Bad outcome. *)
|
||||
|
|
|
@ -11,16 +11,18 @@ C MP+polockonce+poacquiresilsil
|
|||
*)
|
||||
|
||||
{
|
||||
spinlock_t lo;
|
||||
int x;
|
||||
}
|
||||
|
||||
P0(spinlock_t *lo, int *x)
|
||||
P0(spinlock_t *lo, int *x) // Producer
|
||||
{
|
||||
spin_lock(lo);
|
||||
WRITE_ONCE(*x, 1);
|
||||
spin_unlock(lo);
|
||||
}
|
||||
|
||||
P1(spinlock_t *lo, int *x)
|
||||
P1(spinlock_t *lo, int *x) // Consumer
|
||||
{
|
||||
int r1;
|
||||
int r2;
|
||||
|
@ -31,4 +33,4 @@ P1(spinlock_t *lo, int *x)
|
|||
r3 = spin_is_locked(lo);
|
||||
}
|
||||
|
||||
exists (1:r1=1 /\ 1:r2=0 /\ 1:r3=1)
|
||||
exists (1:r1=1 /\ 1:r2=0 /\ 1:r3=1) (* Bad outcome. *)
|
||||
|
|
|
@ -11,25 +11,29 @@ C MP+polocks
|
|||
* to see all prior accesses by those other CPUs.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
spinlock_t mylock;
|
||||
int buf;
|
||||
int flag;
|
||||
}
|
||||
|
||||
P0(int *buf, int *flag, spinlock_t *mylock) // Producer
|
||||
{
|
||||
WRITE_ONCE(*buf, 1);
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*y, 1);
|
||||
WRITE_ONCE(*flag, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *x, int *y, spinlock_t *mylock)
|
||||
P1(int *buf, int *flag, spinlock_t *mylock) // Consumer
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
spin_lock(mylock);
|
||||
r0 = READ_ONCE(*y);
|
||||
r0 = READ_ONCE(*flag);
|
||||
spin_unlock(mylock);
|
||||
r1 = READ_ONCE(*x);
|
||||
r1 = READ_ONCE(*buf);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
||||
exists (1:r0=1 /\ 1:r1=0) (* Bad outcome. *)
|
||||
|
|
|
@ -7,21 +7,24 @@ C MP+poonceonces
|
|||
* no ordering at all?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
int buf;
|
||||
int flag;
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
P0(int *buf, int *flag) // Producer
|
||||
{
|
||||
WRITE_ONCE(*buf, 1);
|
||||
WRITE_ONCE(*flag, 1);
|
||||
}
|
||||
|
||||
P1(int *buf, int *flag) // Consumer
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
r1 = READ_ONCE(*x);
|
||||
r0 = READ_ONCE(*flag);
|
||||
r1 = READ_ONCE(*buf);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
||||
exists (1:r0=1 /\ 1:r1=0) (* Bad outcome. *)
|
||||
|
|
|
@ -8,21 +8,24 @@ C MP+pooncerelease+poacquireonce
|
|||
* pattern.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_store_release(y, 1);
|
||||
int buf;
|
||||
int flag;
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
P0(int *buf, int *flag) // Producer
|
||||
{
|
||||
WRITE_ONCE(*buf, 1);
|
||||
smp_store_release(flag, 1);
|
||||
}
|
||||
|
||||
P1(int *buf, int *flag) // Consumer
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = smp_load_acquire(y);
|
||||
r1 = READ_ONCE(*x);
|
||||
r0 = smp_load_acquire(flag);
|
||||
r1 = READ_ONCE(*buf);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
||||
exists (1:r0=1 /\ 1:r1=0) (* Bad outcome. *)
|
||||
|
|
|
@ -11,25 +11,29 @@ C MP+porevlocks
|
|||
* see all prior accesses by those other CPUs.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
spinlock_t mylock;
|
||||
int buf;
|
||||
int flag;
|
||||
}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
P0(int *buf, int *flag, spinlock_t *mylock) // Consumer
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
r0 = READ_ONCE(*flag);
|
||||
spin_lock(mylock);
|
||||
r1 = READ_ONCE(*x);
|
||||
r1 = READ_ONCE(*buf);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *x, int *y, spinlock_t *mylock)
|
||||
P1(int *buf, int *flag, spinlock_t *mylock) // Producer
|
||||
{
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*buf, 1);
|
||||
spin_unlock(mylock);
|
||||
WRITE_ONCE(*y, 1);
|
||||
WRITE_ONCE(*flag, 1);
|
||||
}
|
||||
|
||||
exists (0:r0=1 /\ 0:r1=0)
|
||||
exists (0:r0=1 /\ 0:r1=0) (* Bad outcome. *)
|
||||
|
|
|
@ -9,7 +9,10 @@ C R+fencembonceonces
|
|||
* cause the resulting test to be allowed.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -8,7 +8,10 @@ C R+poonceonces
|
|||
* store propagation delays.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -7,7 +7,10 @@ C S+fencewmbonceonce+poacquireonce
|
|||
* store against a subsequent store?
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -9,7 +9,10 @@ C S+poonceonces
|
|||
* READ_ONCE(), is ordering preserved?
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -9,7 +9,10 @@ C SB+fencembonceonces
|
|||
* suffice, but not much else.)
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -8,7 +8,10 @@ C SB+poonceonces
|
|||
* variable that the preceding process reads.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -6,7 +6,10 @@ C SB+rfionceonce-poonceonces
|
|||
* This litmus test demonstrates that LKMM is not fully multicopy atomic.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
|
@ -8,7 +8,10 @@ C WRC+poonceonces+Once
|
|||
* test has no ordering at all.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -10,7 +10,10 @@ C WRC+pooncerelease+fencermbonceonce+Once
|
|||
* is A-cumulative in LKMM.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
|
|
|
@ -9,7 +9,12 @@ C Z6.0+pooncelock+poonceLock+pombonce
|
|||
* by CPUs not holding that lock.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
spinlock_t mylock;
|
||||
int x;
|
||||
int y;
|
||||
int z;
|
||||
}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
|
|
|
@ -8,7 +8,12 @@ C Z6.0+pooncelock+pooncelock+pombonce
|
|||
* seen as ordered by a third process not holding that lock.
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
spinlock_t mylock;
|
||||
int x;
|
||||
int y;
|
||||
int z;
|
||||
}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
|
|
|
@ -14,7 +14,11 @@ C Z6.0+pooncerelease+poacquirerelease+fencembonceonce
|
|||
* involving locking.)
|
||||
*)
|
||||
|
||||
{}
|
||||
{
|
||||
int x;
|
||||
int y;
|
||||
int z;
|
||||
}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
|
|
Loading…
Reference in New Issue