These are the locking updates for v5.10:
- Add deadlock detection for recursive read-locks. The rationale is outlined
in:
224ec489d3cd: ("lockdep/Documention: Recursive read lock detection reasoning")
The main deadlock pattern we want to detect is:
TASK A: TASK B:
read_lock(X);
write_lock(X);
read_lock_2(X);
- Add "latch sequence counters" (seqcount_latch_t):
A sequence counter variant where the counter even/odd value is used to
switch between two copies of protected data. This allows the read path,
typically NMIs, to safely interrupt the write side critical section.
We utilize this new variant for sched-clock, and to make x86 TSC handling safer.
- Other seqlock cleanups, fixes and enhancements
- KCSAN updates
- LKMM updates
- Misc updates, cleanups and fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EX6QRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g3gxAAkg+Jy/tcdRxlxlEDOQPFy1mBqvFmulNA
pGFPkB6dzqmAWF/NfOZSl4g/h/mqGYsq2V+PfK5E8Sq8DQ/yCmnLhjgVOHNUUliv
x0WWfOysNgJdtdf69NLYJufIQhxhyI0dwFHHoHIsCdGdGqjh2DVevQFPFTBjdpOc
BUZYo+u3gCaCdB6A2nmlcWYbEw8eVEHgv3qLG6dq46J0KJOV0HfliqJoU3EZqH+s
977LvEIo+THfuYWMo/Jepwngbi0y36KeeukOAdwm9fK196htBHIUR+YPPrAe+FWD
z+UXP5IS5XIw9V1sGLmUaC2m+6gpdW19jKBtlzPkxHXmJmsgiZdLLeytEh3WYey7
nzfH+9Jd4NyyZKucLssYkOjf6P5BxGKCyJ9LXb7vlSthIhiDdFNx47oKtW4hxjOY
jubsI3BP5c3G1sIBIjTS53XmOhJg+Z52FxTpQ33JswXn1wGidcHZiuNHZuU5q28p
+tn8rGb2NGJFb4Sw/Vp0yTcqIpEXf+vweiQoaxm6tc9BWzcVzZntGnh0i3gFotx/
VgKafN4+pgXgo6bwHbN2WBK2FGyvcXFaptfaOMZL48En82hJ1DI6EnBEYN+vuERQ
JcCXg+iHeeVbxoou7q8NJxITkBmEL5xNBIugXRRqNSP3fXLxKjFuPYqT84/e7yZi
elGTReYcq6g=
=Iq51
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"These are the locking updates for v5.10:
- Add deadlock detection for recursive read-locks.
The rationale is outlined in commit 224ec489d3
("lockdep/
Documention: Recursive read lock detection reasoning")
The main deadlock pattern we want to detect is:
TASK A: TASK B:
read_lock(X);
write_lock(X);
read_lock_2(X);
- Add "latch sequence counters" (seqcount_latch_t):
A sequence counter variant where the counter even/odd value is used
to switch between two copies of protected data. This allows the
read path, typically NMIs, to safely interrupt the write side
critical section.
We utilize this new variant for sched-clock, and to make x86 TSC
handling safer.
- Other seqlock cleanups, fixes and enhancements
- KCSAN updates
- LKMM updates
- Misc updates, cleanups and fixes"
* tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables"
lockdep: Fix lockdep recursion
lockdep: Fix usage_traceoverflow
locking/atomics: Check atomic-arch-fallback.h too
locking/seqlock: Tweak DEFINE_SEQLOCK() kernel doc
lockdep: Optimize the memory usage of circular queue
seqlock: Unbreak lockdep
seqlock: PREEMPT_RT: Do not starve seqlock_t writers
seqlock: seqcount_LOCKNAME_t: Introduce PREEMPT_RT support
seqlock: seqcount_t: Implement all read APIs as statement expressions
seqlock: Use unique prefix for seqcount_t property accessors
seqlock: seqcount_LOCKNAME_t: Standardize naming convention
seqlock: seqcount latch APIs: Only allow seqcount_latch_t
rbtree_latch: Use seqcount_latch_t
x86/tsc: Use seqcount_latch_t
timekeeping: Use seqcount_latch_t
time/sched_clock: Use seqcount_latch_t
seqlock: Introduce seqcount_latch_t
mm/swap: Do not abuse the seqcount_t latching API
time/sched_clock: Use raw_read_seqcount_latch() during suspend
...
This commit is contained in:
commit
ed016af52e
|
@ -392,3 +392,261 @@ Run the command and save the output, then compare against the output from
|
|||
a later run of this command to identify the leakers. This same output
|
||||
can also help you find situations where runtime lock initialization has
|
||||
been omitted.
|
||||
|
||||
Recursive read locks:
|
||||
---------------------
|
||||
The whole of the rest document tries to prove a certain type of cycle is equivalent
|
||||
to deadlock possibility.
|
||||
|
||||
There are three types of lockers: writers (i.e. exclusive lockers, like
|
||||
spin_lock() or write_lock()), non-recursive readers (i.e. shared lockers, like
|
||||
down_read()) and recursive readers (recursive shared lockers, like rcu_read_lock()).
|
||||
And we use the following notations of those lockers in the rest of the document:
|
||||
|
||||
W or E: stands for writers (exclusive lockers).
|
||||
r: stands for non-recursive readers.
|
||||
R: stands for recursive readers.
|
||||
S: stands for all readers (non-recursive + recursive), as both are shared lockers.
|
||||
N: stands for writers and non-recursive readers, as both are not recursive.
|
||||
|
||||
Obviously, N is "r or W" and S is "r or R".
|
||||
|
||||
Recursive readers, as their name indicates, are the lockers allowed to acquire
|
||||
even inside the critical section of another reader of the same lock instance,
|
||||
in other words, allowing nested read-side critical sections of one lock instance.
|
||||
|
||||
While non-recursive readers will cause a self deadlock if trying to acquire inside
|
||||
the critical section of another reader of the same lock instance.
|
||||
|
||||
The difference between recursive readers and non-recursive readers is because:
|
||||
recursive readers get blocked only by a write lock *holder*, while non-recursive
|
||||
readers could get blocked by a write lock *waiter*. Considering the follow example:
|
||||
|
||||
TASK A: TASK B:
|
||||
|
||||
read_lock(X);
|
||||
write_lock(X);
|
||||
read_lock_2(X);
|
||||
|
||||
Task A gets the reader (no matter whether recursive or non-recursive) on X via
|
||||
read_lock() first. And when task B tries to acquire writer on X, it will block
|
||||
and become a waiter for writer on X. Now if read_lock_2() is recursive readers,
|
||||
task A will make progress, because writer waiters don't block recursive readers,
|
||||
and there is no deadlock. However, if read_lock_2() is non-recursive readers,
|
||||
it will get blocked by writer waiter B, and cause a self deadlock.
|
||||
|
||||
Block conditions on readers/writers of the same lock instance:
|
||||
--------------------------------------------------------------
|
||||
There are simply four block conditions:
|
||||
|
||||
1. Writers block other writers.
|
||||
2. Readers block writers.
|
||||
3. Writers block both recursive readers and non-recursive readers.
|
||||
4. And readers (recursive or not) don't block other recursive readers but
|
||||
may block non-recursive readers (because of the potential co-existing
|
||||
writer waiters)
|
||||
|
||||
Block condition matrix, Y means the row blocks the column, and N means otherwise.
|
||||
|
||||
| E | r | R |
|
||||
+---+---+---+---+
|
||||
E | Y | Y | Y |
|
||||
+---+---+---+---+
|
||||
r | Y | Y | N |
|
||||
+---+---+---+---+
|
||||
R | Y | Y | N |
|
||||
|
||||
(W: writers, r: non-recursive readers, R: recursive readers)
|
||||
|
||||
|
||||
acquired recursively. Unlike non-recursive read locks, recursive read locks
|
||||
only get blocked by current write lock *holders* other than write lock
|
||||
*waiters*, for example:
|
||||
|
||||
TASK A: TASK B:
|
||||
|
||||
read_lock(X);
|
||||
|
||||
write_lock(X);
|
||||
|
||||
read_lock(X);
|
||||
|
||||
is not a deadlock for recursive read locks, as while the task B is waiting for
|
||||
the lock X, the second read_lock() doesn't need to wait because it's a recursive
|
||||
read lock. However if the read_lock() is non-recursive read lock, then the above
|
||||
case is a deadlock, because even if the write_lock() in TASK B cannot get the
|
||||
lock, but it can block the second read_lock() in TASK A.
|
||||
|
||||
Note that a lock can be a write lock (exclusive lock), a non-recursive read
|
||||
lock (non-recursive shared lock) or a recursive read lock (recursive shared
|
||||
lock), depending on the lock operations used to acquire it (more specifically,
|
||||
the value of the 'read' parameter for lock_acquire()). In other words, a single
|
||||
lock instance has three types of acquisition depending on the acquisition
|
||||
functions: exclusive, non-recursive read, and recursive read.
|
||||
|
||||
To be concise, we call that write locks and non-recursive read locks as
|
||||
"non-recursive" locks and recursive read locks as "recursive" locks.
|
||||
|
||||
Recursive locks don't block each other, while non-recursive locks do (this is
|
||||
even true for two non-recursive read locks). A non-recursive lock can block the
|
||||
corresponding recursive lock, and vice versa.
|
||||
|
||||
A deadlock case with recursive locks involved is as follow:
|
||||
|
||||
TASK A: TASK B:
|
||||
|
||||
read_lock(X);
|
||||
read_lock(Y);
|
||||
write_lock(Y);
|
||||
write_lock(X);
|
||||
|
||||
Task A is waiting for task B to read_unlock() Y and task B is waiting for task
|
||||
A to read_unlock() X.
|
||||
|
||||
Dependency types and strong dependency paths:
|
||||
---------------------------------------------
|
||||
Lock dependencies record the orders of the acquisitions of a pair of locks, and
|
||||
because there are 3 types for lockers, there are, in theory, 9 types of lock
|
||||
dependencies, but we can show that 4 types of lock dependencies are enough for
|
||||
deadlock detection.
|
||||
|
||||
For each lock dependency:
|
||||
|
||||
L1 -> L2
|
||||
|
||||
, which means lockdep has seen L1 held before L2 held in the same context at runtime.
|
||||
And in deadlock detection, we care whether we could get blocked on L2 with L1 held,
|
||||
IOW, whether there is a locker L3 that L1 blocks L3 and L2 gets blocked by L3. So
|
||||
we only care about 1) what L1 blocks and 2) what blocks L2. As a result, we can combine
|
||||
recursive readers and non-recursive readers for L1 (as they block the same types) and
|
||||
we can combine writers and non-recursive readers for L2 (as they get blocked by the
|
||||
same types).
|
||||
|
||||
With the above combination for simplification, there are 4 types of dependency edges
|
||||
in the lockdep graph:
|
||||
|
||||
1) -(ER)->: exclusive writer to recursive reader dependency, "X -(ER)-> Y" means
|
||||
X -> Y and X is a writer and Y is a recursive reader.
|
||||
|
||||
2) -(EN)->: exclusive writer to non-recursive locker dependency, "X -(EN)-> Y" means
|
||||
X -> Y and X is a writer and Y is either a writer or non-recursive reader.
|
||||
|
||||
3) -(SR)->: shared reader to recursive reader dependency, "X -(SR)-> Y" means
|
||||
X -> Y and X is a reader (recursive or not) and Y is a recursive reader.
|
||||
|
||||
4) -(SN)->: shared reader to non-recursive locker dependency, "X -(SN)-> Y" means
|
||||
X -> Y and X is a reader (recursive or not) and Y is either a writer or
|
||||
non-recursive reader.
|
||||
|
||||
Note that given two locks, they may have multiple dependencies between them, for example:
|
||||
|
||||
TASK A:
|
||||
|
||||
read_lock(X);
|
||||
write_lock(Y);
|
||||
...
|
||||
|
||||
TASK B:
|
||||
|
||||
write_lock(X);
|
||||
write_lock(Y);
|
||||
|
||||
, we have both X -(SN)-> Y and X -(EN)-> Y in the dependency graph.
|
||||
|
||||
We use -(xN)-> to represent edges that are either -(EN)-> or -(SN)->, the
|
||||
similar for -(Ex)->, -(xR)-> and -(Sx)->
|
||||
|
||||
A "path" is a series of conjunct dependency edges in the graph. And we define a
|
||||
"strong" path, which indicates the strong dependency throughout each dependency
|
||||
in the path, as the path that doesn't have two conjunct edges (dependencies) as
|
||||
-(xR)-> and -(Sx)->. In other words, a "strong" path is a path from a lock
|
||||
walking to another through the lock dependencies, and if X -> Y -> Z is in the
|
||||
path (where X, Y, Z are locks), and the walk from X to Y is through a -(SR)-> or
|
||||
-(ER)-> dependency, the walk from Y to Z must not be through a -(SN)-> or
|
||||
-(SR)-> dependency.
|
||||
|
||||
We will see why the path is called "strong" in next section.
|
||||
|
||||
Recursive Read Deadlock Detection:
|
||||
----------------------------------
|
||||
|
||||
We now prove two things:
|
||||
|
||||
Lemma 1:
|
||||
|
||||
If there is a closed strong path (i.e. a strong circle), then there is a
|
||||
combination of locking sequences that causes deadlock. I.e. a strong circle is
|
||||
sufficient for deadlock detection.
|
||||
|
||||
Lemma 2:
|
||||
|
||||
If there is no closed strong path (i.e. strong circle), then there is no
|
||||
combination of locking sequences that could cause deadlock. I.e. strong
|
||||
circles are necessary for deadlock detection.
|
||||
|
||||
With these two Lemmas, we can easily say a closed strong path is both sufficient
|
||||
and necessary for deadlocks, therefore a closed strong path is equivalent to
|
||||
deadlock possibility. As a closed strong path stands for a dependency chain that
|
||||
could cause deadlocks, so we call it "strong", considering there are dependency
|
||||
circles that won't cause deadlocks.
|
||||
|
||||
Proof for sufficiency (Lemma 1):
|
||||
|
||||
Let's say we have a strong circle:
|
||||
|
||||
L1 -> L2 ... -> Ln -> L1
|
||||
|
||||
, which means we have dependencies:
|
||||
|
||||
L1 -> L2
|
||||
L2 -> L3
|
||||
...
|
||||
Ln-1 -> Ln
|
||||
Ln -> L1
|
||||
|
||||
We now can construct a combination of locking sequences that cause deadlock:
|
||||
|
||||
Firstly let's make one CPU/task get the L1 in L1 -> L2, and then another get
|
||||
the L2 in L2 -> L3, and so on. After this, all of the Lx in Lx -> Lx+1 are
|
||||
held by different CPU/tasks.
|
||||
|
||||
And then because we have L1 -> L2, so the holder of L1 is going to acquire L2
|
||||
in L1 -> L2, however since L2 is already held by another CPU/task, plus L1 ->
|
||||
L2 and L2 -> L3 are not -(xR)-> and -(Sx)-> (the definition of strong), which
|
||||
means either L2 in L1 -> L2 is a non-recursive locker (blocked by anyone) or
|
||||
the L2 in L2 -> L3, is writer (blocking anyone), therefore the holder of L1
|
||||
cannot get L2, it has to wait L2's holder to release.
|
||||
|
||||
Moreover, we can have a similar conclusion for L2's holder: it has to wait L3's
|
||||
holder to release, and so on. We now can prove that Lx's holder has to wait for
|
||||
Lx+1's holder to release, and note that Ln+1 is L1, so we have a circular
|
||||
waiting scenario and nobody can get progress, therefore a deadlock.
|
||||
|
||||
Proof for necessary (Lemma 2):
|
||||
|
||||
Lemma 2 is equivalent to: If there is a deadlock scenario, then there must be a
|
||||
strong circle in the dependency graph.
|
||||
|
||||
According to Wikipedia[1], if there is a deadlock, then there must be a circular
|
||||
waiting scenario, means there are N CPU/tasks, where CPU/task P1 is waiting for
|
||||
a lock held by P2, and P2 is waiting for a lock held by P3, ... and Pn is waiting
|
||||
for a lock held by P1. Let's name the lock Px is waiting as Lx, so since P1 is waiting
|
||||
for L1 and holding Ln, so we will have Ln -> L1 in the dependency graph. Similarly,
|
||||
we have L1 -> L2, L2 -> L3, ..., Ln-1 -> Ln in the dependency graph, which means we
|
||||
have a circle:
|
||||
|
||||
Ln -> L1 -> L2 -> ... -> Ln
|
||||
|
||||
, and now let's prove the circle is strong:
|
||||
|
||||
For a lock Lx, Px contributes the dependency Lx-1 -> Lx and Px+1 contributes
|
||||
the dependency Lx -> Lx+1, and since Px is waiting for Px+1 to release Lx,
|
||||
so it's impossible that Lx on Px+1 is a reader and Lx on Px is a recursive
|
||||
reader, because readers (no matter recursive or not) don't block recursive
|
||||
readers, therefore Lx-1 -> Lx and Lx -> Lx+1 cannot be a -(xR)-> -(Sx)-> pair,
|
||||
and this is true for any lock in the circle, therefore, the circle is strong.
|
||||
|
||||
References:
|
||||
-----------
|
||||
[1]: https://en.wikipedia.org/wiki/Deadlock
|
||||
[2]: Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill
|
||||
|
|
|
@ -139,6 +139,24 @@ with the associated LOCKTYPE lock acquired.
|
|||
|
||||
Read path: same as in :ref:`seqcount_t`.
|
||||
|
||||
|
||||
.. _seqcount_latch_t:
|
||||
|
||||
Latch sequence counters (``seqcount_latch_t``)
|
||||
----------------------------------------------
|
||||
|
||||
Latch sequence counters are a multiversion concurrency control mechanism
|
||||
where the embedded seqcount_t counter even/odd value is used to switch
|
||||
between two copies of protected data. This allows the sequence counter
|
||||
read path to safely interrupt its own write side critical section.
|
||||
|
||||
Use seqcount_latch_t when the write side sections cannot be protected
|
||||
from interruption by readers. This is typically the case when the read
|
||||
side can be invoked from NMI handlers.
|
||||
|
||||
Check `raw_write_seqcount_latch()` for more information.
|
||||
|
||||
|
||||
.. _seqlock_t:
|
||||
|
||||
Sequential locks (``seqlock_t``)
|
||||
|
|
|
@ -54,7 +54,7 @@ struct clocksource *art_related_clocksource;
|
|||
|
||||
struct cyc2ns {
|
||||
struct cyc2ns_data data[2]; /* 0 + 2*16 = 32 */
|
||||
seqcount_t seq; /* 32 + 4 = 36 */
|
||||
seqcount_latch_t seq; /* 32 + 4 = 36 */
|
||||
|
||||
}; /* fits one cacheline */
|
||||
|
||||
|
@ -73,14 +73,14 @@ __always_inline void cyc2ns_read_begin(struct cyc2ns_data *data)
|
|||
preempt_disable_notrace();
|
||||
|
||||
do {
|
||||
seq = this_cpu_read(cyc2ns.seq.sequence);
|
||||
seq = this_cpu_read(cyc2ns.seq.seqcount.sequence);
|
||||
idx = seq & 1;
|
||||
|
||||
data->cyc2ns_offset = this_cpu_read(cyc2ns.data[idx].cyc2ns_offset);
|
||||
data->cyc2ns_mul = this_cpu_read(cyc2ns.data[idx].cyc2ns_mul);
|
||||
data->cyc2ns_shift = this_cpu_read(cyc2ns.data[idx].cyc2ns_shift);
|
||||
|
||||
} while (unlikely(seq != this_cpu_read(cyc2ns.seq.sequence)));
|
||||
} while (unlikely(seq != this_cpu_read(cyc2ns.seq.seqcount.sequence)));
|
||||
}
|
||||
|
||||
__always_inline void cyc2ns_read_end(void)
|
||||
|
@ -186,7 +186,7 @@ static void __init cyc2ns_init_boot_cpu(void)
|
|||
{
|
||||
struct cyc2ns *c2n = this_cpu_ptr(&cyc2ns);
|
||||
|
||||
seqcount_init(&c2n->seq);
|
||||
seqcount_latch_init(&c2n->seq);
|
||||
__set_cyc2ns_scale(tsc_khz, smp_processor_id(), rdtsc());
|
||||
}
|
||||
|
||||
|
@ -203,7 +203,7 @@ static void __init cyc2ns_init_secondary_cpus(void)
|
|||
|
||||
for_each_possible_cpu(cpu) {
|
||||
if (cpu != this_cpu) {
|
||||
seqcount_init(&c2n->seq);
|
||||
seqcount_latch_init(&c2n->seq);
|
||||
c2n = per_cpu_ptr(&cyc2ns, cpu);
|
||||
c2n->data[0] = data[0];
|
||||
c2n->data[1] = data[1];
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -67,7 +67,7 @@ static inline void change_bit(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
return arch_test_and_set_bit(nr, addr);
|
||||
}
|
||||
|
||||
|
@ -80,7 +80,7 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
return arch_test_and_clear_bit(nr, addr);
|
||||
}
|
||||
|
||||
|
@ -93,7 +93,7 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
return arch_test_and_change_bit(nr, addr);
|
||||
}
|
||||
|
||||
|
|
|
@ -52,7 +52,7 @@ static inline void __clear_bit_unlock(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool test_and_set_bit_lock(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
return arch_test_and_set_bit_lock(nr, addr);
|
||||
}
|
||||
|
||||
|
|
|
@ -58,6 +58,30 @@ static inline void __change_bit(long nr, volatile unsigned long *addr)
|
|||
arch___change_bit(nr, addr);
|
||||
}
|
||||
|
||||
static inline void __instrument_read_write_bitop(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
if (IS_ENABLED(CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC)) {
|
||||
/*
|
||||
* We treat non-atomic read-write bitops a little more special.
|
||||
* Given the operations here only modify a single bit, assuming
|
||||
* non-atomicity of the writer is sufficient may be reasonable
|
||||
* for certain usage (and follows the permissible nature of the
|
||||
* assume-plain-writes-atomic rule):
|
||||
* 1. report read-modify-write races -> check read;
|
||||
* 2. do not report races with marked readers, but do report
|
||||
* races with unmarked readers -> check "atomic" write.
|
||||
*/
|
||||
kcsan_check_read(addr + BIT_WORD(nr), sizeof(long));
|
||||
/*
|
||||
* Use generic write instrumentation, in case other sanitizers
|
||||
* or tools are enabled alongside KCSAN.
|
||||
*/
|
||||
instrument_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
} else {
|
||||
instrument_read_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* __test_and_set_bit - Set a bit and return its old value
|
||||
* @nr: Bit to set
|
||||
|
@ -68,7 +92,7 @@ static inline void __change_bit(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
__instrument_read_write_bitop(nr, addr);
|
||||
return arch___test_and_set_bit(nr, addr);
|
||||
}
|
||||
|
||||
|
@ -82,7 +106,7 @@ static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
__instrument_read_write_bitop(nr, addr);
|
||||
return arch___test_and_clear_bit(nr, addr);
|
||||
}
|
||||
|
||||
|
@ -96,7 +120,7 @@ static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
|
|||
*/
|
||||
static inline bool __test_and_change_bit(long nr, volatile unsigned long *addr)
|
||||
{
|
||||
instrument_write(addr + BIT_WORD(nr), sizeof(long));
|
||||
__instrument_read_write_bitop(nr, addr);
|
||||
return arch___test_and_change_bit(nr, addr);
|
||||
}
|
||||
|
||||
|
|
|
@ -42,6 +42,21 @@ static __always_inline void instrument_write(const volatile void *v, size_t size
|
|||
kcsan_check_write(v, size);
|
||||
}
|
||||
|
||||
/**
|
||||
* instrument_read_write - instrument regular read-write access
|
||||
*
|
||||
* Instrument a regular write access. The instrumentation should be inserted
|
||||
* before the actual write happens.
|
||||
*
|
||||
* @ptr address of access
|
||||
* @size size of access
|
||||
*/
|
||||
static __always_inline void instrument_read_write(const volatile void *v, size_t size)
|
||||
{
|
||||
kasan_check_write(v, size);
|
||||
kcsan_check_read_write(v, size);
|
||||
}
|
||||
|
||||
/**
|
||||
* instrument_atomic_read - instrument atomic read access
|
||||
*
|
||||
|
@ -72,6 +87,21 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size
|
|||
kcsan_check_atomic_write(v, size);
|
||||
}
|
||||
|
||||
/**
|
||||
* instrument_atomic_read_write - instrument atomic read-write access
|
||||
*
|
||||
* Instrument an atomic read-write access. The instrumentation should be
|
||||
* inserted before the actual write happens.
|
||||
*
|
||||
* @ptr address of access
|
||||
* @size size of access
|
||||
*/
|
||||
static __always_inline void instrument_atomic_read_write(const volatile void *v, size_t size)
|
||||
{
|
||||
kasan_check_write(v, size);
|
||||
kcsan_check_atomic_read_write(v, size);
|
||||
}
|
||||
|
||||
/**
|
||||
* instrument_copy_to_user - instrument reads of copy_to_user
|
||||
*
|
||||
|
|
|
@ -7,19 +7,13 @@
|
|||
#include <linux/compiler_attributes.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
/*
|
||||
* ACCESS TYPE MODIFIERS
|
||||
*
|
||||
* <none>: normal read access;
|
||||
* WRITE : write access;
|
||||
* ATOMIC: access is atomic;
|
||||
* ASSERT: access is not a regular access, but an assertion;
|
||||
* SCOPED: access is a scoped access;
|
||||
*/
|
||||
#define KCSAN_ACCESS_WRITE 0x1
|
||||
#define KCSAN_ACCESS_ATOMIC 0x2
|
||||
#define KCSAN_ACCESS_ASSERT 0x4
|
||||
#define KCSAN_ACCESS_SCOPED 0x8
|
||||
/* Access types -- if KCSAN_ACCESS_WRITE is not set, the access is a read. */
|
||||
#define KCSAN_ACCESS_WRITE (1 << 0) /* Access is a write. */
|
||||
#define KCSAN_ACCESS_COMPOUND (1 << 1) /* Compounded read-write instrumentation. */
|
||||
#define KCSAN_ACCESS_ATOMIC (1 << 2) /* Access is atomic. */
|
||||
/* The following are special, and never due to compiler instrumentation. */
|
||||
#define KCSAN_ACCESS_ASSERT (1 << 3) /* Access is an assertion. */
|
||||
#define KCSAN_ACCESS_SCOPED (1 << 4) /* Access is a scoped access. */
|
||||
|
||||
/*
|
||||
* __kcsan_*: Always calls into the runtime when KCSAN is enabled. This may be used
|
||||
|
@ -204,6 +198,15 @@ static inline void __kcsan_disable_current(void) { }
|
|||
#define __kcsan_check_write(ptr, size) \
|
||||
__kcsan_check_access(ptr, size, KCSAN_ACCESS_WRITE)
|
||||
|
||||
/**
|
||||
* __kcsan_check_read_write - check regular read-write access for races
|
||||
*
|
||||
* @ptr: address of access
|
||||
* @size: size of access
|
||||
*/
|
||||
#define __kcsan_check_read_write(ptr, size) \
|
||||
__kcsan_check_access(ptr, size, KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE)
|
||||
|
||||
/**
|
||||
* kcsan_check_read - check regular read access for races
|
||||
*
|
||||
|
@ -221,18 +224,30 @@ static inline void __kcsan_disable_current(void) { }
|
|||
#define kcsan_check_write(ptr, size) \
|
||||
kcsan_check_access(ptr, size, KCSAN_ACCESS_WRITE)
|
||||
|
||||
/**
|
||||
* kcsan_check_read_write - check regular read-write access for races
|
||||
*
|
||||
* @ptr: address of access
|
||||
* @size: size of access
|
||||
*/
|
||||
#define kcsan_check_read_write(ptr, size) \
|
||||
kcsan_check_access(ptr, size, KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE)
|
||||
|
||||
/*
|
||||
* Check for atomic accesses: if atomic accesses are not ignored, this simply
|
||||
* aliases to kcsan_check_access(), otherwise becomes a no-op.
|
||||
*/
|
||||
#ifdef CONFIG_KCSAN_IGNORE_ATOMICS
|
||||
#define kcsan_check_atomic_read(...) do { } while (0)
|
||||
#define kcsan_check_atomic_write(...) do { } while (0)
|
||||
#define kcsan_check_atomic_read(...) do { } while (0)
|
||||
#define kcsan_check_atomic_write(...) do { } while (0)
|
||||
#define kcsan_check_atomic_read_write(...) do { } while (0)
|
||||
#else
|
||||
#define kcsan_check_atomic_read(ptr, size) \
|
||||
kcsan_check_access(ptr, size, KCSAN_ACCESS_ATOMIC)
|
||||
#define kcsan_check_atomic_write(ptr, size) \
|
||||
kcsan_check_access(ptr, size, KCSAN_ACCESS_ATOMIC | KCSAN_ACCESS_WRITE)
|
||||
#define kcsan_check_atomic_read_write(ptr, size) \
|
||||
kcsan_check_access(ptr, size, KCSAN_ACCESS_ATOMIC | KCSAN_ACCESS_WRITE | KCSAN_ACCESS_COMPOUND)
|
||||
#endif
|
||||
|
||||
/**
|
||||
|
|
|
@ -54,7 +54,11 @@ struct lock_list {
|
|||
struct lock_class *class;
|
||||
struct lock_class *links_to;
|
||||
const struct lock_trace *trace;
|
||||
int distance;
|
||||
u16 distance;
|
||||
/* bitmap of different dependencies from head to this */
|
||||
u8 dep;
|
||||
/* used by BFS to record whether "prev -> this" only has -(*R)-> */
|
||||
u8 only_xr;
|
||||
|
||||
/*
|
||||
* The parent field is used to implement breadth-first search, and the
|
||||
|
@ -469,6 +473,20 @@ static inline void print_irqtrace_events(struct task_struct *curr)
|
|||
}
|
||||
#endif
|
||||
|
||||
/* Variable used to make lockdep treat read_lock() as recursive in selftests */
|
||||
#ifdef CONFIG_DEBUG_LOCKING_API_SELFTESTS
|
||||
extern unsigned int force_read_lock_recursive;
|
||||
#else /* CONFIG_DEBUG_LOCKING_API_SELFTESTS */
|
||||
#define force_read_lock_recursive 0
|
||||
#endif /* CONFIG_DEBUG_LOCKING_API_SELFTESTS */
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
extern bool read_lock_is_recursive(void);
|
||||
#else /* CONFIG_LOCKDEP */
|
||||
/* If !LOCKDEP, the value is meaningless */
|
||||
#define read_lock_is_recursive() 0
|
||||
#endif
|
||||
|
||||
/*
|
||||
* For trivial one-depth nesting of a lock-class, the following
|
||||
* global define can be used. (Subsystems with multiple levels
|
||||
|
@ -490,7 +508,14 @@ static inline void print_irqtrace_events(struct task_struct *curr)
|
|||
#define spin_release(l, i) lock_release(l, i)
|
||||
|
||||
#define rwlock_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i)
|
||||
#define rwlock_acquire_read(l, s, t, i) lock_acquire_shared_recursive(l, s, t, NULL, i)
|
||||
#define rwlock_acquire_read(l, s, t, i) \
|
||||
do { \
|
||||
if (read_lock_is_recursive()) \
|
||||
lock_acquire_shared_recursive(l, s, t, NULL, i); \
|
||||
else \
|
||||
lock_acquire_shared(l, s, t, NULL, i); \
|
||||
} while (0)
|
||||
|
||||
#define rwlock_release(l, i) lock_release(l, i)
|
||||
|
||||
#define seqcount_acquire(l, s, t, i) lock_acquire_exclusive(l, s, t, NULL, i)
|
||||
|
@ -512,19 +537,19 @@ static inline void print_irqtrace_events(struct task_struct *curr)
|
|||
#define lock_map_release(l) lock_release(l, _THIS_IP_)
|
||||
|
||||
#ifdef CONFIG_PROVE_LOCKING
|
||||
# define might_lock(lock) \
|
||||
# define might_lock(lock) \
|
||||
do { \
|
||||
typecheck(struct lockdep_map *, &(lock)->dep_map); \
|
||||
lock_acquire(&(lock)->dep_map, 0, 0, 0, 1, NULL, _THIS_IP_); \
|
||||
lock_release(&(lock)->dep_map, _THIS_IP_); \
|
||||
} while (0)
|
||||
# define might_lock_read(lock) \
|
||||
# define might_lock_read(lock) \
|
||||
do { \
|
||||
typecheck(struct lockdep_map *, &(lock)->dep_map); \
|
||||
lock_acquire(&(lock)->dep_map, 0, 0, 1, 1, NULL, _THIS_IP_); \
|
||||
lock_release(&(lock)->dep_map, _THIS_IP_); \
|
||||
} while (0)
|
||||
# define might_lock_nested(lock, subclass) \
|
||||
# define might_lock_nested(lock, subclass) \
|
||||
do { \
|
||||
typecheck(struct lockdep_map *, &(lock)->dep_map); \
|
||||
lock_acquire(&(lock)->dep_map, subclass, 0, 1, 1, NULL, \
|
||||
|
@ -534,44 +559,39 @@ do { \
|
|||
|
||||
DECLARE_PER_CPU(int, hardirqs_enabled);
|
||||
DECLARE_PER_CPU(int, hardirq_context);
|
||||
DECLARE_PER_CPU(unsigned int, lockdep_recursion);
|
||||
|
||||
/*
|
||||
* The below lockdep_assert_*() macros use raw_cpu_read() to access the above
|
||||
* per-cpu variables. This is required because this_cpu_read() will potentially
|
||||
* call into preempt/irq-disable and that obviously isn't right. This is also
|
||||
* correct because when IRQs are enabled, it doesn't matter if we accidentally
|
||||
* read the value from our previous CPU.
|
||||
*/
|
||||
#define __lockdep_enabled (debug_locks && !this_cpu_read(lockdep_recursion))
|
||||
|
||||
#define lockdep_assert_irqs_enabled() \
|
||||
do { \
|
||||
WARN_ON_ONCE(debug_locks && !raw_cpu_read(hardirqs_enabled)); \
|
||||
WARN_ON_ONCE(__lockdep_enabled && !this_cpu_read(hardirqs_enabled)); \
|
||||
} while (0)
|
||||
|
||||
#define lockdep_assert_irqs_disabled() \
|
||||
do { \
|
||||
WARN_ON_ONCE(debug_locks && raw_cpu_read(hardirqs_enabled)); \
|
||||
WARN_ON_ONCE(__lockdep_enabled && this_cpu_read(hardirqs_enabled)); \
|
||||
} while (0)
|
||||
|
||||
#define lockdep_assert_in_irq() \
|
||||
do { \
|
||||
WARN_ON_ONCE(debug_locks && !raw_cpu_read(hardirq_context)); \
|
||||
WARN_ON_ONCE(__lockdep_enabled && !this_cpu_read(hardirq_context)); \
|
||||
} while (0)
|
||||
|
||||
#define lockdep_assert_preemption_enabled() \
|
||||
do { \
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_PREEMPT_COUNT) && \
|
||||
debug_locks && \
|
||||
__lockdep_enabled && \
|
||||
(preempt_count() != 0 || \
|
||||
!raw_cpu_read(hardirqs_enabled))); \
|
||||
!this_cpu_read(hardirqs_enabled))); \
|
||||
} while (0)
|
||||
|
||||
#define lockdep_assert_preemption_disabled() \
|
||||
do { \
|
||||
WARN_ON_ONCE(IS_ENABLED(CONFIG_PREEMPT_COUNT) && \
|
||||
debug_locks && \
|
||||
__lockdep_enabled && \
|
||||
(preempt_count() == 0 && \
|
||||
raw_cpu_read(hardirqs_enabled))); \
|
||||
this_cpu_read(hardirqs_enabled))); \
|
||||
} while (0)
|
||||
|
||||
#else
|
||||
|
|
|
@ -35,8 +35,12 @@ enum lockdep_wait_type {
|
|||
/*
|
||||
* We'd rather not expose kernel/lockdep_states.h this wide, but we do need
|
||||
* the total number of states... :-(
|
||||
*
|
||||
* XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
|
||||
* of those we generates 4 states, Additionally we report on USED and USED_READ.
|
||||
*/
|
||||
#define XXX_LOCK_USAGE_STATES (1+2*4)
|
||||
#define XXX_LOCK_USAGE_STATES 2
|
||||
#define LOCK_TRACE_STATES (XXX_LOCK_USAGE_STATES*4 + 2)
|
||||
|
||||
/*
|
||||
* NR_LOCKDEP_CACHING_CLASSES ... Number of classes
|
||||
|
@ -106,7 +110,7 @@ struct lock_class {
|
|||
* IRQ/softirq usage tracking bits:
|
||||
*/
|
||||
unsigned long usage_mask;
|
||||
const struct lock_trace *usage_traces[XXX_LOCK_USAGE_STATES];
|
||||
const struct lock_trace *usage_traces[LOCK_TRACE_STATES];
|
||||
|
||||
/*
|
||||
* Generation counter, when doing certain classes of graph walking,
|
||||
|
|
|
@ -42,8 +42,8 @@ struct latch_tree_node {
|
|||
};
|
||||
|
||||
struct latch_tree_root {
|
||||
seqcount_t seq;
|
||||
struct rb_root tree[2];
|
||||
seqcount_latch_t seq;
|
||||
struct rb_root tree[2];
|
||||
};
|
||||
|
||||
/**
|
||||
|
@ -206,7 +206,7 @@ latch_tree_find(void *key, struct latch_tree_root *root,
|
|||
do {
|
||||
seq = raw_read_seqcount_latch(&root->seq);
|
||||
node = __lt_find(key, root, seq & 1, ops->comp);
|
||||
} while (read_seqcount_retry(&root->seq, seq));
|
||||
} while (read_seqcount_latch_retry(&root->seq, seq));
|
||||
|
||||
return node;
|
||||
}
|
||||
|
|
|
@ -165,7 +165,7 @@ static inline unsigned int refcount_read(const refcount_t *r)
|
|||
*
|
||||
* Return: false if the passed refcount is 0, true otherwise
|
||||
*/
|
||||
static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
|
||||
static inline __must_check bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp)
|
||||
{
|
||||
int old = refcount_read(r);
|
||||
|
||||
|
@ -174,12 +174,20 @@ static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
|
|||
break;
|
||||
} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));
|
||||
|
||||
if (oldp)
|
||||
*oldp = old;
|
||||
|
||||
if (unlikely(old < 0 || old + i < 0))
|
||||
refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
|
||||
{
|
||||
return __refcount_add_not_zero(i, r, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
* refcount_add - add a value to a refcount
|
||||
* @i: the value to add to the refcount
|
||||
|
@ -196,16 +204,24 @@ static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
|
|||
* cases, refcount_inc(), or one of its variants, should instead be used to
|
||||
* increment a reference count.
|
||||
*/
|
||||
static inline void refcount_add(int i, refcount_t *r)
|
||||
static inline void __refcount_add(int i, refcount_t *r, int *oldp)
|
||||
{
|
||||
int old = atomic_fetch_add_relaxed(i, &r->refs);
|
||||
|
||||
if (oldp)
|
||||
*oldp = old;
|
||||
|
||||
if (unlikely(!old))
|
||||
refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
|
||||
else if (unlikely(old < 0 || old + i < 0))
|
||||
refcount_warn_saturate(r, REFCOUNT_ADD_OVF);
|
||||
}
|
||||
|
||||
static inline void refcount_add(int i, refcount_t *r)
|
||||
{
|
||||
__refcount_add(i, r, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
* refcount_inc_not_zero - increment a refcount unless it is 0
|
||||
* @r: the refcount to increment
|
||||
|
@ -219,9 +235,14 @@ static inline void refcount_add(int i, refcount_t *r)
|
|||
*
|
||||
* Return: true if the increment was successful, false otherwise
|
||||
*/
|
||||
static inline __must_check bool __refcount_inc_not_zero(refcount_t *r, int *oldp)
|
||||
{
|
||||
return __refcount_add_not_zero(1, r, oldp);
|
||||
}
|
||||
|
||||
static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
|
||||
{
|
||||
return refcount_add_not_zero(1, r);
|
||||
return __refcount_inc_not_zero(r, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -236,9 +257,14 @@ static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
|
|||
* Will WARN if the refcount is 0, as this represents a possible use-after-free
|
||||
* condition.
|
||||
*/
|
||||
static inline void __refcount_inc(refcount_t *r, int *oldp)
|
||||
{
|
||||
__refcount_add(1, r, oldp);
|
||||
}
|
||||
|
||||
static inline void refcount_inc(refcount_t *r)
|
||||
{
|
||||
refcount_add(1, r);
|
||||
__refcount_inc(r, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -261,10 +287,13 @@ static inline void refcount_inc(refcount_t *r)
|
|||
*
|
||||
* Return: true if the resulting refcount is 0, false otherwise
|
||||
*/
|
||||
static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
|
||||
static inline __must_check bool __refcount_sub_and_test(int i, refcount_t *r, int *oldp)
|
||||
{
|
||||
int old = atomic_fetch_sub_release(i, &r->refs);
|
||||
|
||||
if (oldp)
|
||||
*oldp = old;
|
||||
|
||||
if (old == i) {
|
||||
smp_acquire__after_ctrl_dep();
|
||||
return true;
|
||||
|
@ -276,6 +305,11 @@ static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
|
|||
return false;
|
||||
}
|
||||
|
||||
static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
|
||||
{
|
||||
return __refcount_sub_and_test(i, r, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
* refcount_dec_and_test - decrement a refcount and test if it is 0
|
||||
* @r: the refcount
|
||||
|
@ -289,9 +323,14 @@ static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
|
|||
*
|
||||
* Return: true if the resulting refcount is 0, false otherwise
|
||||
*/
|
||||
static inline __must_check bool __refcount_dec_and_test(refcount_t *r, int *oldp)
|
||||
{
|
||||
return __refcount_sub_and_test(1, r, oldp);
|
||||
}
|
||||
|
||||
static inline __must_check bool refcount_dec_and_test(refcount_t *r)
|
||||
{
|
||||
return refcount_sub_and_test(1, r);
|
||||
return __refcount_dec_and_test(r, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -304,10 +343,20 @@ static inline __must_check bool refcount_dec_and_test(refcount_t *r)
|
|||
* Provides release memory ordering, such that prior loads and stores are done
|
||||
* before.
|
||||
*/
|
||||
static inline void __refcount_dec(refcount_t *r, int *oldp)
|
||||
{
|
||||
int old = atomic_fetch_sub_release(1, &r->refs);
|
||||
|
||||
if (oldp)
|
||||
*oldp = old;
|
||||
|
||||
if (unlikely(old <= 1))
|
||||
refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
|
||||
}
|
||||
|
||||
static inline void refcount_dec(refcount_t *r)
|
||||
{
|
||||
if (unlikely(atomic_fetch_sub_release(1, &r->refs) <= 1))
|
||||
refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
|
||||
__refcount_dec(r, NULL);
|
||||
}
|
||||
|
||||
extern __must_check bool refcount_dec_if_one(refcount_t *r);
|
||||
|
|
|
@ -17,6 +17,7 @@
|
|||
#include <linux/kcsan-checks.h>
|
||||
#include <linux/lockdep.h>
|
||||
#include <linux/mutex.h>
|
||||
#include <linux/ww_mutex.h>
|
||||
#include <linux/preempt.h>
|
||||
#include <linux/spinlock.h>
|
||||
|
||||
|
@ -53,7 +54,7 @@
|
|||
*
|
||||
* If the write serialization mechanism is one of the common kernel
|
||||
* locking primitives, use a sequence counter with associated lock
|
||||
* (seqcount_LOCKTYPE_t) instead.
|
||||
* (seqcount_LOCKNAME_t) instead.
|
||||
*
|
||||
* If it's desired to automatically handle the sequence counter writer
|
||||
* serialization and non-preemptibility requirements, use a sequential
|
||||
|
@ -117,7 +118,7 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s)
|
|||
#define SEQCNT_ZERO(name) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(name) }
|
||||
|
||||
/*
|
||||
* Sequence counters with associated locks (seqcount_LOCKTYPE_t)
|
||||
* Sequence counters with associated locks (seqcount_LOCKNAME_t)
|
||||
*
|
||||
* A sequence counter which associates the lock used for writer
|
||||
* serialization at initialization time. This enables lockdep to validate
|
||||
|
@ -131,63 +132,117 @@ static inline void seqcount_lockdep_reader_access(const seqcount_t *s)
|
|||
* See Documentation/locking/seqlock.rst
|
||||
*/
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
/*
|
||||
* For PREEMPT_RT, seqcount_LOCKNAME_t write side critical sections cannot
|
||||
* disable preemption. It can lead to higher latencies, and the write side
|
||||
* sections will not be able to acquire locks which become sleeping locks
|
||||
* (e.g. spinlock_t).
|
||||
*
|
||||
* To remain preemptible while avoiding a possible livelock caused by the
|
||||
* reader preempting the writer, use a different technique: let the reader
|
||||
* detect if a seqcount_LOCKNAME_t writer is in progress. If that is the
|
||||
* case, acquire then release the associated LOCKNAME writer serialization
|
||||
* lock. This will allow any possibly-preempted writer to make progress
|
||||
* until the end of its writer serialization lock critical section.
|
||||
*
|
||||
* This lock-unlock technique must be implemented for all of PREEMPT_RT
|
||||
* sleeping locks. See Documentation/locking/locktypes.rst
|
||||
*/
|
||||
#if defined(CONFIG_LOCKDEP) || defined(CONFIG_PREEMPT_RT)
|
||||
#define __SEQ_LOCK(expr) expr
|
||||
#else
|
||||
#define __SEQ_LOCK(expr)
|
||||
#endif
|
||||
|
||||
/**
|
||||
* typedef seqcount_LOCKNAME_t - sequence counter with LOCKTYPR associated
|
||||
* typedef seqcount_LOCKNAME_t - sequence counter with LOCKNAME associated
|
||||
* @seqcount: The real sequence counter
|
||||
* @lock: Pointer to the associated spinlock
|
||||
* @lock: Pointer to the associated lock
|
||||
*
|
||||
* A plain sequence counter with external writer synchronization by a
|
||||
* spinlock. The spinlock is associated to the sequence count in the
|
||||
* A plain sequence counter with external writer synchronization by
|
||||
* LOCKNAME @lock. The lock is associated to the sequence counter in the
|
||||
* static initializer or init function. This enables lockdep to validate
|
||||
* that the write side critical section is properly serialized.
|
||||
*/
|
||||
|
||||
/**
|
||||
* seqcount_LOCKNAME_init() - runtime initializer for seqcount_LOCKNAME_t
|
||||
* @s: Pointer to the seqcount_LOCKNAME_t instance
|
||||
* @lock: Pointer to the associated LOCKTYPE
|
||||
*
|
||||
* LOCKNAME: raw_spinlock, spinlock, rwlock, mutex, or ww_mutex.
|
||||
*/
|
||||
|
||||
/*
|
||||
* SEQCOUNT_LOCKTYPE() - Instantiate seqcount_LOCKNAME_t and helpers
|
||||
* @locktype: actual typename
|
||||
* @lockname: name
|
||||
* seqcount_LOCKNAME_init() - runtime initializer for seqcount_LOCKNAME_t
|
||||
* @s: Pointer to the seqcount_LOCKNAME_t instance
|
||||
* @lock: Pointer to the associated lock
|
||||
*/
|
||||
|
||||
#define seqcount_LOCKNAME_init(s, _lock, lockname) \
|
||||
do { \
|
||||
seqcount_##lockname##_t *____s = (s); \
|
||||
seqcount_init(&____s->seqcount); \
|
||||
__SEQ_LOCK(____s->lock = (_lock)); \
|
||||
} while (0)
|
||||
|
||||
#define seqcount_raw_spinlock_init(s, lock) seqcount_LOCKNAME_init(s, lock, raw_spinlock)
|
||||
#define seqcount_spinlock_init(s, lock) seqcount_LOCKNAME_init(s, lock, spinlock)
|
||||
#define seqcount_rwlock_init(s, lock) seqcount_LOCKNAME_init(s, lock, rwlock);
|
||||
#define seqcount_mutex_init(s, lock) seqcount_LOCKNAME_init(s, lock, mutex);
|
||||
#define seqcount_ww_mutex_init(s, lock) seqcount_LOCKNAME_init(s, lock, ww_mutex);
|
||||
|
||||
/*
|
||||
* SEQCOUNT_LOCKNAME() - Instantiate seqcount_LOCKNAME_t and helpers
|
||||
* seqprop_LOCKNAME_*() - Property accessors for seqcount_LOCKNAME_t
|
||||
*
|
||||
* @lockname: "LOCKNAME" part of seqcount_LOCKNAME_t
|
||||
* @locktype: LOCKNAME canonical C data type
|
||||
* @preemptible: preemptibility of above locktype
|
||||
* @lockmember: argument for lockdep_assert_held()
|
||||
* @lockbase: associated lock release function (prefix only)
|
||||
* @lock_acquire: associated lock acquisition function (full call)
|
||||
*/
|
||||
#define SEQCOUNT_LOCKTYPE(locktype, lockname, preemptible, lockmember) \
|
||||
#define SEQCOUNT_LOCKNAME(lockname, locktype, preemptible, lockmember, lockbase, lock_acquire) \
|
||||
typedef struct seqcount_##lockname { \
|
||||
seqcount_t seqcount; \
|
||||
__SEQ_LOCK(locktype *lock); \
|
||||
} seqcount_##lockname##_t; \
|
||||
\
|
||||
static __always_inline void \
|
||||
seqcount_##lockname##_init(seqcount_##lockname##_t *s, locktype *lock) \
|
||||
{ \
|
||||
seqcount_init(&s->seqcount); \
|
||||
__SEQ_LOCK(s->lock = lock); \
|
||||
} \
|
||||
\
|
||||
static __always_inline seqcount_t * \
|
||||
__seqcount_##lockname##_ptr(seqcount_##lockname##_t *s) \
|
||||
__seqprop_##lockname##_ptr(seqcount_##lockname##_t *s) \
|
||||
{ \
|
||||
return &s->seqcount; \
|
||||
} \
|
||||
\
|
||||
static __always_inline bool \
|
||||
__seqcount_##lockname##_preemptible(seqcount_##lockname##_t *s) \
|
||||
static __always_inline unsigned \
|
||||
__seqprop_##lockname##_sequence(const seqcount_##lockname##_t *s) \
|
||||
{ \
|
||||
return preemptible; \
|
||||
unsigned seq = READ_ONCE(s->seqcount.sequence); \
|
||||
\
|
||||
if (!IS_ENABLED(CONFIG_PREEMPT_RT)) \
|
||||
return seq; \
|
||||
\
|
||||
if (preemptible && unlikely(seq & 1)) { \
|
||||
__SEQ_LOCK(lock_acquire); \
|
||||
__SEQ_LOCK(lockbase##_unlock(s->lock)); \
|
||||
\
|
||||
/* \
|
||||
* Re-read the sequence counter since the (possibly \
|
||||
* preempted) writer made progress. \
|
||||
*/ \
|
||||
seq = READ_ONCE(s->seqcount.sequence); \
|
||||
} \
|
||||
\
|
||||
return seq; \
|
||||
} \
|
||||
\
|
||||
static __always_inline bool \
|
||||
__seqprop_##lockname##_preemptible(const seqcount_##lockname##_t *s) \
|
||||
{ \
|
||||
if (!IS_ENABLED(CONFIG_PREEMPT_RT)) \
|
||||
return preemptible; \
|
||||
\
|
||||
/* PREEMPT_RT relies on the above LOCK+UNLOCK */ \
|
||||
return false; \
|
||||
} \
|
||||
\
|
||||
static __always_inline void \
|
||||
__seqcount_##lockname##_assert(seqcount_##lockname##_t *s) \
|
||||
__seqprop_##lockname##_assert(const seqcount_##lockname##_t *s) \
|
||||
{ \
|
||||
__SEQ_LOCK(lockdep_assert_held(lockmember)); \
|
||||
}
|
||||
|
@ -196,50 +251,56 @@ __seqcount_##lockname##_assert(seqcount_##lockname##_t *s) \
|
|||
* __seqprop() for seqcount_t
|
||||
*/
|
||||
|
||||
static inline seqcount_t *__seqcount_ptr(seqcount_t *s)
|
||||
static inline seqcount_t *__seqprop_ptr(seqcount_t *s)
|
||||
{
|
||||
return s;
|
||||
}
|
||||
|
||||
static inline bool __seqcount_preemptible(seqcount_t *s)
|
||||
static inline unsigned __seqprop_sequence(const seqcount_t *s)
|
||||
{
|
||||
return READ_ONCE(s->sequence);
|
||||
}
|
||||
|
||||
static inline bool __seqprop_preemptible(const seqcount_t *s)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void __seqcount_assert(seqcount_t *s)
|
||||
static inline void __seqprop_assert(const seqcount_t *s)
|
||||
{
|
||||
lockdep_assert_preemption_disabled();
|
||||
}
|
||||
|
||||
SEQCOUNT_LOCKTYPE(raw_spinlock_t, raw_spinlock, false, s->lock)
|
||||
SEQCOUNT_LOCKTYPE(spinlock_t, spinlock, false, s->lock)
|
||||
SEQCOUNT_LOCKTYPE(rwlock_t, rwlock, false, s->lock)
|
||||
SEQCOUNT_LOCKTYPE(struct mutex, mutex, true, s->lock)
|
||||
SEQCOUNT_LOCKTYPE(struct ww_mutex, ww_mutex, true, &s->lock->base)
|
||||
#define __SEQ_RT IS_ENABLED(CONFIG_PREEMPT_RT)
|
||||
|
||||
/**
|
||||
SEQCOUNT_LOCKNAME(raw_spinlock, raw_spinlock_t, false, s->lock, raw_spin, raw_spin_lock(s->lock))
|
||||
SEQCOUNT_LOCKNAME(spinlock, spinlock_t, __SEQ_RT, s->lock, spin, spin_lock(s->lock))
|
||||
SEQCOUNT_LOCKNAME(rwlock, rwlock_t, __SEQ_RT, s->lock, read, read_lock(s->lock))
|
||||
SEQCOUNT_LOCKNAME(mutex, struct mutex, true, s->lock, mutex, mutex_lock(s->lock))
|
||||
SEQCOUNT_LOCKNAME(ww_mutex, struct ww_mutex, true, &s->lock->base, ww_mutex, ww_mutex_lock(s->lock, NULL))
|
||||
|
||||
/*
|
||||
* SEQCNT_LOCKNAME_ZERO - static initializer for seqcount_LOCKNAME_t
|
||||
* @name: Name of the seqcount_LOCKNAME_t instance
|
||||
* @lock: Pointer to the associated LOCKTYPE
|
||||
* @lock: Pointer to the associated LOCKNAME
|
||||
*/
|
||||
|
||||
#define SEQCOUNT_LOCKTYPE_ZERO(seq_name, assoc_lock) { \
|
||||
#define SEQCOUNT_LOCKNAME_ZERO(seq_name, assoc_lock) { \
|
||||
.seqcount = SEQCNT_ZERO(seq_name.seqcount), \
|
||||
__SEQ_LOCK(.lock = (assoc_lock)) \
|
||||
}
|
||||
|
||||
#define SEQCNT_SPINLOCK_ZERO(name, lock) SEQCOUNT_LOCKTYPE_ZERO(name, lock)
|
||||
#define SEQCNT_RAW_SPINLOCK_ZERO(name, lock) SEQCOUNT_LOCKTYPE_ZERO(name, lock)
|
||||
#define SEQCNT_RWLOCK_ZERO(name, lock) SEQCOUNT_LOCKTYPE_ZERO(name, lock)
|
||||
#define SEQCNT_MUTEX_ZERO(name, lock) SEQCOUNT_LOCKTYPE_ZERO(name, lock)
|
||||
#define SEQCNT_WW_MUTEX_ZERO(name, lock) SEQCOUNT_LOCKTYPE_ZERO(name, lock)
|
||||
|
||||
#define SEQCNT_RAW_SPINLOCK_ZERO(name, lock) SEQCOUNT_LOCKNAME_ZERO(name, lock)
|
||||
#define SEQCNT_SPINLOCK_ZERO(name, lock) SEQCOUNT_LOCKNAME_ZERO(name, lock)
|
||||
#define SEQCNT_RWLOCK_ZERO(name, lock) SEQCOUNT_LOCKNAME_ZERO(name, lock)
|
||||
#define SEQCNT_MUTEX_ZERO(name, lock) SEQCOUNT_LOCKNAME_ZERO(name, lock)
|
||||
#define SEQCNT_WW_MUTEX_ZERO(name, lock) SEQCOUNT_LOCKNAME_ZERO(name, lock)
|
||||
|
||||
#define __seqprop_case(s, lockname, prop) \
|
||||
seqcount_##lockname##_t: __seqcount_##lockname##_##prop((void *)(s))
|
||||
seqcount_##lockname##_t: __seqprop_##lockname##_##prop((void *)(s))
|
||||
|
||||
#define __seqprop(s, prop) _Generic(*(s), \
|
||||
seqcount_t: __seqcount_##prop((void *)(s)), \
|
||||
seqcount_t: __seqprop_##prop((void *)(s)), \
|
||||
__seqprop_case((s), raw_spinlock, prop), \
|
||||
__seqprop_case((s), spinlock, prop), \
|
||||
__seqprop_case((s), rwlock, prop), \
|
||||
|
@ -247,12 +308,13 @@ SEQCOUNT_LOCKTYPE(struct ww_mutex, ww_mutex, true, &s->lock->base)
|
|||
__seqprop_case((s), ww_mutex, prop))
|
||||
|
||||
#define __seqcount_ptr(s) __seqprop(s, ptr)
|
||||
#define __seqcount_sequence(s) __seqprop(s, sequence)
|
||||
#define __seqcount_lock_preemptible(s) __seqprop(s, preemptible)
|
||||
#define __seqcount_assert_lock_held(s) __seqprop(s, assert)
|
||||
|
||||
/**
|
||||
* __read_seqcount_begin() - begin a seqcount_t read section w/o barrier
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
|
||||
* barrier. Callers should ensure that smp_rmb() or equivalent ordering is
|
||||
|
@ -265,56 +327,45 @@ SEQCOUNT_LOCKTYPE(struct ww_mutex, ww_mutex, true, &s->lock->base)
|
|||
* Return: count to be passed to read_seqcount_retry()
|
||||
*/
|
||||
#define __read_seqcount_begin(s) \
|
||||
__read_seqcount_t_begin(__seqcount_ptr(s))
|
||||
|
||||
static inline unsigned __read_seqcount_t_begin(const seqcount_t *s)
|
||||
{
|
||||
unsigned ret;
|
||||
|
||||
repeat:
|
||||
ret = READ_ONCE(s->sequence);
|
||||
if (unlikely(ret & 1)) {
|
||||
cpu_relax();
|
||||
goto repeat;
|
||||
}
|
||||
kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX);
|
||||
return ret;
|
||||
}
|
||||
({ \
|
||||
unsigned seq; \
|
||||
\
|
||||
while ((seq = __seqcount_sequence(s)) & 1) \
|
||||
cpu_relax(); \
|
||||
\
|
||||
kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \
|
||||
seq; \
|
||||
})
|
||||
|
||||
/**
|
||||
* raw_read_seqcount_begin() - begin a seqcount_t read section w/o lockdep
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* Return: count to be passed to read_seqcount_retry()
|
||||
*/
|
||||
#define raw_read_seqcount_begin(s) \
|
||||
raw_read_seqcount_t_begin(__seqcount_ptr(s))
|
||||
|
||||
static inline unsigned raw_read_seqcount_t_begin(const seqcount_t *s)
|
||||
{
|
||||
unsigned ret = __read_seqcount_t_begin(s);
|
||||
smp_rmb();
|
||||
return ret;
|
||||
}
|
||||
({ \
|
||||
unsigned seq = __read_seqcount_begin(s); \
|
||||
\
|
||||
smp_rmb(); \
|
||||
seq; \
|
||||
})
|
||||
|
||||
/**
|
||||
* read_seqcount_begin() - begin a seqcount_t read critical section
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* Return: count to be passed to read_seqcount_retry()
|
||||
*/
|
||||
#define read_seqcount_begin(s) \
|
||||
read_seqcount_t_begin(__seqcount_ptr(s))
|
||||
|
||||
static inline unsigned read_seqcount_t_begin(const seqcount_t *s)
|
||||
{
|
||||
seqcount_lockdep_reader_access(s);
|
||||
return raw_read_seqcount_t_begin(s);
|
||||
}
|
||||
({ \
|
||||
seqcount_lockdep_reader_access(__seqcount_ptr(s)); \
|
||||
raw_read_seqcount_begin(s); \
|
||||
})
|
||||
|
||||
/**
|
||||
* raw_read_seqcount() - read the raw seqcount_t counter value
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* raw_read_seqcount opens a read critical section of the given
|
||||
* seqcount_t, without any lockdep checking, and without checking or
|
||||
|
@ -324,20 +375,18 @@ static inline unsigned read_seqcount_t_begin(const seqcount_t *s)
|
|||
* Return: count to be passed to read_seqcount_retry()
|
||||
*/
|
||||
#define raw_read_seqcount(s) \
|
||||
raw_read_seqcount_t(__seqcount_ptr(s))
|
||||
|
||||
static inline unsigned raw_read_seqcount_t(const seqcount_t *s)
|
||||
{
|
||||
unsigned ret = READ_ONCE(s->sequence);
|
||||
smp_rmb();
|
||||
kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX);
|
||||
return ret;
|
||||
}
|
||||
({ \
|
||||
unsigned seq = __seqcount_sequence(s); \
|
||||
\
|
||||
smp_rmb(); \
|
||||
kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \
|
||||
seq; \
|
||||
})
|
||||
|
||||
/**
|
||||
* raw_seqcount_begin() - begin a seqcount_t read critical section w/o
|
||||
* lockdep and w/o counter stabilization
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* raw_seqcount_begin opens a read critical section of the given
|
||||
* seqcount_t. Unlike read_seqcount_begin(), this function will not wait
|
||||
|
@ -352,20 +401,17 @@ static inline unsigned raw_read_seqcount_t(const seqcount_t *s)
|
|||
* Return: count to be passed to read_seqcount_retry()
|
||||
*/
|
||||
#define raw_seqcount_begin(s) \
|
||||
raw_seqcount_t_begin(__seqcount_ptr(s))
|
||||
|
||||
static inline unsigned raw_seqcount_t_begin(const seqcount_t *s)
|
||||
{
|
||||
/*
|
||||
* If the counter is odd, let read_seqcount_retry() fail
|
||||
* by decrementing the counter.
|
||||
*/
|
||||
return raw_read_seqcount_t(s) & ~1;
|
||||
}
|
||||
({ \
|
||||
/* \
|
||||
* If the counter is odd, let read_seqcount_retry() fail \
|
||||
* by decrementing the counter. \
|
||||
*/ \
|
||||
raw_read_seqcount(s) & ~1; \
|
||||
})
|
||||
|
||||
/**
|
||||
* __read_seqcount_retry() - end a seqcount_t read section w/o barrier
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
* @start: count, from read_seqcount_begin()
|
||||
*
|
||||
* __read_seqcount_retry is like read_seqcount_retry, but has no smp_rmb()
|
||||
|
@ -389,7 +435,7 @@ static inline int __read_seqcount_t_retry(const seqcount_t *s, unsigned start)
|
|||
|
||||
/**
|
||||
* read_seqcount_retry() - end a seqcount_t read critical section
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
* @start: count, from read_seqcount_begin()
|
||||
*
|
||||
* read_seqcount_retry closes the read critical section of given
|
||||
|
@ -409,7 +455,7 @@ static inline int read_seqcount_t_retry(const seqcount_t *s, unsigned start)
|
|||
|
||||
/**
|
||||
* raw_write_seqcount_begin() - start a seqcount_t write section w/o lockdep
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*/
|
||||
#define raw_write_seqcount_begin(s) \
|
||||
do { \
|
||||
|
@ -428,7 +474,7 @@ static inline void raw_write_seqcount_t_begin(seqcount_t *s)
|
|||
|
||||
/**
|
||||
* raw_write_seqcount_end() - end a seqcount_t write section w/o lockdep
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*/
|
||||
#define raw_write_seqcount_end(s) \
|
||||
do { \
|
||||
|
@ -448,7 +494,7 @@ static inline void raw_write_seqcount_t_end(seqcount_t *s)
|
|||
/**
|
||||
* write_seqcount_begin_nested() - start a seqcount_t write section with
|
||||
* custom lockdep nesting level
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
* @subclass: lockdep nesting level
|
||||
*
|
||||
* See Documentation/locking/lockdep-design.rst
|
||||
|
@ -471,7 +517,7 @@ static inline void write_seqcount_t_begin_nested(seqcount_t *s, int subclass)
|
|||
|
||||
/**
|
||||
* write_seqcount_begin() - start a seqcount_t write side critical section
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* write_seqcount_begin opens a write side critical section of the given
|
||||
* seqcount_t.
|
||||
|
@ -497,7 +543,7 @@ static inline void write_seqcount_t_begin(seqcount_t *s)
|
|||
|
||||
/**
|
||||
* write_seqcount_end() - end a seqcount_t write side critical section
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* The write section must've been opened with write_seqcount_begin().
|
||||
*/
|
||||
|
@ -517,7 +563,7 @@ static inline void write_seqcount_t_end(seqcount_t *s)
|
|||
|
||||
/**
|
||||
* raw_write_seqcount_barrier() - do a seqcount_t write barrier
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* This can be used to provide an ordering guarantee instead of the usual
|
||||
* consistency guarantee. It is one wmb cheaper, because it can collapse
|
||||
|
@ -571,7 +617,7 @@ static inline void raw_write_seqcount_t_barrier(seqcount_t *s)
|
|||
/**
|
||||
* write_seqcount_invalidate() - invalidate in-progress seqcount_t read
|
||||
* side operations
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_LOCKNAME_t variants
|
||||
*
|
||||
* After write_seqcount_invalidate, no seqcount_t read side operations
|
||||
* will complete successfully and see data older than this.
|
||||
|
@ -587,34 +633,73 @@ static inline void write_seqcount_t_invalidate(seqcount_t *s)
|
|||
kcsan_nestable_atomic_end();
|
||||
}
|
||||
|
||||
/**
|
||||
* raw_read_seqcount_latch() - pick even/odd seqcount_t latch data copy
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
/*
|
||||
* Latch sequence counters (seqcount_latch_t)
|
||||
*
|
||||
* Use seqcount_t latching to switch between two storage places protected
|
||||
* by a sequence counter. Doing so allows having interruptible, preemptible,
|
||||
* seqcount_t write side critical sections.
|
||||
* A sequence counter variant where the counter even/odd value is used to
|
||||
* switch between two copies of protected data. This allows the read path,
|
||||
* typically NMIs, to safely interrupt the write side critical section.
|
||||
*
|
||||
* Check raw_write_seqcount_latch() for more details and a full reader and
|
||||
* writer usage example.
|
||||
*
|
||||
* Return: sequence counter raw value. Use the lowest bit as an index for
|
||||
* picking which data copy to read. The full counter value must then be
|
||||
* checked with read_seqcount_retry().
|
||||
* As the write sections are fully preemptible, no special handling for
|
||||
* PREEMPT_RT is needed.
|
||||
*/
|
||||
#define raw_read_seqcount_latch(s) \
|
||||
raw_read_seqcount_t_latch(__seqcount_ptr(s))
|
||||
typedef struct {
|
||||
seqcount_t seqcount;
|
||||
} seqcount_latch_t;
|
||||
|
||||
static inline int raw_read_seqcount_t_latch(seqcount_t *s)
|
||||
{
|
||||
/* Pairs with the first smp_wmb() in raw_write_seqcount_latch() */
|
||||
int seq = READ_ONCE(s->sequence); /* ^^^ */
|
||||
return seq;
|
||||
/**
|
||||
* SEQCNT_LATCH_ZERO() - static initializer for seqcount_latch_t
|
||||
* @seq_name: Name of the seqcount_latch_t instance
|
||||
*/
|
||||
#define SEQCNT_LATCH_ZERO(seq_name) { \
|
||||
.seqcount = SEQCNT_ZERO(seq_name.seqcount), \
|
||||
}
|
||||
|
||||
/**
|
||||
* raw_write_seqcount_latch() - redirect readers to even/odd copy
|
||||
* @s: Pointer to seqcount_t or any of the seqcount_locktype_t variants
|
||||
* seqcount_latch_init() - runtime initializer for seqcount_latch_t
|
||||
* @s: Pointer to the seqcount_latch_t instance
|
||||
*/
|
||||
static inline void seqcount_latch_init(seqcount_latch_t *s)
|
||||
{
|
||||
seqcount_init(&s->seqcount);
|
||||
}
|
||||
|
||||
/**
|
||||
* raw_read_seqcount_latch() - pick even/odd latch data copy
|
||||
* @s: Pointer to seqcount_latch_t
|
||||
*
|
||||
* See raw_write_seqcount_latch() for details and a full reader/writer
|
||||
* usage example.
|
||||
*
|
||||
* Return: sequence counter raw value. Use the lowest bit as an index for
|
||||
* picking which data copy to read. The full counter must then be checked
|
||||
* with read_seqcount_latch_retry().
|
||||
*/
|
||||
static inline unsigned raw_read_seqcount_latch(const seqcount_latch_t *s)
|
||||
{
|
||||
/*
|
||||
* Pairs with the first smp_wmb() in raw_write_seqcount_latch().
|
||||
* Due to the dependent load, a full smp_rmb() is not needed.
|
||||
*/
|
||||
return READ_ONCE(s->seqcount.sequence);
|
||||
}
|
||||
|
||||
/**
|
||||
* read_seqcount_latch_retry() - end a seqcount_latch_t read section
|
||||
* @s: Pointer to seqcount_latch_t
|
||||
* @start: count, from raw_read_seqcount_latch()
|
||||
*
|
||||
* Return: true if a read section retry is required, else false
|
||||
*/
|
||||
static inline int
|
||||
read_seqcount_latch_retry(const seqcount_latch_t *s, unsigned start)
|
||||
{
|
||||
return read_seqcount_retry(&s->seqcount, start);
|
||||
}
|
||||
|
||||
/**
|
||||
* raw_write_seqcount_latch() - redirect latch readers to even/odd copy
|
||||
* @s: Pointer to seqcount_latch_t
|
||||
*
|
||||
* The latch technique is a multiversion concurrency control method that allows
|
||||
* queries during non-atomic modifications. If you can guarantee queries never
|
||||
|
@ -633,7 +718,7 @@ static inline int raw_read_seqcount_t_latch(seqcount_t *s)
|
|||
* The basic form is a data structure like::
|
||||
*
|
||||
* struct latch_struct {
|
||||
* seqcount_t seq;
|
||||
* seqcount_latch_t seq;
|
||||
* struct data_struct data[2];
|
||||
* };
|
||||
*
|
||||
|
@ -643,13 +728,13 @@ static inline int raw_read_seqcount_t_latch(seqcount_t *s)
|
|||
* void latch_modify(struct latch_struct *latch, ...)
|
||||
* {
|
||||
* smp_wmb(); // Ensure that the last data[1] update is visible
|
||||
* latch->seq++;
|
||||
* latch->seq.sequence++;
|
||||
* smp_wmb(); // Ensure that the seqcount update is visible
|
||||
*
|
||||
* modify(latch->data[0], ...);
|
||||
*
|
||||
* smp_wmb(); // Ensure that the data[0] update is visible
|
||||
* latch->seq++;
|
||||
* latch->seq.sequence++;
|
||||
* smp_wmb(); // Ensure that the seqcount update is visible
|
||||
*
|
||||
* modify(latch->data[1], ...);
|
||||
|
@ -668,8 +753,8 @@ static inline int raw_read_seqcount_t_latch(seqcount_t *s)
|
|||
* idx = seq & 0x01;
|
||||
* entry = data_query(latch->data[idx], ...);
|
||||
*
|
||||
* // read_seqcount_retry() includes needed smp_rmb()
|
||||
* } while (read_seqcount_retry(&latch->seq, seq));
|
||||
* // This includes needed smp_rmb()
|
||||
* } while (read_seqcount_latch_retry(&latch->seq, seq));
|
||||
*
|
||||
* return entry;
|
||||
* }
|
||||
|
@ -688,19 +773,16 @@ static inline int raw_read_seqcount_t_latch(seqcount_t *s)
|
|||
* to miss an entire modification sequence, once it resumes it might
|
||||
* observe the new entry.
|
||||
*
|
||||
* NOTE:
|
||||
* NOTE2:
|
||||
*
|
||||
* When data is a dynamic data structure; one should use regular RCU
|
||||
* patterns to manage the lifetimes of the objects within.
|
||||
*/
|
||||
#define raw_write_seqcount_latch(s) \
|
||||
raw_write_seqcount_t_latch(__seqcount_ptr(s))
|
||||
|
||||
static inline void raw_write_seqcount_t_latch(seqcount_t *s)
|
||||
static inline void raw_write_seqcount_latch(seqcount_latch_t *s)
|
||||
{
|
||||
smp_wmb(); /* prior stores before incrementing "sequence" */
|
||||
s->sequence++;
|
||||
smp_wmb(); /* increment "sequence" before following stores */
|
||||
smp_wmb(); /* prior stores before incrementing "sequence" */
|
||||
s->seqcount.sequence++;
|
||||
smp_wmb(); /* increment "sequence" before following stores */
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -714,13 +796,17 @@ static inline void raw_write_seqcount_t_latch(seqcount_t *s)
|
|||
* - Documentation/locking/seqlock.rst
|
||||
*/
|
||||
typedef struct {
|
||||
struct seqcount seqcount;
|
||||
/*
|
||||
* Make sure that readers don't starve writers on PREEMPT_RT: use
|
||||
* seqcount_spinlock_t instead of seqcount_t. Check __SEQ_LOCK().
|
||||
*/
|
||||
seqcount_spinlock_t seqcount;
|
||||
spinlock_t lock;
|
||||
} seqlock_t;
|
||||
|
||||
#define __SEQLOCK_UNLOCKED(lockname) \
|
||||
{ \
|
||||
.seqcount = SEQCNT_ZERO(lockname), \
|
||||
.seqcount = SEQCNT_SPINLOCK_ZERO(lockname, &(lockname).lock), \
|
||||
.lock = __SPIN_LOCK_UNLOCKED(lockname) \
|
||||
}
|
||||
|
||||
|
@ -730,12 +816,12 @@ typedef struct {
|
|||
*/
|
||||
#define seqlock_init(sl) \
|
||||
do { \
|
||||
seqcount_init(&(sl)->seqcount); \
|
||||
spin_lock_init(&(sl)->lock); \
|
||||
seqcount_spinlock_init(&(sl)->seqcount, &(sl)->lock); \
|
||||
} while (0)
|
||||
|
||||
/**
|
||||
* DEFINE_SEQLOCK() - Define a statically allocated seqlock_t
|
||||
* DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t
|
||||
* @sl: Name of the seqlock_t instance
|
||||
*/
|
||||
#define DEFINE_SEQLOCK(sl) \
|
||||
|
@ -778,6 +864,12 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
|
|||
return read_seqcount_retry(&sl->seqcount, start);
|
||||
}
|
||||
|
||||
/*
|
||||
* For all seqlock_t write side functions, use write_seqcount_*t*_begin()
|
||||
* instead of the generic write_seqcount_begin(). This way, no redundant
|
||||
* lockdep_assert_held() checks are added.
|
||||
*/
|
||||
|
||||
/**
|
||||
* write_seqlock() - start a seqlock_t write side critical section
|
||||
* @sl: Pointer to seqlock_t
|
||||
|
@ -794,7 +886,7 @@ static inline unsigned read_seqretry(const seqlock_t *sl, unsigned start)
|
|||
static inline void write_seqlock(seqlock_t *sl)
|
||||
{
|
||||
spin_lock(&sl->lock);
|
||||
write_seqcount_t_begin(&sl->seqcount);
|
||||
write_seqcount_t_begin(&sl->seqcount.seqcount);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -806,7 +898,7 @@ static inline void write_seqlock(seqlock_t *sl)
|
|||
*/
|
||||
static inline void write_sequnlock(seqlock_t *sl)
|
||||
{
|
||||
write_seqcount_t_end(&sl->seqcount);
|
||||
write_seqcount_t_end(&sl->seqcount.seqcount);
|
||||
spin_unlock(&sl->lock);
|
||||
}
|
||||
|
||||
|
@ -820,7 +912,7 @@ static inline void write_sequnlock(seqlock_t *sl)
|
|||
static inline void write_seqlock_bh(seqlock_t *sl)
|
||||
{
|
||||
spin_lock_bh(&sl->lock);
|
||||
write_seqcount_t_begin(&sl->seqcount);
|
||||
write_seqcount_t_begin(&sl->seqcount.seqcount);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -833,7 +925,7 @@ static inline void write_seqlock_bh(seqlock_t *sl)
|
|||
*/
|
||||
static inline void write_sequnlock_bh(seqlock_t *sl)
|
||||
{
|
||||
write_seqcount_t_end(&sl->seqcount);
|
||||
write_seqcount_t_end(&sl->seqcount.seqcount);
|
||||
spin_unlock_bh(&sl->lock);
|
||||
}
|
||||
|
||||
|
@ -847,7 +939,7 @@ static inline void write_sequnlock_bh(seqlock_t *sl)
|
|||
static inline void write_seqlock_irq(seqlock_t *sl)
|
||||
{
|
||||
spin_lock_irq(&sl->lock);
|
||||
write_seqcount_t_begin(&sl->seqcount);
|
||||
write_seqcount_t_begin(&sl->seqcount.seqcount);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -859,7 +951,7 @@ static inline void write_seqlock_irq(seqlock_t *sl)
|
|||
*/
|
||||
static inline void write_sequnlock_irq(seqlock_t *sl)
|
||||
{
|
||||
write_seqcount_t_end(&sl->seqcount);
|
||||
write_seqcount_t_end(&sl->seqcount.seqcount);
|
||||
spin_unlock_irq(&sl->lock);
|
||||
}
|
||||
|
||||
|
@ -868,7 +960,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
|
|||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&sl->lock, flags);
|
||||
write_seqcount_t_begin(&sl->seqcount);
|
||||
write_seqcount_t_begin(&sl->seqcount.seqcount);
|
||||
return flags;
|
||||
}
|
||||
|
||||
|
@ -897,7 +989,7 @@ static inline unsigned long __write_seqlock_irqsave(seqlock_t *sl)
|
|||
static inline void
|
||||
write_sequnlock_irqrestore(seqlock_t *sl, unsigned long flags)
|
||||
{
|
||||
write_seqcount_t_end(&sl->seqcount);
|
||||
write_seqcount_t_end(&sl->seqcount.seqcount);
|
||||
spin_unlock_irqrestore(&sl->lock, flags);
|
||||
}
|
||||
|
||||
|
|
|
@ -1,5 +1,7 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#define pr_fmt(fmt) "kcsan: " fmt
|
||||
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/delay.h>
|
||||
|
@ -98,6 +100,9 @@ static atomic_long_t watchpoints[CONFIG_KCSAN_NUM_WATCHPOINTS + NUM_SLOTS-1];
|
|||
*/
|
||||
static DEFINE_PER_CPU(long, kcsan_skip);
|
||||
|
||||
/* For kcsan_prandom_u32_max(). */
|
||||
static DEFINE_PER_CPU(struct rnd_state, kcsan_rand_state);
|
||||
|
||||
static __always_inline atomic_long_t *find_watchpoint(unsigned long addr,
|
||||
size_t size,
|
||||
bool expect_write,
|
||||
|
@ -223,7 +228,7 @@ is_atomic(const volatile void *ptr, size_t size, int type, struct kcsan_ctx *ctx
|
|||
|
||||
if (IS_ENABLED(CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC) &&
|
||||
(type & KCSAN_ACCESS_WRITE) && size <= sizeof(long) &&
|
||||
IS_ALIGNED((unsigned long)ptr, size))
|
||||
!(type & KCSAN_ACCESS_COMPOUND) && IS_ALIGNED((unsigned long)ptr, size))
|
||||
return true; /* Assume aligned writes up to word size are atomic. */
|
||||
|
||||
if (ctx->atomic_next > 0) {
|
||||
|
@ -269,11 +274,28 @@ should_watch(const volatile void *ptr, size_t size, int type, struct kcsan_ctx *
|
|||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
* Returns a pseudo-random number in interval [0, ep_ro). See prandom_u32_max()
|
||||
* for more details.
|
||||
*
|
||||
* The open-coded version here is using only safe primitives for all contexts
|
||||
* where we can have KCSAN instrumentation. In particular, we cannot use
|
||||
* prandom_u32() directly, as its tracepoint could cause recursion.
|
||||
*/
|
||||
static u32 kcsan_prandom_u32_max(u32 ep_ro)
|
||||
{
|
||||
struct rnd_state *state = &get_cpu_var(kcsan_rand_state);
|
||||
const u32 res = prandom_u32_state(state);
|
||||
|
||||
put_cpu_var(kcsan_rand_state);
|
||||
return (u32)(((u64) res * ep_ro) >> 32);
|
||||
}
|
||||
|
||||
static inline void reset_kcsan_skip(void)
|
||||
{
|
||||
long skip_count = kcsan_skip_watch -
|
||||
(IS_ENABLED(CONFIG_KCSAN_SKIP_WATCH_RANDOMIZE) ?
|
||||
prandom_u32_max(kcsan_skip_watch) :
|
||||
kcsan_prandom_u32_max(kcsan_skip_watch) :
|
||||
0);
|
||||
this_cpu_write(kcsan_skip, skip_count);
|
||||
}
|
||||
|
@ -283,12 +305,18 @@ static __always_inline bool kcsan_is_enabled(void)
|
|||
return READ_ONCE(kcsan_enabled) && get_ctx()->disable_count == 0;
|
||||
}
|
||||
|
||||
static inline unsigned int get_delay(void)
|
||||
/* Introduce delay depending on context and configuration. */
|
||||
static void delay_access(int type)
|
||||
{
|
||||
unsigned int delay = in_task() ? kcsan_udelay_task : kcsan_udelay_interrupt;
|
||||
return delay - (IS_ENABLED(CONFIG_KCSAN_DELAY_RANDOMIZE) ?
|
||||
prandom_u32_max(delay) :
|
||||
0);
|
||||
/* For certain access types, skew the random delay to be longer. */
|
||||
unsigned int skew_delay_order =
|
||||
(type & (KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_ASSERT)) ? 1 : 0;
|
||||
|
||||
delay -= IS_ENABLED(CONFIG_KCSAN_DELAY_RANDOMIZE) ?
|
||||
kcsan_prandom_u32_max(delay >> skew_delay_order) :
|
||||
0;
|
||||
udelay(delay);
|
||||
}
|
||||
|
||||
void kcsan_save_irqtrace(struct task_struct *task)
|
||||
|
@ -361,13 +389,13 @@ static noinline void kcsan_found_watchpoint(const volatile void *ptr,
|
|||
* already removed the watchpoint, or another thread consumed
|
||||
* the watchpoint before this thread.
|
||||
*/
|
||||
kcsan_counter_inc(KCSAN_COUNTER_REPORT_RACES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_REPORT_RACES]);
|
||||
}
|
||||
|
||||
if ((type & KCSAN_ACCESS_ASSERT) != 0)
|
||||
kcsan_counter_inc(KCSAN_COUNTER_ASSERT_FAILURES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_ASSERT_FAILURES]);
|
||||
else
|
||||
kcsan_counter_inc(KCSAN_COUNTER_DATA_RACES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_DATA_RACES]);
|
||||
|
||||
user_access_restore(flags);
|
||||
}
|
||||
|
@ -408,7 +436,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
|
|||
goto out;
|
||||
|
||||
if (!check_encodable((unsigned long)ptr, size)) {
|
||||
kcsan_counter_inc(KCSAN_COUNTER_UNENCODABLE_ACCESSES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_UNENCODABLE_ACCESSES]);
|
||||
goto out;
|
||||
}
|
||||
|
||||
|
@ -428,12 +456,12 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
|
|||
* with which should_watch() returns true should be tweaked so
|
||||
* that this case happens very rarely.
|
||||
*/
|
||||
kcsan_counter_inc(KCSAN_COUNTER_NO_CAPACITY);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_NO_CAPACITY]);
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
kcsan_counter_inc(KCSAN_COUNTER_SETUP_WATCHPOINTS);
|
||||
kcsan_counter_inc(KCSAN_COUNTER_USED_WATCHPOINTS);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_SETUP_WATCHPOINTS]);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_USED_WATCHPOINTS]);
|
||||
|
||||
/*
|
||||
* Read the current value, to later check and infer a race if the data
|
||||
|
@ -459,7 +487,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
|
|||
|
||||
if (IS_ENABLED(CONFIG_KCSAN_DEBUG)) {
|
||||
kcsan_disable_current();
|
||||
pr_err("KCSAN: watching %s, size: %zu, addr: %px [slot: %d, encoded: %lx]\n",
|
||||
pr_err("watching %s, size: %zu, addr: %px [slot: %d, encoded: %lx]\n",
|
||||
is_write ? "write" : "read", size, ptr,
|
||||
watchpoint_slot((unsigned long)ptr),
|
||||
encode_watchpoint((unsigned long)ptr, size, is_write));
|
||||
|
@ -470,7 +498,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
|
|||
* Delay this thread, to increase probability of observing a racy
|
||||
* conflicting access.
|
||||
*/
|
||||
udelay(get_delay());
|
||||
delay_access(type);
|
||||
|
||||
/*
|
||||
* Re-read value, and check if it is as expected; if not, we infer a
|
||||
|
@ -535,16 +563,16 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
|
|||
* increment this counter.
|
||||
*/
|
||||
if (is_assert && value_change == KCSAN_VALUE_CHANGE_TRUE)
|
||||
kcsan_counter_inc(KCSAN_COUNTER_ASSERT_FAILURES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_ASSERT_FAILURES]);
|
||||
|
||||
kcsan_report(ptr, size, type, value_change, KCSAN_REPORT_RACE_SIGNAL,
|
||||
watchpoint - watchpoints);
|
||||
} else if (value_change == KCSAN_VALUE_CHANGE_TRUE) {
|
||||
/* Inferring a race, since the value should not have changed. */
|
||||
|
||||
kcsan_counter_inc(KCSAN_COUNTER_RACES_UNKNOWN_ORIGIN);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_RACES_UNKNOWN_ORIGIN]);
|
||||
if (is_assert)
|
||||
kcsan_counter_inc(KCSAN_COUNTER_ASSERT_FAILURES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_ASSERT_FAILURES]);
|
||||
|
||||
if (IS_ENABLED(CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN) || is_assert)
|
||||
kcsan_report(ptr, size, type, KCSAN_VALUE_CHANGE_TRUE,
|
||||
|
@ -557,7 +585,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type)
|
|||
* reused after this point.
|
||||
*/
|
||||
remove_watchpoint(watchpoint);
|
||||
kcsan_counter_dec(KCSAN_COUNTER_USED_WATCHPOINTS);
|
||||
atomic_long_dec(&kcsan_counters[KCSAN_COUNTER_USED_WATCHPOINTS]);
|
||||
out_unlock:
|
||||
if (!kcsan_interrupt_watcher)
|
||||
local_irq_restore(irq_flags);
|
||||
|
@ -614,13 +642,16 @@ void __init kcsan_init(void)
|
|||
BUG_ON(!in_task());
|
||||
|
||||
kcsan_debugfs_init();
|
||||
prandom_seed_full_state(&kcsan_rand_state);
|
||||
|
||||
/*
|
||||
* We are in the init task, and no other tasks should be running;
|
||||
* WRITE_ONCE without memory barrier is sufficient.
|
||||
*/
|
||||
if (kcsan_early_enable)
|
||||
if (kcsan_early_enable) {
|
||||
pr_info("enabled early\n");
|
||||
WRITE_ONCE(kcsan_enabled, true);
|
||||
}
|
||||
}
|
||||
|
||||
/* === Exported interface =================================================== */
|
||||
|
@ -793,7 +824,17 @@ EXPORT_SYMBOL(__kcsan_check_access);
|
|||
EXPORT_SYMBOL(__tsan_write##size); \
|
||||
void __tsan_unaligned_write##size(void *ptr) \
|
||||
__alias(__tsan_write##size); \
|
||||
EXPORT_SYMBOL(__tsan_unaligned_write##size)
|
||||
EXPORT_SYMBOL(__tsan_unaligned_write##size); \
|
||||
void __tsan_read_write##size(void *ptr); \
|
||||
void __tsan_read_write##size(void *ptr) \
|
||||
{ \
|
||||
check_access(ptr, size, \
|
||||
KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE); \
|
||||
} \
|
||||
EXPORT_SYMBOL(__tsan_read_write##size); \
|
||||
void __tsan_unaligned_read_write##size(void *ptr) \
|
||||
__alias(__tsan_read_write##size); \
|
||||
EXPORT_SYMBOL(__tsan_unaligned_read_write##size)
|
||||
|
||||
DEFINE_TSAN_READ_WRITE(1);
|
||||
DEFINE_TSAN_READ_WRITE(2);
|
||||
|
@ -879,3 +920,130 @@ void __tsan_init(void)
|
|||
{
|
||||
}
|
||||
EXPORT_SYMBOL(__tsan_init);
|
||||
|
||||
/*
|
||||
* Instrumentation for atomic builtins (__atomic_*, __sync_*).
|
||||
*
|
||||
* Normal kernel code _should not_ be using them directly, but some
|
||||
* architectures may implement some or all atomics using the compilers'
|
||||
* builtins.
|
||||
*
|
||||
* Note: If an architecture decides to fully implement atomics using the
|
||||
* builtins, because they are implicitly instrumented by KCSAN (and KASAN,
|
||||
* etc.), implementing the ARCH_ATOMIC interface (to get instrumentation via
|
||||
* atomic-instrumented) is no longer necessary.
|
||||
*
|
||||
* TSAN instrumentation replaces atomic accesses with calls to any of the below
|
||||
* functions, whose job is to also execute the operation itself.
|
||||
*/
|
||||
|
||||
#define DEFINE_TSAN_ATOMIC_LOAD_STORE(bits) \
|
||||
u##bits __tsan_atomic##bits##_load(const u##bits *ptr, int memorder); \
|
||||
u##bits __tsan_atomic##bits##_load(const u##bits *ptr, int memorder) \
|
||||
{ \
|
||||
if (!IS_ENABLED(CONFIG_KCSAN_IGNORE_ATOMICS)) { \
|
||||
check_access(ptr, bits / BITS_PER_BYTE, KCSAN_ACCESS_ATOMIC); \
|
||||
} \
|
||||
return __atomic_load_n(ptr, memorder); \
|
||||
} \
|
||||
EXPORT_SYMBOL(__tsan_atomic##bits##_load); \
|
||||
void __tsan_atomic##bits##_store(u##bits *ptr, u##bits v, int memorder); \
|
||||
void __tsan_atomic##bits##_store(u##bits *ptr, u##bits v, int memorder) \
|
||||
{ \
|
||||
if (!IS_ENABLED(CONFIG_KCSAN_IGNORE_ATOMICS)) { \
|
||||
check_access(ptr, bits / BITS_PER_BYTE, \
|
||||
KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ATOMIC); \
|
||||
} \
|
||||
__atomic_store_n(ptr, v, memorder); \
|
||||
} \
|
||||
EXPORT_SYMBOL(__tsan_atomic##bits##_store)
|
||||
|
||||
#define DEFINE_TSAN_ATOMIC_RMW(op, bits, suffix) \
|
||||
u##bits __tsan_atomic##bits##_##op(u##bits *ptr, u##bits v, int memorder); \
|
||||
u##bits __tsan_atomic##bits##_##op(u##bits *ptr, u##bits v, int memorder) \
|
||||
{ \
|
||||
if (!IS_ENABLED(CONFIG_KCSAN_IGNORE_ATOMICS)) { \
|
||||
check_access(ptr, bits / BITS_PER_BYTE, \
|
||||
KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE | \
|
||||
KCSAN_ACCESS_ATOMIC); \
|
||||
} \
|
||||
return __atomic_##op##suffix(ptr, v, memorder); \
|
||||
} \
|
||||
EXPORT_SYMBOL(__tsan_atomic##bits##_##op)
|
||||
|
||||
/*
|
||||
* Note: CAS operations are always classified as write, even in case they
|
||||
* fail. We cannot perform check_access() after a write, as it might lead to
|
||||
* false positives, in cases such as:
|
||||
*
|
||||
* T0: __atomic_compare_exchange_n(&p->flag, &old, 1, ...)
|
||||
*
|
||||
* T1: if (__atomic_load_n(&p->flag, ...)) {
|
||||
* modify *p;
|
||||
* p->flag = 0;
|
||||
* }
|
||||
*
|
||||
* The only downside is that, if there are 3 threads, with one CAS that
|
||||
* succeeds, another CAS that fails, and an unmarked racing operation, we may
|
||||
* point at the wrong CAS as the source of the race. However, if we assume that
|
||||
* all CAS can succeed in some other execution, the data race is still valid.
|
||||
*/
|
||||
#define DEFINE_TSAN_ATOMIC_CMPXCHG(bits, strength, weak) \
|
||||
int __tsan_atomic##bits##_compare_exchange_##strength(u##bits *ptr, u##bits *exp, \
|
||||
u##bits val, int mo, int fail_mo); \
|
||||
int __tsan_atomic##bits##_compare_exchange_##strength(u##bits *ptr, u##bits *exp, \
|
||||
u##bits val, int mo, int fail_mo) \
|
||||
{ \
|
||||
if (!IS_ENABLED(CONFIG_KCSAN_IGNORE_ATOMICS)) { \
|
||||
check_access(ptr, bits / BITS_PER_BYTE, \
|
||||
KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE | \
|
||||
KCSAN_ACCESS_ATOMIC); \
|
||||
} \
|
||||
return __atomic_compare_exchange_n(ptr, exp, val, weak, mo, fail_mo); \
|
||||
} \
|
||||
EXPORT_SYMBOL(__tsan_atomic##bits##_compare_exchange_##strength)
|
||||
|
||||
#define DEFINE_TSAN_ATOMIC_CMPXCHG_VAL(bits) \
|
||||
u##bits __tsan_atomic##bits##_compare_exchange_val(u##bits *ptr, u##bits exp, u##bits val, \
|
||||
int mo, int fail_mo); \
|
||||
u##bits __tsan_atomic##bits##_compare_exchange_val(u##bits *ptr, u##bits exp, u##bits val, \
|
||||
int mo, int fail_mo) \
|
||||
{ \
|
||||
if (!IS_ENABLED(CONFIG_KCSAN_IGNORE_ATOMICS)) { \
|
||||
check_access(ptr, bits / BITS_PER_BYTE, \
|
||||
KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE | \
|
||||
KCSAN_ACCESS_ATOMIC); \
|
||||
} \
|
||||
__atomic_compare_exchange_n(ptr, &exp, val, 0, mo, fail_mo); \
|
||||
return exp; \
|
||||
} \
|
||||
EXPORT_SYMBOL(__tsan_atomic##bits##_compare_exchange_val)
|
||||
|
||||
#define DEFINE_TSAN_ATOMIC_OPS(bits) \
|
||||
DEFINE_TSAN_ATOMIC_LOAD_STORE(bits); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(exchange, bits, _n); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(fetch_add, bits, ); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(fetch_sub, bits, ); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(fetch_and, bits, ); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(fetch_or, bits, ); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(fetch_xor, bits, ); \
|
||||
DEFINE_TSAN_ATOMIC_RMW(fetch_nand, bits, ); \
|
||||
DEFINE_TSAN_ATOMIC_CMPXCHG(bits, strong, 0); \
|
||||
DEFINE_TSAN_ATOMIC_CMPXCHG(bits, weak, 1); \
|
||||
DEFINE_TSAN_ATOMIC_CMPXCHG_VAL(bits)
|
||||
|
||||
DEFINE_TSAN_ATOMIC_OPS(8);
|
||||
DEFINE_TSAN_ATOMIC_OPS(16);
|
||||
DEFINE_TSAN_ATOMIC_OPS(32);
|
||||
DEFINE_TSAN_ATOMIC_OPS(64);
|
||||
|
||||
void __tsan_atomic_thread_fence(int memorder);
|
||||
void __tsan_atomic_thread_fence(int memorder)
|
||||
{
|
||||
__atomic_thread_fence(memorder);
|
||||
}
|
||||
EXPORT_SYMBOL(__tsan_atomic_thread_fence);
|
||||
|
||||
void __tsan_atomic_signal_fence(int memorder);
|
||||
void __tsan_atomic_signal_fence(int memorder) { }
|
||||
EXPORT_SYMBOL(__tsan_atomic_signal_fence);
|
||||
|
|
|
@ -1,5 +1,7 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#define pr_fmt(fmt) "kcsan: " fmt
|
||||
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/bsearch.h>
|
||||
#include <linux/bug.h>
|
||||
|
@ -15,10 +17,19 @@
|
|||
|
||||
#include "kcsan.h"
|
||||
|
||||
/*
|
||||
* Statistics counters.
|
||||
*/
|
||||
static atomic_long_t counters[KCSAN_COUNTER_COUNT];
|
||||
atomic_long_t kcsan_counters[KCSAN_COUNTER_COUNT];
|
||||
static const char *const counter_names[] = {
|
||||
[KCSAN_COUNTER_USED_WATCHPOINTS] = "used_watchpoints",
|
||||
[KCSAN_COUNTER_SETUP_WATCHPOINTS] = "setup_watchpoints",
|
||||
[KCSAN_COUNTER_DATA_RACES] = "data_races",
|
||||
[KCSAN_COUNTER_ASSERT_FAILURES] = "assert_failures",
|
||||
[KCSAN_COUNTER_NO_CAPACITY] = "no_capacity",
|
||||
[KCSAN_COUNTER_REPORT_RACES] = "report_races",
|
||||
[KCSAN_COUNTER_RACES_UNKNOWN_ORIGIN] = "races_unknown_origin",
|
||||
[KCSAN_COUNTER_UNENCODABLE_ACCESSES] = "unencodable_accesses",
|
||||
[KCSAN_COUNTER_ENCODING_FALSE_POSITIVES] = "encoding_false_positives",
|
||||
};
|
||||
static_assert(ARRAY_SIZE(counter_names) == KCSAN_COUNTER_COUNT);
|
||||
|
||||
/*
|
||||
* Addresses for filtering functions from reporting. This list can be used as a
|
||||
|
@ -39,34 +50,6 @@ static struct {
|
|||
};
|
||||
static DEFINE_SPINLOCK(report_filterlist_lock);
|
||||
|
||||
static const char *counter_to_name(enum kcsan_counter_id id)
|
||||
{
|
||||
switch (id) {
|
||||
case KCSAN_COUNTER_USED_WATCHPOINTS: return "used_watchpoints";
|
||||
case KCSAN_COUNTER_SETUP_WATCHPOINTS: return "setup_watchpoints";
|
||||
case KCSAN_COUNTER_DATA_RACES: return "data_races";
|
||||
case KCSAN_COUNTER_ASSERT_FAILURES: return "assert_failures";
|
||||
case KCSAN_COUNTER_NO_CAPACITY: return "no_capacity";
|
||||
case KCSAN_COUNTER_REPORT_RACES: return "report_races";
|
||||
case KCSAN_COUNTER_RACES_UNKNOWN_ORIGIN: return "races_unknown_origin";
|
||||
case KCSAN_COUNTER_UNENCODABLE_ACCESSES: return "unencodable_accesses";
|
||||
case KCSAN_COUNTER_ENCODING_FALSE_POSITIVES: return "encoding_false_positives";
|
||||
case KCSAN_COUNTER_COUNT:
|
||||
BUG();
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
void kcsan_counter_inc(enum kcsan_counter_id id)
|
||||
{
|
||||
atomic_long_inc(&counters[id]);
|
||||
}
|
||||
|
||||
void kcsan_counter_dec(enum kcsan_counter_id id)
|
||||
{
|
||||
atomic_long_dec(&counters[id]);
|
||||
}
|
||||
|
||||
/*
|
||||
* The microbenchmark allows benchmarking KCSAN core runtime only. To run
|
||||
* multiple threads, pipe 'microbench=<iters>' from multiple tasks into the
|
||||
|
@ -86,7 +69,7 @@ static noinline void microbenchmark(unsigned long iters)
|
|||
*/
|
||||
WRITE_ONCE(kcsan_enabled, false);
|
||||
|
||||
pr_info("KCSAN: %s begin | iters: %lu\n", __func__, iters);
|
||||
pr_info("%s begin | iters: %lu\n", __func__, iters);
|
||||
|
||||
cycles = get_cycles();
|
||||
while (iters--) {
|
||||
|
@ -97,73 +80,13 @@ static noinline void microbenchmark(unsigned long iters)
|
|||
}
|
||||
cycles = get_cycles() - cycles;
|
||||
|
||||
pr_info("KCSAN: %s end | cycles: %llu\n", __func__, cycles);
|
||||
pr_info("%s end | cycles: %llu\n", __func__, cycles);
|
||||
|
||||
WRITE_ONCE(kcsan_enabled, was_enabled);
|
||||
/* restore context */
|
||||
current->kcsan_ctx = ctx_save;
|
||||
}
|
||||
|
||||
/*
|
||||
* Simple test to create conflicting accesses. Write 'test=<iters>' to KCSAN's
|
||||
* debugfs file from multiple tasks to generate real conflicts and show reports.
|
||||
*/
|
||||
static long test_dummy;
|
||||
static long test_flags;
|
||||
static long test_scoped;
|
||||
static noinline void test_thread(unsigned long iters)
|
||||
{
|
||||
const long CHANGE_BITS = 0xff00ff00ff00ff00L;
|
||||
const struct kcsan_ctx ctx_save = current->kcsan_ctx;
|
||||
cycles_t cycles;
|
||||
|
||||
/* We may have been called from an atomic region; reset context. */
|
||||
memset(¤t->kcsan_ctx, 0, sizeof(current->kcsan_ctx));
|
||||
|
||||
pr_info("KCSAN: %s begin | iters: %lu\n", __func__, iters);
|
||||
pr_info("test_dummy@%px, test_flags@%px, test_scoped@%px,\n",
|
||||
&test_dummy, &test_flags, &test_scoped);
|
||||
|
||||
cycles = get_cycles();
|
||||
while (iters--) {
|
||||
/* These all should generate reports. */
|
||||
__kcsan_check_read(&test_dummy, sizeof(test_dummy));
|
||||
ASSERT_EXCLUSIVE_WRITER(test_dummy);
|
||||
ASSERT_EXCLUSIVE_ACCESS(test_dummy);
|
||||
|
||||
ASSERT_EXCLUSIVE_BITS(test_flags, ~CHANGE_BITS); /* no report */
|
||||
__kcsan_check_read(&test_flags, sizeof(test_flags)); /* no report */
|
||||
|
||||
ASSERT_EXCLUSIVE_BITS(test_flags, CHANGE_BITS); /* report */
|
||||
__kcsan_check_read(&test_flags, sizeof(test_flags)); /* no report */
|
||||
|
||||
/* not actually instrumented */
|
||||
WRITE_ONCE(test_dummy, iters); /* to observe value-change */
|
||||
__kcsan_check_write(&test_dummy, sizeof(test_dummy));
|
||||
|
||||
test_flags ^= CHANGE_BITS; /* generate value-change */
|
||||
__kcsan_check_write(&test_flags, sizeof(test_flags));
|
||||
|
||||
BUG_ON(current->kcsan_ctx.scoped_accesses.prev);
|
||||
{
|
||||
/* Should generate reports anywhere in this block. */
|
||||
ASSERT_EXCLUSIVE_WRITER_SCOPED(test_scoped);
|
||||
ASSERT_EXCLUSIVE_ACCESS_SCOPED(test_scoped);
|
||||
BUG_ON(!current->kcsan_ctx.scoped_accesses.prev);
|
||||
/* Unrelated accesses. */
|
||||
__kcsan_check_access(&cycles, sizeof(cycles), 0);
|
||||
__kcsan_check_access(&cycles, sizeof(cycles), KCSAN_ACCESS_ATOMIC);
|
||||
}
|
||||
BUG_ON(current->kcsan_ctx.scoped_accesses.prev);
|
||||
}
|
||||
cycles = get_cycles() - cycles;
|
||||
|
||||
pr_info("KCSAN: %s end | cycles: %llu\n", __func__, cycles);
|
||||
|
||||
/* restore context */
|
||||
current->kcsan_ctx = ctx_save;
|
||||
}
|
||||
|
||||
static int cmp_filterlist_addrs(const void *rhs, const void *lhs)
|
||||
{
|
||||
const unsigned long a = *(const unsigned long *)rhs;
|
||||
|
@ -220,7 +143,7 @@ static ssize_t insert_report_filterlist(const char *func)
|
|||
ssize_t ret = 0;
|
||||
|
||||
if (!addr) {
|
||||
pr_err("KCSAN: could not find function: '%s'\n", func);
|
||||
pr_err("could not find function: '%s'\n", func);
|
||||
return -ENOENT;
|
||||
}
|
||||
|
||||
|
@ -270,9 +193,10 @@ static int show_info(struct seq_file *file, void *v)
|
|||
|
||||
/* show stats */
|
||||
seq_printf(file, "enabled: %i\n", READ_ONCE(kcsan_enabled));
|
||||
for (i = 0; i < KCSAN_COUNTER_COUNT; ++i)
|
||||
seq_printf(file, "%s: %ld\n", counter_to_name(i),
|
||||
atomic_long_read(&counters[i]));
|
||||
for (i = 0; i < KCSAN_COUNTER_COUNT; ++i) {
|
||||
seq_printf(file, "%s: %ld\n", counter_names[i],
|
||||
atomic_long_read(&kcsan_counters[i]));
|
||||
}
|
||||
|
||||
/* show filter functions, and filter type */
|
||||
spin_lock_irqsave(&report_filterlist_lock, flags);
|
||||
|
@ -307,18 +231,12 @@ debugfs_write(struct file *file, const char __user *buf, size_t count, loff_t *o
|
|||
WRITE_ONCE(kcsan_enabled, true);
|
||||
} else if (!strcmp(arg, "off")) {
|
||||
WRITE_ONCE(kcsan_enabled, false);
|
||||
} else if (!strncmp(arg, "microbench=", sizeof("microbench=") - 1)) {
|
||||
} else if (str_has_prefix(arg, "microbench=")) {
|
||||
unsigned long iters;
|
||||
|
||||
if (kstrtoul(&arg[sizeof("microbench=") - 1], 0, &iters))
|
||||
if (kstrtoul(&arg[strlen("microbench=")], 0, &iters))
|
||||
return -EINVAL;
|
||||
microbenchmark(iters);
|
||||
} else if (!strncmp(arg, "test=", sizeof("test=") - 1)) {
|
||||
unsigned long iters;
|
||||
|
||||
if (kstrtoul(&arg[sizeof("test=") - 1], 0, &iters))
|
||||
return -EINVAL;
|
||||
test_thread(iters);
|
||||
} else if (!strcmp(arg, "whitelist")) {
|
||||
set_report_filterlist_whitelist(true);
|
||||
} else if (!strcmp(arg, "blacklist")) {
|
||||
|
|
|
@ -27,6 +27,12 @@
|
|||
#include <linux/types.h>
|
||||
#include <trace/events/printk.h>
|
||||
|
||||
#ifdef CONFIG_CC_HAS_TSAN_COMPOUND_READ_BEFORE_WRITE
|
||||
#define __KCSAN_ACCESS_RW(alt) (KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE)
|
||||
#else
|
||||
#define __KCSAN_ACCESS_RW(alt) (alt)
|
||||
#endif
|
||||
|
||||
/* Points to current test-case memory access "kernels". */
|
||||
static void (*access_kernels[2])(void);
|
||||
|
||||
|
@ -186,20 +192,21 @@ static bool report_matches(const struct expect_report *r)
|
|||
|
||||
/* Access 1 & 2 */
|
||||
for (i = 0; i < 2; ++i) {
|
||||
const int ty = r->access[i].type;
|
||||
const char *const access_type =
|
||||
(r->access[i].type & KCSAN_ACCESS_ASSERT) ?
|
||||
((r->access[i].type & KCSAN_ACCESS_WRITE) ?
|
||||
"assert no accesses" :
|
||||
"assert no writes") :
|
||||
((r->access[i].type & KCSAN_ACCESS_WRITE) ?
|
||||
"write" :
|
||||
"read");
|
||||
(ty & KCSAN_ACCESS_ASSERT) ?
|
||||
((ty & KCSAN_ACCESS_WRITE) ?
|
||||
"assert no accesses" :
|
||||
"assert no writes") :
|
||||
((ty & KCSAN_ACCESS_WRITE) ?
|
||||
((ty & KCSAN_ACCESS_COMPOUND) ?
|
||||
"read-write" :
|
||||
"write") :
|
||||
"read");
|
||||
const char *const access_type_aux =
|
||||
(r->access[i].type & KCSAN_ACCESS_ATOMIC) ?
|
||||
" (marked)" :
|
||||
((r->access[i].type & KCSAN_ACCESS_SCOPED) ?
|
||||
" (scoped)" :
|
||||
"");
|
||||
(ty & KCSAN_ACCESS_ATOMIC) ?
|
||||
" (marked)" :
|
||||
((ty & KCSAN_ACCESS_SCOPED) ? " (scoped)" : "");
|
||||
|
||||
if (i == 1) {
|
||||
/* Access 2 */
|
||||
|
@ -277,6 +284,12 @@ static noinline void test_kernel_write_atomic(void)
|
|||
WRITE_ONCE(test_var, READ_ONCE_NOCHECK(test_sink) + 1);
|
||||
}
|
||||
|
||||
static noinline void test_kernel_atomic_rmw(void)
|
||||
{
|
||||
/* Use builtin, so we can set up the "bad" atomic/non-atomic scenario. */
|
||||
__atomic_fetch_add(&test_var, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
__no_kcsan
|
||||
static noinline void test_kernel_write_uninstrumented(void) { test_var++; }
|
||||
|
||||
|
@ -390,6 +403,15 @@ static noinline void test_kernel_seqlock_writer(void)
|
|||
write_sequnlock_irqrestore(&test_seqlock, flags);
|
||||
}
|
||||
|
||||
static noinline void test_kernel_atomic_builtins(void)
|
||||
{
|
||||
/*
|
||||
* Generate concurrent accesses, expecting no reports, ensuring KCSAN
|
||||
* treats builtin atomics as actually atomic.
|
||||
*/
|
||||
__atomic_load_n(&test_var, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
/* ===== Test cases ===== */
|
||||
|
||||
/* Simple test with normal data race. */
|
||||
|
@ -430,8 +452,8 @@ static void test_concurrent_races(struct kunit *test)
|
|||
const struct expect_report expect = {
|
||||
.access = {
|
||||
/* NULL will match any address. */
|
||||
{ test_kernel_rmw_array, NULL, 0, KCSAN_ACCESS_WRITE },
|
||||
{ test_kernel_rmw_array, NULL, 0, 0 },
|
||||
{ test_kernel_rmw_array, NULL, 0, __KCSAN_ACCESS_RW(KCSAN_ACCESS_WRITE) },
|
||||
{ test_kernel_rmw_array, NULL, 0, __KCSAN_ACCESS_RW(0) },
|
||||
},
|
||||
};
|
||||
static const struct expect_report never = {
|
||||
|
@ -620,6 +642,29 @@ static void test_read_plain_atomic_write(struct kunit *test)
|
|||
KUNIT_EXPECT_TRUE(test, match_expect);
|
||||
}
|
||||
|
||||
/* Test that atomic RMWs generate correct report. */
|
||||
__no_kcsan
|
||||
static void test_read_plain_atomic_rmw(struct kunit *test)
|
||||
{
|
||||
const struct expect_report expect = {
|
||||
.access = {
|
||||
{ test_kernel_read, &test_var, sizeof(test_var), 0 },
|
||||
{ test_kernel_atomic_rmw, &test_var, sizeof(test_var),
|
||||
KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ATOMIC },
|
||||
},
|
||||
};
|
||||
bool match_expect = false;
|
||||
|
||||
if (IS_ENABLED(CONFIG_KCSAN_IGNORE_ATOMICS))
|
||||
return;
|
||||
|
||||
begin_test_checks(test_kernel_read, test_kernel_atomic_rmw);
|
||||
do {
|
||||
match_expect = report_matches(&expect);
|
||||
} while (!end_test_checks(match_expect));
|
||||
KUNIT_EXPECT_TRUE(test, match_expect);
|
||||
}
|
||||
|
||||
/* Zero-sized accesses should never cause data race reports. */
|
||||
__no_kcsan
|
||||
static void test_zero_size_access(struct kunit *test)
|
||||
|
@ -852,6 +897,59 @@ static void test_seqlock_noreport(struct kunit *test)
|
|||
KUNIT_EXPECT_FALSE(test, match_never);
|
||||
}
|
||||
|
||||
/*
|
||||
* Test atomic builtins work and required instrumentation functions exist. We
|
||||
* also test that KCSAN understands they're atomic by racing with them via
|
||||
* test_kernel_atomic_builtins(), and expect no reports.
|
||||
*
|
||||
* The atomic builtins _SHOULD NOT_ be used in normal kernel code!
|
||||
*/
|
||||
static void test_atomic_builtins(struct kunit *test)
|
||||
{
|
||||
bool match_never = false;
|
||||
|
||||
begin_test_checks(test_kernel_atomic_builtins, test_kernel_atomic_builtins);
|
||||
do {
|
||||
long tmp;
|
||||
|
||||
kcsan_enable_current();
|
||||
|
||||
__atomic_store_n(&test_var, 42L, __ATOMIC_RELAXED);
|
||||
KUNIT_EXPECT_EQ(test, 42L, __atomic_load_n(&test_var, __ATOMIC_RELAXED));
|
||||
|
||||
KUNIT_EXPECT_EQ(test, 42L, __atomic_exchange_n(&test_var, 20, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, 20L, test_var);
|
||||
|
||||
tmp = 20L;
|
||||
KUNIT_EXPECT_TRUE(test, __atomic_compare_exchange_n(&test_var, &tmp, 30L,
|
||||
0, __ATOMIC_RELAXED,
|
||||
__ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, tmp, 20L);
|
||||
KUNIT_EXPECT_EQ(test, test_var, 30L);
|
||||
KUNIT_EXPECT_FALSE(test, __atomic_compare_exchange_n(&test_var, &tmp, 40L,
|
||||
1, __ATOMIC_RELAXED,
|
||||
__ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, tmp, 30L);
|
||||
KUNIT_EXPECT_EQ(test, test_var, 30L);
|
||||
|
||||
KUNIT_EXPECT_EQ(test, 30L, __atomic_fetch_add(&test_var, 1, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, 31L, __atomic_fetch_sub(&test_var, 1, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, 30L, __atomic_fetch_and(&test_var, 0xf, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, 14L, __atomic_fetch_xor(&test_var, 0xf, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, 1L, __atomic_fetch_or(&test_var, 0xf0, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, 241L, __atomic_fetch_nand(&test_var, 0xf, __ATOMIC_RELAXED));
|
||||
KUNIT_EXPECT_EQ(test, -2L, test_var);
|
||||
|
||||
__atomic_thread_fence(__ATOMIC_SEQ_CST);
|
||||
__atomic_signal_fence(__ATOMIC_SEQ_CST);
|
||||
|
||||
kcsan_disable_current();
|
||||
|
||||
match_never = report_available();
|
||||
} while (!end_test_checks(match_never));
|
||||
KUNIT_EXPECT_FALSE(test, match_never);
|
||||
}
|
||||
|
||||
/*
|
||||
* Each test case is run with different numbers of threads. Until KUnit supports
|
||||
* passing arguments for each test case, we encode #threads in the test case
|
||||
|
@ -880,6 +978,7 @@ static struct kunit_case kcsan_test_cases[] = {
|
|||
KCSAN_KUNIT_CASE(test_write_write_struct_part),
|
||||
KCSAN_KUNIT_CASE(test_read_atomic_write_atomic),
|
||||
KCSAN_KUNIT_CASE(test_read_plain_atomic_write),
|
||||
KCSAN_KUNIT_CASE(test_read_plain_atomic_rmw),
|
||||
KCSAN_KUNIT_CASE(test_zero_size_access),
|
||||
KCSAN_KUNIT_CASE(test_data_race),
|
||||
KCSAN_KUNIT_CASE(test_assert_exclusive_writer),
|
||||
|
@ -891,6 +990,7 @@ static struct kunit_case kcsan_test_cases[] = {
|
|||
KCSAN_KUNIT_CASE(test_assert_exclusive_access_scoped),
|
||||
KCSAN_KUNIT_CASE(test_jiffies_noreport),
|
||||
KCSAN_KUNIT_CASE(test_seqlock_noreport),
|
||||
KCSAN_KUNIT_CASE(test_atomic_builtins),
|
||||
{},
|
||||
};
|
||||
|
||||
|
|
|
@ -8,6 +8,7 @@
|
|||
#ifndef _KERNEL_KCSAN_KCSAN_H
|
||||
#define _KERNEL_KCSAN_KCSAN_H
|
||||
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/kcsan.h>
|
||||
#include <linux/sched.h>
|
||||
|
||||
|
@ -34,6 +35,10 @@ void kcsan_restore_irqtrace(struct task_struct *task);
|
|||
*/
|
||||
void kcsan_debugfs_init(void);
|
||||
|
||||
/*
|
||||
* Statistics counters displayed via debugfs; should only be modified in
|
||||
* slow-paths.
|
||||
*/
|
||||
enum kcsan_counter_id {
|
||||
/*
|
||||
* Number of watchpoints currently in use.
|
||||
|
@ -86,12 +91,7 @@ enum kcsan_counter_id {
|
|||
|
||||
KCSAN_COUNTER_COUNT, /* number of counters */
|
||||
};
|
||||
|
||||
/*
|
||||
* Increment/decrement counter with given id; avoid calling these in fast-path.
|
||||
*/
|
||||
extern void kcsan_counter_inc(enum kcsan_counter_id id);
|
||||
extern void kcsan_counter_dec(enum kcsan_counter_id id);
|
||||
extern atomic_long_t kcsan_counters[KCSAN_COUNTER_COUNT];
|
||||
|
||||
/*
|
||||
* Returns true if data races in the function symbol that maps to func_addr
|
||||
|
|
|
@ -228,6 +228,10 @@ static const char *get_access_type(int type)
|
|||
return "write";
|
||||
case KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ATOMIC:
|
||||
return "write (marked)";
|
||||
case KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE:
|
||||
return "read-write";
|
||||
case KCSAN_ACCESS_COMPOUND | KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ATOMIC:
|
||||
return "read-write (marked)";
|
||||
case KCSAN_ACCESS_SCOPED:
|
||||
return "read (scoped)";
|
||||
case KCSAN_ACCESS_SCOPED | KCSAN_ACCESS_ATOMIC:
|
||||
|
@ -275,8 +279,8 @@ static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries
|
|||
|
||||
cur = strnstr(buf, "kcsan_", len);
|
||||
if (cur) {
|
||||
cur += sizeof("kcsan_") - 1;
|
||||
if (strncmp(cur, "test", sizeof("test") - 1))
|
||||
cur += strlen("kcsan_");
|
||||
if (!str_has_prefix(cur, "test"))
|
||||
continue; /* KCSAN runtime function. */
|
||||
/* KCSAN related test. */
|
||||
}
|
||||
|
@ -555,7 +559,7 @@ static bool prepare_report_consumer(unsigned long *flags,
|
|||
* If the actual accesses to not match, this was a false
|
||||
* positive due to watchpoint encoding.
|
||||
*/
|
||||
kcsan_counter_inc(KCSAN_COUNTER_ENCODING_FALSE_POSITIVES);
|
||||
atomic_long_inc(&kcsan_counters[KCSAN_COUNTER_ENCODING_FALSE_POSITIVES]);
|
||||
goto discard;
|
||||
}
|
||||
|
||||
|
|
|
@ -1,5 +1,7 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#define pr_fmt(fmt) "kcsan: " fmt
|
||||
|
||||
#include <linux/init.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/printk.h>
|
||||
|
@ -116,16 +118,16 @@ static int __init kcsan_selftest(void)
|
|||
if (do_test()) \
|
||||
++passed; \
|
||||
else \
|
||||
pr_err("KCSAN selftest: " #do_test " failed"); \
|
||||
pr_err("selftest: " #do_test " failed"); \
|
||||
} while (0)
|
||||
|
||||
RUN_TEST(test_requires);
|
||||
RUN_TEST(test_encode_decode);
|
||||
RUN_TEST(test_matching_access);
|
||||
|
||||
pr_info("KCSAN selftest: %d/%d tests passed\n", passed, total);
|
||||
pr_info("selftest: %d/%d tests passed\n", passed, total);
|
||||
if (passed != total)
|
||||
panic("KCSAN selftests failed");
|
||||
panic("selftests failed");
|
||||
return 0;
|
||||
}
|
||||
postcore_initcall(kcsan_selftest);
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -20,9 +20,12 @@ enum lock_usage_bit {
|
|||
#undef LOCKDEP_STATE
|
||||
LOCK_USED,
|
||||
LOCK_USED_READ,
|
||||
LOCK_USAGE_STATES
|
||||
LOCK_USAGE_STATES,
|
||||
};
|
||||
|
||||
/* states after LOCK_USED_READ are not traced and printed */
|
||||
static_assert(LOCK_TRACE_STATES == LOCK_USAGE_STATES);
|
||||
|
||||
#define LOCK_USAGE_READ_MASK 1
|
||||
#define LOCK_USAGE_DIR_MASK 2
|
||||
#define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
|
||||
|
@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
|
|||
extern struct list_head all_lock_classes;
|
||||
extern struct lock_chain lock_chains[];
|
||||
|
||||
#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
|
||||
#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
|
||||
|
||||
extern void get_usage_chars(struct lock_class *class,
|
||||
char usage[LOCK_USAGE_CHARS]);
|
||||
|
|
|
@ -35,7 +35,7 @@
|
|||
* into a single 64-byte cache line.
|
||||
*/
|
||||
struct clock_data {
|
||||
seqcount_t seq;
|
||||
seqcount_latch_t seq;
|
||||
struct clock_read_data read_data[2];
|
||||
ktime_t wrap_kt;
|
||||
unsigned long rate;
|
||||
|
@ -76,7 +76,7 @@ struct clock_read_data *sched_clock_read_begin(unsigned int *seq)
|
|||
|
||||
int sched_clock_read_retry(unsigned int seq)
|
||||
{
|
||||
return read_seqcount_retry(&cd.seq, seq);
|
||||
return read_seqcount_latch_retry(&cd.seq, seq);
|
||||
}
|
||||
|
||||
unsigned long long notrace sched_clock(void)
|
||||
|
@ -258,7 +258,7 @@ void __init generic_sched_clock_init(void)
|
|||
*/
|
||||
static u64 notrace suspended_sched_clock_read(void)
|
||||
{
|
||||
unsigned int seq = raw_read_seqcount(&cd.seq);
|
||||
unsigned int seq = raw_read_seqcount_latch(&cd.seq);
|
||||
|
||||
return cd.read_data[seq & 1].epoch_cyc;
|
||||
}
|
||||
|
|
|
@ -67,7 +67,7 @@ int __read_mostly timekeeping_suspended;
|
|||
* See @update_fast_timekeeper() below.
|
||||
*/
|
||||
struct tk_fast {
|
||||
seqcount_raw_spinlock_t seq;
|
||||
seqcount_latch_t seq;
|
||||
struct tk_read_base base[2];
|
||||
};
|
||||
|
||||
|
@ -101,13 +101,13 @@ static struct clocksource dummy_clock = {
|
|||
}
|
||||
|
||||
static struct tk_fast tk_fast_mono ____cacheline_aligned = {
|
||||
.seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_fast_mono.seq, &timekeeper_lock),
|
||||
.seq = SEQCNT_LATCH_ZERO(tk_fast_mono.seq),
|
||||
.base[0] = FAST_TK_INIT,
|
||||
.base[1] = FAST_TK_INIT,
|
||||
};
|
||||
|
||||
static struct tk_fast tk_fast_raw ____cacheline_aligned = {
|
||||
.seq = SEQCNT_RAW_SPINLOCK_ZERO(tk_fast_raw.seq, &timekeeper_lock),
|
||||
.seq = SEQCNT_LATCH_ZERO(tk_fast_raw.seq),
|
||||
.base[0] = FAST_TK_INIT,
|
||||
.base[1] = FAST_TK_INIT,
|
||||
};
|
||||
|
@ -484,7 +484,7 @@ static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf)
|
|||
tk_clock_read(tkr),
|
||||
tkr->cycle_last,
|
||||
tkr->mask));
|
||||
} while (read_seqcount_retry(&tkf->seq, seq));
|
||||
} while (read_seqcount_latch_retry(&tkf->seq, seq));
|
||||
|
||||
return now;
|
||||
}
|
||||
|
@ -548,7 +548,7 @@ static __always_inline u64 __ktime_get_real_fast(struct tk_fast *tkf, u64 *mono)
|
|||
delta = timekeeping_delta_to_ns(tkr,
|
||||
clocksource_delta(tk_clock_read(tkr),
|
||||
tkr->cycle_last, tkr->mask));
|
||||
} while (read_seqcount_retry(&tkf->seq, seq));
|
||||
} while (read_seqcount_latch_retry(&tkf->seq, seq));
|
||||
|
||||
if (mono)
|
||||
*mono = basem + delta;
|
||||
|
|
|
@ -40,6 +40,11 @@ menuconfig KCSAN
|
|||
|
||||
if KCSAN
|
||||
|
||||
# Compiler capabilities that should not fail the test if they are unavailable.
|
||||
config CC_HAS_TSAN_COMPOUND_READ_BEFORE_WRITE
|
||||
def_bool (CC_IS_CLANG && $(cc-option,-fsanitize=thread -mllvm -tsan-compound-read-before-write=1)) || \
|
||||
(CC_IS_GCC && $(cc-option,-fsanitize=thread --param tsan-compound-read-before-write=1))
|
||||
|
||||
config KCSAN_VERBOSE
|
||||
bool "Show verbose reports with more information about system state"
|
||||
depends on PROVE_LOCKING
|
||||
|
|
|
@ -28,6 +28,7 @@
|
|||
* Change this to 1 if you want to see the failure printouts:
|
||||
*/
|
||||
static unsigned int debug_locks_verbose;
|
||||
unsigned int force_read_lock_recursive;
|
||||
|
||||
static DEFINE_WD_CLASS(ww_lockdep);
|
||||
|
||||
|
@ -395,6 +396,49 @@ static void rwsem_ABBA1(void)
|
|||
MU(Y1); // should fail
|
||||
}
|
||||
|
||||
/*
|
||||
* read_lock(A)
|
||||
* spin_lock(B)
|
||||
* spin_lock(B)
|
||||
* write_lock(A)
|
||||
*
|
||||
* This test case is aimed at poking whether the chain cache prevents us from
|
||||
* detecting a read-lock/lock-write deadlock: if the chain cache doesn't differ
|
||||
* read/write locks, the following case may happen
|
||||
*
|
||||
* { read_lock(A)->lock(B) dependency exists }
|
||||
*
|
||||
* P0:
|
||||
* lock(B);
|
||||
* read_lock(A);
|
||||
*
|
||||
* { Not a deadlock, B -> A is added in the chain cache }
|
||||
*
|
||||
* P1:
|
||||
* lock(B);
|
||||
* write_lock(A);
|
||||
*
|
||||
* { B->A found in chain cache, not reported as a deadlock }
|
||||
*
|
||||
*/
|
||||
static void rlock_chaincache_ABBA1(void)
|
||||
{
|
||||
RL(X1);
|
||||
L(Y1);
|
||||
U(Y1);
|
||||
RU(X1);
|
||||
|
||||
L(Y1);
|
||||
RL(X1);
|
||||
RU(X1);
|
||||
U(Y1);
|
||||
|
||||
L(Y1);
|
||||
WL(X1);
|
||||
WU(X1);
|
||||
U(Y1); // should fail
|
||||
}
|
||||
|
||||
/*
|
||||
* read_lock(A)
|
||||
* spin_lock(B)
|
||||
|
@ -990,6 +1034,133 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_wlock)
|
|||
#undef E2
|
||||
#undef E3
|
||||
|
||||
/*
|
||||
* write-read / write-read / write-read deadlock even if read is recursive
|
||||
*/
|
||||
|
||||
#define E1() \
|
||||
\
|
||||
WL(X1); \
|
||||
RL(Y1); \
|
||||
RU(Y1); \
|
||||
WU(X1);
|
||||
|
||||
#define E2() \
|
||||
\
|
||||
WL(Y1); \
|
||||
RL(Z1); \
|
||||
RU(Z1); \
|
||||
WU(Y1);
|
||||
|
||||
#define E3() \
|
||||
\
|
||||
WL(Z1); \
|
||||
RL(X1); \
|
||||
RU(X1); \
|
||||
WU(Z1);
|
||||
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(W1R2_W2R3_W3R1)
|
||||
|
||||
#undef E1
|
||||
#undef E2
|
||||
#undef E3
|
||||
|
||||
/*
|
||||
* write-write / read-read / write-read deadlock even if read is recursive
|
||||
*/
|
||||
|
||||
#define E1() \
|
||||
\
|
||||
WL(X1); \
|
||||
WL(Y1); \
|
||||
WU(Y1); \
|
||||
WU(X1);
|
||||
|
||||
#define E2() \
|
||||
\
|
||||
RL(Y1); \
|
||||
RL(Z1); \
|
||||
RU(Z1); \
|
||||
RU(Y1);
|
||||
|
||||
#define E3() \
|
||||
\
|
||||
WL(Z1); \
|
||||
RL(X1); \
|
||||
RU(X1); \
|
||||
WU(Z1);
|
||||
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(W1W2_R2R3_W3R1)
|
||||
|
||||
#undef E1
|
||||
#undef E2
|
||||
#undef E3
|
||||
|
||||
/*
|
||||
* write-write / read-read / read-write is not deadlock when read is recursive
|
||||
*/
|
||||
|
||||
#define E1() \
|
||||
\
|
||||
WL(X1); \
|
||||
WL(Y1); \
|
||||
WU(Y1); \
|
||||
WU(X1);
|
||||
|
||||
#define E2() \
|
||||
\
|
||||
RL(Y1); \
|
||||
RL(Z1); \
|
||||
RU(Z1); \
|
||||
RU(Y1);
|
||||
|
||||
#define E3() \
|
||||
\
|
||||
RL(Z1); \
|
||||
WL(X1); \
|
||||
WU(X1); \
|
||||
RU(Z1);
|
||||
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(W1R2_R2R3_W3W1)
|
||||
|
||||
#undef E1
|
||||
#undef E2
|
||||
#undef E3
|
||||
|
||||
/*
|
||||
* write-read / read-read / write-write is not deadlock when read is recursive
|
||||
*/
|
||||
|
||||
#define E1() \
|
||||
\
|
||||
WL(X1); \
|
||||
RL(Y1); \
|
||||
RU(Y1); \
|
||||
WU(X1);
|
||||
|
||||
#define E2() \
|
||||
\
|
||||
RL(Y1); \
|
||||
RL(Z1); \
|
||||
RU(Z1); \
|
||||
RU(Y1);
|
||||
|
||||
#define E3() \
|
||||
\
|
||||
WL(Z1); \
|
||||
WL(X1); \
|
||||
WU(X1); \
|
||||
WU(Z1);
|
||||
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(W1W2_R2R3_R3W1)
|
||||
|
||||
#undef E1
|
||||
#undef E2
|
||||
#undef E3
|
||||
/*
|
||||
* read-lock / write-lock recursion that is actually safe.
|
||||
*/
|
||||
|
@ -1009,20 +1180,28 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_wlock)
|
|||
#define E3() \
|
||||
\
|
||||
IRQ_ENTER(); \
|
||||
RL(A); \
|
||||
LOCK(A); \
|
||||
L(B); \
|
||||
U(B); \
|
||||
RU(A); \
|
||||
UNLOCK(A); \
|
||||
IRQ_EXIT();
|
||||
|
||||
/*
|
||||
* Generate 12 testcases:
|
||||
* Generate 24 testcases:
|
||||
*/
|
||||
#include "locking-selftest-hardirq.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard)
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard_rlock)
|
||||
|
||||
#include "locking-selftest-wlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard_wlock)
|
||||
|
||||
#include "locking-selftest-softirq.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_rlock)
|
||||
|
||||
#include "locking-selftest-wlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_wlock)
|
||||
|
||||
#undef E1
|
||||
#undef E2
|
||||
|
@ -1036,8 +1215,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
|
|||
\
|
||||
IRQ_DISABLE(); \
|
||||
L(B); \
|
||||
WL(A); \
|
||||
WU(A); \
|
||||
LOCK(A); \
|
||||
UNLOCK(A); \
|
||||
U(B); \
|
||||
IRQ_ENABLE();
|
||||
|
||||
|
@ -1054,13 +1233,75 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
|
|||
IRQ_EXIT();
|
||||
|
||||
/*
|
||||
* Generate 12 testcases:
|
||||
* Generate 24 testcases:
|
||||
*/
|
||||
#include "locking-selftest-hardirq.h"
|
||||
// GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard)
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard_rlock)
|
||||
|
||||
#include "locking-selftest-wlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard_wlock)
|
||||
|
||||
#include "locking-selftest-softirq.h"
|
||||
// GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft)
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft_rlock)
|
||||
|
||||
#include "locking-selftest-wlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft_wlock)
|
||||
|
||||
#undef E1
|
||||
#undef E2
|
||||
#undef E3
|
||||
/*
|
||||
* read-lock / write-lock recursion that is unsafe.
|
||||
*
|
||||
* A is a ENABLED_*_READ lock
|
||||
* B is a USED_IN_*_READ lock
|
||||
*
|
||||
* read_lock(A);
|
||||
* write_lock(B);
|
||||
* <interrupt>
|
||||
* read_lock(B);
|
||||
* write_lock(A); // if this one is read_lock(), no deadlock
|
||||
*/
|
||||
|
||||
#define E1() \
|
||||
\
|
||||
IRQ_DISABLE(); \
|
||||
WL(B); \
|
||||
LOCK(A); \
|
||||
UNLOCK(A); \
|
||||
WU(B); \
|
||||
IRQ_ENABLE();
|
||||
|
||||
#define E2() \
|
||||
\
|
||||
RL(A); \
|
||||
RU(A); \
|
||||
|
||||
#define E3() \
|
||||
\
|
||||
IRQ_ENTER(); \
|
||||
RL(B); \
|
||||
RU(B); \
|
||||
IRQ_EXIT();
|
||||
|
||||
/*
|
||||
* Generate 24 testcases:
|
||||
*/
|
||||
#include "locking-selftest-hardirq.h"
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_hard_rlock)
|
||||
|
||||
#include "locking-selftest-wlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_hard_wlock)
|
||||
|
||||
#include "locking-selftest-softirq.h"
|
||||
#include "locking-selftest-rlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_soft_rlock)
|
||||
|
||||
#include "locking-selftest-wlock.h"
|
||||
GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_soft_wlock)
|
||||
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
# define I_SPINLOCK(x) lockdep_reset_lock(&lock_##x.dep_map)
|
||||
|
@ -1199,6 +1440,19 @@ static inline void print_testname(const char *testname)
|
|||
dotest(name##_##nr, FAILURE, LOCKTYPE_RWLOCK); \
|
||||
pr_cont("\n");
|
||||
|
||||
#define DO_TESTCASE_1RR(desc, name, nr) \
|
||||
print_testname(desc"/"#nr); \
|
||||
pr_cont(" |"); \
|
||||
dotest(name##_##nr, SUCCESS, LOCKTYPE_RWLOCK); \
|
||||
pr_cont("\n");
|
||||
|
||||
#define DO_TESTCASE_1RRB(desc, name, nr) \
|
||||
print_testname(desc"/"#nr); \
|
||||
pr_cont(" |"); \
|
||||
dotest(name##_##nr, FAILURE, LOCKTYPE_RWLOCK); \
|
||||
pr_cont("\n");
|
||||
|
||||
|
||||
#define DO_TESTCASE_3(desc, name, nr) \
|
||||
print_testname(desc"/"#nr); \
|
||||
dotest(name##_spin_##nr, FAILURE, LOCKTYPE_SPIN); \
|
||||
|
@ -1213,6 +1467,25 @@ static inline void print_testname(const char *testname)
|
|||
dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK); \
|
||||
pr_cont("\n");
|
||||
|
||||
#define DO_TESTCASE_2RW(desc, name, nr) \
|
||||
print_testname(desc"/"#nr); \
|
||||
pr_cont(" |"); \
|
||||
dotest(name##_wlock_##nr, FAILURE, LOCKTYPE_RWLOCK); \
|
||||
dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK); \
|
||||
pr_cont("\n");
|
||||
|
||||
#define DO_TESTCASE_2x2RW(desc, name, nr) \
|
||||
DO_TESTCASE_2RW("hard-"desc, name##_hard, nr) \
|
||||
DO_TESTCASE_2RW("soft-"desc, name##_soft, nr) \
|
||||
|
||||
#define DO_TESTCASE_6x2x2RW(desc, name) \
|
||||
DO_TESTCASE_2x2RW(desc, name, 123); \
|
||||
DO_TESTCASE_2x2RW(desc, name, 132); \
|
||||
DO_TESTCASE_2x2RW(desc, name, 213); \
|
||||
DO_TESTCASE_2x2RW(desc, name, 231); \
|
||||
DO_TESTCASE_2x2RW(desc, name, 312); \
|
||||
DO_TESTCASE_2x2RW(desc, name, 321);
|
||||
|
||||
#define DO_TESTCASE_6(desc, name) \
|
||||
print_testname(desc); \
|
||||
dotest(name##_spin, FAILURE, LOCKTYPE_SPIN); \
|
||||
|
@ -1289,6 +1562,22 @@ static inline void print_testname(const char *testname)
|
|||
DO_TESTCASE_2IB(desc, name, 312); \
|
||||
DO_TESTCASE_2IB(desc, name, 321);
|
||||
|
||||
#define DO_TESTCASE_6x1RR(desc, name) \
|
||||
DO_TESTCASE_1RR(desc, name, 123); \
|
||||
DO_TESTCASE_1RR(desc, name, 132); \
|
||||
DO_TESTCASE_1RR(desc, name, 213); \
|
||||
DO_TESTCASE_1RR(desc, name, 231); \
|
||||
DO_TESTCASE_1RR(desc, name, 312); \
|
||||
DO_TESTCASE_1RR(desc, name, 321);
|
||||
|
||||
#define DO_TESTCASE_6x1RRB(desc, name) \
|
||||
DO_TESTCASE_1RRB(desc, name, 123); \
|
||||
DO_TESTCASE_1RRB(desc, name, 132); \
|
||||
DO_TESTCASE_1RRB(desc, name, 213); \
|
||||
DO_TESTCASE_1RRB(desc, name, 231); \
|
||||
DO_TESTCASE_1RRB(desc, name, 312); \
|
||||
DO_TESTCASE_1RRB(desc, name, 321);
|
||||
|
||||
#define DO_TESTCASE_6x6(desc, name) \
|
||||
DO_TESTCASE_6I(desc, name, 123); \
|
||||
DO_TESTCASE_6I(desc, name, 132); \
|
||||
|
@ -1966,6 +2255,108 @@ static void ww_tests(void)
|
|||
pr_cont("\n");
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* <in hardirq handler>
|
||||
* read_lock(&A);
|
||||
* <hardirq disable>
|
||||
* spin_lock(&B);
|
||||
* spin_lock(&B);
|
||||
* read_lock(&A);
|
||||
*
|
||||
* is a deadlock.
|
||||
*/
|
||||
static void queued_read_lock_hardirq_RE_Er(void)
|
||||
{
|
||||
HARDIRQ_ENTER();
|
||||
read_lock(&rwlock_A);
|
||||
LOCK(B);
|
||||
UNLOCK(B);
|
||||
read_unlock(&rwlock_A);
|
||||
HARDIRQ_EXIT();
|
||||
|
||||
HARDIRQ_DISABLE();
|
||||
LOCK(B);
|
||||
read_lock(&rwlock_A);
|
||||
read_unlock(&rwlock_A);
|
||||
UNLOCK(B);
|
||||
HARDIRQ_ENABLE();
|
||||
}
|
||||
|
||||
/*
|
||||
* <in hardirq handler>
|
||||
* spin_lock(&B);
|
||||
* <hardirq disable>
|
||||
* read_lock(&A);
|
||||
* read_lock(&A);
|
||||
* spin_lock(&B);
|
||||
*
|
||||
* is not a deadlock.
|
||||
*/
|
||||
static void queued_read_lock_hardirq_ER_rE(void)
|
||||
{
|
||||
HARDIRQ_ENTER();
|
||||
LOCK(B);
|
||||
read_lock(&rwlock_A);
|
||||
read_unlock(&rwlock_A);
|
||||
UNLOCK(B);
|
||||
HARDIRQ_EXIT();
|
||||
|
||||
HARDIRQ_DISABLE();
|
||||
read_lock(&rwlock_A);
|
||||
LOCK(B);
|
||||
UNLOCK(B);
|
||||
read_unlock(&rwlock_A);
|
||||
HARDIRQ_ENABLE();
|
||||
}
|
||||
|
||||
/*
|
||||
* <hardirq disable>
|
||||
* spin_lock(&B);
|
||||
* read_lock(&A);
|
||||
* <in hardirq handler>
|
||||
* spin_lock(&B);
|
||||
* read_lock(&A);
|
||||
*
|
||||
* is a deadlock. Because the two read_lock()s are both non-recursive readers.
|
||||
*/
|
||||
static void queued_read_lock_hardirq_inversion(void)
|
||||
{
|
||||
|
||||
HARDIRQ_ENTER();
|
||||
LOCK(B);
|
||||
UNLOCK(B);
|
||||
HARDIRQ_EXIT();
|
||||
|
||||
HARDIRQ_DISABLE();
|
||||
LOCK(B);
|
||||
read_lock(&rwlock_A);
|
||||
read_unlock(&rwlock_A);
|
||||
UNLOCK(B);
|
||||
HARDIRQ_ENABLE();
|
||||
|
||||
read_lock(&rwlock_A);
|
||||
read_unlock(&rwlock_A);
|
||||
}
|
||||
|
||||
static void queued_read_lock_tests(void)
|
||||
{
|
||||
printk(" --------------------------------------------------------------------------\n");
|
||||
printk(" | queued read lock tests |\n");
|
||||
printk(" ---------------------------\n");
|
||||
print_testname("hardirq read-lock/lock-read");
|
||||
dotest(queued_read_lock_hardirq_RE_Er, FAILURE, LOCKTYPE_RWLOCK);
|
||||
pr_cont("\n");
|
||||
|
||||
print_testname("hardirq lock-read/read-lock");
|
||||
dotest(queued_read_lock_hardirq_ER_rE, SUCCESS, LOCKTYPE_RWLOCK);
|
||||
pr_cont("\n");
|
||||
|
||||
print_testname("hardirq inversion");
|
||||
dotest(queued_read_lock_hardirq_inversion, FAILURE, LOCKTYPE_RWLOCK);
|
||||
pr_cont("\n");
|
||||
}
|
||||
|
||||
void locking_selftest(void)
|
||||
{
|
||||
/*
|
||||
|
@ -1978,6 +2369,11 @@ void locking_selftest(void)
|
|||
return;
|
||||
}
|
||||
|
||||
/*
|
||||
* treats read_lock() as recursive read locks for testing purpose
|
||||
*/
|
||||
force_read_lock_recursive = 1;
|
||||
|
||||
/*
|
||||
* Run the testsuite:
|
||||
*/
|
||||
|
@ -2033,14 +2429,6 @@ void locking_selftest(void)
|
|||
print_testname("mixed read-lock/lock-write ABBA");
|
||||
pr_cont(" |");
|
||||
dotest(rlock_ABBA1, FAILURE, LOCKTYPE_RWLOCK);
|
||||
#ifdef CONFIG_PROVE_LOCKING
|
||||
/*
|
||||
* Lockdep does indeed fail here, but there's nothing we can do about
|
||||
* that now. Don't kill lockdep for it.
|
||||
*/
|
||||
unexpected_testcase_failures--;
|
||||
#endif
|
||||
|
||||
pr_cont(" |");
|
||||
dotest(rwsem_ABBA1, FAILURE, LOCKTYPE_RWSEM);
|
||||
|
||||
|
@ -2056,6 +2444,15 @@ void locking_selftest(void)
|
|||
pr_cont(" |");
|
||||
dotest(rwsem_ABBA3, FAILURE, LOCKTYPE_RWSEM);
|
||||
|
||||
print_testname("chain cached mixed R-L/L-W ABBA");
|
||||
pr_cont(" |");
|
||||
dotest(rlock_chaincache_ABBA1, FAILURE, LOCKTYPE_RWLOCK);
|
||||
|
||||
DO_TESTCASE_6x1RRB("rlock W1R2/W2R3/W3R1", W1R2_W2R3_W3R1);
|
||||
DO_TESTCASE_6x1RRB("rlock W1W2/R2R3/W3R1", W1W2_R2R3_W3R1);
|
||||
DO_TESTCASE_6x1RR("rlock W1W2/R2R3/R3W1", W1W2_R2R3_R3W1);
|
||||
DO_TESTCASE_6x1RR("rlock W1R2/R2R3/W3W1", W1R2_R2R3_W3W1);
|
||||
|
||||
printk(" --------------------------------------------------------------------------\n");
|
||||
|
||||
/*
|
||||
|
@ -2068,11 +2465,19 @@ void locking_selftest(void)
|
|||
DO_TESTCASE_6x6("safe-A + unsafe-B #2", irqsafe4);
|
||||
DO_TESTCASE_6x6RW("irq lock-inversion", irq_inversion);
|
||||
|
||||
DO_TESTCASE_6x2("irq read-recursion", irq_read_recursion);
|
||||
// DO_TESTCASE_6x2B("irq read-recursion #2", irq_read_recursion2);
|
||||
DO_TESTCASE_6x2x2RW("irq read-recursion", irq_read_recursion);
|
||||
DO_TESTCASE_6x2x2RW("irq read-recursion #2", irq_read_recursion2);
|
||||
DO_TESTCASE_6x2x2RW("irq read-recursion #3", irq_read_recursion3);
|
||||
|
||||
ww_tests();
|
||||
|
||||
force_read_lock_recursive = 0;
|
||||
/*
|
||||
* queued_read_lock() specific test cases can be put here
|
||||
*/
|
||||
if (IS_ENABLED(CONFIG_QUEUED_RWLOCKS))
|
||||
queued_read_lock_tests();
|
||||
|
||||
if (unexpected_testcase_failures) {
|
||||
printk("-----------------------------------------------------------------\n");
|
||||
debug_locks = 0;
|
||||
|
|
65
mm/swap.c
65
mm/swap.c
|
@ -763,10 +763,20 @@ static void lru_add_drain_per_cpu(struct work_struct *dummy)
|
|||
*/
|
||||
void lru_add_drain_all(void)
|
||||
{
|
||||
static seqcount_t seqcount = SEQCNT_ZERO(seqcount);
|
||||
static DEFINE_MUTEX(lock);
|
||||
/*
|
||||
* lru_drain_gen - Global pages generation number
|
||||
*
|
||||
* (A) Definition: global lru_drain_gen = x implies that all generations
|
||||
* 0 < n <= x are already *scheduled* for draining.
|
||||
*
|
||||
* This is an optimization for the highly-contended use case where a
|
||||
* user space workload keeps constantly generating a flow of pages for
|
||||
* each CPU.
|
||||
*/
|
||||
static unsigned int lru_drain_gen;
|
||||
static struct cpumask has_work;
|
||||
int cpu, seq;
|
||||
static DEFINE_MUTEX(lock);
|
||||
unsigned cpu, this_gen;
|
||||
|
||||
/*
|
||||
* Make sure nobody triggers this path before mm_percpu_wq is fully
|
||||
|
@ -775,21 +785,54 @@ void lru_add_drain_all(void)
|
|||
if (WARN_ON(!mm_percpu_wq))
|
||||
return;
|
||||
|
||||
seq = raw_read_seqcount_latch(&seqcount);
|
||||
/*
|
||||
* Guarantee pagevec counter stores visible by this CPU are visible to
|
||||
* other CPUs before loading the current drain generation.
|
||||
*/
|
||||
smp_mb();
|
||||
|
||||
/*
|
||||
* (B) Locally cache global LRU draining generation number
|
||||
*
|
||||
* The read barrier ensures that the counter is loaded before the mutex
|
||||
* is taken. It pairs with smp_mb() inside the mutex critical section
|
||||
* at (D).
|
||||
*/
|
||||
this_gen = smp_load_acquire(&lru_drain_gen);
|
||||
|
||||
mutex_lock(&lock);
|
||||
|
||||
/*
|
||||
* Piggyback on drain started and finished while we waited for lock:
|
||||
* all pages pended at the time of our enter were drained from vectors.
|
||||
* (C) Exit the draining operation if a newer generation, from another
|
||||
* lru_add_drain_all(), was already scheduled for draining. Check (A).
|
||||
*/
|
||||
if (__read_seqcount_retry(&seqcount, seq))
|
||||
if (unlikely(this_gen != lru_drain_gen))
|
||||
goto done;
|
||||
|
||||
raw_write_seqcount_latch(&seqcount);
|
||||
/*
|
||||
* (D) Increment global generation number
|
||||
*
|
||||
* Pairs with smp_load_acquire() at (B), outside of the critical
|
||||
* section. Use a full memory barrier to guarantee that the new global
|
||||
* drain generation number is stored before loading pagevec counters.
|
||||
*
|
||||
* This pairing must be done here, before the for_each_online_cpu loop
|
||||
* below which drains the page vectors.
|
||||
*
|
||||
* Let x, y, and z represent some system CPU numbers, where x < y < z.
|
||||
* Assume CPU #z is is in the middle of the for_each_online_cpu loop
|
||||
* below and has already reached CPU #y's per-cpu data. CPU #x comes
|
||||
* along, adds some pages to its per-cpu vectors, then calls
|
||||
* lru_add_drain_all().
|
||||
*
|
||||
* If the paired barrier is done at any later step, e.g. after the
|
||||
* loop, CPU #x will just exit at (C) and miss flushing out all of its
|
||||
* added pages.
|
||||
*/
|
||||
WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1);
|
||||
smp_mb();
|
||||
|
||||
cpumask_clear(&has_work);
|
||||
|
||||
for_each_online_cpu(cpu) {
|
||||
struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
|
||||
|
||||
|
@ -801,7 +844,7 @@ void lru_add_drain_all(void)
|
|||
need_activate_page_drain(cpu)) {
|
||||
INIT_WORK(work, lru_add_drain_per_cpu);
|
||||
queue_work_on(cpu, mm_percpu_wq, work);
|
||||
cpumask_set_cpu(cpu, &has_work);
|
||||
__cpumask_set_cpu(cpu, &has_work);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -816,7 +859,7 @@ void lru_add_drain_all(void)
|
|||
{
|
||||
lru_add_drain();
|
||||
}
|
||||
#endif
|
||||
#endif /* CONFIG_SMP */
|
||||
|
||||
/**
|
||||
* release_pages - batched put_page()
|
||||
|
|
|
@ -11,5 +11,5 @@ endif
|
|||
# of some options does not break KCSAN nor causes false positive reports.
|
||||
CFLAGS_KCSAN := -fsanitize=thread \
|
||||
$(call cc-option,$(call cc-param,tsan-instrument-func-entry-exit=0) -fno-optimize-sibling-calls) \
|
||||
$(call cc-option,$(call cc-param,tsan-instrument-read-before-write=1)) \
|
||||
$(call cc-option,$(call cc-param,tsan-compound-read-before-write=1),$(call cc-option,$(call cc-param,tsan-instrument-read-before-write=1))) \
|
||||
$(call cc-param,tsan-distinguish-volatile=1)
|
||||
|
|
|
@ -16,6 +16,7 @@ fi
|
|||
cat <<EOF |
|
||||
asm-generic/atomic-instrumented.h
|
||||
asm-generic/atomic-long.h
|
||||
linux/atomic-arch-fallback.h
|
||||
linux/atomic-fallback.h
|
||||
EOF
|
||||
while read header; do
|
||||
|
|
|
@ -5,9 +5,10 @@ ATOMICDIR=$(dirname $0)
|
|||
|
||||
. ${ATOMICDIR}/atomic-tbl.sh
|
||||
|
||||
#gen_param_check(arg)
|
||||
#gen_param_check(meta, arg)
|
||||
gen_param_check()
|
||||
{
|
||||
local meta="$1"; shift
|
||||
local arg="$1"; shift
|
||||
local type="${arg%%:*}"
|
||||
local name="$(gen_param_name "${arg}")"
|
||||
|
@ -17,17 +18,25 @@ gen_param_check()
|
|||
i) return;;
|
||||
esac
|
||||
|
||||
# We don't write to constant parameters
|
||||
[ ${type#c} != ${type} ] && rw="read"
|
||||
if [ ${type#c} != ${type} ]; then
|
||||
# We don't write to constant parameters.
|
||||
rw="read"
|
||||
elif [ "${meta}" != "s" ]; then
|
||||
# An atomic RMW: if this parameter is not a constant, and this atomic is
|
||||
# not just a 's'tore, this parameter is both read from and written to.
|
||||
rw="read_write"
|
||||
fi
|
||||
|
||||
printf "\tinstrument_atomic_${rw}(${name}, sizeof(*${name}));\n"
|
||||
}
|
||||
|
||||
#gen_param_check(arg...)
|
||||
#gen_params_checks(meta, arg...)
|
||||
gen_params_checks()
|
||||
{
|
||||
local meta="$1"; shift
|
||||
|
||||
while [ "$#" -gt 0 ]; do
|
||||
gen_param_check "$1"
|
||||
gen_param_check "$meta" "$1"
|
||||
shift;
|
||||
done
|
||||
}
|
||||
|
@ -77,7 +86,7 @@ gen_proto_order_variant()
|
|||
|
||||
local ret="$(gen_ret_type "${meta}" "${int}")"
|
||||
local params="$(gen_params "${int}" "${atomic}" "$@")"
|
||||
local checks="$(gen_params_checks "$@")"
|
||||
local checks="$(gen_params_checks "${meta}" "$@")"
|
||||
local args="$(gen_args "$@")"
|
||||
local retstmt="$(gen_ret_stmt "${meta}")"
|
||||
|
||||
|
|
|
@ -205,6 +205,8 @@ regex_c=(
|
|||
'/\<DEVICE_ATTR_\(RW\|RO\|WO\)(\([[:alnum:]_]\+\)/dev_attr_\2/'
|
||||
'/\<DRIVER_ATTR_\(RW\|RO\|WO\)(\([[:alnum:]_]\+\)/driver_attr_\2/'
|
||||
'/\<\(DEFINE\|DECLARE\)_STATIC_KEY_\(TRUE\|FALSE\)\(\|_RO\)(\([[:alnum:]_]\+\)/\4/'
|
||||
'/^SEQCOUNT_LOCKTYPE(\([^,]*\),[[:space:]]*\([^,]*\),[^)]*)/seqcount_\2_t/'
|
||||
'/^SEQCOUNT_LOCKTYPE(\([^,]*\),[[:space:]]*\([^,]*\),[^)]*)/seqcount_\2_init/'
|
||||
)
|
||||
regex_kconfig=(
|
||||
'/^[[:blank:]]*\(menu\|\)config[[:blank:]]\+\([[:alnum:]_]\+\)/\2/'
|
||||
|
|
|
@ -3,9 +3,9 @@
|
|||
C Self R W RMW Self R W DR DW RMW SV
|
||||
-- ---- - - --- ---- - - -- -- --- --
|
||||
|
||||
Store, e.g., WRITE_ONCE() Y Y
|
||||
Load, e.g., READ_ONCE() Y Y Y Y
|
||||
Unsuccessful RMW operation Y Y Y Y
|
||||
Relaxed store Y Y
|
||||
Relaxed load Y Y Y Y
|
||||
Relaxed RMW operation Y Y Y Y
|
||||
rcu_dereference() Y Y Y Y
|
||||
Successful *_acquire() R Y Y Y Y Y Y
|
||||
Successful *_release() C Y Y Y W Y
|
||||
|
@ -17,14 +17,19 @@ smp_mb__before_atomic() CP Y Y Y a a a a Y
|
|||
smp_mb__after_atomic() CP a a Y Y Y Y Y Y
|
||||
|
||||
|
||||
Key: C: Ordering is cumulative
|
||||
P: Ordering propagates
|
||||
R: Read, for example, READ_ONCE(), or read portion of RMW
|
||||
W: Write, for example, WRITE_ONCE(), or write portion of RMW
|
||||
Y: Provides ordering
|
||||
a: Provides ordering given intervening RMW atomic operation
|
||||
DR: Dependent read (address dependency)
|
||||
DW: Dependent write (address, data, or control dependency)
|
||||
RMW: Atomic read-modify-write operation
|
||||
SELF: Orders self, as opposed to accesses before and/or after
|
||||
SV: Orders later accesses to the same variable
|
||||
Key: Relaxed: A relaxed operation is either READ_ONCE(), WRITE_ONCE(),
|
||||
a *_relaxed() RMW operation, an unsuccessful RMW
|
||||
operation, a non-value-returning RMW operation such
|
||||
as atomic_inc(), or one of the atomic*_read() and
|
||||
atomic*_set() family of operations.
|
||||
C: Ordering is cumulative
|
||||
P: Ordering propagates
|
||||
R: Read, for example, READ_ONCE(), or read portion of RMW
|
||||
W: Write, for example, WRITE_ONCE(), or write portion of RMW
|
||||
Y: Provides ordering
|
||||
a: Provides ordering given intervening RMW atomic operation
|
||||
DR: Dependent read (address dependency)
|
||||
DW: Dependent write (address, data, or control dependency)
|
||||
RMW: Atomic read-modify-write operation
|
||||
SELF: Orders self, as opposed to accesses before and/or after
|
||||
SV: Orders later accesses to the same variable
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,7 +1,7 @@
|
|||
This document provides "recipes", that is, litmus tests for commonly
|
||||
occurring situations, as well as a few that illustrate subtly broken but
|
||||
attractive nuisances. Many of these recipes include example code from
|
||||
v4.13 of the Linux kernel.
|
||||
v5.7 of the Linux kernel.
|
||||
|
||||
The first section covers simple special cases, the second section
|
||||
takes off the training wheels to cover more involved examples,
|
||||
|
@ -278,7 +278,7 @@ is present if the value loaded determines the address of a later access
|
|||
first place (control dependency). Note that the term "data dependency"
|
||||
is sometimes casually used to cover both address and data dependencies.
|
||||
|
||||
In lib/prime_numbers.c, the expand_to_next_prime() function invokes
|
||||
In lib/math/prime_numbers.c, the expand_to_next_prime() function invokes
|
||||
rcu_assign_pointer(), and the next_prime_number() function invokes
|
||||
rcu_dereference(). This combination mediates access to a bit vector
|
||||
that is expanded as additional primes are needed.
|
||||
|
|
|
@ -120,7 +120,7 @@ o Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. "Herding
|
|||
|
||||
o Jade Alglave, Patrick Cousot, and Luc Maranget. 2016. "Syntax and
|
||||
semantics of the weak consistency model specification language
|
||||
cat". CoRR abs/1608.07531 (2016). http://arxiv.org/abs/1608.07531
|
||||
cat". CoRR abs/1608.07531 (2016). https://arxiv.org/abs/1608.07531
|
||||
|
||||
|
||||
Memory-model comparisons
|
||||
|
|
|
@ -0,0 +1,271 @@
|
|||
This document provides options for those wishing to keep their
|
||||
memory-ordering lives simple, as is necessary for those whose domain
|
||||
is complex. After all, there are bugs other than memory-ordering bugs,
|
||||
and the time spent gaining memory-ordering knowledge is not available
|
||||
for gaining domain knowledge. Furthermore Linux-kernel memory model
|
||||
(LKMM) is quite complex, with subtle differences in code often having
|
||||
dramatic effects on correctness.
|
||||
|
||||
The options near the beginning of this list are quite simple. The idea
|
||||
is not that kernel hackers don't already know about them, but rather
|
||||
that they might need the occasional reminder.
|
||||
|
||||
Please note that this is a generic guide, and that specific subsystems
|
||||
will often have special requirements or idioms. For example, developers
|
||||
of MMIO-based device drivers will often need to use mb(), rmb(), and
|
||||
wmb(), and therefore might find smp_mb(), smp_rmb(), and smp_wmb()
|
||||
to be more natural than smp_load_acquire() and smp_store_release().
|
||||
On the other hand, those coming in from other environments will likely
|
||||
be more familiar with these last two.
|
||||
|
||||
|
||||
Single-threaded code
|
||||
====================
|
||||
|
||||
In single-threaded code, there is no reordering, at least assuming
|
||||
that your toolchain and hardware are working correctly. In addition,
|
||||
it is generally a mistake to assume your code will only run in a single
|
||||
threaded context as the kernel can enter the same code path on multiple
|
||||
CPUs at the same time. One important exception is a function that makes
|
||||
no external data references.
|
||||
|
||||
In the general case, you will need to take explicit steps to ensure that
|
||||
your code really is executed within a single thread that does not access
|
||||
shared variables. A simple way to achieve this is to define a global lock
|
||||
that you acquire at the beginning of your code and release at the end,
|
||||
taking care to ensure that all references to your code's shared data are
|
||||
also carried out under that same lock. Because only one thread can hold
|
||||
this lock at a given time, your code will be executed single-threaded.
|
||||
This approach is called "code locking".
|
||||
|
||||
Code locking can severely limit both performance and scalability, so it
|
||||
should be used with caution, and only on code paths that execute rarely.
|
||||
After all, a huge amount of effort was required to remove the Linux
|
||||
kernel's old "Big Kernel Lock", so let's please be very careful about
|
||||
adding new "little kernel locks".
|
||||
|
||||
One of the advantages of locking is that, in happy contrast with the
|
||||
year 1981, almost all kernel developers are very familiar with locking.
|
||||
The Linux kernel's lockdep (CONFIG_PROVE_LOCKING=y) is very helpful with
|
||||
the formerly feared deadlock scenarios.
|
||||
|
||||
Please use the standard locking primitives provided by the kernel rather
|
||||
than rolling your own. For one thing, the standard primitives interact
|
||||
properly with lockdep. For another thing, these primitives have been
|
||||
tuned to deal better with high contention. And for one final thing, it is
|
||||
surprisingly hard to correctly code production-quality lock acquisition
|
||||
and release functions. After all, even simple non-production-quality
|
||||
locking functions must carefully prevent both the CPU and the compiler
|
||||
from moving code in either direction across the locking function.
|
||||
|
||||
Despite the scalability limitations of single-threaded code, RCU
|
||||
takes this approach for much of its grace-period processing and also
|
||||
for early-boot operation. The reason RCU is able to scale despite
|
||||
single-threaded grace-period processing is use of batching, where all
|
||||
updates that accumulated during one grace period are handled by the
|
||||
next one. In other words, slowing down grace-period processing makes
|
||||
it more efficient. Nor is RCU unique: Similar batching optimizations
|
||||
are used in many I/O operations.
|
||||
|
||||
|
||||
Packaged code
|
||||
=============
|
||||
|
||||
Even if performance and scalability concerns prevent your code from
|
||||
being completely single-threaded, it is often possible to use library
|
||||
functions that handle the concurrency nearly or entirely on their own.
|
||||
This approach delegates any LKMM worries to the library maintainer.
|
||||
|
||||
In the kernel, what is the "library"? Quite a bit. It includes the
|
||||
contents of the lib/ directory, much of the include/linux/ directory along
|
||||
with a lot of other heavily used APIs. But heavily used examples include
|
||||
the list macros (for example, include/linux/{,rcu}list.h), workqueues,
|
||||
smp_call_function(), and the various hash tables and search trees.
|
||||
|
||||
|
||||
Data locking
|
||||
============
|
||||
|
||||
With code locking, we use single-threaded code execution to guarantee
|
||||
serialized access to the data that the code is accessing. However,
|
||||
we can also achieve this by instead associating the lock with specific
|
||||
instances of the data structures. This creates a "critical section"
|
||||
in the code execution that will execute as though it is single threaded.
|
||||
By placing all the accesses and modifications to a shared data structure
|
||||
inside a critical section, we ensure that the execution context that
|
||||
holds the lock has exclusive access to the shared data.
|
||||
|
||||
The poster boy for this approach is the hash table, where placing a lock
|
||||
in each hash bucket allows operations on different buckets to proceed
|
||||
concurrently. This works because the buckets do not overlap with each
|
||||
other, so that an operation on one bucket does not interfere with any
|
||||
other bucket.
|
||||
|
||||
As the number of buckets increases, data locking scales naturally.
|
||||
In particular, if the amount of data increases with the number of CPUs,
|
||||
increasing the number of buckets as the number of CPUs increase results
|
||||
in a naturally scalable data structure.
|
||||
|
||||
|
||||
Per-CPU processing
|
||||
==================
|
||||
|
||||
Partitioning processing and data over CPUs allows each CPU to take
|
||||
a single-threaded approach while providing excellent performance and
|
||||
scalability. Of course, there is no free lunch: The dark side of this
|
||||
excellence is substantially increased memory footprint.
|
||||
|
||||
In addition, it is sometimes necessary to occasionally update some global
|
||||
view of this processing and data, in which case something like locking
|
||||
must be used to protect this global view. This is the approach taken
|
||||
by the percpu_counter infrastructure. In many cases, there are already
|
||||
generic/library variants of commonly used per-cpu constructs available.
|
||||
Please use them rather than rolling your own.
|
||||
|
||||
RCU uses DEFINE_PER_CPU*() declaration to create a number of per-CPU
|
||||
data sets. For example, each CPU does private quiescent-state processing
|
||||
within its instance of the per-CPU rcu_data structure, and then uses data
|
||||
locking to report quiescent states up the grace-period combining tree.
|
||||
|
||||
|
||||
Packaged primitives: Sequence locking
|
||||
=====================================
|
||||
|
||||
Lockless programming is considered by many to be more difficult than
|
||||
lock-based programming, but there are a few lockless design patterns that
|
||||
have been built out into an API. One of these APIs is sequence locking.
|
||||
Although this APIs can be used in extremely complex ways, there are simple
|
||||
and effective ways of using it that avoid the need to pay attention to
|
||||
memory ordering.
|
||||
|
||||
The basic keep-things-simple rule for sequence locking is "do not write
|
||||
in read-side code". Yes, you can do writes from within sequence-locking
|
||||
readers, but it won't be so simple. For example, such writes will be
|
||||
lockless and should be idempotent.
|
||||
|
||||
For more sophisticated use cases, LKMM can guide you, including use
|
||||
cases involving combining sequence locking with other synchronization
|
||||
primitives. (LKMM does not yet know about sequence locking, so it is
|
||||
currently necessary to open-code it in your litmus tests.)
|
||||
|
||||
Additional information may be found in include/linux/seqlock.h.
|
||||
|
||||
Packaged primitives: RCU
|
||||
========================
|
||||
|
||||
Another lockless design pattern that has been baked into an API
|
||||
is RCU. The Linux kernel makes sophisticated use of RCU, but the
|
||||
keep-things-simple rules for RCU are "do not write in read-side code"
|
||||
and "do not update anything that is visible to and accessed by readers",
|
||||
and "protect updates with locking".
|
||||
|
||||
These rules are illustrated by the functions foo_update_a() and
|
||||
foo_get_a() shown in Documentation/RCU/whatisRCU.rst. Additional
|
||||
RCU usage patterns maybe found in Documentation/RCU and in the
|
||||
source code.
|
||||
|
||||
|
||||
Packaged primitives: Atomic operations
|
||||
======================================
|
||||
|
||||
Back in the day, the Linux kernel had three types of atomic operations:
|
||||
|
||||
1. Initialization and read-out, such as atomic_set() and atomic_read().
|
||||
|
||||
2. Operations that did not return a value and provided no ordering,
|
||||
such as atomic_inc() and atomic_dec().
|
||||
|
||||
3. Operations that returned a value and provided full ordering, such as
|
||||
atomic_add_return() and atomic_dec_and_test(). Note that some
|
||||
value-returning operations provide full ordering only conditionally.
|
||||
For example, cmpxchg() provides ordering only upon success.
|
||||
|
||||
More recent kernels have operations that return a value but do not
|
||||
provide full ordering. These are flagged with either a _relaxed()
|
||||
suffix (providing no ordering), or an _acquire() or _release() suffix
|
||||
(providing limited ordering).
|
||||
|
||||
Additional information may be found in these files:
|
||||
|
||||
Documentation/atomic_t.txt
|
||||
Documentation/atomic_bitops.txt
|
||||
Documentation/core-api/atomic_ops.rst
|
||||
Documentation/core-api/refcount-vs-atomic.rst
|
||||
|
||||
Reading code using these primitives is often also quite helpful.
|
||||
|
||||
|
||||
Lockless, fully ordered
|
||||
=======================
|
||||
|
||||
When using locking, there often comes a time when it is necessary
|
||||
to access some variable or another without holding the data lock
|
||||
that serializes access to that variable.
|
||||
|
||||
If you want to keep things simple, use the initialization and read-out
|
||||
operations from the previous section only when there are no racing
|
||||
accesses. Otherwise, use only fully ordered operations when accessing
|
||||
or modifying the variable. This approach guarantees that code prior
|
||||
to a given access to that variable will be seen by all CPUs has having
|
||||
happened before any code following any later access to that same variable.
|
||||
|
||||
Please note that per-CPU functions are not atomic operations and
|
||||
hence they do not provide any ordering guarantees at all.
|
||||
|
||||
If the lockless accesses are frequently executed reads that are used
|
||||
only for heuristics, or if they are frequently executed writes that
|
||||
are used only for statistics, please see the next section.
|
||||
|
||||
|
||||
Lockless statistics and heuristics
|
||||
==================================
|
||||
|
||||
Unordered primitives such as atomic_read(), atomic_set(), READ_ONCE(), and
|
||||
WRITE_ONCE() can safely be used in some cases. These primitives provide
|
||||
no ordering, but they do prevent the compiler from carrying out a number
|
||||
of destructive optimizations (for which please see the next section).
|
||||
One example use for these primitives is statistics, such as per-CPU
|
||||
counters exemplified by the rt_cache_stat structure's routing-cache
|
||||
statistics counters. Another example use case is heuristics, such as
|
||||
the jiffies_till_first_fqs and jiffies_till_next_fqs kernel parameters
|
||||
controlling how often RCU scans for idle CPUs.
|
||||
|
||||
But be careful. "Unordered" really does mean "unordered". It is all
|
||||
too easy to assume ordering, and this assumption must be avoided when
|
||||
using these primitives.
|
||||
|
||||
|
||||
Don't let the compiler trip you up
|
||||
==================================
|
||||
|
||||
It can be quite tempting to use plain C-language accesses for lockless
|
||||
loads from and stores to shared variables. Although this is both
|
||||
possible and quite common in the Linux kernel, it does require a
|
||||
surprising amount of analysis, care, and knowledge about the compiler.
|
||||
Yes, some decades ago it was not unfair to consider a C compiler to be
|
||||
an assembler with added syntax and better portability, but the advent of
|
||||
sophisticated optimizing compilers mean that those days are long gone.
|
||||
Today's optimizing compilers can profoundly rewrite your code during the
|
||||
translation process, and have long been ready, willing, and able to do so.
|
||||
|
||||
Therefore, if you really need to use C-language assignments instead of
|
||||
READ_ONCE(), WRITE_ONCE(), and so on, you will need to have a very good
|
||||
understanding of both the C standard and your compiler. Here are some
|
||||
introductory references and some tooling to start you on this noble quest:
|
||||
|
||||
Who's afraid of a big bad optimizing compiler?
|
||||
https://lwn.net/Articles/793253/
|
||||
Calibrating your fear of big bad optimizing compilers
|
||||
https://lwn.net/Articles/799218/
|
||||
Concurrency bugs should fear the big bad data-race detector (part 1)
|
||||
https://lwn.net/Articles/816850/
|
||||
Concurrency bugs should fear the big bad data-race detector (part 2)
|
||||
https://lwn.net/Articles/816854/
|
||||
|
||||
|
||||
More complex use cases
|
||||
======================
|
||||
|
||||
If the alternatives above do not do what you need, please look at the
|
||||
recipes-pairs.txt file to peel off the next layer of the memory-ordering
|
||||
onion.
|
|
@ -63,10 +63,32 @@ BASIC USAGE: HERD7
|
|||
==================
|
||||
|
||||
The memory model is used, in conjunction with "herd7", to exhaustively
|
||||
explore the state space of small litmus tests.
|
||||
explore the state space of small litmus tests. Documentation describing
|
||||
the format, features, capabilities and limitations of these litmus
|
||||
tests is available in tools/memory-model/Documentation/litmus-tests.txt.
|
||||
|
||||
For example, to run SB+fencembonceonces.litmus against the memory model:
|
||||
Example litmus tests may be found in the Linux-kernel source tree:
|
||||
|
||||
tools/memory-model/litmus-tests/
|
||||
Documentation/litmus-tests/
|
||||
|
||||
Several thousand more example litmus tests are available here:
|
||||
|
||||
https://github.com/paulmckrcu/litmus
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/herd
|
||||
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/litmus
|
||||
|
||||
Documentation describing litmus tests and now to use them may be found
|
||||
here:
|
||||
|
||||
tools/memory-model/Documentation/litmus-tests.txt
|
||||
|
||||
The remainder of this section uses the SB+fencembonceonces.litmus test
|
||||
located in the tools/memory-model directory.
|
||||
|
||||
To run SB+fencembonceonces.litmus against the memory model:
|
||||
|
||||
$ cd $LINUX_SOURCE_TREE/tools/memory-model
|
||||
$ herd7 -conf linux-kernel.cfg litmus-tests/SB+fencembonceonces.litmus
|
||||
|
||||
Here is the corresponding output:
|
||||
|
@ -87,7 +109,11 @@ Here is the corresponding output:
|
|||
The "Positive: 0 Negative: 3" and the "Never 0 3" each indicate that
|
||||
this litmus test's "exists" clause can not be satisfied.
|
||||
|
||||
See "herd7 -help" or "herdtools7/doc/" for more information.
|
||||
See "herd7 -help" or "herdtools7/doc/" for more information on running the
|
||||
tool itself, but please be aware that this documentation is intended for
|
||||
people who work on the memory model itself, that is, people making changes
|
||||
to the tools/memory-model/linux-kernel.* files. It is not intended for
|
||||
people focusing on writing, understanding, and running LKMM litmus tests.
|
||||
|
||||
|
||||
=====================
|
||||
|
@ -124,7 +150,11 @@ that during two million trials, the state specified in this litmus
|
|||
test's "exists" clause was not reached.
|
||||
|
||||
And, as with "herd7", please see "klitmus7 -help" or "herdtools7/doc/"
|
||||
for more information.
|
||||
for more information. And again, please be aware that this documentation
|
||||
is intended for people who work on the memory model itself, that is,
|
||||
people making changes to the tools/memory-model/linux-kernel.* files.
|
||||
It is not intended for people focusing on writing, understanding, and
|
||||
running LKMM litmus tests.
|
||||
|
||||
|
||||
====================
|
||||
|
@ -137,12 +167,21 @@ Documentation/cheatsheet.txt
|
|||
Documentation/explanation.txt
|
||||
Describes the memory model in detail.
|
||||
|
||||
Documentation/litmus-tests.txt
|
||||
Describes the format, features, capabilities, and limitations
|
||||
of the litmus tests that LKMM can evaluate.
|
||||
|
||||
Documentation/recipes.txt
|
||||
Lists common memory-ordering patterns.
|
||||
|
||||
Documentation/references.txt
|
||||
Provides background reading.
|
||||
|
||||
Documentation/simple.txt
|
||||
Starting point for someone new to Linux-kernel concurrency.
|
||||
And also for those needing a reminder of the simpler approaches
|
||||
to concurrency!
|
||||
|
||||
linux-kernel.bell
|
||||
Categorizes the relevant instructions, including memory
|
||||
references, memory barriers, atomic read-modify-write operations,
|
||||
|
@ -187,116 +226,3 @@ README
|
|||
This file.
|
||||
|
||||
scripts Various scripts, see scripts/README.
|
||||
|
||||
|
||||
===========
|
||||
LIMITATIONS
|
||||
===========
|
||||
|
||||
The Linux-kernel memory model (LKMM) has the following limitations:
|
||||
|
||||
1. Compiler optimizations are not accurately modeled. Of course,
|
||||
the use of READ_ONCE() and WRITE_ONCE() limits the compiler's
|
||||
ability to optimize, but under some circumstances it is possible
|
||||
for the compiler to undermine the memory model. For more
|
||||
information, see Documentation/explanation.txt (in particular,
|
||||
the "THE PROGRAM ORDER RELATION: po AND po-loc" and "A WARNING"
|
||||
sections).
|
||||
|
||||
Note that this limitation in turn limits LKMM's ability to
|
||||
accurately model address, control, and data dependencies.
|
||||
For example, if the compiler can deduce the value of some variable
|
||||
carrying a dependency, then the compiler can break that dependency
|
||||
by substituting a constant of that value.
|
||||
|
||||
2. Multiple access sizes for a single variable are not supported,
|
||||
and neither are misaligned or partially overlapping accesses.
|
||||
|
||||
3. Exceptions and interrupts are not modeled. In some cases,
|
||||
this limitation can be overcome by modeling the interrupt or
|
||||
exception with an additional process.
|
||||
|
||||
4. I/O such as MMIO or DMA is not supported.
|
||||
|
||||
5. Self-modifying code (such as that found in the kernel's
|
||||
alternatives mechanism, function tracer, Berkeley Packet Filter
|
||||
JIT compiler, and module loader) is not supported.
|
||||
|
||||
6. Complete modeling of all variants of atomic read-modify-write
|
||||
operations, locking primitives, and RCU is not provided.
|
||||
For example, call_rcu() and rcu_barrier() are not supported.
|
||||
However, a substantial amount of support is provided for these
|
||||
operations, as shown in the linux-kernel.def file.
|
||||
|
||||
a. When rcu_assign_pointer() is passed NULL, the Linux
|
||||
kernel provides no ordering, but LKMM models this
|
||||
case as a store release.
|
||||
|
||||
b. The "unless" RMW operations are not currently modeled:
|
||||
atomic_long_add_unless(), atomic_inc_unless_negative(),
|
||||
and atomic_dec_unless_positive(). These can be emulated
|
||||
in litmus tests, for example, by using atomic_cmpxchg().
|
||||
|
||||
One exception of this limitation is atomic_add_unless(),
|
||||
which is provided directly by herd7 (so no corresponding
|
||||
definition in linux-kernel.def). atomic_add_unless() is
|
||||
modeled by herd7 therefore it can be used in litmus tests.
|
||||
|
||||
c. The call_rcu() function is not modeled. It can be
|
||||
emulated in litmus tests by adding another process that
|
||||
invokes synchronize_rcu() and the body of the callback
|
||||
function, with (for example) a release-acquire from
|
||||
the site of the emulated call_rcu() to the beginning
|
||||
of the additional process.
|
||||
|
||||
d. The rcu_barrier() function is not modeled. It can be
|
||||
emulated in litmus tests emulating call_rcu() via
|
||||
(for example) a release-acquire from the end of each
|
||||
additional call_rcu() process to the site of the
|
||||
emulated rcu-barrier().
|
||||
|
||||
e. Although sleepable RCU (SRCU) is now modeled, there
|
||||
are some subtle differences between its semantics and
|
||||
those in the Linux kernel. For example, the kernel
|
||||
might interpret the following sequence as two partially
|
||||
overlapping SRCU read-side critical sections:
|
||||
|
||||
1 r1 = srcu_read_lock(&my_srcu);
|
||||
2 do_something_1();
|
||||
3 r2 = srcu_read_lock(&my_srcu);
|
||||
4 do_something_2();
|
||||
5 srcu_read_unlock(&my_srcu, r1);
|
||||
6 do_something_3();
|
||||
7 srcu_read_unlock(&my_srcu, r2);
|
||||
|
||||
In contrast, LKMM will interpret this as a nested pair of
|
||||
SRCU read-side critical sections, with the outer critical
|
||||
section spanning lines 1-7 and the inner critical section
|
||||
spanning lines 3-5.
|
||||
|
||||
This difference would be more of a concern had anyone
|
||||
identified a reasonable use case for partially overlapping
|
||||
SRCU read-side critical sections. For more information,
|
||||
please see: https://paulmck.livejournal.com/40593.html
|
||||
|
||||
f. Reader-writer locking is not modeled. It can be
|
||||
emulated in litmus tests using atomic read-modify-write
|
||||
operations.
|
||||
|
||||
The "herd7" tool has some additional limitations of its own, apart from
|
||||
the memory model:
|
||||
|
||||
1. Non-trivial data structures such as arrays or structures are
|
||||
not supported. However, pointers are supported, allowing trivial
|
||||
linked lists to be constructed.
|
||||
|
||||
2. Dynamic memory allocation is not supported, although this can
|
||||
be worked around in some cases by supplying multiple statically
|
||||
allocated variables.
|
||||
|
||||
Some of these limitations may be overcome in the future, but others are
|
||||
more likely to be addressed by incorporating the Linux-kernel memory model
|
||||
into other tools.
|
||||
|
||||
Finally, please note that LKMM is subject to change as hardware, use cases,
|
||||
and compilers evolve.
|
||||
|
|
|
@ -528,6 +528,61 @@ static const char *uaccess_safe_builtin[] = {
|
|||
"__tsan_write4",
|
||||
"__tsan_write8",
|
||||
"__tsan_write16",
|
||||
"__tsan_read_write1",
|
||||
"__tsan_read_write2",
|
||||
"__tsan_read_write4",
|
||||
"__tsan_read_write8",
|
||||
"__tsan_read_write16",
|
||||
"__tsan_atomic8_load",
|
||||
"__tsan_atomic16_load",
|
||||
"__tsan_atomic32_load",
|
||||
"__tsan_atomic64_load",
|
||||
"__tsan_atomic8_store",
|
||||
"__tsan_atomic16_store",
|
||||
"__tsan_atomic32_store",
|
||||
"__tsan_atomic64_store",
|
||||
"__tsan_atomic8_exchange",
|
||||
"__tsan_atomic16_exchange",
|
||||
"__tsan_atomic32_exchange",
|
||||
"__tsan_atomic64_exchange",
|
||||
"__tsan_atomic8_fetch_add",
|
||||
"__tsan_atomic16_fetch_add",
|
||||
"__tsan_atomic32_fetch_add",
|
||||
"__tsan_atomic64_fetch_add",
|
||||
"__tsan_atomic8_fetch_sub",
|
||||
"__tsan_atomic16_fetch_sub",
|
||||
"__tsan_atomic32_fetch_sub",
|
||||
"__tsan_atomic64_fetch_sub",
|
||||
"__tsan_atomic8_fetch_and",
|
||||
"__tsan_atomic16_fetch_and",
|
||||
"__tsan_atomic32_fetch_and",
|
||||
"__tsan_atomic64_fetch_and",
|
||||
"__tsan_atomic8_fetch_or",
|
||||
"__tsan_atomic16_fetch_or",
|
||||
"__tsan_atomic32_fetch_or",
|
||||
"__tsan_atomic64_fetch_or",
|
||||
"__tsan_atomic8_fetch_xor",
|
||||
"__tsan_atomic16_fetch_xor",
|
||||
"__tsan_atomic32_fetch_xor",
|
||||
"__tsan_atomic64_fetch_xor",
|
||||
"__tsan_atomic8_fetch_nand",
|
||||
"__tsan_atomic16_fetch_nand",
|
||||
"__tsan_atomic32_fetch_nand",
|
||||
"__tsan_atomic64_fetch_nand",
|
||||
"__tsan_atomic8_compare_exchange_strong",
|
||||
"__tsan_atomic16_compare_exchange_strong",
|
||||
"__tsan_atomic32_compare_exchange_strong",
|
||||
"__tsan_atomic64_compare_exchange_strong",
|
||||
"__tsan_atomic8_compare_exchange_weak",
|
||||
"__tsan_atomic16_compare_exchange_weak",
|
||||
"__tsan_atomic32_compare_exchange_weak",
|
||||
"__tsan_atomic64_compare_exchange_weak",
|
||||
"__tsan_atomic8_compare_exchange_val",
|
||||
"__tsan_atomic16_compare_exchange_val",
|
||||
"__tsan_atomic32_compare_exchange_val",
|
||||
"__tsan_atomic64_compare_exchange_val",
|
||||
"__tsan_atomic_thread_fence",
|
||||
"__tsan_atomic_signal_fence",
|
||||
/* KCOV */
|
||||
"write_comp_data",
|
||||
"check_kcov_mode",
|
||||
|
|
Loading…
Reference in New Issue