Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar: "The main changes in the locking subsystem in this cycle were: - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem, which is an an array of tools in tools/memory-model/ that formally describe the Linux memory coherency model (a.k.a. Documentation/memory-barriers.txt), and also produce 'litmus tests' in form of kernel code which can be directly executed and tested. Here's a high level background article about an earlier version of this work on LWN.net: https://lwn.net/Articles/718628/ The design principles: "There is reason to believe that Documentation/memory-barriers.txt could use some help, and a major purpose of this patch is to provide that help in the form of a design-time tool that can produce all valid executions of a small fragment of concurrent Linux-kernel code, which is called a "litmus test". This tool's functionality is roughly similar to a full state-space search. Please note that this is a design-time tool, not useful for regression testing. However, we hope that the underlying Linux-kernel memory model will be incorporated into other tools capable of analyzing large bodies of code for regression-testing purposes." [...] "A second tool is klitmus7, which converts litmus tests to loadable kernel modules for direct testing. As with herd7, the klitmus7 code is freely available from http://diy.inria.fr/sources/index.html (and via "git" at https://github.com/herd/herdtools7)" [...] Credits go to: "This patch was the result of a most excellent collaboration founded by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc Maranget." ... and to the gents listed in the MAINTAINERS entry: LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM) M: Alan Stern <stern@rowland.harvard.edu> M: Andrea Parri <parri.andrea@gmail.com> M: Will Deacon <will.deacon@arm.com> M: Peter Zijlstra <peterz@infradead.org> M: Boqun Feng <boqun.feng@gmail.com> M: Nicholas Piggin <npiggin@gmail.com> M: David Howells <dhowells@redhat.com> M: Jade Alglave <j.alglave@ucl.ac.uk> M: Luc Maranget <luc.maranget@inria.fr> M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> The LKMM project already found several bugs in Linux locking primitives and improved the understanding and the documentation of the Linux memory model all around. - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov) - Add RWSEM API debugging and reorganize the lock debugging Kconfig (Waiman Long) - ... misc cleanups and other smaller changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) locking/Kconfig: Restructure the lock debugging menu locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches lockdep: Make the lock debug output more useful locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() locking/atomic, asm-generic, x86: Add comments for atomic instrumentation locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference() tools/memory-model: Add documentation of new litmus test tools/memory-model: Remove mention of docker/gentoo image locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more locking/lockdep: Show unadorned pointers mutex: Drop linkage.h from mutex.h tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference tools/memory-model: Convert underscores to hyphens tools/memory-model: Add a S lock-based external-view litmus test tools/memory-model: Add required herd7 version to README file ...
This commit is contained in:
commit
701f3b3149
|
@ -27,7 +27,8 @@ lock-class.
|
|||
State
|
||||
-----
|
||||
|
||||
The validator tracks lock-class usage history into 4n + 1 separate state bits:
|
||||
The validator tracks lock-class usage history into 4 * nSTATEs + 1 separate
|
||||
state bits:
|
||||
|
||||
- 'ever held in STATE context'
|
||||
- 'ever held as readlock in STATE context'
|
||||
|
@ -37,7 +38,6 @@ The validator tracks lock-class usage history into 4n + 1 separate state bits:
|
|||
Where STATE can be either one of (kernel/locking/lockdep_states.h)
|
||||
- hardirq
|
||||
- softirq
|
||||
- reclaim_fs
|
||||
|
||||
- 'ever used' [ == !unused ]
|
||||
|
||||
|
@ -169,6 +169,53 @@ Note: When changing code to use the _nested() primitives, be careful and
|
|||
check really thoroughly that the hierarchy is correctly mapped; otherwise
|
||||
you can get false positives or false negatives.
|
||||
|
||||
Annotations
|
||||
-----------
|
||||
|
||||
Two constructs can be used to annotate and check where and if certain locks
|
||||
must be held: lockdep_assert_held*(&lock) and lockdep_*pin_lock(&lock).
|
||||
|
||||
As the name suggests, lockdep_assert_held* family of macros assert that a
|
||||
particular lock is held at a certain time (and generate a WARN() otherwise).
|
||||
This annotation is largely used all over the kernel, e.g. kernel/sched/
|
||||
core.c
|
||||
|
||||
void update_rq_clock(struct rq *rq)
|
||||
{
|
||||
s64 delta;
|
||||
|
||||
lockdep_assert_held(&rq->lock);
|
||||
[...]
|
||||
}
|
||||
|
||||
where holding rq->lock is required to safely update a rq's clock.
|
||||
|
||||
The other family of macros is lockdep_*pin_lock(), which is admittedly only
|
||||
used for rq->lock ATM. Despite their limited adoption these annotations
|
||||
generate a WARN() if the lock of interest is "accidentally" unlocked. This turns
|
||||
out to be especially helpful to debug code with callbacks, where an upper
|
||||
layer assumes a lock remains taken, but a lower layer thinks it can maybe drop
|
||||
and reacquire the lock ("unwittingly" introducing races). lockdep_pin_lock()
|
||||
returns a 'struct pin_cookie' that is then used by lockdep_unpin_lock() to check
|
||||
that nobody tampered with the lock, e.g. kernel/sched/sched.h
|
||||
|
||||
static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf)
|
||||
{
|
||||
rf->cookie = lockdep_pin_lock(&rq->lock);
|
||||
[...]
|
||||
}
|
||||
|
||||
static inline void rq_unpin_lock(struct rq *rq, struct rq_flags *rf)
|
||||
{
|
||||
[...]
|
||||
lockdep_unpin_lock(&rq->lock, rf->cookie);
|
||||
}
|
||||
|
||||
While comments about locking requirements might provide useful information,
|
||||
the runtime checks performed by annotations are invaluable when debugging
|
||||
locking problems and they carry the same level of details when inspecting
|
||||
code. Always prefer annotations when in doubt!
|
||||
|
||||
Proof of 100% correctness:
|
||||
--------------------------
|
||||
|
||||
|
|
|
@ -14,7 +14,11 @@ DISCLAIMER
|
|||
This document is not a specification; it is intentionally (for the sake of
|
||||
brevity) and unintentionally (due to being human) incomplete. This document is
|
||||
meant as a guide to using the various memory barriers provided by Linux, but
|
||||
in case of any doubt (and there are many) please ask.
|
||||
in case of any doubt (and there are many) please ask. Some doubts may be
|
||||
resolved by referring to the formal memory consistency model and related
|
||||
documentation at tools/memory-model/. Nevertheless, even this memory
|
||||
model should be viewed as the collective opinion of its maintainers rather
|
||||
than as an infallible oracle.
|
||||
|
||||
To repeat, this document is not a specification of what Linux expects from
|
||||
hardware.
|
||||
|
@ -48,7 +52,7 @@ CONTENTS
|
|||
|
||||
- Varieties of memory barrier.
|
||||
- What may not be assumed about memory barriers?
|
||||
- Data dependency barriers.
|
||||
- Data dependency barriers (historical).
|
||||
- Control dependencies.
|
||||
- SMP barrier pairing.
|
||||
- Examples of memory barrier sequences.
|
||||
|
@ -399,7 +403,7 @@ Memory barriers come in four basic varieties:
|
|||
where two loads are performed such that the second depends on the result
|
||||
of the first (eg: the first load retrieves the address to which the second
|
||||
load will be directed), a data dependency barrier would be required to
|
||||
make sure that the target of the second load is updated before the address
|
||||
make sure that the target of the second load is updated after the address
|
||||
obtained by the first load is accessed.
|
||||
|
||||
A data dependency barrier is a partial ordering on interdependent loads
|
||||
|
@ -550,8 +554,15 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
|
|||
Documentation/DMA-API.txt
|
||||
|
||||
|
||||
DATA DEPENDENCY BARRIERS
|
||||
------------------------
|
||||
DATA DEPENDENCY BARRIERS (HISTORICAL)
|
||||
-------------------------------------
|
||||
|
||||
As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was
|
||||
added to READ_ONCE(), which means that about the only people who
|
||||
need to pay attention to this section are those working on DEC Alpha
|
||||
architecture-specific code and those working on READ_ONCE() itself.
|
||||
For those who need it, and for those who are interested in the history,
|
||||
here is the story of data-dependency barriers.
|
||||
|
||||
The usage requirements of data dependency barriers are a little subtle, and
|
||||
it's not always obvious that they're needed. To illustrate, consider the
|
||||
|
@ -2839,8 +2850,9 @@ as that committed on CPU 1.
|
|||
|
||||
|
||||
To intervene, we need to interpolate a data dependency barrier or a read
|
||||
barrier between the loads. This will force the cache to commit its coherency
|
||||
queue before processing any further requests:
|
||||
barrier between the loads (which as of v4.15 is supplied unconditionally
|
||||
by the READ_ONCE() macro). This will force the cache to commit its
|
||||
coherency queue before processing any further requests:
|
||||
|
||||
CPU 1 CPU 2 COMMENT
|
||||
=============== =============== =======================================
|
||||
|
@ -2869,8 +2881,8 @@ Other CPUs may also have split caches, but must coordinate between the various
|
|||
cachelets for normal memory accesses. The semantics of the Alpha removes the
|
||||
need for hardware coordination in the absence of memory barriers, which
|
||||
permitted Alpha to sport higher CPU clock rates back in the day. However,
|
||||
please note that smp_read_barrier_depends() should not be used except in
|
||||
Alpha arch-specific code and within the READ_ONCE() macro.
|
||||
please note that (again, as of v4.15) smp_read_barrier_depends() should not
|
||||
be used except in Alpha arch-specific code and within the READ_ONCE() macro.
|
||||
|
||||
|
||||
CACHE COHERENCY VS DMA
|
||||
|
@ -3035,7 +3047,9 @@ the data dependency barrier really becomes necessary as this synchronises both
|
|||
caches with the memory coherence system, thus making it seem like pointer
|
||||
changes vs new data occur in the right order.
|
||||
|
||||
The Alpha defines the Linux kernel's memory barrier model.
|
||||
The Alpha defines the Linux kernel's memory model, although as of v4.15
|
||||
the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE()
|
||||
greatly reduced Alpha's impact on the memory model.
|
||||
|
||||
See the subsection on "Cache Coherency" above.
|
||||
|
||||
|
|
18
MAINTAINERS
18
MAINTAINERS
|
@ -8162,6 +8162,24 @@ M: Kees Cook <keescook@chromium.org>
|
|||
S: Maintained
|
||||
F: drivers/misc/lkdtm*
|
||||
|
||||
LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
|
||||
M: Alan Stern <stern@rowland.harvard.edu>
|
||||
M: Andrea Parri <parri.andrea@gmail.com>
|
||||
M: Will Deacon <will.deacon@arm.com>
|
||||
M: Peter Zijlstra <peterz@infradead.org>
|
||||
M: Boqun Feng <boqun.feng@gmail.com>
|
||||
M: Nicholas Piggin <npiggin@gmail.com>
|
||||
M: David Howells <dhowells@redhat.com>
|
||||
M: Jade Alglave <j.alglave@ucl.ac.uk>
|
||||
M: Luc Maranget <luc.maranget@inria.fr>
|
||||
M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
|
||||
R: Akira Yokosawa <akiyks@gmail.com>
|
||||
L: linux-kernel@vger.kernel.org
|
||||
S: Supported
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
|
||||
F: tools/memory-model/
|
||||
F: Documentation/memory-barriers.txt
|
||||
|
||||
LINUX SECURITY MODULE (LSM) FRAMEWORK
|
||||
M: Chris Wright <chrisw@sous-sol.org>
|
||||
L: linux-security-module@vger.kernel.org
|
||||
|
|
|
@ -38,19 +38,31 @@
|
|||
#define ____cmpxchg(type, args...) __cmpxchg ##type(args)
|
||||
#include <asm/xchg.h>
|
||||
|
||||
/*
|
||||
* The leading and the trailing memory barriers guarantee that these
|
||||
* operations are fully ordered.
|
||||
*/
|
||||
#define xchg(ptr, x) \
|
||||
({ \
|
||||
__typeof__(*(ptr)) __ret; \
|
||||
__typeof__(*(ptr)) _x_ = (x); \
|
||||
(__typeof__(*(ptr))) __xchg((ptr), (unsigned long)_x_, \
|
||||
sizeof(*(ptr))); \
|
||||
smp_mb(); \
|
||||
__ret = (__typeof__(*(ptr))) \
|
||||
__xchg((ptr), (unsigned long)_x_, sizeof(*(ptr))); \
|
||||
smp_mb(); \
|
||||
__ret; \
|
||||
})
|
||||
|
||||
#define cmpxchg(ptr, o, n) \
|
||||
({ \
|
||||
__typeof__(*(ptr)) __ret; \
|
||||
__typeof__(*(ptr)) _o_ = (o); \
|
||||
__typeof__(*(ptr)) _n_ = (n); \
|
||||
(__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_, \
|
||||
(unsigned long)_n_, sizeof(*(ptr)));\
|
||||
smp_mb(); \
|
||||
__ret = (__typeof__(*(ptr))) __cmpxchg((ptr), \
|
||||
(unsigned long)_o_, (unsigned long)_n_, sizeof(*(ptr)));\
|
||||
smp_mb(); \
|
||||
__ret; \
|
||||
})
|
||||
|
||||
#define cmpxchg64(ptr, o, n) \
|
||||
|
|
|
@ -12,10 +12,6 @@
|
|||
* Atomic exchange.
|
||||
* Since it can be used to implement critical sections
|
||||
* it must clobber "memory" (also for interrupts in UP).
|
||||
*
|
||||
* The leading and the trailing memory barriers guarantee that these
|
||||
* operations are fully ordered.
|
||||
*
|
||||
*/
|
||||
|
||||
static inline unsigned long
|
||||
|
@ -23,7 +19,6 @@ ____xchg(_u8, volatile char *m, unsigned long val)
|
|||
{
|
||||
unsigned long ret, tmp, addr64;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
" andnot %4,7,%3\n"
|
||||
" insbl %1,%4,%1\n"
|
||||
|
@ -38,7 +33,6 @@ ____xchg(_u8, volatile char *m, unsigned long val)
|
|||
".previous"
|
||||
: "=&r" (ret), "=&r" (val), "=&r" (tmp), "=&r" (addr64)
|
||||
: "r" ((long)m), "1" (val) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -48,7 +42,6 @@ ____xchg(_u16, volatile short *m, unsigned long val)
|
|||
{
|
||||
unsigned long ret, tmp, addr64;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
" andnot %4,7,%3\n"
|
||||
" inswl %1,%4,%1\n"
|
||||
|
@ -63,7 +56,6 @@ ____xchg(_u16, volatile short *m, unsigned long val)
|
|||
".previous"
|
||||
: "=&r" (ret), "=&r" (val), "=&r" (tmp), "=&r" (addr64)
|
||||
: "r" ((long)m), "1" (val) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -73,7 +65,6 @@ ____xchg(_u32, volatile int *m, unsigned long val)
|
|||
{
|
||||
unsigned long dummy;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
"1: ldl_l %0,%4\n"
|
||||
" bis $31,%3,%1\n"
|
||||
|
@ -84,7 +75,6 @@ ____xchg(_u32, volatile int *m, unsigned long val)
|
|||
".previous"
|
||||
: "=&r" (val), "=&r" (dummy), "=m" (*m)
|
||||
: "rI" (val), "m" (*m) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return val;
|
||||
}
|
||||
|
@ -94,7 +84,6 @@ ____xchg(_u64, volatile long *m, unsigned long val)
|
|||
{
|
||||
unsigned long dummy;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
"1: ldq_l %0,%4\n"
|
||||
" bis $31,%3,%1\n"
|
||||
|
@ -105,7 +94,6 @@ ____xchg(_u64, volatile long *m, unsigned long val)
|
|||
".previous"
|
||||
: "=&r" (val), "=&r" (dummy), "=m" (*m)
|
||||
: "rI" (val), "m" (*m) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return val;
|
||||
}
|
||||
|
@ -135,13 +123,6 @@ ____xchg(, volatile void *ptr, unsigned long x, int size)
|
|||
* Atomic compare and exchange. Compare OLD with MEM, if identical,
|
||||
* store NEW in MEM. Return the initial value in MEM. Success is
|
||||
* indicated by comparing RETURN with OLD.
|
||||
*
|
||||
* The leading and the trailing memory barriers guarantee that these
|
||||
* operations are fully ordered.
|
||||
*
|
||||
* The trailing memory barrier is placed in SMP unconditionally, in
|
||||
* order to guarantee that dependency ordering is preserved when a
|
||||
* dependency is headed by an unsuccessful operation.
|
||||
*/
|
||||
|
||||
static inline unsigned long
|
||||
|
@ -149,7 +130,6 @@ ____cmpxchg(_u8, volatile char *m, unsigned char old, unsigned char new)
|
|||
{
|
||||
unsigned long prev, tmp, cmp, addr64;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
" andnot %5,7,%4\n"
|
||||
" insbl %1,%5,%1\n"
|
||||
|
@ -167,7 +147,6 @@ ____cmpxchg(_u8, volatile char *m, unsigned char old, unsigned char new)
|
|||
".previous"
|
||||
: "=&r" (prev), "=&r" (new), "=&r" (tmp), "=&r" (cmp), "=&r" (addr64)
|
||||
: "r" ((long)m), "Ir" (old), "1" (new) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return prev;
|
||||
}
|
||||
|
@ -177,7 +156,6 @@ ____cmpxchg(_u16, volatile short *m, unsigned short old, unsigned short new)
|
|||
{
|
||||
unsigned long prev, tmp, cmp, addr64;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
" andnot %5,7,%4\n"
|
||||
" inswl %1,%5,%1\n"
|
||||
|
@ -195,7 +173,6 @@ ____cmpxchg(_u16, volatile short *m, unsigned short old, unsigned short new)
|
|||
".previous"
|
||||
: "=&r" (prev), "=&r" (new), "=&r" (tmp), "=&r" (cmp), "=&r" (addr64)
|
||||
: "r" ((long)m), "Ir" (old), "1" (new) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return prev;
|
||||
}
|
||||
|
@ -205,7 +182,6 @@ ____cmpxchg(_u32, volatile int *m, int old, int new)
|
|||
{
|
||||
unsigned long prev, cmp;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
"1: ldl_l %0,%5\n"
|
||||
" cmpeq %0,%3,%1\n"
|
||||
|
@ -219,7 +195,6 @@ ____cmpxchg(_u32, volatile int *m, int old, int new)
|
|||
".previous"
|
||||
: "=&r"(prev), "=&r"(cmp), "=m"(*m)
|
||||
: "r"((long) old), "r"(new), "m"(*m) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return prev;
|
||||
}
|
||||
|
@ -229,7 +204,6 @@ ____cmpxchg(_u64, volatile long *m, unsigned long old, unsigned long new)
|
|||
{
|
||||
unsigned long prev, cmp;
|
||||
|
||||
smp_mb();
|
||||
__asm__ __volatile__(
|
||||
"1: ldq_l %0,%5\n"
|
||||
" cmpeq %0,%3,%1\n"
|
||||
|
@ -243,7 +217,6 @@ ____cmpxchg(_u64, volatile long *m, unsigned long old, unsigned long new)
|
|||
".previous"
|
||||
: "=&r"(prev), "=&r"(cmp), "=m"(*m)
|
||||
: "r"((long) old), "r"(new), "m"(*m) : "memory");
|
||||
smp_mb();
|
||||
|
||||
return prev;
|
||||
}
|
||||
|
|
|
@ -17,36 +17,40 @@
|
|||
#define ATOMIC_INIT(i) { (i) }
|
||||
|
||||
/**
|
||||
* atomic_read - read atomic variable
|
||||
* arch_atomic_read - read atomic variable
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically reads the value of @v.
|
||||
*/
|
||||
static __always_inline int atomic_read(const atomic_t *v)
|
||||
static __always_inline int arch_atomic_read(const atomic_t *v)
|
||||
{
|
||||
/*
|
||||
* Note for KASAN: we deliberately don't use READ_ONCE_NOCHECK() here,
|
||||
* it's non-inlined function that increases binary size and stack usage.
|
||||
*/
|
||||
return READ_ONCE((v)->counter);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_set - set atomic variable
|
||||
* arch_atomic_set - set atomic variable
|
||||
* @v: pointer of type atomic_t
|
||||
* @i: required value
|
||||
*
|
||||
* Atomically sets the value of @v to @i.
|
||||
*/
|
||||
static __always_inline void atomic_set(atomic_t *v, int i)
|
||||
static __always_inline void arch_atomic_set(atomic_t *v, int i)
|
||||
{
|
||||
WRITE_ONCE(v->counter, i);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_add - add integer to atomic variable
|
||||
* arch_atomic_add - add integer to atomic variable
|
||||
* @i: integer value to add
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically adds @i to @v.
|
||||
*/
|
||||
static __always_inline void atomic_add(int i, atomic_t *v)
|
||||
static __always_inline void arch_atomic_add(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "addl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -54,13 +58,13 @@ static __always_inline void atomic_add(int i, atomic_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic_sub - subtract integer from atomic variable
|
||||
* arch_atomic_sub - subtract integer from atomic variable
|
||||
* @i: integer value to subtract
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically subtracts @i from @v.
|
||||
*/
|
||||
static __always_inline void atomic_sub(int i, atomic_t *v)
|
||||
static __always_inline void arch_atomic_sub(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "subl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -68,7 +72,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic_sub_and_test - subtract value from variable and test result
|
||||
* arch_atomic_sub_and_test - subtract value from variable and test result
|
||||
* @i: integer value to subtract
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
|
@ -76,63 +80,63 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
|
|||
* true if the result is zero, or false for all
|
||||
* other cases.
|
||||
*/
|
||||
static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
|
||||
static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v)
|
||||
{
|
||||
GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_inc - increment atomic variable
|
||||
* arch_atomic_inc - increment atomic variable
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically increments @v by 1.
|
||||
*/
|
||||
static __always_inline void atomic_inc(atomic_t *v)
|
||||
static __always_inline void arch_atomic_inc(atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "incl %0"
|
||||
: "+m" (v->counter));
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_dec - decrement atomic variable
|
||||
* arch_atomic_dec - decrement atomic variable
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically decrements @v by 1.
|
||||
*/
|
||||
static __always_inline void atomic_dec(atomic_t *v)
|
||||
static __always_inline void arch_atomic_dec(atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "decl %0"
|
||||
: "+m" (v->counter));
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_dec_and_test - decrement and test
|
||||
* arch_atomic_dec_and_test - decrement and test
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically decrements @v by 1 and
|
||||
* returns true if the result is 0, or false for all other
|
||||
* cases.
|
||||
*/
|
||||
static __always_inline bool atomic_dec_and_test(atomic_t *v)
|
||||
static __always_inline bool arch_atomic_dec_and_test(atomic_t *v)
|
||||
{
|
||||
GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", e);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_inc_and_test - increment and test
|
||||
* arch_atomic_inc_and_test - increment and test
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically increments @v by 1
|
||||
* and returns true if the result is zero, or false for all
|
||||
* other cases.
|
||||
*/
|
||||
static __always_inline bool atomic_inc_and_test(atomic_t *v)
|
||||
static __always_inline bool arch_atomic_inc_and_test(atomic_t *v)
|
||||
{
|
||||
GEN_UNARY_RMWcc(LOCK_PREFIX "incl", v->counter, "%0", e);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_add_negative - add and test if negative
|
||||
* arch_atomic_add_negative - add and test if negative
|
||||
* @i: integer value to add
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
|
@ -140,65 +144,65 @@ static __always_inline bool atomic_inc_and_test(atomic_t *v)
|
|||
* if the result is negative, or false when
|
||||
* result is greater than or equal to zero.
|
||||
*/
|
||||
static __always_inline bool atomic_add_negative(int i, atomic_t *v)
|
||||
static __always_inline bool arch_atomic_add_negative(int i, atomic_t *v)
|
||||
{
|
||||
GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, "er", i, "%0", s);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_add_return - add integer and return
|
||||
* arch_atomic_add_return - add integer and return
|
||||
* @i: integer value to add
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* Atomically adds @i to @v and returns @i + @v
|
||||
*/
|
||||
static __always_inline int atomic_add_return(int i, atomic_t *v)
|
||||
static __always_inline int arch_atomic_add_return(int i, atomic_t *v)
|
||||
{
|
||||
return i + xadd(&v->counter, i);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic_sub_return - subtract integer and return
|
||||
* arch_atomic_sub_return - subtract integer and return
|
||||
* @v: pointer of type atomic_t
|
||||
* @i: integer value to subtract
|
||||
*
|
||||
* Atomically subtracts @i from @v and returns @v - @i
|
||||
*/
|
||||
static __always_inline int atomic_sub_return(int i, atomic_t *v)
|
||||
static __always_inline int arch_atomic_sub_return(int i, atomic_t *v)
|
||||
{
|
||||
return atomic_add_return(-i, v);
|
||||
return arch_atomic_add_return(-i, v);
|
||||
}
|
||||
|
||||
#define atomic_inc_return(v) (atomic_add_return(1, v))
|
||||
#define atomic_dec_return(v) (atomic_sub_return(1, v))
|
||||
#define arch_atomic_inc_return(v) (arch_atomic_add_return(1, v))
|
||||
#define arch_atomic_dec_return(v) (arch_atomic_sub_return(1, v))
|
||||
|
||||
static __always_inline int atomic_fetch_add(int i, atomic_t *v)
|
||||
static __always_inline int arch_atomic_fetch_add(int i, atomic_t *v)
|
||||
{
|
||||
return xadd(&v->counter, i);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
|
||||
static __always_inline int arch_atomic_fetch_sub(int i, atomic_t *v)
|
||||
{
|
||||
return xadd(&v->counter, -i);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
|
||||
static __always_inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
|
||||
{
|
||||
return cmpxchg(&v->counter, old, new);
|
||||
return arch_cmpxchg(&v->counter, old, new);
|
||||
}
|
||||
|
||||
#define atomic_try_cmpxchg atomic_try_cmpxchg
|
||||
static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new)
|
||||
#define arch_atomic_try_cmpxchg arch_atomic_try_cmpxchg
|
||||
static __always_inline bool arch_atomic_try_cmpxchg(atomic_t *v, int *old, int new)
|
||||
{
|
||||
return try_cmpxchg(&v->counter, old, new);
|
||||
}
|
||||
|
||||
static inline int atomic_xchg(atomic_t *v, int new)
|
||||
static inline int arch_atomic_xchg(atomic_t *v, int new)
|
||||
{
|
||||
return xchg(&v->counter, new);
|
||||
}
|
||||
|
||||
static inline void atomic_and(int i, atomic_t *v)
|
||||
static inline void arch_atomic_and(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "andl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -206,16 +210,16 @@ static inline void atomic_and(int i, atomic_t *v)
|
|||
: "memory");
|
||||
}
|
||||
|
||||
static inline int atomic_fetch_and(int i, atomic_t *v)
|
||||
static inline int arch_atomic_fetch_and(int i, atomic_t *v)
|
||||
{
|
||||
int val = atomic_read(v);
|
||||
int val = arch_atomic_read(v);
|
||||
|
||||
do { } while (!atomic_try_cmpxchg(v, &val, val & i));
|
||||
do { } while (!arch_atomic_try_cmpxchg(v, &val, val & i));
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
static inline void atomic_or(int i, atomic_t *v)
|
||||
static inline void arch_atomic_or(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "orl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -223,16 +227,16 @@ static inline void atomic_or(int i, atomic_t *v)
|
|||
: "memory");
|
||||
}
|
||||
|
||||
static inline int atomic_fetch_or(int i, atomic_t *v)
|
||||
static inline int arch_atomic_fetch_or(int i, atomic_t *v)
|
||||
{
|
||||
int val = atomic_read(v);
|
||||
int val = arch_atomic_read(v);
|
||||
|
||||
do { } while (!atomic_try_cmpxchg(v, &val, val | i));
|
||||
do { } while (!arch_atomic_try_cmpxchg(v, &val, val | i));
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
static inline void atomic_xor(int i, atomic_t *v)
|
||||
static inline void arch_atomic_xor(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "xorl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -240,17 +244,17 @@ static inline void atomic_xor(int i, atomic_t *v)
|
|||
: "memory");
|
||||
}
|
||||
|
||||
static inline int atomic_fetch_xor(int i, atomic_t *v)
|
||||
static inline int arch_atomic_fetch_xor(int i, atomic_t *v)
|
||||
{
|
||||
int val = atomic_read(v);
|
||||
int val = arch_atomic_read(v);
|
||||
|
||||
do { } while (!atomic_try_cmpxchg(v, &val, val ^ i));
|
||||
do { } while (!arch_atomic_try_cmpxchg(v, &val, val ^ i));
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
/**
|
||||
* __atomic_add_unless - add unless the number is already a given value
|
||||
* __arch_atomic_add_unless - add unless the number is already a given value
|
||||
* @v: pointer of type atomic_t
|
||||
* @a: the amount to add to v...
|
||||
* @u: ...unless v is equal to u.
|
||||
|
@ -258,14 +262,14 @@ static inline int atomic_fetch_xor(int i, atomic_t *v)
|
|||
* Atomically adds @a to @v, so long as @v was not already @u.
|
||||
* Returns the old value of @v.
|
||||
*/
|
||||
static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
|
||||
static __always_inline int __arch_atomic_add_unless(atomic_t *v, int a, int u)
|
||||
{
|
||||
int c = atomic_read(v);
|
||||
int c = arch_atomic_read(v);
|
||||
|
||||
do {
|
||||
if (unlikely(c == u))
|
||||
break;
|
||||
} while (!atomic_try_cmpxchg(v, &c, c + a));
|
||||
} while (!arch_atomic_try_cmpxchg(v, &c, c + a));
|
||||
|
||||
return c;
|
||||
}
|
||||
|
@ -276,4 +280,6 @@ static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
|
|||
# include <asm/atomic64_64.h>
|
||||
#endif
|
||||
|
||||
#include <asm-generic/atomic-instrumented.h>
|
||||
|
||||
#endif /* _ASM_X86_ATOMIC_H */
|
||||
|
|
|
@ -62,7 +62,7 @@ ATOMIC64_DECL(add_unless);
|
|||
#undef ATOMIC64_EXPORT
|
||||
|
||||
/**
|
||||
* atomic64_cmpxchg - cmpxchg atomic64 variable
|
||||
* arch_atomic64_cmpxchg - cmpxchg atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
* @o: expected value
|
||||
* @n: new value
|
||||
|
@ -71,20 +71,21 @@ ATOMIC64_DECL(add_unless);
|
|||
* the old value.
|
||||
*/
|
||||
|
||||
static inline long long atomic64_cmpxchg(atomic64_t *v, long long o, long long n)
|
||||
static inline long long arch_atomic64_cmpxchg(atomic64_t *v, long long o,
|
||||
long long n)
|
||||
{
|
||||
return cmpxchg64(&v->counter, o, n);
|
||||
return arch_cmpxchg64(&v->counter, o, n);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_xchg - xchg atomic64 variable
|
||||
* arch_atomic64_xchg - xchg atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
* @n: value to assign
|
||||
*
|
||||
* Atomically xchgs the value of @v to @n and returns
|
||||
* the old value.
|
||||
*/
|
||||
static inline long long atomic64_xchg(atomic64_t *v, long long n)
|
||||
static inline long long arch_atomic64_xchg(atomic64_t *v, long long n)
|
||||
{
|
||||
long long o;
|
||||
unsigned high = (unsigned)(n >> 32);
|
||||
|
@ -96,13 +97,13 @@ static inline long long atomic64_xchg(atomic64_t *v, long long n)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_set - set atomic64 variable
|
||||
* arch_atomic64_set - set atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
* @i: value to assign
|
||||
*
|
||||
* Atomically sets the value of @v to @n.
|
||||
*/
|
||||
static inline void atomic64_set(atomic64_t *v, long long i)
|
||||
static inline void arch_atomic64_set(atomic64_t *v, long long i)
|
||||
{
|
||||
unsigned high = (unsigned)(i >> 32);
|
||||
unsigned low = (unsigned)i;
|
||||
|
@ -112,12 +113,12 @@ static inline void atomic64_set(atomic64_t *v, long long i)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_read - read atomic64 variable
|
||||
* arch_atomic64_read - read atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically reads the value of @v and returns it.
|
||||
*/
|
||||
static inline long long atomic64_read(const atomic64_t *v)
|
||||
static inline long long arch_atomic64_read(const atomic64_t *v)
|
||||
{
|
||||
long long r;
|
||||
alternative_atomic64(read, "=&A" (r), "c" (v) : "memory");
|
||||
|
@ -125,13 +126,13 @@ static inline long long atomic64_read(const atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_add_return - add and return
|
||||
* arch_atomic64_add_return - add and return
|
||||
* @i: integer value to add
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically adds @i to @v and returns @i + *@v
|
||||
*/
|
||||
static inline long long atomic64_add_return(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_add_return(long long i, atomic64_t *v)
|
||||
{
|
||||
alternative_atomic64(add_return,
|
||||
ASM_OUTPUT2("+A" (i), "+c" (v)),
|
||||
|
@ -142,7 +143,7 @@ static inline long long atomic64_add_return(long long i, atomic64_t *v)
|
|||
/*
|
||||
* Other variants with different arithmetic operators:
|
||||
*/
|
||||
static inline long long atomic64_sub_return(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_sub_return(long long i, atomic64_t *v)
|
||||
{
|
||||
alternative_atomic64(sub_return,
|
||||
ASM_OUTPUT2("+A" (i), "+c" (v)),
|
||||
|
@ -150,7 +151,7 @@ static inline long long atomic64_sub_return(long long i, atomic64_t *v)
|
|||
return i;
|
||||
}
|
||||
|
||||
static inline long long atomic64_inc_return(atomic64_t *v)
|
||||
static inline long long arch_atomic64_inc_return(atomic64_t *v)
|
||||
{
|
||||
long long a;
|
||||
alternative_atomic64(inc_return, "=&A" (a),
|
||||
|
@ -158,7 +159,7 @@ static inline long long atomic64_inc_return(atomic64_t *v)
|
|||
return a;
|
||||
}
|
||||
|
||||
static inline long long atomic64_dec_return(atomic64_t *v)
|
||||
static inline long long arch_atomic64_dec_return(atomic64_t *v)
|
||||
{
|
||||
long long a;
|
||||
alternative_atomic64(dec_return, "=&A" (a),
|
||||
|
@ -167,13 +168,13 @@ static inline long long atomic64_dec_return(atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_add - add integer to atomic64 variable
|
||||
* arch_atomic64_add - add integer to atomic64 variable
|
||||
* @i: integer value to add
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically adds @i to @v.
|
||||
*/
|
||||
static inline long long atomic64_add(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_add(long long i, atomic64_t *v)
|
||||
{
|
||||
__alternative_atomic64(add, add_return,
|
||||
ASM_OUTPUT2("+A" (i), "+c" (v)),
|
||||
|
@ -182,13 +183,13 @@ static inline long long atomic64_add(long long i, atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_sub - subtract the atomic64 variable
|
||||
* arch_atomic64_sub - subtract the atomic64 variable
|
||||
* @i: integer value to subtract
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically subtracts @i from @v.
|
||||
*/
|
||||
static inline long long atomic64_sub(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_sub(long long i, atomic64_t *v)
|
||||
{
|
||||
__alternative_atomic64(sub, sub_return,
|
||||
ASM_OUTPUT2("+A" (i), "+c" (v)),
|
||||
|
@ -197,7 +198,7 @@ static inline long long atomic64_sub(long long i, atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_sub_and_test - subtract value from variable and test result
|
||||
* arch_atomic64_sub_and_test - subtract value from variable and test result
|
||||
* @i: integer value to subtract
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
|
@ -205,46 +206,46 @@ static inline long long atomic64_sub(long long i, atomic64_t *v)
|
|||
* true if the result is zero, or false for all
|
||||
* other cases.
|
||||
*/
|
||||
static inline int atomic64_sub_and_test(long long i, atomic64_t *v)
|
||||
static inline int arch_atomic64_sub_and_test(long long i, atomic64_t *v)
|
||||
{
|
||||
return atomic64_sub_return(i, v) == 0;
|
||||
return arch_atomic64_sub_return(i, v) == 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_inc - increment atomic64 variable
|
||||
* arch_atomic64_inc - increment atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically increments @v by 1.
|
||||
*/
|
||||
static inline void atomic64_inc(atomic64_t *v)
|
||||
static inline void arch_atomic64_inc(atomic64_t *v)
|
||||
{
|
||||
__alternative_atomic64(inc, inc_return, /* no output */,
|
||||
"S" (v) : "memory", "eax", "ecx", "edx");
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_dec - decrement atomic64 variable
|
||||
* arch_atomic64_dec - decrement atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically decrements @v by 1.
|
||||
*/
|
||||
static inline void atomic64_dec(atomic64_t *v)
|
||||
static inline void arch_atomic64_dec(atomic64_t *v)
|
||||
{
|
||||
__alternative_atomic64(dec, dec_return, /* no output */,
|
||||
"S" (v) : "memory", "eax", "ecx", "edx");
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_dec_and_test - decrement and test
|
||||
* arch_atomic64_dec_and_test - decrement and test
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically decrements @v by 1 and
|
||||
* returns true if the result is 0, or false for all other
|
||||
* cases.
|
||||
*/
|
||||
static inline int atomic64_dec_and_test(atomic64_t *v)
|
||||
static inline int arch_atomic64_dec_and_test(atomic64_t *v)
|
||||
{
|
||||
return atomic64_dec_return(v) == 0;
|
||||
return arch_atomic64_dec_return(v) == 0;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -255,13 +256,13 @@ static inline int atomic64_dec_and_test(atomic64_t *v)
|
|||
* and returns true if the result is zero, or false for all
|
||||
* other cases.
|
||||
*/
|
||||
static inline int atomic64_inc_and_test(atomic64_t *v)
|
||||
static inline int arch_atomic64_inc_and_test(atomic64_t *v)
|
||||
{
|
||||
return atomic64_inc_return(v) == 0;
|
||||
return arch_atomic64_inc_return(v) == 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_add_negative - add and test if negative
|
||||
* arch_atomic64_add_negative - add and test if negative
|
||||
* @i: integer value to add
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
|
@ -269,13 +270,13 @@ static inline int atomic64_inc_and_test(atomic64_t *v)
|
|||
* if the result is negative, or false when
|
||||
* result is greater than or equal to zero.
|
||||
*/
|
||||
static inline int atomic64_add_negative(long long i, atomic64_t *v)
|
||||
static inline int arch_atomic64_add_negative(long long i, atomic64_t *v)
|
||||
{
|
||||
return atomic64_add_return(i, v) < 0;
|
||||
return arch_atomic64_add_return(i, v) < 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_add_unless - add unless the number is a given value
|
||||
* arch_atomic64_add_unless - add unless the number is a given value
|
||||
* @v: pointer of type atomic64_t
|
||||
* @a: the amount to add to v...
|
||||
* @u: ...unless v is equal to u.
|
||||
|
@ -283,7 +284,8 @@ static inline int atomic64_add_negative(long long i, atomic64_t *v)
|
|||
* Atomically adds @a to @v, so long as it was not @u.
|
||||
* Returns non-zero if the add was done, zero otherwise.
|
||||
*/
|
||||
static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
|
||||
static inline int arch_atomic64_add_unless(atomic64_t *v, long long a,
|
||||
long long u)
|
||||
{
|
||||
unsigned low = (unsigned)u;
|
||||
unsigned high = (unsigned)(u >> 32);
|
||||
|
@ -294,7 +296,7 @@ static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
|
|||
}
|
||||
|
||||
|
||||
static inline int atomic64_inc_not_zero(atomic64_t *v)
|
||||
static inline int arch_atomic64_inc_not_zero(atomic64_t *v)
|
||||
{
|
||||
int r;
|
||||
alternative_atomic64(inc_not_zero, "=&a" (r),
|
||||
|
@ -302,7 +304,7 @@ static inline int atomic64_inc_not_zero(atomic64_t *v)
|
|||
return r;
|
||||
}
|
||||
|
||||
static inline long long atomic64_dec_if_positive(atomic64_t *v)
|
||||
static inline long long arch_atomic64_dec_if_positive(atomic64_t *v)
|
||||
{
|
||||
long long r;
|
||||
alternative_atomic64(dec_if_positive, "=&A" (r),
|
||||
|
@ -313,70 +315,70 @@ static inline long long atomic64_dec_if_positive(atomic64_t *v)
|
|||
#undef alternative_atomic64
|
||||
#undef __alternative_atomic64
|
||||
|
||||
static inline void atomic64_and(long long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_and(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c & i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c & i)) != c)
|
||||
c = old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_and(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_fetch_and(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c & i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c & i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
static inline void atomic64_or(long long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_or(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c | i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c | i)) != c)
|
||||
c = old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_or(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_fetch_or(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c | i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c | i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
static inline void atomic64_xor(long long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_xor(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c ^ i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c ^ i)) != c)
|
||||
c = old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_xor(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_fetch_xor(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c ^ i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c ^ i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_add(long long i, atomic64_t *v)
|
||||
static inline long long arch_atomic64_fetch_add(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c + i)) != c)
|
||||
while ((old = arch_atomic64_cmpxchg(v, c, c + i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
#define atomic64_fetch_sub(i, v) atomic64_fetch_add(-(i), (v))
|
||||
#define arch_atomic64_fetch_sub(i, v) arch_atomic64_fetch_add(-(i), (v))
|
||||
|
||||
#endif /* _ASM_X86_ATOMIC64_32_H */
|
||||
|
|
|
@ -11,37 +11,37 @@
|
|||
#define ATOMIC64_INIT(i) { (i) }
|
||||
|
||||
/**
|
||||
* atomic64_read - read atomic64 variable
|
||||
* arch_atomic64_read - read atomic64 variable
|
||||
* @v: pointer of type atomic64_t
|
||||
*
|
||||
* Atomically reads the value of @v.
|
||||
* Doesn't imply a read memory barrier.
|
||||
*/
|
||||
static inline long atomic64_read(const atomic64_t *v)
|
||||
static inline long arch_atomic64_read(const atomic64_t *v)
|
||||
{
|
||||
return READ_ONCE((v)->counter);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_set - set atomic64 variable
|
||||
* arch_atomic64_set - set atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
* @i: required value
|
||||
*
|
||||
* Atomically sets the value of @v to @i.
|
||||
*/
|
||||
static inline void atomic64_set(atomic64_t *v, long i)
|
||||
static inline void arch_atomic64_set(atomic64_t *v, long i)
|
||||
{
|
||||
WRITE_ONCE(v->counter, i);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_add - add integer to atomic64 variable
|
||||
* arch_atomic64_add - add integer to atomic64 variable
|
||||
* @i: integer value to add
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically adds @i to @v.
|
||||
*/
|
||||
static __always_inline void atomic64_add(long i, atomic64_t *v)
|
||||
static __always_inline void arch_atomic64_add(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "addq %1,%0"
|
||||
: "=m" (v->counter)
|
||||
|
@ -49,13 +49,13 @@ static __always_inline void atomic64_add(long i, atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_sub - subtract the atomic64 variable
|
||||
* arch_atomic64_sub - subtract the atomic64 variable
|
||||
* @i: integer value to subtract
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically subtracts @i from @v.
|
||||
*/
|
||||
static inline void atomic64_sub(long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_sub(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "subq %1,%0"
|
||||
: "=m" (v->counter)
|
||||
|
@ -63,7 +63,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_sub_and_test - subtract value from variable and test result
|
||||
* arch_atomic64_sub_and_test - subtract value from variable and test result
|
||||
* @i: integer value to subtract
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
|
@ -71,18 +71,18 @@ static inline void atomic64_sub(long i, atomic64_t *v)
|
|||
* true if the result is zero, or false for all
|
||||
* other cases.
|
||||
*/
|
||||
static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
|
||||
static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
|
||||
{
|
||||
GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, "er", i, "%0", e);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_inc - increment atomic64 variable
|
||||
* arch_atomic64_inc - increment atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically increments @v by 1.
|
||||
*/
|
||||
static __always_inline void atomic64_inc(atomic64_t *v)
|
||||
static __always_inline void arch_atomic64_inc(atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "incq %0"
|
||||
: "=m" (v->counter)
|
||||
|
@ -90,12 +90,12 @@ static __always_inline void atomic64_inc(atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_dec - decrement atomic64 variable
|
||||
* arch_atomic64_dec - decrement atomic64 variable
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically decrements @v by 1.
|
||||
*/
|
||||
static __always_inline void atomic64_dec(atomic64_t *v)
|
||||
static __always_inline void arch_atomic64_dec(atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "decq %0"
|
||||
: "=m" (v->counter)
|
||||
|
@ -103,33 +103,33 @@ static __always_inline void atomic64_dec(atomic64_t *v)
|
|||
}
|
||||
|
||||
/**
|
||||
* atomic64_dec_and_test - decrement and test
|
||||
* arch_atomic64_dec_and_test - decrement and test
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically decrements @v by 1 and
|
||||
* returns true if the result is 0, or false for all other
|
||||
* cases.
|
||||
*/
|
||||
static inline bool atomic64_dec_and_test(atomic64_t *v)
|
||||
static inline bool arch_atomic64_dec_and_test(atomic64_t *v)
|
||||
{
|
||||
GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_inc_and_test - increment and test
|
||||
* arch_atomic64_inc_and_test - increment and test
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically increments @v by 1
|
||||
* and returns true if the result is zero, or false for all
|
||||
* other cases.
|
||||
*/
|
||||
static inline bool atomic64_inc_and_test(atomic64_t *v)
|
||||
static inline bool arch_atomic64_inc_and_test(atomic64_t *v)
|
||||
{
|
||||
GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_add_negative - add and test if negative
|
||||
* arch_atomic64_add_negative - add and test if negative
|
||||
* @i: integer value to add
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
|
@ -137,59 +137,59 @@ static inline bool atomic64_inc_and_test(atomic64_t *v)
|
|||
* if the result is negative, or false when
|
||||
* result is greater than or equal to zero.
|
||||
*/
|
||||
static inline bool atomic64_add_negative(long i, atomic64_t *v)
|
||||
static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
|
||||
{
|
||||
GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_add_return - add and return
|
||||
* arch_atomic64_add_return - add and return
|
||||
* @i: integer value to add
|
||||
* @v: pointer to type atomic64_t
|
||||
*
|
||||
* Atomically adds @i to @v and returns @i + @v
|
||||
*/
|
||||
static __always_inline long atomic64_add_return(long i, atomic64_t *v)
|
||||
static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v)
|
||||
{
|
||||
return i + xadd(&v->counter, i);
|
||||
}
|
||||
|
||||
static inline long atomic64_sub_return(long i, atomic64_t *v)
|
||||
static inline long arch_atomic64_sub_return(long i, atomic64_t *v)
|
||||
{
|
||||
return atomic64_add_return(-i, v);
|
||||
return arch_atomic64_add_return(-i, v);
|
||||
}
|
||||
|
||||
static inline long atomic64_fetch_add(long i, atomic64_t *v)
|
||||
static inline long arch_atomic64_fetch_add(long i, atomic64_t *v)
|
||||
{
|
||||
return xadd(&v->counter, i);
|
||||
}
|
||||
|
||||
static inline long atomic64_fetch_sub(long i, atomic64_t *v)
|
||||
static inline long arch_atomic64_fetch_sub(long i, atomic64_t *v)
|
||||
{
|
||||
return xadd(&v->counter, -i);
|
||||
}
|
||||
|
||||
#define atomic64_inc_return(v) (atomic64_add_return(1, (v)))
|
||||
#define atomic64_dec_return(v) (atomic64_sub_return(1, (v)))
|
||||
#define arch_atomic64_inc_return(v) (arch_atomic64_add_return(1, (v)))
|
||||
#define arch_atomic64_dec_return(v) (arch_atomic64_sub_return(1, (v)))
|
||||
|
||||
static inline long atomic64_cmpxchg(atomic64_t *v, long old, long new)
|
||||
static inline long arch_atomic64_cmpxchg(atomic64_t *v, long old, long new)
|
||||
{
|
||||
return cmpxchg(&v->counter, old, new);
|
||||
return arch_cmpxchg(&v->counter, old, new);
|
||||
}
|
||||
|
||||
#define atomic64_try_cmpxchg atomic64_try_cmpxchg
|
||||
static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, s64 *old, long new)
|
||||
#define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
|
||||
static __always_inline bool arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, long new)
|
||||
{
|
||||
return try_cmpxchg(&v->counter, old, new);
|
||||
}
|
||||
|
||||
static inline long atomic64_xchg(atomic64_t *v, long new)
|
||||
static inline long arch_atomic64_xchg(atomic64_t *v, long new)
|
||||
{
|
||||
return xchg(&v->counter, new);
|
||||
}
|
||||
|
||||
/**
|
||||
* atomic64_add_unless - add unless the number is a given value
|
||||
* arch_atomic64_add_unless - add unless the number is a given value
|
||||
* @v: pointer of type atomic64_t
|
||||
* @a: the amount to add to v...
|
||||
* @u: ...unless v is equal to u.
|
||||
|
@ -197,37 +197,37 @@ static inline long atomic64_xchg(atomic64_t *v, long new)
|
|||
* Atomically adds @a to @v, so long as it was not @u.
|
||||
* Returns the old value of @v.
|
||||
*/
|
||||
static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
|
||||
static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
|
||||
{
|
||||
s64 c = atomic64_read(v);
|
||||
s64 c = arch_atomic64_read(v);
|
||||
do {
|
||||
if (unlikely(c == u))
|
||||
return false;
|
||||
} while (!atomic64_try_cmpxchg(v, &c, c + a));
|
||||
} while (!arch_atomic64_try_cmpxchg(v, &c, c + a));
|
||||
return true;
|
||||
}
|
||||
|
||||
#define atomic64_inc_not_zero(v) atomic64_add_unless((v), 1, 0)
|
||||
#define arch_atomic64_inc_not_zero(v) arch_atomic64_add_unless((v), 1, 0)
|
||||
|
||||
/*
|
||||
* atomic64_dec_if_positive - decrement by 1 if old value positive
|
||||
* arch_atomic64_dec_if_positive - decrement by 1 if old value positive
|
||||
* @v: pointer of type atomic_t
|
||||
*
|
||||
* The function returns the old value of *v minus 1, even if
|
||||
* the atomic variable, v, was not decremented.
|
||||
*/
|
||||
static inline long atomic64_dec_if_positive(atomic64_t *v)
|
||||
static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
|
||||
{
|
||||
s64 dec, c = atomic64_read(v);
|
||||
s64 dec, c = arch_atomic64_read(v);
|
||||
do {
|
||||
dec = c - 1;
|
||||
if (unlikely(dec < 0))
|
||||
break;
|
||||
} while (!atomic64_try_cmpxchg(v, &c, dec));
|
||||
} while (!arch_atomic64_try_cmpxchg(v, &c, dec));
|
||||
return dec;
|
||||
}
|
||||
|
||||
static inline void atomic64_and(long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_and(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "andq %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -235,16 +235,16 @@ static inline void atomic64_and(long i, atomic64_t *v)
|
|||
: "memory");
|
||||
}
|
||||
|
||||
static inline long atomic64_fetch_and(long i, atomic64_t *v)
|
||||
static inline long arch_atomic64_fetch_and(long i, atomic64_t *v)
|
||||
{
|
||||
s64 val = atomic64_read(v);
|
||||
s64 val = arch_atomic64_read(v);
|
||||
|
||||
do {
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val & i));
|
||||
} while (!arch_atomic64_try_cmpxchg(v, &val, val & i));
|
||||
return val;
|
||||
}
|
||||
|
||||
static inline void atomic64_or(long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_or(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "orq %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -252,16 +252,16 @@ static inline void atomic64_or(long i, atomic64_t *v)
|
|||
: "memory");
|
||||
}
|
||||
|
||||
static inline long atomic64_fetch_or(long i, atomic64_t *v)
|
||||
static inline long arch_atomic64_fetch_or(long i, atomic64_t *v)
|
||||
{
|
||||
s64 val = atomic64_read(v);
|
||||
s64 val = arch_atomic64_read(v);
|
||||
|
||||
do {
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val | i));
|
||||
} while (!arch_atomic64_try_cmpxchg(v, &val, val | i));
|
||||
return val;
|
||||
}
|
||||
|
||||
static inline void atomic64_xor(long i, atomic64_t *v)
|
||||
static inline void arch_atomic64_xor(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "xorq %1,%0"
|
||||
: "+m" (v->counter)
|
||||
|
@ -269,12 +269,12 @@ static inline void atomic64_xor(long i, atomic64_t *v)
|
|||
: "memory");
|
||||
}
|
||||
|
||||
static inline long atomic64_fetch_xor(long i, atomic64_t *v)
|
||||
static inline long arch_atomic64_fetch_xor(long i, atomic64_t *v)
|
||||
{
|
||||
s64 val = atomic64_read(v);
|
||||
s64 val = arch_atomic64_read(v);
|
||||
|
||||
do {
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val ^ i));
|
||||
} while (!arch_atomic64_try_cmpxchg(v, &val, val ^ i));
|
||||
return val;
|
||||
}
|
||||
|
||||
|
|
|
@ -145,13 +145,13 @@ extern void __add_wrong_size(void)
|
|||
# include <asm/cmpxchg_64.h>
|
||||
#endif
|
||||
|
||||
#define cmpxchg(ptr, old, new) \
|
||||
#define arch_cmpxchg(ptr, old, new) \
|
||||
__cmpxchg(ptr, old, new, sizeof(*(ptr)))
|
||||
|
||||
#define sync_cmpxchg(ptr, old, new) \
|
||||
#define arch_sync_cmpxchg(ptr, old, new) \
|
||||
__sync_cmpxchg(ptr, old, new, sizeof(*(ptr)))
|
||||
|
||||
#define cmpxchg_local(ptr, old, new) \
|
||||
#define arch_cmpxchg_local(ptr, old, new) \
|
||||
__cmpxchg_local(ptr, old, new, sizeof(*(ptr)))
|
||||
|
||||
|
||||
|
@ -221,7 +221,7 @@ extern void __add_wrong_size(void)
|
|||
#define __try_cmpxchg(ptr, pold, new, size) \
|
||||
__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
|
||||
|
||||
#define try_cmpxchg(ptr, pold, new) \
|
||||
#define try_cmpxchg(ptr, pold, new) \
|
||||
__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
|
||||
|
||||
/*
|
||||
|
@ -250,10 +250,10 @@ extern void __add_wrong_size(void)
|
|||
__ret; \
|
||||
})
|
||||
|
||||
#define cmpxchg_double(p1, p2, o1, o2, n1, n2) \
|
||||
#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \
|
||||
__cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2)
|
||||
|
||||
#define cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
|
||||
#define arch_cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
|
||||
__cmpxchg_double(, p1, p2, o1, o2, n1, n2)
|
||||
|
||||
#endif /* ASM_X86_CMPXCHG_H */
|
||||
|
|
|
@ -36,10 +36,10 @@ static inline void set_64bit(volatile u64 *ptr, u64 value)
|
|||
}
|
||||
|
||||
#ifdef CONFIG_X86_CMPXCHG64
|
||||
#define cmpxchg64(ptr, o, n) \
|
||||
#define arch_cmpxchg64(ptr, o, n) \
|
||||
((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \
|
||||
(unsigned long long)(n)))
|
||||
#define cmpxchg64_local(ptr, o, n) \
|
||||
#define arch_cmpxchg64_local(ptr, o, n) \
|
||||
((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \
|
||||
(unsigned long long)(n)))
|
||||
#endif
|
||||
|
@ -76,7 +76,7 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
|
|||
* to simulate the cmpxchg8b on the 80386 and 80486 CPU.
|
||||
*/
|
||||
|
||||
#define cmpxchg64(ptr, o, n) \
|
||||
#define arch_cmpxchg64(ptr, o, n) \
|
||||
({ \
|
||||
__typeof__(*(ptr)) __ret; \
|
||||
__typeof__(*(ptr)) __old = (o); \
|
||||
|
@ -93,7 +93,7 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
|
|||
__ret; })
|
||||
|
||||
|
||||
#define cmpxchg64_local(ptr, o, n) \
|
||||
#define arch_cmpxchg64_local(ptr, o, n) \
|
||||
({ \
|
||||
__typeof__(*(ptr)) __ret; \
|
||||
__typeof__(*(ptr)) __old = (o); \
|
||||
|
|
|
@ -7,13 +7,13 @@ static inline void set_64bit(volatile u64 *ptr, u64 val)
|
|||
*ptr = val;
|
||||
}
|
||||
|
||||
#define cmpxchg64(ptr, o, n) \
|
||||
#define arch_cmpxchg64(ptr, o, n) \
|
||||
({ \
|
||||
BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
|
||||
cmpxchg((ptr), (o), (n)); \
|
||||
})
|
||||
|
||||
#define cmpxchg64_local(ptr, o, n) \
|
||||
#define arch_cmpxchg64_local(ptr, o, n) \
|
||||
({ \
|
||||
BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
|
||||
cmpxchg_local((ptr), (o), (n)); \
|
||||
|
|
|
@ -0,0 +1,476 @@
|
|||
/*
|
||||
* This file provides wrappers with KASAN instrumentation for atomic operations.
|
||||
* To use this functionality an arch's atomic.h file needs to define all
|
||||
* atomic operations with arch_ prefix (e.g. arch_atomic_read()) and include
|
||||
* this file at the end. This file provides atomic_read() that forwards to
|
||||
* arch_atomic_read() for actual atomic operation.
|
||||
* Note: if an arch atomic operation is implemented by means of other atomic
|
||||
* operations (e.g. atomic_read()/atomic_cmpxchg() loop), then it needs to use
|
||||
* arch_ variants (i.e. arch_atomic_read()/arch_atomic_cmpxchg()) to avoid
|
||||
* double instrumentation.
|
||||
*/
|
||||
|
||||
#ifndef _LINUX_ATOMIC_INSTRUMENTED_H
|
||||
#define _LINUX_ATOMIC_INSTRUMENTED_H
|
||||
|
||||
#include <linux/build_bug.h>
|
||||
#include <linux/kasan-checks.h>
|
||||
|
||||
static __always_inline int atomic_read(const atomic_t *v)
|
||||
{
|
||||
kasan_check_read(v, sizeof(*v));
|
||||
return arch_atomic_read(v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_read(const atomic64_t *v)
|
||||
{
|
||||
kasan_check_read(v, sizeof(*v));
|
||||
return arch_atomic64_read(v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_set(atomic_t *v, int i)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_set(v, i);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_set(atomic64_t *v, s64 i)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_set(v, i);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_xchg(atomic_t *v, int i)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_xchg(v, i);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_xchg(atomic64_t *v, s64 i)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_xchg(v, i);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_cmpxchg(v, old, new);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_cmpxchg(v, old, new);
|
||||
}
|
||||
|
||||
#ifdef arch_atomic_try_cmpxchg
|
||||
#define atomic_try_cmpxchg atomic_try_cmpxchg
|
||||
static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
kasan_check_read(old, sizeof(*old));
|
||||
return arch_atomic_try_cmpxchg(v, old, new);
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef arch_atomic64_try_cmpxchg
|
||||
#define atomic64_try_cmpxchg atomic64_try_cmpxchg
|
||||
static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
kasan_check_read(old, sizeof(*old));
|
||||
return arch_atomic64_try_cmpxchg(v, old, new);
|
||||
}
|
||||
#endif
|
||||
|
||||
static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return __arch_atomic_add_unless(v, a, u);
|
||||
}
|
||||
|
||||
|
||||
static __always_inline bool atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_add_unless(v, a, u);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_inc(atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_inc(v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_inc(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_inc(v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_dec(atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_dec(v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_dec(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_dec(v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_add(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_add(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_add(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_add(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_sub(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_sub(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_sub(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_sub(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_and(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_and(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_and(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_and(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_or(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_or(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_or(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_or(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic_xor(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic_xor(i, v);
|
||||
}
|
||||
|
||||
static __always_inline void atomic64_xor(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
arch_atomic64_xor(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_inc_return(atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_inc_return(v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_inc_return(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_inc_return(v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_dec_return(atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_dec_return(v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_dec_return(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_dec_return(v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_inc_not_zero(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_inc_not_zero(v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_dec_if_positive(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_dec_if_positive(v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic_dec_and_test(atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_dec_and_test(v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic64_dec_and_test(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_dec_and_test(v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic_inc_and_test(atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_inc_and_test(v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic64_inc_and_test(atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_inc_and_test(v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_add_return(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_add_return(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_add_return(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_add_return(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_sub_return(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_sub_return(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_sub_return(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_sub_return(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_fetch_add(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_fetch_add(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_fetch_add(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_fetch_add(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_fetch_sub(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_fetch_sub(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_fetch_sub(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_fetch_and(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_fetch_and(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_fetch_and(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_fetch_and(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_fetch_or(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_fetch_or(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_fetch_or(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_fetch_or(i, v);
|
||||
}
|
||||
|
||||
static __always_inline int atomic_fetch_xor(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_fetch_xor(i, v);
|
||||
}
|
||||
|
||||
static __always_inline s64 atomic64_fetch_xor(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_fetch_xor(i, v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_sub_and_test(i, v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic64_sub_and_test(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_sub_and_test(i, v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic_add_negative(int i, atomic_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic_add_negative(i, v);
|
||||
}
|
||||
|
||||
static __always_inline bool atomic64_add_negative(s64 i, atomic64_t *v)
|
||||
{
|
||||
kasan_check_write(v, sizeof(*v));
|
||||
return arch_atomic64_add_negative(i, v);
|
||||
}
|
||||
|
||||
static __always_inline unsigned long
|
||||
cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int size)
|
||||
{
|
||||
kasan_check_write(ptr, size);
|
||||
switch (size) {
|
||||
case 1:
|
||||
return arch_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
|
||||
case 2:
|
||||
return arch_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
|
||||
case 4:
|
||||
return arch_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
|
||||
case 8:
|
||||
BUILD_BUG_ON(sizeof(unsigned long) != 8);
|
||||
return arch_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
|
||||
}
|
||||
BUILD_BUG();
|
||||
return 0;
|
||||
}
|
||||
|
||||
#define cmpxchg(ptr, old, new) \
|
||||
({ \
|
||||
((__typeof__(*(ptr)))cmpxchg_size((ptr), (unsigned long)(old), \
|
||||
(unsigned long)(new), sizeof(*(ptr)))); \
|
||||
})
|
||||
|
||||
static __always_inline unsigned long
|
||||
sync_cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new,
|
||||
int size)
|
||||
{
|
||||
kasan_check_write(ptr, size);
|
||||
switch (size) {
|
||||
case 1:
|
||||
return arch_sync_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
|
||||
case 2:
|
||||
return arch_sync_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
|
||||
case 4:
|
||||
return arch_sync_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
|
||||
case 8:
|
||||
BUILD_BUG_ON(sizeof(unsigned long) != 8);
|
||||
return arch_sync_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
|
||||
}
|
||||
BUILD_BUG();
|
||||
return 0;
|
||||
}
|
||||
|
||||
#define sync_cmpxchg(ptr, old, new) \
|
||||
({ \
|
||||
((__typeof__(*(ptr)))sync_cmpxchg_size((ptr), \
|
||||
(unsigned long)(old), (unsigned long)(new), \
|
||||
sizeof(*(ptr)))); \
|
||||
})
|
||||
|
||||
static __always_inline unsigned long
|
||||
cmpxchg_local_size(volatile void *ptr, unsigned long old, unsigned long new,
|
||||
int size)
|
||||
{
|
||||
kasan_check_write(ptr, size);
|
||||
switch (size) {
|
||||
case 1:
|
||||
return arch_cmpxchg_local((u8 *)ptr, (u8)old, (u8)new);
|
||||
case 2:
|
||||
return arch_cmpxchg_local((u16 *)ptr, (u16)old, (u16)new);
|
||||
case 4:
|
||||
return arch_cmpxchg_local((u32 *)ptr, (u32)old, (u32)new);
|
||||
case 8:
|
||||
BUILD_BUG_ON(sizeof(unsigned long) != 8);
|
||||
return arch_cmpxchg_local((u64 *)ptr, (u64)old, (u64)new);
|
||||
}
|
||||
BUILD_BUG();
|
||||
return 0;
|
||||
}
|
||||
|
||||
#define cmpxchg_local(ptr, old, new) \
|
||||
({ \
|
||||
((__typeof__(*(ptr)))cmpxchg_local_size((ptr), \
|
||||
(unsigned long)(old), (unsigned long)(new), \
|
||||
sizeof(*(ptr)))); \
|
||||
})
|
||||
|
||||
static __always_inline u64
|
||||
cmpxchg64_size(volatile u64 *ptr, u64 old, u64 new)
|
||||
{
|
||||
kasan_check_write(ptr, sizeof(*ptr));
|
||||
return arch_cmpxchg64(ptr, old, new);
|
||||
}
|
||||
|
||||
#define cmpxchg64(ptr, old, new) \
|
||||
({ \
|
||||
((__typeof__(*(ptr)))cmpxchg64_size((ptr), (u64)(old), \
|
||||
(u64)(new))); \
|
||||
})
|
||||
|
||||
static __always_inline u64
|
||||
cmpxchg64_local_size(volatile u64 *ptr, u64 old, u64 new)
|
||||
{
|
||||
kasan_check_write(ptr, sizeof(*ptr));
|
||||
return arch_cmpxchg64_local(ptr, old, new);
|
||||
}
|
||||
|
||||
#define cmpxchg64_local(ptr, old, new) \
|
||||
({ \
|
||||
((__typeof__(*(ptr)))cmpxchg64_local_size((ptr), (u64)(old), \
|
||||
(u64)(new))); \
|
||||
})
|
||||
|
||||
/*
|
||||
* Originally we had the following code here:
|
||||
* __typeof__(p1) ____p1 = (p1);
|
||||
* kasan_check_write(____p1, 2 * sizeof(*____p1));
|
||||
* arch_cmpxchg_double(____p1, (p2), (o1), (o2), (n1), (n2));
|
||||
* But it leads to compilation failures (see gcc issue 72873).
|
||||
* So for now it's left non-instrumented.
|
||||
* There are few callers of cmpxchg_double(), so it's not critical.
|
||||
*/
|
||||
#define cmpxchg_double(p1, p2, o1, o2, n1, n2) \
|
||||
({ \
|
||||
arch_cmpxchg_double((p1), (p2), (o1), (o2), (n1), (n2)); \
|
||||
})
|
||||
|
||||
#define cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
|
||||
({ \
|
||||
arch_cmpxchg_double_local((p1), (p2), (o1), (o2), (n1), (n2)); \
|
||||
})
|
||||
|
||||
#endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
|
|
@ -14,7 +14,6 @@
|
|||
#include <asm/current.h>
|
||||
#include <linux/list.h>
|
||||
#include <linux/spinlock_types.h>
|
||||
#include <linux/linkage.h>
|
||||
#include <linux/lockdep.h>
|
||||
#include <linux/atomic.h>
|
||||
#include <asm/processor.h>
|
||||
|
|
|
@ -556,9 +556,9 @@ static void print_lock(struct held_lock *hlock)
|
|||
return;
|
||||
}
|
||||
|
||||
printk(KERN_CONT "%p", hlock->instance);
|
||||
print_lock_name(lock_classes + class_idx - 1);
|
||||
printk(KERN_CONT ", at: [<%p>] %pS\n",
|
||||
(void *)hlock->acquire_ip, (void *)hlock->acquire_ip);
|
||||
printk(KERN_CONT ", at: %pS\n", (void *)hlock->acquire_ip);
|
||||
}
|
||||
|
||||
static void lockdep_print_held_locks(struct task_struct *curr)
|
||||
|
@ -808,7 +808,7 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
|
|||
if (verbose(class)) {
|
||||
graph_unlock();
|
||||
|
||||
printk("\nnew class %p: %s", class->key, class->name);
|
||||
printk("\nnew class %px: %s", class->key, class->name);
|
||||
if (class->name_version > 1)
|
||||
printk(KERN_CONT "#%d", class->name_version);
|
||||
printk(KERN_CONT "\n");
|
||||
|
@ -1407,7 +1407,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
|
|||
}
|
||||
printk("%*s }\n", depth, "");
|
||||
|
||||
printk("%*s ... key at: [<%p>] %pS\n",
|
||||
printk("%*s ... key at: [<%px>] %pS\n",
|
||||
depth, "", class->key, class->key);
|
||||
}
|
||||
|
||||
|
@ -2340,7 +2340,7 @@ cache_hit:
|
|||
|
||||
if (very_verbose(class)) {
|
||||
printk("\nhash chain already cached, key: "
|
||||
"%016Lx tail class: [%p] %s\n",
|
||||
"%016Lx tail class: [%px] %s\n",
|
||||
(unsigned long long)chain_key,
|
||||
class->key, class->name);
|
||||
}
|
||||
|
@ -2349,7 +2349,7 @@ cache_hit:
|
|||
}
|
||||
|
||||
if (very_verbose(class)) {
|
||||
printk("\nnew hash chain, key: %016Lx tail class: [%p] %s\n",
|
||||
printk("\nnew hash chain, key: %016Lx tail class: [%px] %s\n",
|
||||
(unsigned long long)chain_key, class->key, class->name);
|
||||
}
|
||||
|
||||
|
@ -2676,16 +2676,16 @@ check_usage_backwards(struct task_struct *curr, struct held_lock *this,
|
|||
void print_irqtrace_events(struct task_struct *curr)
|
||||
{
|
||||
printk("irq event stamp: %u\n", curr->irq_events);
|
||||
printk("hardirqs last enabled at (%u): [<%p>] %pS\n",
|
||||
printk("hardirqs last enabled at (%u): [<%px>] %pS\n",
|
||||
curr->hardirq_enable_event, (void *)curr->hardirq_enable_ip,
|
||||
(void *)curr->hardirq_enable_ip);
|
||||
printk("hardirqs last disabled at (%u): [<%p>] %pS\n",
|
||||
printk("hardirqs last disabled at (%u): [<%px>] %pS\n",
|
||||
curr->hardirq_disable_event, (void *)curr->hardirq_disable_ip,
|
||||
(void *)curr->hardirq_disable_ip);
|
||||
printk("softirqs last enabled at (%u): [<%p>] %pS\n",
|
||||
printk("softirqs last enabled at (%u): [<%px>] %pS\n",
|
||||
curr->softirq_enable_event, (void *)curr->softirq_enable_ip,
|
||||
(void *)curr->softirq_enable_ip);
|
||||
printk("softirqs last disabled at (%u): [<%p>] %pS\n",
|
||||
printk("softirqs last disabled at (%u): [<%px>] %pS\n",
|
||||
curr->softirq_disable_event, (void *)curr->softirq_disable_ip,
|
||||
(void *)curr->softirq_disable_ip);
|
||||
}
|
||||
|
@ -3207,7 +3207,7 @@ static void __lockdep_init_map(struct lockdep_map *lock, const char *name,
|
|||
* Sanity check, the lock-class key must be persistent:
|
||||
*/
|
||||
if (!static_obj(key)) {
|
||||
printk("BUG: key %p not in .data!\n", key);
|
||||
printk("BUG: key %px not in .data!\n", key);
|
||||
/*
|
||||
* What it says above ^^^^^, I suggest you read it.
|
||||
*/
|
||||
|
@ -3322,7 +3322,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
|
|||
}
|
||||
atomic_inc((atomic_t *)&class->ops);
|
||||
if (very_verbose(class)) {
|
||||
printk("\nacquire class [%p] %s", class->key, class->name);
|
||||
printk("\nacquire class [%px] %s", class->key, class->name);
|
||||
if (class->name_version > 1)
|
||||
printk(KERN_CONT "#%d", class->name_version);
|
||||
printk(KERN_CONT "\n");
|
||||
|
@ -4376,7 +4376,7 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
|
|||
pr_warn("WARNING: held lock freed!\n");
|
||||
print_kernel_ident();
|
||||
pr_warn("-------------------------\n");
|
||||
pr_warn("%s/%d is freeing memory %p-%p, with a lock still held there!\n",
|
||||
pr_warn("%s/%d is freeing memory %px-%px, with a lock still held there!\n",
|
||||
curr->comm, task_pid_nr(curr), mem_from, mem_to-1);
|
||||
print_lock(hlock);
|
||||
lockdep_print_held_locks(curr);
|
||||
|
|
|
@ -1268,8 +1268,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
|
|||
|
||||
if (unlikely(ret)) {
|
||||
__set_current_state(TASK_RUNNING);
|
||||
if (rt_mutex_has_waiters(lock))
|
||||
remove_waiter(lock, &waiter);
|
||||
remove_waiter(lock, &waiter);
|
||||
rt_mutex_handle_deadlock(ret, chwalk, &waiter);
|
||||
}
|
||||
|
||||
|
|
|
@ -52,12 +52,13 @@ static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
|
|||
static inline struct rt_mutex_waiter *
|
||||
rt_mutex_top_waiter(struct rt_mutex *lock)
|
||||
{
|
||||
struct rt_mutex_waiter *w;
|
||||
|
||||
w = rb_entry(lock->waiters.rb_leftmost,
|
||||
struct rt_mutex_waiter, tree_entry);
|
||||
BUG_ON(w->lock != lock);
|
||||
struct rb_node *leftmost = rb_first_cached(&lock->waiters);
|
||||
struct rt_mutex_waiter *w = NULL;
|
||||
|
||||
if (leftmost) {
|
||||
w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry);
|
||||
BUG_ON(w->lock != lock);
|
||||
}
|
||||
return w;
|
||||
}
|
||||
|
||||
|
|
|
@ -117,6 +117,7 @@ EXPORT_SYMBOL(down_write_trylock);
|
|||
void up_read(struct rw_semaphore *sem)
|
||||
{
|
||||
rwsem_release(&sem->dep_map, 1, _RET_IP_);
|
||||
DEBUG_RWSEMS_WARN_ON(sem->owner != RWSEM_READER_OWNED);
|
||||
|
||||
__up_read(sem);
|
||||
}
|
||||
|
@ -129,6 +130,7 @@ EXPORT_SYMBOL(up_read);
|
|||
void up_write(struct rw_semaphore *sem)
|
||||
{
|
||||
rwsem_release(&sem->dep_map, 1, _RET_IP_);
|
||||
DEBUG_RWSEMS_WARN_ON(sem->owner != current);
|
||||
|
||||
rwsem_clear_owner(sem);
|
||||
__up_write(sem);
|
||||
|
@ -142,6 +144,7 @@ EXPORT_SYMBOL(up_write);
|
|||
void downgrade_write(struct rw_semaphore *sem)
|
||||
{
|
||||
lock_downgrade(&sem->dep_map, _RET_IP_);
|
||||
DEBUG_RWSEMS_WARN_ON(sem->owner != current);
|
||||
|
||||
rwsem_set_reader_owned(sem);
|
||||
__downgrade_write(sem);
|
||||
|
@ -211,6 +214,7 @@ EXPORT_SYMBOL(down_write_killable_nested);
|
|||
|
||||
void up_read_non_owner(struct rw_semaphore *sem)
|
||||
{
|
||||
DEBUG_RWSEMS_WARN_ON(sem->owner != RWSEM_READER_OWNED);
|
||||
__up_read(sem);
|
||||
}
|
||||
|
||||
|
|
|
@ -16,6 +16,12 @@
|
|||
*/
|
||||
#define RWSEM_READER_OWNED ((struct task_struct *)1UL)
|
||||
|
||||
#ifdef CONFIG_DEBUG_RWSEMS
|
||||
# define DEBUG_RWSEMS_WARN_ON(c) DEBUG_LOCKS_WARN_ON(c)
|
||||
#else
|
||||
# define DEBUG_RWSEMS_WARN_ON(c)
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
|
||||
/*
|
||||
* All writes to owner are protected by WRITE_ONCE() to make sure that
|
||||
|
@ -41,7 +47,7 @@ static inline void rwsem_set_reader_owned(struct rw_semaphore *sem)
|
|||
* do a write to the rwsem cacheline when it is really necessary
|
||||
* to minimize cacheline contention.
|
||||
*/
|
||||
if (sem->owner != RWSEM_READER_OWNED)
|
||||
if (READ_ONCE(sem->owner) != RWSEM_READER_OWNED)
|
||||
WRITE_ONCE(sem->owner, RWSEM_READER_OWNED);
|
||||
}
|
||||
|
||||
|
|
|
@ -1034,69 +1034,20 @@ config DEBUG_PREEMPT
|
|||
|
||||
menu "Lock Debugging (spinlocks, mutexes, etc...)"
|
||||
|
||||
config DEBUG_RT_MUTEXES
|
||||
bool "RT Mutex debugging, deadlock detection"
|
||||
depends on DEBUG_KERNEL && RT_MUTEXES
|
||||
help
|
||||
This allows rt mutex semantics violations and rt mutex related
|
||||
deadlocks (lockups) to be detected and reported automatically.
|
||||
|
||||
config DEBUG_SPINLOCK
|
||||
bool "Spinlock and rw-lock debugging: basic checks"
|
||||
depends on DEBUG_KERNEL
|
||||
select UNINLINE_SPIN_UNLOCK
|
||||
help
|
||||
Say Y here and build SMP to catch missing spinlock initialization
|
||||
and certain other kinds of spinlock errors commonly made. This is
|
||||
best used in conjunction with the NMI watchdog so that spinlock
|
||||
deadlocks are also debuggable.
|
||||
|
||||
config DEBUG_MUTEXES
|
||||
bool "Mutex debugging: basic checks"
|
||||
depends on DEBUG_KERNEL
|
||||
help
|
||||
This feature allows mutex semantics violations to be detected and
|
||||
reported.
|
||||
|
||||
config DEBUG_WW_MUTEX_SLOWPATH
|
||||
bool "Wait/wound mutex debugging: Slowpath testing"
|
||||
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
||||
select DEBUG_LOCK_ALLOC
|
||||
select DEBUG_SPINLOCK
|
||||
select DEBUG_MUTEXES
|
||||
help
|
||||
This feature enables slowpath testing for w/w mutex users by
|
||||
injecting additional -EDEADLK wound/backoff cases. Together with
|
||||
the full mutex checks enabled with (CONFIG_PROVE_LOCKING) this
|
||||
will test all possible w/w mutex interface abuse with the
|
||||
exception of simply not acquiring all the required locks.
|
||||
Note that this feature can introduce significant overhead, so
|
||||
it really should not be enabled in a production or distro kernel,
|
||||
even a debug kernel. If you are a driver writer, enable it. If
|
||||
you are a distro, do not.
|
||||
|
||||
config DEBUG_LOCK_ALLOC
|
||||
bool "Lock debugging: detect incorrect freeing of live locks"
|
||||
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
||||
select DEBUG_SPINLOCK
|
||||
select DEBUG_MUTEXES
|
||||
select DEBUG_RT_MUTEXES if RT_MUTEXES
|
||||
select LOCKDEP
|
||||
help
|
||||
This feature will check whether any held lock (spinlock, rwlock,
|
||||
mutex or rwsem) is incorrectly freed by the kernel, via any of the
|
||||
memory-freeing routines (kfree(), kmem_cache_free(), free_pages(),
|
||||
vfree(), etc.), whether a live lock is incorrectly reinitialized via
|
||||
spin_lock_init()/mutex_init()/etc., or whether there is any lock
|
||||
held during task exit.
|
||||
config LOCK_DEBUGGING_SUPPORT
|
||||
bool
|
||||
depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
||||
default y
|
||||
|
||||
config PROVE_LOCKING
|
||||
bool "Lock debugging: prove locking correctness"
|
||||
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
||||
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
|
||||
select LOCKDEP
|
||||
select DEBUG_SPINLOCK
|
||||
select DEBUG_MUTEXES
|
||||
select DEBUG_RT_MUTEXES if RT_MUTEXES
|
||||
select DEBUG_RWSEMS if RWSEM_SPIN_ON_OWNER
|
||||
select DEBUG_WW_MUTEX_SLOWPATH
|
||||
select DEBUG_LOCK_ALLOC
|
||||
select TRACE_IRQFLAGS
|
||||
default n
|
||||
|
@ -1134,20 +1085,9 @@ config PROVE_LOCKING
|
|||
|
||||
For more details, see Documentation/locking/lockdep-design.txt.
|
||||
|
||||
config LOCKDEP
|
||||
bool
|
||||
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
||||
select STACKTRACE
|
||||
select FRAME_POINTER if !MIPS && !PPC && !ARM_UNWIND && !S390 && !MICROBLAZE && !ARC && !SCORE && !X86
|
||||
select KALLSYMS
|
||||
select KALLSYMS_ALL
|
||||
|
||||
config LOCKDEP_SMALL
|
||||
bool
|
||||
|
||||
config LOCK_STAT
|
||||
bool "Lock usage statistics"
|
||||
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
||||
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
|
||||
select LOCKDEP
|
||||
select DEBUG_SPINLOCK
|
||||
select DEBUG_MUTEXES
|
||||
|
@ -1167,6 +1107,80 @@ config LOCK_STAT
|
|||
CONFIG_LOCK_STAT defines "contended" and "acquired" lock events.
|
||||
(CONFIG_LOCKDEP defines "acquire" and "release" events.)
|
||||
|
||||
config DEBUG_RT_MUTEXES
|
||||
bool "RT Mutex debugging, deadlock detection"
|
||||
depends on DEBUG_KERNEL && RT_MUTEXES
|
||||
help
|
||||
This allows rt mutex semantics violations and rt mutex related
|
||||
deadlocks (lockups) to be detected and reported automatically.
|
||||
|
||||
config DEBUG_SPINLOCK
|
||||
bool "Spinlock and rw-lock debugging: basic checks"
|
||||
depends on DEBUG_KERNEL
|
||||
select UNINLINE_SPIN_UNLOCK
|
||||
help
|
||||
Say Y here and build SMP to catch missing spinlock initialization
|
||||
and certain other kinds of spinlock errors commonly made. This is
|
||||
best used in conjunction with the NMI watchdog so that spinlock
|
||||
deadlocks are also debuggable.
|
||||
|
||||
config DEBUG_MUTEXES
|
||||
bool "Mutex debugging: basic checks"
|
||||
depends on DEBUG_KERNEL
|
||||
help
|
||||
This feature allows mutex semantics violations to be detected and
|
||||
reported.
|
||||
|
||||
config DEBUG_WW_MUTEX_SLOWPATH
|
||||
bool "Wait/wound mutex debugging: Slowpath testing"
|
||||
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
|
||||
select DEBUG_LOCK_ALLOC
|
||||
select DEBUG_SPINLOCK
|
||||
select DEBUG_MUTEXES
|
||||
help
|
||||
This feature enables slowpath testing for w/w mutex users by
|
||||
injecting additional -EDEADLK wound/backoff cases. Together with
|
||||
the full mutex checks enabled with (CONFIG_PROVE_LOCKING) this
|
||||
will test all possible w/w mutex interface abuse with the
|
||||
exception of simply not acquiring all the required locks.
|
||||
Note that this feature can introduce significant overhead, so
|
||||
it really should not be enabled in a production or distro kernel,
|
||||
even a debug kernel. If you are a driver writer, enable it. If
|
||||
you are a distro, do not.
|
||||
|
||||
config DEBUG_RWSEMS
|
||||
bool "RW Semaphore debugging: basic checks"
|
||||
depends on DEBUG_KERNEL && RWSEM_SPIN_ON_OWNER
|
||||
help
|
||||
This debugging feature allows mismatched rw semaphore locks and unlocks
|
||||
to be detected and reported.
|
||||
|
||||
config DEBUG_LOCK_ALLOC
|
||||
bool "Lock debugging: detect incorrect freeing of live locks"
|
||||
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
|
||||
select DEBUG_SPINLOCK
|
||||
select DEBUG_MUTEXES
|
||||
select DEBUG_RT_MUTEXES if RT_MUTEXES
|
||||
select LOCKDEP
|
||||
help
|
||||
This feature will check whether any held lock (spinlock, rwlock,
|
||||
mutex or rwsem) is incorrectly freed by the kernel, via any of the
|
||||
memory-freeing routines (kfree(), kmem_cache_free(), free_pages(),
|
||||
vfree(), etc.), whether a live lock is incorrectly reinitialized via
|
||||
spin_lock_init()/mutex_init()/etc., or whether there is any lock
|
||||
held during task exit.
|
||||
|
||||
config LOCKDEP
|
||||
bool
|
||||
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
|
||||
select STACKTRACE
|
||||
select FRAME_POINTER if !MIPS && !PPC && !ARM_UNWIND && !S390 && !MICROBLAZE && !ARC && !SCORE && !X86
|
||||
select KALLSYMS
|
||||
select KALLSYMS_ALL
|
||||
|
||||
config LOCKDEP_SMALL
|
||||
bool
|
||||
|
||||
config DEBUG_LOCKDEP
|
||||
bool "Lock dependency engine debugging"
|
||||
depends on DEBUG_KERNEL && LOCKDEP
|
||||
|
|
|
@ -0,0 +1,29 @@
|
|||
Prior Operation Subsequent Operation
|
||||
--------------- ---------------------------
|
||||
C Self R W RWM Self R W DR DW RMW SV
|
||||
-- ---- - - --- ---- - - -- -- --- --
|
||||
|
||||
Store, e.g., WRITE_ONCE() Y Y
|
||||
Load, e.g., READ_ONCE() Y Y Y Y
|
||||
Unsuccessful RMW operation Y Y Y Y
|
||||
rcu_dereference() Y Y Y Y
|
||||
Successful *_acquire() R Y Y Y Y Y Y
|
||||
Successful *_release() C Y Y Y W Y
|
||||
smp_rmb() Y R Y Y R
|
||||
smp_wmb() Y W Y Y W
|
||||
smp_mb() & synchronize_rcu() CP Y Y Y Y Y Y Y Y
|
||||
Successful full non-void RMW CP Y Y Y Y Y Y Y Y Y Y Y
|
||||
smp_mb__before_atomic() CP Y Y Y a a a a Y
|
||||
smp_mb__after_atomic() CP a a Y Y Y Y Y
|
||||
|
||||
|
||||
Key: C: Ordering is cumulative
|
||||
P: Ordering propagates
|
||||
R: Read, for example, READ_ONCE(), or read portion of RMW
|
||||
W: Write, for example, WRITE_ONCE(), or write portion of RMW
|
||||
Y: Provides ordering
|
||||
a: Provides ordering given intervening RMW atomic operation
|
||||
DR: Dependent read (address dependency)
|
||||
DW: Dependent write (address, data, or control dependency)
|
||||
RMW: Atomic read-modify-write operation
|
||||
SV Same-variable access
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,570 @@
|
|||
This document provides "recipes", that is, litmus tests for commonly
|
||||
occurring situations, as well as a few that illustrate subtly broken but
|
||||
attractive nuisances. Many of these recipes include example code from
|
||||
v4.13 of the Linux kernel.
|
||||
|
||||
The first section covers simple special cases, the second section
|
||||
takes off the training wheels to cover more involved examples,
|
||||
and the third section provides a few rules of thumb.
|
||||
|
||||
|
||||
Simple special cases
|
||||
====================
|
||||
|
||||
This section presents two simple special cases, the first being where
|
||||
there is only one CPU or only one memory location is accessed, and the
|
||||
second being use of that old concurrency workhorse, locking.
|
||||
|
||||
|
||||
Single CPU or single memory location
|
||||
------------------------------------
|
||||
|
||||
If there is only one CPU on the one hand or only one variable
|
||||
on the other, the code will execute in order. There are (as
|
||||
usual) some things to be careful of:
|
||||
|
||||
1. Some aspects of the C language are unordered. For example,
|
||||
in the expression "f(x) + g(y)", the order in which f and g are
|
||||
called is not defined; the object code is allowed to use either
|
||||
order or even to interleave the computations.
|
||||
|
||||
2. Compilers are permitted to use the "as-if" rule. That is, a
|
||||
compiler can emit whatever code it likes for normal accesses,
|
||||
as long as the results of a single-threaded execution appear
|
||||
just as if the compiler had followed all the relevant rules.
|
||||
To see this, compile with a high level of optimization and run
|
||||
the debugger on the resulting binary.
|
||||
|
||||
3. If there is only one variable but multiple CPUs, that variable
|
||||
must be properly aligned and all accesses to that variable must
|
||||
be full sized. Variables that straddle cachelines or pages void
|
||||
your full-ordering warranty, as do undersized accesses that load
|
||||
from or store to only part of the variable.
|
||||
|
||||
4. If there are multiple CPUs, accesses to shared variables should
|
||||
use READ_ONCE() and WRITE_ONCE() or stronger to prevent load/store
|
||||
tearing, load/store fusing, and invented loads and stores.
|
||||
There are exceptions to this rule, including:
|
||||
|
||||
i. When there is no possibility of a given shared variable
|
||||
being updated by some other CPU, for example, while
|
||||
holding the update-side lock, reads from that variable
|
||||
need not use READ_ONCE().
|
||||
|
||||
ii. When there is no possibility of a given shared variable
|
||||
being either read or updated by other CPUs, for example,
|
||||
when running during early boot, reads from that variable
|
||||
need not use READ_ONCE() and writes to that variable
|
||||
need not use WRITE_ONCE().
|
||||
|
||||
|
||||
Locking
|
||||
-------
|
||||
|
||||
Locking is well-known and straightforward, at least if you don't think
|
||||
about it too hard. And the basic rule is indeed quite simple: Any CPU that
|
||||
has acquired a given lock sees any changes previously seen or made by any
|
||||
CPU before it released that same lock. Note that this statement is a bit
|
||||
stronger than "Any CPU holding a given lock sees all changes made by any
|
||||
CPU during the time that CPU was holding this same lock". For example,
|
||||
consider the following pair of code fragments:
|
||||
|
||||
/* See MP+polocks.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
spin_lock(&mylock);
|
||||
WRITE_ONCE(y, 1);
|
||||
spin_unlock(&mylock);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
spin_lock(&mylock);
|
||||
r0 = READ_ONCE(y);
|
||||
spin_unlock(&mylock);
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
The basic rule guarantees that if CPU0() acquires mylock before CPU1(),
|
||||
then both r0 and r1 must be set to the value 1. This also has the
|
||||
consequence that if the final value of r0 is equal to 1, then the final
|
||||
value of r1 must also be equal to 1. In contrast, the weaker rule would
|
||||
say nothing about the final value of r1.
|
||||
|
||||
The converse to the basic rule also holds, as illustrated by the
|
||||
following litmus test:
|
||||
|
||||
/* See MP+porevlocks.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
r0 = READ_ONCE(y);
|
||||
spin_lock(&mylock);
|
||||
r1 = READ_ONCE(x);
|
||||
spin_unlock(&mylock);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
spin_lock(&mylock);
|
||||
WRITE_ONCE(x, 1);
|
||||
spin_unlock(&mylock);
|
||||
WRITE_ONCE(y, 1);
|
||||
}
|
||||
|
||||
This converse to the basic rule guarantees that if CPU0() acquires
|
||||
mylock before CPU1(), then both r0 and r1 must be set to the value 0.
|
||||
This also has the consequence that if the final value of r1 is equal
|
||||
to 0, then the final value of r0 must also be equal to 0. In contrast,
|
||||
the weaker rule would say nothing about the final value of r0.
|
||||
|
||||
These examples show only a single pair of CPUs, but the effects of the
|
||||
locking basic rule extend across multiple acquisitions of a given lock
|
||||
across multiple CPUs.
|
||||
|
||||
However, it is not necessarily the case that accesses ordered by
|
||||
locking will be seen as ordered by CPUs not holding that lock.
|
||||
Consider this example:
|
||||
|
||||
/* See Z6.0+pooncelock+pooncelock+pombonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
spin_lock(&mylock);
|
||||
WRITE_ONCE(x, 1);
|
||||
WRITE_ONCE(y, 1);
|
||||
spin_unlock(&mylock);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
spin_lock(&mylock);
|
||||
r0 = READ_ONCE(y);
|
||||
WRITE_ONCE(z, 1);
|
||||
spin_unlock(&mylock);
|
||||
}
|
||||
|
||||
void CPU2(void)
|
||||
{
|
||||
WRITE_ONCE(z, 2);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
Counter-intuitive though it might be, it is quite possible to have
|
||||
the final value of r0 be 1, the final value of z be 2, and the final
|
||||
value of r1 be 0. The reason for this surprising outcome is that
|
||||
CPU2() never acquired the lock, and thus did not benefit from the
|
||||
lock's ordering properties.
|
||||
|
||||
Ordering can be extended to CPUs not holding the lock by careful use
|
||||
of smp_mb__after_spinlock():
|
||||
|
||||
/* See Z6.0+pooncelock+poonceLock+pombonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
spin_lock(&mylock);
|
||||
WRITE_ONCE(x, 1);
|
||||
WRITE_ONCE(y, 1);
|
||||
spin_unlock(&mylock);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
spin_lock(&mylock);
|
||||
smp_mb__after_spinlock();
|
||||
r0 = READ_ONCE(y);
|
||||
WRITE_ONCE(z, 1);
|
||||
spin_unlock(&mylock);
|
||||
}
|
||||
|
||||
void CPU2(void)
|
||||
{
|
||||
WRITE_ONCE(z, 2);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
This addition of smp_mb__after_spinlock() strengthens the lock acquisition
|
||||
sufficiently to rule out the counter-intuitive outcome.
|
||||
|
||||
|
||||
Taking off the training wheels
|
||||
==============================
|
||||
|
||||
This section looks at more complex examples, including message passing,
|
||||
load buffering, release-acquire chains, store buffering.
|
||||
Many classes of litmus tests have abbreviated names, which may be found
|
||||
here: https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test6.pdf
|
||||
|
||||
|
||||
Message passing (MP)
|
||||
--------------------
|
||||
|
||||
The MP pattern has one CPU execute a pair of stores to a pair of variables
|
||||
and another CPU execute a pair of loads from this same pair of variables,
|
||||
but in the opposite order. The goal is to avoid the counter-intuitive
|
||||
outcome in which the first load sees the value written by the second store
|
||||
but the second load does not see the value written by the first store.
|
||||
In the absence of any ordering, this goal may not be met, as can be seen
|
||||
in the MP+poonceonces.litmus litmus test. This section therefore looks at
|
||||
a number of ways of meeting this goal.
|
||||
|
||||
|
||||
Release and acquire
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Use of smp_store_release() and smp_load_acquire() is one way to force
|
||||
the desired MP ordering. The general approach is shown below:
|
||||
|
||||
/* See MP+pooncerelease+poacquireonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_store_release(&y, 1);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
r0 = smp_load_acquire(&y);
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
The smp_store_release() macro orders any prior accesses against the
|
||||
store, while the smp_load_acquire macro orders the load against any
|
||||
subsequent accesses. Therefore, if the final value of r0 is the value 1,
|
||||
the final value of r1 must also be the value 1.
|
||||
|
||||
The init_stack_slab() function in lib/stackdepot.c uses release-acquire
|
||||
in this way to safely initialize of a slab of the stack. Working out
|
||||
the mutual-exclusion design is left as an exercise for the reader.
|
||||
|
||||
|
||||
Assign and dereference
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Use of rcu_assign_pointer() and rcu_dereference() is quite similar to the
|
||||
use of smp_store_release() and smp_load_acquire(), except that both
|
||||
rcu_assign_pointer() and rcu_dereference() operate on RCU-protected
|
||||
pointers. The general approach is shown below:
|
||||
|
||||
/* See MP+onceassign+derefonce.litmus. */
|
||||
int z;
|
||||
int *y = &z;
|
||||
int x;
|
||||
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
rcu_assign_pointer(y, &x);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
rcu_read_lock();
|
||||
r0 = rcu_dereference(y);
|
||||
r1 = READ_ONCE(*r0);
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
In this example, if the final value of r0 is &x then the final value of
|
||||
r1 must be 1.
|
||||
|
||||
The rcu_assign_pointer() macro has the same ordering properties as does
|
||||
smp_store_release(), but the rcu_dereference() macro orders the load only
|
||||
against later accesses that depend on the value loaded. A dependency
|
||||
is present if the value loaded determines the address of a later access
|
||||
(address dependency, as shown above), the value written by a later store
|
||||
(data dependency), or whether or not a later store is executed in the
|
||||
first place (control dependency). Note that the term "data dependency"
|
||||
is sometimes casually used to cover both address and data dependencies.
|
||||
|
||||
In lib/prime_numbers.c, the expand_to_next_prime() function invokes
|
||||
rcu_assign_pointer(), and the next_prime_number() function invokes
|
||||
rcu_dereference(). This combination mediates access to a bit vector
|
||||
that is expanded as additional primes are needed.
|
||||
|
||||
|
||||
Write and read memory barriers
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
It is usually better to use smp_store_release() instead of smp_wmb()
|
||||
and to use smp_load_acquire() instead of smp_rmb(). However, the older
|
||||
smp_wmb() and smp_rmb() APIs are still heavily used, so it is important
|
||||
to understand their use cases. The general approach is shown below:
|
||||
|
||||
/* See MP+wmbonceonce+rmbonceonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_wmb();
|
||||
WRITE_ONCE(y, 1);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
r0 = READ_ONCE(y);
|
||||
smp_rmb();
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
The smp_wmb() macro orders prior stores against later stores, and the
|
||||
smp_rmb() macro orders prior loads against later loads. Therefore, if
|
||||
the final value of r0 is 1, the final value of r1 must also be 1.
|
||||
|
||||
The the xlog_state_switch_iclogs() function in fs/xfs/xfs_log.c contains
|
||||
the following write-side code fragment:
|
||||
|
||||
log->l_curr_block -= log->l_logBBsize;
|
||||
ASSERT(log->l_curr_block >= 0);
|
||||
smp_wmb();
|
||||
log->l_curr_cycle++;
|
||||
|
||||
And the xlog_valid_lsn() function in fs/xfs/xfs_log_priv.h contains
|
||||
the corresponding read-side code fragment:
|
||||
|
||||
cur_cycle = ACCESS_ONCE(log->l_curr_cycle);
|
||||
smp_rmb();
|
||||
cur_block = ACCESS_ONCE(log->l_curr_block);
|
||||
|
||||
Alternatively, consider the following comment in function
|
||||
perf_output_put_handle() in kernel/events/ring_buffer.c:
|
||||
|
||||
* kernel user
|
||||
*
|
||||
* if (LOAD ->data_tail) { LOAD ->data_head
|
||||
* (A) smp_rmb() (C)
|
||||
* STORE $data LOAD $data
|
||||
* smp_wmb() (B) smp_mb() (D)
|
||||
* STORE ->data_head STORE ->data_tail
|
||||
* }
|
||||
|
||||
The B/C pairing is an example of the MP pattern using smp_wmb() on the
|
||||
write side and smp_rmb() on the read side.
|
||||
|
||||
Of course, given that smp_mb() is strictly stronger than either smp_wmb()
|
||||
or smp_rmb(), any code fragment that would work with smp_rmb() and
|
||||
smp_wmb() would also work with smp_mb() replacing either or both of the
|
||||
weaker barriers.
|
||||
|
||||
|
||||
Load buffering (LB)
|
||||
-------------------
|
||||
|
||||
The LB pattern has one CPU load from one variable and then store to a
|
||||
second, while another CPU loads from the second variable and then stores
|
||||
to the first. The goal is to avoid the counter-intuitive situation where
|
||||
each load reads the value written by the other CPU's store. In the
|
||||
absence of any ordering it is quite possible that this may happen, as
|
||||
can be seen in the LB+poonceonces.litmus litmus test.
|
||||
|
||||
One way of avoiding the counter-intuitive outcome is through the use of a
|
||||
control dependency paired with a full memory barrier:
|
||||
|
||||
/* See LB+ctrlonceonce+mbonceonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
r0 = READ_ONCE(x);
|
||||
if (r0)
|
||||
WRITE_ONCE(y, 1);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
r1 = READ_ONCE(y);
|
||||
smp_mb();
|
||||
WRITE_ONCE(x, 1);
|
||||
}
|
||||
|
||||
This pairing of a control dependency in CPU0() with a full memory
|
||||
barrier in CPU1() prevents r0 and r1 from both ending up equal to 1.
|
||||
|
||||
The A/D pairing from the ring-buffer use case shown earlier also
|
||||
illustrates LB. Here is a repeat of the comment in
|
||||
perf_output_put_handle() in kernel/events/ring_buffer.c, showing a
|
||||
control dependency on the kernel side and a full memory barrier on
|
||||
the user side:
|
||||
|
||||
* kernel user
|
||||
*
|
||||
* if (LOAD ->data_tail) { LOAD ->data_head
|
||||
* (A) smp_rmb() (C)
|
||||
* STORE $data LOAD $data
|
||||
* smp_wmb() (B) smp_mb() (D)
|
||||
* STORE ->data_head STORE ->data_tail
|
||||
* }
|
||||
*
|
||||
* Where A pairs with D, and B pairs with C.
|
||||
|
||||
The kernel's control dependency between the load from ->data_tail
|
||||
and the store to data combined with the user's full memory barrier
|
||||
between the load from data and the store to ->data_tail prevents
|
||||
the counter-intuitive outcome where the kernel overwrites the data
|
||||
before the user gets done loading it.
|
||||
|
||||
|
||||
Release-acquire chains
|
||||
----------------------
|
||||
|
||||
Release-acquire chains are a low-overhead, flexible, and easy-to-use
|
||||
method of maintaining order. However, they do have some limitations that
|
||||
need to be fully understood. Here is an example that maintains order:
|
||||
|
||||
/* See ISA2+pooncerelease+poacquirerelease+poacquireonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_store_release(&y, 1);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
r0 = smp_load_acquire(y);
|
||||
smp_store_release(&z, 1);
|
||||
}
|
||||
|
||||
void CPU2(void)
|
||||
{
|
||||
r1 = smp_load_acquire(z);
|
||||
r2 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
In this case, if r0 and r1 both have final values of 1, then r2 must
|
||||
also have a final value of 1.
|
||||
|
||||
The ordering in this example is stronger than it needs to be. For
|
||||
example, ordering would still be preserved if CPU1()'s smp_load_acquire()
|
||||
invocation was replaced with READ_ONCE().
|
||||
|
||||
It is tempting to assume that CPU0()'s store to x is globally ordered
|
||||
before CPU1()'s store to z, but this is not the case:
|
||||
|
||||
/* See Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_store_release(&y, 1);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
r0 = smp_load_acquire(y);
|
||||
smp_store_release(&z, 1);
|
||||
}
|
||||
|
||||
void CPU2(void)
|
||||
{
|
||||
WRITE_ONCE(z, 2);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
One might hope that if the final value of r0 is 1 and the final value
|
||||
of z is 2, then the final value of r1 must also be 1, but it really is
|
||||
possible for r1 to have the final value of 0. The reason, of course,
|
||||
is that in this version, CPU2() is not part of the release-acquire chain.
|
||||
This situation is accounted for in the rules of thumb below.
|
||||
|
||||
Despite this limitation, release-acquire chains are low-overhead as
|
||||
well as simple and powerful, at least as memory-ordering mechanisms go.
|
||||
|
||||
|
||||
Store buffering
|
||||
---------------
|
||||
|
||||
Store buffering can be thought of as upside-down load buffering, so
|
||||
that one CPU first stores to one variable and then loads from a second,
|
||||
while another CPU stores to the second variable and then loads from the
|
||||
first. Preserving order requires nothing less than full barriers:
|
||||
|
||||
/* See SB+mbonceonces.litmus. */
|
||||
void CPU0(void)
|
||||
{
|
||||
WRITE_ONCE(x, 1);
|
||||
smp_mb();
|
||||
r0 = READ_ONCE(y);
|
||||
}
|
||||
|
||||
void CPU1(void)
|
||||
{
|
||||
WRITE_ONCE(y, 1);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(x);
|
||||
}
|
||||
|
||||
Omitting either smp_mb() will allow both r0 and r1 to have final
|
||||
values of 0, but providing both full barriers as shown above prevents
|
||||
this counter-intuitive outcome.
|
||||
|
||||
This pattern most famously appears as part of Dekker's locking
|
||||
algorithm, but it has a much more practical use within the Linux kernel
|
||||
of ordering wakeups. The following comment taken from waitqueue_active()
|
||||
in include/linux/wait.h shows the canonical pattern:
|
||||
|
||||
* CPU0 - waker CPU1 - waiter
|
||||
*
|
||||
* for (;;) {
|
||||
* @cond = true; prepare_to_wait(&wq_head, &wait, state);
|
||||
* smp_mb(); // smp_mb() from set_current_state()
|
||||
* if (waitqueue_active(wq_head)) if (@cond)
|
||||
* wake_up(wq_head); break;
|
||||
* schedule();
|
||||
* }
|
||||
* finish_wait(&wq_head, &wait);
|
||||
|
||||
On CPU0, the store is to @cond and the load is in waitqueue_active().
|
||||
On CPU1, prepare_to_wait() contains both a store to wq_head and a call
|
||||
to set_current_state(), which contains an smp_mb() barrier; the load is
|
||||
"if (@cond)". The full barriers prevent the undesirable outcome where
|
||||
CPU1 puts the waiting task to sleep and CPU0 fails to wake it up.
|
||||
|
||||
Note that use of locking can greatly simplify this pattern.
|
||||
|
||||
|
||||
Rules of thumb
|
||||
==============
|
||||
|
||||
There might seem to be no pattern governing what ordering primitives are
|
||||
needed in which situations, but this is not the case. There is a pattern
|
||||
based on the relation between the accesses linking successive CPUs in a
|
||||
given litmus test. There are three types of linkage:
|
||||
|
||||
1. Write-to-read, where the next CPU reads the value that the
|
||||
previous CPU wrote. The LB litmus-test patterns contain only
|
||||
this type of relation. In formal memory-modeling texts, this
|
||||
relation is called "reads-from" and is usually abbreviated "rf".
|
||||
|
||||
2. Read-to-write, where the next CPU overwrites the value that the
|
||||
previous CPU read. The SB litmus test contains only this type
|
||||
of relation. In formal memory-modeling texts, this relation is
|
||||
often called "from-reads" and is sometimes abbreviated "fr".
|
||||
|
||||
3. Write-to-write, where the next CPU overwrites the value written
|
||||
by the previous CPU. The Z6.0 litmus test pattern contains a
|
||||
write-to-write relation between the last access of CPU1() and
|
||||
the first access of CPU2(). In formal memory-modeling texts,
|
||||
this relation is often called "coherence order" and is sometimes
|
||||
abbreviated "co". In the C++ standard, it is instead called
|
||||
"modification order" and often abbreviated "mo".
|
||||
|
||||
The strength of memory ordering required for a given litmus test to
|
||||
avoid a counter-intuitive outcome depends on the types of relations
|
||||
linking the memory accesses for the outcome in question:
|
||||
|
||||
o If all links are write-to-read links, then the weakest
|
||||
possible ordering within each CPU suffices. For example, in
|
||||
the LB litmus test, a control dependency was enough to do the
|
||||
job.
|
||||
|
||||
o If all but one of the links are write-to-read links, then a
|
||||
release-acquire chain suffices. Both the MP and the ISA2
|
||||
litmus tests illustrate this case.
|
||||
|
||||
o If more than one of the links are something other than
|
||||
write-to-read links, then a full memory barrier is required
|
||||
between each successive pair of non-write-to-read links. This
|
||||
case is illustrated by the Z6.0 litmus tests, both in the
|
||||
locking and in the release-acquire sections.
|
||||
|
||||
However, if you find yourself having to stretch these rules of thumb
|
||||
to fit your situation, you should consider creating a litmus test and
|
||||
running it on the model.
|
|
@ -0,0 +1,107 @@
|
|||
This document provides background reading for memory models and related
|
||||
tools. These documents are aimed at kernel hackers who are interested
|
||||
in memory models.
|
||||
|
||||
|
||||
Hardware manuals and models
|
||||
===========================
|
||||
|
||||
o SPARC International Inc. (Ed.). 1994. "The SPARC Architecture
|
||||
Reference Manual Version 9". SPARC International Inc.
|
||||
|
||||
o Compaq Computer Corporation (Ed.). 2002. "Alpha Architecture
|
||||
Reference Manual". Compaq Computer Corporation.
|
||||
|
||||
o Intel Corporation (Ed.). 2002. "A Formal Specification of Intel
|
||||
Itanium Processor Family Memory Ordering". Intel Corporation.
|
||||
|
||||
o Intel Corporation (Ed.). 2002. "Intel 64 and IA-32 Architectures
|
||||
Software Developer’s Manual". Intel Corporation.
|
||||
|
||||
o Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli,
|
||||
and Magnus O. Myreen. 2010. "x86-TSO: A Rigorous and Usable
|
||||
Programmer's Model for x86 Multiprocessors". Commun. ACM 53, 7
|
||||
(July, 2010), 89-97. http://doi.acm.org/10.1145/1785414.1785443
|
||||
|
||||
o IBM Corporation (Ed.). 2009. "Power ISA Version 2.06". IBM
|
||||
Corporation.
|
||||
|
||||
o ARM Ltd. (Ed.). 2009. "ARM Barrier Litmus Tests and Cookbook".
|
||||
ARM Ltd.
|
||||
|
||||
o Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and
|
||||
Derek Williams. 2011. "Understanding POWER Multiprocessors". In
|
||||
Proceedings of the 32Nd ACM SIGPLAN Conference on Programming
|
||||
Language Design and Implementation (PLDI ’11). ACM, New York,
|
||||
NY, USA, 175–186.
|
||||
|
||||
o Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty,
|
||||
Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams.
|
||||
2012. "Synchronising C/C++ and POWER". In Proceedings of the 33rd
|
||||
ACM SIGPLAN Conference on Programming Language Design and
|
||||
Implementation (PLDI '12). ACM, New York, NY, USA, 311-322.
|
||||
|
||||
o ARM Ltd. (Ed.). 2014. "ARM Architecture Reference Manual (ARMv8,
|
||||
for ARMv8-A architecture profile)". ARM Ltd.
|
||||
|
||||
o Imagination Technologies, LTD. 2015. "MIPS(R) Architecture
|
||||
For Programmers, Volume II-A: The MIPS64(R) Instruction,
|
||||
Set Reference Manual". Imagination Technologies,
|
||||
LTD. https://imgtec.com/?do-download=4302.
|
||||
|
||||
o Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit
|
||||
Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter
|
||||
Sewell. 2016. "Modelling the ARMv8 Architecture, Operationally:
|
||||
Concurrency and ISA". In Proceedings of the 43rd Annual ACM
|
||||
SIGPLAN-SIGACT Symposium on Principles of Programming Languages
|
||||
(POPL ’16). ACM, New York, NY, USA, 608–621.
|
||||
|
||||
o Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis,
|
||||
Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter
|
||||
Sewell. 2017. "Mixed-size Concurrency: ARM, POWER, C/C++11,
|
||||
and SC". In Proceedings of the 44th ACM SIGPLAN Symposium on
|
||||
Principles of Programming Languages (POPL 2017). ACM, New York,
|
||||
NY, USA, 429–442.
|
||||
|
||||
|
||||
Linux-kernel memory model
|
||||
=========================
|
||||
|
||||
o Andrea Parri, Alan Stern, Luc Maranget, Paul E. McKenney,
|
||||
and Jade Alglave. 2017. "A formal model of
|
||||
Linux-kernel memory ordering - companion webpage".
|
||||
http://moscova.inria.fr/∼maranget/cats7/linux/. (2017). [Online;
|
||||
accessed 30-January-2017].
|
||||
|
||||
o Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, and
|
||||
Alan Stern. 2017. "A formal kernel memory-ordering model (part 1)"
|
||||
Linux Weekly News. https://lwn.net/Articles/718628/
|
||||
|
||||
o Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, and
|
||||
Alan Stern. 2017. "A formal kernel memory-ordering model (part 2)"
|
||||
Linux Weekly News. https://lwn.net/Articles/720550/
|
||||
|
||||
|
||||
Memory-model tooling
|
||||
====================
|
||||
|
||||
o Daniel Jackson. 2002. "Alloy: A Lightweight Object Modelling
|
||||
Notation". ACM Trans. Softw. Eng. Methodol. 11, 2 (April 2002),
|
||||
256–290. http://doi.acm.org/10.1145/505145.505149
|
||||
|
||||
o Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. "Herding
|
||||
Cats: Modelling, Simulation, Testing, and Data Mining for Weak
|
||||
Memory". ACM Trans. Program. Lang. Syst. 36, 2, Article 7 (July
|
||||
2014), 7:1–7:74 pages.
|
||||
|
||||
o Jade Alglave, Patrick Cousot, and Luc Maranget. 2016. "Syntax and
|
||||
semantics of the weak consistency model specification language
|
||||
cat". CoRR abs/1608.07531 (2016). http://arxiv.org/abs/1608.07531
|
||||
|
||||
|
||||
Memory-model comparisons
|
||||
========================
|
||||
|
||||
o Paul E. McKenney, Ulrich Weigand, Andrea Parri, and Boqun
|
||||
Feng. 2016. "Linux-Kernel Memory Model". (6 June 2016).
|
||||
http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0124r2.html.
|
|
@ -0,0 +1,206 @@
|
|||
=====================================
|
||||
LINUX KERNEL MEMORY CONSISTENCY MODEL
|
||||
=====================================
|
||||
|
||||
============
|
||||
INTRODUCTION
|
||||
============
|
||||
|
||||
This directory contains the memory consistency model (memory model, for
|
||||
short) of the Linux kernel, written in the "cat" language and executable
|
||||
by the externally provided "herd7" simulator, which exhaustively explores
|
||||
the state space of small litmus tests.
|
||||
|
||||
In addition, the "klitmus7" tool (also externally provided) may be used
|
||||
to convert a litmus test to a Linux kernel module, which in turn allows
|
||||
that litmus test to be exercised within the Linux kernel.
|
||||
|
||||
|
||||
============
|
||||
REQUIREMENTS
|
||||
============
|
||||
|
||||
Version 7.48 of the "herd7" and "klitmus7" tools must be downloaded
|
||||
separately:
|
||||
|
||||
https://github.com/herd/herdtools7
|
||||
|
||||
See "herdtools7/INSTALL.md" for installation instructions.
|
||||
|
||||
|
||||
==================
|
||||
BASIC USAGE: HERD7
|
||||
==================
|
||||
|
||||
The memory model is used, in conjunction with "herd7", to exhaustively
|
||||
explore the state space of small litmus tests.
|
||||
|
||||
For example, to run SB+mbonceonces.litmus against the memory model:
|
||||
|
||||
$ herd7 -conf linux-kernel.cfg litmus-tests/SB+mbonceonces.litmus
|
||||
|
||||
Here is the corresponding output:
|
||||
|
||||
Test SB+mbonceonces Allowed
|
||||
States 3
|
||||
0:r0=0; 1:r0=1;
|
||||
0:r0=1; 1:r0=0;
|
||||
0:r0=1; 1:r0=1;
|
||||
No
|
||||
Witnesses
|
||||
Positive: 0 Negative: 3
|
||||
Condition exists (0:r0=0 /\ 1:r0=0)
|
||||
Observation SB+mbonceonces Never 0 3
|
||||
Time SB+mbonceonces 0.01
|
||||
Hash=d66d99523e2cac6b06e66f4c995ebb48
|
||||
|
||||
The "Positive: 0 Negative: 3" and the "Never 0 3" each indicate that
|
||||
this litmus test's "exists" clause can not be satisfied.
|
||||
|
||||
See "herd7 -help" or "herdtools7/doc/" for more information.
|
||||
|
||||
|
||||
=====================
|
||||
BASIC USAGE: KLITMUS7
|
||||
=====================
|
||||
|
||||
The "klitmus7" tool converts a litmus test into a Linux kernel module,
|
||||
which may then be loaded and run.
|
||||
|
||||
For example, to run SB+mbonceonces.litmus against hardware:
|
||||
|
||||
$ mkdir mymodules
|
||||
$ klitmus7 -o mymodules litmus-tests/SB+mbonceonces.litmus
|
||||
$ cd mymodules ; make
|
||||
$ sudo sh run.sh
|
||||
|
||||
The corresponding output includes:
|
||||
|
||||
Test SB+mbonceonces Allowed
|
||||
Histogram (3 states)
|
||||
644580 :>0:r0=1; 1:r0=0;
|
||||
644328 :>0:r0=0; 1:r0=1;
|
||||
711092 :>0:r0=1; 1:r0=1;
|
||||
No
|
||||
Witnesses
|
||||
Positive: 0, Negative: 2000000
|
||||
Condition exists (0:r0=0 /\ 1:r0=0) is NOT validated
|
||||
Hash=d66d99523e2cac6b06e66f4c995ebb48
|
||||
Observation SB+mbonceonces Never 0 2000000
|
||||
Time SB+mbonceonces 0.16
|
||||
|
||||
The "Positive: 0 Negative: 2000000" and the "Never 0 2000000" indicate
|
||||
that during two million trials, the state specified in this litmus
|
||||
test's "exists" clause was not reached.
|
||||
|
||||
And, as with "herd7", please see "klitmus7 -help" or "herdtools7/doc/"
|
||||
for more information.
|
||||
|
||||
|
||||
====================
|
||||
DESCRIPTION OF FILES
|
||||
====================
|
||||
|
||||
Documentation/cheatsheet.txt
|
||||
Quick-reference guide to the Linux-kernel memory model.
|
||||
|
||||
Documentation/explanation.txt
|
||||
Describes the memory model in detail.
|
||||
|
||||
Documentation/recipes.txt
|
||||
Lists common memory-ordering patterns.
|
||||
|
||||
Documentation/references.txt
|
||||
Provides background reading.
|
||||
|
||||
linux-kernel.bell
|
||||
Categorizes the relevant instructions, including memory
|
||||
references, memory barriers, atomic read-modify-write operations,
|
||||
lock acquisition/release, and RCU operations.
|
||||
|
||||
More formally, this file (1) lists the subtypes of the various
|
||||
event types used by the memory model and (2) performs RCU
|
||||
read-side critical section nesting analysis.
|
||||
|
||||
linux-kernel.cat
|
||||
Specifies what reorderings are forbidden by memory references,
|
||||
memory barriers, atomic read-modify-write operations, and RCU.
|
||||
|
||||
More formally, this file specifies what executions are forbidden
|
||||
by the memory model. Allowed executions are those which
|
||||
satisfy the model's "coherence", "atomic", "happens-before",
|
||||
"propagation", and "rcu" axioms, which are defined in the file.
|
||||
|
||||
linux-kernel.cfg
|
||||
Convenience file that gathers the common-case herd7 command-line
|
||||
arguments.
|
||||
|
||||
linux-kernel.def
|
||||
Maps from C-like syntax to herd7's internal litmus-test
|
||||
instruction-set architecture.
|
||||
|
||||
litmus-tests
|
||||
Directory containing a few representative litmus tests, which
|
||||
are listed in litmus-tests/README. A great deal more litmus
|
||||
tests are available at https://github.com/paulmckrcu/litmus.
|
||||
|
||||
lock.cat
|
||||
Provides a front-end analysis of lock acquisition and release,
|
||||
for example, associating a lock acquisition with the preceding
|
||||
and following releases and checking for self-deadlock.
|
||||
|
||||
More formally, this file defines a performance-enhanced scheme
|
||||
for generation of the possible reads-from and coherence order
|
||||
relations on the locking primitives.
|
||||
|
||||
README
|
||||
This file.
|
||||
|
||||
|
||||
===========
|
||||
LIMITATIONS
|
||||
===========
|
||||
|
||||
The Linux-kernel memory model has the following limitations:
|
||||
|
||||
1. Compiler optimizations are not modeled. Of course, the use
|
||||
of READ_ONCE() and WRITE_ONCE() limits the compiler's ability
|
||||
to optimize, but there is Linux-kernel code that uses bare C
|
||||
memory accesses. Handling this code is on the to-do list.
|
||||
For more information, see Documentation/explanation.txt (in
|
||||
particular, the "THE PROGRAM ORDER RELATION: po AND po-loc"
|
||||
and "A WARNING" sections).
|
||||
|
||||
2. Multiple access sizes for a single variable are not supported,
|
||||
and neither are misaligned or partially overlapping accesses.
|
||||
|
||||
3. Exceptions and interrupts are not modeled. In some cases,
|
||||
this limitation can be overcome by modeling the interrupt or
|
||||
exception with an additional process.
|
||||
|
||||
4. I/O such as MMIO or DMA is not supported.
|
||||
|
||||
5. Self-modifying code (such as that found in the kernel's
|
||||
alternatives mechanism, function tracer, Berkeley Packet Filter
|
||||
JIT compiler, and module loader) is not supported.
|
||||
|
||||
6. Complete modeling of all variants of atomic read-modify-write
|
||||
operations, locking primitives, and RCU is not provided.
|
||||
For example, call_rcu() and rcu_barrier() are not supported.
|
||||
However, a substantial amount of support is provided for these
|
||||
operations, as shown in the linux-kernel.def file.
|
||||
|
||||
The "herd7" tool has some additional limitations of its own, apart from
|
||||
the memory model:
|
||||
|
||||
1. Non-trivial data structures such as arrays or structures are
|
||||
not supported. However, pointers are supported, allowing trivial
|
||||
linked lists to be constructed.
|
||||
|
||||
2. Dynamic memory allocation is not supported, although this can
|
||||
be worked around in some cases by supplying multiple statically
|
||||
allocated variables.
|
||||
|
||||
Some of these limitations may be overcome in the future, but others are
|
||||
more likely to be addressed by incorporating the Linux-kernel memory model
|
||||
into other tools.
|
|
@ -0,0 +1,52 @@
|
|||
// SPDX-License-Identifier: GPL-2.0+
|
||||
(*
|
||||
* Copyright (C) 2015 Jade Alglave <j.alglave@ucl.ac.uk>,
|
||||
* Copyright (C) 2016 Luc Maranget <luc.maranget@inria.fr> for Inria
|
||||
* Copyright (C) 2017 Alan Stern <stern@rowland.harvard.edu>,
|
||||
* Andrea Parri <parri.andrea@gmail.com>
|
||||
*
|
||||
* An earlier version of this file appears in the companion webpage for
|
||||
* "Frightening small children and disconcerting grown-ups: Concurrency
|
||||
* in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
|
||||
* which is to appear in ASPLOS 2018.
|
||||
*)
|
||||
|
||||
"Linux-kernel memory consistency model"
|
||||
|
||||
enum Accesses = 'once (*READ_ONCE,WRITE_ONCE,ACCESS_ONCE*) ||
|
||||
'release (*smp_store_release*) ||
|
||||
'acquire (*smp_load_acquire*) ||
|
||||
'noreturn (* R of non-return RMW *)
|
||||
instructions R[{'once,'acquire,'noreturn}]
|
||||
instructions W[{'once,'release}]
|
||||
instructions RMW[{'once,'acquire,'release}]
|
||||
|
||||
enum Barriers = 'wmb (*smp_wmb*) ||
|
||||
'rmb (*smp_rmb*) ||
|
||||
'mb (*smp_mb*) ||
|
||||
'rcu-lock (*rcu_read_lock*) ||
|
||||
'rcu-unlock (*rcu_read_unlock*) ||
|
||||
'sync-rcu (*synchronize_rcu*) ||
|
||||
'before-atomic (*smp_mb__before_atomic*) ||
|
||||
'after-atomic (*smp_mb__after_atomic*) ||
|
||||
'after-spinlock (*smp_mb__after_spinlock*)
|
||||
instructions F[Barriers]
|
||||
|
||||
(* Compute matching pairs of nested Rcu-lock and Rcu-unlock *)
|
||||
let matched = let rec
|
||||
unmatched-locks = Rcu-lock \ domain(matched)
|
||||
and unmatched-unlocks = Rcu-unlock \ range(matched)
|
||||
and unmatched = unmatched-locks | unmatched-unlocks
|
||||
and unmatched-po = [unmatched] ; po ; [unmatched]
|
||||
and unmatched-locks-to-unlocks =
|
||||
[unmatched-locks] ; po ; [unmatched-unlocks]
|
||||
and matched = matched | (unmatched-locks-to-unlocks \
|
||||
(unmatched-po ; unmatched-po))
|
||||
in matched
|
||||
|
||||
(* Validate nesting *)
|
||||
flag ~empty Rcu-lock \ domain(matched) as unbalanced-rcu-locking
|
||||
flag ~empty Rcu-unlock \ range(matched) as unbalanced-rcu-locking
|
||||
|
||||
(* Outermost level of nesting only *)
|
||||
let crit = matched \ (po^-1 ; matched ; po^-1)
|
|
@ -0,0 +1,121 @@
|
|||
// SPDX-License-Identifier: GPL-2.0+
|
||||
(*
|
||||
* Copyright (C) 2015 Jade Alglave <j.alglave@ucl.ac.uk>,
|
||||
* Copyright (C) 2016 Luc Maranget <luc.maranget@inria.fr> for Inria
|
||||
* Copyright (C) 2017 Alan Stern <stern@rowland.harvard.edu>,
|
||||
* Andrea Parri <parri.andrea@gmail.com>
|
||||
*
|
||||
* An earlier version of this file appears in the companion webpage for
|
||||
* "Frightening small children and disconcerting grown-ups: Concurrency
|
||||
* in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
|
||||
* which is to appear in ASPLOS 2018.
|
||||
*)
|
||||
|
||||
"Linux-kernel memory consistency model"
|
||||
|
||||
(*
|
||||
* File "lock.cat" handles locks and is experimental.
|
||||
* It can be replaced by include "cos.cat" for tests that do not use locks.
|
||||
*)
|
||||
|
||||
include "lock.cat"
|
||||
|
||||
(*******************)
|
||||
(* Basic relations *)
|
||||
(*******************)
|
||||
|
||||
(* Fences *)
|
||||
let rmb = [R \ Noreturn] ; fencerel(Rmb) ; [R \ Noreturn]
|
||||
let wmb = [W] ; fencerel(Wmb) ; [W]
|
||||
let mb = ([M] ; fencerel(Mb) ; [M]) |
|
||||
([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
|
||||
([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
|
||||
([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M])
|
||||
let gp = po ; [Sync-rcu] ; po?
|
||||
|
||||
let strong-fence = mb | gp
|
||||
|
||||
(* Release Acquire *)
|
||||
let acq-po = [Acquire] ; po ; [M]
|
||||
let po-rel = [M] ; po ; [Release]
|
||||
let rfi-rel-acq = [Release] ; rfi ; [Acquire]
|
||||
|
||||
(**********************************)
|
||||
(* Fundamental coherence ordering *)
|
||||
(**********************************)
|
||||
|
||||
(* Sequential Consistency Per Variable *)
|
||||
let com = rf | co | fr
|
||||
acyclic po-loc | com as coherence
|
||||
|
||||
(* Atomic Read-Modify-Write *)
|
||||
empty rmw & (fre ; coe) as atomic
|
||||
|
||||
(**********************************)
|
||||
(* Instruction execution ordering *)
|
||||
(**********************************)
|
||||
|
||||
(* Preserved Program Order *)
|
||||
let dep = addr | data
|
||||
let rwdep = (dep | ctrl) ; [W]
|
||||
let overwrite = co | fr
|
||||
let to-w = rwdep | (overwrite & int)
|
||||
let to-r = addr | (dep ; rfi) | rfi-rel-acq
|
||||
let fence = strong-fence | wmb | po-rel | rmb | acq-po
|
||||
let ppo = to-r | to-w | fence
|
||||
|
||||
(* Propagation: Ordering from release operations and strong fences. *)
|
||||
let A-cumul(r) = rfe? ; r
|
||||
let cumul-fence = A-cumul(strong-fence | po-rel) | wmb
|
||||
let prop = (overwrite & ext)? ; cumul-fence* ; rfe?
|
||||
|
||||
(*
|
||||
* Happens Before: Ordering from the passage of time.
|
||||
* No fences needed here for prop because relation confined to one process.
|
||||
*)
|
||||
let hb = ppo | rfe | ((prop \ id) & int)
|
||||
acyclic hb as happens-before
|
||||
|
||||
(****************************************)
|
||||
(* Write and fence propagation ordering *)
|
||||
(****************************************)
|
||||
|
||||
(* Propagation: Each non-rf link needs a strong fence. *)
|
||||
let pb = prop ; strong-fence ; hb*
|
||||
acyclic pb as propagation
|
||||
|
||||
(*******)
|
||||
(* RCU *)
|
||||
(*******)
|
||||
|
||||
(*
|
||||
* Effect of read-side critical section proceeds from the rcu_read_lock()
|
||||
* onward on the one hand and from the rcu_read_unlock() backwards on the
|
||||
* other hand.
|
||||
*)
|
||||
let rscs = po ; crit^-1 ; po?
|
||||
|
||||
(*
|
||||
* The synchronize_rcu() strong fence is special in that it can order not
|
||||
* one but two non-rf relations, but only in conjunction with an RCU
|
||||
* read-side critical section.
|
||||
*)
|
||||
let link = hb* ; pb* ; prop
|
||||
|
||||
(* Chains that affect the RCU grace-period guarantee *)
|
||||
let gp-link = gp ; link
|
||||
let rscs-link = rscs ; link
|
||||
|
||||
(*
|
||||
* A cycle containing at least as many grace periods as RCU read-side
|
||||
* critical sections is forbidden.
|
||||
*)
|
||||
let rec rcu-path =
|
||||
gp-link |
|
||||
(gp-link ; rscs-link) |
|
||||
(rscs-link ; gp-link) |
|
||||
(rcu-path ; rcu-path) |
|
||||
(gp-link ; rcu-path ; rscs-link) |
|
||||
(rscs-link ; rcu-path ; gp-link)
|
||||
|
||||
irreflexive rcu-path as rcu
|
|
@ -0,0 +1,21 @@
|
|||
macros linux-kernel.def
|
||||
bell linux-kernel.bell
|
||||
model linux-kernel.cat
|
||||
graph columns
|
||||
squished true
|
||||
showevents noregs
|
||||
movelabel true
|
||||
fontsize 8
|
||||
xscale 2.0
|
||||
yscale 1.5
|
||||
arrowsize 0.8
|
||||
showinitrf false
|
||||
showfinalrf false
|
||||
showinitwrites false
|
||||
splines spline
|
||||
pad 0.1
|
||||
edgeattr hb,color,indigo
|
||||
edgeattr co,color,blue
|
||||
edgeattr mb,color,darkgreen
|
||||
edgeattr wmb,color,darkgreen
|
||||
edgeattr rmb,color,darkgreen
|
|
@ -0,0 +1,106 @@
|
|||
// SPDX-License-Identifier: GPL-2.0+
|
||||
//
|
||||
// An earlier version of this file appears in the companion webpage for
|
||||
// "Frightening small children and disconcerting grown-ups: Concurrency
|
||||
// in the Linux kernel" by Alglave, Maranget, McKenney, Parri, and Stern,
|
||||
// which is to appear in ASPLOS 2018.
|
||||
|
||||
// ONCE
|
||||
READ_ONCE(X) __load{once}(X)
|
||||
WRITE_ONCE(X,V) { __store{once}(X,V); }
|
||||
|
||||
// Release Acquire and friends
|
||||
smp_store_release(X,V) { __store{release}(*X,V); }
|
||||
smp_load_acquire(X) __load{acquire}(*X)
|
||||
rcu_assign_pointer(X,V) { __store{release}(X,V); }
|
||||
rcu_dereference(X) __load{once}(X)
|
||||
|
||||
// Fences
|
||||
smp_mb() { __fence{mb} ; }
|
||||
smp_rmb() { __fence{rmb} ; }
|
||||
smp_wmb() { __fence{wmb} ; }
|
||||
smp_mb__before_atomic() { __fence{before-atomic} ; }
|
||||
smp_mb__after_atomic() { __fence{after-atomic} ; }
|
||||
smp_mb__after_spinlock() { __fence{after-spinlock} ; }
|
||||
|
||||
// Exchange
|
||||
xchg(X,V) __xchg{mb}(X,V)
|
||||
xchg_relaxed(X,V) __xchg{once}(X,V)
|
||||
xchg_release(X,V) __xchg{release}(X,V)
|
||||
xchg_acquire(X,V) __xchg{acquire}(X,V)
|
||||
cmpxchg(X,V,W) __cmpxchg{mb}(X,V,W)
|
||||
cmpxchg_relaxed(X,V,W) __cmpxchg{once}(X,V,W)
|
||||
cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
|
||||
cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
|
||||
|
||||
// Spinlocks
|
||||
spin_lock(X) { __lock(X) ; }
|
||||
spin_unlock(X) { __unlock(X) ; }
|
||||
spin_trylock(X) __trylock(X)
|
||||
|
||||
// RCU
|
||||
rcu_read_lock() { __fence{rcu-lock}; }
|
||||
rcu_read_unlock() { __fence{rcu-unlock};}
|
||||
synchronize_rcu() { __fence{sync-rcu}; }
|
||||
synchronize_rcu_expedited() { __fence{sync-rcu}; }
|
||||
|
||||
// Atomic
|
||||
atomic_read(X) READ_ONCE(*X)
|
||||
atomic_set(X,V) { WRITE_ONCE(*X,V) ; }
|
||||
atomic_read_acquire(X) smp_load_acquire(X)
|
||||
atomic_set_release(X,V) { smp_store_release(X,V); }
|
||||
|
||||
atomic_add(V,X) { __atomic_op(X,+,V) ; }
|
||||
atomic_sub(V,X) { __atomic_op(X,-,V) ; }
|
||||
atomic_inc(X) { __atomic_op(X,+,1) ; }
|
||||
atomic_dec(X) { __atomic_op(X,-,1) ; }
|
||||
|
||||
atomic_add_return(V,X) __atomic_op_return{mb}(X,+,V)
|
||||
atomic_add_return_relaxed(V,X) __atomic_op_return{once}(X,+,V)
|
||||
atomic_add_return_acquire(V,X) __atomic_op_return{acquire}(X,+,V)
|
||||
atomic_add_return_release(V,X) __atomic_op_return{release}(X,+,V)
|
||||
atomic_fetch_add(V,X) __atomic_fetch_op{mb}(X,+,V)
|
||||
atomic_fetch_add_relaxed(V,X) __atomic_fetch_op{once}(X,+,V)
|
||||
atomic_fetch_add_acquire(V,X) __atomic_fetch_op{acquire}(X,+,V)
|
||||
atomic_fetch_add_release(V,X) __atomic_fetch_op{release}(X,+,V)
|
||||
|
||||
atomic_inc_return(X) __atomic_op_return{mb}(X,+,1)
|
||||
atomic_inc_return_relaxed(X) __atomic_op_return{once}(X,+,1)
|
||||
atomic_inc_return_acquire(X) __atomic_op_return{acquire}(X,+,1)
|
||||
atomic_inc_return_release(X) __atomic_op_return{release}(X,+,1)
|
||||
atomic_fetch_inc(X) __atomic_fetch_op{mb}(X,+,1)
|
||||
atomic_fetch_inc_relaxed(X) __atomic_fetch_op{once}(X,+,1)
|
||||
atomic_fetch_inc_acquire(X) __atomic_fetch_op{acquire}(X,+,1)
|
||||
atomic_fetch_inc_release(X) __atomic_fetch_op{release}(X,+,1)
|
||||
|
||||
atomic_sub_return(V,X) __atomic_op_return{mb}(X,-,V)
|
||||
atomic_sub_return_relaxed(V,X) __atomic_op_return{once}(X,-,V)
|
||||
atomic_sub_return_acquire(V,X) __atomic_op_return{acquire}(X,-,V)
|
||||
atomic_sub_return_release(V,X) __atomic_op_return{release}(X,-,V)
|
||||
atomic_fetch_sub(V,X) __atomic_fetch_op{mb}(X,-,V)
|
||||
atomic_fetch_sub_relaxed(V,X) __atomic_fetch_op{once}(X,-,V)
|
||||
atomic_fetch_sub_acquire(V,X) __atomic_fetch_op{acquire}(X,-,V)
|
||||
atomic_fetch_sub_release(V,X) __atomic_fetch_op{release}(X,-,V)
|
||||
|
||||
atomic_dec_return(X) __atomic_op_return{mb}(X,-,1)
|
||||
atomic_dec_return_relaxed(X) __atomic_op_return{once}(X,-,1)
|
||||
atomic_dec_return_acquire(X) __atomic_op_return{acquire}(X,-,1)
|
||||
atomic_dec_return_release(X) __atomic_op_return{release}(X,-,1)
|
||||
atomic_fetch_dec(X) __atomic_fetch_op{mb}(X,-,1)
|
||||
atomic_fetch_dec_relaxed(X) __atomic_fetch_op{once}(X,-,1)
|
||||
atomic_fetch_dec_acquire(X) __atomic_fetch_op{acquire}(X,-,1)
|
||||
atomic_fetch_dec_release(X) __atomic_fetch_op{release}(X,-,1)
|
||||
|
||||
atomic_xchg(X,V) __xchg{mb}(X,V)
|
||||
atomic_xchg_relaxed(X,V) __xchg{once}(X,V)
|
||||
atomic_xchg_release(X,V) __xchg{release}(X,V)
|
||||
atomic_xchg_acquire(X,V) __xchg{acquire}(X,V)
|
||||
atomic_cmpxchg(X,V,W) __cmpxchg{mb}(X,V,W)
|
||||
atomic_cmpxchg_relaxed(X,V,W) __cmpxchg{once}(X,V,W)
|
||||
atomic_cmpxchg_acquire(X,V,W) __cmpxchg{acquire}(X,V,W)
|
||||
atomic_cmpxchg_release(X,V,W) __cmpxchg{release}(X,V,W)
|
||||
|
||||
atomic_sub_and_test(V,X) __atomic_op_return{mb}(X,-,V) == 0
|
||||
atomic_dec_and_test(X) __atomic_op_return{mb}(X,-,1) == 0
|
||||
atomic_inc_and_test(X) __atomic_op_return{mb}(X,+,1) == 0
|
||||
atomic_add_negative(V,X) __atomic_op_return{mb}(X,+,V) < 0
|
|
@ -0,0 +1,26 @@
|
|||
C CoRR+poonceonce+Once
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Test of read-read coherence, that is, whether or not two successive
|
||||
* reads from the same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
P1(int *x)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
|
@ -0,0 +1,25 @@
|
|||
C CoRW+poonceonce+Once
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Test of read-write coherence, that is, whether or not a read from
|
||||
* a given variable and a later write to that same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
P1(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 2);
|
||||
}
|
||||
|
||||
exists (x=2 /\ 0:r0=2)
|
|
@ -0,0 +1,25 @@
|
|||
C CoWR+poonceonce+Once
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Test of write-read coherence, that is, whether or not a write to a
|
||||
* given variable and a later read from that same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*x, 1);
|
||||
r0 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
P1(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 2);
|
||||
}
|
||||
|
||||
exists (x=1 /\ 0:r0=2)
|
|
@ -0,0 +1,18 @@
|
|||
C CoWW+poonceonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Test of write-write coherence, that is, whether or not two successive
|
||||
* writes to the same variable are ordered.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*x, 2);
|
||||
}
|
||||
|
||||
exists (x=1)
|
|
@ -0,0 +1,45 @@
|
|||
C IRIW+mbonceonces+OnceOnce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Test of independent reads from independent writes with smp_mb()
|
||||
* between each pairs of reads. In other words, is smp_mb() sufficient to
|
||||
* cause two different reading processes to agree on the order of a pair
|
||||
* of writes, where each write is to a different variable by a different
|
||||
* process?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(*y);
|
||||
}
|
||||
|
||||
P2(int *y)
|
||||
{
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P3(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0 /\ 3:r0=1 /\ 3:r1=0)
|
|
@ -0,0 +1,43 @@
|
|||
C IRIW+poonceonces+OnceOnce
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* Test of independent reads from independent writes with nothing
|
||||
* between each pairs of reads. In other words, is anything at all
|
||||
* needed to cause two different reading processes to agree on the order
|
||||
* of a pair of writes, where each write is to a different variable by a
|
||||
* different process?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
r1 = READ_ONCE(*y);
|
||||
}
|
||||
|
||||
P2(int *y)
|
||||
{
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P3(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0 /\ 3:r0=1 /\ 3:r1=0)
|
|
@ -0,0 +1,41 @@
|
|||
C ISA2+pooncelock+pooncelock+pombonce.litmus
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* This test shows that the ordering provided by a lock-protected S
|
||||
* litmus test (P0() and P1()) are not visible to external process P2().
|
||||
* This is likely to change soon.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *y, int *z, spinlock_t *mylock)
|
||||
{
|
||||
int r0;
|
||||
|
||||
spin_lock(mylock);
|
||||
r0 = READ_ONCE(*y);
|
||||
WRITE_ONCE(*z, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P2(int *x, int *z)
|
||||
{
|
||||
int r1;
|
||||
int r2;
|
||||
|
||||
r2 = READ_ONCE(*z);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 2:r2=1 /\ 2:r1=0)
|
|
@ -0,0 +1,37 @@
|
|||
C ISA2+poonceonces
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* Given a release-acquire chain ordering the first process's store
|
||||
* against the last process's load, is ordering preserved if all of the
|
||||
* smp_store_release() invocations are replaced by WRITE_ONCE() and all
|
||||
* of the smp_load_acquire() invocations are replaced by READ_ONCE()?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *y, int *z)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
WRITE_ONCE(*z, 1);
|
||||
}
|
||||
|
||||
P2(int *x, int *z)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*z);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 2:r0=1 /\ 2:r1=0)
|
|
@ -0,0 +1,39 @@
|
|||
C ISA2+pooncerelease+poacquirerelease+poacquireonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates that a release-acquire chain suffices
|
||||
* to order P0()'s initial write against P2()'s final read. The reason
|
||||
* that the release-acquire chain suffices is because in all but one
|
||||
* case (P2() to P0()), each process reads from the preceding process's
|
||||
* write. In memory-model-speak, there is only one non-reads-from
|
||||
* (AKA non-rf) link, so release-acquire is all that is needed.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_store_release(y, 1);
|
||||
}
|
||||
|
||||
P1(int *y, int *z)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = smp_load_acquire(y);
|
||||
smp_store_release(z, 1);
|
||||
}
|
||||
|
||||
P2(int *x, int *z)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = smp_load_acquire(z);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 2:r0=1 /\ 2:r1=0)
|
|
@ -0,0 +1,34 @@
|
|||
C LB+ctrlonceonce+mbonceonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates that lightweight ordering suffices for
|
||||
* the load-buffering pattern, in other words, preventing all processes
|
||||
* reading from the preceding process's write. In this example, the
|
||||
* combination of a control dependency and a full memory barrier are enough
|
||||
* to do the trick. (But the full memory barrier could be replaced with
|
||||
* another control dependency and order would still be maintained.)
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
if (r0)
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
smp_mb();
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
exists (0:r0=1 /\ 1:r0=1)
|
|
@ -0,0 +1,29 @@
|
|||
C LB+poacquireonce+pooncerelease
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Does a release-acquire pair suffice for the load-buffering litmus
|
||||
* test, where each process reads from one of two variables then writes
|
||||
* to the other?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
smp_store_release(y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = smp_load_acquire(y);
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
exists (0:r0=1 /\ 1:r0=1)
|
|
@ -0,0 +1,28 @@
|
|||
C LB+poonceonces
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* Can the counter-intuitive outcome for the load-buffering pattern
|
||||
* be prevented even with no explicit ordering?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
exists (0:r0=1 /\ 1:r0=1)
|
|
@ -0,0 +1,34 @@
|
|||
C MP+onceassign+derefonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates that rcu_assign_pointer() and
|
||||
* rcu_dereference() suffice to ensure that an RCU reader will not see
|
||||
* pre-initialization garbage when it traverses an RCU-protected data
|
||||
* structure containing a newly inserted element.
|
||||
*)
|
||||
|
||||
{
|
||||
y=z;
|
||||
z=0;
|
||||
}
|
||||
|
||||
P0(int *x, int **y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
rcu_assign_pointer(*y, x);
|
||||
}
|
||||
|
||||
P1(int *x, int **y)
|
||||
{
|
||||
int *r0;
|
||||
int r1;
|
||||
|
||||
rcu_read_lock();
|
||||
r0 = rcu_dereference(*y);
|
||||
r1 = READ_ONCE(*r0);
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
exists (1:r0=x /\ 1:r1=0)
|
|
@ -0,0 +1,35 @@
|
|||
C MP+polocks
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates how lock acquisitions and releases can
|
||||
* stand in for smp_load_acquire() and smp_store_release(), respectively.
|
||||
* In other words, when holding a given lock (or indeed after releasing a
|
||||
* given lock), a CPU is not only guaranteed to see the accesses that other
|
||||
* CPUs made while previously holding that lock, it is also guaranteed
|
||||
* to see all prior accesses by those other CPUs.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*y, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
spin_lock(mylock);
|
||||
r0 = READ_ONCE(*y);
|
||||
spin_unlock(mylock);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
|
@ -0,0 +1,27 @@
|
|||
C MP+poonceonces
|
||||
|
||||
(*
|
||||
* Result: Maybe
|
||||
*
|
||||
* Can the counter-intuitive message-passing outcome be prevented with
|
||||
* no ordering at all?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
|
@ -0,0 +1,28 @@
|
|||
C MP+pooncerelease+poacquireonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates that smp_store_release() and
|
||||
* smp_load_acquire() provide sufficient ordering for the message-passing
|
||||
* pattern.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_store_release(y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = smp_load_acquire(y);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
|
@ -0,0 +1,35 @@
|
|||
C MP+porevlocks
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates how lock acquisitions and releases can
|
||||
* stand in for smp_load_acquire() and smp_store_release(), respectively.
|
||||
* In other words, when holding a given lock (or indeed after releasing a
|
||||
* given lock), a CPU is not only guaranteed to see the accesses that other
|
||||
* CPUs made while previously holding that lock, it is also guaranteed to
|
||||
* see all prior accesses by those other CPUs.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
spin_lock(mylock);
|
||||
r1 = READ_ONCE(*x);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*x, 1);
|
||||
spin_unlock(mylock);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
exists (0:r0=1 /\ 0:r1=0)
|
|
@ -0,0 +1,30 @@
|
|||
C MP+wmbonceonce+rmbonceonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates that smp_wmb() and smp_rmb() provide
|
||||
* sufficient ordering for the message-passing pattern. However, it
|
||||
* is usually better to use smp_store_release() and smp_load_acquire().
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_wmb();
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
smp_rmb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 1:r1=0)
|
|
@ -0,0 +1,30 @@
|
|||
C R+mbonceonces
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This is the fully ordered (via smp_mb()) version of one of the classic
|
||||
* counterintuitive litmus tests that illustrates the effects of store
|
||||
* propagation delays. Note that weakening either of the barriers would
|
||||
* cause the resulting test to be allowed.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_mb();
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*y, 2);
|
||||
smp_mb();
|
||||
r0 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (y=2 /\ 1:r0=0)
|
|
@ -0,0 +1,27 @@
|
|||
C R+poonceonces
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* This is the unordered (thus lacking smp_mb()) version of one of the
|
||||
* classic counterintuitive litmus tests that illustrates the effects of
|
||||
* store propagation delays.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*y, 2);
|
||||
r0 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (y=2 /\ 1:r0=0)
|
|
@ -0,0 +1,131 @@
|
|||
This directory contains the following litmus tests:
|
||||
|
||||
CoRR+poonceonce+Once.litmus
|
||||
Test of read-read coherence, that is, whether or not two
|
||||
successive reads from the same variable are ordered.
|
||||
|
||||
CoRW+poonceonce+Once.litmus
|
||||
Test of read-write coherence, that is, whether or not a read
|
||||
from a given variable followed by a write to that same variable
|
||||
are ordered.
|
||||
|
||||
CoWR+poonceonce+Once.litmus
|
||||
Test of write-read coherence, that is, whether or not a write
|
||||
to a given variable followed by a read from that same variable
|
||||
are ordered.
|
||||
|
||||
CoWW+poonceonce.litmus
|
||||
Test of write-write coherence, that is, whether or not two
|
||||
successive writes to the same variable are ordered.
|
||||
|
||||
IRIW+mbonceonces+OnceOnce.litmus
|
||||
Test of independent reads from independent writes with smp_mb()
|
||||
between each pairs of reads. In other words, is smp_mb()
|
||||
sufficient to cause two different reading processes to agree on
|
||||
the order of a pair of writes, where each write is to a different
|
||||
variable by a different process?
|
||||
|
||||
IRIW+poonceonces+OnceOnce.litmus
|
||||
Test of independent reads from independent writes with nothing
|
||||
between each pairs of reads. In other words, is anything at all
|
||||
needed to cause two different reading processes to agree on the
|
||||
order of a pair of writes, where each write is to a different
|
||||
variable by a different process?
|
||||
|
||||
ISA2+pooncelock+pooncelock+pombonce.litmus
|
||||
Tests whether the ordering provided by a lock-protected S
|
||||
litmus test is visible to an external process whose accesses are
|
||||
separated by smp_mb(). This addition of an external process to
|
||||
S is otherwise known as ISA2.
|
||||
|
||||
ISA2+poonceonces.litmus
|
||||
As below, but with store-release replaced with WRITE_ONCE()
|
||||
and load-acquire replaced with READ_ONCE().
|
||||
|
||||
ISA2+pooncerelease+poacquirerelease+poacquireonce.litmus
|
||||
Can a release-acquire chain order a prior store against
|
||||
a later load?
|
||||
|
||||
LB+ctrlonceonce+mbonceonce.litmus
|
||||
Does a control dependency and an smp_mb() suffice for the
|
||||
load-buffering litmus test, where each process reads from one
|
||||
of two variables then writes to the other?
|
||||
|
||||
LB+poacquireonce+pooncerelease.litmus
|
||||
Does a release-acquire pair suffice for the load-buffering
|
||||
litmus test, where each process reads from one of two variables then
|
||||
writes to the other?
|
||||
|
||||
LB+poonceonces.litmus
|
||||
As above, but with store-release replaced with WRITE_ONCE()
|
||||
and load-acquire replaced with READ_ONCE().
|
||||
|
||||
MP+onceassign+derefonce.litmus
|
||||
As below, but with rcu_assign_pointer() and an rcu_dereference().
|
||||
|
||||
MP+polocks.litmus
|
||||
As below, but with the second access of the writer process
|
||||
and the first access of reader process protected by a lock.
|
||||
|
||||
MP+poonceonces.litmus
|
||||
As below, but without the smp_rmb() and smp_wmb().
|
||||
|
||||
MP+pooncerelease+poacquireonce.litmus
|
||||
As below, but with a release-acquire chain.
|
||||
|
||||
MP+porevlocks.litmus
|
||||
As below, but with the first access of the writer process
|
||||
and the second access of reader process protected by a lock.
|
||||
|
||||
MP+wmbonceonce+rmbonceonce.litmus
|
||||
Does a smp_wmb() (between the stores) and an smp_rmb() (between
|
||||
the loads) suffice for the message-passing litmus test, where one
|
||||
process writes data and then a flag, and the other process reads
|
||||
the flag and then the data. (This is similar to the ISA2 tests,
|
||||
but with two processes instead of three.)
|
||||
|
||||
R+mbonceonces.litmus
|
||||
This is the fully ordered (via smp_mb()) version of one of
|
||||
the classic counterintuitive litmus tests that illustrates the
|
||||
effects of store propagation delays.
|
||||
|
||||
R+poonceonces.litmus
|
||||
As above, but without the smp_mb() invocations.
|
||||
|
||||
SB+mbonceonces.litmus
|
||||
This is the fully ordered (again, via smp_mb() version of store
|
||||
buffering, which forms the core of Dekker's mutual-exclusion
|
||||
algorithm.
|
||||
|
||||
SB+poonceonces.litmus
|
||||
As above, but without the smp_mb() invocations.
|
||||
|
||||
S+poonceonces.litmus
|
||||
As below, but without the smp_wmb() and acquire load.
|
||||
|
||||
S+wmbonceonce+poacquireonce.litmus
|
||||
Can a smp_wmb(), instead of a release, and an acquire order
|
||||
a prior store against a subsequent store?
|
||||
|
||||
WRC+poonceonces+Once.litmus
|
||||
WRC+pooncerelease+rmbonceonce+Once.litmus
|
||||
These two are members of an extension of the MP litmus-test class
|
||||
in which the first write is moved to a separate process.
|
||||
|
||||
Z6.0+pooncelock+pooncelock+pombonce.litmus
|
||||
Is the ordering provided by a spin_unlock() and a subsequent
|
||||
spin_lock() sufficient to make ordering apparent to accesses
|
||||
by a process not holding the lock?
|
||||
|
||||
Z6.0+pooncelock+poonceLock+pombonce.litmus
|
||||
As above, but with smp_mb__after_spinlock() immediately
|
||||
following the spin_lock().
|
||||
|
||||
Z6.0+pooncerelease+poacquirerelease+mbonceonce.litmus
|
||||
Is the ordering provided by a release-acquire chain sufficient
|
||||
to make ordering apparent to accesses by a process that does
|
||||
not participate in that release-acquire chain?
|
||||
|
||||
A great many more litmus tests are available here:
|
||||
|
||||
https://github.com/paulmckrcu/litmus
|
|
@ -0,0 +1,28 @@
|
|||
C S+poonceonces
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* Starting with a two-process release-acquire chain ordering P0()'s
|
||||
* first store against P1()'s final load, if the smp_store_release()
|
||||
* is replaced by WRITE_ONCE() and the smp_load_acquire() replaced by
|
||||
* READ_ONCE(), is ordering preserved?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 2);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
exists (x=2 /\ 1:r0=1)
|
|
@ -0,0 +1,27 @@
|
|||
C S+wmbonceonce+poacquireonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* Can a smp_wmb(), instead of a release, and an acquire order a prior
|
||||
* store against a subsequent store?
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 2);
|
||||
smp_wmb();
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = smp_load_acquire(y);
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
exists (x=2 /\ 1:r0=1)
|
|
@ -0,0 +1,32 @@
|
|||
C SB+mbonceonces
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates that full memory barriers suffice to
|
||||
* order the store-buffering pattern, where each process writes to the
|
||||
* variable that the preceding process reads. (Locking and RCU can also
|
||||
* suffice, but not much else.)
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_mb();
|
||||
r0 = READ_ONCE(*y);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*y, 1);
|
||||
smp_mb();
|
||||
r0 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (0:r0=0 /\ 1:r0=0)
|
|
@ -0,0 +1,29 @@
|
|||
C SB+poonceonces
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* This litmus test demonstrates that at least some ordering is required
|
||||
* to order the store-buffering pattern, where each process writes to the
|
||||
* variable that the preceding process reads.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*x, 1);
|
||||
r0 = READ_ONCE(*y);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
WRITE_ONCE(*y, 1);
|
||||
r0 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (0:r0=0 /\ 1:r0=0)
|
|
@ -0,0 +1,35 @@
|
|||
C WRC+poonceonces+Once
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* This litmus test is an extension of the message-passing pattern,
|
||||
* where the first write is moved to a separate process. Note that this
|
||||
* test has no ordering at all.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
WRITE_ONCE(*y, 1);
|
||||
}
|
||||
|
||||
P2(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 2:r0=1 /\ 2:r1=0)
|
|
@ -0,0 +1,36 @@
|
|||
C WRC+pooncerelease+rmbonceonce+Once
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test is an extension of the message-passing pattern, where
|
||||
* the first write is moved to a separate process. Because it features
|
||||
* a release and a read memory barrier, it should be forbidden.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
P1(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = READ_ONCE(*x);
|
||||
smp_store_release(y, 1);
|
||||
}
|
||||
|
||||
P2(int *x, int *y)
|
||||
{
|
||||
int r0;
|
||||
int r1;
|
||||
|
||||
r0 = READ_ONCE(*y);
|
||||
smp_rmb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ 2:r0=1 /\ 2:r1=0)
|
|
@ -0,0 +1,42 @@
|
|||
C Z6.0+pooncelock+poonceLock+pombonce
|
||||
|
||||
(*
|
||||
* Result: Never
|
||||
*
|
||||
* This litmus test demonstrates how smp_mb__after_spinlock() may be
|
||||
* used to ensure that accesses in different critical sections for a
|
||||
* given lock running on different CPUs are nevertheless seen in order
|
||||
* by CPUs not holding that lock.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *y, int *z, spinlock_t *mylock)
|
||||
{
|
||||
int r0;
|
||||
|
||||
spin_lock(mylock);
|
||||
smp_mb__after_spinlock();
|
||||
r0 = READ_ONCE(*y);
|
||||
WRITE_ONCE(*z, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P2(int *x, int *z)
|
||||
{
|
||||
int r1;
|
||||
|
||||
WRITE_ONCE(*z, 2);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ z=2 /\ 2:r1=0)
|
|
@ -0,0 +1,40 @@
|
|||
C Z6.0+pooncelock+pooncelock+pombonce
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* This example demonstrates that a pair of accesses made by different
|
||||
* processes each while holding a given lock will not necessarily be
|
||||
* seen as ordered by a third process not holding that lock.
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y, spinlock_t *mylock)
|
||||
{
|
||||
spin_lock(mylock);
|
||||
WRITE_ONCE(*x, 1);
|
||||
WRITE_ONCE(*y, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P1(int *y, int *z, spinlock_t *mylock)
|
||||
{
|
||||
int r0;
|
||||
|
||||
spin_lock(mylock);
|
||||
r0 = READ_ONCE(*y);
|
||||
WRITE_ONCE(*z, 1);
|
||||
spin_unlock(mylock);
|
||||
}
|
||||
|
||||
P2(int *x, int *z)
|
||||
{
|
||||
int r1;
|
||||
|
||||
WRITE_ONCE(*z, 2);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ z=2 /\ 2:r1=0)
|
|
@ -0,0 +1,42 @@
|
|||
C Z6.0+pooncerelease+poacquirerelease+mbonceonce
|
||||
|
||||
(*
|
||||
* Result: Sometimes
|
||||
*
|
||||
* This litmus test shows that a release-acquire chain, while sufficient
|
||||
* when there is but one non-reads-from (AKA non-rf) link, does not suffice
|
||||
* if there is more than one. Of the three processes, only P1() reads from
|
||||
* P0's write, which means that there are two non-rf links: P1() to P2()
|
||||
* is a write-to-write link (AKA a "coherence" or just "co" link) and P2()
|
||||
* to P0() is a read-to-write link (AKA a "from-reads" or just "fr" link).
|
||||
* When there are two or more non-rf links, you typically will need one
|
||||
* full barrier for each non-rf link. (Exceptions include some cases
|
||||
* involving locking.)
|
||||
*)
|
||||
|
||||
{}
|
||||
|
||||
P0(int *x, int *y)
|
||||
{
|
||||
WRITE_ONCE(*x, 1);
|
||||
smp_store_release(y, 1);
|
||||
}
|
||||
|
||||
P1(int *y, int *z)
|
||||
{
|
||||
int r0;
|
||||
|
||||
r0 = smp_load_acquire(y);
|
||||
smp_store_release(z, 1);
|
||||
}
|
||||
|
||||
P2(int *x, int *z)
|
||||
{
|
||||
int r1;
|
||||
|
||||
WRITE_ONCE(*z, 2);
|
||||
smp_mb();
|
||||
r1 = READ_ONCE(*x);
|
||||
}
|
||||
|
||||
exists (1:r0=1 /\ z=2 /\ 2:r1=0)
|
|
@ -0,0 +1,99 @@
|
|||
// SPDX-License-Identifier: GPL-2.0+
|
||||
(*
|
||||
* Copyright (C) 2016 Luc Maranget <luc.maranget@inria.fr> for Inria
|
||||
* Copyright (C) 2017 Alan Stern <stern@rowland.harvard.edu>
|
||||
*)
|
||||
|
||||
(* Generate coherence orders and handle lock operations *)
|
||||
|
||||
include "cross.cat"
|
||||
|
||||
(* From lock reads to their partner lock writes *)
|
||||
let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)
|
||||
let rmw = rmw | lk-rmw
|
||||
|
||||
(*
|
||||
* A paired LKR must always see an unlocked value; spin_lock() calls nested
|
||||
* inside a critical section (for the same lock) always deadlock.
|
||||
*)
|
||||
empty ([LKW] ; po-loc ; [domain(lk-rmw)]) \ (po-loc ; [UL] ; po-loc)
|
||||
as lock-nest
|
||||
|
||||
(* The litmus test is invalid if an LKW event is not part of an RMW pair *)
|
||||
flag ~empty LKW \ range(lk-rmw) as unpaired-LKW
|
||||
|
||||
(* This will be allowed if we implement spin_is_locked() *)
|
||||
flag ~empty LKR \ domain(lk-rmw) as unpaired-LKR
|
||||
|
||||
(* There should be no R or W accesses to spinlocks *)
|
||||
let ALL-LOCKS = LKR | LKW | UL | LF
|
||||
flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
|
||||
|
||||
(* The final value of a spinlock should not be tested *)
|
||||
flag ~empty [FW] ; loc ; [ALL-LOCKS] as lock-final
|
||||
|
||||
|
||||
(*
|
||||
* Put lock operations in their appropriate classes, but leave UL out of W
|
||||
* until after the co relation has been generated.
|
||||
*)
|
||||
let R = R | LKR | LF
|
||||
let W = W | LKW
|
||||
|
||||
let Release = Release | UL
|
||||
let Acquire = Acquire | LKR
|
||||
|
||||
|
||||
(* Match LKW events to their corresponding UL events *)
|
||||
let critical = ([LKW] ; po-loc ; [UL]) \ (po-loc ; [LKW | UL] ; po-loc)
|
||||
|
||||
flag ~empty UL \ range(critical) as unmatched-unlock
|
||||
|
||||
(* Allow up to one unmatched LKW per location; more must deadlock *)
|
||||
let UNMATCHED-LKW = LKW \ domain(critical)
|
||||
empty ([UNMATCHED-LKW] ; loc ; [UNMATCHED-LKW]) \ id as unmatched-locks
|
||||
|
||||
|
||||
(* rfi for LF events: link each LKW to the LF events in its critical section *)
|
||||
let rfi-lf = ([LKW] ; po-loc ; [LF]) \ ([LKW] ; po-loc ; [UL] ; po-loc)
|
||||
|
||||
(* rfe for LF events *)
|
||||
let all-possible-rfe-lf =
|
||||
(*
|
||||
* Given an LF event r, compute the possible rfe edges for that event
|
||||
* (all those starting from LKW events in other threads),
|
||||
* and then convert that relation to a set of single-edge relations.
|
||||
*)
|
||||
let possible-rfe-lf r =
|
||||
let pair-to-relation p = p ++ 0
|
||||
in map pair-to-relation ((LKW * {r}) & loc & ext)
|
||||
(* Do this for each LF event r that isn't in rfi-lf *)
|
||||
in map possible-rfe-lf (LF \ range(rfi-lf))
|
||||
|
||||
(* Generate all rf relations for LF events *)
|
||||
with rfe-lf from cross(all-possible-rfe-lf)
|
||||
let rf = rf | rfi-lf | rfe-lf
|
||||
|
||||
|
||||
(* Generate all co relations, including LKW events but not UL *)
|
||||
let co0 = co0 | ([IW] ; loc ; [LKW]) |
|
||||
(([LKW] ; loc ; [UNMATCHED-LKW]) \ [UNMATCHED-LKW])
|
||||
include "cos-opt.cat"
|
||||
let W = W | UL
|
||||
let M = R | W
|
||||
|
||||
(* Merge UL events into co *)
|
||||
let co = (co | critical | (critical^-1 ; co))+
|
||||
let coe = co & ext
|
||||
let coi = co & int
|
||||
|
||||
(* Merge LKR events into rf *)
|
||||
let rf = rf | ([IW | UL] ; singlestep(co) ; lk-rmw^-1)
|
||||
let rfe = rf & ext
|
||||
let rfi = rf & int
|
||||
|
||||
let fr = rf^-1 ; co
|
||||
let fre = fr & ext
|
||||
let fri = fr & int
|
||||
|
||||
show co,rf,fr
|
Loading…
Reference in New Issue