locking: Introduce local_lock()
preempt_disable() and local_irq_disable/save() are in principle per CPU big
kernel locks. This has several downsides:
- The protection scope is unknown
- Violation of protection rules is hard to detect by instrumentation
- For PREEMPT_RT such sections, unless in low level critical code, can
violate the preemptability constraints.
To address this PREEMPT_RT introduced the concept of local_locks which are
strictly per CPU.
The lock operations map to preempt_disable(), local_irq_disable/save() and
the enabling counterparts on non RT enabled kernels.
If lockdep is enabled local locks gain a lock map which tracks the usage
context. This will catch cases where an area is protected by
preempt_disable() but the access also happens from interrupt context. local
locks have identified quite a few such issues over the years, the most
recent example is:
b7d5dc2107
("random: add a spinlock_t to struct batched_entropy")
Aside of the lockdep coverage this also improves code readability as it
precisely annotates the protection scope.
PREEMPT_RT substitutes these local locks with 'sleeping' spinlocks to
protect such sections while maintaining preemtability and CPU locality.
local locks can replace:
- preempt_enable()/disable() pairs
- local_irq_disable/enable() pairs
- local_irq_save/restore() pairs
They are also used to replace code which implicitly disables preemption
like:
- get_cpu()/put_cpu()
- get_cpu_var()/put_cpu_var()
with PREEMPT_RT friendly constructs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-2-bigeasy@linutronix.de
This commit is contained in:
parent
4f470fff67
commit
91710728d1
|
@ -13,6 +13,7 @@ The kernel provides a variety of locking primitives which can be divided
|
|||
into two categories:
|
||||
|
||||
- Sleeping locks
|
||||
- CPU local locks
|
||||
- Spinning locks
|
||||
|
||||
This document conceptually describes these lock types and provides rules
|
||||
|
@ -44,9 +45,23 @@ Sleeping lock types:
|
|||
|
||||
On PREEMPT_RT kernels, these lock types are converted to sleeping locks:
|
||||
|
||||
- local_lock
|
||||
- spinlock_t
|
||||
- rwlock_t
|
||||
|
||||
|
||||
CPU local locks
|
||||
---------------
|
||||
|
||||
- local_lock
|
||||
|
||||
On non-PREEMPT_RT kernels, local_lock functions are wrappers around
|
||||
preemption and interrupt disabling primitives. Contrary to other locking
|
||||
mechanisms, disabling preemption or interrupts are pure CPU local
|
||||
concurrency control mechanisms and not suited for inter-CPU concurrency
|
||||
control.
|
||||
|
||||
|
||||
Spinning locks
|
||||
--------------
|
||||
|
||||
|
@ -67,6 +82,7 @@ can have suffixes which apply further protections:
|
|||
_irqsave/restore() Save and disable / restore interrupt disabled state
|
||||
=================== ====================================================
|
||||
|
||||
|
||||
Owner semantics
|
||||
===============
|
||||
|
||||
|
@ -139,6 +155,56 @@ implementation, thus changing the fairness:
|
|||
writer from starving readers.
|
||||
|
||||
|
||||
local_lock
|
||||
==========
|
||||
|
||||
local_lock provides a named scope to critical sections which are protected
|
||||
by disabling preemption or interrupts.
|
||||
|
||||
On non-PREEMPT_RT kernels local_lock operations map to the preemption and
|
||||
interrupt disabling and enabling primitives:
|
||||
|
||||
=========================== ======================
|
||||
local_lock(&llock) preempt_disable()
|
||||
local_unlock(&llock) preempt_enable()
|
||||
local_lock_irq(&llock) local_irq_disable()
|
||||
local_unlock_irq(&llock) local_irq_enable()
|
||||
local_lock_save(&llock) local_irq_save()
|
||||
local_lock_restore(&llock) local_irq_save()
|
||||
=========================== ======================
|
||||
|
||||
The named scope of local_lock has two advantages over the regular
|
||||
primitives:
|
||||
|
||||
- The lock name allows static analysis and is also a clear documentation
|
||||
of the protection scope while the regular primitives are scopeless and
|
||||
opaque.
|
||||
|
||||
- If lockdep is enabled the local_lock gains a lockmap which allows to
|
||||
validate the correctness of the protection. This can detect cases where
|
||||
e.g. a function using preempt_disable() as protection mechanism is
|
||||
invoked from interrupt or soft-interrupt context. Aside of that
|
||||
lockdep_assert_held(&llock) works as with any other locking primitive.
|
||||
|
||||
local_lock and PREEMPT_RT
|
||||
-------------------------
|
||||
|
||||
PREEMPT_RT kernels map local_lock to a per-CPU spinlock_t, thus changing
|
||||
semantics:
|
||||
|
||||
- All spinlock_t changes also apply to local_lock.
|
||||
|
||||
local_lock usage
|
||||
----------------
|
||||
|
||||
local_lock should be used in situations where disabling preemption or
|
||||
interrupts is the appropriate form of concurrency control to protect
|
||||
per-CPU data structures on a non PREEMPT_RT kernel.
|
||||
|
||||
local_lock is not suitable to protect against preemption or interrupts on a
|
||||
PREEMPT_RT kernel due to the PREEMPT_RT specific spinlock_t semantics.
|
||||
|
||||
|
||||
raw_spinlock_t and spinlock_t
|
||||
=============================
|
||||
|
||||
|
@ -258,10 +324,82 @@ implementation, thus changing semantics:
|
|||
PREEMPT_RT caveats
|
||||
==================
|
||||
|
||||
local_lock on RT
|
||||
----------------
|
||||
|
||||
The mapping of local_lock to spinlock_t on PREEMPT_RT kernels has a few
|
||||
implications. For example, on a non-PREEMPT_RT kernel the following code
|
||||
sequence works as expected::
|
||||
|
||||
local_lock_irq(&local_lock);
|
||||
raw_spin_lock(&lock);
|
||||
|
||||
and is fully equivalent to::
|
||||
|
||||
raw_spin_lock_irq(&lock);
|
||||
|
||||
On a PREEMPT_RT kernel this code sequence breaks because local_lock_irq()
|
||||
is mapped to a per-CPU spinlock_t which neither disables interrupts nor
|
||||
preemption. The following code sequence works perfectly correct on both
|
||||
PREEMPT_RT and non-PREEMPT_RT kernels::
|
||||
|
||||
local_lock_irq(&local_lock);
|
||||
spin_lock(&lock);
|
||||
|
||||
Another caveat with local locks is that each local_lock has a specific
|
||||
protection scope. So the following substitution is wrong::
|
||||
|
||||
func1()
|
||||
{
|
||||
local_irq_save(flags); -> local_lock_irqsave(&local_lock_1, flags);
|
||||
func3();
|
||||
local_irq_restore(flags); -> local_lock_irqrestore(&local_lock_1, flags);
|
||||
}
|
||||
|
||||
func2()
|
||||
{
|
||||
local_irq_save(flags); -> local_lock_irqsave(&local_lock_2, flags);
|
||||
func3();
|
||||
local_irq_restore(flags); -> local_lock_irqrestore(&local_lock_2, flags);
|
||||
}
|
||||
|
||||
func3()
|
||||
{
|
||||
lockdep_assert_irqs_disabled();
|
||||
access_protected_data();
|
||||
}
|
||||
|
||||
On a non-PREEMPT_RT kernel this works correctly, but on a PREEMPT_RT kernel
|
||||
local_lock_1 and local_lock_2 are distinct and cannot serialize the callers
|
||||
of func3(). Also the lockdep assert will trigger on a PREEMPT_RT kernel
|
||||
because local_lock_irqsave() does not disable interrupts due to the
|
||||
PREEMPT_RT-specific semantics of spinlock_t. The correct substitution is::
|
||||
|
||||
func1()
|
||||
{
|
||||
local_irq_save(flags); -> local_lock_irqsave(&local_lock, flags);
|
||||
func3();
|
||||
local_irq_restore(flags); -> local_lock_irqrestore(&local_lock, flags);
|
||||
}
|
||||
|
||||
func2()
|
||||
{
|
||||
local_irq_save(flags); -> local_lock_irqsave(&local_lock, flags);
|
||||
func3();
|
||||
local_irq_restore(flags); -> local_lock_irqrestore(&local_lock, flags);
|
||||
}
|
||||
|
||||
func3()
|
||||
{
|
||||
lockdep_assert_held(&local_lock);
|
||||
access_protected_data();
|
||||
}
|
||||
|
||||
|
||||
spinlock_t and rwlock_t
|
||||
-----------------------
|
||||
|
||||
These changes in spinlock_t and rwlock_t semantics on PREEMPT_RT kernels
|
||||
The changes in spinlock_t and rwlock_t semantics on PREEMPT_RT kernels
|
||||
have a few implications. For example, on a non-PREEMPT_RT kernel the
|
||||
following code sequence works as expected::
|
||||
|
||||
|
@ -282,9 +420,61 @@ local_lock mechanism. Acquiring the local_lock pins the task to a CPU,
|
|||
allowing things like per-CPU interrupt disabled locks to be acquired.
|
||||
However, this approach should be used only where absolutely necessary.
|
||||
|
||||
A typical scenario is protection of per-CPU variables in thread context::
|
||||
|
||||
raw_spinlock_t
|
||||
--------------
|
||||
struct foo *p = get_cpu_ptr(&var1);
|
||||
|
||||
spin_lock(&p->lock);
|
||||
p->count += this_cpu_read(var2);
|
||||
|
||||
This is correct code on a non-PREEMPT_RT kernel, but on a PREEMPT_RT kernel
|
||||
this breaks. The PREEMPT_RT-specific change of spinlock_t semantics does
|
||||
not allow to acquire p->lock because get_cpu_ptr() implicitly disables
|
||||
preemption. The following substitution works on both kernels::
|
||||
|
||||
struct foo *p;
|
||||
|
||||
migrate_disable();
|
||||
p = this_cpu_ptr(&var1);
|
||||
spin_lock(&p->lock);
|
||||
p->count += this_cpu_read(var2);
|
||||
|
||||
On a non-PREEMPT_RT kernel migrate_disable() maps to preempt_disable()
|
||||
which makes the above code fully equivalent. On a PREEMPT_RT kernel
|
||||
migrate_disable() ensures that the task is pinned on the current CPU which
|
||||
in turn guarantees that the per-CPU access to var1 and var2 are staying on
|
||||
the same CPU.
|
||||
|
||||
The migrate_disable() substitution is not valid for the following
|
||||
scenario::
|
||||
|
||||
func()
|
||||
{
|
||||
struct foo *p;
|
||||
|
||||
migrate_disable();
|
||||
p = this_cpu_ptr(&var1);
|
||||
p->val = func2();
|
||||
|
||||
While correct on a non-PREEMPT_RT kernel, this breaks on PREEMPT_RT because
|
||||
here migrate_disable() does not protect against reentrancy from a
|
||||
preempting task. A correct substitution for this case is::
|
||||
|
||||
func()
|
||||
{
|
||||
struct foo *p;
|
||||
|
||||
local_lock(&foo_lock);
|
||||
p = this_cpu_ptr(&var1);
|
||||
p->val = func2();
|
||||
|
||||
On a non-PREEMPT_RT kernel this protects against reentrancy by disabling
|
||||
preemption. On a PREEMPT_RT kernel this is achieved by acquiring the
|
||||
underlying per-CPU spinlock.
|
||||
|
||||
|
||||
raw_spinlock_t on RT
|
||||
--------------------
|
||||
|
||||
Acquiring a raw_spinlock_t disables preemption and possibly also
|
||||
interrupts, so the critical section must avoid acquiring a regular
|
||||
|
@ -325,22 +515,25 @@ Lock type nesting rules
|
|||
|
||||
The most basic rules are:
|
||||
|
||||
- Lock types of the same lock category (sleeping, spinning) can nest
|
||||
arbitrarily as long as they respect the general lock ordering rules to
|
||||
prevent deadlocks.
|
||||
- Lock types of the same lock category (sleeping, CPU local, spinning)
|
||||
can nest arbitrarily as long as they respect the general lock ordering
|
||||
rules to prevent deadlocks.
|
||||
|
||||
- Sleeping lock types cannot nest inside spinning lock types.
|
||||
- Sleeping lock types cannot nest inside CPU local and spinning lock types.
|
||||
|
||||
- Spinning lock types can nest inside sleeping lock types.
|
||||
- CPU local and spinning lock types can nest inside sleeping lock types.
|
||||
|
||||
- Spinning lock types can nest inside all lock types
|
||||
|
||||
These constraints apply both in PREEMPT_RT and otherwise.
|
||||
|
||||
The fact that PREEMPT_RT changes the lock category of spinlock_t and
|
||||
rwlock_t from spinning to sleeping means that they cannot be acquired while
|
||||
holding a raw spinlock. This results in the following nesting ordering:
|
||||
rwlock_t from spinning to sleeping and substitutes local_lock with a
|
||||
per-CPU spinlock_t means that they cannot be acquired while holding a raw
|
||||
spinlock. This results in the following nesting ordering:
|
||||
|
||||
1) Sleeping locks
|
||||
2) spinlock_t and rwlock_t
|
||||
2) spinlock_t, rwlock_t, local_lock
|
||||
3) raw_spinlock_t and bit spinlocks
|
||||
|
||||
Lockdep will complain if these constraints are violated, both in
|
||||
|
|
|
@ -0,0 +1,54 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef _LINUX_LOCAL_LOCK_H
|
||||
#define _LINUX_LOCAL_LOCK_H
|
||||
|
||||
#include <linux/local_lock_internal.h>
|
||||
|
||||
/**
|
||||
* local_lock_init - Runtime initialize a lock instance
|
||||
*/
|
||||
#define local_lock_init(lock) __local_lock_init(lock)
|
||||
|
||||
/**
|
||||
* local_lock - Acquire a per CPU local lock
|
||||
* @lock: The lock variable
|
||||
*/
|
||||
#define local_lock(lock) __local_lock(lock)
|
||||
|
||||
/**
|
||||
* local_lock_irq - Acquire a per CPU local lock and disable interrupts
|
||||
* @lock: The lock variable
|
||||
*/
|
||||
#define local_lock_irq(lock) __local_lock_irq(lock)
|
||||
|
||||
/**
|
||||
* local_lock_irqsave - Acquire a per CPU local lock, save and disable
|
||||
* interrupts
|
||||
* @lock: The lock variable
|
||||
* @flags: Storage for interrupt flags
|
||||
*/
|
||||
#define local_lock_irqsave(lock, flags) \
|
||||
__local_lock_irqsave(lock, flags)
|
||||
|
||||
/**
|
||||
* local_unlock - Release a per CPU local lock
|
||||
* @lock: The lock variable
|
||||
*/
|
||||
#define local_unlock(lock) __local_unlock(lock)
|
||||
|
||||
/**
|
||||
* local_unlock_irq - Release a per CPU local lock and enable interrupts
|
||||
* @lock: The lock variable
|
||||
*/
|
||||
#define local_unlock_irq(lock) __local_unlock_irq(lock)
|
||||
|
||||
/**
|
||||
* local_unlock_irqrestore - Release a per CPU local lock and restore
|
||||
* interrupt flags
|
||||
* @lock: The lock variable
|
||||
* @flags: Interrupt flags to restore
|
||||
*/
|
||||
#define local_unlock_irqrestore(lock, flags) \
|
||||
__local_unlock_irqrestore(lock, flags)
|
||||
|
||||
#endif
|
|
@ -0,0 +1,90 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef _LINUX_LOCAL_LOCK_H
|
||||
# error "Do not include directly, include linux/local_lock.h"
|
||||
#endif
|
||||
|
||||
#include <linux/percpu-defs.h>
|
||||
#include <linux/lockdep.h>
|
||||
|
||||
typedef struct {
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
struct lockdep_map dep_map;
|
||||
struct task_struct *owner;
|
||||
#endif
|
||||
} local_lock_t;
|
||||
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
# define LL_DEP_MAP_INIT(lockname) \
|
||||
.dep_map = { \
|
||||
.name = #lockname, \
|
||||
.wait_type_inner = LD_WAIT_CONFIG, \
|
||||
}
|
||||
#else
|
||||
# define LL_DEP_MAP_INIT(lockname)
|
||||
#endif
|
||||
|
||||
#define INIT_LOCAL_LOCK(lockname) { LL_DEP_MAP_INIT(lockname) }
|
||||
|
||||
#define __local_lock_init(lock) \
|
||||
do { \
|
||||
static struct lock_class_key __key; \
|
||||
\
|
||||
debug_check_no_locks_freed((void *)lock, sizeof(*lock));\
|
||||
lockdep_init_map_wait(&(lock)->dep_map, #lock, &__key, 0, LD_WAIT_CONFIG);\
|
||||
} while (0)
|
||||
|
||||
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
||||
static inline void local_lock_acquire(local_lock_t *l)
|
||||
{
|
||||
lock_map_acquire(&l->dep_map);
|
||||
DEBUG_LOCKS_WARN_ON(l->owner);
|
||||
l->owner = current;
|
||||
}
|
||||
|
||||
static inline void local_lock_release(local_lock_t *l)
|
||||
{
|
||||
DEBUG_LOCKS_WARN_ON(l->owner != current);
|
||||
l->owner = NULL;
|
||||
lock_map_release(&l->dep_map);
|
||||
}
|
||||
|
||||
#else /* CONFIG_DEBUG_LOCK_ALLOC */
|
||||
static inline void local_lock_acquire(local_lock_t *l) { }
|
||||
static inline void local_lock_release(local_lock_t *l) { }
|
||||
#endif /* !CONFIG_DEBUG_LOCK_ALLOC */
|
||||
|
||||
#define __local_lock(lock) \
|
||||
do { \
|
||||
preempt_disable(); \
|
||||
local_lock_acquire(this_cpu_ptr(lock)); \
|
||||
} while (0)
|
||||
|
||||
#define __local_lock_irq(lock) \
|
||||
do { \
|
||||
local_irq_disable(); \
|
||||
local_lock_acquire(this_cpu_ptr(lock)); \
|
||||
} while (0)
|
||||
|
||||
#define __local_lock_irqsave(lock, flags) \
|
||||
do { \
|
||||
local_irq_save(flags); \
|
||||
local_lock_acquire(this_cpu_ptr(lock)); \
|
||||
} while (0)
|
||||
|
||||
#define __local_unlock(lock) \
|
||||
do { \
|
||||
local_lock_release(this_cpu_ptr(lock)); \
|
||||
preempt_enable(); \
|
||||
} while (0)
|
||||
|
||||
#define __local_unlock_irq(lock) \
|
||||
do { \
|
||||
local_lock_release(this_cpu_ptr(lock)); \
|
||||
local_irq_enable(); \
|
||||
} while (0)
|
||||
|
||||
#define __local_unlock_irqrestore(lock, flags) \
|
||||
do { \
|
||||
local_lock_release(this_cpu_ptr(lock)); \
|
||||
local_irq_restore(flags); \
|
||||
} while (0)
|
Loading…
Reference in New Issue