membarrier: Execute SYNC_CORE on the calling thread

membarrier()'s MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is documented as
syncing the core on all sibling threads but not necessarily the calling
thread.  This behavior is fundamentally buggy and cannot be used safely.

Suppose a user program has two threads.  Thread A is on CPU 0 and thread B
is on CPU 1.  Thread A modifies some text and calls
membarrier(MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE).

Then thread B executes the modified code.  If, at any point after
membarrier() decides which CPUs to target, thread A could be preempted and
replaced by thread B on CPU 0.  This could even happen on exit from the
membarrier() syscall.  If this happens, thread B will end up running on CPU
0 without having synced.

In principle, this could be fixed by arranging for the scheduler to issue
sync_core_before_usermode() whenever switching between two threads in the
same mm if there is any possibility of a concurrent membarrier() call, but
this would have considerable overhead.  Instead, make membarrier() sync the
calling CPU as well.

As an optimization, this avoids an extra smp_mb() in the default
barrier-only mode and an extra rseq preempt on the caller.

Fixes: 70216e18e5 ("membarrier: Provide core serializing command, *_SYNC_CORE")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/250ded637696d490c69bef1877148db86066881c.1607058304.git.luto@kernel.org
This commit is contained in:
Andy Lutomirski 2020-12-03 21:07:06 -08:00 committed by Thomas Gleixner
parent 758c9373d8
commit e45cdc71d1
1 changed files with 33 additions and 18 deletions

View File

@ -194,7 +194,8 @@ static int membarrier_private_expedited(int flags, int cpu_id)
return -EPERM;
}
if (atomic_read(&mm->mm_users) == 1 || num_online_cpus() == 1)
if (flags != MEMBARRIER_FLAG_SYNC_CORE &&
(atomic_read(&mm->mm_users) == 1 || num_online_cpus() == 1))
return 0;
/*
@ -213,8 +214,6 @@ static int membarrier_private_expedited(int flags, int cpu_id)
if (cpu_id >= nr_cpu_ids || !cpu_online(cpu_id))
goto out;
if (cpu_id == raw_smp_processor_id())
goto out;
rcu_read_lock();
p = rcu_dereference(cpu_rq(cpu_id)->curr);
if (!p || p->mm != mm) {
@ -229,16 +228,6 @@ static int membarrier_private_expedited(int flags, int cpu_id)
for_each_online_cpu(cpu) {
struct task_struct *p;
/*
* Skipping the current CPU is OK even through we can be
* migrated at any point. The current CPU, at the point
* where we read raw_smp_processor_id(), is ensured to
* be in program order with respect to the caller
* thread. Therefore, we can skip this CPU from the
* iteration.
*/
if (cpu == raw_smp_processor_id())
continue;
p = rcu_dereference(cpu_rq(cpu)->curr);
if (p && p->mm == mm)
__cpumask_set_cpu(cpu, tmpmask);
@ -246,12 +235,38 @@ static int membarrier_private_expedited(int flags, int cpu_id)
rcu_read_unlock();
}
preempt_disable();
if (cpu_id >= 0)
if (cpu_id >= 0) {
/*
* smp_call_function_single() will call ipi_func() if cpu_id
* is the calling CPU.
*/
smp_call_function_single(cpu_id, ipi_func, NULL, 1);
else
smp_call_function_many(tmpmask, ipi_func, NULL, 1);
preempt_enable();
} else {
/*
* For regular membarrier, we can save a few cycles by
* skipping the current cpu -- we're about to do smp_mb()
* below, and if we migrate to a different cpu, this cpu
* and the new cpu will execute a full barrier in the
* scheduler.
*
* For SYNC_CORE, we do need a barrier on the current cpu --
* otherwise, if we are migrated and replaced by a different
* task in the same mm just before, during, or after
* membarrier, we will end up with some thread in the mm
* running without a core sync.
*
* For RSEQ, don't rseq_preempt() the caller. User code
* is not supposed to issue syscalls at all from inside an
* rseq critical section.
*/
if (flags != MEMBARRIER_FLAG_SYNC_CORE) {
preempt_disable();
smp_call_function_many(tmpmask, ipi_func, NULL, true);
preempt_enable();
} else {
on_each_cpu_mask(tmpmask, ipi_func, NULL, true);
}
}
out:
if (cpu_id < 0)