cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order
Paul reported a very sporadic, rcutorture induced, workqueue failure. When the planets align, the workqueue rescuer's self-migrate fails and then triggers a WARN for running a work on the wrong CPU. Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call could be ignored! When stopper->enabled is false, stop_machine will insta complete the work, without actually doing the work. Worse, it will not WARN about this (we really should fix this). It turns out there is a small window where a freshly online'ed CPU is marked 'online' but doesn't yet have the stopper task running: BP AP bringup_cpu() __cpu_up(cpu, idle) --> start_secondary() ... cpu_startup_entry() bringup_wait_for_ap() wait_for_ap_thread() <-- cpuhp_online_idle() while (1) do_idle() ... available to run kthreads ... stop_machine_unpark() stopper->enable = true; Close this by moving the stop_machine_unpark() into cpuhp_online_idle(), such that the stopper thread is ready before we start the idle loop and schedule. Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Debugged-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: "Paul E. McKenney" <paulmck@kernel.org>
This commit is contained in:
parent
cde6519450
commit
45178ac0ce
13
kernel/cpu.c
13
kernel/cpu.c
|
@ -525,8 +525,7 @@ static int bringup_wait_for_ap(unsigned int cpu)
|
|||
if (WARN_ON_ONCE((!cpu_online(cpu))))
|
||||
return -ECANCELED;
|
||||
|
||||
/* Unpark the stopper thread and the hotplug thread of the target cpu */
|
||||
stop_machine_unpark(cpu);
|
||||
/* Unpark the hotplug thread of the target cpu */
|
||||
kthread_unpark(st->thread);
|
||||
|
||||
/*
|
||||
|
@ -1089,8 +1088,8 @@ void notify_cpu_starting(unsigned int cpu)
|
|||
|
||||
/*
|
||||
* Called from the idle task. Wake up the controlling task which brings the
|
||||
* stopper and the hotplug thread of the upcoming CPU up and then delegates
|
||||
* the rest of the online bringup to the hotplug thread.
|
||||
* hotplug thread of the upcoming CPU up and then delegates the rest of the
|
||||
* online bringup to the hotplug thread.
|
||||
*/
|
||||
void cpuhp_online_idle(enum cpuhp_state state)
|
||||
{
|
||||
|
@ -1100,6 +1099,12 @@ void cpuhp_online_idle(enum cpuhp_state state)
|
|||
if (state != CPUHP_AP_ONLINE_IDLE)
|
||||
return;
|
||||
|
||||
/*
|
||||
* Unpart the stopper thread before we start the idle loop (and start
|
||||
* scheduling); this ensures the stopper task is always available.
|
||||
*/
|
||||
stop_machine_unpark(smp_processor_id());
|
||||
|
||||
st->state = CPUHP_AP_ONLINE_IDLE;
|
||||
complete_ap_thread(st, true);
|
||||
}
|
||||
|
|
Loading…
Reference in New Issue