sched/numa: Use select_idle_sibling() to select a destination for task_numa_move()

The code in task_numa_compare() will only examine at most one idle CPU per node,
because they all have the same score. However, some idle CPUs are better
candidates than others, due to busy or idle SMT siblings, etc...

The scheduler has logic to find the best CPU within an LLC to place a
task. The NUMA code should probably use it.

This seems to reduce the standard deviation for single instance SPECjbb2005
with a low warehouse count on my 4 node test system.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: mgorman@suse.de
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140904163530.189d410a@cuia.bos.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
Rik van Riel 2014-09-04 16:35:30 -04:00 committed by Ingo Molnar
parent 13924d2a98
commit ba7e5a279e
1 changed files with 8 additions and 0 deletions

View File

@ -665,6 +665,7 @@ static u64 sched_vslice(struct cfs_rq *cfs_rq, struct sched_entity *se)
}
#ifdef CONFIG_SMP
static int select_idle_sibling(struct task_struct *p, int cpu);
static unsigned long task_h_load(struct task_struct *p);
static inline void __update_task_entity_contrib(struct sched_entity *se);
@ -1257,6 +1258,13 @@ balance:
if (load_too_imbalanced(src_load, dst_load, env))
goto unlock;
/*
* One idle CPU per node is evaluated for a task numa move.
* Call select_idle_sibling to maybe find a better one.
*/
if (!cur)
env->dst_cpu = select_idle_sibling(env->p, env->dst_cpu);
assign:
task_numa_assign(env, cur, imp);
unlock: