x86/mm/tlb: Skip atomic operations for 'init_mm' in switch_mm_irqs_off()
Song Liu noticed switch_mm_irqs_off() taking a lot of CPU time in recent kernels,using 1.8% of a 48 CPU system during a netperf to localhost run. Digging into the profile, we noticed that cpumask_clear_cpu and cpumask_set_cpu together take about half of the CPU time taken by switch_mm_irqs_off(). However, the CPUs running netperf end up switching back and forth between netperf and the idle task, which does not require changes to the mm_cpumask. Furthermore, the init_mm cpumask ends up being the most heavily contended one in the system. Simply skipping changes to mm_cpumask(&init_mm) reduces overhead. Reported-and-tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Rik van Riel <riel@surriel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: efault@gmx.de Cc: kernel-team@fb.com Cc: luto@kernel.org Link: http://lkml.kernel.org/r/20180716190337.26133-8-riel@surriel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
parent
95b0e6357d
commit
e9d8c61557
|
@ -310,14 +310,21 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
|
|||
sync_current_stack_to_mm(next);
|
||||
}
|
||||
|
||||
/* Stop remote flushes for the previous mm */
|
||||
VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(real_prev)) &&
|
||||
real_prev != &init_mm);
|
||||
/*
|
||||
* Stop remote flushes for the previous mm.
|
||||
* Skip kernel threads; we never send init_mm TLB flushing IPIs,
|
||||
* but the bitmap manipulation can cause cache line contention.
|
||||
*/
|
||||
if (real_prev != &init_mm) {
|
||||
VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu,
|
||||
mm_cpumask(real_prev)));
|
||||
cpumask_clear_cpu(cpu, mm_cpumask(real_prev));
|
||||
}
|
||||
|
||||
/*
|
||||
* Start remote flushes and then read tlb_gen.
|
||||
*/
|
||||
if (next != &init_mm)
|
||||
cpumask_set_cpu(cpu, mm_cpumask(next));
|
||||
next_tlb_gen = atomic64_read(&next->context.tlb_gen);
|
||||
|
||||
|
|
Loading…
Reference in New Issue