powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN

On a 16-socket 192-core POWER8 system, the context_switch1_threads
benchmark from will-it-scale (see earlier changelog), upstream can achieve
a rate of about 1 million context switches per second, due to contention
on the mm refcount.

64s meets the prerequisites for CONFIG_MMU_LAZY_TLB_SHOOTDOWN, so enable
the option.  This increases the above benchmark to 118 million context
switches per second.

This generates 314 additional IPI interrupts on a 144 CPU system doing a
kernel compile, which is in the noise in terms of kernel cycles.

Link: https://lkml.kernel.org/r/20230203071837.1136453-6-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
Nicholas Piggin 2023-02-03 17:18:37 +10:00 committed by Andrew Morton
parent 2655421ae6
commit 77f68ebeee
1 changed files with 1 additions and 0 deletions

View File

@ -266,6 +266,7 @@ config PPC
select MMU_GATHER_PAGE_SIZE
select MMU_GATHER_RCU_TABLE_FREE
select MMU_GATHER_MERGE_VMAS
select MMU_LAZY_TLB_SHOOTDOWN if PPC_BOOK3S_64
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE
select NEED_PER_CPU_EMBED_FIRST_CHUNK if PPC64