OpenCloudOS-Kernel

History

Mel Gorman 1378447598 sched/numa: Stagger NUMA balancing scan periods for new threads Threads share an address space and each can change the protections of the same address space to trap NUMA faults. This is redundant and potentially counter-productive as any thread doing the update will suffice. Potentially only one thread is required but that thread may be idle or it may not have any locality concerns and pick an unsuitable scan rate. This patch uses independent scan period but they are staggered based on the number of address space users when the thread is created. The intent is that threads will avoid scanning at the same time and have a chance to adapt their scan rate later if necessary. This reduces the total scan activity early in the lifetime of the threads. The different in headline performance across a range of machines and workloads is marginal but the system CPU usage is reduced as well as overall scan activity. The following is the time reported by NAS Parallel Benchmark using unbound openmp threads and a D size class: 4.17.0-rc1 4.17.0-rc1 vanilla stagger-v1r1 Time bt.D 442.77 ( 0.00%) 419.70 ( 5.21%) Time cg.D 171.90 ( 0.00%) 180.85 ( -5.21%) Time ep.D 33.10 ( 0.00%) 32.90 ( 0.60%) Time is.D 9.59 ( 0.00%) 9.42 ( 1.77%) Time lu.D 306.75 ( 0.00%) 304.65 ( 0.68%) Time mg.D 54.56 ( 0.00%) 52.38 ( 4.00%) Time sp.D 1020.03 ( 0.00%) 903.77 ( 11.40%) Time ua.D 400.58 ( 0.00%) 386.49 ( 3.52%) Note it's not a universal win but we have no prior knowledge of which thread matters but the number of threads created often exceeds the size of the node when the threads are not bound. However, there is a reducation of overall system CPU usage: 4.17.0-rc1 4.17.0-rc1 vanilla stagger-v1r1 sys-time-bt.D 48.78 ( 0.00%) 48.22 ( 1.15%) sys-time-cg.D 25.31 ( 0.00%) 26.63 ( -5.22%) sys-time-ep.D 1.65 ( 0.00%) 0.62 ( 62.42%) sys-time-is.D 40.05 ( 0.00%) 24.45 ( 38.95%) sys-time-lu.D 37.55 ( 0.00%) 29.02 ( 22.72%) sys-time-mg.D 47.52 ( 0.00%) 34.92 ( 26.52%) sys-time-sp.D 119.01 ( 0.00%) 109.05 ( 8.37%) sys-time-ua.D 51.52 ( 0.00%) 45.13 ( 12.40%) NUMA scan activity is also reduced: NUMA alloc local 1042828 1342670 NUMA base PTE updates 140481138 93577468 NUMA huge PMD updates 272171 180766 NUMA page range updates 279832690 186129660 NUMA hint faults 1395972 1193897 NUMA hint local faults 877925 855053 NUMA hint local percent 62 71 NUMA pages migrated 12057909 9158023 Similar observations are made for other thread-intensive workloads. System CPU usage is lower even though the headline gains in performance tend to be small. For example, specjbb 2005 shows almost no difference in performance but scan activity is reduced by a third on a 4-socket box. I didn't find a workload (thread intensive or otherwise) that suffered badly. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20180504154109.mvrha2qo5wdl65vr@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org>		2018-05-14 09:12:24 +02:00
..
bpf	bpf: use array_index_nospec in find_prog_type	2018-05-03 19:29:35 -07:00
cgroup	Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2018-04-03 18:00:13 -07:00
configs	KVM changes for 4.16	2018-02-10 13:16:35 -08:00
debug	* Fix 2032 time access issues and new compiler warnings	2018-04-12 10:21:19 -07:00
events	perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]	2018-05-05 08:37:27 +02:00
gcov	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
irq	genirq/affinity: Spread irq vectors among present CPUs as far as possible	2018-04-06 12:19:51 +02:00
livepatch	livepatch: Allow to call a custom callback when freeing shadow variables	2018-04-17 13:42:48 +02:00
locking	locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches	2018-03-31 07:30:50 +02:00
power	PM / QoS: mark expected switch fall-throughs	2018-04-09 13:49:40 +02:00
printk	New features:	2018-04-10 11:27:30 -07:00
rcu	Merge branches 'fixes.2018.02.23a', 'srcu.2018.02.20a' and 'torture.2018.02.20a' into HEAD	2018-02-23 15:15:41 -08:00
sched	sched/numa: Stagger NUMA balancing scan periods for new threads	2018-05-14 09:12:24 +02:00
time	clocksource: Rework stale comment	2018-05-02 16:10:41 +02:00
trace	tracing: Fix regex_match_front() to not over compare the test string	2018-05-11 10:56:42 -04:00
.gitignore	…
Kconfig.freezer	…
Kconfig.hz	…
Kconfig.locks	…
Kconfig.preempt	…
Makefile	error-injection: Support fault injection framework	2018-01-12 17:33:38 -08:00
acct.c	kernel/acct.c: fix the acct->needcheck check in check_free_space()	2018-01-04 16:45:09 -08:00
async.c	kernel/async.c: revert "async: simplify lowest_in_progress()"	2018-02-06 18:32:44 -08:00
audit.c	audit/stable-4.17 PR 20180403	2018-04-06 15:01:25 -07:00
audit.h	audit: track the owner of the command mutex ourselves	2018-02-23 11:22:22 -05:00
audit_fsnotify.c	…
audit_tree.c	audit: track the owner of the command mutex ourselves	2018-02-23 11:22:22 -05:00
audit_watch.c	audit/stable-4.13 PR 20170816	2017-08-16 16:48:34 -07:00
auditfilter.c	audit: deprecate the AUDIT_FILTER_ENTRY filter	2018-02-15 14:36:29 -05:00
auditsc.c	audit: bail before bug check if audit disabled	2018-02-15 14:40:25 -05:00
backtracetest.c	…
bounds.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
capability.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
compat.c	compat: fix 4-byte infoleak via uninitialized struct field	2018-05-10 17:51:58 -07:00
configs.c	…
context_tracking.c	…
cpu.c	cpu/hotplug: Fix unused function warning	2018-03-15 20:34:40 +01:00
cpu_pm.c	PM / CPU: replace raw_notifier with atomic_notifier	2017-07-31 13:09:49 +02:00
crash_core.c	kexec: export PG_swapbacked to VMCOREINFO	2018-04-13 17:10:27 -07:00
crash_dump.c	…
cred.c	…
delayacct.c	delayacct: Account blkio completion on the correct task	2018-01-16 03:29:36 +01:00
dma.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
elfcore.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
exec_domain.c	get rid of pointless includes of fs_struct.h	2018-02-22 14:28:50 -05:00
exit.c	kernel: use kernel_wait4() instead of sys_wait4()	2018-04-02 20:14:51 +02:00
extable.c	extable: Make init_kernel_text() global	2018-02-21 16:54:06 +01:00
fail_function.c	error-injection: Fix to prohibit jump optimization	2018-03-12 16:16:00 +01:00
fork.c	fork: unconditionally clear stack on fork	2018-04-20 17:18:35 -07:00
freezer.c	…
futex.c	pids: introduce find_get_task_by_vpid() helper	2018-02-06 18:32:46 -08:00
futex_compat.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
groups.c	kernel: make groups_sort calling a responsibility group_info allocators	2017-12-14 16:00:49 -08:00
hung_task.c	…
irq_work.c	irq/work: Improve the flag definitions	2018-01-08 19:43:15 +01:00
jump_label.c	jump_label: Disable jump labels in __exit code	2018-03-20 08:57:17 +01:00
kallsyms.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk	2018-02-01 13:36:15 -08:00
kcmp.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
kcov.c	kcov: detect double association with a single task	2018-02-06 18:32:46 -08:00
kexec.c	kexec: call do_kexec_load() in compat syscall directly	2018-04-02 20:15:01 +02:00
kexec_core.c	x86/mm, kexec: Allow kexec to be used with SME	2017-07-18 11:38:04 +02:00
kexec_file.c	kernel/kexec_file.c: allow archs to set purgatory load address	2018-04-13 17:10:28 -07:00
kexec_internal.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
kmod.c	kmod: move #ifdef CONFIG_MODULES wrapper to Makefile	2017-09-08 18:26:51 -07:00
kprobes.c	kprobes: Fix random address output of blacklist file	2018-04-25 10:27:56 -04:00
ksysfs.c	kexec: move vmcoreinfo out of the kernel's .bss section	2017-07-12 16:25:59 -07:00
kthread.c	kthread, sched/wait: Fix kthread_parkme() completion issue	2018-05-03 07:38:05 +02:00
latencytop.c	…
memremap.c	kernel/memremap: Remove stale devres_free() call	2018-03-06 10:58:54 -08:00
module-internal.h	…
module.c	init: fix false positives in W+X checking	2018-05-11 17:28:45 -07:00
module_signing.c	…
notifier.c	…
nsproxy.c	…
padata.c	padata: add SPDX identifier	2018-01-05 18:43:00 +11:00
panic.c	taint: add taint for randstruct	2018-04-11 10:28:35 -07:00
params.c	kernel/params.c: downgrade warning for unsafe parameters	2018-04-11 10:28:37 -07:00
pid.c	xarray: add the xa_lock to the radix_tree_root	2018-04-11 10:28:39 -07:00
pid_namespace.c	Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2018-04-03 19:15:32 -07:00
profile.c	…
ptrace.c	pids: introduce find_get_task_by_vpid() helper	2018-02-06 18:32:46 -08:00
range.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
reboot.c	kernel/reboot.c: add devm_register_reboot_notifier()	2017-11-17 16:10:04 -08:00
relay.c	kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE	2018-02-21 15:35:43 -08:00
resource.c	resource: fix integer overflow at reallocation	2018-04-13 17:10:27 -07:00
seccomp.c	- Fix seccomp GET_METADATA to deal with field sizes correctly (Tycho Andersen)	2018-02-22 10:50:24 -08:00
signal.c	sched/core: Introduce set_special_state()	2018-05-04 07:54:54 +02:00
smp.c	smp/core: Use lockdep to assert IRQs are disabled/enabled	2017-11-08 11:13:50 +01:00
smpboot.c	watchdog/core, powerpc: Lock cpus across reconfiguration	2017-10-04 10:53:54 +02:00
smpboot.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
softirq.c	softirq: Consolidate common code in tasklet_[hi]_action()	2018-03-09 11:50:55 +01:00
stacktrace.c	…
stop_machine.c	stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock	2018-05-03 07:38:03 +02:00
sys.c	kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()	2018-04-02 20:16:06 +02:00
sys_ni.c	syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls	2018-04-05 16:59:38 +02:00
sysctl.c	kernel/sysctl.c: add kdoc comments to do_proc_do{u}intvec_minmax_conv_param	2018-04-11 10:28:38 -07:00
sysctl_binary.c	staging: irda: remove remaining remants of irda code removal	2018-04-16 11:26:49 +02:00
task_work.c	locking/barriers: Convert users of lockless_dereference() to READ_ONCE()	2017-12-17 13:57:15 +01:00
taskstats.c	pids: introduce find_get_task_by_vpid() helper	2018-02-06 18:32:46 -08:00
test_kprobes.c	kprobes: Disable the jprobes test code	2017-10-20 11:02:54 +02:00
torture.c	torture: Save a line in stutter_wait(): while -> for	2017-12-11 09:18:30 -08:00
tracepoint.c	tracepoint: Do not warn on ENOMEM	2018-04-30 12:09:56 -04:00
tsacct.c	…
ucount.c	headers: untangle kmemleak.h from mm.h	2018-04-05 21:36:27 -07:00
uid16.c	fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers	2018-04-02 20:15:59 +02:00
uid16.h	kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c	2018-04-02 20:15:30 +02:00
umh.c	kernel: use kernel_wait4() instead of sys_wait4()	2018-04-02 20:14:51 +02:00
up.c	smp: Avoid using two cache lines for struct call_single_data	2017-08-29 15:14:38 +02:00
user-return-notifier.c	…
user.c	efivarfs: Limit the rate for non-root to read files	2018-02-22 10:21:02 -08:00
user_namespace.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2017-11-16 12:20:15 -08:00
utsname.c	uts: create "struct uts_namespace" from kmem_cache	2018-04-11 10:28:35 -07:00
utsname_sysctl.c	…
watchdog.c	Merge branch 'linus' into sched/core, to pick up fixes	2017-11-08 10:17:15 +01:00
watchdog_hld.c	Merge branch 'linus' into core/urgent, to pick up dependent commits	2017-11-04 08:53:04 +01:00
workqueue.c	Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2018-04-03 18:00:13 -07:00
workqueue_internal.h	Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2017-11-06 12:26:49 -08:00