linux-sg2042

History

Shakeel Butt 1f828223b7 memcg: flush lruvec stats in the refault Prior to the commit `7e1c0d6f58` ("memcg: switch lruvec stats to rstat") and the commit `aa48e47e39` ("memcg: infrastructure to flush memcg stats"), each lruvec memcg stats can be off by (nr_cgroups * nr_cpus * 32) at worst and for unbounded amount of time. The commit `aa48e47e39` moved the lruvec stats to rstat infrastructure and the commit `7e1c0d6f58` bounded the error for all the lruvec stats to (nr_cpus * 32) at worst for at most 2 seconds. More specifically it decoupled the number of stats and the number of cgroups from the error rate. However this reduction in error comes with the cost of triggering the slowpath of stats update more frequently. Previously in the slowpath the kernel adds the stats up the memcg tree. After `aa48e47e39`, the kernel triggers the asyn lruvec stats flush through queue_work(). This causes regression reports from 0day kernel bot [1] as well as from phoronix test suite [2]. We tried two options to fix the regression: 1) Increase the threshold to trigger the slowpath in lruvec stats update codepath from 32 to 512. 2) Remove the slowpath from lruvec stats update codepath and instead flush the stats in the page refault codepath. The assumption is that the kernel timely flush the stats, so, the update tree would be small in the refault codepath to not cause the preformance impact. Following are the results of will-it-scale/page_fault[1\|2\|3] benchmark on four settings i.e. (1) 5.15-rc1 as baseline (2) 5.15-rc1 with `aa48e47e39` and `7e1c0d6f58` reverted (3) 5.15-rc1 with option-1 (4) 5.15-rc1 with option-2. test (1) (2) (3) (4) pg_f1 368563 406277 (10.23%) 399693 (8.44%) 416398 (12.97%) pg_f2 338399 372133 (9.96%) 369180 (9.09%) 381024 (12.59%) pg_f3 500853 575399 (14.88%) 570388 (13.88%) 576083 (15.02%) From the above result, it seems like the option-2 not only solves the regression but also improves the performance for at least these benchmarks. Feng Tang (intel) ran the aim7 benchmark with these two options and confirms that option-1 reduces the regression but option-2 removes the regression. Michael Larabel (phoronix) ran multiple benchmarks with these options and reported the results at [3] and it shows for most benchmarks option-2 removes the regression introduced by the commit `aa48e47e39` ("memcg: infrastructure to flush memcg stats"). Based on the experiment results, this patch proposed the option-2 as the solution to resolve the regression. Link: https://lore.kernel.org/all/20210726022421.GB21872@xsang-OptiPlex-9020 [1] Link: https://www.phoronix.com/scan.php?page=article&item=linux515-compile-regress [2] Link: https://openbenchmarking.org/result/2109226-DEBU-LINUX5104 [3] Fixes: `aa48e47e39` ("memcg: infrastructure to flush memcg stats") Signed-off-by: Shakeel Butt <shakeelb@google.com> Tested-by: Michael Larabel <Michael@phoronix.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hillf Danton <hdanton@sina.com>, Cc: Michal Koutný <mkoutny@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2021-09-23 10:09:13 -07:00
..
damon	mm/damon: add kunit tests	2021-09-08 11:50:25 -07:00
kasan	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
kfence	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
Kconfig	mm/idle_page_tracking: make PG_idle reusable	2021-09-08 11:50:24 -07:00
Kconfig.debug	mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO	2020-12-15 12:13:46 -08:00
Makefile	mm: introduce Data Access MONitor (DAMON)	2021-09-08 11:50:24 -07:00
backing-dev.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
balloon_compaction.c	mm: fix typos in comments	2021-05-07 00:26:35 -07:00
bootmem_info.c	mm/bootmem_info.c: mark __init on register_page_bootmem_info_section	2021-09-03 09:58:14 -07:00
cleancache.c	…
cma.c	mm: use proper type for cma_[alloc\|release]	2021-05-05 11:27:24 -07:00
cma.h	mm: cma: support sysfs	2021-05-05 11:27:24 -07:00
cma_debug.c	mm/cma: change cma mutex to irq safe spinlock	2021-05-05 11:27:21 -07:00
cma_sysfs.c	mm: cma: support sysfs	2021-05-05 11:27:24 -07:00
compaction.c	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
debug.c	mm/debug: factor PagePoisoned out of __dump_page	2021-06-29 10:53:53 -07:00
debug_page_ref.c	…
debug_vm_pgtable.c	mm/debug_vm_pgtable: fix corrupted page flag	2021-09-03 09:58:10 -07:00
dmapool.c	mm/dmapool: use DEVICE_ATTR_RO macro	2021-06-29 10:53:52 -07:00
early_ioremap.c	mm/early_ioremap.c: remove redundant early_ioremap_shutdown()	2021-09-08 11:50:24 -07:00
fadvise.c	mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED	2020-10-13 18:38:29 -07:00
failslab.c	…
filemap.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
frontswap.c	mm/mempool: minor coding style tweaks	2021-05-05 11:27:27 -07:00
gup.c	Revert "mm/gup: remove try_get_page(), call try_get_compound_head() directly"	2021-09-07 11:03:45 -07:00
gup_test.c	selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages	2021-05-05 11:27:26 -07:00
gup_test.h	selftests/vm: gup_test: fix test flag	2021-05-05 11:27:26 -07:00
highmem.c	mm: in_irq() cleanup	2021-09-08 11:50:24 -07:00
hmm.c	mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled	2021-09-08 18:45:52 -07:00
huge_memory.c	mm,do_huge_pmd_numa_page: remove unnecessary TLB flushing code	2021-09-03 09:58:13 -07:00
hugetlb.c	mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY	2021-09-03 09:58:17 -07:00
hugetlb_cgroup.c	hugetlb: make free_huge_page irq safe	2021-05-05 11:27:22 -07:00
hugetlb_vmemmap.c	mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON	2021-06-30 20:47:26 -07:00
hugetlb_vmemmap.h	mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate	2021-06-30 20:47:25 -07:00
hwpoison-inject.c	mm: hwpoison: don't drop slab caches for offlining non-LRU page	2021-09-03 09:58:15 -07:00
init-mm.c	mm: add setup_initial_init_mm() helper	2021-07-08 11:48:21 -07:00
internal.h	mm/numa: automatically generate node migration order	2021-09-03 09:58:16 -07:00
interval_tree.c	mm/interval_tree: add comments to improve code readability	2021-04-30 11:20:38 -07:00
io-mapping.c	mm: add a io_mapping_map_user helper	2021-04-30 11:20:39 -07:00
ioremap.c	mm: move ioremap_page_range to vmalloc.c	2021-09-08 11:50:24 -07:00
khugepaged.c	huge tmpfs: SGP_NOALLOC to stop collapse_file() on race	2021-09-03 09:58:12 -07:00
kmemleak.c	mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp	2021-09-08 18:45:53 -07:00
ksm.c	mm/ksm: remove old GCC 4.9+ check	2021-09-13 10:18:28 -07:00
list_lru.c	mm: vmscan: consolidate shrinker_maps handling code	2021-05-05 11:27:23 -07:00
maccess.c	ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault	2021-08-20 11:39:25 +01:00
madvise.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
mapping_dirty_helpers.c	mm/mapping_dirty_helpers: remove double Note in kerneldoc	2021-07-01 11:06:02 -07:00
memblock.c	memblock: introduce saner 'memblock_free_ptr()' interface	2021-09-14 13:23:22 -07:00
memcontrol.c	memcg: flush lruvec stats in the refault	2021-09-23 10:09:13 -07:00
memfd.c	Reimplement RLIMIT_MEMLOCK on top of ucounts	2021-04-30 14:14:02 -05:00
memory-failure.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
memory.c	afs: Fix mmap coherency vs 3rd-party changes	2021-09-13 09:10:39 +01:00
memory_hotplug.c	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
mempolicy.c	Merge branches 'akpm' and 'akpm-hotfixes' (patches from Andrew)	2021-09-08 18:52:05 -07:00
mempool.c	kasan: use separate (un)poison implementation for integrated init	2021-06-04 19:32:21 +01:00
memremap.c	mm/memory_hotplug: remove nid parameter from arch_remove_memory()	2021-09-08 11:50:23 -07:00
memtest.c	…
migrate.c	compat: remove some compat entry points	2021-09-08 15:32:35 -07:00
mincore.c	inode: make init and permission helpers idmapped mount aware	2021-01-24 14:27:16 +01:00
mlock.c	mm: introduce memfd_secret system call to create "secret" memory areas	2021-07-08 11:48:21 -07:00
mm_init.c	include/linux/page-flags-layout.h: cleanups	2021-04-30 11:20:42 -07:00
mmap.c	Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux	2021-09-04 11:35:47 -07:00
mmap_lock.c	mm: mmap_lock: fix disabling preemption directly	2021-07-23 17:43:28 -07:00
mmu_gather.c	mm: eliminate "expecting prototype" kernel-doc warnings	2021-04-16 16:10:36 -07:00
mmu_notifier.c	mm/mmu_notifiers: ensure range_end() is paired with range_start()	2021-03-25 09:22:55 -07:00
mmzone.c	mm/lru: replace pgdat lru_lock with lruvec lock	2020-12-15 14:48:04 -08:00
mprotect.c	mm: device exclusive memory access	2021-07-01 11:06:03 -07:00
mremap.c	mm/mremap: fix memory account on do_munmap() failure	2021-09-03 09:58:14 -07:00
msync.c	mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start	2021-04-30 11:20:37 -07:00
nommu.c	Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux	2021-09-04 11:35:47 -07:00
oom_kill.c	mm: introduce process_mrelease system call	2021-09-03 09:58:17 -07:00
page-writeback.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
page_alloc.c	mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype	2021-09-08 18:45:53 -07:00
page_counter.c	mm: page_counter: mitigate consequences of a page_counter underflow	2021-04-30 11:20:38 -07:00
page_ext.c	mm/idle_page_tracking: make PG_idle reusable	2021-09-08 11:50:24 -07:00
page_idle.c	mm/idle_page_tracking: make PG_idle reusable	2021-09-08 11:50:24 -07:00
page_io.c	swap: fix swapfile read/write offset	2021-03-02 17:25:46 -07:00
page_isolation.c	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
page_owner.c	mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE	2021-09-08 11:50:22 -07:00
page_poison.c	mm: page_poison: print page info when corruption is caught	2021-04-30 11:20:36 -07:00
page_reporting.c	mm/page_reporting: allow driver to specify reporting order	2021-06-29 10:53:47 -07:00
page_reporting.h	mm/page_reporting: export reporting order as module parameter	2021-06-29 10:53:47 -07:00
page_vma_mapped.c	mm: device exclusive memory access	2021-07-01 11:06:03 -07:00
pagewalk.c	mm: pagewalk: fix walk for hugepage tables	2021-06-29 10:53:49 -07:00
percpu-internal.h	Merge branch 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu	2021-07-01 17:17:24 -07:00
percpu-km.c	percpu: flush tlb in pcpu_reclaim_populated()	2021-07-04 18:30:17 +00:00
percpu-stats.c	percpu: rework memcg accounting	2021-06-05 20:43:15 +00:00
percpu-vm.c	percpu: flush tlb in pcpu_reclaim_populated()	2021-07-04 18:30:17 +00:00
percpu.c	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
pgalloc-track.h	mm: fix typos in comments	2021-05-07 00:26:35 -07:00
pgtable-generic.c	mm/thp: fix __split_huge_pmd_locked() on shmem migration entry	2021-06-16 09:24:42 -07:00
process_vm_access.c	mm/process_vm_access.c: remove duplicate include	2021-05-05 11:27:27 -07:00
ptdump.c	mm: ptdump: fix build failure	2021-04-16 16:10:37 -07:00
readahead.c	mm: Protect operations adding pages to page cache with invalidate_lock	2021-07-13 13:14:27 +02:00
rmap.c	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
rodata_test.c	mm/rodata_test.c: fix missing function declaration	2020-08-21 09:52:53 -07:00
secretmem.c	mm/secretmem: use refcount_t instead of atomic_t	2021-09-08 11:50:24 -07:00
shmem.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
shuffle.c	mm: eliminate "expecting prototype" kernel-doc warnings	2021-04-16 16:10:36 -07:00
shuffle.h	mm/shuffle: fix section mismatch warning	2021-05-22 15:09:07 -10:00
slab.c	mm: fix typos in comments	2021-05-07 00:26:35 -07:00
slab.h	mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()	2021-07-30 10:14:39 -07:00
slab_common.c	mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context	2021-09-04 01:12:23 +02:00
slob.c	mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels	2021-03-08 14:18:46 -08:00
slub.c	mm, slub: convert kmem_cpu_slab protection to local_lock	2021-09-04 10:22:01 +02:00
sparse-vmemmap.c	mm: sparsemem: split the huge PMD mapping of vmemmap pages	2021-06-30 20:47:26 -07:00
sparse.c	mm: introduce memmap_alloc() to unify memory map allocation	2021-09-03 09:58:15 -07:00
swap.c	mm: delete unused get_kernel_page()	2021-09-03 09:58:11 -07:00
swap_cgroup.c	mm: memcontrol: make swap tracking an integral part of memory control	2020-06-03 20:09:48 -07:00
swap_slots.c	mm: Replace deprecated CPU-hotplug functions.	2021-08-28 01:46:17 +02:00
swap_state.c	Revert "mm: swap: check if swap backing device is congested or not"	2021-08-20 11:31:42 -07:00
swapfile.c	mm, memcg: inline swap-related functions to improve disabled memcg config	2021-09-03 09:58:12 -07:00
truncate.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
usercopy.c	mm/usercopy.c: delete duplicated word	2020-08-12 10:57:58 -07:00
userfaultfd.c	userfaultfd: change mmap_changing to atomic	2021-09-03 09:58:16 -07:00
util.c	mm: don't allow oversized kvmalloc() calls	2021-09-02 09:47:01 -07:00
vmacache.c	kernel: better document the use_mm/unuse_mm API contract	2020-06-10 19:14:18 -07:00
vmalloc.c	Merge branch 'akpm' (patches from Andrew)	2021-09-08 12:55:35 -07:00
vmpressure.c	mm/vmpressure: replace vmpressure_to_css() with vmpressure_to_memcg()	2021-09-03 09:58:17 -07:00
vmscan.c	mm,vmscan: fix divide by zero in get_scan_count	2021-09-08 18:45:53 -07:00
vmstat.c	mm/vmstat: protect per cpu variables with preempt disable on RT	2021-09-08 15:32:34 -07:00
workingset.c	memcg: flush lruvec stats in the refault	2021-09-23 10:09:13 -07:00
z3fold.c	mm/z3fold: add kerneldoc fields for z3fold_pool	2021-07-01 11:06:03 -07:00
zbud.c	mm/zbud: add kerneldoc fields for zbud_pool	2021-07-01 11:06:03 -07:00
zpool.c	mm: fix typos in comments	2021-05-07 00:26:35 -07:00
zsmalloc.c	mm/zsmalloc.c: improve readability for async_free_zspage()	2021-07-01 11:06:02 -07:00
zswap.c	mm/zswap.c: fix two bugs in zswap_writeback_entry()	2021-06-30 20:47:31 -07:00