linux-sg2042

History

Linus Torvalds edf445ad7c Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes) Merge hugepage allocation updates from David Rientjes: "We (mostly Linus, Andrea, and myself) have been discussing offlist how to implement a sane default allocation strategy for hugepages on NUMA platforms. With these reverts in place, the page allocator will happily allocate a remote hugepage immediately rather than try to make a local hugepage available. This incurs a substantial performance degradation when memory compaction would have otherwise made a local hugepage available. This series reverts those reverts and attempts to propose a more sane default allocation strategy specifically for hugepages. Andrea acknowledges this is likely to fix the swap storms that he originally reported that resulted in the patches that removed __GFP_THISNODE from hugepage allocations. The immediate goal is to return 5.3 to the behavior the kernel has implemented over the past several years so that remote hugepages are not immediately allocated when local hugepages could have been made available because the increased access latency is untenable. The next goal is to introduce a sane default allocation strategy for hugepages allocations in general regardless of the configuration of the system so that we prevent thrashing of local memory when compaction is unlikely to succeed and can prefer remote hugepages over remote native pages when the local node is low on memory." Note on timing: this reverts the hugepage VM behavior changes that got introduced fairly late in the 5.3 cycle, and that fixed a huge performance regression for certain loads that had been around since 4.18. Andrea had this note: "The regression of 4.18 was that it was taking hours to start a VM where 3.10 was only taking a few seconds, I reported all the details on lkml when it was finally tracked down in August 2018. https://lore.kernel.org/linux-mm/20180820032640.9896-2-aarcange@redhat.com/ __GFP_THISNODE in MADV_HUGEPAGE made the above enterprise vfio workload degrade like in the "current upstream" above. And it still would have been that bad as above until 5.3-rc5" where the bad behavior ends up happening as you fill up a local node, and without that change, you'd get into the nasty swap storm behavior due to compaction working overtime to make room for more memory on the nodes. As a result 5.3 got the two performance fix reverts in rc5. However, David Rientjes then noted that those performance fixes in turn regressed performance for other loads - although not quite to the same degree. He suggested reverting the reverts and instead replacing them with two small changes to how hugepage allocations are done (patch descriptions rephrased by me): - "avoid expensive reclaim when compaction may not succeed": just admit that the allocation failed when you're trying to allocate a huge-page and compaction wasn't successful. - "allow hugepage fallback to remote nodes when madvised": when that node-local huge-page allocation failed, retry without forcing the local node. but by then I judged it too late to replace the fixes for a 5.3 release. So 5.3 was released with behavior that harked back to the pre-4.18 logic. But now we're in the merge window for 5.4, and we can see if this alternate model fixes not just the horrendous swap storm behavior, but also restores the performance regression that the late reverts caused. Fingers crossed. * emailed patches from David Rientjes <rientjes@google.com>: mm, page_alloc: allow hugepage fallback to remote nodes when madvised mm, page_alloc: avoid expensive reclaim when compaction may not succeed Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"" Revert "Revert "mm, thp: restore node-local hugepage allocations""		2019-09-28 14:26:47 -07:00
..
kasan	mm: introduce compound_nr()	2019-09-24 15:54:08 -07:00
Kconfig	mm,thp: add read-only THP support for (non-shmem) FS	2019-09-24 15:54:11 -07:00
Kconfig.debug	mm, page_owner, debug_pagealloc: save and dump freeing stack trace	2019-09-24 15:54:08 -07:00
Makefile	mm: silence -Woverride-init/initializer-overrides	2019-09-24 15:54:10 -07:00
backing-dev.c	writeback: Separate out wb_get_lookup() from wb_get_create()	2019-08-27 09:22:38 -06:00
balloon_compaction.c	mm/balloon_compaction: suppress allocation warnings	2019-09-04 07:42:01 -04:00
cleancache.c	Driver Core and debugfs changes for 5.3-rc1	2019-07-12 12:24:03 -07:00
cma.c	mm/cma.c: fail if fixed declaration can't be honored	2019-07-16 19:23:21 -07:00
cma.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
cma_debug.c	mm/cma_debug.c: fix the break condition in cma_maxchunk_get()	2019-05-14 09:47:45 -07:00
compaction.c	mm/compaction.c: remove unnecessary zone parameter in isolate_migratepages()	2019-09-24 15:54:10 -07:00
debug.c	mm: update references to page _refcount	2019-05-14 19:52:47 -07:00
debug_page_ref.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
dmapool.c	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options	2019-07-12 11:05:46 -07:00
early_ioremap.c	mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep	2017-12-11 14:54:44 +01:00
fadvise.c	fs: Export generic_fadvise()	2019-08-30 22:43:58 -07:00
failslab.c	mm/failslab.c: by default, do not fail allocations with direct reclaim only	2019-07-12 11:05:43 -07:00
filemap.c	mm,thp: avoid writes to file with THP in pagecache	2019-09-24 15:54:11 -07:00
frame_vector.c	mm: untag user pointers in get_vaddr_frames	2019-09-25 17:51:41 -07:00
frontswap.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482	2019-06-19 17:09:52 +02:00
gup.c	mm: untag user pointers in mm/gup.c	2019-09-25 17:51:41 -07:00
gup_benchmark.c	mm/gup: replace get_user_pages_longterm() with FOLL_LONGTERM	2019-05-14 09:47:45 -07:00
highmem.c	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
hmm.c	pagewalk: separate function pointers from iterator data	2019-09-07 04:28:04 -03:00
huge_memory.c	Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes)	2019-09-28 14:26:47 -07:00
hugetlb.c	hugetlbfs: don't retry when pool page allocations start to fail	2019-09-24 15:54:10 -07:00
hugetlb_cgroup.c	mm: introduce compound_nr()	2019-09-24 15:54:08 -07:00
hwpoison-inject.c	hwpoison-inject: no need to check return value of debugfs_create functions	2019-06-03 15:39:40 +02:00
init-mm.c	mm: use CPU_BITS_NONE to initialize init_mm.cpu_bitmask	2019-09-24 15:54:10 -07:00
internal.h	mm: introduce MADV_COLD	2019-09-25 17:51:41 -07:00
interval_tree.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248	2019-06-19 17:09:08 +02:00
khugepaged.c	khugepaged: enable collapse pmd for pte-mapped THP	2019-09-24 15:54:11 -07:00
kmemleak-test.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333	2019-06-05 17:37:06 +02:00
kmemleak.c	mm/kmemleak.c: record the current memory pool size	2019-09-24 15:54:07 -07:00
ksm.c	mm: move memcmp_pages() and pages_identical()	2019-09-24 15:54:11 -07:00
list_lru.c	mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages	2019-07-12 11:05:44 -07:00
maccess.c	The main changes in this release include:	2019-07-18 11:51:00 -07:00
madvise.c	mm: factor out common parts between MADV_COLD and MADV_PAGEOUT	2019-09-25 17:51:41 -07:00
memblock.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	2019-05-30 11:26:32 -07:00
memcontrol.c	memcg, kmem: do not fail __GFP_NOFAIL charges	2019-09-25 17:51:39 -07:00
memfd.c	mm: page cache: store only head pages in i_pages	2019-09-24 15:54:08 -07:00
memory-failure.c	HMM patches for 5.3	2019-07-14 19:42:11 -07:00
memory.c	mm: do not hash address in print_bad_pte()	2019-09-24 15:54:09 -07:00
memory_hotplug.c	mm/memory_hotplug.c: s/is/if	2019-09-24 15:54:09 -07:00
mempolicy.c	Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes)	2019-09-28 14:26:47 -07:00
mempool.c	docs/core-api/mm: fix return value descriptions in mm/	2019-03-05 21:07:20 -08:00
memremap.c	Merge branch 'odp_fixes' into hmm.git	2019-08-21 20:58:18 -03:00
memtest.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
migrate.c	mm: untag user pointers passed to memory syscalls	2019-09-25 17:51:41 -07:00
mincore.c	mm: untag user pointers passed to memory syscalls	2019-09-25 17:51:41 -07:00
mlock.c	mm: untag user pointers passed to memory syscalls	2019-09-25 17:51:41 -07:00
mm_init.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
mmap.c	mm: untag user pointers in mmap/munmap/mremap/brk	2019-09-25 17:51:41 -07:00
mmu_context.c	sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h>	2017-03-02 08:42:38 +01:00
mmu_gather.c	mm: remove quicklist page table caches	2019-09-24 15:54:09 -07:00
mmu_notifier.c	mm, notifier: Catch sleeping/blocking for !blockable	2019-09-07 04:28:05 -03:00
mmzone.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
mprotect.c	mm: untag user pointers passed to memory syscalls	2019-09-25 17:51:41 -07:00
mremap.c	mm: untag user pointers in mmap/munmap/mremap/brk	2019-09-25 17:51:41 -07:00
msync.c	mm: untag user pointers passed to memory syscalls	2019-09-25 17:51:41 -07:00
nommu.c	mm: introduce page_size()	2019-09-24 15:54:08 -07:00
oom_kill.c	mm: introduce MADV_COLD	2019-09-25 17:51:41 -07:00
page-writeback.c	writeback, memcg: Implement foreign dirty flushing	2019-08-27 09:22:38 -06:00
page_alloc.c	Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes)	2019-09-28 14:26:47 -07:00
page_counter.c	memcg: introduce memory.min	2018-06-07 17:34:36 -07:00
page_ext.c	mm, debug_pagealloc: use a page type instead of page_ext flag	2019-07-12 11:05:43 -07:00
page_idle.c	mm/page_idle.c: fix oops because end_pfn is larger than max_pfn	2019-06-29 16:43:45 +08:00
page_io.c	mm, swap: use rbtree for swap_extent	2019-07-12 11:05:43 -07:00
page_isolation.c	mm/page_isolation.c: change the prototype of undo_isolate_page_range()	2019-07-12 11:05:43 -07:00
page_owner.c	mm, page_owner, debug_pagealloc: save and dump freeing stack trace	2019-09-24 15:54:08 -07:00
page_poison.c	mm/page_poison.c: fix a typo in a comment	2019-09-24 15:54:08 -07:00
page_vma_mapped.c	mm: introduce page_size()	2019-09-24 15:54:08 -07:00
pagewalk.c	pagewalk: use lockdep_assert_held for locking validation	2019-09-07 04:28:04 -03:00
percpu-internal.h	percpu: convert chunk hints to be based on pcpu_block_md	2019-03-13 12:25:31 -07:00
percpu-km.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
percpu-stats.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
percpu-vm.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428	2019-06-05 17:37:16 +02:00
percpu.c	percpu: Use struct_size() helper	2019-09-04 13:40:49 -07:00
pgtable-generic.c	x86/mm: Page size aware flush_tlb_mm_range()	2018-10-09 16:51:11 +02:00
process_vm_access.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152	2019-05-30 11:26:32 -07:00
readahead.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
rmap.c	mm,thp: add read-only THP support for (non-shmem) FS	2019-09-24 15:54:11 -07:00
rodata_test.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
shmem.c	Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes)	2019-09-28 14:26:47 -07:00
shuffle.c	mm: maintain randomization of page free lists	2019-05-14 19:52:48 -07:00
shuffle.h	mm: maintain randomization of page free lists	2019-05-14 19:52:48 -07:00
slab.c	mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options	2019-07-12 11:05:46 -07:00
slab.h	mm, slab: move memcg_cache_params structure to mm/slab.h	2019-09-24 15:54:07 -07:00
slab_common.c	mm, slab: extend slab/shrink to shrink all memcg caches	2019-09-24 15:54:07 -07:00
slob.c	mm: introduce page_size()	2019-09-24 15:54:08 -07:00
slub.c	mm: introduce page_size()	2019-09-24 15:54:08 -07:00
sparse-vmemmap.c	mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()	2019-07-18 17:08:07 -07:00
sparse.c	mm/sparse.c: remove NULL check in clear_hwpoisoned_pages()	2019-09-24 15:54:09 -07:00
swap.c	mm: introduce MADV_COLD	2019-09-25 17:51:41 -07:00
swap_cgroup.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
swap_slots.c	mm, swap, get_swap_pages: use entry_size instead of cluster in parameter	2018-08-22 10:52:44 -07:00
swap_state.c	mm: page cache: store only head pages in i_pages	2019-09-24 15:54:08 -07:00
swapfile.c	vfs: don't allow writes to swap files	2019-08-20 07:55:16 -07:00
truncate.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
usercopy.c	usercopy: Avoid HIGHMEM pfn warning	2019-09-17 15:20:17 -07:00
userfaultfd.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499	2019-06-19 17:09:53 +02:00
util.c	arm64, mm: make randomization selected by generic topdown mmap layout	2019-09-24 15:54:11 -07:00
vmacache.c	mm: get rid of vmacache_flush_all() entirely	2018-09-13 15:18:04 -10:00
vmalloc.c	augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macro	2019-09-25 17:51:39 -07:00
vmpressure.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500	2019-06-19 17:09:55 +02:00
vmscan.c	mm: introduce MADV_PAGEOUT	2019-09-25 17:51:41 -07:00
vmstat.c	mm,thp: stats for file backed THP	2019-09-24 15:54:11 -07:00
workingset.c	mm: workingset: fix vmstat counters for shadow nodes	2019-08-13 16:06:52 -07:00
z3fold.c	z3fold: fix memory leak in kmem cache	2019-09-24 15:54:10 -07:00
zbud.c	treewide: Add SPDX license identifier for more missed files	2019-05-21 10:50:45 +02:00
zpool.c	zpool: add malloc_support_movable to zpool_driver	2019-09-24 15:54:12 -07:00
zsmalloc.c	mm/zsmalloc.c: fix a -Wunused-function warning	2019-09-24 15:54:12 -07:00
zswap.c	zswap: do not map same object twice	2019-09-24 15:54:12 -07:00