OpenCloudOS-Kernel/arch/x86/mm
Mike Rapoport (IBM) 0f32b0b2bd x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
[ Upstream commit a1e2b8b36820d8c91275f207e77e91645b7c6836 ]

Qi Zheng reported crashes in a production environment and provided a
simplified example as a reproducer:

 |  For example, if we use Qemu to start a two NUMA node kernel,
 |  one of the nodes has 2M memory (less than NODE_MIN_SIZE),
 |  and the other node has 2G, then we will encounter the
 |  following panic:
 |
 |    BUG: kernel NULL pointer dereference, address: 0000000000000000
 |    <...>
 |    RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
 |    <...>
 |    Call Trace:
 |      <TASK>
 |      deactivate_slab()
 |      bootstrap()
 |      kmem_cache_init()
 |      start_kernel()
 |      secondary_startup_64_no_verify()

The crashes happen because of inconsistency between the nodemask that
has nodes with less than 4MB as memoryless, and the actual memory fed
into the core mm.

The commit:

  9391a3f9c7 ("[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing")

... that introduced minimal size of a NUMA node does not explain why
a node size cannot be less than 4MB and what boot failures this
restriction might fix.

Fixes have been submitted to the core MM code to tighten up the
memory topologies it accepts and to not crash on weird input:

  mm: page_alloc: skip memoryless nodes entirely
  mm: memory_hotplug: drop memoryless node from fallback lists

Andrew has accepted them into the -mm tree, but there are no
stable SHA1's yet.

This patch drops the limitation for minimal node size on x86:

  - which works around the crash without the fixes to the core MM.
  - makes x86 topologies less weird,
  - removes an arbitrary and undocumented limitation on NUMA topologies.

[ mingo: Improved changelog clarity. ]

Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/ZS+2qqjEO5/867br@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:19:36 +00:00
..
pat x86/mm: Remove _PAGE_DIRTY from kernel RO pages 2023-07-11 14:12:19 -07:00
Makefile x86: kmsan: handle CPU entry area 2022-10-03 14:03:26 -07:00
amdtopology.c
cpu_entry_area.c x86/mm: Do not shuffle CPU entry areas without KASLR 2023-03-22 10:42:47 -07:00
debug_pagetables.c x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:54 -07:00
dump_pagetables.c
extable.c x86-64: mm: clarify the 'positive addresses' user address rules 2023-05-03 10:37:22 -07:00
fault.c Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
highmem_32.c x86/mm: Include asm/numa.h for set_highmem_pages_init() 2023-05-18 11:56:18 -07:00
hugetlbpage.c arch/x86/mm/hugetlbpage.c: pud_huge() returns 0 when using 2-level paging 2022-11-08 15:57:25 -08:00
ident_map.c
init.c - Remove unnecessary "INVPCID single" feature tracking 2023-08-30 09:54:00 -07:00
init_32.c x86/mm: Remove Xen-PV leftovers from init_32.c 2023-06-09 11:00:21 +02:00
init_64.c mm/sparse-vmemmap: generalise vmemmap_populate_hugepages() 2022-12-11 18:12:12 -08:00
iomap_32.c
ioremap.c x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM 2023-03-26 23:42:40 +02:00
kasan_init_64.c x86/kasan: Populate shadow for shared chunk of the CPU entry area 2022-12-15 10:37:28 -08:00
kaslr.c x86/mm: Avoid using set_pgd() outside of real PGD pages 2023-06-16 11:46:42 -07:00
kmmio.c x86/mm/kmmio: Remove redundant preempt_disable() 2022-12-12 10:54:48 -05:00
kmsan_shadow.c x86: kmsan: handle CPU entry area 2022-10-03 14:03:26 -07:00
maccess.c x86/sev-es: Allow copy_from_kernel_nofault() in earlier boot 2023-11-20 11:58:53 +01:00
mem_encrypt.c virtio: replace arch_has_restricted_virtio_memory_access() 2022-06-06 08:22:01 +02:00
mem_encrypt_amd.c x86/sev: Make enc_dec_hypercall() accept a size instead of npages 2023-08-25 13:33:48 +02:00
mem_encrypt_boot.S x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros 2022-12-15 10:37:27 -08:00
mem_encrypt_identity.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
mm_internal.h
mmap.c
mmio-mod.c
numa.c x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size 2023-11-28 17:19:36 +00:00
numa_32.c
numa_64.c
numa_emulation.c
numa_internal.h
pf_in.c
pf_in.h
pgprot.c x86/mm: move protection_map[] inside the platform 2022-07-17 17:14:38 -07:00
pgtable.c Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
pgtable_32.c
physaddr.c
physaddr.h
pkeys.c x86/pkeys: Clarify PKRU_AD_KEY macro 2022-06-07 16:06:33 -07:00
pti.c x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros 2022-12-15 10:37:27 -08:00
srat.c x86/apic: Wrap APIC ID validation into an inline 2023-08-09 11:58:30 -07:00
testmmiotrace.c
tlb.c - Remove unnecessary "INVPCID single" feature tracking 2023-08-30 09:54:00 -07:00