OpenCloudOS-Kernel/mm
Naoya Horiguchi 2bc6b781f8 mm/hwpoison: fix error page recovered but reported "not recovered"
commit 046545a661 upstream.

When an uncorrected memory error is consumed there is a race between the CMCI from the memory controller reporting an uncorrected error with a UCNA signature, and the core reporting and SRAR signature machine check when the data is about to be consumed.

If the CMCI wins that race, the page is marked poisoned when
uc_decode_notifier() calls memory_failure() and the machine check processing code finds the page already poisoned.  It calls
kill_accessing_process() to make sure a SIGBUS is sent.  But returns the wrong error code.

Console log looks like this:

[34775.674296] mce: Uncorrected hardware memory error in user-access at 3710b3400 [34775.675413] Memory failure: 0x3710b3: recovery action for dirty LRU page: Recovered [34775.690310] Memory failure: 0x3710b3: already hardware poisoned [34775.696247] Memory failure: 0x3710b3: Sending SIGBUS to einj_mem_uc:361438 due to hardware memory corruption [34775.706072] mce: Memory error not recovered

kill_accessing_process() is supposed to return -EHWPOISON to notify that SIGBUS is already set to the process and kill_me_maybe() doesn't have to send it again.  But current code simply fails to do this, so fix it to make sure to work as intended.  This change avoids the noise message "Memory error not recovered" and skips duplicate SIGBUSs.

Intel-SIG: commit 046545a661 mm/hwpoison: fix error page recovered but reported "not
 recovered".
Backport to kernel 5.4 to enhance MCA-R.

[tony.luck@intel.com: reword some parts of commit message]
Link: https://lkml.kernel.org/r/20220113231117.1021405-1-naoya.horiguchi@linux.dev
Fixes: a3f5d80ea4 ("mm,hwpoison: send SIGBUS with error virutal address")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Youquan Song <youquan.song@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Youquan Song: amend commit log ]
Signed-off-by: Youquan Song <youquan.song@intel.com>
2024-06-11 21:18:21 +08:00
..
kasan ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
Kconfig ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
Kconfig.debug mm, page_owner, debug_pagealloc: save and dump freeing stack trace 2019-09-24 15:54:08 -07:00
Makefile mm: silence -Woverride-init/initializer-overrides 2019-09-24 15:54:10 -07:00
backing-dev.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
balloon_compaction.c mm/balloon_compaction: suppress allocation warnings 2019-09-04 07:42:01 -04:00
cleancache.c Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
cma.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
cma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cma_debug.c mm/cma_debug.c: fix the break condition in cma_maxchunk_get() 2019-05-14 09:47:45 -07:00
compaction.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
debug.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
debug_page_ref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dmapool.c mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options 2019-07-12 11:05:46 -07:00
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2017-12-11 14:54:44 +01:00
fadvise.c fs: Export generic_fadvise() 2019-08-30 22:43:58 -07:00
failslab.c mm/failslab.c: by default, do not fail allocations with direct reclaim only 2019-07-12 11:05:43 -07:00
filemap.c Intel: generic_perform_write()/iomap_write_actor(): saner logics for short copy 2024-06-11 21:18:20 +08:00
frame_vector.c mm: untag user pointers in get_vaddr_frames 2019-09-25 17:51:41 -07:00
frontswap.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482 2019-06-19 17:09:52 +02:00
gup.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
gup_benchmark.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
highmem.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
hmm.c pagewalk: separate function pointers from iterator data 2019-09-07 04:28:04 -03:00
huge_memory.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
hugetlb.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
hugetlb_cgroup.c mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup() 2019-11-15 18:34:00 -08:00
hwpoison-inject.c hwpoison-inject: no need to check return value of debugfs_create functions 2019-06-03 15:39:40 +02:00
init-mm.c kernel/fork: Initialize mm's PASID 2024-06-11 21:12:36 +08:00
internal.h tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
interval_tree.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248 2019-06-19 17:09:08 +02:00
khugepaged.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
kmemleak-test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
kmemleak.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
ksm.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
list_lru.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
maccess.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
madvise.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
memblock.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
memcontrol.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
memfd.c mm: page cache: store only head pages in i_pages 2019-09-24 15:54:08 -07:00
memory-failure.c mm/hwpoison: fix error page recovered but reported "not recovered" 2024-06-11 21:18:21 +08:00
memory.c x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() 2024-06-11 21:05:59 +08:00
memory_hotplug.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
mempolicy.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
mempool.c docs/core-api/mm: fix return value descriptions in mm/ 2019-03-05 21:07:20 -08:00
memremap.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
memtest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
migrate.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
mincore.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
mlock.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
mm_init.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
mmap.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
mmu_context.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
mmu_gather.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
mmu_notifier.c mm/mmu_notifiers: use the right return code for WARN_ON 2019-11-06 08:47:50 -08:00
mmzone.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mprotect.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
mremap.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
msync.c mm: untag user pointers passed to memory syscalls 2019-09-25 17:51:41 -07:00
nommu.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
oom_kill.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page-writeback.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
page_alloc.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_counter.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_ext.c mm, page_owner: fix off-by-one error in __set_page_owner_handle() 2019-10-14 15:04:00 -07:00
page_idle.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_io.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_isolation.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_owner.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
page_poison.c mm/page_poison.c: fix a typo in a comment 2019-09-24 15:54:08 -07:00
page_vma_mapped.c mm: introduce page_size() 2019-09-24 15:54:08 -07:00
pagewalk.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
percpu-internal.h percpu: convert chunk hints to be based on pcpu_block_md 2019-03-13 12:25:31 -07:00
percpu-km.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu-stats.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu-vm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
percpu.c bitmap: genericize percpu bitmap region iterators 2024-06-11 21:16:24 +08:00
pgtable-generic.c x86/mm: Page size aware flush_tlb_mm_range() 2018-10-09 16:51:11 +02:00
process_vm_access.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
readahead.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
rmap.c mm: include <linux/huge_mm.h> for is_vma_temporary_stack 2019-10-19 06:32:32 -04:00
rodata_test.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
shmem.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
shuffle.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
shuffle.h mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
slab.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
slab.h mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly 2019-11-06 08:47:50 -08:00
slab_common.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
slob.c mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two) 2019-10-07 15:47:20 -07:00
slub.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
sparse-vmemmap.c mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap() 2019-07-18 17:08:07 -07:00
sparse.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
swap.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
swap_cgroup.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swap_slots.c mm, swap, get_swap_pages: use entry_size instead of cluster in parameter 2018-08-22 10:52:44 -07:00
swap_state.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
swapfile.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
truncate.c mm/thp: allow dropping THP from page cache 2019-10-19 06:32:33 -04:00
usercopy.c usercopy: Avoid HIGHMEM pfn warning 2019-09-17 15:20:17 -07:00
userfaultfd.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
util.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-13 15:18:04 -10:00
vmalloc.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
vmpressure.c mm/vmpressure.c: fix a signedness bug in vmpressure_register_event() 2019-10-07 15:47:19 -07:00
vmscan.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
vmstat.c tkernel: add base tlinux kernel interfaces 2024-06-11 20:09:33 +08:00
workingset.c mm: workingset: fix vmstat counters for shadow nodes 2019-08-13 16:06:52 -07:00
z3fold.c mm/z3fold.c: claim page in the beginning of free 2019-10-07 15:47:19 -07:00
zbud.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
zpool.c zpool: add malloc_support_movable to zpool_driver 2019-09-24 15:54:12 -07:00
zsmalloc.c ock: sync codes to ock 5.4.119-20.0009.21 2024-06-11 20:27:38 +08:00
zswap.c zswap: do not map same object twice 2019-09-24 15:54:12 -07:00