OpenCloudOS-Kernel

History

Paul E. McKenney 2112a9df40 x86/nmi: Fix out-of-order NMI nesting checks & false positive warning [ Upstream commit f44075ecafb726830e63d33fbca29413149eeeb8 ] The ->idt_seq and ->recv_jiffies variables added by: `1a3ea611fc` ("x86/nmi: Accumulate NMI-progress evidence in exc_nmi()") ... place the exit-time check of the bottom bit of ->idt_seq after the this_cpu_dec_return() that re-enables NMI nesting. This can result in the following sequence of events on a given CPU in kernels built with CONFIG_NMI_CHECK_CPU=y: o An NMI arrives, and ->idt_seq is incremented to an odd number. In addition, nmi_state is set to NMI_EXECUTING==1. o The NMI is processed. o The this_cpu_dec_return(nmi_state) zeroes nmi_state and returns NMI_EXECUTING==1, thus opting out of the "goto nmi_restart". o Another NMI arrives and ->idt_seq is incremented to an even number, triggering the warning. But all is just fine, at least assuming we don't get so many closely spaced NMIs that the stack overflows or some such. Experience on the fleet indicates that the MTBF of this false positive is about 70 years. Or, for those who are not quite that patient, the MTBF appears to be about one per week per 4,000 systems. Fix this false-positive warning by moving the "nmi_restart" label before the initial ->idt_seq increment/check and moving the this_cpu_dec_return() to follow the final ->idt_seq increment/check. This way, all nested NMIs that get past the NMI_NOT_RUNNING check get a clean ->idt_seq slate. And if they don't get past that check, they will set nmi_state to NMI_LATCHED, which will cause the this_cpu_dec_return(nmi_state) to restart. Fixes: `1a3ea611fc` ("x86/nmi: Accumulate NMI-progress evidence in exc_nmi()") Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/0cbff831-6e3d-431c-9830-ee65ee7787ff@paulmck-laptop Signed-off-by: Sasha Levin <sashal@kernel.org>		2023-11-20 11:58:53 +01:00
..
alpha	Kbuild updates for v6.6	2023-09-05 11:01:47 -07:00
arc	ARC updates for v6.6	2023-09-04 15:38:24 -07:00
arm	Few minor fixes for omaps	2023-10-18 15:29:11 +02:00
arm64	ARM: SoC fixes for 6.7, part 3	2023-10-26 08:17:26 -10:00
csky	arch/csky 2nd patches for 6.6	2023-09-01 08:02:45 -07:00
hexagon	Add x86 shadow stack support	2023-08-31 12:20:12 -07:00
ia64	cpu-hotplug: Provide prototypes for arch CPU registration	2023-10-11 14:27:37 +02:00
loongarch	LoongArch: Disable WUC for pgprot_writecombine() like ioremap_wc()	2023-10-18 08:42:52 +08:00
m68k	ata changes for 6.6	2023-09-05 12:37:28 -07:00
microblaze	Microblaze patches for 6.6-rc1	2023-09-05 10:15:22 -07:00
mips	KVM: MIPS: fix -Wunused-but-set-variable warning	2023-10-12 11:25:40 -04:00
nios2	Add x86 shadow stack support	2023-08-31 12:20:12 -07:00
openrisc	OpenRISC updates for 6.6	2023-09-05 10:09:31 -07:00
parisc	parisc architecture fixes for kernel v6.6-rc5:	2023-10-07 13:05:43 -07:00
powerpc	powerpc fixes for 6.6 #6	2023-10-27 05:40:42 -10:00
riscv	ARM: SoC fixes for 6.7, part 3	2023-10-26 08:17:26 -10:00
s390	s390 updates for 6.6-rc7	2023-10-21 10:11:11 -07:00
sh	sh: mm: re-add lost __ref to ioremap_prot() to fix modpost warning	2023-09-19 13:21:32 -07:00
sparc	sparc32: fix a braino in fault handling in csum_and_copy_..._user()	2023-10-27 20:06:06 -04:00
um	This pull request contains the following changes for UML:	2023-09-04 11:32:21 -07:00
x86	x86/nmi: Fix out-of-order NMI nesting checks & false positive warning	2023-11-20 11:58:53 +01:00
xtensa	xtensa: boot/lib: fix function prototypes	2023-09-20 05:03:30 -07:00
.gitignore	…
Kconfig	Add x86 shadow stack support	2023-08-31 12:20:12 -07:00