Fix kexec() flow in HyperV (#415)

When invoking kexec() on a Linux guest running on a Hyper-V host, the kernel panics. Created and applied kernel patch that fixes this issue.
This commit is contained in:
Christopher Co 2020-11-30 16:14:43 -08:00 committed by GitHub
parent cf46eb9bca
commit c51c6d44f9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 79 additions and 3 deletions

View File

@ -3,7 +3,7 @@
Summary: Signed Linux Kernel for aarch64 systems
Name: kernel-signed-aarch64
Version: 5.4.72
Release: 3%{?dist}
Release: 4%{?dist}
License: GPLv2
Vendor: Microsoft Corporation
Distribution: Mariner
@ -80,6 +80,9 @@ ln -sf linux-%{uname_r}.cfg /boot/mariner.cfg
%config %{_localstatedir}/lib/initramfs/kernel/%{uname_r}
%changelog
* Mon Nov 23 2020 Chris Co <chrco@microsoft.com> - 5.4.72-4
- Update release number to match kernel spec
* Mon Nov 16 2020 Suresh Babu Chalamalasetty <schalam@microsoft.com> - 5.4.72-3
- Update release number

View File

@ -3,7 +3,7 @@
Summary: Signed Linux Kernel for x86_64 systems
Name: kernel-signed-x64
Version: 5.4.72
Release: 3%{?dist}
Release: 4%{?dist}
License: GPLv2
Vendor: Microsoft Corporation
Distribution: Mariner
@ -80,6 +80,9 @@ ln -sf linux-%{uname_r}.cfg /boot/mariner.cfg
%config %{_localstatedir}/lib/initramfs/kernel/%{uname_r}
%changelog
* Mon Nov 23 2020 Chris Co <chrco@microsoft.com> - 5.4.72-4
- Update release number to match kernel spec
* Mon Nov 16 2020 Suresh Babu Chalamalasetty <schalam@microsoft.com> - 5.4.72-3
- Update release number

View File

@ -0,0 +1,64 @@
From 4ffcffdc2e06186f35ffd41f46489d2027f32940 Mon Sep 17 00:00:00 2001
From: Chris Co <chrco@microsoft.com>
Date: Fri, 30 Oct 2020 05:20:29 +0000
Subject: [PATCH] Drivers: hv: vmbus: Allow cleanup of VMBUS_CONNECT_CPU if
disconnected
When invoking kexec() on a Linux guest running on a Hyper-V host, the
kernel panics.
RIP: 0010:cpuhp_issue_call+0x137/0x140
Call Trace:
__cpuhp_remove_state_cpuslocked+0x99/0x100
__cpuhp_remove_state+0x1c/0x30
hv_kexec_handler+0x23/0x30 [hv_vmbus]
hv_machine_shutdown+0x1e/0x30
machine_shutdown+0x10/0x20
kernel_kexec+0x6d/0x96
__do_sys_reboot+0x1ef/0x230
__x64_sys_reboot+0x1d/0x20
do_syscall_64+0x6b/0x3d8
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This was due to hv_synic_cleanup() callback returning -EBUSY to
cpuhp_issue_call() when tearing down the VMBUS_CONNECT_CPU, even
if the vmbus_connection.conn_state = DISCONNECTED. hv_synic_cleanup()
should succeed in the case where vmbus_connection.conn_state
is DISCONNECTED.
Fix is to add an extra condition to test for
vmbus_connection.conn_state == CONNECTED on the VMBUS_CONNECT_CPU and
only return early if true. This way the kexec() path can still shut
everything down while preserving the initial behavior of preventing
CPU offlining on the VMBUS_CONNECT_CPU while the VM is running.
Fixes: 8a857c55420f29 ("Drivers: hv: vmbus: Always handle the VMBus messages on CPU0")
Signed-off-by: Chris Co <chrco@microsoft.com>
Cc: stable@vger.kernel.org
---
drivers/hv/hv.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 0cde10fe0e71..f202ac7f4b3d 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -244,9 +244,13 @@ int hv_synic_cleanup(unsigned int cpu)
/*
* Hyper-V does not provide a way to change the connect CPU once
- * it is set; we must prevent the connect CPU from going offline.
+ * it is set; we must prevent the connect CPU from going offline
+ * while the VM is running normally. But in the panic or kexec()
+ * path where the vmbus is already disconnected, the CPU must be
+ * allowed to shut down.
*/
- if (cpu == VMBUS_CONNECT_CPU)
+ if (cpu == VMBUS_CONNECT_CPU &&
+ vmbus_connection.conn_state == CONNECTED)
return -EBUSY;
/*
--
2.17.1

View File

@ -3,7 +3,7 @@
Summary: Linux Kernel
Name: kernel
Version: 5.4.72
Release: 3%{?dist}
Release: 4%{?dist}
License: GPLv2
Vendor: Microsoft Corporation
Distribution: Mariner
@ -14,6 +14,7 @@ Source1: config
Source2: config_aarch64
# Arm64 HyperV support required patch
Patch0: ver5_4_72_arm64_hyperv_support.patch
Patch1: 0001-Drivers-hv-vmbus-Allow-cleanup-of-VMBUS_CONNECT_CPU-.patch
# Kernel CVEs are addressed by moving to a newer version of the stable kernel.
# Since kernel CVEs are filed against the upstream kernel version and not the
# stable kernel version, our automated tooling will still flag the CVE as not
@ -184,6 +185,8 @@ This package contains the 'perf' performance analysis tools for Linux kernel.
%patch0 -p1
%endif
%patch1 -p1
%build
make mrproper
@ -403,6 +406,9 @@ ln -sf linux-%{uname_r}.cfg /boot/mariner.cfg
%{_libdir}/perf/include/bpf/*
%changelog
* Mon Nov 23 2020 Chris Co <chrco@microsoft.com> - 5.4.72-4
- Apply patch to fix kexec in HyperV
* Mon Nov 16 2020 Suresh Babu Chalamalasetty <schalam@microsoft.com> - 5.4.72-3
- Disable kernel config SLUB_DEBUG_ON due to tcp throughput perf impact