DOCUMENTATION: Protected virtual machine introduction and IPL
Add documentation about protected KVM guests and description of changes that are necessary to move a KVM VM into Protected Virtualization mode. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: fixing and conversion to rst] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This commit is contained in:
parent
8a8378fa61
commit
a421027987
|
@ -18,6 +18,8 @@ KVM
|
|||
nested-vmx
|
||||
ppc-pv
|
||||
s390-diag
|
||||
s390-pv
|
||||
s390-pv-boot
|
||||
timekeeping
|
||||
vcpu-requests
|
||||
|
||||
|
|
|
@ -0,0 +1,84 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
======================================
|
||||
s390 (IBM Z) Boot/IPL of Protected VMs
|
||||
======================================
|
||||
|
||||
Summary
|
||||
-------
|
||||
The memory of Protected Virtual Machines (PVMs) is not accessible to
|
||||
I/O or the hypervisor. In those cases where the hypervisor needs to
|
||||
access the memory of a PVM, that memory must be made accessible.
|
||||
Memory made accessible to the hypervisor will be encrypted. See
|
||||
:doc:`s390-pv` for details."
|
||||
|
||||
On IPL (boot) a small plaintext bootloader is started, which provides
|
||||
information about the encrypted components and necessary metadata to
|
||||
KVM to decrypt the protected virtual machine.
|
||||
|
||||
Based on this data, KVM will make the protected virtual machine known
|
||||
to the Ultravisor (UV) and instruct it to secure the memory of the
|
||||
PVM, decrypt the components and verify the data and address list
|
||||
hashes, to ensure integrity. Afterwards KVM can run the PVM via the
|
||||
SIE instruction which the UV will intercept and execute on KVM's
|
||||
behalf.
|
||||
|
||||
As the guest image is just like an opaque kernel image that does the
|
||||
switch into PV mode itself, the user can load encrypted guest
|
||||
executables and data via every available method (network, dasd, scsi,
|
||||
direct kernel, ...) without the need to change the boot process.
|
||||
|
||||
|
||||
Diag308
|
||||
-------
|
||||
This diagnose instruction is the basic mechanism to handle IPL and
|
||||
related operations for virtual machines. The VM can set and retrieve
|
||||
IPL information blocks, that specify the IPL method/devices and
|
||||
request VM memory and subsystem resets, as well as IPLs.
|
||||
|
||||
For PVMs this concept has been extended with new subcodes:
|
||||
|
||||
Subcode 8: Set an IPL Information Block of type 5 (information block
|
||||
for PVMs)
|
||||
Subcode 9: Store the saved block in guest memory
|
||||
Subcode 10: Move into Protected Virtualization mode
|
||||
|
||||
The new PV load-device-specific-parameters field specifies all data
|
||||
that is necessary to move into PV mode.
|
||||
|
||||
* PV Header origin
|
||||
* PV Header length
|
||||
* List of Components composed of
|
||||
* AES-XTS Tweak prefix
|
||||
* Origin
|
||||
* Size
|
||||
|
||||
The PV header contains the keys and hashes, which the UV will use to
|
||||
decrypt and verify the PV, as well as control flags and a start PSW.
|
||||
|
||||
The components are for instance an encrypted kernel, kernel parameters
|
||||
and initrd. The components are decrypted by the UV.
|
||||
|
||||
After the initial import of the encrypted data, all defined pages will
|
||||
contain the guest content. All non-specified pages will start out as
|
||||
zero pages on first access.
|
||||
|
||||
|
||||
When running in protected virtualization mode, some subcodes will result in
|
||||
exceptions or return error codes.
|
||||
|
||||
Subcodes 4 and 7, which specify operations that do not clear the guest
|
||||
memory, will result in specification exceptions. This is because the
|
||||
UV will clear all memory when a secure VM is removed, and therefore
|
||||
non-clearing IPL subcodes are not allowed.
|
||||
|
||||
Subcodes 8, 9, 10 will result in specification exceptions.
|
||||
Re-IPL into a protected mode is only possible via a detour into non
|
||||
protected mode.
|
||||
|
||||
Keys
|
||||
----
|
||||
Every CEC will have a unique public key to enable tooling to build
|
||||
encrypted images.
|
||||
See `s390-tools <https://github.com/ibm-s390-tools/s390-tools/>`_
|
||||
for the tooling.
|
|
@ -0,0 +1,116 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=========================================
|
||||
s390 (IBM Z) Ultravisor and Protected VMs
|
||||
=========================================
|
||||
|
||||
Summary
|
||||
-------
|
||||
Protected virtual machines (PVM) are KVM VMs that do not allow KVM to
|
||||
access VM state like guest memory or guest registers. Instead, the
|
||||
PVMs are mostly managed by a new entity called Ultravisor (UV). The UV
|
||||
provides an API that can be used by PVMs and KVM to request management
|
||||
actions.
|
||||
|
||||
Each guest starts in non-protected mode and then may make a request to
|
||||
transition into protected mode. On transition, KVM registers the guest
|
||||
and its VCPUs with the Ultravisor and prepares everything for running
|
||||
it.
|
||||
|
||||
The Ultravisor will secure and decrypt the guest's boot memory
|
||||
(i.e. kernel/initrd). It will safeguard state changes like VCPU
|
||||
starts/stops and injected interrupts while the guest is running.
|
||||
|
||||
As access to the guest's state, such as the SIE state description, is
|
||||
normally needed to be able to run a VM, some changes have been made in
|
||||
the behavior of the SIE instruction. A new format 4 state description
|
||||
has been introduced, where some fields have different meanings for a
|
||||
PVM. SIE exits are minimized as much as possible to improve speed and
|
||||
reduce exposed guest state.
|
||||
|
||||
|
||||
Interrupt injection
|
||||
-------------------
|
||||
Interrupt injection is safeguarded by the Ultravisor. As KVM doesn't
|
||||
have access to the VCPUs' lowcores, injection is handled via the
|
||||
format 4 state description.
|
||||
|
||||
Machine check, external, IO and restart interruptions each can be
|
||||
injected on SIE entry via a bit in the interrupt injection control
|
||||
field (offset 0x54). If the guest cpu is not enabled for the interrupt
|
||||
at the time of injection, a validity interception is recognized. The
|
||||
format 4 state description contains fields in the interception data
|
||||
block where data associated with the interrupt can be transported.
|
||||
|
||||
Program and Service Call exceptions have another layer of
|
||||
safeguarding; they can only be injected for instructions that have
|
||||
been intercepted into KVM. The exceptions need to be a valid outcome
|
||||
of an instruction emulation by KVM, e.g. we can never inject a
|
||||
addressing exception as they are reported by SIE since KVM has no
|
||||
access to the guest memory.
|
||||
|
||||
|
||||
Mask notification interceptions
|
||||
-------------------------------
|
||||
KVM cannot intercept lctl(g) and lpsw(e) anymore in order to be
|
||||
notified when a PVM enables a certain class of interrupt. As a
|
||||
replacement, two new interception codes have been introduced: One
|
||||
indicating that the contents of CRs 0, 6, or 14 have been changed,
|
||||
indicating different interruption subclasses; and one indicating that
|
||||
PSW bit 13 has been changed, indicating that a machine check
|
||||
intervention was requested and those are now enabled.
|
||||
|
||||
Instruction emulation
|
||||
---------------------
|
||||
With the format 4 state description for PVMs, the SIE instruction already
|
||||
interprets more instructions than it does with format 2. It is not able
|
||||
to interpret every instruction, but needs to hand some tasks to KVM;
|
||||
therefore, the SIE and the ultravisor safeguard emulation inputs and outputs.
|
||||
|
||||
The control structures associated with SIE provide the Secure
|
||||
Instruction Data Area (SIDA), the Interception Parameters (IP) and the
|
||||
Secure Interception General Register Save Area. Guest GRs and most of
|
||||
the instruction data, such as I/O data structures, are filtered.
|
||||
Instruction data is copied to and from the SIDA when needed. Guest
|
||||
GRs are put into / retrieved from the Secure Interception General
|
||||
Register Save Area.
|
||||
|
||||
Only GR values needed to emulate an instruction will be copied into this
|
||||
save area and the real register numbers will be hidden.
|
||||
|
||||
The Interception Parameters state description field still contains the
|
||||
the bytes of the instruction text, but with pre-set register values
|
||||
instead of the actual ones. I.e. each instruction always uses the same
|
||||
instruction text, in order not to leak guest instruction text.
|
||||
This also implies that the register content that a guest had in r<n>
|
||||
may be in r<m> from the hypervisor's point of view.
|
||||
|
||||
The Secure Instruction Data Area contains instruction storage
|
||||
data. Instruction data, i.e. data being referenced by an instruction
|
||||
like the SCCB for sclp, is moved via the SIDA. When an instruction is
|
||||
intercepted, the SIE will only allow data and program interrupts for
|
||||
this instruction to be moved to the guest via the two data areas
|
||||
discussed before. Other data is either ignored or results in validity
|
||||
interceptions.
|
||||
|
||||
|
||||
Instruction emulation interceptions
|
||||
-----------------------------------
|
||||
There are two types of SIE secure instruction intercepts: the normal
|
||||
and the notification type. Normal secure instruction intercepts will
|
||||
make the guest pending for instruction completion of the intercepted
|
||||
instruction type, i.e. on SIE entry it is attempted to complete
|
||||
emulation of the instruction with the data provided by KVM. That might
|
||||
be a program exception or instruction completion.
|
||||
|
||||
The notification type intercepts inform KVM about guest environment
|
||||
changes due to guest instruction interpretation. Such an interception
|
||||
is recognized, for example, for the store prefix instruction to provide
|
||||
the new lowcore location. On SIE reentry, any KVM data in the data areas
|
||||
is ignored and execution continues as if the guest instruction had
|
||||
completed. For that reason KVM is not allowed to inject a program
|
||||
interrupt.
|
||||
|
||||
Links
|
||||
-----
|
||||
`KVM Forum 2019 presentation <https://static.sched.com/hosted_files/kvmforum2019/3b/ibm_protected_vms_s390x.pdf>`_
|
|
@ -9209,6 +9209,7 @@ L: kvm@vger.kernel.org
|
|||
W: http://www.ibm.com/developerworks/linux/linux390/
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
|
||||
S: Supported
|
||||
F: Documentation/virt/kvm/s390*
|
||||
F: arch/s390/include/uapi/asm/kvm*
|
||||
F: arch/s390/include/asm/gmap.h
|
||||
F: arch/s390/include/asm/kvm*
|
||||
|
|
Loading…
Reference in New Issue