KVM updates for v4.9-rc1
All architectures: Move `make kvmconfig` stubs from x86; use 64 bits for debugfs stats. ARM: Important fixes for not using an in-kernel irqchip; handle SError exceptions and present them to guests if appropriate; proxying of GICV access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8; preparations for GICv3 save/restore, including ABI docs; cleanups and a bit of optimizations. MIPS: A couple of fixes in preparation for supporting MIPS EVA host kernels; MIPS SMP host & TLB invalidation fixes. PPC: Fix the bug which caused guests to falsely report lockups; other minor fixes; a small optimization. s390: Lazy enablement of runtime instrumentation; up to 255 CPUs for nested guests; rework of machine check deliver; cleanups and fixes. x86: IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in nVMX; cleanups and fixes. -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4 K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18= =fZRB -----END PGP SIGNATURE----- Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Radim Krčmář: "All architectures: - move `make kvmconfig` stubs from x86 - use 64 bits for debugfs stats ARM: - Important fixes for not using an in-kernel irqchip - handle SError exceptions and present them to guests if appropriate - proxying of GICV access at EL2 if guest mappings are unsafe - GICv3 on AArch32 on ARMv8 - preparations for GICv3 save/restore, including ABI docs - cleanups and a bit of optimizations MIPS: - A couple of fixes in preparation for supporting MIPS EVA host kernels - MIPS SMP host & TLB invalidation fixes PPC: - Fix the bug which caused guests to falsely report lockups - other minor fixes - a small optimization s390: - Lazy enablement of runtime instrumentation - up to 255 CPUs for nested guests - rework of machine check deliver - cleanups and fixes x86: - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery - Hyper-V TSC page - per-vcpu tsc_offset in debugfs - accelerated INS/OUTS in nVMX - cleanups and fixes" * tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits) KVM: MIPS: Drop dubious EntryHi optimisation KVM: MIPS: Invalidate TLB by regenerating ASIDs KVM: MIPS: Split kernel/user ASID regeneration KVM: MIPS: Drop other CPU ASIDs on guest MMU changes KVM: arm/arm64: vgic: Don't flush/sync without a working vgic KVM: arm64: Require in-kernel irqchip for PMU support KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie KVM: PPC: BookE: Fix a sanity check KVM: PPC: Book3S HV: Take out virtual core piggybacking code KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread ARM: gic-v3: Work around definition of gic_write_bpr1 KVM: nVMX: Fix the NMI IDT-vectoring handling KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive KVM: nVMX: Fix reload apic access page warning kvmconfig: add virtio-gpu to config fragment config: move x86 kvm_guest.config to a common location arm64: KVM: Remove duplicating init code for setting VMID ARM: KVM: Support vgic-v3 ...
This commit is contained in:
commit
6218590bcb
|
@ -460,6 +460,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||
driver will print ACPI tables for AMD IOMMU during
|
||||
IOMMU initialization.
|
||||
|
||||
amd_iommu_intr= [HW,X86-64]
|
||||
Specifies one of the following AMD IOMMU interrupt
|
||||
remapping modes:
|
||||
legacy - Use legacy interrupt remapping mode.
|
||||
vapic - Use virtual APIC mode, which allows IOMMU
|
||||
to inject interrupts directly into guest.
|
||||
This mode requires kvm-amd.avic=1.
|
||||
(Default when IOMMU HW support is present.)
|
||||
|
||||
amijoy.map= [HW,JOY] Amiga joystick support
|
||||
Map of devices attached to JOY0DAT and JOY1DAT
|
||||
Format: <a>,<b>
|
||||
|
|
|
@ -0,0 +1,38 @@
|
|||
ARM Virtual Interrupt Translation Service (ITS)
|
||||
===============================================
|
||||
|
||||
Device types supported:
|
||||
KVM_DEV_TYPE_ARM_VGIC_ITS ARM Interrupt Translation Service Controller
|
||||
|
||||
The ITS allows MSI(-X) interrupts to be injected into guests. This extension is
|
||||
optional. Creating a virtual ITS controller also requires a host GICv3 (see
|
||||
arm-vgic-v3.txt), but does not depend on having physical ITS controllers.
|
||||
|
||||
There can be multiple ITS controllers per guest, each of them has to have
|
||||
a separate, non-overlapping MMIO region.
|
||||
|
||||
|
||||
Groups:
|
||||
KVM_DEV_ARM_VGIC_GRP_ADDR
|
||||
Attributes:
|
||||
KVM_VGIC_ITS_ADDR_TYPE (rw, 64-bit)
|
||||
Base address in the guest physical address space of the GICv3 ITS
|
||||
control register frame.
|
||||
This address needs to be 64K aligned and the region covers 128K.
|
||||
Errors:
|
||||
-E2BIG: Address outside of addressable IPA range
|
||||
-EINVAL: Incorrectly aligned address
|
||||
-EEXIST: Address already configured
|
||||
-EFAULT: Invalid user pointer for attr->addr.
|
||||
-ENODEV: Incorrect attribute or the ITS is not supported.
|
||||
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_CTRL
|
||||
Attributes:
|
||||
KVM_DEV_ARM_VGIC_CTRL_INIT
|
||||
request the initialization of the ITS, no additional parameter in
|
||||
kvm_device_attr.addr.
|
||||
Errors:
|
||||
-ENXIO: ITS not properly configured as required prior to setting
|
||||
this attribute
|
||||
-ENOMEM: Memory shortage when allocating ITS internal data
|
|
@ -0,0 +1,206 @@
|
|||
ARM Virtual Generic Interrupt Controller v3 and later (VGICv3)
|
||||
==============================================================
|
||||
|
||||
|
||||
Device types supported:
|
||||
KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0
|
||||
|
||||
Only one VGIC instance may be instantiated through this API. The created VGIC
|
||||
will act as the VM interrupt controller, requiring emulated user-space devices
|
||||
to inject interrupts to the VGIC instead of directly to CPUs. It is not
|
||||
possible to create both a GICv3 and GICv2 on the same VM.
|
||||
|
||||
Creating a guest GICv3 device requires a host GICv3 as well.
|
||||
|
||||
|
||||
Groups:
|
||||
KVM_DEV_ARM_VGIC_GRP_ADDR
|
||||
Attributes:
|
||||
KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit)
|
||||
Base address in the guest physical address space of the GICv3 distributor
|
||||
register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
|
||||
This address needs to be 64K aligned and the region covers 64 KByte.
|
||||
|
||||
KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit)
|
||||
Base address in the guest physical address space of the GICv3
|
||||
redistributor register mappings. There are two 64K pages for each
|
||||
VCPU and all of the redistributor pages are contiguous.
|
||||
Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
|
||||
This address needs to be 64K aligned.
|
||||
Errors:
|
||||
-E2BIG: Address outside of addressable IPA range
|
||||
-EINVAL: Incorrectly aligned address
|
||||
-EEXIST: Address already configured
|
||||
-ENXIO: The group or attribute is unknown/unsupported for this device
|
||||
or hardware support is missing.
|
||||
-EFAULT: Invalid user pointer for attr->addr.
|
||||
|
||||
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_DIST_REGS
|
||||
KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
|
||||
Attributes:
|
||||
The attr field of kvm_device_attr encodes two values:
|
||||
bits: | 63 .... 32 | 31 .... 0 |
|
||||
values: | mpidr | offset |
|
||||
|
||||
All distributor regs are (rw, 32-bit) and kvm_device_attr.addr points to a
|
||||
__u32 value. 64-bit registers must be accessed by separately accessing the
|
||||
lower and higher word.
|
||||
|
||||
Writes to read-only registers are ignored by the kernel.
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_DIST_REGS accesses the main distributor registers.
|
||||
KVM_DEV_ARM_VGIC_GRP_REDIST_REGS accesses the redistributor of the CPU
|
||||
specified by the mpidr.
|
||||
|
||||
The offset is relative to the "[Re]Distributor base address" as defined
|
||||
in the GICv3/4 specs. Getting or setting such a register has the same
|
||||
effect as reading or writing the register on real hardware, except for the
|
||||
following registers: GICD_STATUSR, GICR_STATUSR, GICD_ISPENDR,
|
||||
GICR_ISPENDR0, GICD_ICPENDR, and GICR_ICPENDR0. These registers behave
|
||||
differently when accessed via this interface compared to their
|
||||
architecturally defined behavior to allow software a full view of the
|
||||
VGIC's internal state.
|
||||
|
||||
The mpidr field is used to specify which
|
||||
redistributor is accessed. The mpidr is ignored for the distributor.
|
||||
|
||||
The mpidr encoding is based on the affinity information in the
|
||||
architecture defined MPIDR, and the field is encoded as follows:
|
||||
| 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
|
||||
| Aff3 | Aff2 | Aff1 | Aff0 |
|
||||
|
||||
Note that distributor fields are not banked, but return the same value
|
||||
regardless of the mpidr used to access the register.
|
||||
|
||||
The GICD_STATUSR and GICR_STATUSR registers are architecturally defined such
|
||||
that a write of a clear bit has no effect, whereas a write with a set bit
|
||||
clears that value. To allow userspace to freely set the values of these two
|
||||
registers, setting the attributes with the register offsets for these two
|
||||
registers simply sets the non-reserved bits to the value written.
|
||||
|
||||
|
||||
Accesses (reads and writes) to the GICD_ISPENDR register region and
|
||||
GICR_ISPENDR0 registers get/set the value of the latched pending state for
|
||||
the interrupts.
|
||||
|
||||
This is identical to the value returned by a guest read from ISPENDR for an
|
||||
edge triggered interrupt, but may differ for level triggered interrupts.
|
||||
For edge triggered interrupts, once an interrupt becomes pending (whether
|
||||
because of an edge detected on the input line or because of a guest write
|
||||
to ISPENDR) this state is "latched", and only cleared when either the
|
||||
interrupt is activated or when the guest writes to ICPENDR. A level
|
||||
triggered interrupt may be pending either because the level input is held
|
||||
high by a device, or because of a guest write to the ISPENDR register. Only
|
||||
ISPENDR writes are latched; if the device lowers the line level then the
|
||||
interrupt is no longer pending unless the guest also wrote to ISPENDR, and
|
||||
conversely writes to ICPENDR or activations of the interrupt do not clear
|
||||
the pending status if the line level is still being held high. (These
|
||||
rules are documented in the GICv3 specification descriptions of the ICPENDR
|
||||
and ISPENDR registers.) For a level triggered interrupt the value accessed
|
||||
here is that of the latch which is set by ISPENDR and cleared by ICPENDR or
|
||||
interrupt activation, whereas the value returned by a guest read from
|
||||
ISPENDR is the logical OR of the latch value and the input line level.
|
||||
|
||||
Raw access to the latch state is provided to userspace so that it can save
|
||||
and restore the entire GIC internal state (which is defined by the
|
||||
combination of the current input line level and the latch state, and cannot
|
||||
be deduced from purely the line level and the value of the ISPENDR
|
||||
registers).
|
||||
|
||||
Accesses to GICD_ICPENDR register region and GICR_ICPENDR0 registers have
|
||||
RAZ/WI semantics, meaning that reads always return 0 and writes are always
|
||||
ignored.
|
||||
|
||||
Errors:
|
||||
-ENXIO: Getting or setting this register is not yet supported
|
||||
-EBUSY: One or more VCPUs are running
|
||||
|
||||
|
||||
KVM_DEV_ARM_VGIC_CPU_SYSREGS
|
||||
Attributes:
|
||||
The attr field of kvm_device_attr encodes two values:
|
||||
bits: | 63 .... 32 | 31 .... 16 | 15 .... 0 |
|
||||
values: | mpidr | RES | instr |
|
||||
|
||||
The mpidr field encodes the CPU ID based on the affinity information in the
|
||||
architecture defined MPIDR, and the field is encoded as follows:
|
||||
| 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
|
||||
| Aff3 | Aff2 | Aff1 | Aff0 |
|
||||
|
||||
The instr field encodes the system register to access based on the fields
|
||||
defined in the A64 instruction set encoding for system register access
|
||||
(RES means the bits are reserved for future use and should be zero):
|
||||
|
||||
| 15 ... 14 | 13 ... 11 | 10 ... 7 | 6 ... 3 | 2 ... 0 |
|
||||
| Op 0 | Op1 | CRn | CRm | Op2 |
|
||||
|
||||
All system regs accessed through this API are (rw, 64-bit) and
|
||||
kvm_device_attr.addr points to a __u64 value.
|
||||
|
||||
KVM_DEV_ARM_VGIC_CPU_SYSREGS accesses the CPU interface registers for the
|
||||
CPU specified by the mpidr field.
|
||||
|
||||
Errors:
|
||||
-ENXIO: Getting or setting this register is not yet supported
|
||||
-EBUSY: VCPU is running
|
||||
-EINVAL: Invalid mpidr supplied
|
||||
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_NR_IRQS
|
||||
Attributes:
|
||||
A value describing the number of interrupts (SGI, PPI and SPI) for
|
||||
this GIC instance, ranging from 64 to 1024, in increments of 32.
|
||||
|
||||
kvm_device_attr.addr points to a __u32 value.
|
||||
|
||||
Errors:
|
||||
-EINVAL: Value set is out of the expected range
|
||||
-EBUSY: Value has already be set.
|
||||
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_CTRL
|
||||
Attributes:
|
||||
KVM_DEV_ARM_VGIC_CTRL_INIT
|
||||
request the initialization of the VGIC, no additional parameter in
|
||||
kvm_device_attr.addr.
|
||||
Errors:
|
||||
-ENXIO: VGIC not properly configured as required prior to calling
|
||||
this attribute
|
||||
-ENODEV: no online VCPU
|
||||
-ENOMEM: memory shortage when allocating vgic internal data
|
||||
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO
|
||||
Attributes:
|
||||
The attr field of kvm_device_attr encodes the following values:
|
||||
bits: | 63 .... 32 | 31 .... 10 | 9 .... 0 |
|
||||
values: | mpidr | info | vINTID |
|
||||
|
||||
The vINTID specifies which set of IRQs is reported on.
|
||||
|
||||
The info field specifies which information userspace wants to get or set
|
||||
using this interface. Currently we support the following info values:
|
||||
|
||||
VGIC_LEVEL_INFO_LINE_LEVEL:
|
||||
Get/Set the input level of the IRQ line for a set of 32 contiguously
|
||||
numbered interrupts.
|
||||
vINTID must be a multiple of 32.
|
||||
|
||||
kvm_device_attr.addr points to a __u32 value which will contain a
|
||||
bitmap where a set bit means the interrupt level is asserted.
|
||||
|
||||
Bit[n] indicates the status for interrupt vINTID + n.
|
||||
|
||||
SGIs and any interrupt with a higher ID than the number of interrupts
|
||||
supported, will be RAZ/WI. LPIs are always edge-triggered and are
|
||||
therefore not supported by this interface.
|
||||
|
||||
PPIs are reported per VCPU as specified in the mpidr field, and SPIs are
|
||||
reported with the same value regardless of the mpidr specified.
|
||||
|
||||
The mpidr field encodes the CPU ID based on the affinity information in the
|
||||
architecture defined MPIDR, and the field is encoded as follows:
|
||||
| 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
|
||||
| Aff3 | Aff2 | Aff1 | Aff0 |
|
|
@ -1,24 +1,19 @@
|
|||
ARM Virtual Generic Interrupt Controller (VGIC)
|
||||
===============================================
|
||||
ARM Virtual Generic Interrupt Controller v2 (VGIC)
|
||||
==================================================
|
||||
|
||||
Device types supported:
|
||||
KVM_DEV_TYPE_ARM_VGIC_V2 ARM Generic Interrupt Controller v2.0
|
||||
KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0
|
||||
KVM_DEV_TYPE_ARM_VGIC_ITS ARM Interrupt Translation Service Controller
|
||||
|
||||
Only one VGIC instance of the V2/V3 types above may be instantiated through
|
||||
either this API or the legacy KVM_CREATE_IRQCHIP api. The created VGIC will
|
||||
act as the VM interrupt controller, requiring emulated user-space devices to
|
||||
inject interrupts to the VGIC instead of directly to CPUs.
|
||||
Only one VGIC instance may be instantiated through either this API or the
|
||||
legacy KVM_CREATE_IRQCHIP API. The created VGIC will act as the VM interrupt
|
||||
controller, requiring emulated user-space devices to inject interrupts to the
|
||||
VGIC instead of directly to CPUs.
|
||||
|
||||
Creating a guest GICv3 device requires a host GICv3 as well.
|
||||
GICv3 implementations with hardware compatibility support allow a guest GICv2
|
||||
as well.
|
||||
GICv3 implementations with hardware compatibility support allow creating a
|
||||
guest GICv2 through this interface. For information on creating a guest GICv3
|
||||
device and guest ITS devices, see arm-vgic-v3.txt. It is not possible to
|
||||
create both a GICv3 and GICv2 device on the same VM.
|
||||
|
||||
Creating a virtual ITS controller requires a host GICv3 (but does not depend
|
||||
on having physical ITS controllers).
|
||||
There can be multiple ITS controllers per guest, each of them has to have
|
||||
a separate, non-overlapping MMIO region.
|
||||
|
||||
Groups:
|
||||
KVM_DEV_ARM_VGIC_GRP_ADDR
|
||||
|
@ -32,26 +27,13 @@ Groups:
|
|||
Base address in the guest physical address space of the GIC virtual cpu
|
||||
interface register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V2.
|
||||
This address needs to be 4K aligned and the region covers 4 KByte.
|
||||
|
||||
KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit)
|
||||
Base address in the guest physical address space of the GICv3 distributor
|
||||
register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
|
||||
This address needs to be 64K aligned and the region covers 64 KByte.
|
||||
|
||||
KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit)
|
||||
Base address in the guest physical address space of the GICv3
|
||||
redistributor register mappings. There are two 64K pages for each
|
||||
VCPU and all of the redistributor pages are contiguous.
|
||||
Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
|
||||
This address needs to be 64K aligned.
|
||||
|
||||
KVM_VGIC_V3_ADDR_TYPE_ITS (rw, 64-bit)
|
||||
Base address in the guest physical address space of the GICv3 ITS
|
||||
control register frame. The ITS allows MSI(-X) interrupts to be
|
||||
injected into guests. This extension is optional. If the kernel
|
||||
does not support the ITS, the call returns -ENODEV.
|
||||
Only valid for KVM_DEV_TYPE_ARM_VGIC_ITS.
|
||||
This address needs to be 64K aligned and the region covers 128K.
|
||||
Errors:
|
||||
-E2BIG: Address outside of addressable IPA range
|
||||
-EINVAL: Incorrectly aligned address
|
||||
-EEXIST: Address already configured
|
||||
-ENXIO: The group or attribute is unknown/unsupported for this device
|
||||
or hardware support is missing.
|
||||
-EFAULT: Invalid user pointer for attr->addr.
|
||||
|
||||
KVM_DEV_ARM_VGIC_GRP_DIST_REGS
|
||||
Attributes:
|
||||
|
|
|
@ -30,4 +30,6 @@ Returns: -ENODEV: PMUv3 not supported
|
|||
attribute
|
||||
-EBUSY: PMUv3 already initialized
|
||||
|
||||
Request the initialization of the PMUv3.
|
||||
Request the initialization of the PMUv3. This must be done after creating the
|
||||
in-kernel irqchip. Creating a PMU with a userspace irqchip is currently not
|
||||
supported.
|
||||
|
|
|
@ -22,9 +22,7 @@
|
|||
|
||||
#include <linux/io.h>
|
||||
#include <asm/barrier.h>
|
||||
|
||||
#define __ACCESS_CP15(CRn, Op1, CRm, Op2) p15, Op1, %0, CRn, CRm, Op2
|
||||
#define __ACCESS_CP15_64(Op1, CRm) p15, Op1, %Q0, %R0, CRm
|
||||
#include <asm/cp15.h>
|
||||
|
||||
#define ICC_EOIR1 __ACCESS_CP15(c12, 0, c12, 1)
|
||||
#define ICC_DIR __ACCESS_CP15(c12, 0, c11, 1)
|
||||
|
@ -99,68 +97,129 @@
|
|||
#define ICH_AP1R2 __AP1Rx(2)
|
||||
#define ICH_AP1R3 __AP1Rx(3)
|
||||
|
||||
/* A32-to-A64 mappings used by VGIC save/restore */
|
||||
|
||||
#define CPUIF_MAP(a32, a64) \
|
||||
static inline void write_ ## a64(u32 val) \
|
||||
{ \
|
||||
write_sysreg(val, a32); \
|
||||
} \
|
||||
static inline u32 read_ ## a64(void) \
|
||||
{ \
|
||||
return read_sysreg(a32); \
|
||||
} \
|
||||
|
||||
#define CPUIF_MAP_LO_HI(a32lo, a32hi, a64) \
|
||||
static inline void write_ ## a64(u64 val) \
|
||||
{ \
|
||||
write_sysreg(lower_32_bits(val), a32lo);\
|
||||
write_sysreg(upper_32_bits(val), a32hi);\
|
||||
} \
|
||||
static inline u64 read_ ## a64(void) \
|
||||
{ \
|
||||
u64 val = read_sysreg(a32lo); \
|
||||
\
|
||||
val |= (u64)read_sysreg(a32hi) << 32; \
|
||||
\
|
||||
return val; \
|
||||
}
|
||||
|
||||
CPUIF_MAP(ICH_HCR, ICH_HCR_EL2)
|
||||
CPUIF_MAP(ICH_VTR, ICH_VTR_EL2)
|
||||
CPUIF_MAP(ICH_MISR, ICH_MISR_EL2)
|
||||
CPUIF_MAP(ICH_EISR, ICH_EISR_EL2)
|
||||
CPUIF_MAP(ICH_ELSR, ICH_ELSR_EL2)
|
||||
CPUIF_MAP(ICH_VMCR, ICH_VMCR_EL2)
|
||||
CPUIF_MAP(ICH_AP0R3, ICH_AP0R3_EL2)
|
||||
CPUIF_MAP(ICH_AP0R2, ICH_AP0R2_EL2)
|
||||
CPUIF_MAP(ICH_AP0R1, ICH_AP0R1_EL2)
|
||||
CPUIF_MAP(ICH_AP0R0, ICH_AP0R0_EL2)
|
||||
CPUIF_MAP(ICH_AP1R3, ICH_AP1R3_EL2)
|
||||
CPUIF_MAP(ICH_AP1R2, ICH_AP1R2_EL2)
|
||||
CPUIF_MAP(ICH_AP1R1, ICH_AP1R1_EL2)
|
||||
CPUIF_MAP(ICH_AP1R0, ICH_AP1R0_EL2)
|
||||
CPUIF_MAP(ICC_HSRE, ICC_SRE_EL2)
|
||||
CPUIF_MAP(ICC_SRE, ICC_SRE_EL1)
|
||||
|
||||
CPUIF_MAP_LO_HI(ICH_LR15, ICH_LRC15, ICH_LR15_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR14, ICH_LRC14, ICH_LR14_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR13, ICH_LRC13, ICH_LR13_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR12, ICH_LRC12, ICH_LR12_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR11, ICH_LRC11, ICH_LR11_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR10, ICH_LRC10, ICH_LR10_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR9, ICH_LRC9, ICH_LR9_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR8, ICH_LRC8, ICH_LR8_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR7, ICH_LRC7, ICH_LR7_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR6, ICH_LRC6, ICH_LR6_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR5, ICH_LRC5, ICH_LR5_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR4, ICH_LRC4, ICH_LR4_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR3, ICH_LRC3, ICH_LR3_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR2, ICH_LRC2, ICH_LR2_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR1, ICH_LRC1, ICH_LR1_EL2)
|
||||
CPUIF_MAP_LO_HI(ICH_LR0, ICH_LRC0, ICH_LR0_EL2)
|
||||
|
||||
#define read_gicreg(r) read_##r()
|
||||
#define write_gicreg(v, r) write_##r(v)
|
||||
|
||||
/* Low-level accessors */
|
||||
|
||||
static inline void gic_write_eoir(u32 irq)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_EOIR1) : : "r" (irq));
|
||||
write_sysreg(irq, ICC_EOIR1);
|
||||
isb();
|
||||
}
|
||||
|
||||
static inline void gic_write_dir(u32 val)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_DIR) : : "r" (val));
|
||||
write_sysreg(val, ICC_DIR);
|
||||
isb();
|
||||
}
|
||||
|
||||
static inline u32 gic_read_iar(void)
|
||||
{
|
||||
u32 irqstat;
|
||||
u32 irqstat = read_sysreg(ICC_IAR1);
|
||||
|
||||
asm volatile("mrc " __stringify(ICC_IAR1) : "=r" (irqstat));
|
||||
dsb(sy);
|
||||
|
||||
return irqstat;
|
||||
}
|
||||
|
||||
static inline void gic_write_pmr(u32 val)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_PMR) : : "r" (val));
|
||||
write_sysreg(val, ICC_PMR);
|
||||
}
|
||||
|
||||
static inline void gic_write_ctlr(u32 val)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_CTLR) : : "r" (val));
|
||||
write_sysreg(val, ICC_CTLR);
|
||||
isb();
|
||||
}
|
||||
|
||||
static inline void gic_write_grpen1(u32 val)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_IGRPEN1) : : "r" (val));
|
||||
write_sysreg(val, ICC_IGRPEN1);
|
||||
isb();
|
||||
}
|
||||
|
||||
static inline void gic_write_sgi1r(u64 val)
|
||||
{
|
||||
asm volatile("mcrr " __stringify(ICC_SGI1R) : : "r" (val));
|
||||
write_sysreg(val, ICC_SGI1R);
|
||||
}
|
||||
|
||||
static inline u32 gic_read_sre(void)
|
||||
{
|
||||
u32 val;
|
||||
|
||||
asm volatile("mrc " __stringify(ICC_SRE) : "=r" (val));
|
||||
return val;
|
||||
return read_sysreg(ICC_SRE);
|
||||
}
|
||||
|
||||
static inline void gic_write_sre(u32 val)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_SRE) : : "r" (val));
|
||||
write_sysreg(val, ICC_SRE);
|
||||
isb();
|
||||
}
|
||||
|
||||
static inline void gic_write_bpr1(u32 val)
|
||||
{
|
||||
asm volatile("mcr " __stringify(ICC_BPR1) : : "r" (val));
|
||||
write_sysreg(val, ICC_BPR1);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
|
@ -49,6 +49,21 @@
|
|||
|
||||
#ifdef CONFIG_CPU_CP15
|
||||
|
||||
#define __ACCESS_CP15(CRn, Op1, CRm, Op2) \
|
||||
"mrc", "mcr", __stringify(p15, Op1, %0, CRn, CRm, Op2), u32
|
||||
#define __ACCESS_CP15_64(Op1, CRm) \
|
||||
"mrrc", "mcrr", __stringify(p15, Op1, %Q0, %R0, CRm), u64
|
||||
|
||||
#define __read_sysreg(r, w, c, t) ({ \
|
||||
t __val; \
|
||||
asm volatile(r " " c : "=r" (__val)); \
|
||||
__val; \
|
||||
})
|
||||
#define read_sysreg(...) __read_sysreg(__VA_ARGS__)
|
||||
|
||||
#define __write_sysreg(v, r, w, c, t) asm volatile(w " " c : : "r" ((t)(v)))
|
||||
#define write_sysreg(v, ...) __write_sysreg(v, __VA_ARGS__)
|
||||
|
||||
extern unsigned long cr_alignment; /* defined in entry-armv.S */
|
||||
|
||||
static inline unsigned long get_cr(void)
|
||||
|
|
|
@ -55,6 +55,7 @@
|
|||
|
||||
#define MPIDR_LEVEL_BITS 8
|
||||
#define MPIDR_LEVEL_MASK ((1 << MPIDR_LEVEL_BITS) - 1)
|
||||
#define MPIDR_LEVEL_SHIFT(level) (MPIDR_LEVEL_BITS * level)
|
||||
|
||||
#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
|
||||
((mpidr >> (MPIDR_LEVEL_BITS * level)) & MPIDR_LEVEL_MASK)
|
||||
|
|
|
@ -21,6 +21,10 @@
|
|||
|
||||
#include <asm/virt.h>
|
||||
|
||||
#define ARM_EXIT_WITH_ABORT_BIT 31
|
||||
#define ARM_EXCEPTION_CODE(x) ((x) & ~(1U << ARM_EXIT_WITH_ABORT_BIT))
|
||||
#define ARM_ABORT_PENDING(x) !!((x) & (1U << ARM_EXIT_WITH_ABORT_BIT))
|
||||
|
||||
#define ARM_EXCEPTION_RESET 0
|
||||
#define ARM_EXCEPTION_UNDEFINED 1
|
||||
#define ARM_EXCEPTION_SOFTWARE 2
|
||||
|
@ -68,6 +72,9 @@ extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
|
|||
extern void __init_stage2_translation(void);
|
||||
|
||||
extern void __kvm_hyp_reset(unsigned long);
|
||||
|
||||
extern u64 __vgic_v3_get_ich_vtr_el2(void);
|
||||
extern void __vgic_v3_init_lrs(void);
|
||||
#endif
|
||||
|
||||
#endif /* __ARM_KVM_ASM_H__ */
|
||||
|
|
|
@ -40,18 +40,29 @@ static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
|
|||
*vcpu_reg(vcpu, reg_num) = val;
|
||||
}
|
||||
|
||||
bool kvm_condition_valid(struct kvm_vcpu *vcpu);
|
||||
void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr);
|
||||
bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
|
||||
void kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr);
|
||||
void kvm_inject_undefined(struct kvm_vcpu *vcpu);
|
||||
void kvm_inject_vabt(struct kvm_vcpu *vcpu);
|
||||
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
|
||||
static inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return kvm_condition_valid32(vcpu);
|
||||
}
|
||||
|
||||
static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
|
||||
{
|
||||
kvm_skip_instr32(vcpu, is_wide_instr);
|
||||
}
|
||||
|
||||
static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu->arch.hcr = HCR_GUEST_MASK;
|
||||
}
|
||||
|
||||
static inline unsigned long vcpu_get_hcr(struct kvm_vcpu *vcpu)
|
||||
static inline unsigned long vcpu_get_hcr(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.hcr;
|
||||
}
|
||||
|
@ -61,7 +72,7 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
|
|||
vcpu->arch.hcr = hcr;
|
||||
}
|
||||
|
||||
static inline bool vcpu_mode_is_32bit(struct kvm_vcpu *vcpu)
|
||||
static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
|
@ -71,9 +82,9 @@ static inline unsigned long *vcpu_pc(struct kvm_vcpu *vcpu)
|
|||
return &vcpu->arch.ctxt.gp_regs.usr_regs.ARM_pc;
|
||||
}
|
||||
|
||||
static inline unsigned long *vcpu_cpsr(struct kvm_vcpu *vcpu)
|
||||
static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return &vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr;
|
||||
return (unsigned long *)&vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr;
|
||||
}
|
||||
|
||||
static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
|
||||
|
@ -93,11 +104,21 @@ static inline bool vcpu_mode_priv(struct kvm_vcpu *vcpu)
|
|||
return cpsr_mode > USR_MODE;;
|
||||
}
|
||||
|
||||
static inline u32 kvm_vcpu_get_hsr(struct kvm_vcpu *vcpu)
|
||||
static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.fault.hsr;
|
||||
}
|
||||
|
||||
static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u32 hsr = kvm_vcpu_get_hsr(vcpu);
|
||||
|
||||
if (hsr & HSR_CV)
|
||||
return (hsr & HSR_COND) >> HSR_COND_SHIFT;
|
||||
|
||||
return -1;
|
||||
}
|
||||
|
||||
static inline unsigned long kvm_vcpu_get_hfar(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.fault.hxfar;
|
||||
|
|
|
@ -39,7 +39,12 @@
|
|||
|
||||
#include <kvm/arm_vgic.h>
|
||||
|
||||
|
||||
#ifdef CONFIG_ARM_GIC_V3
|
||||
#define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
|
||||
#else
|
||||
#define KVM_MAX_VCPUS VGIC_V2_MAX_CPUS
|
||||
#endif
|
||||
|
||||
#define KVM_REQ_VCPU_EXIT 8
|
||||
|
||||
|
@ -183,15 +188,15 @@ struct kvm_vcpu_arch {
|
|||
};
|
||||
|
||||
struct kvm_vm_stat {
|
||||
u32 remote_tlb_flush;
|
||||
ulong remote_tlb_flush;
|
||||
};
|
||||
|
||||
struct kvm_vcpu_stat {
|
||||
u32 halt_successful_poll;
|
||||
u32 halt_attempted_poll;
|
||||
u32 halt_poll_invalid;
|
||||
u32 halt_wakeup;
|
||||
u32 hvc_exit_stat;
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_wakeup;
|
||||
u64 hvc_exit_stat;
|
||||
u64 wfe_exit_stat;
|
||||
u64 wfi_exit_stat;
|
||||
u64 mmio_exit_user;
|
||||
|
|
|
@ -20,28 +20,15 @@
|
|||
|
||||
#include <linux/compiler.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <asm/cp15.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
#include <asm/vfp.h>
|
||||
|
||||
#define __hyp_text __section(.hyp.text) notrace
|
||||
|
||||
#define __ACCESS_CP15(CRn, Op1, CRm, Op2) \
|
||||
"mrc", "mcr", __stringify(p15, Op1, %0, CRn, CRm, Op2), u32
|
||||
#define __ACCESS_CP15_64(Op1, CRm) \
|
||||
"mrrc", "mcrr", __stringify(p15, Op1, %Q0, %R0, CRm), u64
|
||||
#define __ACCESS_VFP(CRn) \
|
||||
"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
|
||||
|
||||
#define __write_sysreg(v, r, w, c, t) asm volatile(w " " c : : "r" ((t)(v)))
|
||||
#define write_sysreg(v, ...) __write_sysreg(v, __VA_ARGS__)
|
||||
|
||||
#define __read_sysreg(r, w, c, t) ({ \
|
||||
t __val; \
|
||||
asm volatile(r " " c : "=r" (__val)); \
|
||||
__val; \
|
||||
})
|
||||
#define read_sysreg(...) __read_sysreg(__VA_ARGS__)
|
||||
|
||||
#define write_special(v, r) \
|
||||
asm volatile("msr " __stringify(r) ", %0" : : "r" (v))
|
||||
#define read_special(r) ({ \
|
||||
|
@ -119,6 +106,9 @@ void __vgic_v2_restore_state(struct kvm_vcpu *vcpu);
|
|||
void __sysreg_save_state(struct kvm_cpu_context *ctxt);
|
||||
void __sysreg_restore_state(struct kvm_cpu_context *ctxt);
|
||||
|
||||
void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
|
||||
void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
|
||||
|
||||
void asmlinkage __vfp_save_state(struct vfp_hard_struct *vfp);
|
||||
void asmlinkage __vfp_restore_state(struct vfp_hard_struct *vfp);
|
||||
static inline bool __vfp_enabled(void)
|
||||
|
|
|
@ -63,37 +63,13 @@ void kvm_clear_hyp_idmap(void);
|
|||
static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
|
||||
{
|
||||
*pmd = new_pmd;
|
||||
flush_pmd_entry(pmd);
|
||||
dsb(ishst);
|
||||
}
|
||||
|
||||
static inline void kvm_set_pte(pte_t *pte, pte_t new_pte)
|
||||
{
|
||||
*pte = new_pte;
|
||||
/*
|
||||
* flush_pmd_entry just takes a void pointer and cleans the necessary
|
||||
* cache entries, so we can reuse the function for ptes.
|
||||
*/
|
||||
flush_pmd_entry(pte);
|
||||
}
|
||||
|
||||
static inline void kvm_clean_pgd(pgd_t *pgd)
|
||||
{
|
||||
clean_dcache_area(pgd, PTRS_PER_S2_PGD * sizeof(pgd_t));
|
||||
}
|
||||
|
||||
static inline void kvm_clean_pmd(pmd_t *pmd)
|
||||
{
|
||||
clean_dcache_area(pmd, PTRS_PER_PMD * sizeof(pmd_t));
|
||||
}
|
||||
|
||||
static inline void kvm_clean_pmd_entry(pmd_t *pmd)
|
||||
{
|
||||
clean_pmd_entry(pmd);
|
||||
}
|
||||
|
||||
static inline void kvm_clean_pte(pte_t *pte)
|
||||
{
|
||||
clean_pte_table(pte);
|
||||
dsb(ishst);
|
||||
}
|
||||
|
||||
static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
|
||||
|
|
|
@ -84,6 +84,13 @@ struct kvm_regs {
|
|||
#define KVM_VGIC_V2_DIST_SIZE 0x1000
|
||||
#define KVM_VGIC_V2_CPU_SIZE 0x2000
|
||||
|
||||
/* Supported VGICv3 address types */
|
||||
#define KVM_VGIC_V3_ADDR_TYPE_DIST 2
|
||||
#define KVM_VGIC_V3_ADDR_TYPE_REDIST 3
|
||||
|
||||
#define KVM_VGIC_V3_DIST_SIZE SZ_64K
|
||||
#define KVM_VGIC_V3_REDIST_SIZE (2 * SZ_64K)
|
||||
|
||||
#define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
|
||||
#define KVM_ARM_VCPU_PSCI_0_2 1 /* CPU uses PSCI v0.2 */
|
||||
|
||||
|
|
|
@ -21,13 +21,16 @@ obj-$(CONFIG_KVM_ARM_HOST) += hyp/
|
|||
obj-y += kvm-arm.o init.o interrupts.o
|
||||
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
|
||||
obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o
|
||||
obj-y += $(KVM)/arm/aarch32.o
|
||||
|
||||
obj-y += $(KVM)/arm/vgic/vgic.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-init.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-irqfd.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-v2.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-v3.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-mmio.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-mmio-v2.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-mmio-v3.o
|
||||
obj-y += $(KVM)/arm/vgic/vgic-kvm-device.o
|
||||
obj-y += $(KVM)/irqchip.o
|
||||
obj-y += $(KVM)/arm/arch_timer.o
|
||||
|
|
|
@ -144,6 +144,16 @@ out_fail_alloc:
|
|||
return ret;
|
||||
}
|
||||
|
||||
bool kvm_arch_has_vcpu_debugfs(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
|
||||
{
|
||||
return VM_FAULT_SIGBUS;
|
||||
|
@ -1176,6 +1186,10 @@ static int init_common_resources(void)
|
|||
return -ENOMEM;
|
||||
}
|
||||
|
||||
/* set size of VMID supported by CPU */
|
||||
kvm_vmid_bits = kvm_get_vmid_bits();
|
||||
kvm_info("%d-bit VMID\n", kvm_vmid_bits);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -1241,10 +1255,6 @@ static void teardown_hyp_mode(void)
|
|||
|
||||
static int init_vhe_mode(void)
|
||||
{
|
||||
/* set size of VMID supported by CPU */
|
||||
kvm_vmid_bits = kvm_get_vmid_bits();
|
||||
kvm_info("%d-bit VMID\n", kvm_vmid_bits);
|
||||
|
||||
kvm_info("VHE mode initialized successfully\n");
|
||||
return 0;
|
||||
}
|
||||
|
@ -1328,10 +1338,6 @@ static int init_hyp_mode(void)
|
|||
}
|
||||
}
|
||||
|
||||
/* set size of VMID supported by CPU */
|
||||
kvm_vmid_bits = kvm_get_vmid_bits();
|
||||
kvm_info("%d-bit VMID\n", kvm_vmid_bits);
|
||||
|
||||
kvm_info("Hyp mode initialized successfully\n");
|
||||
|
||||
return 0;
|
||||
|
|
|
@ -228,6 +228,35 @@ bool access_vm_reg(struct kvm_vcpu *vcpu,
|
|||
return true;
|
||||
}
|
||||
|
||||
static bool access_gic_sgi(struct kvm_vcpu *vcpu,
|
||||
const struct coproc_params *p,
|
||||
const struct coproc_reg *r)
|
||||
{
|
||||
u64 reg;
|
||||
|
||||
if (!p->is_write)
|
||||
return read_from_write_only(vcpu, p);
|
||||
|
||||
reg = (u64)*vcpu_reg(vcpu, p->Rt2) << 32;
|
||||
reg |= *vcpu_reg(vcpu, p->Rt1) ;
|
||||
|
||||
vgic_v3_dispatch_sgi(vcpu, reg);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool access_gic_sre(struct kvm_vcpu *vcpu,
|
||||
const struct coproc_params *p,
|
||||
const struct coproc_reg *r)
|
||||
{
|
||||
if (p->is_write)
|
||||
return ignore_write(vcpu, p);
|
||||
|
||||
*vcpu_reg(vcpu, p->Rt1) = vcpu->arch.vgic_cpu.vgic_v3.vgic_sre;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
* We could trap ID_DFR0 and tell the guest we don't support performance
|
||||
* monitoring. Unfortunately the patch to make the kernel check ID_DFR0 was
|
||||
|
@ -361,10 +390,16 @@ static const struct coproc_reg cp15_regs[] = {
|
|||
{ CRn(10), CRm( 3), Op1( 0), Op2( 1), is32,
|
||||
access_vm_reg, reset_unknown, c10_AMAIR1},
|
||||
|
||||
/* ICC_SGI1R */
|
||||
{ CRm64(12), Op1( 0), is64, access_gic_sgi},
|
||||
|
||||
/* VBAR: swapped by interrupt.S. */
|
||||
{ CRn(12), CRm( 0), Op1( 0), Op2( 0), is32,
|
||||
NULL, reset_val, c12_VBAR, 0x00000000 },
|
||||
|
||||
/* ICC_SRE */
|
||||
{ CRn(12), CRm(12), Op1( 0), Op2(5), is32, access_gic_sre },
|
||||
|
||||
/* CONTEXTIDR/TPIDRURW/TPIDRURO/TPIDRPRW: swapped by interrupt.S. */
|
||||
{ CRn(13), CRm( 0), Op1( 0), Op2( 1), is32,
|
||||
access_vm_reg, reset_val, c13_CID, 0x00000000 },
|
||||
|
|
|
@ -161,105 +161,6 @@ unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu)
|
|||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* A conditional instruction is allowed to trap, even though it
|
||||
* wouldn't be executed. So let's re-implement the hardware, in
|
||||
* software!
|
||||
*/
|
||||
bool kvm_condition_valid(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
unsigned long cpsr, cond, insn;
|
||||
|
||||
/*
|
||||
* Exception Code 0 can only happen if we set HCR.TGE to 1, to
|
||||
* catch undefined instructions, and then we won't get past
|
||||
* the arm_exit_handlers test anyway.
|
||||
*/
|
||||
BUG_ON(!kvm_vcpu_trap_get_class(vcpu));
|
||||
|
||||
/* Top two bits non-zero? Unconditional. */
|
||||
if (kvm_vcpu_get_hsr(vcpu) >> 30)
|
||||
return true;
|
||||
|
||||
cpsr = *vcpu_cpsr(vcpu);
|
||||
|
||||
/* Is condition field valid? */
|
||||
if ((kvm_vcpu_get_hsr(vcpu) & HSR_CV) >> HSR_CV_SHIFT)
|
||||
cond = (kvm_vcpu_get_hsr(vcpu) & HSR_COND) >> HSR_COND_SHIFT;
|
||||
else {
|
||||
/* This can happen in Thumb mode: examine IT state. */
|
||||
unsigned long it;
|
||||
|
||||
it = ((cpsr >> 8) & 0xFC) | ((cpsr >> 25) & 0x3);
|
||||
|
||||
/* it == 0 => unconditional. */
|
||||
if (it == 0)
|
||||
return true;
|
||||
|
||||
/* The cond for this insn works out as the top 4 bits. */
|
||||
cond = (it >> 4);
|
||||
}
|
||||
|
||||
/* Shift makes it look like an ARM-mode instruction */
|
||||
insn = cond << 28;
|
||||
return arm_check_condition(insn, cpsr) != ARM_OPCODE_CONDTEST_FAIL;
|
||||
}
|
||||
|
||||
/**
|
||||
* adjust_itstate - adjust ITSTATE when emulating instructions in IT-block
|
||||
* @vcpu: The VCPU pointer
|
||||
*
|
||||
* When exceptions occur while instructions are executed in Thumb IF-THEN
|
||||
* blocks, the ITSTATE field of the CPSR is not advanced (updated), so we have
|
||||
* to do this little bit of work manually. The fields map like this:
|
||||
*
|
||||
* IT[7:0] -> CPSR[26:25],CPSR[15:10]
|
||||
*/
|
||||
static void kvm_adjust_itstate(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
unsigned long itbits, cond;
|
||||
unsigned long cpsr = *vcpu_cpsr(vcpu);
|
||||
bool is_arm = !(cpsr & PSR_T_BIT);
|
||||
|
||||
BUG_ON(is_arm && (cpsr & PSR_IT_MASK));
|
||||
|
||||
if (!(cpsr & PSR_IT_MASK))
|
||||
return;
|
||||
|
||||
cond = (cpsr & 0xe000) >> 13;
|
||||
itbits = (cpsr & 0x1c00) >> (10 - 2);
|
||||
itbits |= (cpsr & (0x3 << 25)) >> 25;
|
||||
|
||||
/* Perform ITAdvance (see page A-52 in ARM DDI 0406C) */
|
||||
if ((itbits & 0x7) == 0)
|
||||
itbits = cond = 0;
|
||||
else
|
||||
itbits = (itbits << 1) & 0x1f;
|
||||
|
||||
cpsr &= ~PSR_IT_MASK;
|
||||
cpsr |= cond << 13;
|
||||
cpsr |= (itbits & 0x1c) << (10 - 2);
|
||||
cpsr |= (itbits & 0x3) << 25;
|
||||
*vcpu_cpsr(vcpu) = cpsr;
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_skip_instr - skip a trapped instruction and proceed to the next
|
||||
* @vcpu: The vcpu pointer
|
||||
*/
|
||||
void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
|
||||
{
|
||||
bool is_thumb;
|
||||
|
||||
is_thumb = !!(*vcpu_cpsr(vcpu) & PSR_T_BIT);
|
||||
if (is_thumb && !is_wide_instr)
|
||||
*vcpu_pc(vcpu) += 2;
|
||||
else
|
||||
*vcpu_pc(vcpu) += 4;
|
||||
kvm_adjust_itstate(vcpu);
|
||||
}
|
||||
|
||||
|
||||
/******************************************************************************
|
||||
* Inject exceptions into the guest
|
||||
*/
|
||||
|
@ -402,3 +303,15 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr)
|
|||
{
|
||||
inject_abt(vcpu, true, addr);
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_inject_vabt - inject an async abort / SError into the guest
|
||||
* @vcpu: The VCPU to receive the exception
|
||||
*
|
||||
* It is assumed that this code is called from the VCPU thread and that the
|
||||
* VCPU therefore is not currently executing guest code.
|
||||
*/
|
||||
void kvm_inject_vabt(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VA);
|
||||
}
|
||||
|
|
|
@ -28,14 +28,6 @@
|
|||
|
||||
typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
|
||||
|
||||
static int handle_svc_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
{
|
||||
/* SVC called from Hyp mode should never get here */
|
||||
kvm_debug("SVC called from Hyp mode shouldn't go here\n");
|
||||
BUG();
|
||||
return -EINVAL; /* Squash warning */
|
||||
}
|
||||
|
||||
static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
{
|
||||
int ret;
|
||||
|
@ -59,22 +51,6 @@ static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
|||
return 1;
|
||||
}
|
||||
|
||||
static int handle_pabt_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
{
|
||||
/* The hypervisor should never cause aborts */
|
||||
kvm_err("Prefetch Abort taken from Hyp mode at %#08lx (HSR: %#08x)\n",
|
||||
kvm_vcpu_get_hfar(vcpu), kvm_vcpu_get_hsr(vcpu));
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
static int handle_dabt_hyp(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
{
|
||||
/* This is either an error in the ws. code or an external abort */
|
||||
kvm_err("Data Abort taken from Hyp mode at %#08lx (HSR: %#08x)\n",
|
||||
kvm_vcpu_get_hfar(vcpu), kvm_vcpu_get_hsr(vcpu));
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_handle_wfx - handle a WFI or WFE instructions trapped in guests
|
||||
* @vcpu: the vcpu pointer
|
||||
|
@ -112,13 +88,10 @@ static exit_handle_fn arm_exit_handlers[] = {
|
|||
[HSR_EC_CP14_64] = kvm_handle_cp14_access,
|
||||
[HSR_EC_CP_0_13] = kvm_handle_cp_0_13_access,
|
||||
[HSR_EC_CP10_ID] = kvm_handle_cp10_id,
|
||||
[HSR_EC_SVC_HYP] = handle_svc_hyp,
|
||||
[HSR_EC_HVC] = handle_hvc,
|
||||
[HSR_EC_SMC] = handle_smc,
|
||||
[HSR_EC_IABT] = kvm_handle_guest_abort,
|
||||
[HSR_EC_IABT_HYP] = handle_pabt_hyp,
|
||||
[HSR_EC_DABT] = kvm_handle_guest_abort,
|
||||
[HSR_EC_DABT_HYP] = handle_dabt_hyp,
|
||||
};
|
||||
|
||||
static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
|
||||
|
@ -144,6 +117,25 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
|||
{
|
||||
exit_handle_fn exit_handler;
|
||||
|
||||
if (ARM_ABORT_PENDING(exception_index)) {
|
||||
u8 hsr_ec = kvm_vcpu_trap_get_class(vcpu);
|
||||
|
||||
/*
|
||||
* HVC/SMC already have an adjusted PC, which we need
|
||||
* to correct in order to return to after having
|
||||
* injected the abort.
|
||||
*/
|
||||
if (hsr_ec == HSR_EC_HVC || hsr_ec == HSR_EC_SMC) {
|
||||
u32 adj = kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
|
||||
*vcpu_pc(vcpu) -= adj;
|
||||
}
|
||||
|
||||
kvm_inject_vabt(vcpu);
|
||||
return 1;
|
||||
}
|
||||
|
||||
exception_index = ARM_EXCEPTION_CODE(exception_index);
|
||||
|
||||
switch (exception_index) {
|
||||
case ARM_EXCEPTION_IRQ:
|
||||
return 1;
|
||||
|
@ -160,6 +152,9 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
|||
exit_handler = kvm_get_exit_handler(vcpu);
|
||||
|
||||
return exit_handler(vcpu, run);
|
||||
case ARM_EXCEPTION_DATA_ABORT:
|
||||
kvm_inject_vabt(vcpu);
|
||||
return 1;
|
||||
default:
|
||||
kvm_pr_unimpl("Unsupported exception type: %d",
|
||||
exception_index);
|
||||
|
|
|
@ -5,6 +5,7 @@
|
|||
KVM=../../../../virt/kvm
|
||||
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
|
||||
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += tlb.o
|
||||
|
|
|
@ -18,6 +18,7 @@
|
|||
#include <linux/linkage.h>
|
||||
#include <asm/asm-offsets.h>
|
||||
#include <asm/kvm_arm.h>
|
||||
#include <asm/kvm_asm.h>
|
||||
|
||||
.arch_extension virt
|
||||
|
||||
|
@ -63,6 +64,36 @@ ENTRY(__guest_exit)
|
|||
ldr lr, [r0, #4]
|
||||
|
||||
mov r0, r1
|
||||
mrs r1, SPSR
|
||||
mrs r2, ELR_hyp
|
||||
mrc p15, 4, r3, c5, c2, 0 @ HSR
|
||||
|
||||
/*
|
||||
* Force loads and stores to complete before unmasking aborts
|
||||
* and forcing the delivery of the exception. This gives us a
|
||||
* single instruction window, which the handler will try to
|
||||
* match.
|
||||
*/
|
||||
dsb sy
|
||||
cpsie a
|
||||
|
||||
.global abort_guest_exit_start
|
||||
abort_guest_exit_start:
|
||||
|
||||
isb
|
||||
|
||||
.global abort_guest_exit_end
|
||||
abort_guest_exit_end:
|
||||
|
||||
/*
|
||||
* If we took an abort, r0[31] will be set, and cmp will set
|
||||
* the N bit in PSTATE.
|
||||
*/
|
||||
cmp r0, #0
|
||||
msrmi SPSR_cxsf, r1
|
||||
msrmi ELR_hyp, r2
|
||||
mcrmi p15, 4, r3, c5, c2, 0 @ HSR
|
||||
|
||||
bx lr
|
||||
ENDPROC(__guest_exit)
|
||||
|
||||
|
|
|
@ -81,7 +81,6 @@ __kvm_hyp_vector:
|
|||
invalid_vector hyp_undef ARM_EXCEPTION_UNDEFINED
|
||||
invalid_vector hyp_svc ARM_EXCEPTION_SOFTWARE
|
||||
invalid_vector hyp_pabt ARM_EXCEPTION_PREF_ABORT
|
||||
invalid_vector hyp_dabt ARM_EXCEPTION_DATA_ABORT
|
||||
invalid_vector hyp_fiq ARM_EXCEPTION_FIQ
|
||||
|
||||
ENTRY(__hyp_do_panic)
|
||||
|
@ -164,6 +163,21 @@ hyp_irq:
|
|||
load_vcpu r0 @ Load VCPU pointer to r0
|
||||
b __guest_exit
|
||||
|
||||
hyp_dabt:
|
||||
push {r0, r1}
|
||||
mrs r0, ELR_hyp
|
||||
ldr r1, =abort_guest_exit_start
|
||||
THUMB( add r1, r1, #1)
|
||||
cmp r0, r1
|
||||
ldrne r1, =abort_guest_exit_end
|
||||
THUMB( addne r1, r1, #1)
|
||||
cmpne r0, r1
|
||||
pop {r0, r1}
|
||||
bne __hyp_panic
|
||||
|
||||
orr r0, r0, #(1 << ARM_EXIT_WITH_ABORT_BIT)
|
||||
eret
|
||||
|
||||
.ltorg
|
||||
|
||||
.popsection
|
||||
|
|
|
@ -14,6 +14,7 @@
|
|||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program. If not, see <http://www.gnu.org/licenses/>.
|
||||
*/
|
||||
#include <linux/jump_label.h>
|
||||
|
||||
#include <asm/kvm_asm.h>
|
||||
#include <asm/kvm_hyp.h>
|
||||
|
@ -54,6 +55,15 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
|
|||
{
|
||||
u32 val;
|
||||
|
||||
/*
|
||||
* If we pended a virtual abort, preserve it until it gets
|
||||
* cleared. See B1.9.9 (Virtual Abort exception) for details,
|
||||
* but the crucial bit is the zeroing of HCR.VA in the
|
||||
* pseudocode.
|
||||
*/
|
||||
if (vcpu->arch.hcr & HCR_VA)
|
||||
vcpu->arch.hcr = read_sysreg(HCR);
|
||||
|
||||
write_sysreg(0, HCR);
|
||||
write_sysreg(0, HSTR);
|
||||
val = read_sysreg(HDCR);
|
||||
|
@ -74,13 +84,20 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu)
|
|||
write_sysreg(read_sysreg(MIDR), VPIDR);
|
||||
}
|
||||
|
||||
|
||||
static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
|
||||
__vgic_v3_save_state(vcpu);
|
||||
else
|
||||
__vgic_v2_save_state(vcpu);
|
||||
}
|
||||
|
||||
static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
|
||||
__vgic_v3_restore_state(vcpu);
|
||||
else
|
||||
__vgic_v2_restore_state(vcpu);
|
||||
}
|
||||
|
||||
|
@ -134,7 +151,7 @@ static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
|
|||
return true;
|
||||
}
|
||||
|
||||
static int __hyp_text __guest_run(struct kvm_vcpu *vcpu)
|
||||
int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpu_context *host_ctxt;
|
||||
struct kvm_cpu_context *guest_ctxt;
|
||||
|
@ -191,8 +208,6 @@ again:
|
|||
return exit_code;
|
||||
}
|
||||
|
||||
__alias(__guest_run) int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
|
||||
|
||||
static const char * const __hyp_panic_string[] = {
|
||||
[ARM_EXCEPTION_RESET] = "\nHYP panic: RST PC:%08x CPSR:%08x",
|
||||
[ARM_EXCEPTION_UNDEFINED] = "\nHYP panic: UNDEF PC:%08x CPSR:%08x",
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
* As v7 does not support flushing per IPA, just nuke the whole TLB
|
||||
* instead, ignoring the ipa value.
|
||||
*/
|
||||
static void __hyp_text __tlb_flush_vmid(struct kvm *kvm)
|
||||
void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
|
||||
{
|
||||
dsb(ishst);
|
||||
|
||||
|
@ -50,21 +50,14 @@ static void __hyp_text __tlb_flush_vmid(struct kvm *kvm)
|
|||
write_sysreg(0, VTTBR);
|
||||
}
|
||||
|
||||
__alias(__tlb_flush_vmid) void __kvm_tlb_flush_vmid(struct kvm *kvm);
|
||||
|
||||
static void __hyp_text __tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
|
||||
void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
|
||||
{
|
||||
__tlb_flush_vmid(kvm);
|
||||
__kvm_tlb_flush_vmid(kvm);
|
||||
}
|
||||
|
||||
__alias(__tlb_flush_vmid_ipa) void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm,
|
||||
phys_addr_t ipa);
|
||||
|
||||
static void __hyp_text __tlb_flush_vm_context(void)
|
||||
void __hyp_text __kvm_flush_vm_context(void)
|
||||
{
|
||||
write_sysreg(0, TLBIALLNSNHIS);
|
||||
write_sysreg(0, ICIALLUIS);
|
||||
dsb(ish);
|
||||
}
|
||||
|
||||
__alias(__tlb_flush_vm_context) void __kvm_flush_vm_context(void);
|
||||
|
|
|
@ -126,12 +126,6 @@ static int decode_hsr(struct kvm_vcpu *vcpu, bool *is_write, int *len)
|
|||
int access_size;
|
||||
bool sign_extend;
|
||||
|
||||
if (kvm_vcpu_dabt_isextabt(vcpu)) {
|
||||
/* cache operation on I/O addr, tell guest unsupported */
|
||||
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (kvm_vcpu_dabt_iss1tw(vcpu)) {
|
||||
/* page table accesses IO mem: tell guest to fix its TTBR */
|
||||
kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
|
||||
|
|
|
@ -744,7 +744,6 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm)
|
|||
if (!pgd)
|
||||
return -ENOMEM;
|
||||
|
||||
kvm_clean_pgd(pgd);
|
||||
kvm->arch.pgd = pgd;
|
||||
return 0;
|
||||
}
|
||||
|
@ -936,7 +935,6 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
|
|||
if (!cache)
|
||||
return 0; /* ignore calls from kvm_set_spte_hva */
|
||||
pte = mmu_memory_cache_alloc(cache);
|
||||
kvm_clean_pte(pte);
|
||||
pmd_populate_kernel(NULL, pmd, pte);
|
||||
get_page(virt_to_page(pmd));
|
||||
}
|
||||
|
@ -1434,6 +1432,11 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
|||
int ret, idx;
|
||||
|
||||
is_iabt = kvm_vcpu_trap_is_iabt(vcpu);
|
||||
if (unlikely(!is_iabt && kvm_vcpu_dabt_isextabt(vcpu))) {
|
||||
kvm_inject_vabt(vcpu);
|
||||
return 1;
|
||||
}
|
||||
|
||||
fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
|
||||
|
||||
trace_kvm_guest_fault(*vcpu_pc(vcpu), kvm_vcpu_get_hsr(vcpu),
|
||||
|
|
|
@ -80,6 +80,19 @@
|
|||
#include <linux/stringify.h>
|
||||
#include <asm/barrier.h>
|
||||
|
||||
#define read_gicreg(r) \
|
||||
({ \
|
||||
u64 reg; \
|
||||
asm volatile("mrs_s %0, " __stringify(r) : "=r" (reg)); \
|
||||
reg; \
|
||||
})
|
||||
|
||||
#define write_gicreg(v,r) \
|
||||
do { \
|
||||
u64 __val = (v); \
|
||||
asm volatile("msr_s " __stringify(r) ", %0" : : "r" (__val));\
|
||||
} while (0)
|
||||
|
||||
/*
|
||||
* Low-level accessors
|
||||
*
|
||||
|
|
|
@ -50,7 +50,7 @@
|
|||
#define HCR_BSU (3 << 10)
|
||||
#define HCR_BSU_IS (UL(1) << 10)
|
||||
#define HCR_FB (UL(1) << 9)
|
||||
#define HCR_VA (UL(1) << 8)
|
||||
#define HCR_VSE (UL(1) << 8)
|
||||
#define HCR_VI (UL(1) << 7)
|
||||
#define HCR_VF (UL(1) << 6)
|
||||
#define HCR_AMO (UL(1) << 5)
|
||||
|
@ -80,7 +80,7 @@
|
|||
#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
|
||||
HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \
|
||||
HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW)
|
||||
#define HCR_VIRT_EXCP_MASK (HCR_VA | HCR_VI | HCR_VF)
|
||||
#define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
|
||||
#define HCR_INT_OVERRIDE (HCR_FMO | HCR_IMO)
|
||||
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
|
||||
|
||||
|
|
|
@ -20,10 +20,15 @@
|
|||
|
||||
#include <asm/virt.h>
|
||||
|
||||
#define ARM_EXIT_WITH_SERROR_BIT 31
|
||||
#define ARM_EXCEPTION_CODE(x) ((x) & ~(1U << ARM_EXIT_WITH_SERROR_BIT))
|
||||
#define ARM_SERROR_PENDING(x) !!((x) & (1U << ARM_EXIT_WITH_SERROR_BIT))
|
||||
|
||||
#define ARM_EXCEPTION_IRQ 0
|
||||
#define ARM_EXCEPTION_TRAP 1
|
||||
#define ARM_EXCEPTION_EL1_SERROR 1
|
||||
#define ARM_EXCEPTION_TRAP 2
|
||||
/* The hyp-stub will return this for any kvm_call_hyp() call */
|
||||
#define ARM_EXCEPTION_HYP_GONE 2
|
||||
#define ARM_EXCEPTION_HYP_GONE 3
|
||||
|
||||
#define KVM_ARM64_DEBUG_DIRTY_SHIFT 0
|
||||
#define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT)
|
||||
|
|
|
@ -38,6 +38,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
|
|||
void kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr);
|
||||
|
||||
void kvm_inject_undefined(struct kvm_vcpu *vcpu);
|
||||
void kvm_inject_vabt(struct kvm_vcpu *vcpu);
|
||||
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
|
||||
|
@ -147,6 +148,16 @@ static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
|
|||
return vcpu->arch.fault.esr_el2;
|
||||
}
|
||||
|
||||
static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u32 esr = kvm_vcpu_get_hsr(vcpu);
|
||||
|
||||
if (esr & ESR_ELx_CV)
|
||||
return (esr & ESR_ELx_COND_MASK) >> ESR_ELx_COND_SHIFT;
|
||||
|
||||
return -1;
|
||||
}
|
||||
|
||||
static inline unsigned long kvm_vcpu_get_hfar(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.fault.far_el2;
|
||||
|
|
|
@ -290,15 +290,15 @@ struct kvm_vcpu_arch {
|
|||
#endif
|
||||
|
||||
struct kvm_vm_stat {
|
||||
u32 remote_tlb_flush;
|
||||
ulong remote_tlb_flush;
|
||||
};
|
||||
|
||||
struct kvm_vcpu_stat {
|
||||
u32 halt_successful_poll;
|
||||
u32 halt_attempted_poll;
|
||||
u32 halt_poll_invalid;
|
||||
u32 halt_wakeup;
|
||||
u32 hvc_exit_stat;
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_wakeup;
|
||||
u64 hvc_exit_stat;
|
||||
u64 wfe_exit_stat;
|
||||
u64 wfi_exit_stat;
|
||||
u64 mmio_exit_user;
|
||||
|
|
|
@ -123,6 +123,7 @@ typeof(orig) * __hyp_text fname(void) \
|
|||
|
||||
void __vgic_v2_save_state(struct kvm_vcpu *vcpu);
|
||||
void __vgic_v2_restore_state(struct kvm_vcpu *vcpu);
|
||||
int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
|
||||
|
||||
void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
|
||||
void __vgic_v3_restore_state(struct kvm_vcpu *vcpu);
|
||||
|
|
|
@ -162,12 +162,6 @@ void kvm_clear_hyp_idmap(void);
|
|||
#define kvm_set_pte(ptep, pte) set_pte(ptep, pte)
|
||||
#define kvm_set_pmd(pmdp, pmd) set_pmd(pmdp, pmd)
|
||||
|
||||
static inline void kvm_clean_pgd(pgd_t *pgd) {}
|
||||
static inline void kvm_clean_pmd(pmd_t *pmd) {}
|
||||
static inline void kvm_clean_pmd_entry(pmd_t *pmd) {}
|
||||
static inline void kvm_clean_pte(pte_t *pte) {}
|
||||
static inline void kvm_clean_pte_entry(pte_t *pte) {}
|
||||
|
||||
static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
|
||||
{
|
||||
pte_val(pte) |= PTE_S2_RDWR;
|
||||
|
|
|
@ -16,7 +16,7 @@ menuconfig VIRTUALIZATION
|
|||
|
||||
if VIRTUALIZATION
|
||||
|
||||
config KVM_ARM_VGIC_V3
|
||||
config KVM_ARM_VGIC_V3_ITS
|
||||
bool
|
||||
|
||||
config KVM
|
||||
|
@ -34,7 +34,7 @@ config KVM
|
|||
select KVM_VFIO
|
||||
select HAVE_KVM_EVENTFD
|
||||
select HAVE_KVM_IRQFD
|
||||
select KVM_ARM_VGIC_V3
|
||||
select KVM_ARM_VGIC_V3_ITS
|
||||
select KVM_ARM_PMU if HW_PERF_EVENTS
|
||||
select HAVE_KVM_MSI
|
||||
select HAVE_KVM_IRQCHIP
|
||||
|
|
|
@ -16,9 +16,10 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
|
|||
kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/arm.o $(ARM)/mmu.o $(ARM)/mmio.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o
|
||||
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += emulate.o inject_fault.o regmap.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
|
||||
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-init.o
|
||||
|
|
|
@ -170,9 +170,32 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
|
|||
{
|
||||
exit_handle_fn exit_handler;
|
||||
|
||||
if (ARM_SERROR_PENDING(exception_index)) {
|
||||
u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
|
||||
|
||||
/*
|
||||
* HVC/SMC already have an adjusted PC, which we need
|
||||
* to correct in order to return to after having
|
||||
* injected the SError.
|
||||
*/
|
||||
if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
|
||||
hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
|
||||
u32 adj = kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
|
||||
*vcpu_pc(vcpu) -= adj;
|
||||
}
|
||||
|
||||
kvm_inject_vabt(vcpu);
|
||||
return 1;
|
||||
}
|
||||
|
||||
exception_index = ARM_EXCEPTION_CODE(exception_index);
|
||||
|
||||
switch (exception_index) {
|
||||
case ARM_EXCEPTION_IRQ:
|
||||
return 1;
|
||||
case ARM_EXCEPTION_EL1_SERROR:
|
||||
kvm_inject_vabt(vcpu);
|
||||
return 1;
|
||||
case ARM_EXCEPTION_TRAP:
|
||||
/*
|
||||
* See ARM ARM B1.14.1: "Hyp traps on instructions
|
||||
|
|
|
@ -5,9 +5,9 @@
|
|||
KVM=../../../../virt/kvm
|
||||
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v2-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/vgic-v3-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/hyp/timer-sr.o
|
||||
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += vgic-v3-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += sysreg-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += debug-sr.o
|
||||
obj-$(CONFIG_KVM_ARM_HOST) += entry.o
|
||||
|
|
|
@ -131,9 +131,7 @@ void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu)
|
|||
vcpu->arch.debug_flags &= ~KVM_ARM64_DEBUG_DIRTY;
|
||||
}
|
||||
|
||||
static u32 __hyp_text __debug_read_mdcr_el2(void)
|
||||
u32 __hyp_text __kvm_get_mdcr_el2(void)
|
||||
{
|
||||
return read_sysreg(mdcr_el2);
|
||||
}
|
||||
|
||||
__alias(__debug_read_mdcr_el2) u32 __kvm_get_mdcr_el2(void);
|
||||
|
|
|
@ -55,79 +55,111 @@
|
|||
*/
|
||||
ENTRY(__guest_enter)
|
||||
// x0: vcpu
|
||||
// x1: host/guest context
|
||||
// x2-x18: clobbered by macros
|
||||
// x1: host context
|
||||
// x2-x17: clobbered by macros
|
||||
// x18: guest context
|
||||
|
||||
// Store the host regs
|
||||
save_callee_saved_regs x1
|
||||
|
||||
// Preserve vcpu & host_ctxt for use at exit time
|
||||
stp x0, x1, [sp, #-16]!
|
||||
// Store the host_ctxt for use at exit time
|
||||
str x1, [sp, #-16]!
|
||||
|
||||
add x1, x0, #VCPU_CONTEXT
|
||||
add x18, x0, #VCPU_CONTEXT
|
||||
|
||||
// Prepare x0-x1 for later restore by pushing them onto the stack
|
||||
ldp x2, x3, [x1, #CPU_XREG_OFFSET(0)]
|
||||
stp x2, x3, [sp, #-16]!
|
||||
// Restore guest regs x0-x17
|
||||
ldp x0, x1, [x18, #CPU_XREG_OFFSET(0)]
|
||||
ldp x2, x3, [x18, #CPU_XREG_OFFSET(2)]
|
||||
ldp x4, x5, [x18, #CPU_XREG_OFFSET(4)]
|
||||
ldp x6, x7, [x18, #CPU_XREG_OFFSET(6)]
|
||||
ldp x8, x9, [x18, #CPU_XREG_OFFSET(8)]
|
||||
ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)]
|
||||
ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)]
|
||||
ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)]
|
||||
ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)]
|
||||
|
||||
// x2-x18
|
||||
ldp x2, x3, [x1, #CPU_XREG_OFFSET(2)]
|
||||
ldp x4, x5, [x1, #CPU_XREG_OFFSET(4)]
|
||||
ldp x6, x7, [x1, #CPU_XREG_OFFSET(6)]
|
||||
ldp x8, x9, [x1, #CPU_XREG_OFFSET(8)]
|
||||
ldp x10, x11, [x1, #CPU_XREG_OFFSET(10)]
|
||||
ldp x12, x13, [x1, #CPU_XREG_OFFSET(12)]
|
||||
ldp x14, x15, [x1, #CPU_XREG_OFFSET(14)]
|
||||
ldp x16, x17, [x1, #CPU_XREG_OFFSET(16)]
|
||||
ldr x18, [x1, #CPU_XREG_OFFSET(18)]
|
||||
// Restore guest regs x19-x29, lr
|
||||
restore_callee_saved_regs x18
|
||||
|
||||
// x19-x29, lr
|
||||
restore_callee_saved_regs x1
|
||||
|
||||
// Last bits of the 64bit state
|
||||
ldp x0, x1, [sp], #16
|
||||
// Restore guest reg x18
|
||||
ldr x18, [x18, #CPU_XREG_OFFSET(18)]
|
||||
|
||||
// Do not touch any register after this!
|
||||
eret
|
||||
ENDPROC(__guest_enter)
|
||||
|
||||
ENTRY(__guest_exit)
|
||||
// x0: vcpu
|
||||
// x1: return code
|
||||
// x2-x3: free
|
||||
// x4-x29,lr: vcpu regs
|
||||
// vcpu x0-x3 on the stack
|
||||
// x0: return code
|
||||
// x1: vcpu
|
||||
// x2-x29,lr: vcpu regs
|
||||
// vcpu x0-x1 on the stack
|
||||
|
||||
add x2, x0, #VCPU_CONTEXT
|
||||
add x1, x1, #VCPU_CONTEXT
|
||||
|
||||
stp x4, x5, [x2, #CPU_XREG_OFFSET(4)]
|
||||
stp x6, x7, [x2, #CPU_XREG_OFFSET(6)]
|
||||
stp x8, x9, [x2, #CPU_XREG_OFFSET(8)]
|
||||
stp x10, x11, [x2, #CPU_XREG_OFFSET(10)]
|
||||
stp x12, x13, [x2, #CPU_XREG_OFFSET(12)]
|
||||
stp x14, x15, [x2, #CPU_XREG_OFFSET(14)]
|
||||
stp x16, x17, [x2, #CPU_XREG_OFFSET(16)]
|
||||
str x18, [x2, #CPU_XREG_OFFSET(18)]
|
||||
ALTERNATIVE(nop, SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
|
||||
|
||||
ldp x6, x7, [sp], #16 // x2, x3
|
||||
ldp x4, x5, [sp], #16 // x0, x1
|
||||
// Store the guest regs x2 and x3
|
||||
stp x2, x3, [x1, #CPU_XREG_OFFSET(2)]
|
||||
|
||||
stp x4, x5, [x2, #CPU_XREG_OFFSET(0)]
|
||||
stp x6, x7, [x2, #CPU_XREG_OFFSET(2)]
|
||||
// Retrieve the guest regs x0-x1 from the stack
|
||||
ldp x2, x3, [sp], #16 // x0, x1
|
||||
|
||||
save_callee_saved_regs x2
|
||||
// Store the guest regs x0-x1 and x4-x18
|
||||
stp x2, x3, [x1, #CPU_XREG_OFFSET(0)]
|
||||
stp x4, x5, [x1, #CPU_XREG_OFFSET(4)]
|
||||
stp x6, x7, [x1, #CPU_XREG_OFFSET(6)]
|
||||
stp x8, x9, [x1, #CPU_XREG_OFFSET(8)]
|
||||
stp x10, x11, [x1, #CPU_XREG_OFFSET(10)]
|
||||
stp x12, x13, [x1, #CPU_XREG_OFFSET(12)]
|
||||
stp x14, x15, [x1, #CPU_XREG_OFFSET(14)]
|
||||
stp x16, x17, [x1, #CPU_XREG_OFFSET(16)]
|
||||
str x18, [x1, #CPU_XREG_OFFSET(18)]
|
||||
|
||||
// Store the guest regs x19-x29, lr
|
||||
save_callee_saved_regs x1
|
||||
|
||||
// Restore the host_ctxt from the stack
|
||||
ldr x2, [sp], #16
|
||||
|
||||
// Restore vcpu & host_ctxt from the stack
|
||||
// (preserving return code in x1)
|
||||
ldp x0, x2, [sp], #16
|
||||
// Now restore the host regs
|
||||
restore_callee_saved_regs x2
|
||||
|
||||
mov x0, x1
|
||||
ret
|
||||
// If we have a pending asynchronous abort, now is the
|
||||
// time to find out. From your VAXorcist book, page 666:
|
||||
// "Threaten me not, oh Evil one! For I speak with
|
||||
// the power of DEC, and I command thee to show thyself!"
|
||||
mrs x2, elr_el2
|
||||
mrs x3, esr_el2
|
||||
mrs x4, spsr_el2
|
||||
mov x5, x0
|
||||
|
||||
dsb sy // Synchronize against in-flight ld/st
|
||||
msr daifclr, #4 // Unmask aborts
|
||||
|
||||
// This is our single instruction exception window. A pending
|
||||
// SError is guaranteed to occur at the earliest when we unmask
|
||||
// it, and at the latest just after the ISB.
|
||||
.global abort_guest_exit_start
|
||||
abort_guest_exit_start:
|
||||
|
||||
isb
|
||||
|
||||
.global abort_guest_exit_end
|
||||
abort_guest_exit_end:
|
||||
|
||||
// If the exception took place, restore the EL1 exception
|
||||
// context so that we can report some information.
|
||||
// Merge the exception code with the SError pending bit.
|
||||
tbz x0, #ARM_EXIT_WITH_SERROR_BIT, 1f
|
||||
msr elr_el2, x2
|
||||
msr esr_el2, x3
|
||||
msr spsr_el2, x4
|
||||
orr x0, x0, x5
|
||||
1: ret
|
||||
ENDPROC(__guest_exit)
|
||||
|
||||
ENTRY(__fpsimd_guest_restore)
|
||||
stp x2, x3, [sp, #-16]!
|
||||
stp x4, lr, [sp, #-16]!
|
||||
|
||||
alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
|
||||
|
|
|
@ -27,16 +27,6 @@
|
|||
.text
|
||||
.pushsection .hyp.text, "ax"
|
||||
|
||||
.macro save_x0_to_x3
|
||||
stp x0, x1, [sp, #-16]!
|
||||
stp x2, x3, [sp, #-16]!
|
||||
.endm
|
||||
|
||||
.macro restore_x0_to_x3
|
||||
ldp x2, x3, [sp], #16
|
||||
ldp x0, x1, [sp], #16
|
||||
.endm
|
||||
|
||||
.macro do_el2_call
|
||||
/*
|
||||
* Shuffle the parameters before calling the function
|
||||
|
@ -79,23 +69,23 @@ ENTRY(__kvm_hyp_teardown)
|
|||
ENDPROC(__kvm_hyp_teardown)
|
||||
|
||||
el1_sync: // Guest trapped into EL2
|
||||
save_x0_to_x3
|
||||
stp x0, x1, [sp, #-16]!
|
||||
|
||||
alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
|
||||
mrs x1, esr_el2
|
||||
alternative_else
|
||||
mrs x1, esr_el1
|
||||
alternative_endif
|
||||
lsr x2, x1, #ESR_ELx_EC_SHIFT
|
||||
lsr x0, x1, #ESR_ELx_EC_SHIFT
|
||||
|
||||
cmp x2, #ESR_ELx_EC_HVC64
|
||||
cmp x0, #ESR_ELx_EC_HVC64
|
||||
b.ne el1_trap
|
||||
|
||||
mrs x3, vttbr_el2 // If vttbr is valid, the 64bit guest
|
||||
cbnz x3, el1_trap // called HVC
|
||||
mrs x1, vttbr_el2 // If vttbr is valid, the 64bit guest
|
||||
cbnz x1, el1_trap // called HVC
|
||||
|
||||
/* Here, we're pretty sure the host called HVC. */
|
||||
restore_x0_to_x3
|
||||
ldp x0, x1, [sp], #16
|
||||
|
||||
cmp x0, #HVC_GET_VECTORS
|
||||
b.ne 1f
|
||||
|
@ -113,24 +103,51 @@ alternative_endif
|
|||
|
||||
el1_trap:
|
||||
/*
|
||||
* x1: ESR
|
||||
* x2: ESR_EC
|
||||
* x0: ESR_EC
|
||||
*/
|
||||
|
||||
/* Guest accessed VFP/SIMD registers, save host, restore Guest */
|
||||
cmp x2, #ESR_ELx_EC_FP_ASIMD
|
||||
cmp x0, #ESR_ELx_EC_FP_ASIMD
|
||||
b.eq __fpsimd_guest_restore
|
||||
|
||||
mrs x0, tpidr_el2
|
||||
mov x1, #ARM_EXCEPTION_TRAP
|
||||
mrs x1, tpidr_el2
|
||||
mov x0, #ARM_EXCEPTION_TRAP
|
||||
b __guest_exit
|
||||
|
||||
el1_irq:
|
||||
save_x0_to_x3
|
||||
mrs x0, tpidr_el2
|
||||
mov x1, #ARM_EXCEPTION_IRQ
|
||||
stp x0, x1, [sp, #-16]!
|
||||
mrs x1, tpidr_el2
|
||||
mov x0, #ARM_EXCEPTION_IRQ
|
||||
b __guest_exit
|
||||
|
||||
el1_error:
|
||||
stp x0, x1, [sp, #-16]!
|
||||
mrs x1, tpidr_el2
|
||||
mov x0, #ARM_EXCEPTION_EL1_SERROR
|
||||
b __guest_exit
|
||||
|
||||
el2_error:
|
||||
/*
|
||||
* Only two possibilities:
|
||||
* 1) Either we come from the exit path, having just unmasked
|
||||
* PSTATE.A: change the return code to an EL2 fault, and
|
||||
* carry on, as we're already in a sane state to handle it.
|
||||
* 2) Or we come from anywhere else, and that's a bug: we panic.
|
||||
*
|
||||
* For (1), x0 contains the original return code and x1 doesn't
|
||||
* contain anything meaningful at that stage. We can reuse them
|
||||
* as temp registers.
|
||||
* For (2), who cares?
|
||||
*/
|
||||
mrs x0, elr_el2
|
||||
adr x1, abort_guest_exit_start
|
||||
cmp x0, x1
|
||||
adr x1, abort_guest_exit_end
|
||||
ccmp x0, x1, #4, ne
|
||||
b.ne __hyp_panic
|
||||
mov x0, #(1 << ARM_EXIT_WITH_SERROR_BIT)
|
||||
eret
|
||||
|
||||
ENTRY(__hyp_do_panic)
|
||||
mov lr, #(PSR_F_BIT | PSR_I_BIT | PSR_A_BIT | PSR_D_BIT |\
|
||||
PSR_MODE_EL1h)
|
||||
|
@ -155,11 +172,9 @@ ENDPROC(\label)
|
|||
invalid_vector el2h_sync_invalid
|
||||
invalid_vector el2h_irq_invalid
|
||||
invalid_vector el2h_fiq_invalid
|
||||
invalid_vector el2h_error_invalid
|
||||
invalid_vector el1_sync_invalid
|
||||
invalid_vector el1_irq_invalid
|
||||
invalid_vector el1_fiq_invalid
|
||||
invalid_vector el1_error_invalid
|
||||
|
||||
.ltorg
|
||||
|
||||
|
@ -174,15 +189,15 @@ ENTRY(__kvm_hyp_vector)
|
|||
ventry el2h_sync_invalid // Synchronous EL2h
|
||||
ventry el2h_irq_invalid // IRQ EL2h
|
||||
ventry el2h_fiq_invalid // FIQ EL2h
|
||||
ventry el2h_error_invalid // Error EL2h
|
||||
ventry el2_error // Error EL2h
|
||||
|
||||
ventry el1_sync // Synchronous 64-bit EL1
|
||||
ventry el1_irq // IRQ 64-bit EL1
|
||||
ventry el1_fiq_invalid // FIQ 64-bit EL1
|
||||
ventry el1_error_invalid // Error 64-bit EL1
|
||||
ventry el1_error // Error 64-bit EL1
|
||||
|
||||
ventry el1_sync // Synchronous 32-bit EL1
|
||||
ventry el1_irq // IRQ 32-bit EL1
|
||||
ventry el1_fiq_invalid // FIQ 32-bit EL1
|
||||
ventry el1_error_invalid // Error 32-bit EL1
|
||||
ventry el1_error // Error 32-bit EL1
|
||||
ENDPROC(__kvm_hyp_vector)
|
||||
|
|
|
@ -16,7 +16,10 @@
|
|||
*/
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/jump_label.h>
|
||||
|
||||
#include <asm/kvm_asm.h>
|
||||
#include <asm/kvm_emulate.h>
|
||||
#include <asm/kvm_hyp.h>
|
||||
|
||||
static bool __hyp_text __fpsimd_enabled_nvhe(void)
|
||||
|
@ -109,6 +112,15 @@ static hyp_alternate_select(__deactivate_traps_arch,
|
|||
|
||||
static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
/*
|
||||
* If we pended a virtual abort, preserve it until it gets
|
||||
* cleared. See D1.14.3 (Virtual Interrupts) for details, but
|
||||
* the crucial bit is "On taking a vSError interrupt,
|
||||
* HCR_EL2.VSE is cleared to 0."
|
||||
*/
|
||||
if (vcpu->arch.hcr_el2 & HCR_VSE)
|
||||
vcpu->arch.hcr_el2 = read_sysreg(hcr_el2);
|
||||
|
||||
__deactivate_traps_arch()();
|
||||
write_sysreg(0, hstr_el2);
|
||||
write_sysreg(read_sysreg(mdcr_el2) & MDCR_EL2_HPMN_MASK, mdcr_el2);
|
||||
|
@ -126,17 +138,13 @@ static void __hyp_text __deactivate_vm(struct kvm_vcpu *vcpu)
|
|||
write_sysreg(0, vttbr_el2);
|
||||
}
|
||||
|
||||
static hyp_alternate_select(__vgic_call_save_state,
|
||||
__vgic_v2_save_state, __vgic_v3_save_state,
|
||||
ARM64_HAS_SYSREG_GIC_CPUIF);
|
||||
|
||||
static hyp_alternate_select(__vgic_call_restore_state,
|
||||
__vgic_v2_restore_state, __vgic_v3_restore_state,
|
||||
ARM64_HAS_SYSREG_GIC_CPUIF);
|
||||
|
||||
static void __hyp_text __vgic_save_state(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
__vgic_call_save_state()(vcpu);
|
||||
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
|
||||
__vgic_v3_save_state(vcpu);
|
||||
else
|
||||
__vgic_v2_save_state(vcpu);
|
||||
|
||||
write_sysreg(read_sysreg(hcr_el2) & ~HCR_INT_OVERRIDE, hcr_el2);
|
||||
}
|
||||
|
||||
|
@ -149,7 +157,10 @@ static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu)
|
|||
val |= vcpu->arch.irq_lines;
|
||||
write_sysreg(val, hcr_el2);
|
||||
|
||||
__vgic_call_restore_state()(vcpu);
|
||||
if (static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
|
||||
__vgic_v3_restore_state(vcpu);
|
||||
else
|
||||
__vgic_v2_restore_state(vcpu);
|
||||
}
|
||||
|
||||
static bool __hyp_text __true_value(void)
|
||||
|
@ -232,7 +243,22 @@ static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
|
|||
return true;
|
||||
}
|
||||
|
||||
static int __hyp_text __guest_run(struct kvm_vcpu *vcpu)
|
||||
static void __hyp_text __skip_instr(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
*vcpu_pc(vcpu) = read_sysreg_el2(elr);
|
||||
|
||||
if (vcpu_mode_is_32bit(vcpu)) {
|
||||
vcpu->arch.ctxt.gp_regs.regs.pstate = read_sysreg_el2(spsr);
|
||||
kvm_skip_instr32(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
|
||||
write_sysreg_el2(vcpu->arch.ctxt.gp_regs.regs.pstate, spsr);
|
||||
} else {
|
||||
*vcpu_pc(vcpu) += 4;
|
||||
}
|
||||
|
||||
write_sysreg_el2(*vcpu_pc(vcpu), elr);
|
||||
}
|
||||
|
||||
int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpu_context *host_ctxt;
|
||||
struct kvm_cpu_context *guest_ctxt;
|
||||
|
@ -267,9 +293,43 @@ again:
|
|||
exit_code = __guest_enter(vcpu, host_ctxt);
|
||||
/* And we're baaack! */
|
||||
|
||||
/*
|
||||
* We're using the raw exception code in order to only process
|
||||
* the trap if no SError is pending. We will come back to the
|
||||
* same PC once the SError has been injected, and replay the
|
||||
* trapping instruction.
|
||||
*/
|
||||
if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
|
||||
goto again;
|
||||
|
||||
if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
|
||||
exit_code == ARM_EXCEPTION_TRAP) {
|
||||
bool valid;
|
||||
|
||||
valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
|
||||
kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
|
||||
kvm_vcpu_dabt_isvalid(vcpu) &&
|
||||
!kvm_vcpu_dabt_isextabt(vcpu) &&
|
||||
!kvm_vcpu_dabt_iss1tw(vcpu);
|
||||
|
||||
if (valid) {
|
||||
int ret = __vgic_v2_perform_cpuif_access(vcpu);
|
||||
|
||||
if (ret == 1) {
|
||||
__skip_instr(vcpu);
|
||||
goto again;
|
||||
}
|
||||
|
||||
if (ret == -1) {
|
||||
/* Promote an illegal access to an SError */
|
||||
__skip_instr(vcpu);
|
||||
exit_code = ARM_EXCEPTION_EL1_SERROR;
|
||||
}
|
||||
|
||||
/* 0 falls through to be handler out of EL2 */
|
||||
}
|
||||
}
|
||||
|
||||
fp_enabled = __fpsimd_enabled();
|
||||
|
||||
__sysreg_save_guest_state(guest_ctxt);
|
||||
|
@ -293,8 +353,6 @@ again:
|
|||
return exit_code;
|
||||
}
|
||||
|
||||
__alias(__guest_run) int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
|
||||
|
||||
static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";
|
||||
|
||||
static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par)
|
||||
|
|
|
@ -17,7 +17,7 @@
|
|||
|
||||
#include <asm/kvm_hyp.h>
|
||||
|
||||
static void __hyp_text __tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
|
||||
void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
|
||||
{
|
||||
dsb(ishst);
|
||||
|
||||
|
@ -48,10 +48,7 @@ static void __hyp_text __tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
|
|||
write_sysreg(0, vttbr_el2);
|
||||
}
|
||||
|
||||
__alias(__tlb_flush_vmid_ipa) void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm,
|
||||
phys_addr_t ipa);
|
||||
|
||||
static void __hyp_text __tlb_flush_vmid(struct kvm *kvm)
|
||||
void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
|
||||
{
|
||||
dsb(ishst);
|
||||
|
||||
|
@ -67,14 +64,10 @@ static void __hyp_text __tlb_flush_vmid(struct kvm *kvm)
|
|||
write_sysreg(0, vttbr_el2);
|
||||
}
|
||||
|
||||
__alias(__tlb_flush_vmid) void __kvm_tlb_flush_vmid(struct kvm *kvm);
|
||||
|
||||
static void __hyp_text __tlb_flush_vm_context(void)
|
||||
void __hyp_text __kvm_flush_vm_context(void)
|
||||
{
|
||||
dsb(ishst);
|
||||
asm volatile("tlbi alle1is \n"
|
||||
"ic ialluis ": : );
|
||||
dsb(ish);
|
||||
}
|
||||
|
||||
__alias(__tlb_flush_vm_context) void __kvm_flush_vm_context(void);
|
||||
|
|
|
@ -231,3 +231,15 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
|
|||
else
|
||||
inject_undef64(vcpu);
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_inject_vabt - inject an async abort / SError into the guest
|
||||
* @vcpu: The VCPU to receive the exception
|
||||
*
|
||||
* It is assumed that this code is called from the VCPU thread and that the
|
||||
* VCPU therefore is not currently executing guest code.
|
||||
*/
|
||||
void kvm_inject_vabt(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
|
||||
}
|
||||
|
|
|
@ -107,35 +107,49 @@
|
|||
#define KVM_INVALID_INST 0xdeadbeef
|
||||
#define KVM_INVALID_ADDR 0xdeadbeef
|
||||
|
||||
/*
|
||||
* EVA has overlapping user & kernel address spaces, so user VAs may be >
|
||||
* PAGE_OFFSET. For this reason we can't use the default KVM_HVA_ERR_BAD of
|
||||
* PAGE_OFFSET.
|
||||
*/
|
||||
|
||||
#define KVM_HVA_ERR_BAD (-1UL)
|
||||
#define KVM_HVA_ERR_RO_BAD (-2UL)
|
||||
|
||||
static inline bool kvm_is_error_hva(unsigned long addr)
|
||||
{
|
||||
return IS_ERR_VALUE(addr);
|
||||
}
|
||||
|
||||
extern atomic_t kvm_mips_instance;
|
||||
|
||||
struct kvm_vm_stat {
|
||||
u32 remote_tlb_flush;
|
||||
ulong remote_tlb_flush;
|
||||
};
|
||||
|
||||
struct kvm_vcpu_stat {
|
||||
u32 wait_exits;
|
||||
u32 cache_exits;
|
||||
u32 signal_exits;
|
||||
u32 int_exits;
|
||||
u32 cop_unusable_exits;
|
||||
u32 tlbmod_exits;
|
||||
u32 tlbmiss_ld_exits;
|
||||
u32 tlbmiss_st_exits;
|
||||
u32 addrerr_st_exits;
|
||||
u32 addrerr_ld_exits;
|
||||
u32 syscall_exits;
|
||||
u32 resvd_inst_exits;
|
||||
u32 break_inst_exits;
|
||||
u32 trap_inst_exits;
|
||||
u32 msa_fpe_exits;
|
||||
u32 fpe_exits;
|
||||
u32 msa_disabled_exits;
|
||||
u32 flush_dcache_exits;
|
||||
u32 halt_successful_poll;
|
||||
u32 halt_attempted_poll;
|
||||
u32 halt_poll_invalid;
|
||||
u32 halt_wakeup;
|
||||
u64 wait_exits;
|
||||
u64 cache_exits;
|
||||
u64 signal_exits;
|
||||
u64 int_exits;
|
||||
u64 cop_unusable_exits;
|
||||
u64 tlbmod_exits;
|
||||
u64 tlbmiss_ld_exits;
|
||||
u64 tlbmiss_st_exits;
|
||||
u64 addrerr_st_exits;
|
||||
u64 addrerr_ld_exits;
|
||||
u64 syscall_exits;
|
||||
u64 resvd_inst_exits;
|
||||
u64 break_inst_exits;
|
||||
u64 trap_inst_exits;
|
||||
u64 msa_fpe_exits;
|
||||
u64 fpe_exits;
|
||||
u64 msa_disabled_exits;
|
||||
u64 flush_dcache_exits;
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_wakeup;
|
||||
};
|
||||
|
||||
struct kvm_arch_memory_slot {
|
||||
|
@ -314,6 +328,9 @@ struct kvm_vcpu_arch {
|
|||
u32 guest_kernel_asid[NR_CPUS];
|
||||
struct mm_struct guest_kernel_mm, guest_user_mm;
|
||||
|
||||
/* Guest ASID of last user mode execution */
|
||||
unsigned int last_user_gasid;
|
||||
|
||||
int last_sched_cpu;
|
||||
|
||||
/* WAIT executed */
|
||||
|
|
|
@ -846,6 +846,47 @@ enum emulation_result kvm_mips_emul_tlbr(struct kvm_vcpu *vcpu)
|
|||
return EMULATE_FAIL;
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_mips_invalidate_guest_tlb() - Indicates a change in guest MMU map.
|
||||
* @vcpu: VCPU with changed mappings.
|
||||
* @tlb: TLB entry being removed.
|
||||
*
|
||||
* This is called to indicate a single change in guest MMU mappings, so that we
|
||||
* can arrange TLB flushes on this and other CPUs.
|
||||
*/
|
||||
static void kvm_mips_invalidate_guest_tlb(struct kvm_vcpu *vcpu,
|
||||
struct kvm_mips_tlb *tlb)
|
||||
{
|
||||
int cpu, i;
|
||||
bool user;
|
||||
|
||||
/* No need to flush for entries which are already invalid */
|
||||
if (!((tlb->tlb_lo[0] | tlb->tlb_lo[1]) & ENTRYLO_V))
|
||||
return;
|
||||
/* User address space doesn't need flushing for KSeg2/3 changes */
|
||||
user = tlb->tlb_hi < KVM_GUEST_KSEG0;
|
||||
|
||||
preempt_disable();
|
||||
|
||||
/*
|
||||
* Probe the shadow host TLB for the entry being overwritten, if one
|
||||
* matches, invalidate it
|
||||
*/
|
||||
kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi);
|
||||
|
||||
/* Invalidate the whole ASID on other CPUs */
|
||||
cpu = smp_processor_id();
|
||||
for_each_possible_cpu(i) {
|
||||
if (i == cpu)
|
||||
continue;
|
||||
if (user)
|
||||
vcpu->arch.guest_user_asid[i] = 0;
|
||||
vcpu->arch.guest_kernel_asid[i] = 0;
|
||||
}
|
||||
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
/* Write Guest TLB Entry @ Index */
|
||||
enum emulation_result kvm_mips_emul_tlbwi(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
|
@ -865,11 +906,8 @@ enum emulation_result kvm_mips_emul_tlbwi(struct kvm_vcpu *vcpu)
|
|||
}
|
||||
|
||||
tlb = &vcpu->arch.guest_tlb[index];
|
||||
/*
|
||||
* Probe the shadow host TLB for the entry being overwritten, if one
|
||||
* matches, invalidate it
|
||||
*/
|
||||
kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi);
|
||||
|
||||
kvm_mips_invalidate_guest_tlb(vcpu, tlb);
|
||||
|
||||
tlb->tlb_mask = kvm_read_c0_guest_pagemask(cop0);
|
||||
tlb->tlb_hi = kvm_read_c0_guest_entryhi(cop0);
|
||||
|
@ -898,11 +936,7 @@ enum emulation_result kvm_mips_emul_tlbwr(struct kvm_vcpu *vcpu)
|
|||
|
||||
tlb = &vcpu->arch.guest_tlb[index];
|
||||
|
||||
/*
|
||||
* Probe the shadow host TLB for the entry being overwritten, if one
|
||||
* matches, invalidate it
|
||||
*/
|
||||
kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi);
|
||||
kvm_mips_invalidate_guest_tlb(vcpu, tlb);
|
||||
|
||||
tlb->tlb_mask = kvm_read_c0_guest_pagemask(cop0);
|
||||
tlb->tlb_hi = kvm_read_c0_guest_entryhi(cop0);
|
||||
|
@ -1026,6 +1060,7 @@ enum emulation_result kvm_mips_emulate_CP0(union mips_instruction inst,
|
|||
enum emulation_result er = EMULATE_DONE;
|
||||
u32 rt, rd, sel;
|
||||
unsigned long curr_pc;
|
||||
int cpu, i;
|
||||
|
||||
/*
|
||||
* Update PC and hold onto current PC in case there is
|
||||
|
@ -1127,16 +1162,31 @@ enum emulation_result kvm_mips_emulate_CP0(union mips_instruction inst,
|
|||
} else if (rd == MIPS_CP0_TLB_HI && sel == 0) {
|
||||
u32 nasid =
|
||||
vcpu->arch.gprs[rt] & KVM_ENTRYHI_ASID;
|
||||
if ((KSEGX(vcpu->arch.gprs[rt]) != CKSEG0) &&
|
||||
((kvm_read_c0_guest_entryhi(cop0) &
|
||||
if (((kvm_read_c0_guest_entryhi(cop0) &
|
||||
KVM_ENTRYHI_ASID) != nasid)) {
|
||||
trace_kvm_asid_change(vcpu,
|
||||
kvm_read_c0_guest_entryhi(cop0)
|
||||
& KVM_ENTRYHI_ASID,
|
||||
nasid);
|
||||
|
||||
/* Blow away the shadow host TLBs */
|
||||
kvm_mips_flush_host_tlb(1);
|
||||
/*
|
||||
* Regenerate/invalidate kernel MMU
|
||||
* context.
|
||||
* The user MMU context will be
|
||||
* regenerated lazily on re-entry to
|
||||
* guest user if the guest ASID actually
|
||||
* changes.
|
||||
*/
|
||||
preempt_disable();
|
||||
cpu = smp_processor_id();
|
||||
kvm_get_new_mmu_context(&vcpu->arch.guest_kernel_mm,
|
||||
cpu, vcpu);
|
||||
vcpu->arch.guest_kernel_asid[cpu] =
|
||||
vcpu->arch.guest_kernel_mm.context.asid[cpu];
|
||||
for_each_possible_cpu(i)
|
||||
if (i != cpu)
|
||||
vcpu->arch.guest_kernel_asid[i] = 0;
|
||||
preempt_enable();
|
||||
}
|
||||
kvm_write_c0_guest_entryhi(cop0,
|
||||
vcpu->arch.gprs[rt]);
|
||||
|
|
|
@ -140,6 +140,16 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
|
|||
return 0;
|
||||
}
|
||||
|
||||
bool kvm_arch_has_vcpu_debugfs(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
void kvm_mips_free_vcpus(struct kvm *kvm)
|
||||
{
|
||||
unsigned int i;
|
||||
|
@ -411,6 +421,31 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
|
|||
return -ENOIOCTLCMD;
|
||||
}
|
||||
|
||||
/* Must be called with preemption disabled, just before entering guest */
|
||||
static void kvm_mips_check_asids(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct mips_coproc *cop0 = vcpu->arch.cop0;
|
||||
int cpu = smp_processor_id();
|
||||
unsigned int gasid;
|
||||
|
||||
/*
|
||||
* Lazy host ASID regeneration for guest user mode.
|
||||
* If the guest ASID has changed since the last guest usermode
|
||||
* execution, regenerate the host ASID so as to invalidate stale TLB
|
||||
* entries.
|
||||
*/
|
||||
if (!KVM_GUEST_KERNEL_MODE(vcpu)) {
|
||||
gasid = kvm_read_c0_guest_entryhi(cop0) & KVM_ENTRYHI_ASID;
|
||||
if (gasid != vcpu->arch.last_user_gasid) {
|
||||
kvm_get_new_mmu_context(&vcpu->arch.guest_user_mm, cpu,
|
||||
vcpu);
|
||||
vcpu->arch.guest_user_asid[cpu] =
|
||||
vcpu->arch.guest_user_mm.context.asid[cpu];
|
||||
vcpu->arch.last_user_gasid = gasid;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
{
|
||||
int r = 0;
|
||||
|
@ -438,6 +473,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
|||
htw_stop();
|
||||
|
||||
trace_kvm_enter(vcpu);
|
||||
|
||||
kvm_mips_check_asids(vcpu);
|
||||
|
||||
r = vcpu->arch.vcpu_run(run, vcpu);
|
||||
trace_kvm_out(vcpu);
|
||||
|
||||
|
@ -1551,6 +1589,8 @@ skip_emul:
|
|||
if (ret == RESUME_GUEST) {
|
||||
trace_kvm_reenter(vcpu);
|
||||
|
||||
kvm_mips_check_asids(vcpu);
|
||||
|
||||
/*
|
||||
* If FPU / MSA are enabled (i.e. the guest's FPU / MSA context
|
||||
* is live), restore FCR31 / MSACSR.
|
||||
|
|
|
@ -250,15 +250,27 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
|||
kvm_get_new_mmu_context(&vcpu->arch.guest_kernel_mm, cpu, vcpu);
|
||||
vcpu->arch.guest_kernel_asid[cpu] =
|
||||
vcpu->arch.guest_kernel_mm.context.asid[cpu];
|
||||
kvm_get_new_mmu_context(&vcpu->arch.guest_user_mm, cpu, vcpu);
|
||||
vcpu->arch.guest_user_asid[cpu] =
|
||||
vcpu->arch.guest_user_mm.context.asid[cpu];
|
||||
newasid++;
|
||||
|
||||
kvm_debug("[%d]: cpu_context: %#lx\n", cpu,
|
||||
cpu_context(cpu, current->mm));
|
||||
kvm_debug("[%d]: Allocated new ASID for Guest Kernel: %#x\n",
|
||||
cpu, vcpu->arch.guest_kernel_asid[cpu]);
|
||||
}
|
||||
|
||||
if ((vcpu->arch.guest_user_asid[cpu] ^ asid_cache(cpu)) &
|
||||
asid_version_mask(cpu)) {
|
||||
u32 gasid = kvm_read_c0_guest_entryhi(vcpu->arch.cop0) &
|
||||
KVM_ENTRYHI_ASID;
|
||||
|
||||
kvm_get_new_mmu_context(&vcpu->arch.guest_user_mm, cpu, vcpu);
|
||||
vcpu->arch.guest_user_asid[cpu] =
|
||||
vcpu->arch.guest_user_mm.context.asid[cpu];
|
||||
vcpu->arch.last_user_gasid = gasid;
|
||||
newasid++;
|
||||
|
||||
kvm_debug("[%d]: cpu_context: %#lx\n", cpu,
|
||||
cpu_context(cpu, current->mm));
|
||||
kvm_debug("[%d]: Allocated new ASID for Guest User: %#x\n", cpu,
|
||||
vcpu->arch.guest_user_asid[cpu]);
|
||||
}
|
||||
|
|
|
@ -175,6 +175,24 @@ static int kvm_trap_emul_handle_tlb_miss(struct kvm_vcpu *vcpu, bool store)
|
|||
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
|
||||
ret = RESUME_HOST;
|
||||
}
|
||||
} else if (KVM_GUEST_KERNEL_MODE(vcpu)
|
||||
&& (KSEGX(badvaddr) == CKSEG0 || KSEGX(badvaddr) == CKSEG1)) {
|
||||
/*
|
||||
* With EVA we may get a TLB exception instead of an address
|
||||
* error when the guest performs MMIO to KSeg1 addresses.
|
||||
*/
|
||||
kvm_debug("Emulate %s MMIO space\n",
|
||||
store ? "Store to" : "Load from");
|
||||
er = kvm_mips_emulate_inst(cause, opc, run, vcpu);
|
||||
if (er == EMULATE_FAIL) {
|
||||
kvm_err("Emulate %s MMIO space failed\n",
|
||||
store ? "Store to" : "Load from");
|
||||
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
|
||||
ret = RESUME_HOST;
|
||||
} else {
|
||||
run->exit_reason = KVM_EXIT_MMIO;
|
||||
ret = RESUME_HOST;
|
||||
}
|
||||
} else {
|
||||
kvm_err("Illegal TLB %s fault address , cause %#x, PC: %p, BadVaddr: %#lx\n",
|
||||
store ? "ST" : "LD", cause, opc, badvaddr);
|
||||
|
|
|
@ -244,6 +244,43 @@ static inline int segment_shift(int ssize)
|
|||
return SID_SHIFT_1T;
|
||||
}
|
||||
|
||||
/*
|
||||
* This array is indexed by the LP field of the HPTE second dword.
|
||||
* Since this field may contain some RPN bits, some entries are
|
||||
* replicated so that we get the same value irrespective of RPN.
|
||||
* The top 4 bits are the page size index (MMU_PAGE_*) for the
|
||||
* actual page size, the bottom 4 bits are the base page size.
|
||||
*/
|
||||
extern u8 hpte_page_sizes[1 << LP_BITS];
|
||||
|
||||
static inline unsigned long __hpte_page_size(unsigned long h, unsigned long l,
|
||||
bool is_base_size)
|
||||
{
|
||||
unsigned int i, lp;
|
||||
|
||||
if (!(h & HPTE_V_LARGE))
|
||||
return 1ul << 12;
|
||||
|
||||
/* Look at the 8 bit LP value */
|
||||
lp = (l >> LP_SHIFT) & ((1 << LP_BITS) - 1);
|
||||
i = hpte_page_sizes[lp];
|
||||
if (!i)
|
||||
return 0;
|
||||
if (!is_base_size)
|
||||
i >>= 4;
|
||||
return 1ul << mmu_psize_defs[i & 0xf].shift;
|
||||
}
|
||||
|
||||
static inline unsigned long hpte_page_size(unsigned long h, unsigned long l)
|
||||
{
|
||||
return __hpte_page_size(h, l, 0);
|
||||
}
|
||||
|
||||
static inline unsigned long hpte_base_page_size(unsigned long h, unsigned long l)
|
||||
{
|
||||
return __hpte_page_size(h, l, 1);
|
||||
}
|
||||
|
||||
/*
|
||||
* The current system page and segment sizes
|
||||
*/
|
||||
|
|
|
@ -241,6 +241,35 @@ static inline void out_be64(volatile u64 __iomem *addr, u64 val)
|
|||
#endif
|
||||
#endif /* __powerpc64__ */
|
||||
|
||||
|
||||
/*
|
||||
* Simple Cache inhibited accessors
|
||||
* Unlike the DEF_MMIO_* macros, these don't include any h/w memory
|
||||
* barriers, callers need to manage memory barriers on their own.
|
||||
* These can only be used in hypervisor real mode.
|
||||
*/
|
||||
|
||||
static inline u32 _lwzcix(unsigned long addr)
|
||||
{
|
||||
u32 ret;
|
||||
|
||||
__asm__ __volatile__("lwzcix %0,0, %1"
|
||||
: "=r" (ret) : "r" (addr) : "memory");
|
||||
return ret;
|
||||
}
|
||||
|
||||
static inline void _stbcix(u64 addr, u8 val)
|
||||
{
|
||||
__asm__ __volatile__("stbcix %0,0,%1"
|
||||
: : "r" (val), "r" (addr) : "memory");
|
||||
}
|
||||
|
||||
static inline void _stwcix(u64 addr, u32 val)
|
||||
{
|
||||
__asm__ __volatile__("stwcix %0,0,%1"
|
||||
: : "r" (val), "r" (addr) : "memory");
|
||||
}
|
||||
|
||||
/*
|
||||
* Low level IO stream instructions are defined out of line for now
|
||||
*/
|
||||
|
|
|
@ -105,6 +105,15 @@
|
|||
#define BOOK3S_INTERRUPT_FAC_UNAVAIL 0xf60
|
||||
#define BOOK3S_INTERRUPT_H_FAC_UNAVAIL 0xf80
|
||||
|
||||
/* book3s_hv */
|
||||
|
||||
/*
|
||||
* Special trap used to indicate to host that this is a
|
||||
* passthrough interrupt that could not be handled
|
||||
* completely in the guest.
|
||||
*/
|
||||
#define BOOK3S_INTERRUPT_HV_RM_HARD 0x5555
|
||||
|
||||
#define BOOK3S_IRQPRIO_SYSTEM_RESET 0
|
||||
#define BOOK3S_IRQPRIO_DATA_SEGMENT 1
|
||||
#define BOOK3S_IRQPRIO_INST_SEGMENT 2
|
||||
|
@ -136,6 +145,7 @@
|
|||
#define RESUME_FLAG_NV (1<<0) /* Reload guest nonvolatile state? */
|
||||
#define RESUME_FLAG_HOST (1<<1) /* Resume host? */
|
||||
#define RESUME_FLAG_ARCH1 (1<<2)
|
||||
#define RESUME_FLAG_ARCH2 (1<<3)
|
||||
|
||||
#define RESUME_GUEST 0
|
||||
#define RESUME_GUEST_NV RESUME_FLAG_NV
|
||||
|
|
|
@ -69,6 +69,43 @@ struct hpte_cache {
|
|||
int pagesize;
|
||||
};
|
||||
|
||||
/*
|
||||
* Struct for a virtual core.
|
||||
* Note: entry_exit_map combines a bitmap of threads that have entered
|
||||
* in the bottom 8 bits and a bitmap of threads that have exited in the
|
||||
* next 8 bits. This is so that we can atomically set the entry bit
|
||||
* iff the exit map is 0 without taking a lock.
|
||||
*/
|
||||
struct kvmppc_vcore {
|
||||
int n_runnable;
|
||||
int num_threads;
|
||||
int entry_exit_map;
|
||||
int napping_threads;
|
||||
int first_vcpuid;
|
||||
u16 pcpu;
|
||||
u16 last_cpu;
|
||||
u8 vcore_state;
|
||||
u8 in_guest;
|
||||
struct kvmppc_vcore *master_vcore;
|
||||
struct kvm_vcpu *runnable_threads[MAX_SMT_THREADS];
|
||||
struct list_head preempt_list;
|
||||
spinlock_t lock;
|
||||
struct swait_queue_head wq;
|
||||
spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */
|
||||
u64 stolen_tb;
|
||||
u64 preempt_tb;
|
||||
struct kvm_vcpu *runner;
|
||||
struct kvm *kvm;
|
||||
u64 tb_offset; /* guest timebase - host timebase */
|
||||
ulong lpcr;
|
||||
u32 arch_compat;
|
||||
ulong pcr;
|
||||
ulong dpdes; /* doorbell state (POWER8) */
|
||||
ulong vtb; /* virtual timebase */
|
||||
ulong conferring_threads;
|
||||
unsigned int halt_poll_ns;
|
||||
};
|
||||
|
||||
struct kvmppc_vcpu_book3s {
|
||||
struct kvmppc_sid_map sid_map[SID_MAP_NUM];
|
||||
struct {
|
||||
|
@ -83,6 +120,7 @@ struct kvmppc_vcpu_book3s {
|
|||
u64 sdr1;
|
||||
u64 hior;
|
||||
u64 msr_mask;
|
||||
u64 vtb;
|
||||
#ifdef CONFIG_PPC_BOOK3S_32
|
||||
u32 vsid_pool[VSID_POOL_SIZE];
|
||||
u32 vsid_next;
|
||||
|
@ -191,6 +229,7 @@ extern void kvmppc_copy_to_svcpu(struct kvmppc_book3s_shadow_vcpu *svcpu,
|
|||
struct kvm_vcpu *vcpu);
|
||||
extern void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
|
||||
struct kvmppc_book3s_shadow_vcpu *svcpu);
|
||||
extern int kvm_irq_bypass;
|
||||
|
||||
static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
|
|
|
@ -20,6 +20,8 @@
|
|||
#ifndef __ASM_KVM_BOOK3S_64_H__
|
||||
#define __ASM_KVM_BOOK3S_64_H__
|
||||
|
||||
#include <asm/book3s/64/mmu-hash.h>
|
||||
|
||||
#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
|
||||
static inline struct kvmppc_book3s_shadow_vcpu *svcpu_get(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
|
@ -97,56 +99,20 @@ static inline void __unlock_hpte(__be64 *hpte, unsigned long hpte_v)
|
|||
hpte[0] = cpu_to_be64(hpte_v);
|
||||
}
|
||||
|
||||
static inline int __hpte_actual_psize(unsigned int lp, int psize)
|
||||
{
|
||||
int i, shift;
|
||||
unsigned int mask;
|
||||
|
||||
/* start from 1 ignoring MMU_PAGE_4K */
|
||||
for (i = 1; i < MMU_PAGE_COUNT; i++) {
|
||||
|
||||
/* invalid penc */
|
||||
if (mmu_psize_defs[psize].penc[i] == -1)
|
||||
continue;
|
||||
/*
|
||||
* encoding bits per actual page size
|
||||
* PTE LP actual page size
|
||||
* rrrr rrrz >=8KB
|
||||
* rrrr rrzz >=16KB
|
||||
* rrrr rzzz >=32KB
|
||||
* rrrr zzzz >=64KB
|
||||
* .......
|
||||
*/
|
||||
shift = mmu_psize_defs[i].shift - LP_SHIFT;
|
||||
if (shift > LP_BITS)
|
||||
shift = LP_BITS;
|
||||
mask = (1 << shift) - 1;
|
||||
if ((lp & mask) == mmu_psize_defs[psize].penc[i])
|
||||
return i;
|
||||
}
|
||||
return -1;
|
||||
}
|
||||
|
||||
static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r,
|
||||
unsigned long pte_index)
|
||||
{
|
||||
int b_psize = MMU_PAGE_4K, a_psize = MMU_PAGE_4K;
|
||||
int i, b_psize = MMU_PAGE_4K, a_psize = MMU_PAGE_4K;
|
||||
unsigned int penc;
|
||||
unsigned long rb = 0, va_low, sllp;
|
||||
unsigned int lp = (r >> LP_SHIFT) & ((1 << LP_BITS) - 1);
|
||||
|
||||
if (v & HPTE_V_LARGE) {
|
||||
for (b_psize = 0; b_psize < MMU_PAGE_COUNT; b_psize++) {
|
||||
|
||||
/* valid entries have a shift value */
|
||||
if (!mmu_psize_defs[b_psize].shift)
|
||||
continue;
|
||||
|
||||
a_psize = __hpte_actual_psize(lp, b_psize);
|
||||
if (a_psize != -1)
|
||||
break;
|
||||
}
|
||||
i = hpte_page_sizes[lp];
|
||||
b_psize = i & 0xf;
|
||||
a_psize = i >> 4;
|
||||
}
|
||||
|
||||
/*
|
||||
* Ignore the top 14 bits of va
|
||||
* v have top two bits covering segment size, hence move
|
||||
|
@ -159,7 +125,6 @@ static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r,
|
|||
/* This covers 14..54 bits of va*/
|
||||
rb = (v & ~0x7fUL) << 16; /* AVA field */
|
||||
|
||||
rb |= (v >> HPTE_V_SSIZE_SHIFT) << 8; /* B field */
|
||||
/*
|
||||
* AVA in v had cleared lower 23 bits. We need to derive
|
||||
* that from pteg index
|
||||
|
@ -211,49 +176,10 @@ static inline unsigned long compute_tlbie_rb(unsigned long v, unsigned long r,
|
|||
break;
|
||||
}
|
||||
}
|
||||
rb |= (v >> 54) & 0x300; /* B field */
|
||||
rb |= (v >> HPTE_V_SSIZE_SHIFT) << 8; /* B field */
|
||||
return rb;
|
||||
}
|
||||
|
||||
static inline unsigned long __hpte_page_size(unsigned long h, unsigned long l,
|
||||
bool is_base_size)
|
||||
{
|
||||
|
||||
int size, a_psize;
|
||||
/* Look at the 8 bit LP value */
|
||||
unsigned int lp = (l >> LP_SHIFT) & ((1 << LP_BITS) - 1);
|
||||
|
||||
/* only handle 4k, 64k and 16M pages for now */
|
||||
if (!(h & HPTE_V_LARGE))
|
||||
return 1ul << 12;
|
||||
else {
|
||||
for (size = 0; size < MMU_PAGE_COUNT; size++) {
|
||||
/* valid entries have a shift value */
|
||||
if (!mmu_psize_defs[size].shift)
|
||||
continue;
|
||||
|
||||
a_psize = __hpte_actual_psize(lp, size);
|
||||
if (a_psize != -1) {
|
||||
if (is_base_size)
|
||||
return 1ul << mmu_psize_defs[size].shift;
|
||||
return 1ul << mmu_psize_defs[a_psize].shift;
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline unsigned long hpte_page_size(unsigned long h, unsigned long l)
|
||||
{
|
||||
return __hpte_page_size(h, l, 0);
|
||||
}
|
||||
|
||||
static inline unsigned long hpte_base_page_size(unsigned long h, unsigned long l)
|
||||
{
|
||||
return __hpte_page_size(h, l, 1);
|
||||
}
|
||||
|
||||
static inline unsigned long hpte_rpn(unsigned long ptel, unsigned long psize)
|
||||
{
|
||||
return ((ptel & HPTE_R_RPN) & ~(psize - 1)) >> PAGE_SHIFT;
|
||||
|
|
|
@ -43,6 +43,8 @@
|
|||
#include <asm/cputhreads.h>
|
||||
#define KVM_MAX_VCPU_ID (threads_per_subcore * KVM_MAX_VCORES)
|
||||
|
||||
#define __KVM_HAVE_ARCH_INTC_INITIALIZED
|
||||
|
||||
#ifdef CONFIG_KVM_MMIO
|
||||
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
|
||||
#endif
|
||||
|
@ -95,42 +97,49 @@ struct kvmppc_vcpu_book3s;
|
|||
struct kvmppc_book3s_shadow_vcpu;
|
||||
|
||||
struct kvm_vm_stat {
|
||||
u32 remote_tlb_flush;
|
||||
ulong remote_tlb_flush;
|
||||
};
|
||||
|
||||
struct kvm_vcpu_stat {
|
||||
u32 sum_exits;
|
||||
u32 mmio_exits;
|
||||
u32 signal_exits;
|
||||
u32 light_exits;
|
||||
u64 sum_exits;
|
||||
u64 mmio_exits;
|
||||
u64 signal_exits;
|
||||
u64 light_exits;
|
||||
/* Account for special types of light exits: */
|
||||
u32 itlb_real_miss_exits;
|
||||
u32 itlb_virt_miss_exits;
|
||||
u32 dtlb_real_miss_exits;
|
||||
u32 dtlb_virt_miss_exits;
|
||||
u32 syscall_exits;
|
||||
u32 isi_exits;
|
||||
u32 dsi_exits;
|
||||
u32 emulated_inst_exits;
|
||||
u32 dec_exits;
|
||||
u32 ext_intr_exits;
|
||||
u32 halt_successful_poll;
|
||||
u32 halt_attempted_poll;
|
||||
u32 halt_poll_invalid;
|
||||
u32 halt_wakeup;
|
||||
u32 dbell_exits;
|
||||
u32 gdbell_exits;
|
||||
u32 ld;
|
||||
u32 st;
|
||||
u64 itlb_real_miss_exits;
|
||||
u64 itlb_virt_miss_exits;
|
||||
u64 dtlb_real_miss_exits;
|
||||
u64 dtlb_virt_miss_exits;
|
||||
u64 syscall_exits;
|
||||
u64 isi_exits;
|
||||
u64 dsi_exits;
|
||||
u64 emulated_inst_exits;
|
||||
u64 dec_exits;
|
||||
u64 ext_intr_exits;
|
||||
u64 halt_poll_success_ns;
|
||||
u64 halt_poll_fail_ns;
|
||||
u64 halt_wait_ns;
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_successful_wait;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_wakeup;
|
||||
u64 dbell_exits;
|
||||
u64 gdbell_exits;
|
||||
u64 ld;
|
||||
u64 st;
|
||||
#ifdef CONFIG_PPC_BOOK3S
|
||||
u32 pf_storage;
|
||||
u32 pf_instruc;
|
||||
u32 sp_storage;
|
||||
u32 sp_instruc;
|
||||
u32 queue_intr;
|
||||
u32 ld_slow;
|
||||
u32 st_slow;
|
||||
u64 pf_storage;
|
||||
u64 pf_instruc;
|
||||
u64 sp_storage;
|
||||
u64 sp_instruc;
|
||||
u64 queue_intr;
|
||||
u64 ld_slow;
|
||||
u64 st_slow;
|
||||
#endif
|
||||
u64 pthru_all;
|
||||
u64 pthru_host;
|
||||
u64 pthru_bad_aff;
|
||||
};
|
||||
|
||||
enum kvm_exit_types {
|
||||
|
@ -197,6 +206,8 @@ struct kvmppc_spapr_tce_table {
|
|||
struct kvmppc_xics;
|
||||
struct kvmppc_icp;
|
||||
|
||||
struct kvmppc_passthru_irqmap;
|
||||
|
||||
/*
|
||||
* The reverse mapping array has one entry for each HPTE,
|
||||
* which stores the guest's view of the second word of the HPTE
|
||||
|
@ -267,6 +278,7 @@ struct kvm_arch {
|
|||
#endif
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
struct kvmppc_xics *xics;
|
||||
struct kvmppc_passthru_irqmap *pimap;
|
||||
#endif
|
||||
struct kvmppc_ops *kvm_ops;
|
||||
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
|
||||
|
@ -275,41 +287,6 @@ struct kvm_arch {
|
|||
#endif
|
||||
};
|
||||
|
||||
/*
|
||||
* Struct for a virtual core.
|
||||
* Note: entry_exit_map combines a bitmap of threads that have entered
|
||||
* in the bottom 8 bits and a bitmap of threads that have exited in the
|
||||
* next 8 bits. This is so that we can atomically set the entry bit
|
||||
* iff the exit map is 0 without taking a lock.
|
||||
*/
|
||||
struct kvmppc_vcore {
|
||||
int n_runnable;
|
||||
int num_threads;
|
||||
int entry_exit_map;
|
||||
int napping_threads;
|
||||
int first_vcpuid;
|
||||
u16 pcpu;
|
||||
u16 last_cpu;
|
||||
u8 vcore_state;
|
||||
u8 in_guest;
|
||||
struct kvmppc_vcore *master_vcore;
|
||||
struct list_head runnable_threads;
|
||||
struct list_head preempt_list;
|
||||
spinlock_t lock;
|
||||
struct swait_queue_head wq;
|
||||
spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */
|
||||
u64 stolen_tb;
|
||||
u64 preempt_tb;
|
||||
struct kvm_vcpu *runner;
|
||||
struct kvm *kvm;
|
||||
u64 tb_offset; /* guest timebase - host timebase */
|
||||
ulong lpcr;
|
||||
u32 arch_compat;
|
||||
ulong pcr;
|
||||
ulong dpdes; /* doorbell state (POWER8) */
|
||||
ulong conferring_threads;
|
||||
};
|
||||
|
||||
#define VCORE_ENTRY_MAP(vc) ((vc)->entry_exit_map & 0xff)
|
||||
#define VCORE_EXIT_MAP(vc) ((vc)->entry_exit_map >> 8)
|
||||
#define VCORE_IS_EXITING(vc) (VCORE_EXIT_MAP(vc) != 0)
|
||||
|
@ -329,6 +306,7 @@ struct kvmppc_vcore {
|
|||
#define VCORE_SLEEPING 3
|
||||
#define VCORE_RUNNING 4
|
||||
#define VCORE_EXITING 5
|
||||
#define VCORE_POLLING 6
|
||||
|
||||
/*
|
||||
* Struct used to manage memory for a virtual processor area
|
||||
|
@ -397,6 +375,20 @@ struct kvmhv_tb_accumulator {
|
|||
u64 tb_max; /* max time */
|
||||
};
|
||||
|
||||
#ifdef CONFIG_PPC_BOOK3S_64
|
||||
struct kvmppc_irq_map {
|
||||
u32 r_hwirq;
|
||||
u32 v_hwirq;
|
||||
struct irq_desc *desc;
|
||||
};
|
||||
|
||||
#define KVMPPC_PIRQ_MAPPED 1024
|
||||
struct kvmppc_passthru_irqmap {
|
||||
int n_mapped;
|
||||
struct kvmppc_irq_map mapped[KVMPPC_PIRQ_MAPPED];
|
||||
};
|
||||
#endif
|
||||
|
||||
# ifdef CONFIG_PPC_FSL_BOOK3E
|
||||
#define KVMPPC_BOOKE_IAC_NUM 2
|
||||
#define KVMPPC_BOOKE_DAC_NUM 2
|
||||
|
@ -483,7 +475,6 @@ struct kvm_vcpu_arch {
|
|||
ulong purr;
|
||||
ulong spurr;
|
||||
ulong ic;
|
||||
ulong vtb;
|
||||
ulong dscr;
|
||||
ulong amr;
|
||||
ulong uamor;
|
||||
|
@ -668,7 +659,6 @@ struct kvm_vcpu_arch {
|
|||
long pgfault_index;
|
||||
unsigned long pgfault_hpte[2];
|
||||
|
||||
struct list_head run_list;
|
||||
struct task_struct *run_task;
|
||||
struct kvm_run *kvm_run;
|
||||
|
||||
|
|
|
@ -287,6 +287,10 @@ struct kvmppc_ops {
|
|||
long (*arch_vm_ioctl)(struct file *filp, unsigned int ioctl,
|
||||
unsigned long arg);
|
||||
int (*hcall_implemented)(unsigned long hcall);
|
||||
int (*irq_bypass_add_producer)(struct irq_bypass_consumer *,
|
||||
struct irq_bypass_producer *);
|
||||
void (*irq_bypass_del_producer)(struct irq_bypass_consumer *,
|
||||
struct irq_bypass_producer *);
|
||||
};
|
||||
|
||||
extern struct kvmppc_ops *kvmppc_hv_ops;
|
||||
|
@ -453,8 +457,19 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
|
|||
{
|
||||
return vcpu->arch.irq_type == KVMPPC_IRQ_XICS;
|
||||
}
|
||||
|
||||
static inline struct kvmppc_passthru_irqmap *kvmppc_get_passthru_irqmap(
|
||||
struct kvm *kvm)
|
||||
{
|
||||
if (kvm && kvm_irq_bypass)
|
||||
return kvm->arch.pimap;
|
||||
return NULL;
|
||||
}
|
||||
|
||||
extern void kvmppc_alloc_host_rm_ops(void);
|
||||
extern void kvmppc_free_host_rm_ops(void);
|
||||
extern void kvmppc_free_pimap(struct kvm *kvm);
|
||||
extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall);
|
||||
extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu);
|
||||
extern int kvmppc_xics_create_icp(struct kvm_vcpu *vcpu, unsigned long server);
|
||||
extern int kvm_vm_ioctl_xics_irq(struct kvm *kvm, struct kvm_irq_level *args);
|
||||
|
@ -464,10 +479,23 @@ extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
|
|||
extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev,
|
||||
struct kvm_vcpu *vcpu, u32 cpu);
|
||||
extern void kvmppc_xics_ipi_action(void);
|
||||
extern void kvmppc_xics_set_mapped(struct kvm *kvm, unsigned long guest_irq,
|
||||
unsigned long host_irq);
|
||||
extern void kvmppc_xics_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
|
||||
unsigned long host_irq);
|
||||
extern long kvmppc_deliver_irq_passthru(struct kvm_vcpu *vcpu, u32 xirr,
|
||||
struct kvmppc_irq_map *irq_map,
|
||||
struct kvmppc_passthru_irqmap *pimap);
|
||||
extern int h_ipi_redirect;
|
||||
#else
|
||||
static inline struct kvmppc_passthru_irqmap *kvmppc_get_passthru_irqmap(
|
||||
struct kvm *kvm)
|
||||
{ return NULL; }
|
||||
static inline void kvmppc_alloc_host_rm_ops(void) {};
|
||||
static inline void kvmppc_free_host_rm_ops(void) {};
|
||||
static inline void kvmppc_free_pimap(struct kvm *kvm) {};
|
||||
static inline int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall)
|
||||
{ return 0; }
|
||||
static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
|
||||
{ return 0; }
|
||||
static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { }
|
||||
|
|
|
@ -271,6 +271,7 @@ static inline bool early_radix_enabled(void)
|
|||
#define MMU_PAGE_16G 13
|
||||
#define MMU_PAGE_64G 14
|
||||
|
||||
/* N.B. we need to change the type of hpte_page_sizes if this gets to be > 16 */
|
||||
#define MMU_PAGE_COUNT 15
|
||||
|
||||
#ifdef CONFIG_PPC_BOOK3S_64
|
||||
|
|
|
@ -67,6 +67,7 @@ int64_t opal_pci_config_write_half_word(uint64_t phb_id, uint64_t bus_dev_func,
|
|||
int64_t opal_pci_config_write_word(uint64_t phb_id, uint64_t bus_dev_func,
|
||||
uint64_t offset, uint32_t data);
|
||||
int64_t opal_set_xive(uint32_t isn, uint16_t server, uint8_t priority);
|
||||
int64_t opal_rm_set_xive(uint32_t isn, uint16_t server, uint8_t priority);
|
||||
int64_t opal_get_xive(uint32_t isn, __be16 *server, uint8_t *priority);
|
||||
int64_t opal_register_exception_handler(uint64_t opal_exception,
|
||||
uint64_t handler_address,
|
||||
|
|
|
@ -12,6 +12,7 @@
|
|||
|
||||
#include <linux/pci.h>
|
||||
#include <linux/pci_hotplug.h>
|
||||
#include <linux/irq.h>
|
||||
#include <misc/cxl-base.h>
|
||||
#include <asm/opal-api.h>
|
||||
|
||||
|
@ -33,6 +34,8 @@ int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num);
|
|||
void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num);
|
||||
int pnv_cxl_get_irq_count(struct pci_dev *dev);
|
||||
struct device_node *pnv_pci_get_phb_node(struct pci_dev *dev);
|
||||
int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq);
|
||||
bool is_pnv_opal_msi(struct irq_chip *chip);
|
||||
|
||||
#ifdef CONFIG_CXL_BASE
|
||||
int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs,
|
||||
|
|
|
@ -737,6 +737,7 @@
|
|||
#define MMCR0_FCHV 0x00000001UL /* freeze conditions in hypervisor mode */
|
||||
#define SPRN_MMCR1 798
|
||||
#define SPRN_MMCR2 785
|
||||
#define SPRN_UMMCR2 769
|
||||
#define SPRN_MMCRA 0x312
|
||||
#define MMCRA_SDSYNC 0x80000000UL /* SDAR synced with SIAR */
|
||||
#define MMCRA_SDAR_DCACHE_MISS 0x40000000UL
|
||||
|
|
|
@ -506,7 +506,6 @@ int main(void)
|
|||
DEFINE(VCPU_PURR, offsetof(struct kvm_vcpu, arch.purr));
|
||||
DEFINE(VCPU_SPURR, offsetof(struct kvm_vcpu, arch.spurr));
|
||||
DEFINE(VCPU_IC, offsetof(struct kvm_vcpu, arch.ic));
|
||||
DEFINE(VCPU_VTB, offsetof(struct kvm_vcpu, arch.vtb));
|
||||
DEFINE(VCPU_DSCR, offsetof(struct kvm_vcpu, arch.dscr));
|
||||
DEFINE(VCPU_AMR, offsetof(struct kvm_vcpu, arch.amr));
|
||||
DEFINE(VCPU_UAMOR, offsetof(struct kvm_vcpu, arch.uamor));
|
||||
|
@ -557,6 +556,7 @@ int main(void)
|
|||
DEFINE(VCORE_LPCR, offsetof(struct kvmppc_vcore, lpcr));
|
||||
DEFINE(VCORE_PCR, offsetof(struct kvmppc_vcore, pcr));
|
||||
DEFINE(VCORE_DPDES, offsetof(struct kvmppc_vcore, dpdes));
|
||||
DEFINE(VCORE_VTB, offsetof(struct kvmppc_vcore, vtb));
|
||||
DEFINE(VCPU_SLB_E, offsetof(struct kvmppc_slb, orige));
|
||||
DEFINE(VCPU_SLB_V, offsetof(struct kvmppc_slb, origv));
|
||||
DEFINE(VCPU_SLB_SIZE, sizeof(struct kvmppc_slb));
|
||||
|
|
|
@ -22,6 +22,9 @@ config KVM
|
|||
select ANON_INODES
|
||||
select HAVE_KVM_EVENTFD
|
||||
select SRCU
|
||||
select KVM_VFIO
|
||||
select IRQ_BYPASS_MANAGER
|
||||
select HAVE_KVM_IRQ_BYPASS
|
||||
|
||||
config KVM_BOOK3S_HANDLER
|
||||
bool
|
||||
|
|
|
@ -7,16 +7,16 @@ subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
|
|||
ccflags-y := -Ivirt/kvm -Iarch/powerpc/kvm
|
||||
KVM := ../../../virt/kvm
|
||||
|
||||
common-objs-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
|
||||
$(KVM)/eventfd.o
|
||||
common-objs-y = $(KVM)/kvm_main.o $(KVM)/eventfd.o
|
||||
common-objs-$(CONFIG_KVM_VFIO) += $(KVM)/vfio.o
|
||||
common-objs-$(CONFIG_KVM_MMIO) += $(KVM)/coalesced_mmio.o
|
||||
|
||||
CFLAGS_e500_mmu.o := -I.
|
||||
CFLAGS_e500_mmu_host.o := -I.
|
||||
CFLAGS_emulate.o := -I.
|
||||
CFLAGS_emulate_loadstore.o := -I.
|
||||
|
||||
common-objs-y += powerpc.o emulate.o emulate_loadstore.o
|
||||
common-objs-y += powerpc.o emulate_loadstore.o
|
||||
obj-$(CONFIG_KVM_EXIT_TIMING) += timing.o
|
||||
obj-$(CONFIG_KVM_BOOK3S_HANDLER) += book3s_exports.o
|
||||
|
||||
|
@ -24,6 +24,7 @@ AFLAGS_booke_interrupts.o := -I$(objtree)/$(obj)
|
|||
|
||||
kvm-e500-objs := \
|
||||
$(common-objs-y) \
|
||||
emulate.o \
|
||||
booke.o \
|
||||
booke_emulate.o \
|
||||
booke_interrupts.o \
|
||||
|
@ -35,6 +36,7 @@ kvm-objs-$(CONFIG_KVM_E500V2) := $(kvm-e500-objs)
|
|||
|
||||
kvm-e500mc-objs := \
|
||||
$(common-objs-y) \
|
||||
emulate.o \
|
||||
booke.o \
|
||||
booke_emulate.o \
|
||||
bookehv_interrupts.o \
|
||||
|
@ -61,9 +63,6 @@ kvm-pr-y := \
|
|||
book3s_32_mmu.o
|
||||
|
||||
ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
|
||||
kvm-book3s_64-module-objs := \
|
||||
$(KVM)/coalesced_mmio.o
|
||||
|
||||
kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
|
||||
book3s_rmhandlers.o
|
||||
endif
|
||||
|
@ -89,11 +88,8 @@ endif
|
|||
kvm-book3s_64-objs-$(CONFIG_KVM_XICS) += \
|
||||
book3s_xics.o
|
||||
|
||||
kvm-book3s_64-module-objs += \
|
||||
$(KVM)/kvm_main.o \
|
||||
$(KVM)/eventfd.o \
|
||||
powerpc.o \
|
||||
emulate_loadstore.o \
|
||||
kvm-book3s_64-module-objs := \
|
||||
$(common-objs-y) \
|
||||
book3s.o \
|
||||
book3s_64_vio.o \
|
||||
book3s_rtas.o \
|
||||
|
@ -103,6 +99,7 @@ kvm-objs-$(CONFIG_KVM_BOOK3S_64) := $(kvm-book3s_64-module-objs)
|
|||
|
||||
kvm-book3s_32-objs := \
|
||||
$(common-objs-y) \
|
||||
emulate.o \
|
||||
fpu.o \
|
||||
book3s_paired_singles.o \
|
||||
book3s.o \
|
||||
|
|
|
@ -52,8 +52,12 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
|
|||
{ "dec", VCPU_STAT(dec_exits) },
|
||||
{ "ext_intr", VCPU_STAT(ext_intr_exits) },
|
||||
{ "queue_intr", VCPU_STAT(queue_intr) },
|
||||
{ "halt_poll_success_ns", VCPU_STAT(halt_poll_success_ns) },
|
||||
{ "halt_poll_fail_ns", VCPU_STAT(halt_poll_fail_ns) },
|
||||
{ "halt_wait_ns", VCPU_STAT(halt_wait_ns) },
|
||||
{ "halt_successful_poll", VCPU_STAT(halt_successful_poll), },
|
||||
{ "halt_attempted_poll", VCPU_STAT(halt_attempted_poll), },
|
||||
{ "halt_successful_wait", VCPU_STAT(halt_successful_wait) },
|
||||
{ "halt_poll_invalid", VCPU_STAT(halt_poll_invalid) },
|
||||
{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
|
||||
{ "pf_storage", VCPU_STAT(pf_storage) },
|
||||
|
@ -64,6 +68,9 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
|
|||
{ "ld_slow", VCPU_STAT(ld_slow) },
|
||||
{ "st", VCPU_STAT(st) },
|
||||
{ "st_slow", VCPU_STAT(st_slow) },
|
||||
{ "pthru_all", VCPU_STAT(pthru_all) },
|
||||
{ "pthru_host", VCPU_STAT(pthru_host) },
|
||||
{ "pthru_bad_aff", VCPU_STAT(pthru_bad_aff) },
|
||||
{ NULL }
|
||||
};
|
||||
|
||||
|
@ -592,9 +599,6 @@ int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
|
|||
case KVM_REG_PPC_BESCR:
|
||||
*val = get_reg_val(id, vcpu->arch.bescr);
|
||||
break;
|
||||
case KVM_REG_PPC_VTB:
|
||||
*val = get_reg_val(id, vcpu->arch.vtb);
|
||||
break;
|
||||
case KVM_REG_PPC_IC:
|
||||
*val = get_reg_val(id, vcpu->arch.ic);
|
||||
break;
|
||||
|
@ -666,9 +670,6 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
|
|||
case KVM_REG_PPC_BESCR:
|
||||
vcpu->arch.bescr = set_reg_val(id, *val);
|
||||
break;
|
||||
case KVM_REG_PPC_VTB:
|
||||
vcpu->arch.vtb = set_reg_val(id, *val);
|
||||
break;
|
||||
case KVM_REG_PPC_IC:
|
||||
vcpu->arch.ic = set_reg_val(id, *val);
|
||||
break;
|
||||
|
|
|
@ -498,6 +498,7 @@ int kvmppc_core_emulate_mtspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
|
|||
case SPRN_MMCR0:
|
||||
case SPRN_MMCR1:
|
||||
case SPRN_MMCR2:
|
||||
case SPRN_UMMCR2:
|
||||
#endif
|
||||
break;
|
||||
unprivileged:
|
||||
|
@ -579,7 +580,7 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val
|
|||
*spr_val = vcpu->arch.spurr;
|
||||
break;
|
||||
case SPRN_VTB:
|
||||
*spr_val = vcpu->arch.vtb;
|
||||
*spr_val = to_book3s(vcpu)->vtb;
|
||||
break;
|
||||
case SPRN_IC:
|
||||
*spr_val = vcpu->arch.ic;
|
||||
|
@ -640,6 +641,7 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val
|
|||
case SPRN_MMCR0:
|
||||
case SPRN_MMCR1:
|
||||
case SPRN_MMCR2:
|
||||
case SPRN_UMMCR2:
|
||||
case SPRN_TIR:
|
||||
#endif
|
||||
*spr_val = 0;
|
||||
|
|
|
@ -53,11 +53,15 @@
|
|||
#include <asm/smp.h>
|
||||
#include <asm/dbell.h>
|
||||
#include <asm/hmi.h>
|
||||
#include <asm/pnv-pci.h>
|
||||
#include <linux/gfp.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <linux/highmem.h>
|
||||
#include <linux/hugetlb.h>
|
||||
#include <linux/kvm_irqfd.h>
|
||||
#include <linux/irqbypass.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/compiler.h>
|
||||
|
||||
#include "book3s.h"
|
||||
|
||||
|
@ -70,6 +74,8 @@
|
|||
|
||||
/* Used to indicate that a guest page fault needs to be handled */
|
||||
#define RESUME_PAGE_FAULT (RESUME_GUEST | RESUME_FLAG_ARCH1)
|
||||
/* Used to indicate that a guest passthrough interrupt needs to be handled */
|
||||
#define RESUME_PASSTHROUGH (RESUME_GUEST | RESUME_FLAG_ARCH2)
|
||||
|
||||
/* Used as a "null" value for timebase values */
|
||||
#define TB_NIL (~(u64)0)
|
||||
|
@ -89,14 +95,55 @@ static struct kernel_param_ops module_param_ops = {
|
|||
.get = param_get_int,
|
||||
};
|
||||
|
||||
module_param_cb(kvm_irq_bypass, &module_param_ops, &kvm_irq_bypass,
|
||||
S_IRUGO | S_IWUSR);
|
||||
MODULE_PARM_DESC(kvm_irq_bypass, "Bypass passthrough interrupt optimization");
|
||||
|
||||
module_param_cb(h_ipi_redirect, &module_param_ops, &h_ipi_redirect,
|
||||
S_IRUGO | S_IWUSR);
|
||||
MODULE_PARM_DESC(h_ipi_redirect, "Redirect H_IPI wakeup to a free host core");
|
||||
#endif
|
||||
|
||||
/* Maximum halt poll interval defaults to KVM_HALT_POLL_NS_DEFAULT */
|
||||
static unsigned int halt_poll_max_ns = KVM_HALT_POLL_NS_DEFAULT;
|
||||
module_param(halt_poll_max_ns, uint, S_IRUGO | S_IWUSR);
|
||||
MODULE_PARM_DESC(halt_poll_max_ns, "Maximum halt poll time in ns");
|
||||
|
||||
/* Factor by which the vcore halt poll interval is grown, default is to double
|
||||
*/
|
||||
static unsigned int halt_poll_ns_grow = 2;
|
||||
module_param(halt_poll_ns_grow, int, S_IRUGO);
|
||||
MODULE_PARM_DESC(halt_poll_ns_grow, "Factor halt poll time is grown by");
|
||||
|
||||
/* Factor by which the vcore halt poll interval is shrunk, default is to reset
|
||||
*/
|
||||
static unsigned int halt_poll_ns_shrink;
|
||||
module_param(halt_poll_ns_shrink, int, S_IRUGO);
|
||||
MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll time is shrunk by");
|
||||
|
||||
static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
|
||||
static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
|
||||
|
||||
static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc,
|
||||
int *ip)
|
||||
{
|
||||
int i = *ip;
|
||||
struct kvm_vcpu *vcpu;
|
||||
|
||||
while (++i < MAX_SMT_THREADS) {
|
||||
vcpu = READ_ONCE(vc->runnable_threads[i]);
|
||||
if (vcpu) {
|
||||
*ip = i;
|
||||
return vcpu;
|
||||
}
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/* Used to traverse the list of runnable threads for a given vcore */
|
||||
#define for_each_runnable_thread(i, vcpu, vc) \
|
||||
for (i = -1; (vcpu = next_runnable_thread(vc, &i)); )
|
||||
|
||||
static bool kvmppc_ipi_thread(int cpu)
|
||||
{
|
||||
/* On POWER8 for IPIs to threads in the same core, use msgsnd */
|
||||
|
@ -991,6 +1038,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu,
|
|||
kvmppc_core_queue_program(vcpu, SRR1_PROGILL);
|
||||
r = RESUME_GUEST;
|
||||
break;
|
||||
case BOOK3S_INTERRUPT_HV_RM_HARD:
|
||||
r = RESUME_PASSTHROUGH;
|
||||
break;
|
||||
default:
|
||||
kvmppc_dump_regs(vcpu);
|
||||
printk(KERN_EMERG "trap=0x%x | pc=0x%lx | msr=0x%llx\n",
|
||||
|
@ -1149,6 +1199,9 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
|
|||
case KVM_REG_PPC_DPDES:
|
||||
*val = get_reg_val(id, vcpu->arch.vcore->dpdes);
|
||||
break;
|
||||
case KVM_REG_PPC_VTB:
|
||||
*val = get_reg_val(id, vcpu->arch.vcore->vtb);
|
||||
break;
|
||||
case KVM_REG_PPC_DAWR:
|
||||
*val = get_reg_val(id, vcpu->arch.dawr);
|
||||
break;
|
||||
|
@ -1341,6 +1394,9 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
|
|||
case KVM_REG_PPC_DPDES:
|
||||
vcpu->arch.vcore->dpdes = set_reg_val(id, *val);
|
||||
break;
|
||||
case KVM_REG_PPC_VTB:
|
||||
vcpu->arch.vcore->vtb = set_reg_val(id, *val);
|
||||
break;
|
||||
case KVM_REG_PPC_DAWR:
|
||||
vcpu->arch.dawr = set_reg_val(id, *val);
|
||||
break;
|
||||
|
@ -1493,7 +1549,6 @@ static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
|
|||
if (vcore == NULL)
|
||||
return NULL;
|
||||
|
||||
INIT_LIST_HEAD(&vcore->runnable_threads);
|
||||
spin_lock_init(&vcore->lock);
|
||||
spin_lock_init(&vcore->stoltb_lock);
|
||||
init_swait_queue_head(&vcore->wq);
|
||||
|
@ -1802,7 +1857,7 @@ static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
|
|||
vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST;
|
||||
spin_unlock_irq(&vcpu->arch.tbacct_lock);
|
||||
--vc->n_runnable;
|
||||
list_del(&vcpu->arch.run_list);
|
||||
WRITE_ONCE(vc->runnable_threads[vcpu->arch.ptid], NULL);
|
||||
}
|
||||
|
||||
static int kvmppc_grab_hwthread(int cpu)
|
||||
|
@ -2048,66 +2103,6 @@ static void init_master_vcore(struct kvmppc_vcore *vc)
|
|||
vc->conferring_threads = 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* See if the existing subcores can be split into 3 (or fewer) subcores
|
||||
* of at most two threads each, so we can fit in another vcore. This
|
||||
* assumes there are at most two subcores and at most 6 threads in total.
|
||||
*/
|
||||
static bool can_split_piggybacked_subcores(struct core_info *cip)
|
||||
{
|
||||
int sub, new_sub;
|
||||
int large_sub = -1;
|
||||
int thr;
|
||||
int n_subcores = cip->n_subcores;
|
||||
struct kvmppc_vcore *vc, *vcnext;
|
||||
struct kvmppc_vcore *master_vc = NULL;
|
||||
|
||||
for (sub = 0; sub < cip->n_subcores; ++sub) {
|
||||
if (cip->subcore_threads[sub] <= 2)
|
||||
continue;
|
||||
if (large_sub >= 0)
|
||||
return false;
|
||||
large_sub = sub;
|
||||
vc = list_first_entry(&cip->vcs[sub], struct kvmppc_vcore,
|
||||
preempt_list);
|
||||
if (vc->num_threads > 2)
|
||||
return false;
|
||||
n_subcores += (cip->subcore_threads[sub] - 1) >> 1;
|
||||
}
|
||||
if (large_sub < 0 || !subcore_config_ok(n_subcores + 1, 2))
|
||||
return false;
|
||||
|
||||
/*
|
||||
* Seems feasible, so go through and move vcores to new subcores.
|
||||
* Note that when we have two or more vcores in one subcore,
|
||||
* all those vcores must have only one thread each.
|
||||
*/
|
||||
new_sub = cip->n_subcores;
|
||||
thr = 0;
|
||||
sub = large_sub;
|
||||
list_for_each_entry_safe(vc, vcnext, &cip->vcs[sub], preempt_list) {
|
||||
if (thr >= 2) {
|
||||
list_del(&vc->preempt_list);
|
||||
list_add_tail(&vc->preempt_list, &cip->vcs[new_sub]);
|
||||
/* vc->num_threads must be 1 */
|
||||
if (++cip->subcore_threads[new_sub] == 1) {
|
||||
cip->subcore_vm[new_sub] = vc->kvm;
|
||||
init_master_vcore(vc);
|
||||
master_vc = vc;
|
||||
++cip->n_subcores;
|
||||
} else {
|
||||
vc->master_vcore = master_vc;
|
||||
++new_sub;
|
||||
}
|
||||
}
|
||||
thr += vc->num_threads;
|
||||
}
|
||||
cip->subcore_threads[large_sub] = 2;
|
||||
cip->max_subcore_threads = 2;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool can_dynamic_split(struct kvmppc_vcore *vc, struct core_info *cip)
|
||||
{
|
||||
int n_threads = vc->num_threads;
|
||||
|
@ -2118,23 +2113,9 @@ static bool can_dynamic_split(struct kvmppc_vcore *vc, struct core_info *cip)
|
|||
|
||||
if (n_threads < cip->max_subcore_threads)
|
||||
n_threads = cip->max_subcore_threads;
|
||||
if (subcore_config_ok(cip->n_subcores + 1, n_threads)) {
|
||||
if (!subcore_config_ok(cip->n_subcores + 1, n_threads))
|
||||
return false;
|
||||
cip->max_subcore_threads = n_threads;
|
||||
} else if (cip->n_subcores <= 2 && cip->total_threads <= 6 &&
|
||||
vc->num_threads <= 2) {
|
||||
/*
|
||||
* We may be able to fit another subcore in by
|
||||
* splitting an existing subcore with 3 or 4
|
||||
* threads into two 2-thread subcores, or one
|
||||
* with 5 or 6 threads into three subcores.
|
||||
* We can only do this if those subcores have
|
||||
* piggybacked virtual cores.
|
||||
*/
|
||||
if (!can_split_piggybacked_subcores(cip))
|
||||
return false;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
|
||||
sub = cip->n_subcores;
|
||||
++cip->n_subcores;
|
||||
|
@ -2148,43 +2129,6 @@ static bool can_dynamic_split(struct kvmppc_vcore *vc, struct core_info *cip)
|
|||
return true;
|
||||
}
|
||||
|
||||
static bool can_piggyback_subcore(struct kvmppc_vcore *pvc,
|
||||
struct core_info *cip, int sub)
|
||||
{
|
||||
struct kvmppc_vcore *vc;
|
||||
int n_thr;
|
||||
|
||||
vc = list_first_entry(&cip->vcs[sub], struct kvmppc_vcore,
|
||||
preempt_list);
|
||||
|
||||
/* require same VM and same per-core reg values */
|
||||
if (pvc->kvm != vc->kvm ||
|
||||
pvc->tb_offset != vc->tb_offset ||
|
||||
pvc->pcr != vc->pcr ||
|
||||
pvc->lpcr != vc->lpcr)
|
||||
return false;
|
||||
|
||||
/* P8 guest with > 1 thread per core would see wrong TIR value */
|
||||
if (cpu_has_feature(CPU_FTR_ARCH_207S) &&
|
||||
(vc->num_threads > 1 || pvc->num_threads > 1))
|
||||
return false;
|
||||
|
||||
n_thr = cip->subcore_threads[sub] + pvc->num_threads;
|
||||
if (n_thr > cip->max_subcore_threads) {
|
||||
if (!subcore_config_ok(cip->n_subcores, n_thr))
|
||||
return false;
|
||||
cip->max_subcore_threads = n_thr;
|
||||
}
|
||||
|
||||
cip->total_threads += pvc->num_threads;
|
||||
cip->subcore_threads[sub] = n_thr;
|
||||
pvc->master_vcore = vc;
|
||||
list_del(&pvc->preempt_list);
|
||||
list_add_tail(&pvc->preempt_list, &cip->vcs[sub]);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
* Work out whether it is possible to piggyback the execution of
|
||||
* vcore *pvc onto the execution of the other vcores described in *cip.
|
||||
|
@ -2192,27 +2136,18 @@ static bool can_piggyback_subcore(struct kvmppc_vcore *pvc,
|
|||
static bool can_piggyback(struct kvmppc_vcore *pvc, struct core_info *cip,
|
||||
int target_threads)
|
||||
{
|
||||
int sub;
|
||||
|
||||
if (cip->total_threads + pvc->num_threads > target_threads)
|
||||
return false;
|
||||
for (sub = 0; sub < cip->n_subcores; ++sub)
|
||||
if (cip->subcore_threads[sub] &&
|
||||
can_piggyback_subcore(pvc, cip, sub))
|
||||
return true;
|
||||
|
||||
if (can_dynamic_split(pvc, cip))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
return can_dynamic_split(pvc, cip);
|
||||
}
|
||||
|
||||
static void prepare_threads(struct kvmppc_vcore *vc)
|
||||
{
|
||||
struct kvm_vcpu *vcpu, *vnext;
|
||||
int i;
|
||||
struct kvm_vcpu *vcpu;
|
||||
|
||||
list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
|
||||
arch.run_list) {
|
||||
for_each_runnable_thread(i, vcpu, vc) {
|
||||
if (signal_pending(vcpu->arch.run_task))
|
||||
vcpu->arch.ret = -EINTR;
|
||||
else if (vcpu->arch.vpa.update_pending ||
|
||||
|
@ -2259,15 +2194,14 @@ static void collect_piggybacks(struct core_info *cip, int target_threads)
|
|||
|
||||
static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
|
||||
{
|
||||
int still_running = 0;
|
||||
int still_running = 0, i;
|
||||
u64 now;
|
||||
long ret;
|
||||
struct kvm_vcpu *vcpu, *vnext;
|
||||
struct kvm_vcpu *vcpu;
|
||||
|
||||
spin_lock(&vc->lock);
|
||||
now = get_tb();
|
||||
list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
|
||||
arch.run_list) {
|
||||
for_each_runnable_thread(i, vcpu, vc) {
|
||||
/* cancel pending dec exception if dec is positive */
|
||||
if (now < vcpu->arch.dec_expires &&
|
||||
kvmppc_core_pending_dec(vcpu))
|
||||
|
@ -2307,8 +2241,8 @@ static void post_guest_process(struct kvmppc_vcore *vc, bool is_master)
|
|||
}
|
||||
if (vc->n_runnable > 0 && vc->runner == NULL) {
|
||||
/* make sure there's a candidate runner awake */
|
||||
vcpu = list_first_entry(&vc->runnable_threads,
|
||||
struct kvm_vcpu, arch.run_list);
|
||||
i = -1;
|
||||
vcpu = next_runnable_thread(vc, &i);
|
||||
wake_up(&vcpu->arch.cpu_run);
|
||||
}
|
||||
}
|
||||
|
@ -2361,7 +2295,7 @@ static inline void kvmppc_set_host_core(int cpu)
|
|||
*/
|
||||
static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
|
||||
{
|
||||
struct kvm_vcpu *vcpu, *vnext;
|
||||
struct kvm_vcpu *vcpu;
|
||||
int i;
|
||||
int srcu_idx;
|
||||
struct core_info core_info;
|
||||
|
@ -2397,8 +2331,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
|
|||
*/
|
||||
if ((threads_per_core > 1) &&
|
||||
((vc->num_threads > threads_per_subcore) || !on_primary_thread())) {
|
||||
list_for_each_entry_safe(vcpu, vnext, &vc->runnable_threads,
|
||||
arch.run_list) {
|
||||
for_each_runnable_thread(i, vcpu, vc) {
|
||||
vcpu->arch.ret = -EBUSY;
|
||||
kvmppc_remove_runnable(vc, vcpu);
|
||||
wake_up(&vcpu->arch.cpu_run);
|
||||
|
@ -2477,8 +2410,7 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
|
|||
active |= 1 << thr;
|
||||
list_for_each_entry(pvc, &core_info.vcs[sub], preempt_list) {
|
||||
pvc->pcpu = pcpu + thr;
|
||||
list_for_each_entry(vcpu, &pvc->runnable_threads,
|
||||
arch.run_list) {
|
||||
for_each_runnable_thread(i, vcpu, pvc) {
|
||||
kvmppc_start_thread(vcpu, pvc);
|
||||
kvmppc_create_dtl_entry(vcpu, pvc);
|
||||
trace_kvm_guest_enter(vcpu);
|
||||
|
@ -2604,33 +2536,91 @@ static void kvmppc_wait_for_exec(struct kvmppc_vcore *vc,
|
|||
finish_wait(&vcpu->arch.cpu_run, &wait);
|
||||
}
|
||||
|
||||
static void grow_halt_poll_ns(struct kvmppc_vcore *vc)
|
||||
{
|
||||
/* 10us base */
|
||||
if (vc->halt_poll_ns == 0 && halt_poll_ns_grow)
|
||||
vc->halt_poll_ns = 10000;
|
||||
else
|
||||
vc->halt_poll_ns *= halt_poll_ns_grow;
|
||||
|
||||
if (vc->halt_poll_ns > halt_poll_max_ns)
|
||||
vc->halt_poll_ns = halt_poll_max_ns;
|
||||
}
|
||||
|
||||
static void shrink_halt_poll_ns(struct kvmppc_vcore *vc)
|
||||
{
|
||||
if (halt_poll_ns_shrink == 0)
|
||||
vc->halt_poll_ns = 0;
|
||||
else
|
||||
vc->halt_poll_ns /= halt_poll_ns_shrink;
|
||||
}
|
||||
|
||||
/* Check to see if any of the runnable vcpus on the vcore have pending
|
||||
* exceptions or are no longer ceded
|
||||
*/
|
||||
static int kvmppc_vcore_check_block(struct kvmppc_vcore *vc)
|
||||
{
|
||||
struct kvm_vcpu *vcpu;
|
||||
int i;
|
||||
|
||||
for_each_runnable_thread(i, vcpu, vc) {
|
||||
if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded)
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* All the vcpus in this vcore are idle, so wait for a decrementer
|
||||
* or external interrupt to one of the vcpus. vc->lock is held.
|
||||
*/
|
||||
static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
|
||||
{
|
||||
struct kvm_vcpu *vcpu;
|
||||
ktime_t cur, start_poll, start_wait;
|
||||
int do_sleep = 1;
|
||||
u64 block_ns;
|
||||
DECLARE_SWAITQUEUE(wait);
|
||||
|
||||
prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
|
||||
/* Poll for pending exceptions and ceded state */
|
||||
cur = start_poll = ktime_get();
|
||||
if (vc->halt_poll_ns) {
|
||||
ktime_t stop = ktime_add_ns(start_poll, vc->halt_poll_ns);
|
||||
++vc->runner->stat.halt_attempted_poll;
|
||||
|
||||
/*
|
||||
* Check one last time for pending exceptions and ceded state after
|
||||
* we put ourselves on the wait queue
|
||||
*/
|
||||
list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
|
||||
if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
|
||||
vc->vcore_state = VCORE_POLLING;
|
||||
spin_unlock(&vc->lock);
|
||||
|
||||
do {
|
||||
if (kvmppc_vcore_check_block(vc)) {
|
||||
do_sleep = 0;
|
||||
break;
|
||||
}
|
||||
}
|
||||
cur = ktime_get();
|
||||
} while (single_task_running() && ktime_before(cur, stop));
|
||||
|
||||
spin_lock(&vc->lock);
|
||||
vc->vcore_state = VCORE_INACTIVE;
|
||||
|
||||
if (!do_sleep) {
|
||||
finish_swait(&vc->wq, &wait);
|
||||
return;
|
||||
++vc->runner->stat.halt_successful_poll;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
prepare_to_swait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
|
||||
|
||||
if (kvmppc_vcore_check_block(vc)) {
|
||||
finish_swait(&vc->wq, &wait);
|
||||
do_sleep = 0;
|
||||
/* If we polled, count this as a successful poll */
|
||||
if (vc->halt_poll_ns)
|
||||
++vc->runner->stat.halt_successful_poll;
|
||||
goto out;
|
||||
}
|
||||
|
||||
start_wait = ktime_get();
|
||||
|
||||
vc->vcore_state = VCORE_SLEEPING;
|
||||
trace_kvmppc_vcore_blocked(vc, 0);
|
||||
|
@ -2640,13 +2630,52 @@ static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
|
|||
spin_lock(&vc->lock);
|
||||
vc->vcore_state = VCORE_INACTIVE;
|
||||
trace_kvmppc_vcore_blocked(vc, 1);
|
||||
++vc->runner->stat.halt_successful_wait;
|
||||
|
||||
cur = ktime_get();
|
||||
|
||||
out:
|
||||
block_ns = ktime_to_ns(cur) - ktime_to_ns(start_poll);
|
||||
|
||||
/* Attribute wait time */
|
||||
if (do_sleep) {
|
||||
vc->runner->stat.halt_wait_ns +=
|
||||
ktime_to_ns(cur) - ktime_to_ns(start_wait);
|
||||
/* Attribute failed poll time */
|
||||
if (vc->halt_poll_ns)
|
||||
vc->runner->stat.halt_poll_fail_ns +=
|
||||
ktime_to_ns(start_wait) -
|
||||
ktime_to_ns(start_poll);
|
||||
} else {
|
||||
/* Attribute successful poll time */
|
||||
if (vc->halt_poll_ns)
|
||||
vc->runner->stat.halt_poll_success_ns +=
|
||||
ktime_to_ns(cur) -
|
||||
ktime_to_ns(start_poll);
|
||||
}
|
||||
|
||||
/* Adjust poll time */
|
||||
if (halt_poll_max_ns) {
|
||||
if (block_ns <= vc->halt_poll_ns)
|
||||
;
|
||||
/* We slept and blocked for longer than the max halt time */
|
||||
else if (vc->halt_poll_ns && block_ns > halt_poll_max_ns)
|
||||
shrink_halt_poll_ns(vc);
|
||||
/* We slept and our poll time is too small */
|
||||
else if (vc->halt_poll_ns < halt_poll_max_ns &&
|
||||
block_ns < halt_poll_max_ns)
|
||||
grow_halt_poll_ns(vc);
|
||||
} else
|
||||
vc->halt_poll_ns = 0;
|
||||
|
||||
trace_kvmppc_vcore_wakeup(do_sleep, block_ns);
|
||||
}
|
||||
|
||||
static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
|
||||
{
|
||||
int n_ceded;
|
||||
int n_ceded, i;
|
||||
struct kvmppc_vcore *vc;
|
||||
struct kvm_vcpu *v, *vn;
|
||||
struct kvm_vcpu *v;
|
||||
|
||||
trace_kvmppc_run_vcpu_enter(vcpu);
|
||||
|
||||
|
@ -2666,7 +2695,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
|
|||
vcpu->arch.stolen_logged = vcore_stolen_time(vc, mftb());
|
||||
vcpu->arch.state = KVMPPC_VCPU_RUNNABLE;
|
||||
vcpu->arch.busy_preempt = TB_NIL;
|
||||
list_add_tail(&vcpu->arch.run_list, &vc->runnable_threads);
|
||||
WRITE_ONCE(vc->runnable_threads[vcpu->arch.ptid], vcpu);
|
||||
++vc->n_runnable;
|
||||
|
||||
/*
|
||||
|
@ -2706,8 +2735,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
|
|||
kvmppc_wait_for_exec(vc, vcpu, TASK_INTERRUPTIBLE);
|
||||
continue;
|
||||
}
|
||||
list_for_each_entry_safe(v, vn, &vc->runnable_threads,
|
||||
arch.run_list) {
|
||||
for_each_runnable_thread(i, v, vc) {
|
||||
kvmppc_core_prepare_to_enter(v);
|
||||
if (signal_pending(v->arch.run_task)) {
|
||||
kvmppc_remove_runnable(vc, v);
|
||||
|
@ -2720,7 +2748,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
|
|||
if (!vc->n_runnable || vcpu->arch.state != KVMPPC_VCPU_RUNNABLE)
|
||||
break;
|
||||
n_ceded = 0;
|
||||
list_for_each_entry(v, &vc->runnable_threads, arch.run_list) {
|
||||
for_each_runnable_thread(i, v, vc) {
|
||||
if (!v->arch.pending_exceptions)
|
||||
n_ceded += v->arch.ceded;
|
||||
else
|
||||
|
@ -2759,8 +2787,8 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
|
|||
|
||||
if (vc->n_runnable && vc->vcore_state == VCORE_INACTIVE) {
|
||||
/* Wake up some vcpu to run the core */
|
||||
v = list_first_entry(&vc->runnable_threads,
|
||||
struct kvm_vcpu, arch.run_list);
|
||||
i = -1;
|
||||
v = next_runnable_thread(vc, &i);
|
||||
wake_up(&v->arch.cpu_run);
|
||||
}
|
||||
|
||||
|
@ -2818,7 +2846,8 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu)
|
|||
r = kvmppc_book3s_hv_page_fault(run, vcpu,
|
||||
vcpu->arch.fault_dar, vcpu->arch.fault_dsisr);
|
||||
srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx);
|
||||
}
|
||||
} else if (r == RESUME_PASSTHROUGH)
|
||||
r = kvmppc_xics_rm_complete(vcpu, 0);
|
||||
} while (is_kvmppc_resume_guest(r));
|
||||
|
||||
out:
|
||||
|
@ -3247,6 +3276,8 @@ static void kvmppc_core_destroy_vm_hv(struct kvm *kvm)
|
|||
kvmppc_free_vcores(kvm);
|
||||
|
||||
kvmppc_free_hpt(kvm);
|
||||
|
||||
kvmppc_free_pimap(kvm);
|
||||
}
|
||||
|
||||
/* We don't need to emulate any privileged instructions or dcbz */
|
||||
|
@ -3282,6 +3313,184 @@ static int kvmppc_core_check_processor_compat_hv(void)
|
|||
return 0;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
|
||||
void kvmppc_free_pimap(struct kvm *kvm)
|
||||
{
|
||||
kfree(kvm->arch.pimap);
|
||||
}
|
||||
|
||||
static struct kvmppc_passthru_irqmap *kvmppc_alloc_pimap(void)
|
||||
{
|
||||
return kzalloc(sizeof(struct kvmppc_passthru_irqmap), GFP_KERNEL);
|
||||
}
|
||||
|
||||
static int kvmppc_set_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi)
|
||||
{
|
||||
struct irq_desc *desc;
|
||||
struct kvmppc_irq_map *irq_map;
|
||||
struct kvmppc_passthru_irqmap *pimap;
|
||||
struct irq_chip *chip;
|
||||
int i;
|
||||
|
||||
if (!kvm_irq_bypass)
|
||||
return 1;
|
||||
|
||||
desc = irq_to_desc(host_irq);
|
||||
if (!desc)
|
||||
return -EIO;
|
||||
|
||||
mutex_lock(&kvm->lock);
|
||||
|
||||
pimap = kvm->arch.pimap;
|
||||
if (pimap == NULL) {
|
||||
/* First call, allocate structure to hold IRQ map */
|
||||
pimap = kvmppc_alloc_pimap();
|
||||
if (pimap == NULL) {
|
||||
mutex_unlock(&kvm->lock);
|
||||
return -ENOMEM;
|
||||
}
|
||||
kvm->arch.pimap = pimap;
|
||||
}
|
||||
|
||||
/*
|
||||
* For now, we only support interrupts for which the EOI operation
|
||||
* is an OPAL call followed by a write to XIRR, since that's
|
||||
* what our real-mode EOI code does.
|
||||
*/
|
||||
chip = irq_data_get_irq_chip(&desc->irq_data);
|
||||
if (!chip || !is_pnv_opal_msi(chip)) {
|
||||
pr_warn("kvmppc_set_passthru_irq_hv: Could not assign IRQ map for (%d,%d)\n",
|
||||
host_irq, guest_gsi);
|
||||
mutex_unlock(&kvm->lock);
|
||||
return -ENOENT;
|
||||
}
|
||||
|
||||
/*
|
||||
* See if we already have an entry for this guest IRQ number.
|
||||
* If it's mapped to a hardware IRQ number, that's an error,
|
||||
* otherwise re-use this entry.
|
||||
*/
|
||||
for (i = 0; i < pimap->n_mapped; i++) {
|
||||
if (guest_gsi == pimap->mapped[i].v_hwirq) {
|
||||
if (pimap->mapped[i].r_hwirq) {
|
||||
mutex_unlock(&kvm->lock);
|
||||
return -EINVAL;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (i == KVMPPC_PIRQ_MAPPED) {
|
||||
mutex_unlock(&kvm->lock);
|
||||
return -EAGAIN; /* table is full */
|
||||
}
|
||||
|
||||
irq_map = &pimap->mapped[i];
|
||||
|
||||
irq_map->v_hwirq = guest_gsi;
|
||||
irq_map->desc = desc;
|
||||
|
||||
/*
|
||||
* Order the above two stores before the next to serialize with
|
||||
* the KVM real mode handler.
|
||||
*/
|
||||
smp_wmb();
|
||||
irq_map->r_hwirq = desc->irq_data.hwirq;
|
||||
|
||||
if (i == pimap->n_mapped)
|
||||
pimap->n_mapped++;
|
||||
|
||||
kvmppc_xics_set_mapped(kvm, guest_gsi, desc->irq_data.hwirq);
|
||||
|
||||
mutex_unlock(&kvm->lock);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int kvmppc_clr_passthru_irq(struct kvm *kvm, int host_irq, int guest_gsi)
|
||||
{
|
||||
struct irq_desc *desc;
|
||||
struct kvmppc_passthru_irqmap *pimap;
|
||||
int i;
|
||||
|
||||
if (!kvm_irq_bypass)
|
||||
return 0;
|
||||
|
||||
desc = irq_to_desc(host_irq);
|
||||
if (!desc)
|
||||
return -EIO;
|
||||
|
||||
mutex_lock(&kvm->lock);
|
||||
|
||||
if (kvm->arch.pimap == NULL) {
|
||||
mutex_unlock(&kvm->lock);
|
||||
return 0;
|
||||
}
|
||||
pimap = kvm->arch.pimap;
|
||||
|
||||
for (i = 0; i < pimap->n_mapped; i++) {
|
||||
if (guest_gsi == pimap->mapped[i].v_hwirq)
|
||||
break;
|
||||
}
|
||||
|
||||
if (i == pimap->n_mapped) {
|
||||
mutex_unlock(&kvm->lock);
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
kvmppc_xics_clr_mapped(kvm, guest_gsi, pimap->mapped[i].r_hwirq);
|
||||
|
||||
/* invalidate the entry */
|
||||
pimap->mapped[i].r_hwirq = 0;
|
||||
|
||||
/*
|
||||
* We don't free this structure even when the count goes to
|
||||
* zero. The structure is freed when we destroy the VM.
|
||||
*/
|
||||
|
||||
mutex_unlock(&kvm->lock);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int kvmppc_irq_bypass_add_producer_hv(struct irq_bypass_consumer *cons,
|
||||
struct irq_bypass_producer *prod)
|
||||
{
|
||||
int ret = 0;
|
||||
struct kvm_kernel_irqfd *irqfd =
|
||||
container_of(cons, struct kvm_kernel_irqfd, consumer);
|
||||
|
||||
irqfd->producer = prod;
|
||||
|
||||
ret = kvmppc_set_passthru_irq(irqfd->kvm, prod->irq, irqfd->gsi);
|
||||
if (ret)
|
||||
pr_info("kvmppc_set_passthru_irq (irq %d, gsi %d) fails: %d\n",
|
||||
prod->irq, irqfd->gsi, ret);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void kvmppc_irq_bypass_del_producer_hv(struct irq_bypass_consumer *cons,
|
||||
struct irq_bypass_producer *prod)
|
||||
{
|
||||
int ret;
|
||||
struct kvm_kernel_irqfd *irqfd =
|
||||
container_of(cons, struct kvm_kernel_irqfd, consumer);
|
||||
|
||||
irqfd->producer = NULL;
|
||||
|
||||
/*
|
||||
* When producer of consumer is unregistered, we change back to
|
||||
* default external interrupt handling mode - KVM real mode
|
||||
* will switch back to host.
|
||||
*/
|
||||
ret = kvmppc_clr_passthru_irq(irqfd->kvm, prod->irq, irqfd->gsi);
|
||||
if (ret)
|
||||
pr_warn("kvmppc_clr_passthru_irq (irq %d, gsi %d) fails: %d\n",
|
||||
prod->irq, irqfd->gsi, ret);
|
||||
}
|
||||
#endif
|
||||
|
||||
static long kvm_arch_vm_ioctl_hv(struct file *filp,
|
||||
unsigned int ioctl, unsigned long arg)
|
||||
{
|
||||
|
@ -3400,6 +3609,10 @@ static struct kvmppc_ops kvm_ops_hv = {
|
|||
.fast_vcpu_kick = kvmppc_fast_vcpu_kick_hv,
|
||||
.arch_vm_ioctl = kvm_arch_vm_ioctl_hv,
|
||||
.hcall_implemented = kvmppc_hcall_impl_hv,
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
.irq_bypass_add_producer = kvmppc_irq_bypass_add_producer_hv,
|
||||
.irq_bypass_del_producer = kvmppc_irq_bypass_del_producer_hv,
|
||||
#endif
|
||||
};
|
||||
|
||||
static int kvm_init_subcore_bitmap(void)
|
||||
|
|
|
@ -25,6 +25,7 @@
|
|||
#include <asm/xics.h>
|
||||
#include <asm/dbell.h>
|
||||
#include <asm/cputhreads.h>
|
||||
#include <asm/io.h>
|
||||
|
||||
#define KVM_CMA_CHUNK_ORDER 18
|
||||
|
||||
|
@ -286,3 +287,158 @@ void kvmhv_commence_exit(int trap)
|
|||
|
||||
struct kvmppc_host_rm_ops *kvmppc_host_rm_ops_hv;
|
||||
EXPORT_SYMBOL_GPL(kvmppc_host_rm_ops_hv);
|
||||
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
static struct kvmppc_irq_map *get_irqmap(struct kvmppc_passthru_irqmap *pimap,
|
||||
u32 xisr)
|
||||
{
|
||||
int i;
|
||||
|
||||
/*
|
||||
* We access the mapped array here without a lock. That
|
||||
* is safe because we never reduce the number of entries
|
||||
* in the array and we never change the v_hwirq field of
|
||||
* an entry once it is set.
|
||||
*
|
||||
* We have also carefully ordered the stores in the writer
|
||||
* and the loads here in the reader, so that if we find a matching
|
||||
* hwirq here, the associated GSI and irq_desc fields are valid.
|
||||
*/
|
||||
for (i = 0; i < pimap->n_mapped; i++) {
|
||||
if (xisr == pimap->mapped[i].r_hwirq) {
|
||||
/*
|
||||
* Order subsequent reads in the caller to serialize
|
||||
* with the writer.
|
||||
*/
|
||||
smp_rmb();
|
||||
return &pimap->mapped[i];
|
||||
}
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/*
|
||||
* If we have an interrupt that's not an IPI, check if we have a
|
||||
* passthrough adapter and if so, check if this external interrupt
|
||||
* is for the adapter.
|
||||
* We will attempt to deliver the IRQ directly to the target VCPU's
|
||||
* ICP, the virtual ICP (based on affinity - the xive value in ICS).
|
||||
*
|
||||
* If the delivery fails or if this is not for a passthrough adapter,
|
||||
* return to the host to handle this interrupt. We earlier
|
||||
* saved a copy of the XIRR in the PACA, it will be picked up by
|
||||
* the host ICP driver.
|
||||
*/
|
||||
static int kvmppc_check_passthru(u32 xisr, __be32 xirr)
|
||||
{
|
||||
struct kvmppc_passthru_irqmap *pimap;
|
||||
struct kvmppc_irq_map *irq_map;
|
||||
struct kvm_vcpu *vcpu;
|
||||
|
||||
vcpu = local_paca->kvm_hstate.kvm_vcpu;
|
||||
if (!vcpu)
|
||||
return 1;
|
||||
pimap = kvmppc_get_passthru_irqmap(vcpu->kvm);
|
||||
if (!pimap)
|
||||
return 1;
|
||||
irq_map = get_irqmap(pimap, xisr);
|
||||
if (!irq_map)
|
||||
return 1;
|
||||
|
||||
/* We're handling this interrupt, generic code doesn't need to */
|
||||
local_paca->kvm_hstate.saved_xirr = 0;
|
||||
|
||||
return kvmppc_deliver_irq_passthru(vcpu, xirr, irq_map, pimap);
|
||||
}
|
||||
|
||||
#else
|
||||
static inline int kvmppc_check_passthru(u32 xisr, __be32 xirr)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Determine what sort of external interrupt is pending (if any).
|
||||
* Returns:
|
||||
* 0 if no interrupt is pending
|
||||
* 1 if an interrupt is pending that needs to be handled by the host
|
||||
* 2 Passthrough that needs completion in the host
|
||||
* -1 if there was a guest wakeup IPI (which has now been cleared)
|
||||
* -2 if there is PCI passthrough external interrupt that was handled
|
||||
*/
|
||||
|
||||
long kvmppc_read_intr(void)
|
||||
{
|
||||
unsigned long xics_phys;
|
||||
u32 h_xirr;
|
||||
__be32 xirr;
|
||||
u32 xisr;
|
||||
u8 host_ipi;
|
||||
|
||||
/* see if a host IPI is pending */
|
||||
host_ipi = local_paca->kvm_hstate.host_ipi;
|
||||
if (host_ipi)
|
||||
return 1;
|
||||
|
||||
/* Now read the interrupt from the ICP */
|
||||
xics_phys = local_paca->kvm_hstate.xics_phys;
|
||||
if (unlikely(!xics_phys))
|
||||
return 1;
|
||||
|
||||
/*
|
||||
* Save XIRR for later. Since we get control in reverse endian
|
||||
* on LE systems, save it byte reversed and fetch it back in
|
||||
* host endian. Note that xirr is the value read from the
|
||||
* XIRR register, while h_xirr is the host endian version.
|
||||
*/
|
||||
xirr = _lwzcix(xics_phys + XICS_XIRR);
|
||||
h_xirr = be32_to_cpu(xirr);
|
||||
local_paca->kvm_hstate.saved_xirr = h_xirr;
|
||||
xisr = h_xirr & 0xffffff;
|
||||
/*
|
||||
* Ensure that the store/load complete to guarantee all side
|
||||
* effects of loading from XIRR has completed
|
||||
*/
|
||||
smp_mb();
|
||||
|
||||
/* if nothing pending in the ICP */
|
||||
if (!xisr)
|
||||
return 0;
|
||||
|
||||
/* We found something in the ICP...
|
||||
*
|
||||
* If it is an IPI, clear the MFRR and EOI it.
|
||||
*/
|
||||
if (xisr == XICS_IPI) {
|
||||
_stbcix(xics_phys + XICS_MFRR, 0xff);
|
||||
_stwcix(xics_phys + XICS_XIRR, xirr);
|
||||
/*
|
||||
* Need to ensure side effects of above stores
|
||||
* complete before proceeding.
|
||||
*/
|
||||
smp_mb();
|
||||
|
||||
/*
|
||||
* We need to re-check host IPI now in case it got set in the
|
||||
* meantime. If it's clear, we bounce the interrupt to the
|
||||
* guest
|
||||
*/
|
||||
host_ipi = local_paca->kvm_hstate.host_ipi;
|
||||
if (unlikely(host_ipi != 0)) {
|
||||
/* We raced with the host,
|
||||
* we need to resend that IPI, bummer
|
||||
*/
|
||||
_stbcix(xics_phys + XICS_MFRR, IPI_PRIORITY);
|
||||
/* Let side effects complete */
|
||||
smp_mb();
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* OK, it's an IPI for us */
|
||||
local_paca->kvm_hstate.saved_xirr = 0;
|
||||
return -1;
|
||||
}
|
||||
|
||||
return kvmppc_check_passthru(xisr, xirr);
|
||||
}
|
||||
|
|
|
@ -10,6 +10,7 @@
|
|||
#include <linux/kernel.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/kernel_stat.h>
|
||||
|
||||
#include <asm/kvm_book3s.h>
|
||||
#include <asm/kvm_ppc.h>
|
||||
|
@ -18,7 +19,10 @@
|
|||
#include <asm/debug.h>
|
||||
#include <asm/synch.h>
|
||||
#include <asm/cputhreads.h>
|
||||
#include <asm/pgtable.h>
|
||||
#include <asm/ppc-opcode.h>
|
||||
#include <asm/pnv-pci.h>
|
||||
#include <asm/opal.h>
|
||||
|
||||
#include "book3s_xics.h"
|
||||
|
||||
|
@ -26,9 +30,12 @@
|
|||
|
||||
int h_ipi_redirect = 1;
|
||||
EXPORT_SYMBOL(h_ipi_redirect);
|
||||
int kvm_irq_bypass = 1;
|
||||
EXPORT_SYMBOL(kvm_irq_bypass);
|
||||
|
||||
static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp,
|
||||
u32 new_irq);
|
||||
static int xics_opal_rm_set_server(unsigned int hw_irq, int server_cpu);
|
||||
|
||||
/* -- ICS routines -- */
|
||||
static void ics_rm_check_resend(struct kvmppc_xics *xics,
|
||||
|
@ -708,10 +715,123 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
|
|||
icp->rm_action |= XICS_RM_NOTIFY_EOI;
|
||||
icp->rm_eoied_irq = irq;
|
||||
}
|
||||
|
||||
if (state->host_irq) {
|
||||
++vcpu->stat.pthru_all;
|
||||
if (state->intr_cpu != -1) {
|
||||
int pcpu = raw_smp_processor_id();
|
||||
|
||||
pcpu = cpu_first_thread_sibling(pcpu);
|
||||
++vcpu->stat.pthru_host;
|
||||
if (state->intr_cpu != pcpu) {
|
||||
++vcpu->stat.pthru_bad_aff;
|
||||
xics_opal_rm_set_server(state->host_irq, pcpu);
|
||||
}
|
||||
state->intr_cpu = -1;
|
||||
}
|
||||
}
|
||||
bail:
|
||||
return check_too_hard(xics, icp);
|
||||
}
|
||||
|
||||
unsigned long eoi_rc;
|
||||
|
||||
static void icp_eoi(struct irq_chip *c, u32 hwirq, u32 xirr)
|
||||
{
|
||||
unsigned long xics_phys;
|
||||
int64_t rc;
|
||||
|
||||
rc = pnv_opal_pci_msi_eoi(c, hwirq);
|
||||
|
||||
if (rc)
|
||||
eoi_rc = rc;
|
||||
|
||||
iosync();
|
||||
|
||||
/* EOI it */
|
||||
xics_phys = local_paca->kvm_hstate.xics_phys;
|
||||
_stwcix(xics_phys + XICS_XIRR, xirr);
|
||||
}
|
||||
|
||||
static int xics_opal_rm_set_server(unsigned int hw_irq, int server_cpu)
|
||||
{
|
||||
unsigned int mangle_cpu = get_hard_smp_processor_id(server_cpu) << 2;
|
||||
|
||||
return opal_rm_set_xive(hw_irq, mangle_cpu, DEFAULT_PRIORITY);
|
||||
}
|
||||
|
||||
/*
|
||||
* Increment a per-CPU 32-bit unsigned integer variable.
|
||||
* Safe to call in real-mode. Handles vmalloc'ed addresses
|
||||
*
|
||||
* ToDo: Make this work for any integral type
|
||||
*/
|
||||
|
||||
static inline void this_cpu_inc_rm(unsigned int __percpu *addr)
|
||||
{
|
||||
unsigned long l;
|
||||
unsigned int *raddr;
|
||||
int cpu = smp_processor_id();
|
||||
|
||||
raddr = per_cpu_ptr(addr, cpu);
|
||||
l = (unsigned long)raddr;
|
||||
|
||||
if (REGION_ID(l) == VMALLOC_REGION_ID) {
|
||||
l = vmalloc_to_phys(raddr);
|
||||
raddr = (unsigned int *)l;
|
||||
}
|
||||
++*raddr;
|
||||
}
|
||||
|
||||
/*
|
||||
* We don't try to update the flags in the irq_desc 'istate' field in
|
||||
* here as would happen in the normal IRQ handling path for several reasons:
|
||||
* - state flags represent internal IRQ state and are not expected to be
|
||||
* updated outside the IRQ subsystem
|
||||
* - more importantly, these are useful for edge triggered interrupts,
|
||||
* IRQ probing, etc., but we are only handling MSI/MSIx interrupts here
|
||||
* and these states shouldn't apply to us.
|
||||
*
|
||||
* However, we do update irq_stats - we somewhat duplicate the code in
|
||||
* kstat_incr_irqs_this_cpu() for this since this function is defined
|
||||
* in irq/internal.h which we don't want to include here.
|
||||
* The only difference is that desc->kstat_irqs is an allocated per CPU
|
||||
* variable and could have been vmalloc'ed, so we can't directly
|
||||
* call __this_cpu_inc() on it. The kstat structure is a static
|
||||
* per CPU variable and it should be accessible by real-mode KVM.
|
||||
*
|
||||
*/
|
||||
static void kvmppc_rm_handle_irq_desc(struct irq_desc *desc)
|
||||
{
|
||||
this_cpu_inc_rm(desc->kstat_irqs);
|
||||
__this_cpu_inc(kstat.irqs_sum);
|
||||
}
|
||||
|
||||
long kvmppc_deliver_irq_passthru(struct kvm_vcpu *vcpu,
|
||||
u32 xirr,
|
||||
struct kvmppc_irq_map *irq_map,
|
||||
struct kvmppc_passthru_irqmap *pimap)
|
||||
{
|
||||
struct kvmppc_xics *xics;
|
||||
struct kvmppc_icp *icp;
|
||||
u32 irq;
|
||||
|
||||
irq = irq_map->v_hwirq;
|
||||
xics = vcpu->kvm->arch.xics;
|
||||
icp = vcpu->arch.icp;
|
||||
|
||||
kvmppc_rm_handle_irq_desc(irq_map->desc);
|
||||
icp_rm_deliver_irq(xics, icp, irq);
|
||||
|
||||
/* EOI the interrupt */
|
||||
icp_eoi(irq_desc_get_chip(irq_map->desc), irq_map->r_hwirq, xirr);
|
||||
|
||||
if (check_too_hard(xics, icp) == H_TOO_HARD)
|
||||
return 2;
|
||||
else
|
||||
return -2;
|
||||
}
|
||||
|
||||
/* --- Non-real mode XICS-related built-in routines --- */
|
||||
|
||||
/**
|
||||
|
|
|
@ -221,6 +221,13 @@ kvmppc_primary_no_guest:
|
|||
li r3, 0 /* Don't wake on privileged (OS) doorbell */
|
||||
b kvm_do_nap
|
||||
|
||||
/*
|
||||
* kvm_novcpu_wakeup
|
||||
* Entered from kvm_start_guest if kvm_hstate.napping is set
|
||||
* to NAPPING_NOVCPU
|
||||
* r2 = kernel TOC
|
||||
* r13 = paca
|
||||
*/
|
||||
kvm_novcpu_wakeup:
|
||||
ld r1, HSTATE_HOST_R1(r13)
|
||||
ld r5, HSTATE_KVM_VCORE(r13)
|
||||
|
@ -230,6 +237,13 @@ kvm_novcpu_wakeup:
|
|||
/* check the wake reason */
|
||||
bl kvmppc_check_wake_reason
|
||||
|
||||
/*
|
||||
* Restore volatile registers since we could have called
|
||||
* a C routine in kvmppc_check_wake_reason.
|
||||
* r5 = VCORE
|
||||
*/
|
||||
ld r5, HSTATE_KVM_VCORE(r13)
|
||||
|
||||
/* see if any other thread is already exiting */
|
||||
lwz r0, VCORE_ENTRY_EXIT(r5)
|
||||
cmpwi r0, 0x100
|
||||
|
@ -322,6 +336,11 @@ kvm_start_guest:
|
|||
|
||||
/* Check the wake reason in SRR1 to see why we got here */
|
||||
bl kvmppc_check_wake_reason
|
||||
/*
|
||||
* kvmppc_check_wake_reason could invoke a C routine, but we
|
||||
* have no volatile registers to restore when we return.
|
||||
*/
|
||||
|
||||
cmpdi r3, 0
|
||||
bge kvm_no_guest
|
||||
|
||||
|
@ -625,9 +644,11 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_207S)
|
|||
38:
|
||||
|
||||
BEGIN_FTR_SECTION
|
||||
/* DPDES is shared between threads */
|
||||
/* DPDES and VTB are shared between threads */
|
||||
ld r8, VCORE_DPDES(r5)
|
||||
ld r7, VCORE_VTB(r5)
|
||||
mtspr SPRN_DPDES, r8
|
||||
mtspr SPRN_VTB, r7
|
||||
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
|
||||
|
||||
/* Mark the subcore state as inside guest */
|
||||
|
@ -787,10 +808,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
|
|||
mtspr SPRN_CIABR, r7
|
||||
mtspr SPRN_TAR, r8
|
||||
ld r5, VCPU_IC(r4)
|
||||
ld r6, VCPU_VTB(r4)
|
||||
mtspr SPRN_IC, r5
|
||||
mtspr SPRN_VTB, r6
|
||||
ld r8, VCPU_EBBHR(r4)
|
||||
mtspr SPRN_IC, r5
|
||||
mtspr SPRN_EBBHR, r8
|
||||
ld r5, VCPU_EBBRR(r4)
|
||||
ld r6, VCPU_BESCR(r4)
|
||||
|
@ -881,6 +900,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
|
|||
cmpwi r3, 512 /* 1 microsecond */
|
||||
blt hdec_soon
|
||||
|
||||
deliver_guest_interrupt:
|
||||
ld r6, VCPU_CTR(r4)
|
||||
ld r7, VCPU_XER(r4)
|
||||
|
||||
|
@ -895,7 +915,6 @@ kvmppc_cede_reentry: /* r4 = vcpu, r13 = paca */
|
|||
mtspr SPRN_SRR0, r6
|
||||
mtspr SPRN_SRR1, r7
|
||||
|
||||
deliver_guest_interrupt:
|
||||
/* r11 = vcpu->arch.msr & ~MSR_HV */
|
||||
rldicl r11, r11, 63 - MSR_HV_LG, 1
|
||||
rotldi r11, r11, 1 + MSR_HV_LG
|
||||
|
@ -1155,10 +1174,54 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
|
|||
* set, we know the host wants us out so let's do it now
|
||||
*/
|
||||
bl kvmppc_read_intr
|
||||
|
||||
/*
|
||||
* Restore the active volatile registers after returning from
|
||||
* a C function.
|
||||
*/
|
||||
ld r9, HSTATE_KVM_VCPU(r13)
|
||||
li r12, BOOK3S_INTERRUPT_EXTERNAL
|
||||
|
||||
/*
|
||||
* kvmppc_read_intr return codes:
|
||||
*
|
||||
* Exit to host (r3 > 0)
|
||||
* 1 An interrupt is pending that needs to be handled by the host
|
||||
* Exit guest and return to host by branching to guest_exit_cont
|
||||
*
|
||||
* 2 Passthrough that needs completion in the host
|
||||
* Exit guest and return to host by branching to guest_exit_cont
|
||||
* However, we also set r12 to BOOK3S_INTERRUPT_HV_RM_HARD
|
||||
* to indicate to the host to complete handling the interrupt
|
||||
*
|
||||
* Before returning to guest, we check if any CPU is heading out
|
||||
* to the host and if so, we head out also. If no CPUs are heading
|
||||
* check return values <= 0.
|
||||
*
|
||||
* Return to guest (r3 <= 0)
|
||||
* 0 No external interrupt is pending
|
||||
* -1 A guest wakeup IPI (which has now been cleared)
|
||||
* In either case, we return to guest to deliver any pending
|
||||
* guest interrupts.
|
||||
*
|
||||
* -2 A PCI passthrough external interrupt was handled
|
||||
* (interrupt was delivered directly to guest)
|
||||
* Return to guest to deliver any pending guest interrupts.
|
||||
*/
|
||||
|
||||
cmpdi r3, 1
|
||||
ble 1f
|
||||
|
||||
/* Return code = 2 */
|
||||
li r12, BOOK3S_INTERRUPT_HV_RM_HARD
|
||||
stw r12, VCPU_TRAP(r9)
|
||||
b guest_exit_cont
|
||||
|
||||
1: /* Return code <= 1 */
|
||||
cmpdi r3, 0
|
||||
bgt guest_exit_cont
|
||||
|
||||
/* Check if any CPU is heading out to the host, if so head out too */
|
||||
/* Return code <= 0 */
|
||||
4: ld r5, HSTATE_KVM_VCORE(r13)
|
||||
lwz r0, VCORE_ENTRY_EXIT(r5)
|
||||
cmpwi r0, 0x100
|
||||
|
@ -1271,10 +1334,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
|
|||
stw r6, VCPU_PSPB(r9)
|
||||
std r7, VCPU_FSCR(r9)
|
||||
mfspr r5, SPRN_IC
|
||||
mfspr r6, SPRN_VTB
|
||||
mfspr r7, SPRN_TAR
|
||||
std r5, VCPU_IC(r9)
|
||||
std r6, VCPU_VTB(r9)
|
||||
std r7, VCPU_TAR(r9)
|
||||
mfspr r8, SPRN_EBBHR
|
||||
std r8, VCPU_EBBHR(r9)
|
||||
|
@ -1501,9 +1562,11 @@ kvmhv_switch_to_host:
|
|||
isync
|
||||
|
||||
BEGIN_FTR_SECTION
|
||||
/* DPDES is shared between threads */
|
||||
/* DPDES and VTB are shared between threads */
|
||||
mfspr r7, SPRN_DPDES
|
||||
mfspr r8, SPRN_VTB
|
||||
std r7, VCORE_DPDES(r5)
|
||||
std r8, VCORE_VTB(r5)
|
||||
/* clear DPDES so we don't get guest doorbells in the host */
|
||||
li r8, 0
|
||||
mtspr SPRN_DPDES, r8
|
||||
|
@ -2217,6 +2280,16 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
|
|||
/* Check the wake reason in SRR1 to see why we got here */
|
||||
bl kvmppc_check_wake_reason
|
||||
|
||||
/*
|
||||
* Restore volatile registers since we could have called a
|
||||
* C routine in kvmppc_check_wake_reason
|
||||
* r4 = VCPU
|
||||
* r3 tells us whether we need to return to host or not
|
||||
* WARNING: it gets checked further down:
|
||||
* should not modify r3 until this check is done.
|
||||
*/
|
||||
ld r4, HSTATE_KVM_VCPU(r13)
|
||||
|
||||
/* clear our bit in vcore->napping_threads */
|
||||
34: ld r5,HSTATE_KVM_VCORE(r13)
|
||||
lbz r7,HSTATE_PTID(r13)
|
||||
|
@ -2230,7 +2303,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
|
|||
li r0,0
|
||||
stb r0,HSTATE_NAPPING(r13)
|
||||
|
||||
/* See if the wake reason means we need to exit */
|
||||
/* See if the wake reason saved in r3 means we need to exit */
|
||||
stw r12, VCPU_TRAP(r4)
|
||||
mr r9, r4
|
||||
cmpdi r3, 0
|
||||
|
@ -2297,10 +2370,14 @@ machine_check_realmode:
|
|||
* 0 if nothing needs to be done
|
||||
* 1 if something happened that needs to be handled by the host
|
||||
* -1 if there was a guest wakeup (IPI or msgsnd)
|
||||
* -2 if we handled a PCI passthrough interrupt (returned by
|
||||
* kvmppc_read_intr only)
|
||||
*
|
||||
* Also sets r12 to the interrupt vector for any interrupt that needs
|
||||
* to be handled now by the host (0x500 for external interrupt), or zero.
|
||||
* Modifies r0, r6, r7, r8.
|
||||
* Modifies all volatile registers (since it may call a C function).
|
||||
* This routine calls kvmppc_read_intr, a C function, if an external
|
||||
* interrupt is pending.
|
||||
*/
|
||||
kvmppc_check_wake_reason:
|
||||
mfspr r6, SPRN_SRR1
|
||||
|
@ -2310,8 +2387,7 @@ FTR_SECTION_ELSE
|
|||
rlwinm r6, r6, 45-31, 0xe /* P7 wake reason field is 3 bits */
|
||||
ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_207S)
|
||||
cmpwi r6, 8 /* was it an external interrupt? */
|
||||
li r12, BOOK3S_INTERRUPT_EXTERNAL
|
||||
beq kvmppc_read_intr /* if so, see what it was */
|
||||
beq 7f /* if so, see what it was */
|
||||
li r3, 0
|
||||
li r12, 0
|
||||
cmpwi r6, 6 /* was it the decrementer? */
|
||||
|
@ -2350,83 +2426,28 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
|
|||
li r3, 1
|
||||
blr
|
||||
|
||||
/*
|
||||
* Determine what sort of external interrupt is pending (if any).
|
||||
* Returns:
|
||||
* 0 if no interrupt is pending
|
||||
* 1 if an interrupt is pending that needs to be handled by the host
|
||||
* -1 if there was a guest wakeup IPI (which has now been cleared)
|
||||
* Modifies r0, r6, r7, r8, returns value in r3.
|
||||
*/
|
||||
kvmppc_read_intr:
|
||||
/* see if a host IPI is pending */
|
||||
li r3, 1
|
||||
lbz r0, HSTATE_HOST_IPI(r13)
|
||||
cmpwi r0, 0
|
||||
bne 1f
|
||||
/* external interrupt - create a stack frame so we can call C */
|
||||
7: mflr r0
|
||||
std r0, PPC_LR_STKOFF(r1)
|
||||
stdu r1, -PPC_MIN_STKFRM(r1)
|
||||
bl kvmppc_read_intr
|
||||
nop
|
||||
li r12, BOOK3S_INTERRUPT_EXTERNAL
|
||||
cmpdi r3, 1
|
||||
ble 1f
|
||||
|
||||
/* Now read the interrupt from the ICP */
|
||||
ld r6, HSTATE_XICS_PHYS(r13)
|
||||
li r7, XICS_XIRR
|
||||
cmpdi r6, 0
|
||||
beq- 1f
|
||||
lwzcix r0, r6, r7
|
||||
/*
|
||||
* Save XIRR for later. Since we get in in reverse endian on LE
|
||||
* systems, save it byte reversed and fetch it back in host endian.
|
||||
* Return code of 2 means PCI passthrough interrupt, but
|
||||
* we need to return back to host to complete handling the
|
||||
* interrupt. Trap reason is expected in r12 by guest
|
||||
* exit code.
|
||||
*/
|
||||
li r3, HSTATE_SAVED_XIRR
|
||||
STWX_BE r0, r3, r13
|
||||
#ifdef __LITTLE_ENDIAN__
|
||||
lwz r3, HSTATE_SAVED_XIRR(r13)
|
||||
#else
|
||||
mr r3, r0
|
||||
#endif
|
||||
rlwinm. r3, r3, 0, 0xffffff
|
||||
sync
|
||||
beq 1f /* if nothing pending in the ICP */
|
||||
|
||||
/* We found something in the ICP...
|
||||
*
|
||||
* If it's not an IPI, stash it in the PACA and return to
|
||||
* the host, we don't (yet) handle directing real external
|
||||
* interrupts directly to the guest
|
||||
*/
|
||||
cmpwi r3, XICS_IPI /* if there is, is it an IPI? */
|
||||
bne 42f
|
||||
|
||||
/* It's an IPI, clear the MFRR and EOI it */
|
||||
li r3, 0xff
|
||||
li r8, XICS_MFRR
|
||||
stbcix r3, r6, r8 /* clear the IPI */
|
||||
stwcix r0, r6, r7 /* EOI it */
|
||||
sync
|
||||
|
||||
/* We need to re-check host IPI now in case it got set in the
|
||||
* meantime. If it's clear, we bounce the interrupt to the
|
||||
* guest
|
||||
*/
|
||||
lbz r0, HSTATE_HOST_IPI(r13)
|
||||
cmpwi r0, 0
|
||||
bne- 43f
|
||||
|
||||
/* OK, it's an IPI for us */
|
||||
li r12, 0
|
||||
li r3, -1
|
||||
1: blr
|
||||
|
||||
42: /* It's not an IPI and it's for the host. We saved a copy of XIRR in
|
||||
* the PACA earlier, it will be picked up by the host ICP driver
|
||||
*/
|
||||
li r3, 1
|
||||
b 1b
|
||||
|
||||
43: /* We raced with the host, we need to resend that IPI, bummer */
|
||||
li r0, IPI_PRIORITY
|
||||
stbcix r0, r6, r8 /* set the IPI */
|
||||
sync
|
||||
li r3, 1
|
||||
b 1b
|
||||
li r12, BOOK3S_INTERRUPT_HV_RM_HARD
|
||||
1:
|
||||
ld r0, PPC_MIN_STKFRM+PPC_LR_STKOFF(r1)
|
||||
addi r1, r1, PPC_MIN_STKFRM
|
||||
mtlr r0
|
||||
blr
|
||||
|
||||
/*
|
||||
* Save away FP, VMX and VSX registers.
|
||||
|
|
|
@ -226,7 +226,7 @@ void kvmppc_copy_from_svcpu(struct kvm_vcpu *vcpu,
|
|||
*/
|
||||
vcpu->arch.purr += get_tb() - vcpu->arch.entry_tb;
|
||||
vcpu->arch.spurr += get_tb() - vcpu->arch.entry_tb;
|
||||
vcpu->arch.vtb += get_vtb() - vcpu->arch.entry_vtb;
|
||||
to_book3s(vcpu)->vtb += get_vtb() - vcpu->arch.entry_vtb;
|
||||
if (cpu_has_feature(CPU_FTR_ARCH_207S))
|
||||
vcpu->arch.ic += mfspr(SPRN_IC) - vcpu->arch.entry_ic;
|
||||
svcpu->in_use = false;
|
||||
|
@ -448,6 +448,8 @@ void kvmppc_set_pvr_pr(struct kvm_vcpu *vcpu, u32 pvr)
|
|||
case PVR_POWER7:
|
||||
case PVR_POWER7p:
|
||||
case PVR_POWER8:
|
||||
case PVR_POWER8E:
|
||||
case PVR_POWER8NVL:
|
||||
vcpu->arch.hflags |= BOOK3S_HFLAG_MULTI_PGSIZE |
|
||||
BOOK3S_HFLAG_NEW_TLBIE;
|
||||
break;
|
||||
|
@ -1361,6 +1363,9 @@ static int kvmppc_get_one_reg_pr(struct kvm_vcpu *vcpu, u64 id,
|
|||
case KVM_REG_PPC_HIOR:
|
||||
*val = get_reg_val(id, to_book3s(vcpu)->hior);
|
||||
break;
|
||||
case KVM_REG_PPC_VTB:
|
||||
*val = get_reg_val(id, to_book3s(vcpu)->vtb);
|
||||
break;
|
||||
case KVM_REG_PPC_LPCR:
|
||||
case KVM_REG_PPC_LPCR_64:
|
||||
/*
|
||||
|
@ -1397,6 +1402,9 @@ static int kvmppc_set_one_reg_pr(struct kvm_vcpu *vcpu, u64 id,
|
|||
to_book3s(vcpu)->hior = set_reg_val(id, *val);
|
||||
to_book3s(vcpu)->hior_explicit = true;
|
||||
break;
|
||||
case KVM_REG_PPC_VTB:
|
||||
to_book3s(vcpu)->vtb = set_reg_val(id, *val);
|
||||
break;
|
||||
case KVM_REG_PPC_LPCR:
|
||||
case KVM_REG_PPC_LPCR_64:
|
||||
kvmppc_set_lpcr_pr(vcpu, set_reg_val(id, *val));
|
||||
|
|
|
@ -99,6 +99,10 @@ static int ics_deliver_irq(struct kvmppc_xics *xics, u32 irq, u32 level)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* Record which CPU this arrived on for passed-through interrupts */
|
||||
if (state->host_irq)
|
||||
state->intr_cpu = raw_smp_processor_id();
|
||||
|
||||
/* Attempt delivery */
|
||||
icp_deliver_irq(xics, NULL, irq);
|
||||
|
||||
|
@ -812,7 +816,7 @@ static noinline int kvmppc_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
|
|||
return H_SUCCESS;
|
||||
}
|
||||
|
||||
static noinline int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall)
|
||||
int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall)
|
||||
{
|
||||
struct kvmppc_xics *xics = vcpu->kvm->arch.xics;
|
||||
struct kvmppc_icp *icp = vcpu->arch.icp;
|
||||
|
@ -841,6 +845,7 @@ static noinline int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall)
|
|||
|
||||
return H_SUCCESS;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_xics_rm_complete);
|
||||
|
||||
int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 req)
|
||||
{
|
||||
|
@ -892,6 +897,21 @@ EXPORT_SYMBOL_GPL(kvmppc_xics_hcall);
|
|||
|
||||
/* -- Initialisation code etc. -- */
|
||||
|
||||
static void xics_debugfs_irqmap(struct seq_file *m,
|
||||
struct kvmppc_passthru_irqmap *pimap)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (!pimap)
|
||||
return;
|
||||
seq_printf(m, "========\nPIRQ mappings: %d maps\n===========\n",
|
||||
pimap->n_mapped);
|
||||
for (i = 0; i < pimap->n_mapped; i++) {
|
||||
seq_printf(m, "r_hwirq=%x, v_hwirq=%x\n",
|
||||
pimap->mapped[i].r_hwirq, pimap->mapped[i].v_hwirq);
|
||||
}
|
||||
}
|
||||
|
||||
static int xics_debug_show(struct seq_file *m, void *private)
|
||||
{
|
||||
struct kvmppc_xics *xics = m->private;
|
||||
|
@ -913,6 +933,8 @@ static int xics_debug_show(struct seq_file *m, void *private)
|
|||
t_check_resend = 0;
|
||||
t_reject = 0;
|
||||
|
||||
xics_debugfs_irqmap(m, kvm->arch.pimap);
|
||||
|
||||
seq_printf(m, "=========\nICP state\n=========\n");
|
||||
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
|
@ -1252,6 +1274,8 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level,
|
|||
{
|
||||
struct kvmppc_xics *xics = kvm->arch.xics;
|
||||
|
||||
if (!xics)
|
||||
return -ENODEV;
|
||||
return ics_deliver_irq(xics, irq, level);
|
||||
}
|
||||
|
||||
|
@ -1418,3 +1442,34 @@ int kvm_irq_map_chip_pin(struct kvm *kvm, unsigned irqchip, unsigned pin)
|
|||
{
|
||||
return pin;
|
||||
}
|
||||
|
||||
void kvmppc_xics_set_mapped(struct kvm *kvm, unsigned long irq,
|
||||
unsigned long host_irq)
|
||||
{
|
||||
struct kvmppc_xics *xics = kvm->arch.xics;
|
||||
struct kvmppc_ics *ics;
|
||||
u16 idx;
|
||||
|
||||
ics = kvmppc_xics_find_ics(xics, irq, &idx);
|
||||
if (!ics)
|
||||
return;
|
||||
|
||||
ics->irq_state[idx].host_irq = host_irq;
|
||||
ics->irq_state[idx].intr_cpu = -1;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_xics_set_mapped);
|
||||
|
||||
void kvmppc_xics_clr_mapped(struct kvm *kvm, unsigned long irq,
|
||||
unsigned long host_irq)
|
||||
{
|
||||
struct kvmppc_xics *xics = kvm->arch.xics;
|
||||
struct kvmppc_ics *ics;
|
||||
u16 idx;
|
||||
|
||||
ics = kvmppc_xics_find_ics(xics, irq, &idx);
|
||||
if (!ics)
|
||||
return;
|
||||
|
||||
ics->irq_state[idx].host_irq = 0;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_xics_clr_mapped);
|
||||
|
|
|
@ -42,6 +42,8 @@ struct ics_irq_state {
|
|||
u8 lsi; /* level-sensitive interrupt */
|
||||
u8 asserted; /* Only for LSI */
|
||||
u8 exists;
|
||||
int intr_cpu;
|
||||
u32 host_irq;
|
||||
};
|
||||
|
||||
/* Atomic ICP state, updated with a single compare & swap */
|
||||
|
|
|
@ -2038,7 +2038,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
|
|||
if (type == KVMPPC_DEBUG_NONE)
|
||||
continue;
|
||||
|
||||
if (type & !(KVMPPC_DEBUG_WATCH_READ |
|
||||
if (type & ~(KVMPPC_DEBUG_WATCH_READ |
|
||||
KVMPPC_DEBUG_WATCH_WRITE |
|
||||
KVMPPC_DEBUG_BREAKPOINT))
|
||||
return -EINVAL;
|
||||
|
|
|
@ -743,7 +743,7 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
|
|||
char *virt;
|
||||
struct page **pages;
|
||||
struct tlbe_priv *privs[2] = {};
|
||||
u64 *g2h_bitmap = NULL;
|
||||
u64 *g2h_bitmap;
|
||||
size_t array_len;
|
||||
u32 sets;
|
||||
int num_pages, ret, i;
|
||||
|
@ -779,41 +779,44 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
|
|||
|
||||
num_pages = DIV_ROUND_UP(cfg->array + array_len - 1, PAGE_SIZE) -
|
||||
cfg->array / PAGE_SIZE;
|
||||
pages = kmalloc(sizeof(struct page *) * num_pages, GFP_KERNEL);
|
||||
pages = kmalloc_array(num_pages, sizeof(*pages), GFP_KERNEL);
|
||||
if (!pages)
|
||||
return -ENOMEM;
|
||||
|
||||
ret = get_user_pages_fast(cfg->array, num_pages, 1, pages);
|
||||
if (ret < 0)
|
||||
goto err_pages;
|
||||
goto free_pages;
|
||||
|
||||
if (ret != num_pages) {
|
||||
num_pages = ret;
|
||||
ret = -EFAULT;
|
||||
goto err_put_page;
|
||||
goto put_pages;
|
||||
}
|
||||
|
||||
virt = vmap(pages, num_pages, VM_MAP, PAGE_KERNEL);
|
||||
if (!virt) {
|
||||
ret = -ENOMEM;
|
||||
goto err_put_page;
|
||||
goto put_pages;
|
||||
}
|
||||
|
||||
privs[0] = kzalloc(sizeof(struct tlbe_priv) * params.tlb_sizes[0],
|
||||
GFP_KERNEL);
|
||||
privs[1] = kzalloc(sizeof(struct tlbe_priv) * params.tlb_sizes[1],
|
||||
GFP_KERNEL);
|
||||
|
||||
if (!privs[0] || !privs[1]) {
|
||||
privs[0] = kcalloc(params.tlb_sizes[0], sizeof(*privs[0]), GFP_KERNEL);
|
||||
if (!privs[0]) {
|
||||
ret = -ENOMEM;
|
||||
goto err_privs;
|
||||
goto put_pages;
|
||||
}
|
||||
|
||||
g2h_bitmap = kzalloc(sizeof(u64) * params.tlb_sizes[1],
|
||||
privs[1] = kcalloc(params.tlb_sizes[1], sizeof(*privs[1]), GFP_KERNEL);
|
||||
if (!privs[1]) {
|
||||
ret = -ENOMEM;
|
||||
goto free_privs_first;
|
||||
}
|
||||
|
||||
g2h_bitmap = kcalloc(params.tlb_sizes[1],
|
||||
sizeof(*g2h_bitmap),
|
||||
GFP_KERNEL);
|
||||
if (!g2h_bitmap) {
|
||||
ret = -ENOMEM;
|
||||
goto err_privs;
|
||||
goto free_privs_second;
|
||||
}
|
||||
|
||||
free_gtlb(vcpu_e500);
|
||||
|
@ -845,16 +848,14 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu,
|
|||
|
||||
kvmppc_recalc_tlb1map_range(vcpu_e500);
|
||||
return 0;
|
||||
|
||||
err_privs:
|
||||
kfree(privs[0]);
|
||||
free_privs_second:
|
||||
kfree(privs[1]);
|
||||
|
||||
err_put_page:
|
||||
free_privs_first:
|
||||
kfree(privs[0]);
|
||||
put_pages:
|
||||
for (i = 0; i < num_pages; i++)
|
||||
put_page(pages[i]);
|
||||
|
||||
err_pages:
|
||||
free_pages:
|
||||
kfree(pages);
|
||||
return ret;
|
||||
}
|
||||
|
@ -904,11 +905,9 @@ static int vcpu_mmu_init(struct kvm_vcpu *vcpu,
|
|||
int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
|
||||
{
|
||||
struct kvm_vcpu *vcpu = &vcpu_e500->vcpu;
|
||||
int entry_size = sizeof(struct kvm_book3e_206_tlb_entry);
|
||||
int entries = KVM_E500_TLB0_SIZE + KVM_E500_TLB1_SIZE;
|
||||
|
||||
if (e500_mmu_host_init(vcpu_e500))
|
||||
goto err;
|
||||
goto free_vcpu;
|
||||
|
||||
vcpu_e500->gtlb_params[0].entries = KVM_E500_TLB0_SIZE;
|
||||
vcpu_e500->gtlb_params[1].entries = KVM_E500_TLB1_SIZE;
|
||||
|
@ -920,37 +919,39 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500)
|
|||
vcpu_e500->gtlb_params[1].ways = KVM_E500_TLB1_SIZE;
|
||||
vcpu_e500->gtlb_params[1].sets = 1;
|
||||
|
||||
vcpu_e500->gtlb_arch = kmalloc(entries * entry_size, GFP_KERNEL);
|
||||
vcpu_e500->gtlb_arch = kmalloc_array(KVM_E500_TLB0_SIZE +
|
||||
KVM_E500_TLB1_SIZE,
|
||||
sizeof(*vcpu_e500->gtlb_arch),
|
||||
GFP_KERNEL);
|
||||
if (!vcpu_e500->gtlb_arch)
|
||||
return -ENOMEM;
|
||||
|
||||
vcpu_e500->gtlb_offset[0] = 0;
|
||||
vcpu_e500->gtlb_offset[1] = KVM_E500_TLB0_SIZE;
|
||||
|
||||
vcpu_e500->gtlb_priv[0] = kzalloc(sizeof(struct tlbe_ref) *
|
||||
vcpu_e500->gtlb_params[0].entries,
|
||||
vcpu_e500->gtlb_priv[0] = kcalloc(vcpu_e500->gtlb_params[0].entries,
|
||||
sizeof(struct tlbe_ref),
|
||||
GFP_KERNEL);
|
||||
if (!vcpu_e500->gtlb_priv[0])
|
||||
goto err;
|
||||
goto free_vcpu;
|
||||
|
||||
vcpu_e500->gtlb_priv[1] = kzalloc(sizeof(struct tlbe_ref) *
|
||||
vcpu_e500->gtlb_params[1].entries,
|
||||
vcpu_e500->gtlb_priv[1] = kcalloc(vcpu_e500->gtlb_params[1].entries,
|
||||
sizeof(struct tlbe_ref),
|
||||
GFP_KERNEL);
|
||||
if (!vcpu_e500->gtlb_priv[1])
|
||||
goto err;
|
||||
goto free_vcpu;
|
||||
|
||||
vcpu_e500->g2h_tlb1_map = kzalloc(sizeof(u64) *
|
||||
vcpu_e500->gtlb_params[1].entries,
|
||||
vcpu_e500->g2h_tlb1_map = kcalloc(vcpu_e500->gtlb_params[1].entries,
|
||||
sizeof(*vcpu_e500->g2h_tlb1_map),
|
||||
GFP_KERNEL);
|
||||
if (!vcpu_e500->g2h_tlb1_map)
|
||||
goto err;
|
||||
goto free_vcpu;
|
||||
|
||||
vcpu_mmu_init(vcpu, vcpu_e500->gtlb_params);
|
||||
|
||||
kvmppc_recalc_tlb1map_range(vcpu_e500);
|
||||
return 0;
|
||||
|
||||
err:
|
||||
free_vcpu:
|
||||
free_gtlb(vcpu_e500);
|
||||
return -1;
|
||||
}
|
||||
|
|
|
@ -27,6 +27,8 @@
|
|||
#include <linux/slab.h>
|
||||
#include <linux/file.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/irqbypass.h>
|
||||
#include <linux/kvm_irqfd.h>
|
||||
#include <asm/cputable.h>
|
||||
#include <asm/uaccess.h>
|
||||
#include <asm/kvm_ppc.h>
|
||||
|
@ -436,6 +438,16 @@ err_out:
|
|||
return -EINVAL;
|
||||
}
|
||||
|
||||
bool kvm_arch_has_vcpu_debugfs(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
void kvm_arch_destroy_vm(struct kvm *kvm)
|
||||
{
|
||||
unsigned int i;
|
||||
|
@ -739,6 +751,42 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
|||
#endif
|
||||
}
|
||||
|
||||
/*
|
||||
* irq_bypass_add_producer and irq_bypass_del_producer are only
|
||||
* useful if the architecture supports PCI passthrough.
|
||||
* irq_bypass_stop and irq_bypass_start are not needed and so
|
||||
* kvm_ops are not defined for them.
|
||||
*/
|
||||
bool kvm_arch_has_irq_bypass(void)
|
||||
{
|
||||
return ((kvmppc_hv_ops && kvmppc_hv_ops->irq_bypass_add_producer) ||
|
||||
(kvmppc_pr_ops && kvmppc_pr_ops->irq_bypass_add_producer));
|
||||
}
|
||||
|
||||
int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
|
||||
struct irq_bypass_producer *prod)
|
||||
{
|
||||
struct kvm_kernel_irqfd *irqfd =
|
||||
container_of(cons, struct kvm_kernel_irqfd, consumer);
|
||||
struct kvm *kvm = irqfd->kvm;
|
||||
|
||||
if (kvm->arch.kvm_ops->irq_bypass_add_producer)
|
||||
return kvm->arch.kvm_ops->irq_bypass_add_producer(cons, prod);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
|
||||
struct irq_bypass_producer *prod)
|
||||
{
|
||||
struct kvm_kernel_irqfd *irqfd =
|
||||
container_of(cons, struct kvm_kernel_irqfd, consumer);
|
||||
struct kvm *kvm = irqfd->kvm;
|
||||
|
||||
if (kvm->arch.kvm_ops->irq_bypass_del_producer)
|
||||
kvm->arch.kvm_ops->irq_bypass_del_producer(cons, prod);
|
||||
}
|
||||
|
||||
static void kvmppc_complete_mmio_load(struct kvm_vcpu *vcpu,
|
||||
struct kvm_run *run)
|
||||
{
|
||||
|
@ -1167,6 +1215,19 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
|
|||
return r;
|
||||
}
|
||||
|
||||
bool kvm_arch_intc_initialized(struct kvm *kvm)
|
||||
{
|
||||
#ifdef CONFIG_KVM_MPIC
|
||||
if (kvm->arch.mpic)
|
||||
return true;
|
||||
#endif
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
if (kvm->arch.xics)
|
||||
return true;
|
||||
#endif
|
||||
return false;
|
||||
}
|
||||
|
||||
int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
|
||||
struct kvm_mp_state *mp_state)
|
||||
{
|
||||
|
|
|
@ -432,6 +432,28 @@ TRACE_EVENT(kvmppc_vcore_blocked,
|
|||
__entry->runner_vcpu, __entry->n_runnable, __entry->tgid)
|
||||
);
|
||||
|
||||
TRACE_EVENT(kvmppc_vcore_wakeup,
|
||||
TP_PROTO(int do_sleep, __u64 ns),
|
||||
|
||||
TP_ARGS(do_sleep, ns),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(__u64, ns)
|
||||
__field(int, waited)
|
||||
__field(pid_t, tgid)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->ns = ns;
|
||||
__entry->waited = do_sleep;
|
||||
__entry->tgid = current->tgid;
|
||||
),
|
||||
|
||||
TP_printk("%s time %lld ns, tgid=%d",
|
||||
__entry->waited ? "wait" : "poll",
|
||||
__entry->ns, __entry->tgid)
|
||||
);
|
||||
|
||||
TRACE_EVENT(kvmppc_run_vcpu_enter,
|
||||
TP_PROTO(struct kvm_vcpu *vcpu),
|
||||
|
||||
|
|
|
@ -493,36 +493,6 @@ static void native_hugepage_invalidate(unsigned long vsid,
|
|||
}
|
||||
#endif
|
||||
|
||||
static inline int __hpte_actual_psize(unsigned int lp, int psize)
|
||||
{
|
||||
int i, shift;
|
||||
unsigned int mask;
|
||||
|
||||
/* start from 1 ignoring MMU_PAGE_4K */
|
||||
for (i = 1; i < MMU_PAGE_COUNT; i++) {
|
||||
|
||||
/* invalid penc */
|
||||
if (mmu_psize_defs[psize].penc[i] == -1)
|
||||
continue;
|
||||
/*
|
||||
* encoding bits per actual page size
|
||||
* PTE LP actual page size
|
||||
* rrrr rrrz >=8KB
|
||||
* rrrr rrzz >=16KB
|
||||
* rrrr rzzz >=32KB
|
||||
* rrrr zzzz >=64KB
|
||||
* .......
|
||||
*/
|
||||
shift = mmu_psize_defs[i].shift - LP_SHIFT;
|
||||
if (shift > LP_BITS)
|
||||
shift = LP_BITS;
|
||||
mask = (1 << shift) - 1;
|
||||
if ((lp & mask) == mmu_psize_defs[psize].penc[i])
|
||||
return i;
|
||||
}
|
||||
return -1;
|
||||
}
|
||||
|
||||
static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
|
||||
int *psize, int *apsize, int *ssize, unsigned long *vpn)
|
||||
{
|
||||
|
@ -538,16 +508,8 @@ static void hpte_decode(struct hash_pte *hpte, unsigned long slot,
|
|||
size = MMU_PAGE_4K;
|
||||
a_size = MMU_PAGE_4K;
|
||||
} else {
|
||||
for (size = 0; size < MMU_PAGE_COUNT; size++) {
|
||||
|
||||
/* valid entries have a shift value */
|
||||
if (!mmu_psize_defs[size].shift)
|
||||
continue;
|
||||
|
||||
a_size = __hpte_actual_psize(lp, size);
|
||||
if (a_size != -1)
|
||||
break;
|
||||
}
|
||||
size = hpte_page_sizes[lp] & 0xf;
|
||||
a_size = hpte_page_sizes[lp] >> 4;
|
||||
}
|
||||
/* This works for all page sizes, and for 256M and 1T segments */
|
||||
if (cpu_has_feature(CPU_FTR_ARCH_300))
|
||||
|
|
|
@ -93,6 +93,9 @@ static unsigned long _SDR1;
|
|||
struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
|
||||
EXPORT_SYMBOL_GPL(mmu_psize_defs);
|
||||
|
||||
u8 hpte_page_sizes[1 << LP_BITS];
|
||||
EXPORT_SYMBOL_GPL(hpte_page_sizes);
|
||||
|
||||
struct hash_pte *htab_address;
|
||||
unsigned long htab_size_bytes;
|
||||
unsigned long htab_hash_mask;
|
||||
|
@ -564,8 +567,60 @@ static void __init htab_scan_page_sizes(void)
|
|||
#endif /* CONFIG_HUGETLB_PAGE */
|
||||
}
|
||||
|
||||
/*
|
||||
* Fill in the hpte_page_sizes[] array.
|
||||
* We go through the mmu_psize_defs[] array looking for all the
|
||||
* supported base/actual page size combinations. Each combination
|
||||
* has a unique pagesize encoding (penc) value in the low bits of
|
||||
* the LP field of the HPTE. For actual page sizes less than 1MB,
|
||||
* some of the upper LP bits are used for RPN bits, meaning that
|
||||
* we need to fill in several entries in hpte_page_sizes[].
|
||||
*
|
||||
* In diagrammatic form, with r = RPN bits and z = page size bits:
|
||||
* PTE LP actual page size
|
||||
* rrrr rrrz >=8KB
|
||||
* rrrr rrzz >=16KB
|
||||
* rrrr rzzz >=32KB
|
||||
* rrrr zzzz >=64KB
|
||||
* ...
|
||||
*
|
||||
* The zzzz bits are implementation-specific but are chosen so that
|
||||
* no encoding for a larger page size uses the same value in its
|
||||
* low-order N bits as the encoding for the 2^(12+N) byte page size
|
||||
* (if it exists).
|
||||
*/
|
||||
static void init_hpte_page_sizes(void)
|
||||
{
|
||||
long int ap, bp;
|
||||
long int shift, penc;
|
||||
|
||||
for (bp = 0; bp < MMU_PAGE_COUNT; ++bp) {
|
||||
if (!mmu_psize_defs[bp].shift)
|
||||
continue; /* not a supported page size */
|
||||
for (ap = bp; ap < MMU_PAGE_COUNT; ++ap) {
|
||||
penc = mmu_psize_defs[bp].penc[ap];
|
||||
if (penc == -1)
|
||||
continue;
|
||||
shift = mmu_psize_defs[ap].shift - LP_SHIFT;
|
||||
if (shift <= 0)
|
||||
continue; /* should never happen */
|
||||
/*
|
||||
* For page sizes less than 1MB, this loop
|
||||
* replicates the entry for all possible values
|
||||
* of the rrrr bits.
|
||||
*/
|
||||
while (penc < (1 << LP_BITS)) {
|
||||
hpte_page_sizes[penc] = (ap << 4) | bp;
|
||||
penc += 1 << shift;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static void __init htab_init_page_sizes(void)
|
||||
{
|
||||
init_hpte_page_sizes();
|
||||
|
||||
if (!debug_pagealloc_enabled()) {
|
||||
/*
|
||||
* Pick a size for the linear mapping. Currently, we only
|
||||
|
|
|
@ -208,6 +208,7 @@ OPAL_CALL(opal_pci_config_write_byte, OPAL_PCI_CONFIG_WRITE_BYTE);
|
|||
OPAL_CALL(opal_pci_config_write_half_word, OPAL_PCI_CONFIG_WRITE_HALF_WORD);
|
||||
OPAL_CALL(opal_pci_config_write_word, OPAL_PCI_CONFIG_WRITE_WORD);
|
||||
OPAL_CALL(opal_set_xive, OPAL_SET_XIVE);
|
||||
OPAL_CALL_REAL(opal_rm_set_xive, OPAL_SET_XIVE);
|
||||
OPAL_CALL(opal_get_xive, OPAL_GET_XIVE);
|
||||
OPAL_CALL(opal_register_exception_handler, OPAL_REGISTER_OPAL_EXCEPTION_HANDLER);
|
||||
OPAL_CALL(opal_pci_eeh_freeze_status, OPAL_PCI_EEH_FREEZE_STATUS);
|
||||
|
|
|
@ -2718,15 +2718,21 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
|
|||
}
|
||||
|
||||
#ifdef CONFIG_PCI_MSI
|
||||
static void pnv_ioda2_msi_eoi(struct irq_data *d)
|
||||
int64_t pnv_opal_pci_msi_eoi(struct irq_chip *chip, unsigned int hw_irq)
|
||||
{
|
||||
unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
|
||||
struct irq_chip *chip = irq_data_get_irq_chip(d);
|
||||
struct pnv_phb *phb = container_of(chip, struct pnv_phb,
|
||||
ioda.irq_chip);
|
||||
int64_t rc;
|
||||
|
||||
rc = opal_pci_msi_eoi(phb->opal_id, hw_irq);
|
||||
return opal_pci_msi_eoi(phb->opal_id, hw_irq);
|
||||
}
|
||||
|
||||
static void pnv_ioda2_msi_eoi(struct irq_data *d)
|
||||
{
|
||||
int64_t rc;
|
||||
unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
|
||||
struct irq_chip *chip = irq_data_get_irq_chip(d);
|
||||
|
||||
rc = pnv_opal_pci_msi_eoi(chip, hw_irq);
|
||||
WARN_ON_ONCE(rc);
|
||||
|
||||
icp_native_eoi(d);
|
||||
|
@ -2756,6 +2762,16 @@ void pnv_set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq)
|
|||
irq_set_chip(virq, &phb->ioda.irq_chip);
|
||||
}
|
||||
|
||||
/*
|
||||
* Returns true iff chip is something that we could call
|
||||
* pnv_opal_pci_msi_eoi for.
|
||||
*/
|
||||
bool is_pnv_opal_msi(struct irq_chip *chip)
|
||||
{
|
||||
return chip->irq_eoi == pnv_ioda2_msi_eoi;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(is_pnv_opal_msi);
|
||||
|
||||
static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev,
|
||||
unsigned int hwirq, unsigned int virq,
|
||||
unsigned int is_64, struct msi_msg *msg)
|
||||
|
|
|
@ -28,7 +28,7 @@
|
|||
|
||||
#define KVM_S390_BSCA_CPU_SLOTS 64
|
||||
#define KVM_S390_ESCA_CPU_SLOTS 248
|
||||
#define KVM_MAX_VCPUS KVM_S390_ESCA_CPU_SLOTS
|
||||
#define KVM_MAX_VCPUS 255
|
||||
#define KVM_USER_MEM_SLOTS 32
|
||||
|
||||
/*
|
||||
|
@ -245,72 +245,72 @@ struct sie_page {
|
|||
} __packed;
|
||||
|
||||
struct kvm_vcpu_stat {
|
||||
u32 exit_userspace;
|
||||
u32 exit_null;
|
||||
u32 exit_external_request;
|
||||
u32 exit_external_interrupt;
|
||||
u32 exit_stop_request;
|
||||
u32 exit_validity;
|
||||
u32 exit_instruction;
|
||||
u32 exit_pei;
|
||||
u32 halt_successful_poll;
|
||||
u32 halt_attempted_poll;
|
||||
u32 halt_poll_invalid;
|
||||
u32 halt_wakeup;
|
||||
u32 instruction_lctl;
|
||||
u32 instruction_lctlg;
|
||||
u32 instruction_stctl;
|
||||
u32 instruction_stctg;
|
||||
u32 exit_program_interruption;
|
||||
u32 exit_instr_and_program;
|
||||
u32 exit_operation_exception;
|
||||
u32 deliver_external_call;
|
||||
u32 deliver_emergency_signal;
|
||||
u32 deliver_service_signal;
|
||||
u32 deliver_virtio_interrupt;
|
||||
u32 deliver_stop_signal;
|
||||
u32 deliver_prefix_signal;
|
||||
u32 deliver_restart_signal;
|
||||
u32 deliver_program_int;
|
||||
u32 deliver_io_int;
|
||||
u32 exit_wait_state;
|
||||
u32 instruction_pfmf;
|
||||
u32 instruction_stidp;
|
||||
u32 instruction_spx;
|
||||
u32 instruction_stpx;
|
||||
u32 instruction_stap;
|
||||
u32 instruction_storage_key;
|
||||
u32 instruction_ipte_interlock;
|
||||
u32 instruction_stsch;
|
||||
u32 instruction_chsc;
|
||||
u32 instruction_stsi;
|
||||
u32 instruction_stfl;
|
||||
u32 instruction_tprot;
|
||||
u32 instruction_sie;
|
||||
u32 instruction_essa;
|
||||
u32 instruction_sthyi;
|
||||
u32 instruction_sigp_sense;
|
||||
u32 instruction_sigp_sense_running;
|
||||
u32 instruction_sigp_external_call;
|
||||
u32 instruction_sigp_emergency;
|
||||
u32 instruction_sigp_cond_emergency;
|
||||
u32 instruction_sigp_start;
|
||||
u32 instruction_sigp_stop;
|
||||
u32 instruction_sigp_stop_store_status;
|
||||
u32 instruction_sigp_store_status;
|
||||
u32 instruction_sigp_store_adtl_status;
|
||||
u32 instruction_sigp_arch;
|
||||
u32 instruction_sigp_prefix;
|
||||
u32 instruction_sigp_restart;
|
||||
u32 instruction_sigp_init_cpu_reset;
|
||||
u32 instruction_sigp_cpu_reset;
|
||||
u32 instruction_sigp_unknown;
|
||||
u32 diagnose_10;
|
||||
u32 diagnose_44;
|
||||
u32 diagnose_9c;
|
||||
u32 diagnose_258;
|
||||
u32 diagnose_308;
|
||||
u32 diagnose_500;
|
||||
u64 exit_userspace;
|
||||
u64 exit_null;
|
||||
u64 exit_external_request;
|
||||
u64 exit_external_interrupt;
|
||||
u64 exit_stop_request;
|
||||
u64 exit_validity;
|
||||
u64 exit_instruction;
|
||||
u64 exit_pei;
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_wakeup;
|
||||
u64 instruction_lctl;
|
||||
u64 instruction_lctlg;
|
||||
u64 instruction_stctl;
|
||||
u64 instruction_stctg;
|
||||
u64 exit_program_interruption;
|
||||
u64 exit_instr_and_program;
|
||||
u64 exit_operation_exception;
|
||||
u64 deliver_external_call;
|
||||
u64 deliver_emergency_signal;
|
||||
u64 deliver_service_signal;
|
||||
u64 deliver_virtio_interrupt;
|
||||
u64 deliver_stop_signal;
|
||||
u64 deliver_prefix_signal;
|
||||
u64 deliver_restart_signal;
|
||||
u64 deliver_program_int;
|
||||
u64 deliver_io_int;
|
||||
u64 exit_wait_state;
|
||||
u64 instruction_pfmf;
|
||||
u64 instruction_stidp;
|
||||
u64 instruction_spx;
|
||||
u64 instruction_stpx;
|
||||
u64 instruction_stap;
|
||||
u64 instruction_storage_key;
|
||||
u64 instruction_ipte_interlock;
|
||||
u64 instruction_stsch;
|
||||
u64 instruction_chsc;
|
||||
u64 instruction_stsi;
|
||||
u64 instruction_stfl;
|
||||
u64 instruction_tprot;
|
||||
u64 instruction_sie;
|
||||
u64 instruction_essa;
|
||||
u64 instruction_sthyi;
|
||||
u64 instruction_sigp_sense;
|
||||
u64 instruction_sigp_sense_running;
|
||||
u64 instruction_sigp_external_call;
|
||||
u64 instruction_sigp_emergency;
|
||||
u64 instruction_sigp_cond_emergency;
|
||||
u64 instruction_sigp_start;
|
||||
u64 instruction_sigp_stop;
|
||||
u64 instruction_sigp_stop_store_status;
|
||||
u64 instruction_sigp_store_status;
|
||||
u64 instruction_sigp_store_adtl_status;
|
||||
u64 instruction_sigp_arch;
|
||||
u64 instruction_sigp_prefix;
|
||||
u64 instruction_sigp_restart;
|
||||
u64 instruction_sigp_init_cpu_reset;
|
||||
u64 instruction_sigp_cpu_reset;
|
||||
u64 instruction_sigp_unknown;
|
||||
u64 diagnose_10;
|
||||
u64 diagnose_44;
|
||||
u64 diagnose_9c;
|
||||
u64 diagnose_258;
|
||||
u64 diagnose_308;
|
||||
u64 diagnose_500;
|
||||
};
|
||||
|
||||
#define PGM_OPERATION 0x01
|
||||
|
@ -577,7 +577,7 @@ struct kvm_vcpu_arch {
|
|||
};
|
||||
|
||||
struct kvm_vm_stat {
|
||||
u32 remote_tlb_flush;
|
||||
ulong remote_tlb_flush;
|
||||
};
|
||||
|
||||
struct kvm_arch_memory_slot {
|
||||
|
|
|
@ -125,6 +125,7 @@ int main(void)
|
|||
OFFSET(__LC_STFL_FAC_LIST, lowcore, stfl_fac_list);
|
||||
OFFSET(__LC_STFLE_FAC_LIST, lowcore, stfle_fac_list);
|
||||
OFFSET(__LC_MCCK_CODE, lowcore, mcck_interruption_code);
|
||||
OFFSET(__LC_EXT_DAMAGE_CODE, lowcore, external_damage_code);
|
||||
OFFSET(__LC_MCCK_FAIL_STOR_ADDR, lowcore, failing_storage_address);
|
||||
OFFSET(__LC_LAST_BREAK, lowcore, breaking_event_addr);
|
||||
OFFSET(__LC_RST_OLD_PSW, lowcore, restart_old_psw);
|
||||
|
|
|
@ -495,6 +495,18 @@ static int trans_exc(struct kvm_vcpu *vcpu, int code, unsigned long gva,
|
|||
tec = (struct trans_exc_code_bits *)&pgm->trans_exc_code;
|
||||
|
||||
switch (code) {
|
||||
case PGM_PROTECTION:
|
||||
switch (prot) {
|
||||
case PROT_TYPE_ALC:
|
||||
tec->b60 = 1;
|
||||
/* FALL THROUGH */
|
||||
case PROT_TYPE_DAT:
|
||||
tec->b61 = 1;
|
||||
break;
|
||||
default: /* LA and KEYC set b61 to 0, other params undefined */
|
||||
return code;
|
||||
}
|
||||
/* FALL THROUGH */
|
||||
case PGM_ASCE_TYPE:
|
||||
case PGM_PAGE_TRANSLATION:
|
||||
case PGM_REGION_FIRST_TRANS:
|
||||
|
@ -504,8 +516,7 @@ static int trans_exc(struct kvm_vcpu *vcpu, int code, unsigned long gva,
|
|||
/*
|
||||
* op_access_id only applies to MOVE_PAGE -> set bit 61
|
||||
* exc_access_id has to be set to 0 for some instructions. Both
|
||||
* cases have to be handled by the caller. We can always store
|
||||
* exc_access_id, as it is undefined for non-ar cases.
|
||||
* cases have to be handled by the caller.
|
||||
*/
|
||||
tec->addr = gva >> PAGE_SHIFT;
|
||||
tec->fsi = mode == GACC_STORE ? FSI_STORE : FSI_FETCH;
|
||||
|
@ -516,25 +527,13 @@ static int trans_exc(struct kvm_vcpu *vcpu, int code, unsigned long gva,
|
|||
case PGM_ASTE_VALIDITY:
|
||||
case PGM_ASTE_SEQUENCE:
|
||||
case PGM_EXTENDED_AUTHORITY:
|
||||
/*
|
||||
* We can always store exc_access_id, as it is
|
||||
* undefined for non-ar cases. It is undefined for
|
||||
* most DAT protection exceptions.
|
||||
*/
|
||||
pgm->exc_access_id = ar;
|
||||
break;
|
||||
case PGM_PROTECTION:
|
||||
switch (prot) {
|
||||
case PROT_TYPE_ALC:
|
||||
tec->b60 = 1;
|
||||
/* FALL THROUGH */
|
||||
case PROT_TYPE_DAT:
|
||||
tec->b61 = 1;
|
||||
tec->addr = gva >> PAGE_SHIFT;
|
||||
tec->fsi = mode == GACC_STORE ? FSI_STORE : FSI_FETCH;
|
||||
tec->as = psw_bits(vcpu->arch.sie_block->gpsw).as;
|
||||
/* exc_access_id is undefined for most cases */
|
||||
pgm->exc_access_id = ar;
|
||||
break;
|
||||
default: /* LA and KEYC set b61 to 0, other params undefined */
|
||||
break;
|
||||
}
|
||||
break;
|
||||
}
|
||||
return code;
|
||||
}
|
||||
|
|
|
@ -206,7 +206,7 @@ static int __import_wp_info(struct kvm_vcpu *vcpu,
|
|||
int kvm_s390_import_bp_data(struct kvm_vcpu *vcpu,
|
||||
struct kvm_guest_debug *dbg)
|
||||
{
|
||||
int ret = 0, nr_wp = 0, nr_bp = 0, i, size;
|
||||
int ret = 0, nr_wp = 0, nr_bp = 0, i;
|
||||
struct kvm_hw_breakpoint *bp_data = NULL;
|
||||
struct kvm_hw_wp_info_arch *wp_info = NULL;
|
||||
struct kvm_hw_bp_info_arch *bp_info = NULL;
|
||||
|
@ -216,17 +216,10 @@ int kvm_s390_import_bp_data(struct kvm_vcpu *vcpu,
|
|||
else if (dbg->arch.nr_hw_bp > MAX_BP_COUNT)
|
||||
return -EINVAL;
|
||||
|
||||
size = dbg->arch.nr_hw_bp * sizeof(struct kvm_hw_breakpoint);
|
||||
bp_data = kmalloc(size, GFP_KERNEL);
|
||||
if (!bp_data) {
|
||||
ret = -ENOMEM;
|
||||
goto error;
|
||||
}
|
||||
|
||||
if (copy_from_user(bp_data, dbg->arch.hw_bp, size)) {
|
||||
ret = -EFAULT;
|
||||
goto error;
|
||||
}
|
||||
bp_data = memdup_user(dbg->arch.hw_bp,
|
||||
sizeof(*bp_data) * dbg->arch.nr_hw_bp);
|
||||
if (IS_ERR(bp_data))
|
||||
return PTR_ERR(bp_data);
|
||||
|
||||
for (i = 0; i < dbg->arch.nr_hw_bp; i++) {
|
||||
switch (bp_data[i].type) {
|
||||
|
@ -241,17 +234,19 @@ int kvm_s390_import_bp_data(struct kvm_vcpu *vcpu,
|
|||
}
|
||||
}
|
||||
|
||||
size = nr_wp * sizeof(struct kvm_hw_wp_info_arch);
|
||||
if (size > 0) {
|
||||
wp_info = kmalloc(size, GFP_KERNEL);
|
||||
if (nr_wp > 0) {
|
||||
wp_info = kmalloc_array(nr_wp,
|
||||
sizeof(*wp_info),
|
||||
GFP_KERNEL);
|
||||
if (!wp_info) {
|
||||
ret = -ENOMEM;
|
||||
goto error;
|
||||
}
|
||||
}
|
||||
size = nr_bp * sizeof(struct kvm_hw_bp_info_arch);
|
||||
if (size > 0) {
|
||||
bp_info = kmalloc(size, GFP_KERNEL);
|
||||
if (nr_bp > 0) {
|
||||
bp_info = kmalloc_array(nr_bp,
|
||||
sizeof(*bp_info),
|
||||
GFP_KERNEL);
|
||||
if (!bp_info) {
|
||||
ret = -ENOMEM;
|
||||
goto error;
|
||||
|
@ -382,14 +377,20 @@ void kvm_s390_prepare_debug_exit(struct kvm_vcpu *vcpu)
|
|||
vcpu->guest_debug &= ~KVM_GUESTDBG_EXIT_PENDING;
|
||||
}
|
||||
|
||||
#define PER_CODE_MASK (PER_EVENT_MASK >> 24)
|
||||
#define PER_CODE_BRANCH (PER_EVENT_BRANCH >> 24)
|
||||
#define PER_CODE_IFETCH (PER_EVENT_IFETCH >> 24)
|
||||
#define PER_CODE_STORE (PER_EVENT_STORE >> 24)
|
||||
#define PER_CODE_STORE_REAL (PER_EVENT_STORE_REAL >> 24)
|
||||
|
||||
#define per_bp_event(code) \
|
||||
(code & (PER_EVENT_IFETCH | PER_EVENT_BRANCH))
|
||||
(code & (PER_CODE_IFETCH | PER_CODE_BRANCH))
|
||||
#define per_write_wp_event(code) \
|
||||
(code & (PER_EVENT_STORE | PER_EVENT_STORE_REAL))
|
||||
(code & (PER_CODE_STORE | PER_CODE_STORE_REAL))
|
||||
|
||||
static int debug_exit_required(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u32 perc = (vcpu->arch.sie_block->perc << 24);
|
||||
u8 perc = vcpu->arch.sie_block->perc;
|
||||
struct kvm_debug_exit_arch *debug_exit = &vcpu->run->debug.arch;
|
||||
struct kvm_hw_wp_info_arch *wp_info = NULL;
|
||||
struct kvm_hw_bp_info_arch *bp_info = NULL;
|
||||
|
@ -444,7 +445,7 @@ int kvm_s390_handle_per_ifetch_icpt(struct kvm_vcpu *vcpu)
|
|||
const u8 ilen = kvm_s390_get_ilen(vcpu);
|
||||
struct kvm_s390_pgm_info pgm_info = {
|
||||
.code = PGM_PER,
|
||||
.per_code = PER_EVENT_IFETCH >> 24,
|
||||
.per_code = PER_CODE_IFETCH,
|
||||
.per_address = __rewind_psw(vcpu->arch.sie_block->gpsw, ilen),
|
||||
};
|
||||
|
||||
|
@ -458,33 +459,33 @@ int kvm_s390_handle_per_ifetch_icpt(struct kvm_vcpu *vcpu)
|
|||
|
||||
static void filter_guest_per_event(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u32 perc = vcpu->arch.sie_block->perc << 24;
|
||||
const u8 perc = vcpu->arch.sie_block->perc;
|
||||
u64 peraddr = vcpu->arch.sie_block->peraddr;
|
||||
u64 addr = vcpu->arch.sie_block->gpsw.addr;
|
||||
u64 cr9 = vcpu->arch.sie_block->gcr[9];
|
||||
u64 cr10 = vcpu->arch.sie_block->gcr[10];
|
||||
u64 cr11 = vcpu->arch.sie_block->gcr[11];
|
||||
/* filter all events, demanded by the guest */
|
||||
u32 guest_perc = perc & cr9 & PER_EVENT_MASK;
|
||||
u8 guest_perc = perc & (cr9 >> 24) & PER_CODE_MASK;
|
||||
|
||||
if (!guest_per_enabled(vcpu))
|
||||
guest_perc = 0;
|
||||
|
||||
/* filter "successful-branching" events */
|
||||
if (guest_perc & PER_EVENT_BRANCH &&
|
||||
if (guest_perc & PER_CODE_BRANCH &&
|
||||
cr9 & PER_CONTROL_BRANCH_ADDRESS &&
|
||||
!in_addr_range(addr, cr10, cr11))
|
||||
guest_perc &= ~PER_EVENT_BRANCH;
|
||||
guest_perc &= ~PER_CODE_BRANCH;
|
||||
|
||||
/* filter "instruction-fetching" events */
|
||||
if (guest_perc & PER_EVENT_IFETCH &&
|
||||
if (guest_perc & PER_CODE_IFETCH &&
|
||||
!in_addr_range(peraddr, cr10, cr11))
|
||||
guest_perc &= ~PER_EVENT_IFETCH;
|
||||
guest_perc &= ~PER_CODE_IFETCH;
|
||||
|
||||
/* All other PER events will be given to the guest */
|
||||
/* TODO: Check altered address/address space */
|
||||
|
||||
vcpu->arch.sie_block->perc = guest_perc >> 24;
|
||||
vcpu->arch.sie_block->perc = guest_perc;
|
||||
|
||||
if (!guest_perc)
|
||||
vcpu->arch.sie_block->iprcc &= ~PGM_PER;
|
||||
|
|
|
@ -29,6 +29,7 @@ static const intercept_handler_t instruction_handlers[256] = {
|
|||
[0x01] = kvm_s390_handle_01,
|
||||
[0x82] = kvm_s390_handle_lpsw,
|
||||
[0x83] = kvm_s390_handle_diag,
|
||||
[0xaa] = kvm_s390_handle_aa,
|
||||
[0xae] = kvm_s390_handle_sigp,
|
||||
[0xb2] = kvm_s390_handle_b2,
|
||||
[0xb6] = kvm_s390_handle_stctl,
|
||||
|
|
|
@ -24,6 +24,8 @@
|
|||
#include <asm/sclp.h>
|
||||
#include <asm/isc.h>
|
||||
#include <asm/gmap.h>
|
||||
#include <asm/switch_to.h>
|
||||
#include <asm/nmi.h>
|
||||
#include "kvm-s390.h"
|
||||
#include "gaccess.h"
|
||||
#include "trace-s390.h"
|
||||
|
@ -40,6 +42,7 @@ static int sca_ext_call_pending(struct kvm_vcpu *vcpu, int *src_id)
|
|||
if (!(atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_ECALL_PEND))
|
||||
return 0;
|
||||
|
||||
BUG_ON(!kvm_s390_use_sca_entries());
|
||||
read_lock(&vcpu->kvm->arch.sca_lock);
|
||||
if (vcpu->kvm->arch.use_esca) {
|
||||
struct esca_block *sca = vcpu->kvm->arch.sca;
|
||||
|
@ -68,6 +71,7 @@ static int sca_inject_ext_call(struct kvm_vcpu *vcpu, int src_id)
|
|||
{
|
||||
int expect, rc;
|
||||
|
||||
BUG_ON(!kvm_s390_use_sca_entries());
|
||||
read_lock(&vcpu->kvm->arch.sca_lock);
|
||||
if (vcpu->kvm->arch.use_esca) {
|
||||
struct esca_block *sca = vcpu->kvm->arch.sca;
|
||||
|
@ -109,6 +113,8 @@ static void sca_clear_ext_call(struct kvm_vcpu *vcpu)
|
|||
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
|
||||
int rc, expect;
|
||||
|
||||
if (!kvm_s390_use_sca_entries())
|
||||
return;
|
||||
atomic_andnot(CPUSTAT_ECALL_PEND, li->cpuflags);
|
||||
read_lock(&vcpu->kvm->arch.sca_lock);
|
||||
if (vcpu->kvm->arch.use_esca) {
|
||||
|
@ -400,12 +406,78 @@ static int __must_check __deliver_pfault_init(struct kvm_vcpu *vcpu)
|
|||
return rc ? -EFAULT : 0;
|
||||
}
|
||||
|
||||
static int __write_machine_check(struct kvm_vcpu *vcpu,
|
||||
struct kvm_s390_mchk_info *mchk)
|
||||
{
|
||||
unsigned long ext_sa_addr;
|
||||
freg_t fprs[NUM_FPRS];
|
||||
union mci mci;
|
||||
int rc;
|
||||
|
||||
mci.val = mchk->mcic;
|
||||
/* take care of lazy register loading via vcpu load/put */
|
||||
save_fpu_regs();
|
||||
save_access_regs(vcpu->run->s.regs.acrs);
|
||||
|
||||
/* Extended save area */
|
||||
rc = read_guest_lc(vcpu, __LC_VX_SAVE_AREA_ADDR, &ext_sa_addr,
|
||||
sizeof(unsigned long));
|
||||
/* Only bits 0-53 are used for address formation */
|
||||
ext_sa_addr &= ~0x3ffUL;
|
||||
if (!rc && mci.vr && ext_sa_addr && test_kvm_facility(vcpu->kvm, 129)) {
|
||||
if (write_guest_abs(vcpu, ext_sa_addr, vcpu->run->s.regs.vrs,
|
||||
512))
|
||||
mci.vr = 0;
|
||||
} else {
|
||||
mci.vr = 0;
|
||||
}
|
||||
|
||||
/* General interruption information */
|
||||
rc |= put_guest_lc(vcpu, 1, (u8 __user *) __LC_AR_MODE_ID);
|
||||
rc |= write_guest_lc(vcpu, __LC_MCK_OLD_PSW,
|
||||
&vcpu->arch.sie_block->gpsw, sizeof(psw_t));
|
||||
rc |= read_guest_lc(vcpu, __LC_MCK_NEW_PSW,
|
||||
&vcpu->arch.sie_block->gpsw, sizeof(psw_t));
|
||||
rc |= put_guest_lc(vcpu, mci.val, (u64 __user *) __LC_MCCK_CODE);
|
||||
|
||||
/* Register-save areas */
|
||||
if (MACHINE_HAS_VX) {
|
||||
convert_vx_to_fp(fprs, (__vector128 *) vcpu->run->s.regs.vrs);
|
||||
rc |= write_guest_lc(vcpu, __LC_FPREGS_SAVE_AREA, fprs, 128);
|
||||
} else {
|
||||
rc |= write_guest_lc(vcpu, __LC_FPREGS_SAVE_AREA,
|
||||
vcpu->run->s.regs.fprs, 128);
|
||||
}
|
||||
rc |= write_guest_lc(vcpu, __LC_GPREGS_SAVE_AREA,
|
||||
vcpu->run->s.regs.gprs, 128);
|
||||
rc |= put_guest_lc(vcpu, current->thread.fpu.fpc,
|
||||
(u32 __user *) __LC_FP_CREG_SAVE_AREA);
|
||||
rc |= put_guest_lc(vcpu, vcpu->arch.sie_block->todpr,
|
||||
(u32 __user *) __LC_TOD_PROGREG_SAVE_AREA);
|
||||
rc |= put_guest_lc(vcpu, kvm_s390_get_cpu_timer(vcpu),
|
||||
(u64 __user *) __LC_CPU_TIMER_SAVE_AREA);
|
||||
rc |= put_guest_lc(vcpu, vcpu->arch.sie_block->ckc >> 8,
|
||||
(u64 __user *) __LC_CLOCK_COMP_SAVE_AREA);
|
||||
rc |= write_guest_lc(vcpu, __LC_AREGS_SAVE_AREA,
|
||||
&vcpu->run->s.regs.acrs, 64);
|
||||
rc |= write_guest_lc(vcpu, __LC_CREGS_SAVE_AREA,
|
||||
&vcpu->arch.sie_block->gcr, 128);
|
||||
|
||||
/* Extended interruption information */
|
||||
rc |= put_guest_lc(vcpu, mchk->ext_damage_code,
|
||||
(u32 __user *) __LC_EXT_DAMAGE_CODE);
|
||||
rc |= put_guest_lc(vcpu, mchk->failing_storage_address,
|
||||
(u64 __user *) __LC_MCCK_FAIL_STOR_ADDR);
|
||||
rc |= write_guest_lc(vcpu, __LC_PSW_SAVE_AREA, &mchk->fixed_logout,
|
||||
sizeof(mchk->fixed_logout));
|
||||
return rc ? -EFAULT : 0;
|
||||
}
|
||||
|
||||
static int __must_check __deliver_machine_check(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_s390_float_interrupt *fi = &vcpu->kvm->arch.float_int;
|
||||
struct kvm_s390_local_interrupt *li = &vcpu->arch.local_int;
|
||||
struct kvm_s390_mchk_info mchk = {};
|
||||
unsigned long adtl_status_addr;
|
||||
int deliver = 0;
|
||||
int rc = 0;
|
||||
|
||||
|
@ -446,29 +518,9 @@ static int __must_check __deliver_machine_check(struct kvm_vcpu *vcpu)
|
|||
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
|
||||
KVM_S390_MCHK,
|
||||
mchk.cr14, mchk.mcic);
|
||||
|
||||
rc = kvm_s390_vcpu_store_status(vcpu,
|
||||
KVM_S390_STORE_STATUS_PREFIXED);
|
||||
rc |= read_guest_lc(vcpu, __LC_VX_SAVE_AREA_ADDR,
|
||||
&adtl_status_addr,
|
||||
sizeof(unsigned long));
|
||||
rc |= kvm_s390_vcpu_store_adtl_status(vcpu,
|
||||
adtl_status_addr);
|
||||
rc |= put_guest_lc(vcpu, mchk.mcic,
|
||||
(u64 __user *) __LC_MCCK_CODE);
|
||||
rc |= put_guest_lc(vcpu, mchk.failing_storage_address,
|
||||
(u64 __user *) __LC_MCCK_FAIL_STOR_ADDR);
|
||||
rc |= write_guest_lc(vcpu, __LC_PSW_SAVE_AREA,
|
||||
&mchk.fixed_logout,
|
||||
sizeof(mchk.fixed_logout));
|
||||
rc |= write_guest_lc(vcpu, __LC_MCK_OLD_PSW,
|
||||
&vcpu->arch.sie_block->gpsw,
|
||||
sizeof(psw_t));
|
||||
rc |= read_guest_lc(vcpu, __LC_MCK_NEW_PSW,
|
||||
&vcpu->arch.sie_block->gpsw,
|
||||
sizeof(psw_t));
|
||||
rc = __write_machine_check(vcpu, &mchk);
|
||||
}
|
||||
return rc ? -EFAULT : 0;
|
||||
return rc;
|
||||
}
|
||||
|
||||
static int __must_check __deliver_restart(struct kvm_vcpu *vcpu)
|
||||
|
|
|
@ -384,7 +384,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
|||
case KVM_CAP_NR_VCPUS:
|
||||
case KVM_CAP_MAX_VCPUS:
|
||||
r = KVM_S390_BSCA_CPU_SLOTS;
|
||||
if (sclp.has_esca && sclp.has_64bscao)
|
||||
if (!kvm_s390_use_sca_entries())
|
||||
r = KVM_MAX_VCPUS;
|
||||
else if (sclp.has_esca && sclp.has_64bscao)
|
||||
r = KVM_S390_ESCA_CPU_SLOTS;
|
||||
break;
|
||||
case KVM_CAP_NR_MEMSLOTS:
|
||||
|
@ -1498,6 +1500,16 @@ out_err:
|
|||
return rc;
|
||||
}
|
||||
|
||||
bool kvm_arch_has_vcpu_debugfs(void)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
VCPU_EVENT(vcpu, 3, "%s", "free cpu");
|
||||
|
@ -1561,6 +1573,8 @@ static int __kvm_ucontrol_vcpu_init(struct kvm_vcpu *vcpu)
|
|||
|
||||
static void sca_del_vcpu(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (!kvm_s390_use_sca_entries())
|
||||
return;
|
||||
read_lock(&vcpu->kvm->arch.sca_lock);
|
||||
if (vcpu->kvm->arch.use_esca) {
|
||||
struct esca_block *sca = vcpu->kvm->arch.sca;
|
||||
|
@ -1578,6 +1592,13 @@ static void sca_del_vcpu(struct kvm_vcpu *vcpu)
|
|||
|
||||
static void sca_add_vcpu(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (!kvm_s390_use_sca_entries()) {
|
||||
struct bsca_block *sca = vcpu->kvm->arch.sca;
|
||||
|
||||
/* we still need the basic sca for the ipte control */
|
||||
vcpu->arch.sie_block->scaoh = (__u32)(((__u64)sca) >> 32);
|
||||
vcpu->arch.sie_block->scaol = (__u32)(__u64)sca;
|
||||
}
|
||||
read_lock(&vcpu->kvm->arch.sca_lock);
|
||||
if (vcpu->kvm->arch.use_esca) {
|
||||
struct esca_block *sca = vcpu->kvm->arch.sca;
|
||||
|
@ -1658,6 +1679,11 @@ static int sca_can_add_vcpu(struct kvm *kvm, unsigned int id)
|
|||
{
|
||||
int rc;
|
||||
|
||||
if (!kvm_s390_use_sca_entries()) {
|
||||
if (id < KVM_MAX_VCPUS)
|
||||
return true;
|
||||
return false;
|
||||
}
|
||||
if (id < KVM_S390_BSCA_CPU_SLOTS)
|
||||
return true;
|
||||
if (!sclp.has_esca || !sclp.has_64bscao)
|
||||
|
@ -1946,8 +1972,6 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
|
|||
vcpu->arch.sie_block->eca |= 1;
|
||||
if (sclp.has_sigpif)
|
||||
vcpu->arch.sie_block->eca |= 0x10000000U;
|
||||
if (test_kvm_facility(vcpu->kvm, 64))
|
||||
vcpu->arch.sie_block->ecb3 |= 0x01;
|
||||
if (test_kvm_facility(vcpu->kvm, 129)) {
|
||||
vcpu->arch.sie_block->eca |= 0x00020000;
|
||||
vcpu->arch.sie_block->ecd |= 0x20000000;
|
||||
|
@ -2704,6 +2728,19 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
|
|||
if (vcpu->arch.pfault_token == KVM_S390_PFAULT_TOKEN_INVALID)
|
||||
kvm_clear_async_pf_completion_queue(vcpu);
|
||||
}
|
||||
/*
|
||||
* If userspace sets the riccb (e.g. after migration) to a valid state,
|
||||
* we should enable RI here instead of doing the lazy enablement.
|
||||
*/
|
||||
if ((kvm_run->kvm_dirty_regs & KVM_SYNC_RICCB) &&
|
||||
test_kvm_facility(vcpu->kvm, 64)) {
|
||||
struct runtime_instr_cb *riccb =
|
||||
(struct runtime_instr_cb *) &kvm_run->s.regs.riccb;
|
||||
|
||||
if (riccb->valid)
|
||||
vcpu->arch.sie_block->ecb3 |= 0x01;
|
||||
}
|
||||
|
||||
kvm_run->kvm_dirty_regs = 0;
|
||||
}
|
||||
|
||||
|
@ -2847,38 +2884,6 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr)
|
|||
return kvm_s390_store_status_unloaded(vcpu, addr);
|
||||
}
|
||||
|
||||
/*
|
||||
* store additional status at address
|
||||
*/
|
||||
int kvm_s390_store_adtl_status_unloaded(struct kvm_vcpu *vcpu,
|
||||
unsigned long gpa)
|
||||
{
|
||||
/* Only bits 0-53 are used for address formation */
|
||||
if (!(gpa & ~0x3ff))
|
||||
return 0;
|
||||
|
||||
return write_guest_abs(vcpu, gpa & ~0x3ff,
|
||||
(void *)&vcpu->run->s.regs.vrs, 512);
|
||||
}
|
||||
|
||||
int kvm_s390_vcpu_store_adtl_status(struct kvm_vcpu *vcpu, unsigned long addr)
|
||||
{
|
||||
if (!test_kvm_facility(vcpu->kvm, 129))
|
||||
return 0;
|
||||
|
||||
/*
|
||||
* The guest VXRS are in the host VXRs due to the lazy
|
||||
* copying in vcpu load/put. We can simply call save_fpu_regs()
|
||||
* to save the current register state because we are in the
|
||||
* middle of a load/put cycle.
|
||||
*
|
||||
* Let's update our copies before we save it into the save area.
|
||||
*/
|
||||
save_fpu_regs();
|
||||
|
||||
return kvm_s390_store_adtl_status_unloaded(vcpu, addr);
|
||||
}
|
||||
|
||||
static void __disable_ibs_on_vcpu(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
kvm_check_request(KVM_REQ_ENABLE_IBS, vcpu);
|
||||
|
|
|
@ -20,6 +20,7 @@
|
|||
#include <linux/kvm_host.h>
|
||||
#include <asm/facility.h>
|
||||
#include <asm/processor.h>
|
||||
#include <asm/sclp.h>
|
||||
|
||||
typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
|
||||
|
||||
|
@ -245,6 +246,7 @@ static inline void kvm_s390_retry_instr(struct kvm_vcpu *vcpu)
|
|||
|
||||
/* implemented in priv.c */
|
||||
int is_valid_psw(psw_t *psw);
|
||||
int kvm_s390_handle_aa(struct kvm_vcpu *vcpu);
|
||||
int kvm_s390_handle_b2(struct kvm_vcpu *vcpu);
|
||||
int kvm_s390_handle_e5(struct kvm_vcpu *vcpu);
|
||||
int kvm_s390_handle_01(struct kvm_vcpu *vcpu);
|
||||
|
@ -273,10 +275,7 @@ int handle_sthyi(struct kvm_vcpu *vcpu);
|
|||
void kvm_s390_set_tod_clock(struct kvm *kvm, u64 tod);
|
||||
long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable);
|
||||
int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
int kvm_s390_store_adtl_status_unloaded(struct kvm_vcpu *vcpu,
|
||||
unsigned long addr);
|
||||
int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
int kvm_s390_vcpu_store_adtl_status(struct kvm_vcpu *vcpu, unsigned long addr);
|
||||
void kvm_s390_vcpu_start(struct kvm_vcpu *vcpu);
|
||||
void kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu);
|
||||
void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu);
|
||||
|
@ -389,4 +388,13 @@ static inline union ipte_control *kvm_s390_get_ipte_control(struct kvm *kvm)
|
|||
|
||||
return &sca->ipte_control;
|
||||
}
|
||||
static inline int kvm_s390_use_sca_entries(void)
|
||||
{
|
||||
/*
|
||||
* Without SIGP interpretation, only SRS interpretation (if available)
|
||||
* might use the entries. By not setting the entries and keeping them
|
||||
* invalid, hardware will not access them but intercept.
|
||||
*/
|
||||
return sclp.has_sigpif;
|
||||
}
|
||||
#endif
|
||||
|
|
|
@ -32,6 +32,24 @@
|
|||
#include "kvm-s390.h"
|
||||
#include "trace.h"
|
||||
|
||||
static int handle_ri(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (test_kvm_facility(vcpu->kvm, 64)) {
|
||||
vcpu->arch.sie_block->ecb3 |= 0x01;
|
||||
kvm_s390_retry_instr(vcpu);
|
||||
return 0;
|
||||
} else
|
||||
return kvm_s390_inject_program_int(vcpu, PGM_OPERATION);
|
||||
}
|
||||
|
||||
int kvm_s390_handle_aa(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if ((vcpu->arch.sie_block->ipa & 0xf) <= 4)
|
||||
return handle_ri(vcpu);
|
||||
else
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
/* Handle SCK (SET CLOCK) interception */
|
||||
static int handle_set_clock(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
|
@ -1093,6 +1111,9 @@ static int handle_stctg(struct kvm_vcpu *vcpu)
|
|||
static const intercept_handler_t eb_handlers[256] = {
|
||||
[0x2f] = handle_lctlg,
|
||||
[0x25] = handle_stctg,
|
||||
[0x60] = handle_ri,
|
||||
[0x61] = handle_ri,
|
||||
[0x62] = handle_ri,
|
||||
};
|
||||
|
||||
int kvm_s390_handle_eb(struct kvm_vcpu *vcpu)
|
||||
|
|
|
@ -129,7 +129,7 @@ static notrace cycle_t vread_pvclock(int *mode)
|
|||
return 0;
|
||||
}
|
||||
|
||||
ret = __pvclock_read_cycles(pvti);
|
||||
ret = __pvclock_read_cycles(pvti, rdtsc_ordered());
|
||||
} while (pvclock_read_retry(pvti, version));
|
||||
|
||||
/* refer to vread_tsc() comment for rationale */
|
||||
|
|
|
@ -568,6 +568,7 @@ struct kvm_vcpu_arch {
|
|||
struct kvm_steal_time steal;
|
||||
} st;
|
||||
|
||||
u64 tsc_offset;
|
||||
u64 last_guest_tsc;
|
||||
u64 last_host_tsc;
|
||||
u64 tsc_offset_adjustment;
|
||||
|
@ -701,6 +702,8 @@ struct kvm_hv {
|
|||
/* Hyper-v based guest crash (NT kernel bugcheck) parameters */
|
||||
u64 hv_crash_param[HV_X64_MSR_CRASH_PARAMS];
|
||||
u64 hv_crash_ctl;
|
||||
|
||||
HV_REFERENCE_TSC_PAGE tsc_ref;
|
||||
};
|
||||
|
||||
struct kvm_arch {
|
||||
|
@ -781,54 +784,56 @@ struct kvm_arch {
|
|||
bool disabled_lapic_found;
|
||||
|
||||
/* Struct members for AVIC */
|
||||
u32 avic_vm_id;
|
||||
u32 ldr_mode;
|
||||
struct page *avic_logical_id_table_page;
|
||||
struct page *avic_physical_id_table_page;
|
||||
struct hlist_node hnode;
|
||||
|
||||
bool x2apic_format;
|
||||
bool x2apic_broadcast_quirk_disabled;
|
||||
};
|
||||
|
||||
struct kvm_vm_stat {
|
||||
u32 mmu_shadow_zapped;
|
||||
u32 mmu_pte_write;
|
||||
u32 mmu_pte_updated;
|
||||
u32 mmu_pde_zapped;
|
||||
u32 mmu_flooded;
|
||||
u32 mmu_recycled;
|
||||
u32 mmu_cache_miss;
|
||||
u32 mmu_unsync;
|
||||
u32 remote_tlb_flush;
|
||||
u32 lpages;
|
||||
ulong mmu_shadow_zapped;
|
||||
ulong mmu_pte_write;
|
||||
ulong mmu_pte_updated;
|
||||
ulong mmu_pde_zapped;
|
||||
ulong mmu_flooded;
|
||||
ulong mmu_recycled;
|
||||
ulong mmu_cache_miss;
|
||||
ulong mmu_unsync;
|
||||
ulong remote_tlb_flush;
|
||||
ulong lpages;
|
||||
};
|
||||
|
||||
struct kvm_vcpu_stat {
|
||||
u32 pf_fixed;
|
||||
u32 pf_guest;
|
||||
u32 tlb_flush;
|
||||
u32 invlpg;
|
||||
u64 pf_fixed;
|
||||
u64 pf_guest;
|
||||
u64 tlb_flush;
|
||||
u64 invlpg;
|
||||
|
||||
u32 exits;
|
||||
u32 io_exits;
|
||||
u32 mmio_exits;
|
||||
u32 signal_exits;
|
||||
u32 irq_window_exits;
|
||||
u32 nmi_window_exits;
|
||||
u32 halt_exits;
|
||||
u32 halt_successful_poll;
|
||||
u32 halt_attempted_poll;
|
||||
u32 halt_poll_invalid;
|
||||
u32 halt_wakeup;
|
||||
u32 request_irq_exits;
|
||||
u32 irq_exits;
|
||||
u32 host_state_reload;
|
||||
u32 efer_reload;
|
||||
u32 fpu_reload;
|
||||
u32 insn_emulation;
|
||||
u32 insn_emulation_fail;
|
||||
u32 hypercalls;
|
||||
u32 irq_injections;
|
||||
u32 nmi_injections;
|
||||
u64 exits;
|
||||
u64 io_exits;
|
||||
u64 mmio_exits;
|
||||
u64 signal_exits;
|
||||
u64 irq_window_exits;
|
||||
u64 nmi_window_exits;
|
||||
u64 halt_exits;
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_wakeup;
|
||||
u64 request_irq_exits;
|
||||
u64 irq_exits;
|
||||
u64 host_state_reload;
|
||||
u64 efer_reload;
|
||||
u64 fpu_reload;
|
||||
u64 insn_emulation;
|
||||
u64 insn_emulation_fail;
|
||||
u64 hypercalls;
|
||||
u64 irq_injections;
|
||||
u64 nmi_injections;
|
||||
};
|
||||
|
||||
struct x86_instruction_info;
|
||||
|
@ -951,7 +956,6 @@ struct kvm_x86_ops {
|
|||
|
||||
bool (*has_wbinvd_exit)(void);
|
||||
|
||||
u64 (*read_tsc_offset)(struct kvm_vcpu *vcpu);
|
||||
void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
|
||||
|
||||
u64 (*read_l1_tsc)(struct kvm_vcpu *vcpu, u64 host_tsc);
|
||||
|
|
|
@ -87,9 +87,10 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
|
|||
}
|
||||
|
||||
static __always_inline
|
||||
cycle_t __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src)
|
||||
cycle_t __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
|
||||
u64 tsc)
|
||||
{
|
||||
u64 delta = rdtsc_ordered() - src->tsc_timestamp;
|
||||
u64 delta = tsc - src->tsc_timestamp;
|
||||
cycle_t offset = pvclock_scale_delta(delta, src->tsc_to_system_mul,
|
||||
src->tsc_shift);
|
||||
return src->system_time + offset;
|
||||
|
|
|
@ -80,7 +80,7 @@ cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src)
|
|||
|
||||
do {
|
||||
version = pvclock_read_begin(src);
|
||||
ret = __pvclock_read_cycles(src);
|
||||
ret = __pvclock_read_cycles(src, rdtsc_ordered());
|
||||
flags = src->flags;
|
||||
} while (pvclock_read_retry(src, version));
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@ kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
|
|||
|
||||
kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
|
||||
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
|
||||
hyperv.o page_track.o
|
||||
hyperv.o page_track.o debugfs.o
|
||||
|
||||
kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += assigned-dev.o iommu.o
|
||||
|
||||
|
|
|
@ -366,7 +366,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
|
|||
F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
|
||||
F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
|
||||
F(ADX) | F(SMAP) | F(AVX512F) | F(AVX512PF) | F(AVX512ER) |
|
||||
F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB);
|
||||
F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | F(AVX512DQ) |
|
||||
F(AVX512BW) | F(AVX512VL);
|
||||
|
||||
/* cpuid 0xD.1.eax */
|
||||
const u32 kvm_cpuid_D_1_eax_x86_features =
|
||||
|
|
|
@ -0,0 +1,69 @@
|
|||
/*
|
||||
* Kernel-based Virtual Machine driver for Linux
|
||||
*
|
||||
* Copyright 2016 Red Hat, Inc. and/or its affiliates.
|
||||
*
|
||||
* This work is licensed under the terms of the GNU GPL, version 2. See
|
||||
* the COPYING file in the top-level directory.
|
||||
*
|
||||
*/
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/debugfs.h>
|
||||
|
||||
bool kvm_arch_has_vcpu_debugfs(void)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
static int vcpu_get_tsc_offset(void *data, u64 *val)
|
||||
{
|
||||
struct kvm_vcpu *vcpu = (struct kvm_vcpu *) data;
|
||||
*val = vcpu->arch.tsc_offset;
|
||||
return 0;
|
||||
}
|
||||
|
||||
DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_offset_fops, vcpu_get_tsc_offset, NULL, "%lld\n");
|
||||
|
||||
static int vcpu_get_tsc_scaling_ratio(void *data, u64 *val)
|
||||
{
|
||||
struct kvm_vcpu *vcpu = (struct kvm_vcpu *) data;
|
||||
*val = vcpu->arch.tsc_scaling_ratio;
|
||||
return 0;
|
||||
}
|
||||
|
||||
DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_fops, vcpu_get_tsc_scaling_ratio, NULL, "%llu\n");
|
||||
|
||||
static int vcpu_get_tsc_scaling_frac_bits(void *data, u64 *val)
|
||||
{
|
||||
*val = kvm_tsc_scaling_ratio_frac_bits;
|
||||
return 0;
|
||||
}
|
||||
|
||||
DEFINE_SIMPLE_ATTRIBUTE(vcpu_tsc_scaling_frac_fops, vcpu_get_tsc_scaling_frac_bits, NULL, "%llu\n");
|
||||
|
||||
int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct dentry *ret;
|
||||
|
||||
ret = debugfs_create_file("tsc-offset", 0444,
|
||||
vcpu->debugfs_dentry,
|
||||
vcpu, &vcpu_tsc_offset_fops);
|
||||
if (!ret)
|
||||
return -ENOMEM;
|
||||
|
||||
if (kvm_has_tsc_control) {
|
||||
ret = debugfs_create_file("tsc-scaling-ratio", 0444,
|
||||
vcpu->debugfs_dentry,
|
||||
vcpu, &vcpu_tsc_scaling_fops);
|
||||
if (!ret)
|
||||
return -ENOMEM;
|
||||
ret = debugfs_create_file("tsc-scaling-ratio-frac-bits", 0444,
|
||||
vcpu->debugfs_dentry,
|
||||
vcpu, &vcpu_tsc_scaling_frac_fops);
|
||||
if (!ret)
|
||||
return -ENOMEM;
|
||||
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
|
@ -386,7 +386,21 @@ static void synic_init(struct kvm_vcpu_hv_synic *synic)
|
|||
|
||||
static u64 get_time_ref_counter(struct kvm *kvm)
|
||||
{
|
||||
return div_u64(get_kernel_ns() + kvm->arch.kvmclock_offset, 100);
|
||||
struct kvm_hv *hv = &kvm->arch.hyperv;
|
||||
struct kvm_vcpu *vcpu;
|
||||
u64 tsc;
|
||||
|
||||
/*
|
||||
* The guest has not set up the TSC page or the clock isn't
|
||||
* stable, fall back to get_kvmclock_ns.
|
||||
*/
|
||||
if (!hv->tsc_ref.tsc_sequence)
|
||||
return div_u64(get_kvmclock_ns(kvm), 100);
|
||||
|
||||
vcpu = kvm_get_vcpu(kvm, 0);
|
||||
tsc = kvm_read_l1_tsc(vcpu, rdtsc());
|
||||
return mul_u64_u64_shr(tsc, hv->tsc_ref.tsc_scale, 64)
|
||||
+ hv->tsc_ref.tsc_offset;
|
||||
}
|
||||
|
||||
static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer,
|
||||
|
@ -756,6 +770,129 @@ static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
|
|||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* The kvmclock and Hyper-V TSC page use similar formulas, and converting
|
||||
* between them is possible:
|
||||
*
|
||||
* kvmclock formula:
|
||||
* nsec = (ticks - tsc_timestamp) * tsc_to_system_mul * 2^(tsc_shift-32)
|
||||
* + system_time
|
||||
*
|
||||
* Hyper-V formula:
|
||||
* nsec/100 = ticks * scale / 2^64 + offset
|
||||
*
|
||||
* When tsc_timestamp = system_time = 0, offset is zero in the Hyper-V formula.
|
||||
* By dividing the kvmclock formula by 100 and equating what's left we get:
|
||||
* ticks * scale / 2^64 = ticks * tsc_to_system_mul * 2^(tsc_shift-32) / 100
|
||||
* scale / 2^64 = tsc_to_system_mul * 2^(tsc_shift-32) / 100
|
||||
* scale = tsc_to_system_mul * 2^(32+tsc_shift) / 100
|
||||
*
|
||||
* Now expand the kvmclock formula and divide by 100:
|
||||
* nsec = ticks * tsc_to_system_mul * 2^(tsc_shift-32)
|
||||
* - tsc_timestamp * tsc_to_system_mul * 2^(tsc_shift-32)
|
||||
* + system_time
|
||||
* nsec/100 = ticks * tsc_to_system_mul * 2^(tsc_shift-32) / 100
|
||||
* - tsc_timestamp * tsc_to_system_mul * 2^(tsc_shift-32) / 100
|
||||
* + system_time / 100
|
||||
*
|
||||
* Replace tsc_to_system_mul * 2^(tsc_shift-32) / 100 by scale / 2^64:
|
||||
* nsec/100 = ticks * scale / 2^64
|
||||
* - tsc_timestamp * scale / 2^64
|
||||
* + system_time / 100
|
||||
*
|
||||
* Equate with the Hyper-V formula so that ticks * scale / 2^64 cancels out:
|
||||
* offset = system_time / 100 - tsc_timestamp * scale / 2^64
|
||||
*
|
||||
* These two equivalencies are implemented in this function.
|
||||
*/
|
||||
static bool compute_tsc_page_parameters(struct pvclock_vcpu_time_info *hv_clock,
|
||||
HV_REFERENCE_TSC_PAGE *tsc_ref)
|
||||
{
|
||||
u64 max_mul;
|
||||
|
||||
if (!(hv_clock->flags & PVCLOCK_TSC_STABLE_BIT))
|
||||
return false;
|
||||
|
||||
/*
|
||||
* check if scale would overflow, if so we use the time ref counter
|
||||
* tsc_to_system_mul * 2^(tsc_shift+32) / 100 >= 2^64
|
||||
* tsc_to_system_mul / 100 >= 2^(32-tsc_shift)
|
||||
* tsc_to_system_mul >= 100 * 2^(32-tsc_shift)
|
||||
*/
|
||||
max_mul = 100ull << (32 - hv_clock->tsc_shift);
|
||||
if (hv_clock->tsc_to_system_mul >= max_mul)
|
||||
return false;
|
||||
|
||||
/*
|
||||
* Otherwise compute the scale and offset according to the formulas
|
||||
* derived above.
|
||||
*/
|
||||
tsc_ref->tsc_scale =
|
||||
mul_u64_u32_div(1ULL << (32 + hv_clock->tsc_shift),
|
||||
hv_clock->tsc_to_system_mul,
|
||||
100);
|
||||
|
||||
tsc_ref->tsc_offset = hv_clock->system_time;
|
||||
do_div(tsc_ref->tsc_offset, 100);
|
||||
tsc_ref->tsc_offset -=
|
||||
mul_u64_u64_shr(hv_clock->tsc_timestamp, tsc_ref->tsc_scale, 64);
|
||||
return true;
|
||||
}
|
||||
|
||||
void kvm_hv_setup_tsc_page(struct kvm *kvm,
|
||||
struct pvclock_vcpu_time_info *hv_clock)
|
||||
{
|
||||
struct kvm_hv *hv = &kvm->arch.hyperv;
|
||||
u32 tsc_seq;
|
||||
u64 gfn;
|
||||
|
||||
BUILD_BUG_ON(sizeof(tsc_seq) != sizeof(hv->tsc_ref.tsc_sequence));
|
||||
BUILD_BUG_ON(offsetof(HV_REFERENCE_TSC_PAGE, tsc_sequence) != 0);
|
||||
|
||||
if (!(hv->hv_tsc_page & HV_X64_MSR_TSC_REFERENCE_ENABLE))
|
||||
return;
|
||||
|
||||
gfn = hv->hv_tsc_page >> HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT;
|
||||
/*
|
||||
* Because the TSC parameters only vary when there is a
|
||||
* change in the master clock, do not bother with caching.
|
||||
*/
|
||||
if (unlikely(kvm_read_guest(kvm, gfn_to_gpa(gfn),
|
||||
&tsc_seq, sizeof(tsc_seq))))
|
||||
return;
|
||||
|
||||
/*
|
||||
* While we're computing and writing the parameters, force the
|
||||
* guest to use the time reference count MSR.
|
||||
*/
|
||||
hv->tsc_ref.tsc_sequence = 0;
|
||||
if (kvm_write_guest(kvm, gfn_to_gpa(gfn),
|
||||
&hv->tsc_ref, sizeof(hv->tsc_ref.tsc_sequence)))
|
||||
return;
|
||||
|
||||
if (!compute_tsc_page_parameters(hv_clock, &hv->tsc_ref))
|
||||
return;
|
||||
|
||||
/* Ensure sequence is zero before writing the rest of the struct. */
|
||||
smp_wmb();
|
||||
if (kvm_write_guest(kvm, gfn_to_gpa(gfn), &hv->tsc_ref, sizeof(hv->tsc_ref)))
|
||||
return;
|
||||
|
||||
/*
|
||||
* Now switch to the TSC page mechanism by writing the sequence.
|
||||
*/
|
||||
tsc_seq++;
|
||||
if (tsc_seq == 0xFFFFFFFF || tsc_seq == 0)
|
||||
tsc_seq = 1;
|
||||
|
||||
/* Write the struct entirely before the non-zero sequence. */
|
||||
smp_wmb();
|
||||
|
||||
hv->tsc_ref.tsc_sequence = tsc_seq;
|
||||
kvm_write_guest(kvm, gfn_to_gpa(gfn),
|
||||
&hv->tsc_ref, sizeof(hv->tsc_ref.tsc_sequence));
|
||||
}
|
||||
|
||||
static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
|
||||
bool host)
|
||||
{
|
||||
|
@ -793,23 +930,11 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
|
|||
mark_page_dirty(kvm, gfn);
|
||||
break;
|
||||
}
|
||||
case HV_X64_MSR_REFERENCE_TSC: {
|
||||
u64 gfn;
|
||||
HV_REFERENCE_TSC_PAGE tsc_ref;
|
||||
|
||||
memset(&tsc_ref, 0, sizeof(tsc_ref));
|
||||
case HV_X64_MSR_REFERENCE_TSC:
|
||||
hv->hv_tsc_page = data;
|
||||
if (!(data & HV_X64_MSR_TSC_REFERENCE_ENABLE))
|
||||
if (hv->hv_tsc_page & HV_X64_MSR_TSC_REFERENCE_ENABLE)
|
||||
kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);
|
||||
break;
|
||||
gfn = data >> HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT;
|
||||
if (kvm_write_guest(
|
||||
kvm,
|
||||
gfn << HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT,
|
||||
&tsc_ref, sizeof(tsc_ref)))
|
||||
return 1;
|
||||
mark_page_dirty(kvm, gfn);
|
||||
break;
|
||||
}
|
||||
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
|
||||
return kvm_hv_msr_set_crash_data(vcpu,
|
||||
msr - HV_X64_MSR_CRASH_P0,
|
||||
|
|
|
@ -84,4 +84,7 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
|
|||
|
||||
void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
|
||||
|
||||
void kvm_hv_setup_tsc_page(struct kvm *kvm,
|
||||
struct pvclock_vcpu_time_info *hv_clock);
|
||||
|
||||
#endif
|
||||
|
|
|
@ -1761,10 +1761,11 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
|
|||
if (value & MSR_IA32_APICBASE_ENABLE) {
|
||||
kvm_apic_set_xapic_id(apic, vcpu->vcpu_id);
|
||||
static_key_slow_dec_deferred(&apic_hw_disabled);
|
||||
} else
|
||||
} else {
|
||||
static_key_slow_inc(&apic_hw_disabled.key);
|
||||
recalculate_apic_map(vcpu->kvm);
|
||||
}
|
||||
}
|
||||
|
||||
if ((old_value ^ value) & X2APIC_ENABLE) {
|
||||
if (value & X2APIC_ENABLE) {
|
||||
|
|
|
@ -1207,7 +1207,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep)
|
|||
*
|
||||
* Return true if tlb need be flushed.
|
||||
*/
|
||||
static bool spte_write_protect(struct kvm *kvm, u64 *sptep, bool pt_protect)
|
||||
static bool spte_write_protect(u64 *sptep, bool pt_protect)
|
||||
{
|
||||
u64 spte = *sptep;
|
||||
|
||||
|
@ -1233,12 +1233,12 @@ static bool __rmap_write_protect(struct kvm *kvm,
|
|||
bool flush = false;
|
||||
|
||||
for_each_rmap_spte(rmap_head, &iter, sptep)
|
||||
flush |= spte_write_protect(kvm, sptep, pt_protect);
|
||||
flush |= spte_write_protect(sptep, pt_protect);
|
||||
|
||||
return flush;
|
||||
}
|
||||
|
||||
static bool spte_clear_dirty(struct kvm *kvm, u64 *sptep)
|
||||
static bool spte_clear_dirty(u64 *sptep)
|
||||
{
|
||||
u64 spte = *sptep;
|
||||
|
||||
|
@ -1256,12 +1256,12 @@ static bool __rmap_clear_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
|
|||
bool flush = false;
|
||||
|
||||
for_each_rmap_spte(rmap_head, &iter, sptep)
|
||||
flush |= spte_clear_dirty(kvm, sptep);
|
||||
flush |= spte_clear_dirty(sptep);
|
||||
|
||||
return flush;
|
||||
}
|
||||
|
||||
static bool spte_set_dirty(struct kvm *kvm, u64 *sptep)
|
||||
static bool spte_set_dirty(u64 *sptep)
|
||||
{
|
||||
u64 spte = *sptep;
|
||||
|
||||
|
@ -1279,7 +1279,7 @@ static bool __rmap_set_dirty(struct kvm *kvm, struct kvm_rmap_head *rmap_head)
|
|||
bool flush = false;
|
||||
|
||||
for_each_rmap_spte(rmap_head, &iter, sptep)
|
||||
flush |= spte_set_dirty(kvm, sptep);
|
||||
flush |= spte_set_dirty(sptep);
|
||||
|
||||
return flush;
|
||||
}
|
||||
|
|
|
@ -34,6 +34,8 @@
|
|||
#include <linux/sched.h>
|
||||
#include <linux/trace_events.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/amd-iommu.h>
|
||||
#include <linux/hashtable.h>
|
||||
|
||||
#include <asm/apic.h>
|
||||
#include <asm/perf_event.h>
|
||||
|
@ -41,6 +43,7 @@
|
|||
#include <asm/desc.h>
|
||||
#include <asm/debugreg.h>
|
||||
#include <asm/kvm_para.h>
|
||||
#include <asm/irq_remapping.h>
|
||||
|
||||
#include <asm/virtext.h>
|
||||
#include "trace.h"
|
||||
|
@ -96,6 +99,19 @@ MODULE_DEVICE_TABLE(x86cpu, svm_cpu_id);
|
|||
#define AVIC_UNACCEL_ACCESS_OFFSET_MASK 0xFF0
|
||||
#define AVIC_UNACCEL_ACCESS_VECTOR_MASK 0xFFFFFFFF
|
||||
|
||||
/* AVIC GATAG is encoded using VM and VCPU IDs */
|
||||
#define AVIC_VCPU_ID_BITS 8
|
||||
#define AVIC_VCPU_ID_MASK ((1 << AVIC_VCPU_ID_BITS) - 1)
|
||||
|
||||
#define AVIC_VM_ID_BITS 24
|
||||
#define AVIC_VM_ID_NR (1 << AVIC_VM_ID_BITS)
|
||||
#define AVIC_VM_ID_MASK ((1 << AVIC_VM_ID_BITS) - 1)
|
||||
|
||||
#define AVIC_GATAG(x, y) (((x & AVIC_VM_ID_MASK) << AVIC_VCPU_ID_BITS) | \
|
||||
(y & AVIC_VCPU_ID_MASK))
|
||||
#define AVIC_GATAG_TO_VMID(x) ((x >> AVIC_VCPU_ID_BITS) & AVIC_VM_ID_MASK)
|
||||
#define AVIC_GATAG_TO_VCPUID(x) (x & AVIC_VCPU_ID_MASK)
|
||||
|
||||
static bool erratum_383_found __read_mostly;
|
||||
|
||||
static const u32 host_save_user_msrs[] = {
|
||||
|
@ -185,6 +201,23 @@ struct vcpu_svm {
|
|||
struct page *avic_backing_page;
|
||||
u64 *avic_physical_id_cache;
|
||||
bool avic_is_running;
|
||||
|
||||
/*
|
||||
* Per-vcpu list of struct amd_svm_iommu_ir:
|
||||
* This is used mainly to store interrupt remapping information used
|
||||
* when update the vcpu affinity. This avoids the need to scan for
|
||||
* IRTE and try to match ga_tag in the IOMMU driver.
|
||||
*/
|
||||
struct list_head ir_list;
|
||||
spinlock_t ir_list_lock;
|
||||
};
|
||||
|
||||
/*
|
||||
* This is a wrapper of struct amd_iommu_ir_data.
|
||||
*/
|
||||
struct amd_svm_iommu_ir {
|
||||
struct list_head node; /* Used by SVM for per-vcpu ir_list */
|
||||
void *data; /* Storing pointer to struct amd_ir_data */
|
||||
};
|
||||
|
||||
#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK (0xFF)
|
||||
|
@ -242,6 +275,10 @@ static int avic;
|
|||
module_param(avic, int, S_IRUGO);
|
||||
#endif
|
||||
|
||||
/* AVIC VM ID bit masks and lock */
|
||||
static DECLARE_BITMAP(avic_vm_id_bitmap, AVIC_VM_ID_NR);
|
||||
static DEFINE_SPINLOCK(avic_vm_id_lock);
|
||||
|
||||
static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
|
||||
static void svm_flush_tlb(struct kvm_vcpu *vcpu);
|
||||
static void svm_complete_interrupts(struct vcpu_svm *svm);
|
||||
|
@ -928,6 +965,55 @@ static void svm_disable_lbrv(struct vcpu_svm *svm)
|
|||
set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 0, 0);
|
||||
}
|
||||
|
||||
/* Note:
|
||||
* This hash table is used to map VM_ID to a struct kvm_arch,
|
||||
* when handling AMD IOMMU GALOG notification to schedule in
|
||||
* a particular vCPU.
|
||||
*/
|
||||
#define SVM_VM_DATA_HASH_BITS 8
|
||||
DECLARE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
|
||||
static spinlock_t svm_vm_data_hash_lock;
|
||||
|
||||
/* Note:
|
||||
* This function is called from IOMMU driver to notify
|
||||
* SVM to schedule in a particular vCPU of a particular VM.
|
||||
*/
|
||||
static int avic_ga_log_notifier(u32 ga_tag)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct kvm_arch *ka = NULL;
|
||||
struct kvm_vcpu *vcpu = NULL;
|
||||
u32 vm_id = AVIC_GATAG_TO_VMID(ga_tag);
|
||||
u32 vcpu_id = AVIC_GATAG_TO_VCPUID(ga_tag);
|
||||
|
||||
pr_debug("SVM: %s: vm_id=%#x, vcpu_id=%#x\n", __func__, vm_id, vcpu_id);
|
||||
|
||||
spin_lock_irqsave(&svm_vm_data_hash_lock, flags);
|
||||
hash_for_each_possible(svm_vm_data_hash, ka, hnode, vm_id) {
|
||||
struct kvm *kvm = container_of(ka, struct kvm, arch);
|
||||
struct kvm_arch *vm_data = &kvm->arch;
|
||||
|
||||
if (vm_data->avic_vm_id != vm_id)
|
||||
continue;
|
||||
vcpu = kvm_get_vcpu_by_id(kvm, vcpu_id);
|
||||
break;
|
||||
}
|
||||
spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
|
||||
|
||||
if (!vcpu)
|
||||
return 0;
|
||||
|
||||
/* Note:
|
||||
* At this point, the IOMMU should have already set the pending
|
||||
* bit in the vAPIC backing page. So, we just need to schedule
|
||||
* in the vcpu.
|
||||
*/
|
||||
if (vcpu->mode == OUTSIDE_GUEST_MODE)
|
||||
kvm_vcpu_wake_up(vcpu);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static __init int svm_hardware_setup(void)
|
||||
{
|
||||
int cpu;
|
||||
|
@ -986,10 +1072,15 @@ static __init int svm_hardware_setup(void)
|
|||
if (avic) {
|
||||
if (!npt_enabled ||
|
||||
!boot_cpu_has(X86_FEATURE_AVIC) ||
|
||||
!IS_ENABLED(CONFIG_X86_LOCAL_APIC))
|
||||
!IS_ENABLED(CONFIG_X86_LOCAL_APIC)) {
|
||||
avic = false;
|
||||
else
|
||||
} else {
|
||||
pr_info("AVIC enabled\n");
|
||||
|
||||
hash_init(svm_vm_data_hash);
|
||||
spin_lock_init(&svm_vm_data_hash_lock);
|
||||
amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
@ -1028,13 +1119,6 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t type)
|
|||
seg->base = 0;
|
||||
}
|
||||
|
||||
static u64 svm_read_tsc_offset(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
|
||||
return svm->vmcb->control.tsc_offset;
|
||||
}
|
||||
|
||||
static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
|
||||
{
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
|
@ -1280,19 +1364,55 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static inline int avic_get_next_vm_id(void)
|
||||
{
|
||||
int id;
|
||||
|
||||
spin_lock(&avic_vm_id_lock);
|
||||
|
||||
/* AVIC VM ID is one-based. */
|
||||
id = find_next_zero_bit(avic_vm_id_bitmap, AVIC_VM_ID_NR, 1);
|
||||
if (id <= AVIC_VM_ID_MASK)
|
||||
__set_bit(id, avic_vm_id_bitmap);
|
||||
else
|
||||
id = -EAGAIN;
|
||||
|
||||
spin_unlock(&avic_vm_id_lock);
|
||||
return id;
|
||||
}
|
||||
|
||||
static inline int avic_free_vm_id(int id)
|
||||
{
|
||||
if (id <= 0 || id > AVIC_VM_ID_MASK)
|
||||
return -EINVAL;
|
||||
|
||||
spin_lock(&avic_vm_id_lock);
|
||||
__clear_bit(id, avic_vm_id_bitmap);
|
||||
spin_unlock(&avic_vm_id_lock);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void avic_vm_destroy(struct kvm *kvm)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct kvm_arch *vm_data = &kvm->arch;
|
||||
|
||||
avic_free_vm_id(vm_data->avic_vm_id);
|
||||
|
||||
if (vm_data->avic_logical_id_table_page)
|
||||
__free_page(vm_data->avic_logical_id_table_page);
|
||||
if (vm_data->avic_physical_id_table_page)
|
||||
__free_page(vm_data->avic_physical_id_table_page);
|
||||
|
||||
spin_lock_irqsave(&svm_vm_data_hash_lock, flags);
|
||||
hash_del(&vm_data->hnode);
|
||||
spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
|
||||
}
|
||||
|
||||
static int avic_vm_init(struct kvm *kvm)
|
||||
{
|
||||
int err = -ENOMEM;
|
||||
unsigned long flags;
|
||||
int vm_id, err = -ENOMEM;
|
||||
struct kvm_arch *vm_data = &kvm->arch;
|
||||
struct page *p_page;
|
||||
struct page *l_page;
|
||||
|
@ -1300,6 +1420,11 @@ static int avic_vm_init(struct kvm *kvm)
|
|||
if (!avic)
|
||||
return 0;
|
||||
|
||||
vm_id = avic_get_next_vm_id();
|
||||
if (vm_id < 0)
|
||||
return vm_id;
|
||||
vm_data->avic_vm_id = (u32)vm_id;
|
||||
|
||||
/* Allocating physical APIC ID table (4KB) */
|
||||
p_page = alloc_page(GFP_KERNEL);
|
||||
if (!p_page)
|
||||
|
@ -1316,6 +1441,10 @@ static int avic_vm_init(struct kvm *kvm)
|
|||
vm_data->avic_logical_id_table_page = l_page;
|
||||
clear_page(page_address(l_page));
|
||||
|
||||
spin_lock_irqsave(&svm_vm_data_hash_lock, flags);
|
||||
hash_add(svm_vm_data_hash, &vm_data->hnode, vm_data->avic_vm_id);
|
||||
spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
|
||||
|
||||
return 0;
|
||||
|
||||
free_avic:
|
||||
|
@ -1323,31 +1452,34 @@ free_avic:
|
|||
return err;
|
||||
}
|
||||
|
||||
/**
|
||||
* This function is called during VCPU halt/unhalt.
|
||||
*/
|
||||
static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
|
||||
static inline int
|
||||
avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int cpu, bool r)
|
||||
{
|
||||
u64 entry;
|
||||
int h_physical_id = kvm_cpu_get_apicid(vcpu->cpu);
|
||||
int ret = 0;
|
||||
unsigned long flags;
|
||||
struct amd_svm_iommu_ir *ir;
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
|
||||
if (!kvm_vcpu_apicv_active(vcpu))
|
||||
return;
|
||||
if (!kvm_arch_has_assigned_device(vcpu->kvm))
|
||||
return 0;
|
||||
|
||||
svm->avic_is_running = is_run;
|
||||
/*
|
||||
* Here, we go through the per-vcpu ir_list to update all existing
|
||||
* interrupt remapping table entry targeting this vcpu.
|
||||
*/
|
||||
spin_lock_irqsave(&svm->ir_list_lock, flags);
|
||||
|
||||
/* ID = 0xff (broadcast), ID > 0xff (reserved) */
|
||||
if (WARN_ON(h_physical_id >= AVIC_MAX_PHYSICAL_ID_COUNT))
|
||||
return;
|
||||
if (list_empty(&svm->ir_list))
|
||||
goto out;
|
||||
|
||||
entry = READ_ONCE(*(svm->avic_physical_id_cache));
|
||||
WARN_ON(is_run == !!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK));
|
||||
|
||||
entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
|
||||
if (is_run)
|
||||
entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
|
||||
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
|
||||
list_for_each_entry(ir, &svm->ir_list, node) {
|
||||
ret = amd_iommu_update_ga(cpu, r, ir->data);
|
||||
if (ret)
|
||||
break;
|
||||
}
|
||||
out:
|
||||
spin_unlock_irqrestore(&svm->ir_list_lock, flags);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||
|
@ -1374,6 +1506,8 @@ static void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
|||
entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
|
||||
|
||||
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
|
||||
avic_update_iommu_vcpu_affinity(vcpu, h_physical_id,
|
||||
svm->avic_is_running);
|
||||
}
|
||||
|
||||
static void avic_vcpu_put(struct kvm_vcpu *vcpu)
|
||||
|
@ -1385,10 +1519,27 @@ static void avic_vcpu_put(struct kvm_vcpu *vcpu)
|
|||
return;
|
||||
|
||||
entry = READ_ONCE(*(svm->avic_physical_id_cache));
|
||||
if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
|
||||
avic_update_iommu_vcpu_affinity(vcpu, -1, 0);
|
||||
|
||||
entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
|
||||
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
|
||||
}
|
||||
|
||||
/**
|
||||
* This function is called during VCPU halt/unhalt.
|
||||
*/
|
||||
static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
|
||||
{
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
|
||||
svm->avic_is_running = is_run;
|
||||
if (is_run)
|
||||
avic_vcpu_load(vcpu, vcpu->cpu);
|
||||
else
|
||||
avic_vcpu_put(vcpu);
|
||||
}
|
||||
|
||||
static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
|
||||
{
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
|
@ -1450,6 +1601,9 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
|
|||
err = avic_init_backing_page(&svm->vcpu);
|
||||
if (err)
|
||||
goto free_page4;
|
||||
|
||||
INIT_LIST_HEAD(&svm->ir_list);
|
||||
spin_lock_init(&svm->ir_list_lock);
|
||||
}
|
||||
|
||||
/* We initialize this flag to true to make sure that the is_running
|
||||
|
@ -4246,6 +4400,209 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
|
|||
kvm_vcpu_wake_up(vcpu);
|
||||
}
|
||||
|
||||
static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
|
||||
{
|
||||
unsigned long flags;
|
||||
struct amd_svm_iommu_ir *cur;
|
||||
|
||||
spin_lock_irqsave(&svm->ir_list_lock, flags);
|
||||
list_for_each_entry(cur, &svm->ir_list, node) {
|
||||
if (cur->data != pi->ir_data)
|
||||
continue;
|
||||
list_del(&cur->node);
|
||||
kfree(cur);
|
||||
break;
|
||||
}
|
||||
spin_unlock_irqrestore(&svm->ir_list_lock, flags);
|
||||
}
|
||||
|
||||
static int svm_ir_list_add(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
|
||||
{
|
||||
int ret = 0;
|
||||
unsigned long flags;
|
||||
struct amd_svm_iommu_ir *ir;
|
||||
|
||||
/**
|
||||
* In some cases, the existing irte is updaed and re-set,
|
||||
* so we need to check here if it's already been * added
|
||||
* to the ir_list.
|
||||
*/
|
||||
if (pi->ir_data && (pi->prev_ga_tag != 0)) {
|
||||
struct kvm *kvm = svm->vcpu.kvm;
|
||||
u32 vcpu_id = AVIC_GATAG_TO_VCPUID(pi->prev_ga_tag);
|
||||
struct kvm_vcpu *prev_vcpu = kvm_get_vcpu_by_id(kvm, vcpu_id);
|
||||
struct vcpu_svm *prev_svm;
|
||||
|
||||
if (!prev_vcpu) {
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
prev_svm = to_svm(prev_vcpu);
|
||||
svm_ir_list_del(prev_svm, pi);
|
||||
}
|
||||
|
||||
/**
|
||||
* Allocating new amd_iommu_pi_data, which will get
|
||||
* add to the per-vcpu ir_list.
|
||||
*/
|
||||
ir = kzalloc(sizeof(struct amd_svm_iommu_ir), GFP_KERNEL);
|
||||
if (!ir) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
ir->data = pi->ir_data;
|
||||
|
||||
spin_lock_irqsave(&svm->ir_list_lock, flags);
|
||||
list_add(&ir->node, &svm->ir_list);
|
||||
spin_unlock_irqrestore(&svm->ir_list_lock, flags);
|
||||
out:
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* Note:
|
||||
* The HW cannot support posting multicast/broadcast
|
||||
* interrupts to a vCPU. So, we still use legacy interrupt
|
||||
* remapping for these kind of interrupts.
|
||||
*
|
||||
* For lowest-priority interrupts, we only support
|
||||
* those with single CPU as the destination, e.g. user
|
||||
* configures the interrupts via /proc/irq or uses
|
||||
* irqbalance to make the interrupts single-CPU.
|
||||
*/
|
||||
static int
|
||||
get_pi_vcpu_info(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
|
||||
struct vcpu_data *vcpu_info, struct vcpu_svm **svm)
|
||||
{
|
||||
struct kvm_lapic_irq irq;
|
||||
struct kvm_vcpu *vcpu = NULL;
|
||||
|
||||
kvm_set_msi_irq(kvm, e, &irq);
|
||||
|
||||
if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
|
||||
pr_debug("SVM: %s: use legacy intr remap mode for irq %u\n",
|
||||
__func__, irq.vector);
|
||||
return -1;
|
||||
}
|
||||
|
||||
pr_debug("SVM: %s: use GA mode for irq %u\n", __func__,
|
||||
irq.vector);
|
||||
*svm = to_svm(vcpu);
|
||||
vcpu_info->pi_desc_addr = page_to_phys((*svm)->avic_backing_page);
|
||||
vcpu_info->vector = irq.vector;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* svm_update_pi_irte - set IRTE for Posted-Interrupts
|
||||
*
|
||||
* @kvm: kvm
|
||||
* @host_irq: host irq of the interrupt
|
||||
* @guest_irq: gsi of the interrupt
|
||||
* @set: set or unset PI
|
||||
* returns 0 on success, < 0 on failure
|
||||
*/
|
||||
static int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
|
||||
uint32_t guest_irq, bool set)
|
||||
{
|
||||
struct kvm_kernel_irq_routing_entry *e;
|
||||
struct kvm_irq_routing_table *irq_rt;
|
||||
int idx, ret = -EINVAL;
|
||||
|
||||
if (!kvm_arch_has_assigned_device(kvm) ||
|
||||
!irq_remapping_cap(IRQ_POSTING_CAP))
|
||||
return 0;
|
||||
|
||||
pr_debug("SVM: %s: host_irq=%#x, guest_irq=%#x, set=%#x\n",
|
||||
__func__, host_irq, guest_irq, set);
|
||||
|
||||
idx = srcu_read_lock(&kvm->irq_srcu);
|
||||
irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
|
||||
WARN_ON(guest_irq >= irq_rt->nr_rt_entries);
|
||||
|
||||
hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
|
||||
struct vcpu_data vcpu_info;
|
||||
struct vcpu_svm *svm = NULL;
|
||||
|
||||
if (e->type != KVM_IRQ_ROUTING_MSI)
|
||||
continue;
|
||||
|
||||
/**
|
||||
* Here, we setup with legacy mode in the following cases:
|
||||
* 1. When cannot target interrupt to a specific vcpu.
|
||||
* 2. Unsetting posted interrupt.
|
||||
* 3. APIC virtialization is disabled for the vcpu.
|
||||
*/
|
||||
if (!get_pi_vcpu_info(kvm, e, &vcpu_info, &svm) && set &&
|
||||
kvm_vcpu_apicv_active(&svm->vcpu)) {
|
||||
struct amd_iommu_pi_data pi;
|
||||
|
||||
/* Try to enable guest_mode in IRTE */
|
||||
pi.base = page_to_phys(svm->avic_backing_page) & AVIC_HPA_MASK;
|
||||
pi.ga_tag = AVIC_GATAG(kvm->arch.avic_vm_id,
|
||||
svm->vcpu.vcpu_id);
|
||||
pi.is_guest_mode = true;
|
||||
pi.vcpu_data = &vcpu_info;
|
||||
ret = irq_set_vcpu_affinity(host_irq, &pi);
|
||||
|
||||
/**
|
||||
* Here, we successfully setting up vcpu affinity in
|
||||
* IOMMU guest mode. Now, we need to store the posted
|
||||
* interrupt information in a per-vcpu ir_list so that
|
||||
* we can reference to them directly when we update vcpu
|
||||
* scheduling information in IOMMU irte.
|
||||
*/
|
||||
if (!ret && pi.is_guest_mode)
|
||||
svm_ir_list_add(svm, &pi);
|
||||
} else {
|
||||
/* Use legacy mode in IRTE */
|
||||
struct amd_iommu_pi_data pi;
|
||||
|
||||
/**
|
||||
* Here, pi is used to:
|
||||
* - Tell IOMMU to use legacy mode for this interrupt.
|
||||
* - Retrieve ga_tag of prior interrupt remapping data.
|
||||
*/
|
||||
pi.is_guest_mode = false;
|
||||
ret = irq_set_vcpu_affinity(host_irq, &pi);
|
||||
|
||||
/**
|
||||
* Check if the posted interrupt was previously
|
||||
* setup with the guest_mode by checking if the ga_tag
|
||||
* was cached. If so, we need to clean up the per-vcpu
|
||||
* ir_list.
|
||||
*/
|
||||
if (!ret && pi.prev_ga_tag) {
|
||||
int id = AVIC_GATAG_TO_VCPUID(pi.prev_ga_tag);
|
||||
struct kvm_vcpu *vcpu;
|
||||
|
||||
vcpu = kvm_get_vcpu_by_id(kvm, id);
|
||||
if (vcpu)
|
||||
svm_ir_list_del(to_svm(vcpu), &pi);
|
||||
}
|
||||
}
|
||||
|
||||
if (!ret && svm) {
|
||||
trace_kvm_pi_irte_update(svm->vcpu.vcpu_id,
|
||||
host_irq, e->gsi,
|
||||
vcpu_info.vector,
|
||||
vcpu_info.pi_desc_addr, set);
|
||||
}
|
||||
|
||||
if (ret < 0) {
|
||||
pr_err("%s: failed to update PI IRTE\n", __func__);
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
ret = 0;
|
||||
out:
|
||||
srcu_read_unlock(&kvm->irq_srcu, idx);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
|
@ -5064,7 +5421,6 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
|
|||
|
||||
.has_wbinvd_exit = svm_has_wbinvd_exit,
|
||||
|
||||
.read_tsc_offset = svm_read_tsc_offset,
|
||||
.write_tsc_offset = svm_write_tsc_offset,
|
||||
.adjust_tsc_offset_guest = svm_adjust_tsc_offset_guest,
|
||||
.read_l1_tsc = svm_read_l1_tsc,
|
||||
|
@ -5078,6 +5434,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
|
|||
|
||||
.pmu_ops = &amd_pmu_ops,
|
||||
.deliver_posted_interrupt = svm_deliver_avic_intr,
|
||||
.update_pi_irte = svm_update_pi_irte,
|
||||
};
|
||||
|
||||
static int __init svm_init(void)
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue