Commit Graph

262 Commits

Author SHA1 Message Date
Oliver Upton 0d3b2b4d23 Merge branch kvm-arm64/nv-prefix into kvmarm/next
* kvm-arm64/nv-prefix:
  : Preamble to NV support, courtesy of Marc Zyngier.
  :
  : This brings in a set of prerequisite patches for supporting nested
  : virtualization in KVM/arm64. Of course, there is a long way to go until
  : NV is actually enabled in KVM.
  :
  :  - Introduce cpucap / vCPU feature flag to pivot the NV code on
  :
  :  - Add support for EL2 vCPU register state
  :
  :  - Basic nested exception handling
  :
  :  - Hide unsupported features from the ID registers for NV-capable VMs
  KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
  KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
  KVM: arm64: nv: Filter out unsupported features from ID regs
  KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
  KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
  KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
  KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
  KVM: arm64: nv: Handle SMCs taken from virtual EL2
  KVM: arm64: nv: Handle trapped ERET from virtual EL2
  KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
  KVM: arm64: nv: Support virtual EL2 exceptions
  KVM: arm64: nv: Handle HCR_EL2.NV system register traps
  KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
  KVM: arm64: nv: Add EL2 system registers to vcpu context
  KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
  KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
  KVM: arm64: nv: Introduce nested virtualization VCPU feature
  KVM: arm64: Use the S2 MMU context to iterate over S2 table
  arm64: Add ARM64_HAS_NESTED_VIRT cpufeature

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:33:41 +00:00
Oliver Upton 022d3f0800 Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Convert CPACR_EL1_TTA to the new, generated system register
  :    definitions.
  :
  :  - Serialize toggling CPACR_EL1.SMEN to avoid unexpected exceptions when
  :    accessing SVCR in the host.
  :
  :  - Avoid quiescing the guest if a vCPU accesses its own redistributor's
  :    SGIs/PPIs, eliminating the need to IPI. Largely an optimization for
  :    nested virtualization, as the L1 accesses the affected registers
  :    rather often.
  :
  :  - Conversion to kstrtobool()
  :
  :  - Common definition of INVALID_GPA across architectures
  :
  :  - Enable CONFIG_USERFAULTFD for CI runs of KVM selftests
  KVM: arm64: Fix non-kerneldoc comments
  KVM: selftests: Enable USERFAULTFD
  KVM: selftests: Remove redundant setbuf()
  arm64/sysreg: clean up some inconsistent indenting
  KVM: MMU: Make the definition of 'INVALID_GPA' common
  KVM: arm64: vgic-v3: Use kstrtobool() instead of strtobool()
  KVM: arm64: vgic-v3: Limit IPI-ing when accessing GICR_{C,S}ACTIVER0
  KVM: arm64: Synchronize SMEN on vcpu schedule out
  KVM: arm64: Kill CPACR_EL1_TTA definition

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:33:25 +00:00
Oliver Upton 3f1a14af5e Merge branch kvm-arm64/psci-relay-fixes into kvmarm/next
* kvm-arm64/psci-relay-fixes:
  : Fixes for CPU on/resume with pKVM, courtesy Quentin Perret.
  :
  : A consequence of deprivileging the host is that pKVM relays PSCI calls
  : on behalf of the host. pKVM's CPU initialization failed to fully
  : initialize the CPU's EL2 state, which notably led to unexpected SVE
  : traps resulting in a hyp panic.
  :
  : The issue is addressed by reusing parts of __finalise_el2 to restore CPU
  : state in the PSCI relay.
  KVM: arm64: Finalise EL2 state from pKVM PSCI relay
  KVM: arm64: Use sanitized values in __check_override in nVHE
  KVM: arm64: Introduce finalise_el2_state macro
  KVM: arm64: Provide sanitized SYS_ID_AA64SMFR0_EL1 to nVHE
2023-02-13 23:30:37 +00:00
Oliver Upton e8789ab704 Merge branch kvm-arm64/virtual-cache-geometry into kvmarm/next
* kvm-arm64/virtual-cache-geometry:
  : Virtualized cache geometry for KVM guests, courtesy of Akihiko Odaki.
  :
  : KVM/arm64 has always exposed the host cache geometry directly to the
  : guest, even though non-secure software should never perform CMOs by
  : Set/Way. This was slightly wrong, as the cache geometry was derived from
  : the PE on which the vCPU thread was running and not a sanitized value.
  :
  : All together this leads to issues migrating VMs on heterogeneous
  : systems, as the cache geometry saved/restored could be inconsistent.
  :
  : KVM/arm64 now presents 1 level of cache with 1 set and 1 way. The cache
  : geometry is entirely controlled by userspace, such that migrations from
  : older kernels continue to work.
  KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
  KVM: arm64: Normalize cache configuration
  KVM: arm64: Mask FEAT_CCIDX
  KVM: arm64: Always set HCR_TID2
  arm64/cache: Move CLIDR macro definitions
  arm64/sysreg: Add CCSIDR2_EL1
  arm64/sysreg: Convert CCSIDR_EL1 to automatic generation
  arm64: Allow the definition of UNKNOWN system register fields

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:32:40 +00:00
Jintack Lim 675cabc899 arm64: Add ARM64_HAS_NESTED_VIRT cpufeature
Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
CPU has the ARMv8.3 nested virtualization capability, together
with the 'kvm-arm.mode=nested' command line option.

This will be used to support nested virtualization in KVM.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
[maz: moved the command-line option to kvm-arm.mode]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 09:16:11 +00:00
Oliver Upton 5f623a598d KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
Generally speaking, any memory allocations that can be associated with a
particular VM should be charged to the cgroup of its process.
Nonetheless, there are a couple spots in KVM/arm64 that aren't currently
accounted:

 - the ccsidr array containing the virtualized cache hierarchy

 - the cpumask of supported cpus, for use of the vPMU on heterogeneous
   systems

Go ahead and set __GFP_ACCOUNT for these allocations.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230206235229.4174711-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-07 13:56:18 +00:00
Marc Zyngier 67d953d4d7 KVM: arm64: Fix non-kerneldoc comments
The robots amongts us have started spitting out irritating emails about
random errors such as:

<quote>
arch/arm64/kvm/arm.c:2207: warning: expecting prototype for Initialize Hyp().
Prototype was for kvm_arm_init() instead
</quote>

which makes little sense until you finally grok what they are on about:
comments that look like a kerneldoc, but that aren't.

Let's address this before I get even more irritated... ;-)

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/63e139e1.J5AHO6vmxaALh7xv%25lkp@intel.com
Link: https://lore.kernel.org/r/20230207094321.1238600-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-07 13:55:23 +00:00
Quentin Perret 8669651ce0 KVM: arm64: Provide sanitized SYS_ID_AA64SMFR0_EL1 to nVHE
We will need a sanitized copy of SYS_ID_AA64SMFR0_EL1 from the nVHE EL2
code shortly, so make sure to provide it with a copy.

Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230201103755.1398086-2-qperret@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-02 21:46:43 +00:00
Sean Christopherson 81a1cf9f89 KVM: Drop kvm_arch_check_processor_compat() hook
Drop kvm_arch_check_processor_compat() and its support code now that all
architecture implementations are nops.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20221130230934.1014142-33-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:28 -05:00
Sean Christopherson a578a0a9e3 KVM: Drop kvm_arch_{init,exit}() hooks
Drop kvm_arch_init() and kvm_arch_exit() now that all implementations
are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-30-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:23 -05:00
Sean Christopherson 53bf620a2c KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init
Tag kvm_arm_init() and its unique helper as __init, and tag data that is
only ever modified under the kvm_arm_init() umbrella as read-only after
init.

Opportunistically name the boolean param in kvm_timer_hyp_init()'s
prototype to match its definition.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-21-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:10 -05:00
Sean Christopherson 1dc0f02d53 KVM: arm64: Do arm/arch initialization without bouncing through kvm_init()
Do arm/arch specific initialization directly in arm's module_init(), now
called kvm_arm_init(), instead of bouncing through kvm_init() to reach
kvm_arch_init().  Invoking kvm_arch_init() is the very first action
performed by kvm_init(), so from a initialization perspective this is a
glorified nop.

Avoiding kvm_arch_init() also fixes a mostly benign bug as kvm_arch_exit()
doesn't properly unwind if a later stage of kvm_init() fails.  While the
soon-to-be-deleted comment about compiling as a module being unsupported
is correct, kvm_arch_exit() can still be called by kvm_init() if any step
after the call to kvm_arch_init() succeeds.

Add a FIXME to call out that pKVM initialization isn't unwound if
kvm_init() fails, which is a pre-existing problem inherited from
kvm_arch_exit().

Making kvm_arch_init() a nop will also allow dropping kvm_arch_init() and
kvm_arch_exit() entirely once all other architectures follow suit.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-20-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:08 -05:00
Sean Christopherson 78b3bf485d KVM: arm64: Unregister perf callbacks if hypervisor finalization fails
Undo everything done by init_subsystems() if a later initialization step
fails, i.e. unregister perf callbacks in addition to unregistering the
power management notifier.

Fixes: bfa79a8054 ("KVM: arm64: Elevate hypervisor mappings creation at EL2")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-19-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:07 -05:00
Sean Christopherson 6baaeda878 KVM: arm64: Free hypervisor allocations if vector slot init fails
Teardown hypervisor mode if vector slot setup fails in order to avoid
leaking any allocations done by init_hyp_mode().

Fixes: b881cdce77 ("KVM: arm64: Allocate hyp vectors statically")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-18-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:06 -05:00
Marc Zyngier 466d27e48d KVM: arm64: Simplify the CPUHP logic
For a number of historical reasons, the KVM/arm64 hotplug setup is pretty
complicated, and we have two extra CPUHP notifiers for vGIC and timers.

It looks pretty pointless, and gets in the way of further changes.
So let's just expose some helpers that can be called from the core
CPUHP callback, and get rid of everything else.

This gives us the opportunity to drop a useless notifier entry,
as well as tidy-up the timer enable/disable, which was a bit odd.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221130230934.1014142-17-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:41:04 -05:00
Sean Christopherson 63a1bd8ad1 KVM: Drop arch hardware (un)setup hooks
Drop kvm_arch_hardware_setup() and kvm_arch_hardware_unsetup() now that
all implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
Message-Id: <20221130230934.1014142-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:40:54 -05:00
Paolo Bonzini eb5618911a KVM/arm64 updates for 6.2
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 - Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
   option, which multi-process VMMs such as crosvm rely on.
 
 - Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 - Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 - Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 - Add/Enable/Fix a bunch of selftests covering memslots, breakpoints,
   stage-2 faults and access tracking. You name it, we got it, we
   probably broke it.
 
 - Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 As a side effect, this tag also drags:
 
 - The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring
   series
 
 - A shared branch with the arm64 tree that repaints all the system
   registers to match the ARM ARM's naming, and resulting in
   interesting conflicts
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmOODb0PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDztsQAInRnsgLl57/SpqhZzExNCllN6AT/bdeB3uz
 rnw3ScJOV174uNKp8lnPWoTvu2YUGiVtBp6tFHhDI8le7zHX438ZT8KE5mcs8p5i
 KfFKnb8SHV2DDpqkcy24c0Xl/6vsg1qkKrdfJb49yl5ZakRITDpynW/7tn6dXsxX
 wASeGFdCYeW4g2xMQzsCbtx6LgeQ8uomBmzRfPrOtZHYYxAn6+4Mj4595EC1sWxM
 AQnbp8tW3Vw46saEZAQvUEOGOW9q0Nls7G21YqQ52IA+ZVDK1LmAF2b1XY3edjkk
 pX8EsXOURfqdasBxfSfF3SgnUazoz9GHpSzp1cTVTktrPp40rrT7Ldtml0ktq69d
 1malPj47KVMDsIq0kNJGnMxciXFgAHw+VaCQX+k4zhIatNwviMbSop2fEoxj22jc
 4YGgGOxaGrnvmAJhreCIbr4CkZk5CJ8Zvmtfg+QM6npIp8BY8896nvORx/d4i6tT
 H4caadd8AAR56ANUyd3+KqF3x0WrkaU0PLHJLy1tKwOXJUUTjcpvIfahBAAeUlSR
 qEFrtb+EEMPgAwLfNOICcNkPZR/yyuYvM+FiUQNVy5cNiwFkpztpIctfOFaHySGF
 K07O2/a1F6xKL0OKRUg7hGKknF9ecmux4vHhiUMuIk9VOgNTWobHozBDorLKXMzC
 aWa6oGVC
 =iIPT
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.2

- Enable the per-vcpu dirty-ring tracking mechanism, together with an
  option to keep the good old dirty log around for pages that are
  dirtied by something other than a vcpu.

- Switch to the relaxed parallel fault handling, using RCU to delay
  page table reclaim and giving better performance under load.

- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
  option, which multi-process VMMs such as crosvm rely on.

- Merge the pKVM shadow vcpu state tracking that allows the hypervisor
  to have its own view of a vcpu, keeping that state private.

- Add support for the PMUv3p5 architecture revision, bringing support
  for 64bit counters on systems that support it, and fix the
  no-quite-compliant CHAIN-ed counter support for the machines that
  actually exist out there.

- Fix a handful of minor issues around 52bit VA/PA support (64kB pages
  only) as a prefix of the oncoming support for 4kB and 16kB pages.

- Add/Enable/Fix a bunch of selftests covering memslots, breakpoints,
  stage-2 faults and access tracking. You name it, we got it, we
  probably broke it.

- Pick a small set of documentation and spelling fixes, because no
  good merge window would be complete without those.

As a side effect, this tag also drags:

- The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring
  series

- A shared branch with the arm64 tree that repaints all the system
  registers to match the ARM ARM's naming, and resulting in
  interesting conflicts
2022-12-09 09:12:12 +01:00
Marc Zyngier 118bc846d4 Merge branch kvm-arm64/pmu-unchained into kvmarm-master/next
* kvm-arm64/pmu-unchained:
  : .
  : PMUv3 fixes and improvements:
  :
  : - Make the CHAIN event handling strictly follow the architecture
  :
  : - Add support for PMUv3p5 (64bit counters all the way)
  :
  : - Various fixes and cleanups
  : .
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run
  KVM: arm64: PMU: Simplify PMCR_EL0 reset handling
  KVM: arm64: PMU: Replace version number '0' with ID_AA64DFR0_EL1_PMUVer_NI
  KVM: arm64: PMU: Make kvm_pmc the main data structure
  KVM: arm64: PMU: Simplify vcpu computation on perf overflow notification
  KVM: arm64: PMU: Allow PMUv3p5 to be exposed to the guest
  KVM: arm64: PMU: Implement PMUv3p5 long counter support
  KVM: arm64: PMU: Allow ID_DFR0_EL1.PerfMon to be set from userspace
  KVM: arm64: PMU: Allow ID_AA64DFR0_EL1.PMUver to be set from userspace
  KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation
  KVM: arm64: PMU: Do not let AArch32 change the counters' top 32 bits
  KVM: arm64: PMU: Simplify setting a counter to a specific value
  KVM: arm64: PMU: Add counter_index_to_*reg() helpers
  KVM: arm64: PMU: Only narrow counters that are not 64bit wide
  KVM: arm64: PMU: Narrow the overflow checking when required
  KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow
  KVM: arm64: PMU: Always advertise the CHAIN event
  KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode
  arm64: Add ID_DFR0_EL1.PerfMon values for PMUv3p7 and IMP_DEF

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:38:44 +00:00
Marc Zyngier cfa72993d1 Merge branch kvm-arm64/pkvm-vcpu-state into kvmarm-master/next
* kvm-arm64/pkvm-vcpu-state: (25 commits)
  : .
  : Large drop of pKVM patches from Will Deacon and co, adding
  : a private vm/vcpu state at EL2, managed independently from
  : the EL1 state. From the cover letter:
  :
  : "This is version six of the pKVM EL2 state series, extending the pKVM
  : hypervisor code so that it can dynamically instantiate and manage VM
  : data structures without the host being able to access them directly.
  : These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
  : page-table for the MMU. The pages used to hold the hypervisor structures
  : are returned to the host when the VM is destroyed."
  : .
  KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run()
  KVM: arm64: Don't unnecessarily map host kernel sections at EL2
  KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2
  KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
  KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
  KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache
  KVM: arm64: Instantiate guest stage-2 page-tables at EL2
  KVM: arm64: Consolidate stage-2 initialisation into a single function
  KVM: arm64: Add generic hyp_memcache helpers
  KVM: arm64: Provide I-cache invalidation by virtual address at EL2
  KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
  KVM: arm64: Add per-cpu fixmap infrastructure at EL2
  KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1
  KVM: arm64: Add infrastructure to create and track pKVM instances at EL2
  KVM: arm64: Rename 'host_kvm' to 'host_mmu'
  KVM: arm64: Add hyp_spinlock_t static initializer
  KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h
  KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2
  KVM: arm64: Prevent the donation of no-map pages
  KVM: arm64: Implement do_donate() helper for donating memory
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:37:23 +00:00
Marc Zyngier a937f37d85 Merge branch kvm-arm64/dirty-ring into kvmarm-master/next
* kvm-arm64/dirty-ring:
  : .
  : Add support for the "per-vcpu dirty-ring tracking with a bitmap
  : and sprinkles on top", courtesy of Gavin Shan.
  :
  : This branch drags the kvmarm-fixes-6.1-3 tag which was already
  : merged in 6.1-rc4 so that the branch is in a working state.
  : .
  KVM: Push dirty information unconditionally to backup bitmap
  KVM: selftests: Automate choosing dirty ring size in dirty_log_test
  KVM: selftests: Clear dirty ring states between two modes in dirty_log_test
  KVM: selftests: Use host page size to map ring buffer in dirty_log_test
  KVM: arm64: Enable ring-based dirty memory tracking
  KVM: Support dirty ring in conjunction with bitmap
  KVM: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h
  KVM: x86: Introduce KVM_REQ_DIRTY_RING_SOFT_FULL

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:19:50 +00:00
Marc Zyngier 3d0dba5764 KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation
As further patches will enable the selection of a PMU revision
from userspace, sample the supported PMU revision at VM creation
time, rather than building each time the ID_AA64DFR0_EL1 register
is accessed.

This shouldn't result in any change in behaviour.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221113163832.3154370-11-maz@kernel.org
2022-11-19 12:56:39 +00:00
Will Deacon 73f38ef2ae KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
Sharing 'kvm_arm_vmid_bits' between EL1 and EL2 allows the host to
modify the variable arbitrarily, potentially leading to all sorts of
shenanians as this is used to configure the VTTBR register for the
guest stage-2.

In preparation for unmapping host sections entirely from EL2, maintain
a copy of 'kvm_arm_vmid_bits' in the pKVM hypervisor and initialise it
from the host value while it is still trusted.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-23-will@kernel.org
2022-11-11 17:19:35 +00:00
Quentin Perret fe41a7f8c0 KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
When pKVM is enabled, the hypervisor at EL2 does not trust the host at
EL1 and must therefore prevent it from having unrestricted access to
internal hypervisor state.

The 'kvm_arm_hyp_percpu_base' array holds the offsets for hypervisor
per-cpu allocations, so move this this into the nVHE code where it
cannot be modified by the untrusted host at EL1.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-22-will@kernel.org
2022-11-11 17:19:35 +00:00
Quentin Perret 315775ff7c KVM: arm64: Consolidate stage-2 initialisation into a single function
The initialisation of guest stage-2 page-tables is currently split
across two functions: kvm_init_stage2_mmu() and kvm_arm_setup_stage2().
That is presumably for historical reasons as kvm_arm_setup_stage2()
originates from the (now defunct) KVM port for 32-bit Arm.

Simplify this code path by merging both functions into one, taking care
to map the 'struct kvm' into the hypervisor stage-1 early on in order to
simplify the failure path.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-19-will@kernel.org
2022-11-11 17:16:25 +00:00
Will Deacon 13e248aab7 KVM: arm64: Provide I-cache invalidation by virtual address at EL2
In preparation for handling cache maintenance of guest pages from within
the pKVM hypervisor at EL2, introduce an EL2 copy of icache_inval_pou()
which will later be plumbed into the stage-2 page-table cache
maintenance callbacks, ensuring that the initial contents of pages
mapped as executable into the guest stage-2 page-table is visible to the
instruction fetcher.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-17-will@kernel.org
2022-11-11 17:16:25 +00:00
Will Deacon 6c165223e9 KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
The nVHE object at EL2 maintains its own copies of some host variables
so that, when pKVM is enabled, the host cannot directly modify the
hypervisor state. When running in normal nVHE mode, however, these
variables are still mirrored at EL2 but are not initialised.

Initialise the hypervisor symbols from the host copies regardless of
pKVM, ensuring that any reference to this data at EL2 with normal nVHE
will return a sensibly initialised value.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-16-will@kernel.org
2022-11-11 17:16:25 +00:00
Fuad Tabba 9d0c063a4d KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1
With the pKVM hypervisor at EL2 now offering hypercalls to the host for
creating and destroying VM and vCPU structures, plumb these in to the
existing arm64 KVM backend to ensure that the hypervisor data structures
are allocated and initialised on first vCPU run for a pKVM guest.

In the host, 'struct kvm_protected_vm' is introduced to hold the handle
of the pKVM VM instance as well as to track references to the memory
donated to the hypervisor so that it can be freed back to the host
allocator following VM teardown. The stage-2 page-table, hypervisor VM
and vCPU structures are allocated separately so as to avoid the need for
a large physically-contiguous allocation in the host at run-time.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-14-will@kernel.org
2022-11-11 17:16:24 +00:00
Ryan Roberts 579d7ebe90 KVM: arm64: Fix kvm init failure when mode!=vhe and VA_BITS=52.
For nvhe and protected modes, the hyp stage 1 page-tables were previously
configured to have the same number of VA bits as the kernel's idmap.
However, for kernel configs with VA_BITS=52 and where the kernel is
loaded in physical memory below 48 bits, the idmap VA bits is actually
smaller than the kernel's normal stage 1 VA bits. This can lead to
kernel addresses that can't be mapped into the hypervisor, leading to
kvm initialization failure during boot:

  kvm [1]: IPA Size Limit: 48 bits
  kvm [1]: Cannot map world-switch code
  kvm [1]: error initializing Hyp mode: -34

Fix this by ensuring that the hyp stage 1 VA size is the maximum of
what's used for the idmap and the regular kernel stage 1. At the same
time, refactor the code so that the hyp VA bits is only calculated in
one place.

Prior to 7ba8f2b2d6, the idmap was always 52 bits for a 52 VA bits
kernel and therefore the hyp stage1 was also always 52 bits.

Fixes: 7ba8f2b2d6 ("arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
[maz: commit message fixes]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221103150507.32948-2-ryan.roberts@arm.com
2022-11-10 19:22:51 +00:00
Gavin Shan 9cb1096f85 KVM: arm64: Enable ring-based dirty memory tracking
Enable ring-based dirty memory tracking on ARM64:

  - Enable CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL.

  - Enable CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP.

  - Set KVM_DIRTY_LOG_PAGE_OFFSET for the ring buffer's physical page
    offset.

  - Add ARM64 specific kvm_arch_allow_write_without_running_vcpu() to
    keep the site of saving vgic/its tables out of the no-running-vcpu
    radar.

Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110104914.31280-5-gshan@redhat.com
2022-11-10 13:11:58 +00:00
Paolo Bonzini d663b8a285 KVM: replace direct irq.h inclusion
virt/kvm/irqchip.c is including "irq.h" from the arch-specific KVM source
directory (i.e. not from arch/*/include) for the sole purpose of retrieving
irqchip_in_kernel.

Making the function inline in a header that is already included,
such as asm/kvm_host.h, is not possible because it needs to look at
struct kvm which is defined after asm/kvm_host.h is included.  So add a
kvm_arch_irqchip_in_kernel non-inline function; irqchip_in_kernel() is
only performance critical on arm64 and x86, and the non-inline function
is enough on all other architectures.

irq.h can then be deleted from all architectures except x86.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-09 12:31:37 -05:00
Paolo Bonzini fe4d9e4abf KVM/arm64 updates for v6.1
- Fixes for single-stepping in the presence of an async
   exception as well as the preservation of PSTATE.SS
 
 - Better handling of AArch32 ID registers on AArch64-only
   systems
 
 - Fixes for the dirty-ring API, allowing it to work on
   architectures with relaxed memory ordering
 
 - Advertise the new kvmarm mailing list
 
 - Various minor cleanups and spelling fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5hQcPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDoMUP/jra4HSmujLUB5G7Op8HxuurEecOc6xtw0Af
 AbDLlVc2Vs4rrdVh8GMc8D80atUAVitp8IFjdp/PzI2GTBTzWz43Gav2AbhgIJbJ
 xoFVHL8LkdHKyMbq10359DqGMqhIf41OFzGwhbzcx2V4pKNkSpjbCpu3bi/+Ybjg
 006ZpZc7NAU0rZgw9Flb/dhn0jw7RMc3orhoDQ4tBp1P/VhvqvgFt5bWipkvvBP7
 +lQK28ujG3ghST/hKRhg6ozgy5+6NEEHMuhErMYP8nIivRchX+pWF2Lb0qGH1e+U
 v2MZIZnIIUjyTV1vbYlxtltzfYmPuQ2MFNUBawI9tmlIOU9vJSCzeJS64uWK4KLV
 kbmk57OfC7rQoSNJH4jaKQp0YpIktrB9Vei97t4I7NwEmkjQj6cLTgg4tQrNqTiQ
 cFGeC9mE+lhFC8z1lCbna2eG631FxpPrB1SJ1/CU9wboam9dUfXGIvBPh+i2pvMZ
 vcxzUZJ11y+/uhp4k8i2PBwNno0iwRXd5MinwRUs2CR5vhs8qa5y7FVWKyqKpgI2
 xqr4lYTixJZL3mWkYyOQuClrTbT1zkoaPldLq6M7wvO08+QV8ryMeyKT+9s/gNQU
 dcYSwBCWZaOZm2nN8/zjxRb7VqZVu3cwyXi9XXUWNTCgIe/Q/SDPbXU/Hwbgzf8X
 UsQF7e9A
 =aNPK
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for v6.1

- Fixes for single-stepping in the presence of an async
  exception as well as the preservation of PSTATE.SS

- Better handling of AArch32 ID registers on AArch64-only
  systems

- Fixes for the dirty-ring API, allowing it to work on
  architectures with relaxed memory ordering

- Advertise the new kvmarm mailing list

- Various minor cleanups and spelling fixes
2022-10-03 15:33:32 -04:00
Paolo Bonzini c99ad25b0d Merge tag 'kvm-x86-6.1-2' of https://github.com/sean-jc/linux into HEAD
KVM x86 updates for 6.1, batch #2:

 - Misc PMU fixes and cleanups.

 - Fixes for Hyper-V hypercall selftest
2022-09-30 07:09:48 -04:00
Paolo Bonzini c59fb12758 KVM: remove KVM_REQ_UNHALT
KVM_REQ_UNHALT is now unnecessary because it is replaced by the return
value of kvm_vcpu_block/kvm_vcpu_halt.  Remove it.

No functional change intended.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20220921003201.1441511-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:37:21 -04:00
Elliot Berman b2a4d007c3 KVM: arm64: Ignore kvm-arm.mode if !is_hyp_mode_available()
Ignore kvm-arm.mode if !is_hyp_mode_available(). Specifically, we want
to avoid switching kvm_mode to KVM_MODE_PROTECTED if hypervisor mode is
not available. This prevents "Protected KVM" cpu capability being
reported when Linux is booting in EL1 and would not have KVM enabled.
Reasonably though, we should warn if the command line is requesting a
KVM mode at all if KVM isn't actually available. Allow
"kvm-arm.mode=none" to skip the warning since this would disable KVM
anyway.

Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220920190658.2880184-1-quic_eberman@quicinc.com
2022-09-26 10:49:49 +01:00
Zenghui Yu 522c9a64c7 KVM: arm64: Use kmemleak_free_part_phys() to unregister hyp_mem_base
With commit 0c24e06119 ("mm: kmemleak: add rbtree and store physical
address for objects allocated with PA"), kmemleak started to put the
objects allocated with physical address onto object_phys_tree_root tree.
The kmemleak_free_part() therefore no longer worked as expected on
physically allocated objects (hyp_mem_base in this case) as it attempted to
search and remove things in object_tree_root tree.

Fix it by using kmemleak_free_part_phys() to unregister hyp_mem_base. This
fixes an immediate crash when booting a KVM host in protected mode with
kmemleak enabled.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220908130659.2021-1-yuzenghui@huawei.com
2022-09-19 17:59:48 +01:00
Paolo Bonzini 959d6c4ae2 KVM/arm64 fixes for 6.0, take #1
- Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK
 
 - Tidy-up handling of AArch32 on asymmetric systems
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmL+RsgPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDHrUP/3IYZ0LnYUZBImSU/YTPL5yYzdSVAuMNcdRQ
 EgvLQKwP+JSrmd7B7wZ4MhY1LheKjpNmmuSqTRsZOHb/yBmnh3+ao5n2gqusYQeJ
 PCuLYjeF7ZU5fGIrPAW6BW0BFlmYMVbTrC6SEMhZsisBhna44jrrWgkBz9mOsXE/
 YcDWv8kP15lisuQzMvnYxmZobbVgSJ3KgQY4/Dp6vyKMR8ULujCxziFV5R4RD0xP
 Ay8wnxtMUymx9P6sZsd6Vwi5h1MUXOOoI4He7+8ejIfoMOManIMOIq4PDQhINwQv
 tGysDmQavftSbUkXJ1VB+8cJ/9KufzwKxFoc5WqGk1y14QulyBNyb/XR3UtORe1n
 bitINTTkqibHY6fdQJA7z1sD0jaEAh/xNwO1Gq0BS40o4XVQDv2BjdQir9TEdlZO
 tsZKVaFpN3UZe681ru12No8YzQDhpuLH65gDHDjLaftH99WKsrSwMZLoEjqZTlM/
 vH/9acd4UB+9zMGTpN2tJ//2cq6g3JoUC7jJIQB1oGStHX0/7AxKMlabR2xHmt9E
 4CmJND9RLK6+yEelagxOAYMQfnCdj6pW/3bhvAmsZWh0t3fNxCeBBXFr2I5os+E9
 hV0FYx4PG9GtorMSqudCDsP83SDIxCluNZ5iM8t1suSn3dFhk5bDFChS86XGuqQe
 XxHQ6JTF
 =r5iq
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.0, take #1

- Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK

- Tidy-up handling of AArch32 on asymmetric systems
2022-08-19 05:43:53 -04:00
Oliver Upton f3c6efc72f KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems
KVM does not support AArch32 on asymmetric systems. To that end, enforce
AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system.

Fixes: 2122a83331 ("arm64: Allow mismatched 32-bit EL0 support")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220816192554.1455559-2-oliver.upton@linux.dev
2022-08-17 10:29:07 +01:00
Paolo Bonzini c4edb2babc Merge tag 'kvmarm-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 5.20:

- Unwinder implementations for both nVHE modes (classic and
  protected), complete with an overflow stack

- Rework of the sysreg access from userspace, with a complete
  rewrite of the vgic-v3 view to allign with the rest of the
  infrastructure

- Disagregation of the vcpu flags in separate sets to better track
  their use model.

- A fix for the GICv2-on-v3 selftest

- A small set of cosmetic fixes
2022-08-01 03:24:12 -04:00
Marc Zyngier 0982c8d859 Merge branch kvm-arm64/nvhe-stacktrace into kvmarm-master/next
* kvm-arm64/nvhe-stacktrace: (27 commits)
  : .
  : Add an overflow stack to the nVHE EL2 code, allowing
  : the implementation of an unwinder, courtesy of
  : Kalesh Singh. From the cover letter (slightly edited):
  :
  : "nVHE has two modes of operation: protected (pKVM) and unprotected
  : (conventional nVHE). Depending on the mode, a slightly different approach
  : is used to dump the hypervisor stacktrace but the core unwinding logic
  : remains the same.
  :
  : * Protected nVHE (pKVM) stacktraces:
  :
  : In protected nVHE mode, the host cannot directly access hypervisor memory.
  :
  : The hypervisor stack unwinding happens in EL2 and is made accessible to
  : the host via a shared buffer. Symbolizing and printing the stacktrace
  : addresses is delegated to the host and happens in EL1.
  :
  : * Non-protected (Conventional) nVHE stacktraces:
  :
  : In non-protected mode, the host is able to directly access the hypervisor
  : stack pages.
  :
  : The hypervisor stack unwinding and dumping of the stacktrace is performed
  : by the host in EL1, as this avoids the memory overhead of setting up
  : shared buffers between the host and hypervisor."
  :
  : Additional patches from Oliver Upton and Marc Zyngier, tidying up
  : the initial series.
  : .
  arm64: Update 'unwinder howto'
  KVM: arm64: Don't open code ARRAY_SIZE()
  KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c
  KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions
  KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit
  KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around
  KVM: arm64: Introduce pkvm_dump_backtrace()
  KVM: arm64: Implement protected nVHE hyp stack unwinder
  KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
  KVM: arm64: Stub implementation of pKVM HYP stack unwinder
  KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
  KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
  KVM: arm64: Introduce hyp_dump_backtrace()
  KVM: arm64: Implement non-protected nVHE hyp stack unwinder
  KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
  KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder
  KVM: arm64: On stack overflow switch to hyp overflow_stack
  arm64: stacktrace: Add description of stacktrace/common.h
  arm64: stacktrace: Factor out common unwind()
  arm64: stacktrace: Handle frame pointer from different address spaces
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-27 18:33:27 +01:00
Kalesh Singh db129d486e KVM: arm64: Implement non-protected nVHE hyp stack unwinder
Implements the common framework necessary for unwind() to work
for non-protected nVHE mode:
    - on_accessible_stack()
    - on_overflow_stack()
    - unwind_next()

Non-protected nVHE unwind() is used to unwind and dump the hypervisor
stacktrace by the host in EL1

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-11-kaleshsingh@google.com
2022-07-26 10:49:39 +01:00
Marc Zyngier ae98a4a989 Merge branch kvm-arm64/sysreg-cleanup-5.20 into kvmarm-master/next
* kvm-arm64/sysreg-cleanup-5.20:
  : .
  : Long overdue cleanup of the sysreg userspace access,
  : with extra scrubbing on the vgic side of things.
  : From the cover letter:
  :
  : "Schspa Shi recently reported[1] that some of the vgic code interacting
  : with userspace was reading uninitialised stack memory, and although
  : that read wasn't used any further, it prompted me to revisit this part
  : of the code.
  :
  : Needless to say, this area of the kernel is pretty crufty, and shows a
  : bunch of issues in other parts of the KVM/arm64 infrastructure. This
  : series tries to remedy a bunch of them:
  :
  : - Sanitise the way we deal with sysregs from userspace: at the moment,
  :   each and every .set_user/.get_user callback has to implement its own
  :   userspace accesses (directly or indirectly). It'd be much better if
  :   that was centralised so that we can reason about it.
  :
  : - Enforce that all AArch64 sysregs are 64bit. Always. This was sort of
  :   implied by the code, but it took some effort to convince myself that
  :   this was actually the case.
  :
  : - Move the vgic-v3 sysreg userspace accessors to the userspace
  :   callbacks instead of hijacking the vcpu trap callback. This allows
  :   us to reuse the sysreg infrastructure.
  :
  : - Consolidate userspace accesses for both GICv2, GICv3 and common code
  :   as much as possible.
  :
  : - Cleanup a bunch of not-very-useful helpers, tidy up some of the code
  :   as we touch it.
  :
  : [1] https://lore.kernel.org/r/m2h740zz1i.fsf@gmail.com"
  : .
  KVM: arm64: Get rid or outdated comments
  KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg()
  KVM: arm64: Get rid of find_reg_by_id()
  KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr()
  KVM: arm64: vgic: Consolidate userspace access for base address setting
  KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting
  KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user
  KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers
  KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers
  KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace
  KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP
  KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API
  KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess()
  KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr()
  KVM: arm64: Get rid of reg_from/to_user()
  KVM: arm64: Consolidate sysreg userspace accesses
  KVM: arm64: Rely on index_to_param() for size checks on userspace access
  KVM: arm64: Introduce generic get_user/set_user helpers for system registers
  KVM: arm64: Reorder handling of invariant sysregs from userspace
  KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:58 +01:00
Marc Zyngier 9f968c9266 KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting
We carry a legacy interface to set the base addresses for GICv2.
As this is currently plumbed into the same handling code as
the modern interface, it limits the evolution we can make there.

Add a helper dedicated to this handling, with a view of maybe
removing this in the future.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier dc94f89ae6 Merge branch kvm-arm64/burn-the-flags into kvmarm-master/next
* kvm-arm64/burn-the-flags:
  : .
  : Rework the per-vcpu flags to make them more manageable,
  : splitting them in different sets that have specific
  : uses:
  :
  : - configuration flags
  : - input to the world-switch
  : - state bookkeeping for the kernel itself
  :
  : The FP tracking is also simplified and tracked outside
  : of the flags as a separate state.
  : .
  KVM: arm64: Move the handling of !FP outside of the fast path
  KVM: arm64: Document why pause cannot be turned into a flag
  KVM: arm64: Reduce the size of the vcpu flag members
  KVM: arm64: Add build-time sanity checks for flags
  KVM: arm64: Warn when PENDING_EXCEPTION and INCREMENT_PC are set together
  KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag
  KVM: arm64: Kill unused vcpu flags field
  KVM: arm64: Move vcpu WFIT flag to the state flag set
  KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag set
  KVM: arm64: Move vcpu SVE/SME flags to the state flag set
  KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set
  KVM: arm64: Move vcpu PC/Exception flags to the input flag set
  KVM: arm64: Move vcpu configuration flags into their own set
  KVM: arm64: Add three sets of flags to the vcpu state
  KVM: arm64: Add helpers to manipulate vcpu flags among a set
  KVM: arm64: Move FP state ownership from flag to a tristate
  KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor code

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29 10:30:10 +01:00
Marc Zyngier b4da91879e KVM: arm64: Move the handling of !FP outside of the fast path
We currently start by assuming that the host owns the FP unit
at load time, then check again whether this is the case as
we are about to run. Only at this point do we account for the
fact that there is a (vanishingly small) chance that we're running
on a system without a FPSIMD unit (yes, this is madness).

We can actually move this FPSIMD check as early as load-time,
and drop the check at run time.

No intended change in behaviour.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29 10:23:56 +01:00
Marc Zyngier eebc538d8e KVM: arm64: Move vcpu WFIT flag to the state flag set
The host kernel uses the WFIT flag to remember that a vcpu has used
this instruction and wake it up as required. Move it to the state
set, as nothing in the hypervisor uses this information.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29 10:23:23 +01:00
Quentin Perret 56961c6331 KVM: arm64: Prevent kmemleak from accessing pKVM memory
Commit a7259df767 ("memblock: make memblock_find_in_range method
private") changed the API using which memory is reserved for the pKVM
hypervisor. However, memblock_phys_alloc() differs from the original API in
terms of kmemleak semantics -- the old one didn't report the reserved
regions to kmemleak while the new one does. Unfortunately, when protected
KVM is enabled, all kernel accesses to pKVM-private memory result in a
fatal exception, which can now happen because of kmemleak scans:

$ echo scan > /sys/kernel/debug/kmemleak
[   34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290!
[   34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000
[   34.991813] Kernel panic - not syncing: HYP panic:
[   34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
[   34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
[   34.991813] VCPU:0000000000000000
[   34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102
[   34.994059] Hardware name: linux,dummy-virt (DT)
[   34.994452] Call trace:
[   34.994641]  dump_backtrace.part.0+0xcc/0xe0
[   34.994932]  show_stack+0x18/0x6c
[   34.995094]  dump_stack_lvl+0x68/0x84
[   34.995276]  dump_stack+0x18/0x34
[   34.995484]  panic+0x16c/0x354
[   34.995673]  __hyp_pgtable_total_pages+0x0/0x60
[   34.995933]  scan_block+0x74/0x12c
[   34.996129]  scan_gray_list+0xd8/0x19c
[   34.996332]  kmemleak_scan+0x2c8/0x580
[   34.996535]  kmemleak_write+0x340/0x4a0
[   34.996744]  full_proxy_write+0x60/0xbc
[   34.996967]  vfs_write+0xc4/0x2b0
[   34.997136]  ksys_write+0x68/0xf4
[   34.997311]  __arm64_sys_write+0x20/0x2c
[   34.997532]  invoke_syscall+0x48/0x114
[   34.997779]  el0_svc_common.constprop.0+0x44/0xec
[   34.998029]  do_el0_svc+0x2c/0xc0
[   34.998205]  el0_svc+0x2c/0x84
[   34.998421]  el0t_64_sync_handler+0xf4/0x100
[   34.998653]  el0t_64_sync+0x18c/0x190
[   34.999252] SMP: stopping secondary CPUs
[   35.000034] Kernel Offset: disabled
[   35.000261] CPU features: 0x800,00007831,00001086
[   35.000642] Memory Limit: none
[   35.001329] ---[ end Kernel panic - not syncing: HYP panic:
[   35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800
[   35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000
[   35.001329] VCPU:0000000000000000 ]---

Fix this by explicitly excluding the hypervisor's memory pool from
kmemleak like we already do for the hyp BSS.

Cc: Mike Rapoport <rppt@kernel.org>
Fixes: a7259df767 ("memblock: make memblock_find_in_range method private")
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220616161135.3997786-1-qperret@google.com
2022-06-17 09:48:38 +01:00
Marc Zyngier 699bb2e0c6 KVM: arm64: Move vcpu PC/Exception flags to the input flag set
The PC update flags (which also deal with exception injection)
is one of the most complicated use of the flag we have. Make it
more fool prof by:

- moving it over to the new accessors and assign it to the
  input flag set

- turn the combination of generic ELx flags with another flag
  indicating the target EL itself into an explicit set of
  flags for each EL and vector combination

- add a new accessor to pend the exception

This is otherwise a pretty straightformward conversion.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-10 09:54:34 +01:00
Will Deacon cde5042adf KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHE
Ignore 'kvm-arm.mode=protected' when using VHE so that kvm_get_mode()
only returns KVM_MODE_PROTECTED on systems where the feature is available.

Cc: David Brazdil <dbrazdil@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-4-will@kernel.org
2022-06-09 13:24:02 +01:00
Will Deacon ae187fec75 KVM: arm64: Return error from kvm_arch_init_vm() on allocation failure
If we fail to allocate the 'supported_cpus' cpumask in kvm_arch_init_vm()
then be sure to return -ENOMEM instead of success (0) on the failure
path.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-2-will@kernel.org
2022-06-09 13:24:02 +01:00
Linus Torvalds bf9095424d S390:
* ultravisor communication device driver
 
 * fix TEID on terminating storage key ops
 
 RISC-V:
 
 * Added Sv57x4 support for G-stage page table
 
 * Added range based local HFENCE functions
 
 * Added remote HFENCE functions based on VCPU requests
 
 * Added ISA extension registers in ONE_REG interface
 
 * Updated KVM RISC-V maintainers entry to cover selftests support
 
 ARM:
 
 * Add support for the ARMv8.6 WFxT extension
 
 * Guard pages for the EL2 stacks
 
 * Trap and emulate AArch32 ID registers to hide unsupported features
 
 * Ability to select and save/restore the set of hypercalls exposed
   to the guest
 
 * Support for PSCI-initiated suspend in collaboration with userspace
 
 * GICv3 register-based LPI invalidation support
 
 * Move host PMU event merging into the vcpu data structure
 
 * GICv3 ITS save/restore fixes
 
 * The usual set of small-scale cleanups and fixes
 
 x86:
 
 * New ioctls to get/set TSC frequency for a whole VM
 
 * Allow userspace to opt out of hypercall patching
 
 * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr
 
 AMD SEV improvements:
 
 * Add KVM_EXIT_SHUTDOWN metadata for SEV-ES
 
 * V_TSC_AUX support
 
 Nested virtualization improvements for AMD:
 
 * Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
   nested vGIF)
 
 * Allow AVIC to co-exist with a nested guest running
 
 * Fixes for LBR virtualizations when a nested guest is running,
   and nested LBR virtualization support
 
 * PAUSE filtering for nested hypervisors
 
 Guest support:
 
 * Decoupling of vcpu_is_preempted from PV spinlocks
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKN9M4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNLeAf+KizAlQwxEehHHeNyTkZuKyMawrD6
 zsqAENR6i1TxiXe7fDfPFbO2NR0ZulQopHbD9mwnHJ+nNw0J4UT7g3ii1IAVcXPu
 rQNRGMVWiu54jt+lep8/gDg0JvPGKVVKLhxUaU1kdWT9PhIOC6lwpP3vmeWkUfRi
 PFL/TMT0M8Nfryi0zHB0tXeqg41BiXfqO8wMySfBAHUbpv8D53D2eXQL6YlMM0pL
 2quB1HxHnpueE5vj3WEPQ3PCdy1M2MTfCDBJAbZGG78Ljx45FxSGoQcmiBpPnhJr
 C6UGP4ZDWpml5YULUoA70k5ylCbP+vI61U4vUtzEiOjHugpPV5wFKtx5nw==
 =ozWx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "S390:

   - ultravisor communication device driver

   - fix TEID on terminating storage key ops

  RISC-V:

   - Added Sv57x4 support for G-stage page table

   - Added range based local HFENCE functions

   - Added remote HFENCE functions based on VCPU requests

   - Added ISA extension registers in ONE_REG interface

   - Updated KVM RISC-V maintainers entry to cover selftests support

  ARM:

   - Add support for the ARMv8.6 WFxT extension

   - Guard pages for the EL2 stacks

   - Trap and emulate AArch32 ID registers to hide unsupported features

   - Ability to select and save/restore the set of hypercalls exposed to
     the guest

   - Support for PSCI-initiated suspend in collaboration with userspace

   - GICv3 register-based LPI invalidation support

   - Move host PMU event merging into the vcpu data structure

   - GICv3 ITS save/restore fixes

   - The usual set of small-scale cleanups and fixes

  x86:

   - New ioctls to get/set TSC frequency for a whole VM

   - Allow userspace to opt out of hypercall patching

   - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr

  AMD SEV improvements:

   - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

   - V_TSC_AUX support

  Nested virtualization improvements for AMD:

   - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
     nested vGIF)

   - Allow AVIC to co-exist with a nested guest running

   - Fixes for LBR virtualizations when a nested guest is running, and
     nested LBR virtualization support

   - PAUSE filtering for nested hypervisors

  Guest support:

   - Decoupling of vcpu_is_preempted from PV spinlocks"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits)
  KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest
  KVM: selftests: x86: Sync the new name of the test case to .gitignore
  Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND
  x86, kvm: use correct GFP flags for preemption disabled
  KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer
  x86/kvm: Alloc dummy async #PF token outside of raw spinlock
  KVM: x86: avoid calling x86 emulator without a decoded instruction
  KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak
  x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave)
  s390/uv_uapi: depend on CONFIG_S390
  KVM: selftests: x86: Fix test failure on arch lbr capable platforms
  KVM: LAPIC: Trace LAPIC timer expiration on every vmentry
  KVM: s390: selftest: Test suppression indication on key prot exception
  KVM: s390: Don't indicate suppression on dirtying, failing memop
  selftests: drivers/s390x: Add uvdevice tests
  drivers/s390/char: Add Ultravisor io device
  MAINTAINERS: Update KVM RISC-V entry to cover selftests support
  RISC-V: KVM: Introduce ISA extension register
  RISC-V: KVM: Cleanup stale TLB entries when host CPU changes
  RISC-V: KVM: Add remote HFENCE functions based on VCPU requests
  ...
2022-05-26 14:20:14 -07:00