Commit Graph

1043 Commits

Author SHA1 Message Date
Marc Zyngier 4ff3fc316d KVM: arm64: Move AArch32 exceptions over to AArch64 sysregs
The use of the AArch32-specific accessors have always been a bit
annoying on 64bit, and it is time for a change.

Let's move the AArch32 exception injection over to the AArch64 encoding,
which requires us to split the two halves of FAR_EL1 into DFAR and IFAR.
This enables us to drop the preempt_disable() games on VHE, and to kill
the last user of the vcpu_cp15() macro.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:51 +00:00
Marc Zyngier ca4e514774 KVM: arm64: Introduce handling of AArch32 TTBCR2 traps
ARMv8.2 introduced TTBCR2, which shares TCR_EL1 with TTBCR.
Gracefully handle traps to this register when HCR_EL2.TVM is set.

Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:50 +00:00
Marc Zyngier 90c1f934ed KVM: arm64: Get rid of the AArch32 register mapping code
The only use of the register mapping code was for the sake of the LR
mapping, which we trivially solved in a previous patch. Get rid of
the whole thing now.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:27 +00:00
Marc Zyngier dcfba39932 KVM: arm64: Consolidate exception injection
Move the AArch32 exception injection code back into the inject_fault.c
file, removing the need for a few non-static functions now that AArch32
host support is a thing of the past.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:26 +00:00
Marc Zyngier 7d76b8a603 KVM: arm64: Remove SPSR manipulation primitives
The SPSR setting code is now completely unused, including that dealing
with banked AArch32 SPSRs. Cleanup time.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:26 +00:00
Marc Zyngier 41613b519c KVM: arm64: Inject AArch32 exceptions from HYP
Similarly to what has been done for AArch64, move the AArch32 exception
injection to HYP.

In order to not use the regmap selection code at EL2, simplify the code
populating the target mode's LR register by useing the compatibility
aliases for LR_abt and LR_und.

We also introduce new accessors for SPSR_abt and SPSR_und, and
move VBAR/SCTLR to using the AArch64 accessors (the use of the AArch32
names was an ARMv7 leftover).

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:26 +00:00
Marc Zyngier bb666c472c KVM: arm64: Inject AArch64 exceptions from HYP
Move the AArch64 exception injection code from EL1 to HYP, leaving
only the ESR_EL1 updates to EL1. In order to come with the differences
between VHE and nVHE, two set of system register accessors are provided.

SPSR, ELR, PC and PSTATE are now completely handled in the hypervisor.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:26 +00:00
Marc Zyngier e650b64f1a KVM: arm64: Add basic hooks for injecting exceptions from EL2
Add the basic infrastructure to describe injection of exceptions
into a guest. So far, nothing uses this code path.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:25 +00:00
Marc Zyngier 21c810017c KVM: arm64: Move VHE direct sysreg accessors into kvm_host.h
As we are about to need to access system registers from the HYP
code based on their internal encoding, move the direct sysreg
accessors to a common include file, with a VHE-specific guard.

No functionnal change.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:25 +00:00
Marc Zyngier defe21f49b KVM: arm64: Move PC rollback on SError to HYP
Instead of handling the "PC rollback on SError during HVC" at EL1 (which
requires disclosing PC to a potentially untrusted kernel), let's move
this fixup to ... fixup_guest_exit(), which is where we do all fixups.

Isn't that neat?

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:25 +00:00
Marc Zyngier cdb5e02ed1 KVM: arm64: Make kvm_skip_instr() and co private to HYP
In an effort to remove the vcpu PC manipulations from EL1 on nVHE
systems, move kvm_skip_instr() to be HYP-specific. EL1's intent
to increment PC post emulation is now signalled via a flag in the
vcpu structure.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:24 +00:00
Marc Zyngier 6ddbc281e2 KVM: arm64: Move kvm_vcpu_trap_il_is32bit into kvm_skip_instr32()
There is no need to feed the result of kvm_vcpu_trap_il_is32bit()
to kvm_skip_instr(), as only AArch32 has a variable length ISA, and
this helper can equally be called from kvm_skip_instr32(), reducing
the complexity at all the call sites.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:24 +00:00
Marc Zyngier c22588c996 KVM: arm64: Don't adjust PC on SError during SMC trap
On SMC trap, the prefered return address is set to that of the SMC
instruction itself. It is thus wrong to try and roll it back when
an SError occurs while trapping on SMC. It is still necessary on
HVC though, as HVC doesn't cause a trap, and sets ELR to returning
*after* the HVC.

It also became apparent that there is no 16bit encoding for an AArch32
HVC instruction, meaning that the displacement is always 4 bytes,
no matter what the ISA is. Take this opportunity to simplify it.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:23 +00:00
Linus Torvalds 407ab57963 ARM:
- Fix compilation error when PMD and PUD are folded
 - Fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1
 - Add aarch64 get-reg-list test
 
 x86:
 - fix semantic conflict between two series merged for 5.10
 - fix (and test) enforcement of paravirtual cpuid features
 
 Generic:
 - various cleanups to memory management selftests
 - new selftests testcase for performance of dirty logging
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+pVjkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO3fAf/ZniW/7FC4pD/M0txXUst3mKNcC16
 AbMfN36dvzdWBnAuTVsP2d+XM/sbPNacomcJGfJ5II9TKrb00FUNxU37In7vdbbm
 WjpyDEpRDXnCY+OXs7dwY66dEXzv9GTzlQaGuah67AeGpzSuu3zrXlu07di446Gv
 ZtHvbzFEvos7cByp3LoPfvbnvv9kkD5mQkOW7wG42hUPrxMNxtHC+qyP92DIpV8d
 etDNC95rhdhhZM3LAlvO6Bp4I1uFXpYHEHtIOOT05IB9clNhfdgsuD8wiqWfEo0l
 sVhg3yXWbbfGaP3vEZp5QY9qko8I0XjwIWc5hWsIHST7uPqgi8a/wIbbEA==
 =jBcA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - fix compilation error when PMD and PUD are folded
   - fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1
   - add aarch64 get-reg-list test

  x86:
   - fix semantic conflict between two series merged for 5.10
   - fix (and test) enforcement of paravirtual cpuid features

  selftests:
   - various cleanups to memory management selftests
   - new selftests testcase for performance of dirty logging"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
  KVM: selftests: allow two iterations of dirty_log_perf_test
  KVM: selftests: Introduce the dirty log perf test
  KVM: selftests: Make the number of vcpus global
  KVM: selftests: Make the per vcpu memory size global
  KVM: selftests: Drop pointless vm_create wrapper
  KVM: selftests: Add wrfract to common guest code
  KVM: selftests: Simplify demand_paging_test with timespec_diff_now
  KVM: selftests: Remove address rounding in guest code
  KVM: selftests: Factor code out of demand_paging_test
  KVM: selftests: Use a single binary for dirty/clear log test
  KVM: selftests: Always clear dirty bitmap after iteration
  KVM: selftests: Add blessed SVE registers to get-reg-list
  KVM: selftests: Add aarch64 get-reg-list test
  selftests: kvm: test enforcement of paravirtual cpuid features
  selftests: kvm: Add exception handling to selftests
  selftests: kvm: Clear uc so UCALL_NONE is being properly reported
  selftests: kvm: Fix the segment descriptor layout to match the actual layout
  KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrs
  kvm: x86: request masterclock update any time guest uses different msr
  kvm: x86: ensure pv_cpuid.features is initialized when enabling cap
  ...
2020-11-09 13:58:10 -08:00
Marc Zyngier 7cd0aaafaa KVM: arm64: Turn host HVC handling into a dispatch table
Now that we can use function pointer, use a dispatch table to call
the individual HVC handlers, leading to more maintainable code.

Further improvements include helpers to declare the mapping of
local variables to values passed in the host context.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-09 17:08:15 +00:00
Marc Zyngier 1db9d9ded7 KVM: arm64: Add kimg_hyp_va() helper
KVM/arm64 is so far unable to deal with function pointers, as the compiler
will generate the kernel's runtime VA, and not the linear mapping address,
meaning that kern_hyp_va() will give the wrong result.

We so far have been able to use PC-relative addressing, but that's not
always easy to use, and prevents the implementation of things like
the mapping of an index to a pointer.

To allow this, provide a new helper that computes the required
translation from the kernel image to the HYP VA space.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-09 16:56:39 +00:00
Paolo Bonzini ff2bb93f53 KVM/arm64 fixes for v5.10, take #2
- Fix compilation error when PMD and PUD are folded
 - Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+lep8PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDwNEP/3KJfcJtA//6JlQkRXRkYjXVZ2/2Crr0IHdu
 TqzQZ7Mg8w281HuZjrpvYwzHlXbQ89RJT+G/avG5EmfQcmJU5eQna9T1w1Vq2d2q
 3K35HdQskfFJYJ5MMvxSZ1WsE+EMWOXJGwL3jss/ThS+qzD+Ag7Fdg3Eg6kTv0Ic
 eMFtnBzI7UddNwZcrPM43dZTh9JEls9mySF6kjsIleUm3Xnk+6NKP6nDnJMukBOF
 b+9DaGx9cdXI7bqm3elvaWIeSpJQIBhLvYQqyD0OyF2qAqWrGHEULx2qZbMHRG2x
 lhQDcMyKjtv0hzKxmotVjhDaz/Af+yDJ57IRHLfEq/v5ytIqg76vxDwIjyLUHXyk
 3lPoycCDtcgRKlkoz1jQ35oDCo0LUG2sUgWIn6D3Pim1aZfppnlDjCuuOMGyf5Db
 RS0jIBm5u1YDchnL43HsQRBiVUD1In+QHtR/ZECMMUqjtnCojSZp30BGlqoVSlb8
 aSpzecBaA+C1lRFqLTHCldONloE/85vpsYEIfxB1SqVwrpDkrXaEwZPU6im5a8om
 Q9aJ+TqIUGHLLWsI4SGNrS/kjSpt/GAP4Kkfg+wBqPUgYL+lDTSekFWY6DbekwDP
 +CdopqFCSa1Jfby/6rTYzGK1152NH03O63Ky1Z3uGKpCNK9lAptdUIYMkYCj69C2
 m8zS3zyz
 =LmMm
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for v5.10, take #2

- Fix compilation error when PMD and PUD are folded
- Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
2020-11-08 04:15:53 -05:00
Andrew Jones c512298eed KVM: arm64: Remove AA64ZFR0_EL1 accessors
The AA64ZFR0_EL1 accessors are just the general accessors with
its visibility function open-coded. It also skips the if-else
chain in read_id_reg, but there's no reason not to go there.
Indeed consolidating ID register accessors and removing lines
of code make it worthwhile.

Remove the AA64ZFR0_EL1 accessors, replacing them with the
general accessors for sanitized ID registers.

No functional change intended.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-5-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Andrew Jones 912dee5726 KVM: arm64: Check RAZ visibility in ID register accessors
The instruction encodings of ID registers are preallocated. Until an
encoding is assigned a purpose the register is RAZ. KVM's general ID
register accessor functions already support both paths, RAZ or not.
If for each ID register we can determine if it's RAZ or not, then all
ID registers can build on the general functions. The register visibility
function allows us to check whether a register should be completely
hidden or not, extending it to also report when the register should
be RAZ or not allows us to use it for ID registers as well.

Check for RAZ visibility in the ID register accessor functions,
allowing the RAZ case to be handled in a generic way for all system
registers.

The new REG_RAZ flag will be used in a later patch. This patch has
no intended functional change.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-4-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Andrew Jones 01fe5ace92 KVM: arm64: Consolidate REG_HIDDEN_GUEST/USER
REG_HIDDEN_GUEST and REG_HIDDEN_USER are always used together.
Consolidate them into a single REG_HIDDEN flag. We can always
add another flag later if some register needs to expose itself
differently to the guest than it does to userspace.

No functional change intended.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-3-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Andrew Jones f81cb2c3ad KVM: arm64: Don't hide ID registers from userspace
ID registers are RAZ until they've been allocated a purpose, but
that doesn't mean they should be removed from the KVM_GET_REG_LIST
list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that
is hidden from userspace when its function, SVE, is not present.

Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not
implemented. Removing the userspace visibility checks is enough
to reexpose it, as it will already return zero to userspace when
SVE is not present. The register already behaves as RAZ for the
guest when SVE is not present.

Fixes: 73433762fc ("KVM: arm64/sve: System register context switch and access support")
Reported-by: 张东旭 <xu910121@sina.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org#v5.2+
Link: https://lore.kernel.org/r/20201105091022.15373-2-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Gavin Shan faf000397e KVM: arm64: Fix build error in user_mem_abort()
The PUD and PMD are folded into PGD when the following options are
enabled. In that case, PUD_SHIFT is equal to PMD_SHIFT and we fail
to build with the indicated errors:

   CONFIG_ARM64_VA_BITS_42=y
   CONFIG_ARM64_PAGE_SHIFT=16
   CONFIG_PGTABLE_LEVELS=3

   arch/arm64/kvm/mmu.c: In function ‘user_mem_abort’:
   arch/arm64/kvm/mmu.c:798:2: error: duplicate case value
     case PMD_SHIFT:
     ^~~~
   arch/arm64/kvm/mmu.c:791:2: note: previously used here
     case PUD_SHIFT:
     ^~~~

This fixes the issue by skipping the check on PUD huge page when PUD
and PMD are folded into PGD.

Fixes: 2f40c46021 ("KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201103003009.32955-1-gshan@redhat.com
2020-11-06 16:00:29 +00:00
Linus Torvalds 2d38c80d5b ARM:
* selftest fix
 * Force PTE mapping on device pages provided via VFIO
 * Fix detection of cacheable mapping at S2
 * Fallback to PMD/PTE mappings for composite huge pages
 * Fix accounting of Stage-2 PGD allocation
 * Fix AArch32 handling of some of the debug registers
 * Simplify host HYP entry
 * Fix stray pointer conversion on nVHE TLB invalidation
 * Fix initialization of the nVHE code
 * Simplify handling of capabilities exposed to HYP
 * Nuke VCPUs caught using a forbidden AArch32 EL0
 
 x86:
 * new nested virtualization selftest
 * Miscellaneous fixes
 * make W=1 fixes
 * Reserve new CPUID bit in the KVM leaves
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+dhRAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPWCgf/U997UW/11IdNtkehQO/DFdx7lHev
 +IahN1Pnbt92ZoR5nGhK9pgvDahIVhqTmUvgV+3fD24OnqXTpYTu1fliBvL6ynbN
 J9Ycf0zFAgwfgTTD5UexTlEovnhX4xz7NDmd6rpxGDZdMaBHQFPkCXBFK45pf4nd
 O349aHV0X1AA7Tt/sLhpXpi74Vake1xErLHKhIVLHKyo/zDm+Q0UZry068NNBzTr
 St3+QSGlFXhuekVrZLh+DShh6rZGLyY9tcySt6o0Jk7fSs1lmEnPbBgeeqYmyHMd
 Yn+ybhthmNkkpI8so70TA9roiVar4UmjnMBOiav62bo7ue26pKE5cWQyXw==
 =mvBr
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - selftest fix
   - force PTE mapping on device pages provided via VFIO
   - fix detection of cacheable mapping at S2
   - fallback to PMD/PTE mappings for composite huge pages
   - fix accounting of Stage-2 PGD allocation
   - fix AArch32 handling of some of the debug registers
   - simplify host HYP entry
   - fix stray pointer conversion on nVHE TLB invalidation
   - fix initialization of the nVHE code
   - simplify handling of capabilities exposed to HYP
   - nuke VCPUs caught using a forbidden AArch32 EL0

  x86:
   - new nested virtualization selftest
   - miscellaneous fixes
   - make W=1 fixes
   - reserve new CPUID bit in the KVM leaves"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: vmx: remove unused variable
  KVM: selftests: Don't require THP to run tests
  KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
  KVM: selftests: test behavior of unmapped L2 APIC-access address
  KVM: x86: Fix NULL dereference at kvm_msr_ignored_check()
  KVM: x86: replace static const variables with macros
  KVM: arm64: Handle Asymmetric AArch32 systems
  arm64: cpufeature: upgrade hyp caps to final
  arm64: cpufeature: reorder cpus_have_{const, final}_cap()
  KVM: arm64: Factor out is_{vhe,nvhe}_hyp_code()
  KVM: arm64: Force PTE mapping on fault resulting in a device mapping
  KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
  KVM: arm64: Fix masks in stage2_pte_cacheable()
  KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
  KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
  KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition
  KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation
  KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call
  x86/kvm: Reserve KVM_FEATURE_MSI_EXT_DEST_ID
2020-11-01 09:43:32 -08:00
Paolo Bonzini 699116c45e KVM/arm64 fixes for 5.10, take #1
- Force PTE mapping on device pages provided via VFIO
 - Fix detection of cacheable mapping at S2
 - Fallback to PMD/PTE mappings for composite huge pages
 - Fix accounting of Stage-2 PGD allocation
 - Fix AArch32 handling of some of the debug registers
 - Simplify host HYP entry
 - Fix stray pointer conversion on nVHE TLB invalidation
 - Fix initialization of the nVHE code
 - Simplify handling of capabilities exposed to HYP
 - Nuke VCPUs caught using a forbidden AArch32 EL0
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+cO3oPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDJdoP/jiKYR8iVkq/RmIsQl383KwQiJGTMi0iL2Zw
 /tHnf8bKowAPyG8bqyXMJqlWOb7tcp6U3m+WhENAZHWH02r2M921q0DGVW5p48ou
 Ek4zJnFF1iL5ryOBgROKK1nymUZOi3W1a1SsD6ZPImQsKsjNGbqKgWsGs8i9ft0P
 vkNZwlqebzJp+OR3agJemc8dkXcGlcRHk7fffdMcU8jsF5RJ9zC0XU0+scKryxhV
 o8PzKSlwCeisyL+Vz+s7POzoD3Rt+P+qjblz5NWqy/NHuLh+V9hzUSDOjWbZb70f
 Er29vGv7Yjb4nKK2KUzNqirSfXsRylfsjGr+YibP6uKEUMuUm/V41DqzT7nMalIm
 cOBGtPk6W9wOL8JNDmlyVGCfATI+5RrErQ8nFClrPu3qw4Hv4pb1Ad5OgAhNE0u1
 PfUyBBtQKNAjTdVCRfSuFL4d2yegy1rrpCmYWrvdQjLlXemwgYgKnSQN98cZHgjA
 foCAP5gJpAWGualyhKJx2CkY/5deeWKS39ISiNgHo5eRvKsGEnMN7j9UX77VbhRr
 PkwCmeUJ3kjzaAfmtcBN/iLjwQbWypidjX2Vbfl5WoVdLuiYXFZIvsdaqRHGl56F
 5zhYxM8DKODNEJKMl7a89oEFGKy8x1PQ0kqer9a6GBWkNDrQMOSL4+FkxCyM2m9g
 RoHtmdy0
 =gVaX
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.10, take #1

- Force PTE mapping on device pages provided via VFIO
- Fix detection of cacheable mapping at S2
- Fallback to PMD/PTE mappings for composite huge pages
- Fix accounting of Stage-2 PGD allocation
- Fix AArch32 handling of some of the debug registers
- Simplify host HYP entry
- Fix stray pointer conversion on nVHE TLB invalidation
- Fix initialization of the nVHE code
- Simplify handling of capabilities exposed to HYP
- Nuke VCPUs caught using a forbidden AArch32 EL0
2020-10-30 13:25:09 -04:00
Qais Yousef 22f553842b KVM: arm64: Handle Asymmetric AArch32 systems
On a system without uniform support for AArch32 at EL0, it is possible
for the guest to force run AArch32 at EL0 and potentially cause an
illegal exception if running on a core without AArch32. Add an extra
check so that if we catch the guest doing that, then we prevent it from
running again by resetting vcpu->arch.target and return
ARM_EXCEPTION_IL.

We try to catch this misbehaviour as early as possible and not rely on
an illegal exception occuring to signal the problem. Attempting to run a
32bit app in the guest will produce an error from QEMU if the guest
exits while running in AArch32 EL0.

Tested on Juno by instrumenting the host to fake asym aarch32 and
instrumenting KVM to make the asymmetry visible to the guest.

[will: Incorporated feedback from Marc]

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201021104611.2744565-2-qais.yousef@arm.com
Link: https://lore.kernel.org/r/20201027215118.27003-2-will@kernel.org
2020-10-30 16:06:22 +00:00
Santosh Shukla 91a2c34b7d KVM: arm64: Force PTE mapping on fault resulting in a device mapping
VFIO allows a device driver to resolve a fault by mapping a MMIO
range. This can be subsequently result in user_mem_abort() to
try and compute a huge mapping based on the MMIO pfn, which is
a sure recipe for things to go wrong.

Instead, force a PTE mapping when the pfn faulted in has a device
mapping.

Fixes: 6d674e28f6 ("KVM: arm/arm64: Properly handle faulting of device mappings")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Santosh Shukla <sashukla@nvidia.com>
[maz: rewritten commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1603711447-11998-2-git-send-email-sashukla@nvidia.com
2020-10-29 20:41:04 +00:00
Gavin Shan 2f40c46021 KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
Although huge pages can be created out of multiple contiguous PMDs
or PTEs, the corresponding sizes are not supported at Stage-2 yet.

Instead of failing the mapping, fall back to the nearer supported
mapping size (CONT_PMD to PMD and CONT_PTE to PTE respectively).

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
[maz: rewritten commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201025230626.18501-1-gshan@redhat.com
2020-10-29 20:39:46 +00:00
Will Deacon e2fc6a9f68 KVM: arm64: Fix masks in stage2_pte_cacheable()
stage2_pte_cacheable() tries to figure out whether the mapping installed
in its 'pte' parameter is cacheable or not. Unfortunately, it fails
miserably because it extracts the memory attributes from the entry using
FIELD_GET(), which returns the attributes shifted down to bit 0, but then
compares this with the unshifted value generated by the PAGE_S2_MEMATTR()
macro.

A direct consequence of this bug is that cache maintenance is silently
skipped, which in turn causes 32-bit guests to crash early on when their
set/way maintenance is trapped but not emulated correctly.

Fix the broken masks by avoiding the use of FIELD_GET() altogether.

Fixes: 6d9d2115c4 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table")
Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20201029144716.30476-1-will@kernel.org
2020-10-29 19:49:03 +00:00
Marc Zyngier 4a1c2c7f63 KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
The DBGD{CCINT,SCRext} and DBGVCR register entries in the cp14 array
are missing their target register, resulting in all accesses being
targetted at the guard sysreg (indexed by __INVALID_SYSREG__).

Point the emulation code at the actual register entries.

Fixes: bdfb4b389c ("arm64: KVM: add trap handlers for AArch32 debug registers")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201029172409.2768336-1-maz@kernel.org
2020-10-29 19:49:03 +00:00
Will Deacon 7efe8ef274 KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
For consistency with the rest of the stage-2 page-table page allocations
(performing using a kvm_mmu_memory_cache), ensure that __GFP_ACCOUNT is
included in the GFP flags for the PGD pages.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20201026144423.24683-1-will@kernel.org
2020-10-29 19:49:03 +00:00
Marc Zyngier d2782505fb KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition
Setting PSTATE.PAN when entering EL2 on nVHE doesn't make much
sense as this bit only means something for translation regimes
that include EL0. This obviously isn't the case in the nVHE case,
so let's drop this setting.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Link: https://lore.kernel.org/r/20201026095116.72051-4-maz@kernel.org
2020-10-29 19:49:03 +00:00
Marc Zyngier b6d6db4de8 KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation
The new calling convention says that pointers coming from the SMCCC
interface are turned into their HYP version in the host HVC handler.
However, there is still a stray kern_hyp_va() in the TLB invalidation
code, which could result in a corrupted pointer.

Drop the spurious conversion.

Fixes: a071261d93 ("KVM: arm64: nVHE: Fix pointers during SMCCC convertion")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201026095116.72051-3-maz@kernel.org
2020-10-29 19:49:03 +00:00
Marc Zyngier 28e81c6270 KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call
The hyp-init code starts by stashing a register in TPIDR_EL2
in in order to free a register. This happens no matter if the
HVC call is legal or not.

Although nothing wrong seems to come out of it, it feels odd
to alter the EL2 state for something that eventually returns
an error.

Instead, use the fact that we know exactly which bits of the
__kvm_hyp_init call are non-zero to perform the check with
a series of EOR/ROR instructions, combined with a build-time
check that the value is the one we expect.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201026095116.72051-2-maz@kernel.org
2020-10-29 19:49:02 +00:00
Rob Herring 96d389ca10 arm64: Add workaround for Arm Cortex-A77 erratum 1508412
On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load
and a store exclusive or PAR_EL1 read can cause a deadlock.

The workaround requires a DMB SY before and after a PAR_EL1 register
read. In addition, it's possible an interrupt (doing a device read) or
KVM guest exit could be taken between the DMB and PAR read, so we
also need a DMB before returning from interrupt and before returning to
a guest.

A deadlock is still possible with the workaround as KVM guests must also
have the workaround. IOW, a malicious guest can deadlock an affected
systems.

This workaround also depends on a firmware counterpart to enable the h/w
to insert DMB SY after load and store exclusive instructions. See the
errata document SDEN-1152370 v10 [1] for more information.

[1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-29 12:56:01 +00:00
Stephen Boyd 1de111b51b KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED
According to the SMCCC spec[1](7.5.2 Discovery) the
ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
SMCCC_RET_NOT_SUPPORTED.

 0 is "workaround required and safe to call this function"
 1 is "workaround not required but safe to call this function"
 SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!"

SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except
calling this function may not work because it isn't implemented in some
cases". Wonderful. We map this SMC call to

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

For KVM hypercalls (hvc), we've implemented this function id to return
SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those
isn't supposed to be there. Per the code we call
arm64_get_spectre_v2_state() to figure out what to return for this
feature discovery call.

 0 is SPECTRE_MITIGATED
 SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Let's clean this up so that KVM tells the guest this mapping:

 0 is SPECTRE_MITIGATED
 1 is SPECTRE_UNAFFECTED
 SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE

Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec

Fixes: c118bbb527 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://developer.arm.com/documentation/den0028/latest [1]
Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28 11:13:36 +00:00
Linus Torvalds f9a705ad1c ARM:
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 
 PPC:
 - Fix for running nested guests with in-kernel IRQ chip
 - Fix race condition causing occasional host hard lockup
 - Minor cleanups and bugfixes
 
 x86:
 - allow trapping unknown MSRs to userspace
 - allow userspace to force #GP on specific MSRs
 - INVPCID support on AMD
 - nested AMD cleanup, on demand allocation of nested SVM state
 - hide PV MSRs and hypercalls for features not enabled in CPUID
 - new test for MSR_IA32_TSC writes from host and guest
 - cleanups: MMU, CPUID, shared MSRs
 - LAPIC latency optimizations ad bugfixes
 
 For x86, also included in this pull request is a new alternative and
 (in the future) more scalable implementation of extended page tables
 that does not need a reverse map from guest physical addresses to
 host physical addresses.  For now it is disabled by default because
 it is still lacking a few of the existing MMU's bells and whistles.
 However it is a very solid piece of work and it is already available
 for people to hammer on it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+S8dsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM40Af+M46NJmuS5rcwFfybvK/c42KT6svX
 Co1NrZDwzSQ2mMy3WQzH9qeLvb+nbY4sT3n5BPNPNsT+aIDPOTDt//qJ2/Ip9UUs
 tRNea0MAR96JWLE7MSeeRxnTaQIrw/AAZC0RXFzZvxcgytXwdqBExugw4im+b+dn
 Dcz8QxX1EkwT+4lTm5HC0hKZAuo4apnK1QkqCq4SdD2QVJ1YE6+z7pgj4wX7xitr
 STKD6q/Yt/0ndwqS0GSGbyg0jy6mE620SN6isFRkJYwqfwLJci6KnqvEK67EcNMu
 qeE017K+d93yIVC46/6TfVHzLR/D1FpQ8LZ16Yl6S13OuGIfAWBkQZtPRg==
 =AD6a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "For x86, there is a new alternative and (in the future) more scalable
  implementation of extended page tables that does not need a reverse
  map from guest physical addresses to host physical addresses.

  For now it is disabled by default because it is still lacking a few of
  the existing MMU's bells and whistles. However it is a very solid
  piece of work and it is already available for people to hammer on it.

  Other updates:

  ARM:
   - New page table code for both hypervisor and guest stage-2
   - Introduction of a new EL2-private host context
   - Allow EL2 to have its own private per-CPU variables
   - Support of PMU event filtering
   - Complete rework of the Spectre mitigation

  PPC:
   - Fix for running nested guests with in-kernel IRQ chip
   - Fix race condition causing occasional host hard lockup
   - Minor cleanups and bugfixes

  x86:
   - allow trapping unknown MSRs to userspace
   - allow userspace to force #GP on specific MSRs
   - INVPCID support on AMD
   - nested AMD cleanup, on demand allocation of nested SVM state
   - hide PV MSRs and hypercalls for features not enabled in CPUID
   - new test for MSR_IA32_TSC writes from host and guest
   - cleanups: MMU, CPUID, shared MSRs
   - LAPIC latency optimizations ad bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
  kvm: x86/mmu: NX largepage recovery for TDP MMU
  kvm: x86/mmu: Don't clear write flooding count for direct roots
  kvm: x86/mmu: Support MMIO in the TDP MMU
  kvm: x86/mmu: Support write protection for nesting in tdp MMU
  kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
  kvm: x86/mmu: Support dirty logging for the TDP MMU
  kvm: x86/mmu: Support changed pte notifier in tdp MMU
  kvm: x86/mmu: Add access tracking for tdp_mmu
  kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
  kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
  kvm: x86/mmu: Add TDP MMU PF handler
  kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
  kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
  KVM: Cache as_id in kvm_memory_slot
  kvm: x86/mmu: Add functions to handle changed TDP SPTEs
  kvm: x86/mmu: Allocate and free TDP MMU roots
  kvm: x86/mmu: Init / Uninit the TDP MMU
  kvm: x86/mmu: Introduce tdp_iter
  KVM: mmu: extract spte.h and spte.c
  KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
  ...
2020-10-23 11:17:56 -07:00
Paolo Bonzini c0623f5e5d Merge branch 'kvm-fixes' into 'next'
Pick up bugfixes from 5.9, otherwise various tests fail.
2020-10-21 18:05:58 -04:00
Paolo Bonzini 1b21c8db0e KVM/arm64 updates for Linux 5.10
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+B2vwPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDI20P/3zj+DT2INf6VrQr9nKg4uxwRk3AQYevRUhs
 id6s6OA9CCXNrN4OuboH3XXcPbI9bYJrp9xuxALAIPzVtLpQs4CLU0tgNclM3zw2
 Fe6u8wEIEFUxuNKlezd1J0gUqskTOloCsscMZ4s72ayDp56VR9fs6Ll/48rvtKXO
 x0FDkouIOAK8Cp/LVhFc0R4ZqaLvFiRr8OlJIqybM5JATl7CEq60jVlZqtoJcjOp
 M/Q/uPtoPZ+yaNizvzhNX9SSX5q7bW2hYqfjveCvmJXLoRZnUzmSiCGHk0DWSw86
 YO0X9cmK/E4feu7EwDppCbVi+FD/+Wcvhs87wAAa6jG2QgHab+OjGOg88gyNAQdL
 APMGALROiTPyC5s9SmrS8eMYZHJku9NRvBjXz5kdrXYhGhMSEmv9RfoBfwJKC4gz
 iqaQ9ejDm+uo6n9p9O8Fng34SqpIzCD1NwRW7AowM2WZH7wkpZ7d0RnbtToNzXsH
 TRKWrSZDydsy/rfL/e6k9CHdoIYCZcrRo/tfc7WGPoO6bwpUZP8O2K/dNz39fUTY
 eQls1fpEFJa8xRJFrxLHl3LuLcL4MuiTfN7TzbJpiwojXCFfUYcBJ3MGniONXIdQ
 2rtl6cx/FhUFekaC8S/k3UDK4i7AqjWsBm9nxRNBwSdwD/ecnqXMljq/DmOfjEmt
 3kOh0WS3
 =mjUF
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.10

- New page table code for both hypervisor and guest stage-2
- Introduction of a new EL2-private host context
- Allow EL2 to have its own private per-CPU variables
- Support of PMU event filtering
- Complete rework of the Spectre mitigation
2020-10-20 08:14:25 -04:00
Linus Torvalds 6734e20e39 arm64 updates for 5.10
- Userspace support for the Memory Tagging Extension introduced by Armv8.5.
   Kernel support (via KASAN) is likely to follow in 5.11.
 
 - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
   switching.
 
 - Fix and subsequent rewrite of our Spectre mitigations, including the
   addition of support for PR_SPEC_DISABLE_NOEXEC.
 
 - Support for the Armv8.3 Pointer Authentication enhancements.
 
 - Support for ASID pinning, which is required when sharing page-tables with
   the SMMU.
 
 - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op.
 
 - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and
   also support to handle CPU PMU IRQs as NMIs.
 
 - Allow prefetchable PCI BARs to be exposed to userspace using normal
   non-cacheable mappings.
 
 - Implementation of ARCH_STACKWALK for unwinding.
 
 - Improve reporting of unexpected kernel traps due to BPF JIT failure.
 
 - Improve robustness of user-visible HWCAP strings and their corresponding
   numerical constants.
 
 - Removal of TEXT_OFFSET.
 
 - Removal of some unused functions, parameters and prototypes.
 
 - Removal of MPIDR-based topology detection in favour of firmware
   description.
 
 - Cleanups to handling of SVE and FPSIMD register state in preparation
   for potential future optimisation of handling across syscalls.
 
 - Cleanups to the SDEI driver in preparation for support in KVM.
 
 - Miscellaneous cleanups and refactoring work.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+AUXMQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFc1B/4q2Kabe+pPu7s1f58Q+OTaEfqcr3F1qh27
 F1YpFZUYxg0GPfPsFrnbJpo5WKo7wdR9ceI9yF/GHjs7A/MSoQJis3pG6SlAd9c0
 nMU5tCwhg9wfq6asJtl0/IPWem6cqqhdzC6m808DjeHuyi2CCJTt0vFWH3OeHEhG
 cfmLfaSNXOXa/MjEkT8y1AXJ/8IpIpzkJeCRA1G5s18PXV9Kl5bafIo9iqyfKPLP
 0rJljBmoWbzuCSMc81HmGUQI4+8KRp6HHhyZC/k0WEVgj3LiumT7am02bdjZlTnK
 BeNDKQsv2Jk8pXP2SlrI3hIUTz0bM6I567FzJEokepvTUzZ+CVBi
 =9J8H
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's quite a lot of code here, but much of it is due to the
  addition of a new PMU driver as well as some arm64-specific selftests
  which is an area where we've traditionally been lagging a bit.

  In terms of exciting features, this includes support for the Memory
  Tagging Extension which narrowly missed 5.9, hopefully allowing
  userspace to run with use-after-free detection in production on CPUs
  that support it. Work is ongoing to integrate the feature with KASAN
  for 5.11.

  Another change that I'm excited about (assuming they get the hardware
  right) is preparing the ASID allocator for sharing the CPU page-table
  with the SMMU. Those changes will also come in via Joerg with the
  IOMMU pull.

  We do stray outside of our usual directories in a few places, mostly
  due to core changes required by MTE. Although much of this has been
  Acked, there were a couple of places where we unfortunately didn't get
  any review feedback.

  Other than that, we ran into a handful of minor conflicts in -next,
  but nothing that should post any issues.

  Summary:

   - Userspace support for the Memory Tagging Extension introduced by
     Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.

   - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
     switching.

   - Fix and subsequent rewrite of our Spectre mitigations, including
     the addition of support for PR_SPEC_DISABLE_NOEXEC.

   - Support for the Armv8.3 Pointer Authentication enhancements.

   - Support for ASID pinning, which is required when sharing
     page-tables with the SMMU.

   - MM updates, including treating flush_tlb_fix_spurious_fault() as a
     no-op.

   - Perf/PMU driver updates, including addition of the ARM CMN PMU
     driver and also support to handle CPU PMU IRQs as NMIs.

   - Allow prefetchable PCI BARs to be exposed to userspace using normal
     non-cacheable mappings.

   - Implementation of ARCH_STACKWALK for unwinding.

   - Improve reporting of unexpected kernel traps due to BPF JIT
     failure.

   - Improve robustness of user-visible HWCAP strings and their
     corresponding numerical constants.

   - Removal of TEXT_OFFSET.

   - Removal of some unused functions, parameters and prototypes.

   - Removal of MPIDR-based topology detection in favour of firmware
     description.

   - Cleanups to handling of SVE and FPSIMD register state in
     preparation for potential future optimisation of handling across
     syscalls.

   - Cleanups to the SDEI driver in preparation for support in KVM.

   - Miscellaneous cleanups and refactoring work"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  Revert "arm64: initialize per-cpu offsets earlier"
  arm64: random: Remove no longer needed prototypes
  arm64: initialize per-cpu offsets earlier
  kselftest/arm64: Check mte tagged user address in kernel
  kselftest/arm64: Verify KSM page merge for MTE pages
  kselftest/arm64: Verify all different mmap MTE options
  kselftest/arm64: Check forked child mte memory accessibility
  kselftest/arm64: Verify mte tag inclusion via prctl
  kselftest/arm64: Add utilities and a test to validate mte memory
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
  arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
  KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
  arm64: Get rid of arm64_ssbd_state
  KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
  KVM: arm64: Get rid of kvm_arm_have_ssbd()
  KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
  ...
2020-10-12 10:00:51 -07:00
Linus Torvalds 22fbc037cd Two bugfix patches.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl94P3QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP7+AgAm4NqCkxr4SqfYX3WHCiJP5Go9wQY
 9P5SP9YZFfDY/lqPuFMhUv2u/IumO0ErgSzrdmGIqSfjY9/Qf8Y4csc22/XZzvD+
 t8Jxmst51td9sO8vDC2u7Qnb1/Kbfk8rg9DXEre2Gw6EaO8EtGGyOMXtPUmGFEDS
 AKk+k081Mebs3rao8oy8MMDq0/AddE2afi4Cgi0yRkeMqxVOxaEe1gsBbZpOKrXO
 0n1RwaAEsZlZaDIUdu8pmwh/KMNJwhn3qvyG4XgfJsIWQSphB5y5EMhCnnoI/SMb
 auK4XSUDP8Y+ZwDRKxuIxBgBY4xix0aFKdZVSoPGdapVL/o8daYjaRt4ig==
 =4KCW
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Two bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: update PFEC_MASK/PFEC_MATCH together with PF intercept
  KVM: arm64: Restore missing ISB on nVHE __tlb_switch_to_guest
2020-10-03 12:19:23 -07:00
Paolo Bonzini e2e1a1c86b KVM/arm64 fixes for 5.9, take #3
- Fix synchronization of VTTBR update on TLB invalidation for nVHE systems
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl91qtQPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpD2eoQAKEpLbbWycXVKikz+O1yev8EELmWMpvV/7uM
 Q+4+FjO2swcb5FEPBw6hicZKXIFRi7AP3xFp6hKuWyLfdnYST+WOf5KCI1DcDdG+
 /+5bpiS9F1Z+K9Inm6XwpXCnWFuo1P8i1T65mT5HKIm/9+zRwFY5X8svXXvnP4h1
 OVgDSI+8jn14yf9aMnvznmvAiSN9GXiVt4v3h9W/1B5FWw3sUT1bTdFIwjh4q7M5
 Q32fLeWYdLqTnFOaYtLJNRElE9JSUFkwNpSg1nqoFUH+8gK5oBEEnhzdPzJ9figz
 tXrMKlWylswX3ySHSe1L2m9hekwwF3p/h3r4QlpbR2feI8jhOG9mG1YNvPSCdgKh
 xWDXWLUnymw4EZGyREeFdMe26gg4xVKqNnVB7Na3PFOSQklg6oqdcUyQccgyWd/6
 i8ePA+djVFA/C+iVanv/xphAalT0DmNEe3isBSxkt0RZcrLoCDBoOsTRTRQv8d+y
 xBwl3k/DAfGtYjJwntxUNjuQUOzn9E3cc/L/z4Y0ON+sYGzDbv9cPU2oMNyrqFTj
 /AZVumCxz3gjjBah62iu7m0hkXIrh7H86ua/dAn/CC0dtAAcANQn+2373C02IPN9
 ntCEjc+8aMdw4yph31Ngxd/IT6CQxCOpVJd++kzP5BpLvysUsUGy1M/3Hsd1c88v
 hKWU5aS4
 =YFeP
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm64 fixes for 5.9, take #3

- Fix synchronization of VTTBR update on TLB invalidation for nVHE systems
2020-10-03 05:07:59 -04:00
Will Deacon baab853229 Merge branch 'for-next/mte' into for-next/core
Add userspace support for the Memory Tagging Extension introduced by
Armv8.5.

(Catalin Marinas and others)
* for-next/mte: (30 commits)
  arm64: mte: Fix typo in memory tagging ABI documentation
  arm64: mte: Add Memory Tagging Extension documentation
  arm64: mte: Kconfig entry
  arm64: mte: Save tags when hibernating
  arm64: mte: Enable swap of tagged pages
  mm: Add arch hooks for saving/restoring tags
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
  arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
  arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Allow user control of the generated random tags via prctl()
  arm64: mte: Allow user control of the tag check mode via prctl()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Introduce arch_validate_flags()
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_calc_vm_flag_bits()
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: Avoid unnecessary clear_user_page() indirection
  ...
2020-10-02 12:16:11 +01:00
Will Deacon 0a21ac0d30 Merge branch 'for-next/ghostbusters' into for-next/core
Fix and subsequently rewrite Spectre mitigations, including the addition
of support for PR_SPEC_DISABLE_NOEXEC.

(Will Deacon and Marc Zyngier)
* for-next/ghostbusters: (22 commits)
  arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
  arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
  KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
  arm64: Get rid of arm64_ssbd_state
  KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
  KVM: arm64: Get rid of kvm_arm_have_ssbd()
  KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
  arm64: Rewrite Spectre-v4 mitigation code
  arm64: Move SSBD prctl() handler alongside other spectre mitigation code
  arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4
  arm64: Treat SSBS as a non-strict system feature
  arm64: Group start_thread() functions together
  KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2
  arm64: Rewrite Spectre-v2 mitigation code
  arm64: Introduce separate file for spectre mitigations and reporting
  arm64: Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2
  KVM: arm64: Simplify install_bp_hardening_cb()
  KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE
  arm64: Remove Spectre-related CONFIG_* options
  arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs
  ...
2020-10-02 12:15:24 +01:00
Will Deacon 57b8b1b435 Merge branches 'for-next/acpi', 'for-next/boot', 'for-next/bpf', 'for-next/cpuinfo', 'for-next/fpsimd', 'for-next/misc', 'for-next/mm', 'for-next/pci', 'for-next/perf', 'for-next/ptrauth', 'for-next/sdei', 'for-next/selftests', 'for-next/stacktrace', 'for-next/svm', 'for-next/topology', 'for-next/tpyos' and 'for-next/vdso' into for-next/core
Remove unused functions and parameters from ACPI IORT code.
(Zenghui Yu via Lorenzo Pieralisi)
* for-next/acpi:
  ACPI/IORT: Remove the unused inline functions
  ACPI/IORT: Drop the unused @ops of iort_add_device_replay()

Remove redundant code and fix documentation of caching behaviour for the
HVC_SOFT_RESTART hypercall.
(Pingfan Liu)
* for-next/boot:
  Documentation/kvm/arm: improve description of HVC_SOFT_RESTART
  arm64/relocate_kernel: remove redundant code

Improve reporting of unexpected kernel traps due to BPF JIT failure.
(Will Deacon)
* for-next/bpf:
  arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM

Improve robustness of user-visible HWCAP strings and their corresponding
numerical constants.
(Anshuman Khandual)
* for-next/cpuinfo:
  arm64/cpuinfo: Define HWCAP name arrays per their actual bit definitions

Cleanups to handling of SVE and FPSIMD register state in preparation
for potential future optimisation of handling across syscalls.
(Julien Grall)
* for-next/fpsimd:
  arm64/sve: Implement a helper to load SVE registers from FPSIMD state
  arm64/sve: Implement a helper to flush SVE registers
  arm64/fpsimdmacros: Allow the macro "for" to be used in more cases
  arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN
  arm64/signal: Update the comment in preserve_sve_context
  arm64/fpsimd: Update documentation of do_sve_acc

Miscellaneous changes.
(Tian Tao and others)
* for-next/misc:
  arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE
  arm64: mm: Fix missing-prototypes in pageattr.c
  arm64/fpsimd: Fix missing-prototypes in fpsimd.c
  arm64: hibernate: Remove unused including <linux/version.h>
  arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR()
  arm64: Remove the unused include statements
  arm64: get rid of TEXT_OFFSET
  arm64: traps: Add str of description to panic() in die()

Memory management updates and cleanups.
(Anshuman Khandual and others)
* for-next/mm:
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64/mm: Unify CONT_PMD_SHIFT
  arm64/mm: Unify CONT_PTE_SHIFT
  arm64/mm: Remove CONT_RANGE_OFFSET
  arm64/mm: Enable THP migration
  arm64/mm: Change THP helpers to comply with generic MM semantics
  arm64/mm/ptdump: Add address markers for BPF regions

Allow prefetchable PCI BARs to be exposed to userspace using normal
non-cacheable mappings.
(Clint Sbisa)
* for-next/pci:
  arm64: Enable PCI write-combine resources under sysfs

Perf/PMU driver updates.
(Julien Thierry and others)
* for-next/perf:
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm_pmu: arm64: Use NMIs for PMU
  arm_pmu: Introduce pmu_irq_ops
  KVM: arm64: pmu: Make overflow handler NMI safe
  arm64: perf: Defer irq_work to IPI_IRQ_WORK
  arm64: perf: Remove PMU locking
  arm64: perf: Avoid PMXEV* indirection
  arm64: perf: Add missing ISB in armv8pmu_enable_counter()
  perf: Add Arm CMN-600 PMU driver
  perf: Add Arm CMN-600 DT binding
  arm64: perf: Add support caps under sysfs
  drivers/perf: thunderx2_pmu: Fix memory resource error handling
  drivers/perf: xgene_pmu: Fix uninitialized resource struct
  perf: arm_dsu: Support DSU ACPI devices
  arm64: perf: Remove unnecessary event_idx check
  drivers/perf: hisi: Add missing include of linux/module.h
  arm64: perf: Add general hardware LLC events for PMUv3

Support for the Armv8.3 Pointer Authentication enhancements.
(By Amit Daniel Kachhap)
* for-next/ptrauth:
  arm64: kprobe: clarify the comment of steppable hint instructions
  arm64: kprobe: disable probe of fault prone ptrauth instruction
  arm64: cpufeature: Modify address authentication cpufeature to exact
  arm64: ptrauth: Introduce Armv8.3 pointer authentication enhancements
  arm64: traps: Allow force_signal_inject to pass esr error code
  arm64: kprobe: add checks for ARMv8.3-PAuth combined instructions

Tonnes of cleanup to the SDEI driver.
(Gavin Shan)
* for-next/sdei:
  firmware: arm_sdei: Remove _sdei_event_unregister()
  firmware: arm_sdei: Remove _sdei_event_register()
  firmware: arm_sdei: Introduce sdei_do_local_call()
  firmware: arm_sdei: Cleanup on cross call function
  firmware: arm_sdei: Remove while loop in sdei_event_unregister()
  firmware: arm_sdei: Remove while loop in sdei_event_register()
  firmware: arm_sdei: Remove redundant error message in sdei_probe()
  firmware: arm_sdei: Remove duplicate check in sdei_get_conduit()
  firmware: arm_sdei: Unregister driver on error in sdei_init()
  firmware: arm_sdei: Avoid nested statements in sdei_init()
  firmware: arm_sdei: Retrieve event number from event instance
  firmware: arm_sdei: Common block for failing path in sdei_event_create()
  firmware: arm_sdei: Remove sdei_is_err()

Selftests for Pointer Authentication and FPSIMD/SVE context-switching.
(Mark Brown and Boyan Karatotev)
* for-next/selftests:
  selftests: arm64: Add build and documentation for FP tests
  selftests: arm64: Add wrapper scripts for stress tests
  selftests: arm64: Add utility to set SVE vector lengths
  selftests: arm64: Add stress tests for FPSMID and SVE context switching
  selftests: arm64: Add test for the SVE ptrace interface
  selftests: arm64: Test case for enumeration of SVE vector lengths
  kselftests/arm64: add PAuth tests for single threaded consistency and differently initialized keys
  kselftests/arm64: add PAuth test for whether exec() changes keys
  kselftests/arm64: add nop checks for PAuth tests
  kselftests/arm64: add a basic Pointer Authentication test

Implementation of ARCH_STACKWALK for unwinding.
(Mark Brown)
* for-next/stacktrace:
  arm64: Move console stack display code to stacktrace.c
  arm64: stacktrace: Convert to ARCH_STACKWALK
  arm64: stacktrace: Make stack walk callback consistent with generic code
  stacktrace: Remove reliable argument from arch_stack_walk() callback

Support for ASID pinning, which is required when sharing page-tables with
the SMMU.
(Jean-Philippe Brucker)
* for-next/svm:
  arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
  arm64: mm: Pin down ASIDs for sharing mm with devices

Rely on firmware tables for establishing CPU topology.
(Valentin Schneider)
* for-next/topology:
  arm64: topology: Stop using MPIDR for topology information

Spelling fixes.
(Xiaoming Ni and Yanfei Xu)
* for-next/tpyos:
  arm64/numa: Fix a typo in comment of arm64_numa_init
  arm64: fix some spelling mistakes in the comments by codespell

vDSO cleanups.
(Will Deacon)
* for-next/vdso:
  arm64: vdso: Fix unusual formatting in *setup_additional_pages()
  arm64: vdso32: Remove a bunch of #ifdef CONFIG_COMPAT_VDSO guards
2020-10-02 12:01:41 +01:00
Marc Zyngier 4e5dc64c43 Merge branches 'kvm-arm64/pt-new' and 'kvm-arm64/pmu-5.9' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-10-02 09:25:55 +01:00
Will Deacon ffd1b63a58 KVM: arm64: Ensure user_mem_abort() return value is initialised
If a change in the MMU notifier sequence number forces user_mem_abort()
to return early when attempting to handle a stage-2 fault, we return
uninitialised stack to kvm_handle_guest_abort(), which could potentially
result in the injection of an external abort into the guest or a spurious
return to userspace. Neither or these are what we want to do.

Initialise 'ret' to 0 in user_mem_abort() so that bailing due to a
change in the MMU notrifier sequence number is treated as though the
fault was handled.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200930102442.16142-1-will@kernel.org
2020-10-02 09:25:25 +01:00
Will Deacon b259d137e9 KVM: arm64: Pass level hint to TLBI during stage-2 permission fault
Alex pointed out that we don't pass a level hint to the TLBI instruction
when handling a stage-2 permission fault, even though the walker does
at some point have the level information in its hands.

Rework stage2_update_leaf_attrs() so that it can optionally return the
level of the updated pte to its caller, which can in turn be used to
provide the correct TLBI level hint.

Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/595cc73e-636e-8b3a-f93a-b4e9fb218db8@arm.com
Link: https://lore.kernel.org/r/20200930131801.16889-1-will@kernel.org
2020-10-02 09:16:07 +01:00
Marc Zyngier 452d622279 KVM: arm64: Restore missing ISB on nVHE __tlb_switch_to_guest
Commit a0e50aa3f4 ("KVM: arm64: Factor out stage 2 page table
data from struct kvm") dropped the ISB after __load_guest_stage2(),
only leaving the one that is required when the speculative AT
workaround is in effect.

As Andrew points it: "This alternative is 'backwards' to avoid a
double ISB as there is one in __load_guest_stage2 when the workaround
is active."

Restore the missing ISB, conditionned on the AT workaround not being
active.

Fixes: a0e50aa3f4 ("KVM: arm64: Factor out stage 2 page table data from struct kvm")
Reported-by: Andrew Scull <ascull@google.com>
Reported-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-10-01 09:53:45 +01:00
Marc Zyngier 14ef9d0492 Merge branch 'kvm-arm64/hyp-pcpu' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-30 14:05:35 +01:00
Marc Zyngier 816c347f3a Merge remote-tracking branch 'arm64/for-next/ghostbusters' into kvm-arm64/hyp-pcpu
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-30 09:48:30 +01:00
David Brazdil a3bb9c3a00 kvm: arm64: Remove unnecessary hyp mappings
With all nVHE per-CPU variables being part of the hyp per-CPU region,
mapping them individual is not necessary any longer. They are mapped to hyp
as part of the overall per-CPU region.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-11-dbrazdil@google.com
2020-09-30 08:37:57 +01:00
David Brazdil 30c953911c kvm: arm64: Set up hyp percpu data for nVHE
Add hyp percpu section to linker script and rename the corresponding ELF
sections of hyp/nvhe object files. This moves all nVHE-specific percpu
variables to the new hyp percpu section.

Allocate sufficient amount of memory for all percpu hyp regions at global KVM
init time and create corresponding hyp mappings.

The base addresses of hyp percpu regions are kept in a dynamically allocated
array in the kernel.

Add NULL checks in PMU event-reset code as it may run before KVM memory is
initialized.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com
2020-09-30 08:37:14 +01:00
David Brazdil 2a1198c9b4 kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE
Host CPU context is stored in a global per-cpu variable `kvm_host_data`.
In preparation for introducing independent per-CPU region for nVHE hyp,
create two separate instances of `kvm_host_data`, one for VHE and one
for nVHE.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-9-dbrazdil@google.com
2020-09-30 08:37:13 +01:00
David Brazdil df4c8214a1 kvm: arm64: Duplicate arm64_ssbd_callback_required for nVHE hyp
Hyp keeps track of which cores require SSBD callback by accessing a
kernel-proper global variable. Create an nVHE symbol of the same name
and copy the value from kernel proper to nVHE as KVM is being enabled
on a core.

Done in preparation for separating percpu memory owned by kernel
proper and nVHE.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-8-dbrazdil@google.com
2020-09-30 08:37:13 +01:00
David Brazdil ea391027d3 kvm: arm64: Remove hyp_adr/ldr_this_cpu
The hyp_adr/ldr_this_cpu helpers were introduced for use in hyp code
because they always needed to use TPIDR_EL2 for base, while
adr/ldr_this_cpu from kernel proper would select between TPIDR_EL2 and
_EL1 based on VHE/nVHE.

Simplify this now that the hyp mode case can be handled using the
__KVM_VHE/NVHE_HYPERVISOR__ macros.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-6-dbrazdil@google.com
2020-09-30 08:37:07 +01:00
David Brazdil 717cf94adb kvm: arm64: Remove __hyp_this_cpu_read
this_cpu_ptr is meant for use in kernel proper because it selects between
TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always
select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by
selecting _EL2 register under __KVM_NVHE_HYPERVISOR__. VHE continues
selecting the register using alternatives.

Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check
which is omitted by the hyp helpers. Preserve the behavior for nVHE by
overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend
the checks into VHE hyp code.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-5-dbrazdil@google.com
2020-09-30 08:33:52 +01:00
David Brazdil ab25464bda kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY
Relying on objcopy to prefix the ELF section names of the nVHE hyp code
is brittle and prevents us from using wildcards to match specific
section names.

Improve the build rules by partially linking all '.nvhe.o' files and
prefixing their ELF section names using a linker script. Continue using
objcopy for prefixing ELF symbol names.

One immediate advantage of this approach is that all subsections
matching a pattern can be merged into a single prefixed section, eg.
.text and .text.* can be linked into a single '.hyp.text'. This removes
the need for -fno-reorder-functions on GCC and will be useful in the
future too: LTO builds use .text subsections, compilers routinely
generate .rodata subsections, etc.

Partially linking all hyp code into a single object file also makes it
easier to analyze.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-2-dbrazdil@google.com
2020-09-30 08:33:52 +01:00
Will Deacon 9ef2b48be9 KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
Patching the EL2 exception vectors is integral to the Spectre-v2
workaround, where it can be necessary to execute CPU-specific sequences
to nobble the branch predictor before running the hypervisor text proper.

Remove the dependency on CONFIG_RANDOMIZE_BASE and allow the EL2 vectors
to be patched even when KASLR is not enabled.

Fixes: 7a132017e7a5 ("KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202009221053.Jv1XsQUZ%lkp@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Marc Zyngier d63d975a71 KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
Convert the KVM WA2 code to using the Spectre infrastructure,
making the code much more readable. It also allows us to
take SSBS into account for the mitigation.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:17 +01:00
Marc Zyngier 29e8910a56 KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
Owing to the fact that the host kernel is always mitigated, we can
drastically simplify the WA2 handling by keeping the mitigation
state ON when entering the guest. This means the guest is either
unaffected or not mitigated.

This results in a nice simplification of the mitigation space,
and the removal of a lot of code that was never really used anyway.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon 9b0955baa4 arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4
In a similar manner to the renaming of ARM64_HARDEN_BRANCH_PREDICTOR
to ARM64_SPECTRE_V2, rename ARM64_SSBD to ARM64_SPECTRE_V4. This isn't
_entirely_ accurate, as we also need to take into account the interaction
with SSBS, but that will be taken care of in subsequent patches.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Marc Zyngier e1026237f9 KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2
If the system is not affected by Spectre-v2, then advertise to the KVM
guest that it is not affected, without the need for a safelist in the
guest.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Will Deacon d4647f0a2a arm64: Rewrite Spectre-v2 mitigation code
The Spectre-v2 mitigation code is pretty unwieldy and hard to maintain.
This is largely due to it being written hastily, without much clue as to
how things would pan out, and also because it ends up mixing policy and
state in such a way that it is very difficult to figure out what's going
on.

Rewrite the Spectre-v2 mitigation so that it clearly separates state from
policy and follows a more structured approach to handling the mitigation.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon 5359a87d5b KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE
The removal of CONFIG_HARDEN_BRANCH_PREDICTOR means that
CONFIG_KVM_INDIRECT_VECTORS is synonymous with CONFIG_RANDOMIZE_BASE,
so replace it.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Will Deacon 6e5f092784 arm64: Remove Spectre-related CONFIG_* options
The spectre mitigations are too configurable for their own good, leading
to confusing logic trying to figure out when we should mitigate and when
we shouldn't. Although the plethora of command-line options need to stick
around for backwards compatibility, the default-on CONFIG options that
depend on EXPERT can be dropped, as the mitigations only do anything if
the system is vulnerable, a mitigation is available and the command-line
hasn't disabled it.

Remove CONFIG_HARDEN_BRANCH_PREDICTOR and CONFIG_ARM64_SSBD in favour of
enabling this code unconditionally.

Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:15 +01:00
Marc Zyngier 2e02cbb236 Merge branch 'kvm-arm64/pmu-5.9' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 15:12:54 +01:00
Marc Zyngier 88865beca9 KVM: arm64: Mask out filtered events in PCMEID{0,1}_EL1
As we can now hide events from the guest, let's also adjust its view of
PCMEID{0,1}_EL1 so that it can figure out why some common events are not
counting as they should.

The astute user can still look into the TRM for their CPU and find out
they've been cheated, though. Nobody's perfect.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 14:19:39 +01:00
Marc Zyngier d7eec2360e KVM: arm64: Add PMU event filtering infrastructure
It can be desirable to expose a PMU to a guest, and yet not want the
guest to be able to count some of the implemented events (because this
would give information on shared resources, for example.

For this, let's extend the PMUv3 device API, and offer a way to setup a
bitmap of the allowed events (the default being no bitmap, and thus no
filtering).

Userspace can thus allow/deny ranges of event. The default policy
depends on the "polarity" of the first filter setup (default deny if the
filter allows events, and default allow if the filter denies events).
This allows to setup exactly what is allowed for a given guest.

Note that although the ioctl is per-vcpu, the map of allowed events is
global to the VM (it can be setup from any vcpu until the vcpu PMU is
initialized).

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 14:19:39 +01:00
Marc Zyngier fd65a3b5f8 KVM: arm64: Use event mask matching architecture revision
The PMU code suffers from a small defect where we assume that the event
number provided by the guest is always 16 bit wide, even if the CPU only
implements the ARMv8.0 architecture. This isn't really problematic in
the sense that the event number ends up in a system register, cropping
it to the right width, but still this needs fixing.

In order to make it work, let's probe the version of the PMU that the
guest is going to use. This is done by temporarily creating a kernel
event and looking at the PMUVer field that has been saved at probe time
in the associated arm_pmu structure. This in turn gets saved in the kvm
structure, and subsequently used to compute the event mask that gets
used throughout the PMU code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 14:19:38 +01:00
Marc Zyngier 42223fb100 KVM: arm64: Refactor PMU attribute error handling
The PMU emulation error handling is pretty messy when dealing with
attributes. Let's refactor it so that we have less duplication,
and that it is easy to extend later on.

A functional change is that kvm_arm_pmu_v3_init() used to return
-ENXIO when the PMU feature wasn't set. The error is now reported
as -ENODEV, matching the documentation. -ENXIO is still returned
when the interrupt isn't properly configured.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 14:19:38 +01:00
Julien Thierry 95e92e45a4 KVM: arm64: pmu: Make overflow handler NMI safe
kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from
NMI context, defer waking the vcpu to an irq_work queue.

A vcpu can be freed while it's not running by kvm_destroy_vm(). Prevent
running the irq_work for a non-existent vcpu by calling irq_work_sync() on
the PMU destroy path.

[Alexandru E.: Added irq_work_sync()]

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Pouloze <suzuki.poulose@arm.com>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20200924110706.254996-6-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 19:00:17 +01:00
Paolo Bonzini bf3c0e5e71 Merge branch 'x86-seves-for-paolo' of https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD 2020-09-22 06:43:17 -04:00
Linus Torvalds beaeb4f39b ARM:
- fix fault on page table writes during instruction fetch
 
 s390:
 - doc improvement
 
 x86:
 - The obvious patches are always the ones that turn out to be
   completely broken.  /me hangs his head in shame.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl9nyjsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpcAf/bsW1B+Q8QJnyfU4RSiX28lG8Ki9F
 9A0aVJPW4U/x7COZuhldrQGkbHDA5agavCevghMuOqWkz2gs6ihpAGgzfG+FVIm7
 2yi4k9A90kPrMSBf8qaLgvybGNO6uxGpJmv54MjHpkLPUEz+J1MuB9D6eEqBZkWz
 ncOSsGS2eeUFpqulA9DCN3O3PbaFeAXPNJnDNGqxrGjV7CriosRlbK02PVxTQzvT
 nuGzDgaOmmRXntIQ7hrk9DJlHm7gH2jH8TK9gB2xm0yuVm2/nNlpkY6rP6NDUdLs
 OrJOzxWOcSO8HRgBhlFhED/8heTqHCJS1vMUI3M6Z62p324TjRDjRC+4ow==
 =tpZU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - fix fault on page table writes during instruction fetch

  s390:
   - doc improvement

  x86:
   - The obvious patches are always the ones that turn out to be
     completely broken. /me hangs his head in shame"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  Revert "KVM: Check the allocation of pv cpu mask"
  KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
  KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
  docs: kvm: add documentation for KVM_CAP_S390_DIAG318
2020-09-21 08:53:48 -07:00
Paolo Bonzini b73815a18d KVM/arm64 fixes for 5.9, take #2
- Fix handling of S1 Page Table Walk permission fault at S2
   on instruction fetch
 - Cleanup kvm_vcpu_dabt_iswrite()
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl9k6LYPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDe6UP+gMRK9yEgY7eBg23nfOVDiF7t8dvj+J5eSws
 8YP3nRew9pL710SLClGJCrVyVaCEMPN7cL7O926IALqGPMsPgKFPX7RcshjVUOUo
 otjQPfT3hMOKfmAGCHyQHeudQ8KMEZX9ikdvGd19zT3F2t2CojUn0uFZPgjIXKOe
 x4Un5F/QuOVFoI85BctqpNRJsc9yzjpIu/J2YjO+QeKyMoK1gp+GLIACDczfbaLq
 IWSRApDPP3vyxImxtWaahI/IlAk4XcDV4EFmUMN2GDuK64bFTcky1YLr88njp0ZV
 VVrxw4Z/lGGKl1MqNV8/OVBS+z1+L8G8+uNFELFL1veDc+eDn+P/Sr/SoDxT93uO
 Dz9/gWJd2GEtneJ/PspjxLdzpAkUluV7vRQBikjVtBld7OivnDYeylJsA88olXzI
 PUY+Y9p1K3T/u5oeiKEnDvufzisDiYZSKpE81KtJP+++biOULVIm7rkCmg5IvUAF
 GhxnQ/n2RW9Q+FCXy7FTss8gxl7FBKdWSCnGQ1ZYa+hwnIa9WU1GLG94Zx/qc/GO
 zIYstUKsYtn2ViHn9XTbbewUz1XeCaedmb/R1Uu14j6WrzgirhJOfVHE2US9Xlji
 w/GruENw4Xqlb842q5NbZnv2XtTFQWLAduthAotclnIecYRvqIWRR8JVf64XVvjR
 K0JmH08N
 =D7jg
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm64 fixes for 5.9, take #2

- Fix handling of S1 Page Table Walk permission fault at S2
  on instruction fetch
- Cleanup kvm_vcpu_dabt_iswrite()
2020-09-20 17:31:07 -04:00
Marc Zyngier c4ad98e4b7 KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
KVM currently assumes that an instruction abort can never be a write.
This is in general true, except when the abort is triggered by
a S1PTW on instruction fetch that tries to update the S1 page tables
(to set AF, for example).

This can happen if the page tables have been paged out and brought
back in without seeing a direct write to them (they are thus marked
read only), and the fault handling code will make the PT executable(!)
instead of writable. The guest gets stuck forever.

In these conditions, the permission fault must be considered as
a write so that the Stage-1 update can take place. This is essentially
the I-side equivalent of the problem fixed by 60e21a0ef5 ("arm64: KVM:
Take S1 walks into account when determining S2 write faults").

Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce
kvm_vcpu_trap_is_exec_fault() that only return true when no faulting
on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to
kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't
specific to data abort.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
2020-09-18 18:01:48 +01:00
Marc Zyngier 41fa0f5971 Merge branch 'kvm-arm64/misc-5.10' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-18 16:22:28 +01:00
Marc Zyngier 8910f08960 Merge branch 'kvm-arm64/pt-new' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>

# Conflicts:
#	arch/arm64/kvm/mmu.c
2020-09-18 16:22:18 +01:00
Xiaofei Tan c9c0279cc0 KVM: arm64: Fix doc warnings in mmu code
Fix following warnings caused by mismatch bewteen function parameters
and comments.
arch/arm64/kvm/mmu.c:128: warning: Function parameter or member 'mmu' not described in '__unmap_stage2_range'
arch/arm64/kvm/mmu.c:128: warning: Function parameter or member 'may_block' not described in '__unmap_stage2_range'
arch/arm64/kvm/mmu.c:128: warning: Excess function parameter 'kvm' description in '__unmap_stage2_range'
arch/arm64/kvm/mmu.c:499: warning: Function parameter or member 'writable' not described in 'kvm_phys_addr_ioremap'
arch/arm64/kvm/mmu.c:538: warning: Function parameter or member 'mmu' not described in 'stage2_wp_range'
arch/arm64/kvm/mmu.c:538: warning: Excess function parameter 'kvm' description in 'stage2_wp_range'

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1600307269-50957-1-git-send-email-tanxiaofei@huawei.com
2020-09-18 16:18:51 +01:00
Alexandru Elisei ada329e6b5 KVM: arm64: Do not flush memslot if FWB is supported
As a result of a KVM_SET_USER_MEMORY_REGION ioctl, KVM flushes the
dcache for the memslot being changed to ensure a consistent view of memory
between the host and the guest: the host runs with caches enabled, and
it is possible for the data written by the hypervisor to still be in the
caches, but the guest is running with stage 1 disabled, meaning data
accesses are to Device-nGnRnE memory, bypassing the caches entirely.

Flushing the dcache is not necessary when KVM enables FWB, because it
forces the guest to uses cacheable memory accesses.

The current behaviour does not change, as the dcache flush helpers execute
the cache operation only if FWB is not enabled, but walking the stage 2
table is avoided.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915170442.131635-1-alexandru.elisei@arm.com
2020-09-18 16:18:24 +01:00
Liu Shixin cb62e0b5c8 KVM: arm64: vgic-debug: Convert to use DEFINE_SEQ_ATTRIBUTE macro
Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code.

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200916025023.3992679-1-liushixin2@huawei.com
2020-09-18 16:17:27 +01:00
Tian Tao 8a4374f97d KVM: arm64: Fix inject_fault.c kernel-doc warnings
Fix kernel-doc warnings.
arch/arm64/kvm/inject_fault.c:210: warning: Function parameter or member
'vcpu' not described in 'kvm_inject_undefined'

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/1600154512-44624-1-git-send-email-tiantao6@hisilicon.com
2020-09-18 16:17:22 +01:00
Alexandru Elisei 523b3999e5 KVM: arm64: Try PMD block mappings if PUD mappings are not supported
When userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to
use the same block size to map the faulting IPA in stage 2. If stage 2
cannot the same block mapping because the block size doesn't fit in the
memslot or the memslot is not properly aligned, user_mem_abort() will fall
back to a page mapping, regardless of the block size. We can do better for
PUD backed hugetlbfs by checking if a PMD block mapping is supported before
deciding to use a page.

vma_pagesize is an unsigned long, use 1UL instead of 1ULL when assigning
its value.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200910133351.118191-1-alexandru.elisei@arm.com
2020-09-18 16:10:15 +01:00
Marc Zyngier 81867b75db Merge branch 'kvm-arm64/nvhe-hyp-context' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-16 10:59:17 +01:00
Andrew Scull a071261d93 KVM: arm64: nVHE: Fix pointers during SMCCC convertion
The host need not concern itself with the pointer differences for the
hyp interfaces that are shared between VHE and nVHE so leave it to the
hyp to handle.

As the SMCCC function IDs are converted into function calls, it is a
suitable place to also convert any pointer arguments into hyp pointers.
This, additionally, eases the reuse of the handlers in different
contexts.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-20-ascull@google.com
2020-09-15 18:39:04 +01:00
Andrew Scull 04e4caa8d3 KVM: arm64: nVHE: Migrate hyp-init to SMCCC
To complete the transition to SMCCC, the hyp initialization is given a
function ID. This looks neater than comparing the hyp stub function IDs
to the page table physical address.

Some care is taken to only clobber x0-3 before the host context is saved
as only those registers can be clobbered accoring to SMCCC. Fortunately,
only a few acrobatics are needed. The possible new tpidr_el2 is moved to
the argument in x2 so that it can be stashed in tpidr_el2 early to free
up a scratch register. The page table configuration then makes use of
x0-2.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-19-ascull@google.com
2020-09-15 18:39:04 +01:00
Andrew Scull 054698316d KVM: arm64: nVHE: Migrate hyp interface to SMCCC
Rather than passing arbitrary function pointers to run at hyp, define
and equivalent set of SMCCC functions.

Since the SMCCC functions are strongly tied to the original function
prototypes, it is not expected for the host to ever call an invalid ID
but a warning is raised if this does ever occur.

As __kvm_vcpu_run is used for every switch between the host and a guest,
it is explicitly singled out to be identified before the other function
IDs to improve the performance of the hot path.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-18-ascull@google.com
2020-09-15 18:39:04 +01:00
Andrew Scull 5dc33bd199 KVM: arm64: nVHE: Pass pointers consistently to hyp-init
Rather than some being kernel pointer and others being hyp pointers,
standardize on all pointers being hyp pointers.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-15-ascull@google.com
2020-09-15 18:39:03 +01:00
Andrew Scull a2e102e20f KVM: arm64: nVHE: Handle hyp panics
Restore the host context when panicking from hyp to give the best chance
of the panic being clean.

The host requires that registers be preserved such as x18 for the shadow
callstack. If the panic is caused by an exception from EL1, the host
context is still valid so the panic can return straight back to the
host. If the panic comes from EL2 then it's most likely that the hyp
context is active and the host context needs to be restored.

There are windows before and after the host context is saved and
restored that restoration is attempted incorrectly and the panic won't
be clean.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-14-ascull@google.com
2020-09-15 18:39:03 +01:00
Andrew Scull 4e3393a969 KVM: arm64: nVHE: Switch to hyp context for EL2
Save and restore the host context when switching to and from hyp. This
gives hyp its own context that the host will not see as a step towards a
full trust boundary between the two.

SP_EL0 and pointer authentication keys are currently shared between the
host and hyp so don't need to be switched yet.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-13-ascull@google.com
2020-09-15 18:39:03 +01:00
Andrew Scull 603d2bdaa5 KVM: arm64: Share context save and restore macros
To avoid duplicating the context save and restore macros, move them into
a shareable header.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-12-ascull@google.com
2020-09-15 18:39:03 +01:00
Andrew Scull 7db2153047 KVM: arm64: Restore hyp when panicking in guest context
If the guest context is loaded when a panic is triggered, restore the
hyp context so e.g. the shadow call stack works when hyp_panic() is
called and SP_EL0 is valid when the host's panic() is called.

Use the hyp context's __hyp_running_vcpu field to track when hyp
transitions to and from the guest vcpu so the exception handlers know
whether the context needs to be restored.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-11-ascull@google.com
2020-09-15 18:39:02 +01:00
Andrew Scull 7c2e76d87f KVM: arm64: Update context references from host to hyp
Hyp now has its own nominal context for saving and restoring its state
when switching to and from a guest. Update the related comments and
utilities to match the new name.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-10-ascull@google.com
2020-09-15 18:39:02 +01:00
Andrew Scull b619d9aa8b KVM: arm64: Introduce hyp context
During __guest_enter, save and restore from a new hyp context rather
than the host context. This is preparation for separation of the hyp and
host context in nVHE.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-9-ascull@google.com
2020-09-15 18:39:02 +01:00
Andrew Scull 472fc011cc KVM: arm64: nVHE: Don't consume host SErrors with ESB
The ESB at the start of the host vector may cause SErrors to be consumed
to DISR_EL1. However, this is not checked for the host so the SError
could go unhandled.

Remove the ESB so that SErrors are not consumed but are instead left
pending for the host to consume. __guest_enter already defers entry into
a guest if there are any SErrors pending.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200915104643.2543892-8-ascull@google.com
2020-09-15 18:39:02 +01:00
Andrew Scull 6e3bfbb22c KVM: arm64: nVHE: Use separate vector for the host
The host is treated differently from the guests when an exception is
taken so introduce a separate vector that is specialized for the host.
This also allows the nVHE specific code to move out of hyp-entry.S and
into nvhe/host.S.

The host is only expected to make HVC calls and anything else is
considered invalid and results in a panic.

Hyp initialization is now passed the vector that is used for the host
and it is swapped for the guest vector during the context switch.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-7-ascull@google.com
2020-09-15 18:39:02 +01:00
Andrew Scull a0e479523e KVM: arm64: Save chosen hyp vector to a percpu variable
Introduce a percpu variable to hold the address of the selected hyp
vector that will be used with guests. This avoids the selection process
each time a guest is being entered and can be used by nVHE when a
separate vector is introduced for the host.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-6-ascull@google.com
2020-09-15 18:39:02 +01:00
Andrew Scull d7ca1079d8 KVM: arm64: Remove kvm_host_data_t typedef
The kvm_host_data_t typedef is used inconsistently and goes against the
kernel's coding style. Remove it in favour of the full struct specifier.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-4-ascull@google.com
2020-09-15 18:39:01 +01:00
Andrew Scull 6a0259ed29 KVM: arm64: Remove hyp_panic arguments
hyp_panic is able to find all the context it needs from within itself so
remove the argument. The __hyp_panic wrapper becomes redundant so is
also removed.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-3-ascull@google.com
2020-09-15 18:39:01 +01:00
Andrew Scull 501a67a25d KVM: arm64: Remove __activate_vm wrapper
The __activate_vm wrapper serves no useful function and has a misleading
name as it simply calls __load_guest_stage2 and does not touch
HCR_EL2.VM so remove it.

Also rename __deactivate_vm to __load_host_stage2 to match naming
pattern.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200915104643.2543892-2-ascull@google.com
2020-09-15 18:39:01 +01:00
Linus Torvalds 84b1349972 ARM:
- Multiple stolen time fixes, with a new capability to match x86
 - Fix for hugetlbfs mappings when PUD and PMD are the same level
 - Fix for hugetlbfs mappings when PTE mappings are enforced
   (dirty logging, for example)
 - Fix tracing output of 64bit values
 
 x86:
 - nSVM state restore fixes
 - Async page fault fixes
 - Lots of small fixes everywhere
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl9dM5kUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM+Iwf+LbISO7ccpPMK1kKtOeug/jZv+xQA
 sVaBGRzYo+k2e0XtV8E8IV4N30FBtYSwXsbBKkMAoy2FpmMebgDWDQ7xspb6RJMS
 /y8t1iqPwdOaLIkUkgc7UihSTlZm05Es3f3q6uZ9+oaM4Fe+V7xWzTUX4Oy89JO7
 KcQsTD7pMqS4bfZGADK781ITR/WPgCi0aYx5s6dcwcZAQXhb1K1UKEjB8OGKnjUh
 jliReJtxRA16rjF+S5aJ7L07Ce/ksrfwkI4NXJ4GxW+lyOfVNdSBJUBaZt1m7G2M
 1We5+i5EjKCjuxmgtUUUfVdazpj1yl+gBGT7KKkLte9T9WZdXyDnixAbvg==
 =OFb3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A bit on the bigger side, mostly due to me being on vacation, then
  busy, then on parental leave, but there's nothing worrisome.

  ARM:
   - Multiple stolen time fixes, with a new capability to match x86
   - Fix for hugetlbfs mappings when PUD and PMD are the same level
   - Fix for hugetlbfs mappings when PTE mappings are enforced (dirty
     logging, for example)
   - Fix tracing output of 64bit values

  x86:
   - nSVM state restore fixes
   - Async page fault fixes
   - Lots of small fixes everywhere"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: emulator: more strict rsm checks.
  KVM: nSVM: more strict SMM checks when returning to nested guest
  SVM: nSVM: setup nested msr permission bitmap on nested state load
  SVM: nSVM: correctly restore GIF on vmexit from nesting after migration
  x86/kvm: don't forget to ACK async PF IRQ
  x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro
  KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit
  KVM: SVM: avoid emulation with stale next_rip
  KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN
  KVM: SVM: Periodically schedule when unregistering regions on destroy
  KVM: MIPS: Change the definition of kvm type
  kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed
  KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control
  KVM: fix memory leak in kvm_io_bus_unregister_dev()
  KVM: Check the allocation of pv cpu mask
  KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected
  KVM: arm64: Update page shift if stage 2 block mapping not supported
  KVM: arm64: Fix address truncation in traces
  KVM: arm64: Do not try to map PUDs when they are folded into PMD
  arm64/x86: KVM: Introduce steal-time cap
  ...
2020-09-13 08:34:47 -07:00
Paolo Bonzini 1b67fd086d KVM/arm64 fixes for Linux 5.9, take #1
- Multiple stolen time fixes, with a new capability to match x86
 - Fix for hugetlbfs mappings when PUD and PMD are the same level
 - Fix for hugetlbfs mappings when PTE mappings are enforced
   (dirty logging, for example)
 - Fix tracing output of 64bit values
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl9SGP4PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDWSoP/iZdsgkcMXM1qGRpMtn0FDVQiL2zd8QOEki+
 /ldkeLvcUC9aqVOWdCM1fTweIvyPM0KSRxfbQRVGRGXym9bUMzx1laSl0CLXgi9q
 fa/lbmZOLG5PovQLe8eNon6aXybWWwuxfh/dhpBWLg5VGb0fXFutH3Cs2MGbX/Ec
 6qMvbwO09SGSTrXSQepbhFkAmViBAgUH2kiRhKReDfvRc4OwsbxdlmA5r9Ek6R5L
 x7tH8mYwOJVP1OEZWYRX+9GG1n8hCIKKSuRrhMbZtErTSNdf+YhldC6DuJF0zWVE
 nsoKzIEl15kIl0akkC6oA3MLNX1sfRh9C2J85Rt+odN4fY4MidrfcqRgWE2X3cuu
 CKDsr0Lb6aTtfZASHm7QQRsM4hujWoArBq6ZvUNjpfNXOPe4ovxX9TkPHp6OezuK
 v3PRzXQxUtmreK+02ZzalL6IBwAQrmLKxXM2P2Nuh4gDMgFC/BrHMxc1QVSnmb/m
 flMuKtvm+fkwKySQvX22FZrzhPfCMAuxCh28WdDSW2pnmZ8H0M3922y45xw3QTpg
 SltMtIcpO6ipzMsrVvO/hI/GvByFNN6jcLVGUV1Wx8mNdcf2kPeebA4hINKt5UDh
 gpwkz4zb2Bgp/YdgiG8NzBjpk2FMO0IPiAnouPrXenizbesAlKR2V3uFqa70PW/A
 BBkHnakS
 =VBrL
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for Linux 5.9, take #1

- Multiple stolen time fixes, with a new capability to match x86
- Fix for hugetlbfs mappings when PUD and PMD are the same level
- Fix for hugetlbfs mappings when PTE mappings are enforced
  (dirty logging, for example)
- Fix tracing output of 64bit values
2020-09-11 13:12:11 -04:00
Marc Zyngier ae8bd85ca8 Merge branch 'kvm-arm64/pt-new' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-11 15:54:30 +01:00
Will Deacon c9b69a0cf0 KVM: arm64: Don't constrain maximum IPA size based on host configuration
Now that the guest stage-2 page-tables are managed independently from
the host stage-1 page-tables, we can avoid constraining the IPA size
based on the host and instead limit it only based on the PARange field
of the ID_AA64MMFR0 register.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-22-will@kernel.org
2020-09-11 15:51:15 +01:00
Will Deacon 74cfa7ea66 KVM: arm64: Remove unused 'pgd' field from 'struct kvm_s2_mmu'
The stage-2 page-tables are entirely encapsulated by the 'pgt' field of
'struct kvm_s2_mmu', so remove the unused 'pgd' field.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-21-will@kernel.org
2020-09-11 15:51:15 +01:00
Will Deacon 3f26ab58e3 KVM: arm64: Remove unused page-table code
Now that KVM is using the generic page-table code to manage the guest
stage-2 page-tables, we can remove a bunch of unused macros, #defines
and static inline functions from the old implementation.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-20-will@kernel.org
2020-09-11 15:51:15 +01:00
Will Deacon 063deeb1f2 KVM: arm64: Check the pgt instead of the pgd when modifying page-table
In preparation for removing the 'pgd' field of 'struct kvm_s2_mmu',
update the few remaining users to check the 'pgt' field instead.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-19-will@kernel.org
2020-09-11 15:51:15 +01:00
Will Deacon 6f745f1bb5 KVM: arm64: Convert user_mem_abort() to generic page-table API
Convert user_mem_abort() to call kvm_pgtable_stage2_relax_perms() when
handling a stage-2 permission fault and kvm_pgtable_stage2_map() when
handling a stage-2 translation fault, rather than walking the page-table
manually.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-18-will@kernel.org
2020-09-11 15:51:15 +01:00
Will Deacon adcd4e2329 KVM: arm64: Add support for relaxing stage-2 perms in generic page-table code
Add support for relaxing the permissions of a stage-2 mapping (i.e.
adding additional permissions) to the generic page-table code.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-17-will@kernel.org
2020-09-11 15:51:15 +01:00
Quentin Perret 8d5207bef6 KVM: arm64: Convert memslot cache-flushing code to generic page-table API
Convert stage2_flush_memslot() to call the kvm_pgtable_stage2_flush()
function of the generic page-table code instead of walking the page-table
directly.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200911132529.19844-16-will@kernel.org
2020-09-11 15:51:14 +01:00
Quentin Perret 93c66b40d7 KVM: arm64: Add support for stage-2 cache flushing in generic page-table
Add support for cache flushing a range of the stage-2 address space to
the generic page-table code.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200911132529.19844-15-will@kernel.org
2020-09-11 15:51:14 +01:00
Quentin Perret cc38d61cac KVM: arm64: Convert write-protect operation to generic page-table API
Convert stage2_wp_range() to call the kvm_pgtable_stage2_wrprotect()
function of the generic page-table code instead of walking the page-table
directly.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200911132529.19844-14-will@kernel.org
2020-09-11 15:51:14 +01:00
Quentin Perret 73d49df2c3 KVM: arm64: Add support for stage-2 write-protect in generic page-table
Add a stage-2 wrprotect() operation to the generic page-table code.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200911132529.19844-13-will@kernel.org
2020-09-11 15:51:14 +01:00
Will Deacon ee8efad799 KVM: arm64: Convert page-aging and access faults to generic page-table API
Convert the page-aging functions and access fault handler to use the
generic page-table code instead of walking the page-table directly.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-12-will@kernel.org
2020-09-11 15:51:14 +01:00
Will Deacon e0e5a07f3f KVM: arm64: Add support for stage-2 page-aging in generic page-table
Add stage-2 mkyoung(), mkold() and is_young() operations to the generic
page-table code.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-11-will@kernel.org
2020-09-11 15:51:14 +01:00
Will Deacon 52bae936f0 KVM: arm64: Convert unmap_stage2_range() to generic page-table API
Convert unmap_stage2_range() to use kvm_pgtable_stage2_unmap() instead
of walking the page-table directly.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-10-will@kernel.org
2020-09-11 15:51:13 +01:00
Will Deacon e9edb17ae0 KVM: arm64: Convert kvm_set_spte_hva() to generic page-table API
Convert kvm_set_spte_hva() to use kvm_pgtable_stage2_map() instead
of stage2_set_pte().

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-9-will@kernel.org
2020-09-11 15:51:13 +01:00
Will Deacon 02bbd374ce KVM: arm64: Convert kvm_phys_addr_ioremap() to generic page-table API
Convert kvm_phys_addr_ioremap() to use kvm_pgtable_stage2_map() instead
of stage2_set_pte().

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-8-will@kernel.org
2020-09-11 15:51:13 +01:00
Will Deacon 6d9d2115c4 KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table
Add stage-2 map() and unmap() operations to the generic page-table code.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-7-will@kernel.org
2020-09-11 15:51:13 +01:00
Will Deacon 71233d05f4 KVM: arm64: Add support for creating kernel-agnostic stage-2 page tables
Introduce alloc() and free() functions to the generic page-table code
for guest stage-2 page-tables and plumb these into the existing KVM
page-table allocator. Subsequent patches will convert other operations
within the KVM allocator over to the generic code.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-6-will@kernel.org
2020-09-11 15:51:13 +01:00
Will Deacon 0f9d09b8e2 KVM: arm64: Use generic allocator for hyp stage-1 page-tables
Now that we have a shiny new page-table allocator, replace the hyp
page-table code with calls into the new API. This also allows us to
remove the extended idmap code, as we can now simply ensure that the
VA size is large enough to map everything we need.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-5-will@kernel.org
2020-09-11 15:51:13 +01:00
Will Deacon bb0e92cbbc KVM: arm64: Add support for creating kernel-agnostic stage-1 page tables
The generic page-table walker is pretty useless as it stands, because it
doesn't understand enough to allocate anything. Teach it about stage-1
page-tables, and hook up an API for allocating these for the hypervisor
at EL2.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-4-will@kernel.org
2020-09-11 15:51:12 +01:00
Will Deacon b1e57de62c KVM: arm64: Add stand-alone page-table walker infrastructure
The KVM page-table code is intricately tied into the kernel page-table
code and re-uses the pte/pmd/pud/p4d/pgd macros directly in an attempt
to reduce code duplication. Unfortunately, the reality is that there is
an awful lot of code required to make this work, and at the end of the
day you're limited to creating page-tables with the same configuration
as the host kernel. Furthermore, lifting the page-table code to run
directly at EL2 on a non-VHE system (as we plan to to do in future
patches) is practically impossible due to the number of dependencies it
has on the core kernel.

Introduce a framework for walking Armv8 page-tables configured
independently from the host kernel.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-3-will@kernel.org
2020-09-11 15:51:12 +01:00
Will Deacon 9af3e08baa KVM: arm64: Remove kvm_mmu_free_memory_caches()
kvm_mmu_free_memory_caches() is only called by kvm_arch_vcpu_destroy(),
so inline the implementation and get rid of the extra function.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200911132529.19844-2-will@kernel.org
2020-09-11 15:51:12 +01:00
Xiaoming Ni ad14c19242 arm64: fix some spelling mistakes in the comments by codespell
arch/arm64/include/asm/cpu_ops.h:24: necesary ==> necessary
arch/arm64/include/asm/kvm_arm.h:69: maintainance ==> maintenance
arch/arm64/include/asm/cpufeature.h:361: capabilties ==> capabilities
arch/arm64/kernel/perf_regs.c:19: compatability ==> compatibility
arch/arm64/kernel/smp_spin_table.c:86: endianess ==> endianness
arch/arm64/kernel/smp_spin_table.c:88: endianess ==> endianness
arch/arm64/kvm/vgic/vgic-mmio-v3.c:1004: targetting ==> targeting
arch/arm64/kvm/vgic/vgic-mmio-v3.c:1005: targetting ==> targeting

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Link: https://lore.kernel.org/r/20200828031822.35928-1-nixiaoming@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07 14:18:50 +01:00
Catalin Marinas 2ac638fc57 arm64: kvm: mte: Hide the MTE CPUID information from the guests
KVM does not support MTE in guests yet, so clear the corresponding field
in the ID_AA64PFR1_EL1 register. In addition, inject an undefined
exception in the guest if it accesses one of the GCR_EL1, RGSR_EL1,
TFSR_EL1 or TFSRE0_EL1 registers. While the emulate_sys_reg() function
already injects an undefined exception, this patch prevents the
unnecessary printk.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steven Price <steven.price@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
2020-09-04 12:45:44 +01:00
Alexandru Elisei 7b75cd5128 KVM: arm64: Update page shift if stage 2 block mapping not supported
Commit 196f878a7a (" KVM: arm/arm64: Signal SIGBUS when stage2 discovers
hwpoison memory") modifies user_mem_abort() to send a SIGBUS signal when
the fault IPA maps to a hwpoisoned page. Commit 1559b7583f ("KVM:
arm/arm64: Re-check VMA on detecting a poisoned page") changed
kvm_send_hwpoison_signal() to use the page shift instead of the VMA because
at that point the code had already released the mmap lock, which means
userspace could have modified the VMA.

If userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to
map the guest fault IPA using block mappings in stage 2. That is not always
possible, if, for example, userspace uses dirty page logging for the VM.
Update the page shift appropriately in those cases when we downgrade the
stage 2 entry from a block mapping to a page.

Fixes: 1559b7583f ("KVM: arm/arm64: Re-check VMA on detecting a poisoned page")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20200901133357.52640-2-alexandru.elisei@arm.com
2020-09-04 10:53:48 +01:00
Marc Zyngier 376426b1a9 KVM: arm64: Fix address truncation in traces
Owing to their ARMv7 origins, the trace events are truncating most
address values to 32bits. That's not really helpful.

Expand the printing of such values to their full glory.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-04 10:53:48 +01:00
Marc Zyngier 3fb884ffe9 KVM: arm64: Do not try to map PUDs when they are folded into PMD
For the obscure cases where PMD and PUD are the same size
(64kB pages with 42bit VA, for example, which results in only
two levels of page tables), we can't map anything as a PUD,
because there is... erm... no PUD to speak of. Everything is
either a PMD or a PTE.

So let's only try and map a PUD when its size is different from
that of a PMD.

Cc: stable@vger.kernel.org
Fixes: b8e0ba7c8b ("KVM: arm64: Add support for creating PUD hugepages at stage 2")
Reported-by: Gavin Shan <gshan@redhat.com>
Reported-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-04 10:52:49 +01:00
Linus Torvalds 96d454cd2c - Fix kernel build with the integrated LLVM assembler which doesn't
see the -Wa,-march option.
 
 - Fix "make vdso_install" when COMPAT_VDSO is disabled.
 
 - Make KVM more robust if the AT S1E1R instruction triggers an exception
   (architecture corner cases).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl9JPWEACgkQa9axLQDI
 XvFLuBAAjTz6SgaLVk5vtoNlNXR+zx/AcwG1hFthWaPqwRLjEVogwZ76Hx7qOStb
 M2+rEsr7q8BKAsI7nU1OJGfDMbyozqDxCIq89NmOgCm3TTze/BiZx7KIL+l5aQea
 5qiPIt3pwhPaFGAnQLDbdBJ7Iz34VbB8bqxLi9tz5RkbfFFEIkNgobrljVj71ZLp
 7xDn8+w54iVqnMrhSTQtPtbdIpgpBO0HL6PuX6jBY+sGfkwpaZCKMdgU4HVkhW8t
 kgmj3S/orMtPvZvQXDZflFdDn+dS0c0dyJlzTu7qyJjL/zgma5RYwLSaWSH2kcib
 lsna1Xoge1Iqzj7QKT8uzsfCHkZ+ANr17oB8YakQtu1HmVDgvOiDX5v44+aLKdJd
 mRwa+UtJT7cVl7I/3r7rOZyb+ApcmjD5Wft7Hi6lOQSfNp+kBRcBCaOcKdh0Gk4A
 KFlZYBnXywo1Xy06HkUSIL3k+qvrHMHC5g6S2XmIL6BYvj08poq6BUTlqSAIuzp4
 GzIzEusqPX80V8MQeRvJ8XmIPtzqgiA4AVCshAwrSiUcEgYWpsWb+yPTpKpnygpd
 UyuuUmfxR7I9ctNw25C4jebdi+gLQoCwCQqRqHR/0Fj4KvkQnAKIWDa52Dcl85Qp
 nedLKsEDc+Tb7ePfp9VzgJ3OmVQUL4hiYDIR3YSVeQrG3R9O72A=
 =wzsB
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix kernel build with the integrated LLVM assembler which doesn't see
   the -Wa,-march option.

 - Fix "make vdso_install" when COMPAT_VDSO is disabled.

 - Make KVM more robust if the AT S1E1R instruction triggers an
   exception (architecture corner cases).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception
  KVM: arm64: Survive synchronous exceptions caused by AT instructions
  KVM: arm64: Add kvm_extable for vaxorcism code
  arm64: vdso32: make vdso32 install conditional
  arm64: use a common .arch preamble for inline assembly
2020-08-28 11:37:33 -07:00
James Morse 88a84ccccb KVM: arm64: Survive synchronous exceptions caused by AT instructions
KVM doesn't expect any synchronous exceptions when executing, any such
exception leads to a panic(). AT instructions access the guest page
tables, and can cause a synchronous external abort to be taken.

The arm-arm is unclear on what should happen if the guest has configured
the hardware update of the access-flag, and a memory type in TCR_EL1 that
does not support atomic operations. B2.2.6 "Possible implementation
restrictions on using atomic instructions" from DDI0487F.a lists
synchronous external abort as a possible behaviour of atomic instructions
that target memory that isn't writeback cacheable, but the page table
walker may behave differently.

Make KVM robust to synchronous exceptions caused by AT instructions.
Add a get_user() style helper for AT instructions that returns -EFAULT
if an exception was generated.

While KVM's version of the exception table mixes synchronous and
asynchronous exceptions, only one of these can occur at each location.

Re-enter the guest when the AT instructions take an exception on the
assumption the guest will take the same exception. This isn't guaranteed
to make forward progress, as the AT instructions may always walk the page
tables, but guest execution may use the translation cached in the TLB.

This isn't a problem, as since commit 5dcd0fdbb4 ("KVM: arm64: Defer guest
entry when an asynchronous exception is pending"), KVM will return to the
host to process IRQs allowing the rest of the system to keep running.

Cc: stable@vger.kernel.org # <v5.3: 5dcd0fdbb4 ("KVM: arm64: Defer guest entry when an asynchronous exception is pending")
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-08-28 15:23:46 +01:00
James Morse e9ee186bb7 KVM: arm64: Add kvm_extable for vaxorcism code
KVM has a one instruction window where it will allow an SError exception
to be consumed by the hypervisor without treating it as a hypervisor bug.
This is used to consume asynchronous external abort that were caused by
the guest.

As we are about to add another location that survives unexpected exceptions,
generalise this code to make it behave like the host's extable.

KVM's version has to be mapped to EL2 to be accessible on nVHE systems.

The SError vaxorcism code is a one instruction window, so has two entries
in the extable. Because the KVM code is copied for VHE and nVHE, we end up
with four entries, half of which correspond with code that isn't mapped.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-08-28 15:23:42 +01:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds dd105d64a0 - Allow booting of late secondary CPUs affected by erratum 1418040
(currently they are parked if none of the early CPUs are affected by
   this erratum).
 
 - Add the 32-bit vdso Makefile to the vdso_install rule so that 'make
   vdso_install' installs the 32-bit compat vdso when it is compiled.
 
 - Print a warning that untrusted guests without a CPU erratum workaround
   (Cortex-A57 832075) may deadlock the affected system.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl9BOb8ACgkQa9axLQDI
 XvFnixAAladNjLswpC8gm0PSv2fXD+OJU7XiJKl1McG0UcnkRp1brgruOi7WMyQg
 I0cbY6Kjfb6mFJaHfEA9uuQ0P44FGmI+gz+3Injl+a0qJgdu0QLU1uJQG/Rae+zG
 kdoimf+/CnLnJTiIA5YXsdrFhSQsn2lVConJx26QrSJO0SB0TROy86aRSrRSMyYy
 mXrK4xCm/cx4LqJQJrFmShpgs/IjuK8T/LZBInjgB43e3y++SHwGTSA2kf3NOldz
 Rx4HqQVJ37IeROZ8B2v8MggZsTk/5C+DaxJu6QBk7REDKPnI9+RAr3BhPUAAVWDn
 BONYtsjBLgf/Q9bmXNsHlGAhjIIeOAgaIIr10oVhSFScCHsjEU15hqyNP2jfgLSC
 Q4cgU8bA0sm36CHNI0vd5phAnMBN6HJZtmSzu2xb/GJKYW4yiuXupYiOWbBAhhiu
 uAFjMRW+dwfOXY/59kwpmedBZ0WvHod2mPhp3n4FRzo3X0NfRgHzgg1kDRwqVAzv
 eh8ynUYMjuy5H3siNR0279M6epIK/hTtqSvnzIqkCTBkftsYX+32v4UzEardRGiv
 +/gKJ1XQGmWD3OCISuMZwaRK+tH5V68eT+3P0gGmssJLYMtRJFzzVtuv3mz1yXRD
 USsww/SY2etlGU+N8r+66NSqU0KPRdQ0tkK6eCX8DTyl3MPLQfQ=
 =VkR3
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Allow booting of late secondary CPUs affected by erratum 1418040
   (currently they are parked if none of the early CPUs are affected by
   this erratum).

 - Add the 32-bit vdso Makefile to the vdso_install rule so that 'make
   vdso_install' installs the 32-bit compat vdso when it is compiled.

 - Print a warning that untrusted guests without a CPU erratum
   workaround (Cortex-A57 832075) may deadlock the affected system.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  ARM64: vdso32: Install vdso32 from vdso_install
  KVM: arm64: Print warning when cpu erratum can cause guests to deadlock
  arm64: Allow booting of late CPUs affected by erratum 1418040
  arm64: Move handling of erratum 1418040 into C code
2020-08-22 10:17:36 -07:00
Will Deacon b5331379bc KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set
When an MMU notifier call results in unmapping a range that spans multiple
PGDs, we end up calling into cond_resched_lock() when crossing a PGD boundary,
since this avoids running into RCU stalls during VM teardown. Unfortunately,
if the VM is destroyed as a result of OOM, then blocking is not permitted
and the call to the scheduler triggers the following BUG():

 | BUG: sleeping function called from invalid context at arch/arm64/kvm/mmu.c:394
 | in_atomic(): 1, irqs_disabled(): 0, non_block: 1, pid: 36, name: oom_reaper
 | INFO: lockdep is turned off.
 | CPU: 3 PID: 36 Comm: oom_reaper Not tainted 5.8.0 #1
 | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
 | Call trace:
 |  dump_backtrace+0x0/0x284
 |  show_stack+0x1c/0x28
 |  dump_stack+0xf0/0x1a4
 |  ___might_sleep+0x2bc/0x2cc
 |  unmap_stage2_range+0x160/0x1ac
 |  kvm_unmap_hva_range+0x1a0/0x1c8
 |  kvm_mmu_notifier_invalidate_range_start+0x8c/0xf8
 |  __mmu_notifier_invalidate_range_start+0x218/0x31c
 |  mmu_notifier_invalidate_range_start_nonblock+0x78/0xb0
 |  __oom_reap_task_mm+0x128/0x268
 |  oom_reap_task+0xac/0x298
 |  oom_reaper+0x178/0x17c
 |  kthread+0x1e4/0x1fc
 |  ret_from_fork+0x10/0x30

Use the new 'flags' argument to kvm_unmap_hva_range() to ensure that we
only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is set in the notifier
flags.

Cc: <stable@vger.kernel.org>
Fixes: 8b3405e345 ("kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd")
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Message-Id: <20200811102725.7121-3-will@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 18:06:43 -04:00
Will Deacon fdfe7cbd58 KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()
The 'flags' field of 'struct mmu_notifier_range' is used to indicate
whether invalidate_range_{start,end}() are permitted to block. In the
case of kvm_mmu_notifier_invalidate_range_start(), this field is not
forwarded on to the architecture-specific implementation of
kvm_unmap_hva_range() and therefore the backend cannot sensibly decide
whether or not to block.

Add an extra 'flags' parameter to kvm_unmap_hva_range() so that
architectures are aware as to whether or not they are permitted to block.

Cc: <stable@vger.kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Message-Id: <20200811102725.7121-2-will@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 18:03:47 -04:00
Andrew Jones 004a01241c arm64/x86: KVM: Introduce steal-time cap
arm64 requires a vcpu fd (KVM_HAS_DEVICE_ATTR vcpu ioctl) to probe
support for steal-time. However this is unnecessary, as only a KVM
fd is required, and it complicates userspace (userspace may prefer
delaying vcpu creation until after feature probing). Introduce a cap
that can be checked instead. While x86 can already probe steal-time
support with a kvm fd (KVM_GET_SUPPORTED_CPUID), we add the cap there
too for consistency.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20200804170604.42662-7-drjones@redhat.com
2020-08-21 14:05:19 +01:00
Andrew Jones 53f985584e KVM: arm64: pvtime: Fix stolen time accounting across migration
When updating the stolen time we should always read the current
stolen time from the user provided memory, not from a kernel
cache. If we use a cache then we'll end up resetting stolen time
to zero on the first update after migration.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200804170604.42662-5-drjones@redhat.com
2020-08-21 14:04:14 +01:00
Andrew Jones 4d2d4ce001 KVM: arm64: Drop type input from kvm_put_guest
We can use typeof() to avoid the need for the type input.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200804170604.42662-4-drjones@redhat.com
2020-08-21 14:04:14 +01:00
Andrew Jones 2dbd780e34 KVM: arm64: pvtime: Fix potential loss of stolen time
We should only check current->sched_info.run_delay once when
updating stolen time. Otherwise there's a chance there could
be a change between checks that we miss (preemption disabling
comes after vcpu request checks).

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200804170604.42662-3-drjones@redhat.com
2020-08-21 14:04:14 +01:00
Andrew Jones 38480df564 KVM: arm64: pvtime: steal-time is only supported when configured
Don't confuse the guest by saying steal-time is supported when
it hasn't been configured by userspace and won't work.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200804170604.42662-2-drjones@redhat.com
2020-08-21 14:04:14 +01:00
Rob Herring abf532ccea KVM: arm64: Print warning when cpu erratum can cause guests to deadlock
If guests don't have certain CPU erratum workarounds implemented, then
there is a possibility a guest can deadlock the system. IOW, only trusted
guests should be used on systems with the erratum.

This is the case for Cortex-A57 erratum 832075.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20200803193127.3012242-2-robh@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-08-21 12:23:09 +01:00
Paolo Bonzini 0378daef0c KVM/arm64 updates for Linux 5.9:
- Split the VHE and nVHE hypervisor code bases, build the EL2 code
   separately, allowing for the VHE code to now be built with instrumentation
 
 - Level-based TLB invalidation support
 
 - Restructure of the vcpu register storage to accomodate the NV code
 
 - Pointer Authentication available for guests on nVHE hosts
 
 - Simplification of the system register table parsing
 
 - MMU cleanups and fixes
 
 - A number of post-32bit cleanups and other fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl8q5DEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDQFAP/jtscnC5OxEOoGNW1gvg/1QI/BuU4zLvqQL1
 OEW72fUQlil7tmF/CbLLKnsBpxKmzO02C3wDdg3oaRi884bRtTXdok0nsFuCvrZD
 u/wrlMnP0zTjjk1uwIFfZJTx+nnUiT0jC6ffvGxB/jnTJk/8atvOUFL7ODFEfixz
 mS5g1jwwJkRmWKESFg7KGSghKuwXTvo4HVWCfME+t1rQwAa03stXFV8H5tkU6+cG
 BRIssxo7BkAV2AozwL7hgl/M6wd6QvbOrYJqgb67+sQ8qts0YNne96NN3InMedb1
 RENyDssXlA+VI0HoYyEbYnPtFy1Hoj1lOGDZLEZAEH1qcmWrV+hApnoSXSmuofvn
 QlfOWCyd92CZySu21MALRUVXbrKkA3zT2b9R93A5z7iEBPY+Wk0ryJCO6IxdZzF8
 48LNjtzb/Kd0SMU/issJlw+u6fJvLbpnSzXNsYYhiiTMUE9cbu2SEkq0SkonH0a4
 d3V8UifZyeffXsOfOAG0DJZOu/fWZp1/I3tfzujtG9rCb+jTQueJ4E1cFYrwSO6b
 sFNyiI1AzlwcCippG08zSUX61nGfKXBuMXuhIlMRk7GeiF95DmSXuxEgYndZX9I+
 E6zJr1iQk/1lrip41svDIIOBHuMbIeD/w1bsOKi7Zoa270MxB4r2Z3IqRMgosoE5
 l4YO9pl1
 =Ukr4
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next-5.6

KVM/arm64 updates for Linux 5.9:

- Split the VHE and nVHE hypervisor code bases, build the EL2 code
  separately, allowing for the VHE code to now be built with instrumentation

- Level-based TLB invalidation support

- Restructure of the vcpu register storage to accomodate the NV code

- Pointer Authentication available for guests on nVHE hosts

- Simplification of the system register table parsing

- MMU cleanups and fixes

- A number of post-32bit cleanups and other fixes
2020-08-09 12:58:23 -04:00
Linus Torvalds 921d2597ab s390: implement diag318
x86:
 * Report last CPU for debugging
 * Emulate smaller MAXPHYADDR in the guest than in the host
 * .noinstr and tracing fixes from Thomas
 * nested SVM page table switching optimization and fixes
 
 Generic:
 * Unify shadow MMU cache data structures across architectures
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8pC+oUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcOwgAjomqtEqQNlp7DdZT7VyyklzbxX1/
 ud7v+oOJ8K4sFlf64lSthjPo3N9rzZCcw+yOXmuyuITngXOGc3tzIwXpCzpLtuQ1
 WO1Ql3B/2dCi3lP5OMmsO1UAZqy9pKLg1dfeYUPk48P5+p7d/NPmk+Em5kIYzKm5
 JsaHfCp2EEXomwmljNJ8PQ1vTjIQSSzlgYUBZxmCkaaX7zbEUMtxAQCStHmt8B84
 33LczwXBm3viSWrzsoBV37I70+tseugiSGsCfUyupXOvq55d6D9FCqtCb45Hn4Vh
 Ik8ggKdalsk/reiGEwNw1/3nr6mRMkHSbl+Mhc4waOIFf9dn0urgQgOaDg==
 =YVx0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - implement diag318

  x86:
   - Report last CPU for debugging
   - Emulate smaller MAXPHYADDR in the guest than in the host
   - .noinstr and tracing fixes from Thomas
   - nested SVM page table switching optimization and fixes

  Generic:
   - Unify shadow MMU cache data structures across architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  KVM: SVM: Fix sev_pin_memory() error handling
  KVM: LAPIC: Set the TDCR settable bits
  KVM: x86: Specify max TDP level via kvm_configure_mmu()
  KVM: x86/mmu: Rename max_page_level to max_huge_page_level
  KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
  KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
  KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
  KVM: VMX: Make vmx_load_mmu_pgd() static
  KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
  KVM: VMX: Drop a duplicate declaration of construct_eptp()
  KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
  KVM: Using macros instead of magic values
  MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
  KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
  KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
  KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
  KVM: VMX: Add guest physical address check in EPT violation and misconfig
  KVM: VMX: introduce vmx_need_pf_intercept
  KVM: x86: update exception bitmap on CPUID changes
  KVM: x86: rename update_bp_intercept to update_exception_bitmap
  ...
2020-08-06 12:59:31 -07:00
Linus Torvalds 145ff1ec09 arm64 and cross-arch updates for 5.9:
- Removal of the tremendously unpopular read_barrier_depends() barrier,
   which is a NOP on all architectures apart from Alpha, in favour of
   allowing architectures to override READ_ONCE() and do whatever dance
   they need to do to ensure address dependencies provide LOAD ->
   LOAD/STORE ordering. This work also offers a potential solution if
   compilers are shown to convert LOAD -> LOAD address dependencies into
   control dependencies (e.g. under LTO), as weakly ordered architectures
   will effectively be able to upgrade READ_ONCE() to smp_load_acquire().
   The latter case is not used yet, but will be discussed further at LPC.
 
 - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment
   the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
   bus-specific parameter and apply the resulting changes to the device
   ID space provided by the Freescale FSL bus.
 
 - arm64 support for TLBI range operations and translation table level
   hints (part of the ARMv8.4 architecture version).
 
 - Time namespace support for arm64.
 
 - Export the virtual and physical address sizes in vmcoreinfo for
   makedumpfile and crash utilities.
 
 - CPU feature handling cleanups and checks for programmer errors
   (overlapping bit-fields).
 
 - ACPI updates for arm64: disallow AML accesses to EFI code regions and
   kernel memory.
 
 - perf updates for arm64.
 
 - Miscellaneous fixes and cleanups, most notably PLT counting
   optimisation for module loading, recordmcount fix to ignore
   relocations other than R_AARCH64_CALL26, CMA areas reserved for
   gigantic pages on 16K and 64K configurations.
 
 - Trivial typos, duplicate words.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl8oTcsACgkQa9axLQDI
 XvEj6hAAkn39mO5xrR/Vhpg3DyFPk63ZlMSX9SsOeVyaLbovT6stTs1XAZXPpnkt
 rV3gwACyGSrqH6+uey9pHgHJuPF2TdrGEVK08yVKo9KGW/6yXSIncdKFE4jUJ/WJ
 wF5j7eMET2aGzcpm5AlzMmq6HOrKB8nZac9H8/x6H+Ox2WdgJkEjOkDvyqACUyum
 N3FsTZkWj2pIkTXHNgDZ8KjxVLO8HlFaB2hkxFDl9NPlX2UTCQJ8Tg1KiPLafKaK
 gUvH4usQDFdb5RU/UWogre37J4emO0ZTApZOyju+U+PMMWlWVHjZ4isUIS9zz/AE
 JNZ23dnKZX2HrYa5p8HZx175zwj/vXUqUHCZPLvQXaAudCEhF8BVljPiG0e80FV5
 GHFUgUbylKspp01I/9L+2JvsG96Mr0e+P3Sx7L2HTI42cmtoSa14+MpoSRj7zlft
 Qcl8hfrVOjCjUnFRHa/1y1cGvnD9GbgnKJR7zgVxl9bD/Jd48r1HUtwRORZCzWFr
 mRPVbPS72fWxMzMV9DZYJm02jJY9kLX2BMl49njbB8MhAhzOvrMVzoVVtMMeRFLR
 XHeJpmg36W09FiRGe7LRXlkXIhCQzQG2bJfiphuupCfhjRAitPoq8I925G6Pig60
 c8RWaXGU7PrEsdMNrL83vekvGKgqrkoFkRVtsCoQ2X6Hvu/XdYI=
 =mh79
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 and cross-arch updates from Catalin Marinas:
 "Here's a slightly wider-spread set of updates for 5.9.

  Going outside the usual arch/arm64/ area is the removal of
  read_barrier_depends() series from Will and the MSI/IOMMU ID
  translation series from Lorenzo.

  The notable arm64 updates include ARMv8.4 TLBI range operations and
  translation level hint, time namespace support, and perf.

  Summary:

   - Removal of the tremendously unpopular read_barrier_depends()
     barrier, which is a NOP on all architectures apart from Alpha, in
     favour of allowing architectures to override READ_ONCE() and do
     whatever dance they need to do to ensure address dependencies
     provide LOAD -> LOAD/STORE ordering.

     This work also offers a potential solution if compilers are shown
     to convert LOAD -> LOAD address dependencies into control
     dependencies (e.g. under LTO), as weakly ordered architectures will
     effectively be able to upgrade READ_ONCE() to smp_load_acquire().
     The latter case is not used yet, but will be discussed further at
     LPC.

   - Make the MSI/IOMMU input/output ID translation PCI agnostic,
     augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID
     bus-specific parameter and apply the resulting changes to the
     device ID space provided by the Freescale FSL bus.

   - arm64 support for TLBI range operations and translation table level
     hints (part of the ARMv8.4 architecture version).

   - Time namespace support for arm64.

   - Export the virtual and physical address sizes in vmcoreinfo for
     makedumpfile and crash utilities.

   - CPU feature handling cleanups and checks for programmer errors
     (overlapping bit-fields).

   - ACPI updates for arm64: disallow AML accesses to EFI code regions
     and kernel memory.

   - perf updates for arm64.

   - Miscellaneous fixes and cleanups, most notably PLT counting
     optimisation for module loading, recordmcount fix to ignore
     relocations other than R_AARCH64_CALL26, CMA areas reserved for
     gigantic pages on 16K and 64K configurations.

   - Trivial typos, duplicate words"

Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org
Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits)
  arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack
  arm64/mm: save memory access in check_and_switch_context() fast switch path
  arm64: sigcontext.h: delete duplicated word
  arm64: ptrace.h: delete duplicated word
  arm64: pgtable-hwdef.h: delete duplicated words
  bus: fsl-mc: Add ACPI support for fsl-mc
  bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver
  of/irq: Make of_msi_map_rid() PCI bus agnostic
  of/irq: make of_msi_map_get_device_domain() bus agnostic
  dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus
  of/device: Add input id to of_dma_configure()
  of/iommu: Make of_map_rid() PCI agnostic
  ACPI/IORT: Add an input ID to acpi_dma_configure()
  ACPI/IORT: Remove useless PCI bus walk
  ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
  ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
  ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC
  arm64: enable time namespace support
  arm64/vdso: Restrict splitting VVAR VMA
  arm64/vdso: Handle faults on timens page
  ...
2020-08-03 14:11:08 -07:00
Catalin Marinas 0e4cd9f265 Merge branch 'for-next/read-barrier-depends' into for-next/core
* for-next/read-barrier-depends:
  : Allow architectures to override __READ_ONCE()
  arm64: Reduce the number of header files pulled into vmlinux.lds.S
  compiler.h: Move compiletime_assert() macros into compiler_types.h
  checkpatch: Remove checks relating to [smp_]read_barrier_depends()
  include/linux: Remove smp_read_barrier_depends() from comments
  tools/memory-model: Remove smp_read_barrier_depends() from informal doc
  Documentation/barriers/kokr: Remove references to [smp_]read_barrier_depends()
  Documentation/barriers: Remove references to [smp_]read_barrier_depends()
  locking/barriers: Remove definitions for [smp_]read_barrier_depends()
  alpha: Replace smp_read_barrier_depends() usage with smp_[r]mb()
  vhost: Remove redundant use of read_barrier_depends() barrier
  asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h'
  asm/rwonce: Remove smp_read_barrier_depends() invocation
  alpha: Override READ_ONCE() with barriered implementation
  asm/rwonce: Allow __READ_ONCE to be overridden by the architecture
  compiler.h: Split {READ,WRITE}_ONCE definitions out into rwonce.h
  tools: bpf: Use local copy of headers including uapi/linux/filter.h
2020-07-31 18:09:57 +01:00
Marc Zyngier 16314874b1 Merge branch 'kvm-arm64/misc-5.9' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-30 16:13:04 +01:00
Will Deacon 022c8328dc KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort()
To allow for re-injection of stage-2 faults on stage-1 page-table walks
due to either a missing or read-only memslot, move the triage logic out
of io_mem_abort() and into kvm_handle_guest_abort(), where these aborts
can be handled before anything else.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200729102821.23392-5-will@kernel.org
2020-07-30 16:02:37 +01:00
Will Deacon 54dc0d2404 KVM: arm64: Don't skip cache maintenance for read-only memslots
If a guest performs cache maintenance on a read-only memslot, we should
inform userspace rather than skip the instruction altogether.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200729102821.23392-4-will@kernel.org
2020-07-30 16:02:37 +01:00
Will Deacon 84b951a803 KVM: arm64: Handle data and instruction external aborts the same way
If the guest generates a synchronous external abort which is not handled
by the host, we inject it back into the guest as a virtual SError, but
only if the original fault was reported on the data side. Instruction
faults are reported as "Unsupported FSC", causing the vCPU run loop to
bail with -EFAULT.

Although synchronous external aborts from a guest are pretty unusual,
treat them the same regardless of whether they are taken as data or
instruction aborts by EL2.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200729102821.23392-3-will@kernel.org
2020-07-30 16:02:37 +01:00
Will Deacon c9a636f29b KVM: arm64: Rename kvm_vcpu_dabt_isextabt()
kvm_vcpu_dabt_isextabt() is not specific to data aborts and, unlike
kvm_vcpu_dabt_issext(), has nothing to do with sign extension.

Rename it to 'kvm_vcpu_abt_issea()'.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20200729102821.23392-2-will@kernel.org
2020-07-30 15:59:28 +01:00