Commit Graph

483 Commits

Author SHA1 Message Date
Davidlohr Bueso da4ad88cab kvm: Replace vcpu->swait with rcuwait
The use of any sort of waitqueue (simple or regular) for
wait/waking vcpus has always been an overkill and semantically
wrong. Because this is per-vcpu (which is blocked) there is
only ever a single waiting vcpu, thus no need for any sort of
queue.

As such, make use of the rcuwait primitive, with the following
considerations:

  - rcuwait already provides the proper barriers that serialize
  concurrent waiter and waker.

  - Task wakeup is done in rcu read critical region, with a
  stable task pointer.

  - Because there is no concurrency among waiters, we need
  not worry about rcuwait_wait_event() calls corrupting
  the wait->task. As a consequence, this saves the locking
  done in swait when modifying the queue. This also applies
  to per-vcore wait for powerpc kvm-hv.

The x86 tscdeadline_latency test mentioned in 8577370fb0
("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
latency is reduced by around 15-20% with this change.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-mips@vger.kernel.org
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-6-dave@stgolabs.net>
[Avoid extra logic changes. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:56 -04:00
Paolo Bonzini 4aef2ec902 Merge branch 'kvm-amd-fixes' into HEAD 2020-05-13 12:14:05 -04:00
Liangliang Huang c9b0299034 MIPS: Use fallthrough for arch/mips
Convert the various /* fallthrough */ comments to the pseudo-keyword
fallthrough;

Done via script:
https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/

Signed-off-by: Liangliang Huang <huangll@lemote.com>
Reviewed-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-05-07 11:55:47 +02:00
Tianjia Zhang 1b94f6f810 KVM: Remove redundant argument to kvm_arch_vcpu_ioctl_run
In earlier versions of kvm, 'kvm_run' was an independent structure
and was not included in the vcpu structure. At present, 'kvm_run'
is already included in the vcpu structure, so the parameter
'kvm_run' is redundant.

This patch simplifies the function definition, removes the extra
'kvm_run' parameter, and extracts it from the 'kvm_vcpu' structure
if necessary.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Message-Id: <20200416051057.26526-1-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21 09:13:11 -04:00
Emanuele Giuseppe Esposito 812756a82e kvm_host: unify VM_STAT and VCPU_STAT definitions in a single place
The macros VM_STAT and VCPU_STAT are redundantly implemented in multiple
files, each used by a different architecure to initialize the debugfs
entries for statistics. Since they all have the same purpose, they can be
unified in a single common definition in include/linux/kvm_host.h

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20200414155625.20559-1-eesposit@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21 09:13:01 -04:00
Linus Torvalds 9bb715260e virtio: fixes, vdpa
Some bug fixes.
 The new vdpa subsystem with two first drivers.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl6MS7wPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpGp8H/2H49Gya1cfVbGU13qgmBSQqQXC8hS3iNLuG
 ltRgU+jafJT//kvkdm3/DUzfK3eRUWUfqZLKEbAQDtMY0OGHi/KGEBYVLDde7Zxt
 Lg4VnwBhkYDR/f01ZZDbHxzj9JAr83i28nILjLIqf3a1BX4zf203+ZE0/JM8a7wL
 dOPoH7NAfyz5ul2F67bR1IOF8vC6TidpavzR2+HC/MocHYXb6Bgfvt+i4EcrfuMf
 9lnBfajgklKr9sNJniwvvR1pWVg+YyG3VeC6T8tIC/xzbCmIoNT+5b3q2XPSIHq1
 EuQTeXH9CBFXS0qcFlq2ktR1xd1Lx95hKwZpqLwLFDmfgjhV2QU=
 =/84P
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - Some bug fixes

 - The new vdpa subsystem with two first drivers

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-balloon: Revert "virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM"
  vdpa: move to drivers/vdpa
  virtio: Intel IFC VF driver for VDPA
  vdpasim: vDPA device simulator
  vhost: introduce vDPA-based backend
  virtio: introduce a vDPA based transport
  vDPA: introduce vDPA bus
  vringh: IOTLB support
  vhost: factor out IOTLB
  vhost: allow per device message handler
  vhost: refine vhost and vringh kconfig
  virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
  virtio-net: Introduce hash report feature
  virtio-net: Introduce RSS receive steering feature
  virtio-net: Introduce extended RSC feature
  tools/virtio: option to build an out of tree module
2020-04-08 10:51:53 -07:00
Jason Wang 20c384f1ea vhost: refine vhost and vringh kconfig
Currently, CONFIG_VHOST depends on CONFIG_VIRTUALIZATION. But vhost is
not necessarily for VM since it's a generic userspace and kernel
communication protocol. Such dependency may prevent archs without
virtualization support from using vhost.

To solve this, a dedicated vhost menu is created under drivers so
CONIFG_VHOST can be decoupled out of CONFIG_VIRTUALIZATION.

While at it, also squash Kconfig.vringh into vhost Kconfig file. This
avoids the trick of conditional inclusion from VOP or CAIF. Then it
will be easier to introduce new vringh users and common dependency for
both vringh and vhost.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-2-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-04-01 12:06:26 -04:00
Sean Christopherson b990408537 KVM: Pass kvm_init()'s opaque param to additional arch funcs
Pass @opaque to kvm_arch_hardware_setup() and
kvm_arch_check_processor_compat() to allow architecture specific code to
reference @opaque without having to stash it away in a temporary global
variable.  This will enable x86 to separate its vendor specific callback
ops, which are passed via @opaque, into "init" and "runtime" ops without
having to stash away the "init" ops.

No functional change intended.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com> #s390
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-31 10:48:03 -04:00
Sean Christopherson 0dff084607 KVM: Provide common implementation for generic dirty log functions
Move the implementations of KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG
for CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT into common KVM code.
The arch specific implemenations are extremely similar, differing
only in whether the dirty log needs to be sync'd from hardware (x86)
and how the TLBs are flushed.  Add new arch hooks to handle sync
and TLB flush; the sync will also be used for non-generic dirty log
support in a future patch (s390).

The ulterior motive for providing a common implementation is to
eliminate the dependency between arch and common code with respect to
the memslot referenced by the dirty log, i.e. to make it obvious in the
code that the validity of the memslot is guaranteed, as a future patch
will rework memslot handling such that id_to_memslot() can return NULL.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:24 +01:00
Sean Christopherson 9d4c197c0e KVM: Drop "const" attribute from old memslot in commit_memory_region()
Drop the "const" attribute from @old in kvm_arch_commit_memory_region()
to allow arch specific code to free arch specific resources in the old
memslot without having to cast away the attribute.  Freeing resources in
kvm_arch_commit_memory_region() paves the way for simplifying
kvm_free_memslot() by eliminating the last usage of its @dont param.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:20 +01:00
Sean Christopherson 414de7abbf KVM: Drop kvm_arch_create_memslot()
Remove kvm_arch_create_memslot() now that all arch implementations are
effectively nops.  Removing kvm_arch_create_memslot() eliminates the
possibility for arch specific code to allocate memory prior to setting
a memslot, which sets the stage for simplifying kvm_free_memslot().

Cc: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:17 +01:00
Sean Christopherson 879a37632b KVM: MIPS: Fold comparecount_func() into comparecount_wakeup()
Fold kvm_mips_comparecount_func() into kvm_mips_comparecount_wakeup() to
eliminate the nondescript function name as well as its unnecessary cast
of a vcpu to "unsigned long" and back to a vcpu.  Presumably func() was
used as a callback at some point during pre-upstream development, as
wakeup() is the only user of func() and has been the only user since
both with introduced by commit 669e846e6c ("KVM/MIPS32: MIPS arch
specific APIs for KVM").

Cc: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-05 15:29:55 +01:00
Sean Christopherson 09df630712 KVM: MIPS: Fix a build error due to referencing not-yet-defined function
Hoist kvm_mips_comparecount_wakeup() above its only user,
kvm_arch_vcpu_create() to fix a compilation error due to referencing an
undefined function.

Fixes: d11dfed5d7 ("KVM: MIPS: Move all vcpu init code into kvm_arch_vcpu_create()")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-05 15:29:01 +01:00
Sean Christopherson ddd259c9aa KVM: Drop kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit()
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all
arch specific implementations are nops.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:33 +01:00
Sean Christopherson d11dfed5d7 KVM: MIPS: Move all vcpu init code into kvm_arch_vcpu_create()
Fold init() into create() now that the two are called back-to-back by
common KVM code (kvm_vcpu_init() calls kvm_arch_vcpu_init() as its last
action, and kvm_vm_ioctl_create_vcpu() calls kvm_arch_vcpu_create()
immediately thereafter).  Rinse and repeat for kvm_arch_vcpu_uninit()
and kvm_arch_vcpu_destroy().  This paves the way for removing
kvm_arch_vcpu_{un}init() entirely.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:30 +01:00
Sean Christopherson afede96df5 KVM: Drop kvm_arch_vcpu_setup()
Remove kvm_arch_vcpu_setup() now that all arch specific implementations
are nops.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:28 +01:00
Sean Christopherson 5259878432 KVM: MIPS: Move .vcpu_setup() call to kvm_arch_vcpu_create()
Fold setup() into create() now that the two are called back-to-back by
common KVM code.  This paves the way for removing kvm_arch_vcpu_setup().
Note, there is no unwind function associated with kvm_arch_vcpu_setup(),
i.e. no teardown path that also needs to be moved.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:26 +01:00
Sean Christopherson e529ef66e6 KVM: Move vcpu alloc and init invocation to common code
Now that all architectures tightly couple vcpu allocation/free with the
mandatory calls to kvm_{un}init_vcpu(), move the sequences verbatim to
common KVM code.

Move both allocation and initialization in a single patch to eliminate
thrash in arch specific code.  The bisection benefits of moving the two
pieces in separate patches is marginal at best, whereas the odds of
introducing a transient arch specific bug are non-zero.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:20 +01:00
Sean Christopherson 4543bdc088 KVM: Introduce kvm_vcpu_destroy()
Add kvm_vcpu_destroy() and wire up all architectures to call the common
function instead of their arch specific implementation.  The common
destruction function will be used by future patches to move allocation
and initialization of vCPUs to common KVM code, i.e. to free resources
that are allocated by arch agnostic code.

No functional change intended.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:11 +01:00
Sean Christopherson aaf532c579 KVM: MIPS: Invoke kvm_vcpu_uninit() immediately prior to freeing vcpu
Move the call to kvm_vcpu_uninit() in kvm_arch_vcpu_destroy() down a few
lines so that it is invoked immediately prior to freeing the vCPU.  This
paves the way for moving the uninit and free sequence to common KVM code
without an associated functional change.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:09 +01:00
Sean Christopherson 897cc38eaa KVM: Add kvm_arch_vcpu_precreate() to handle pre-allocation issues
Add a pre-allocation arch hook to handle checks that are currently done
by arch specific code prior to allocating the vCPU object.  This paves
the way for moving the allocation to common KVM code.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:07 +01:00
Sean Christopherson 47d51e5eb5 KVM: MIPS: Drop kvm_arch_vcpu_free()
Remove the superfluous kvm_arch_vcpu_free() as it is no longer called
from commmon KVM code.  Note, kvm_arch_vcpu_destroy() *is* called from
common code, i.e. choosing which function to whack is not completely
arbitrary.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:02 +01:00
Sean Christopherson 5233009fab KVM: MIPS: Use kvm_vcpu_cache to allocate vCPUs
For reasons unknown, MIPS configures the vCPU allocation cache but
allocates vCPUs via kzalloc().  Allocate from the vCPU cache in
preparation for moving vCPU allocation to common KVM code.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:01 +01:00
Mike Rapoport 2bee1b5848
mips: add support for folded p4d page tables
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5leve-fixup.h with pgtable-nop4d.h and
drop usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Mike Rapoport <rppt@kernel.org>
2019-11-22 10:51:22 -08:00
Mike Rapoport 31168f033e
mips: drop __pXd_offset() macros that duplicate pXd_index() ones
The __pXd_offset() macros are identical to the pXd_index() macros and there
is no point to keep both of them. All architectures define and use
pXd_index() so let's keep only those to make mips consistent with the rest
of the kernel.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Mike Rapoport <rppt@kernel.org>
2019-11-22 10:51:17 -08:00
Linus Torvalds 7f20fd2337 Bugfixes (arm and x86) and cleanups.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJdTfRfAAoJEL/70l94x66DcN0IAIwyaU2+kwP0jd2miQuKxgwl
 WU4u7dZCoQC6meWEVmrSJIVMBONRubmZ9iCqT7807YP8YZSQpOth51FMbULUWuy1
 VW1eaRwqidX0EAihDhg2ZbBZ8H6RQ9Fn0aiEEh44dAZZAwGSVnO3PRKvQEJ15xjk
 q+OQ4hrxtoorwLj+myejmq3YenTFTCMMJfYwwvlCl+J1FfrLZi5k3X5Gjk+j8Ixd
 8CL8/6u5Lu6MCgfYVvxvo8/bUPiATBdF1sWJMMALwXTrDiSy4tQRD0NvZP1HM8G1
 hy0XnhgtsS9rWNLtAFOj+r/XhP9V5lOOGX8yBcj0XQQr+DC9MG6MCL+pXXOaMcA=
 =ZZh8
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Bugfixes (arm and x86) and cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: kvm: Adding config fragments
  KVM: selftests: Update gitignore file for latest changes
  kvm: remove unnecessary PageReserved check
  KVM: arm/arm64: vgic: Reevaluate level sensitive interrupts on enable
  KVM: arm: Don't write junk to CP15 registers on reset
  KVM: arm64: Don't write junk to sysregs on reset
  KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
  x86: kvm: remove useless calls to kvm_para_available
  KVM: no need to check return value of debugfs_create functions
  KVM: remove kvm_arch_has_vcpu_debugfs()
  KVM: Fix leak vCPU's VMCS value into other pCPU
  KVM: Check preempted_in_kernel for involuntary preemption
  KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire
  arm64: KVM: hyp: debug-sr: Mark expected switch fall-through
  KVM: arm64: Update kvm_arm_exception_class and esr_class_str for new EC
  KVM: arm: vgic-v3: Mark expected switch fall-through
  arm64: KVM: regmap: Fix unexpected switch fall-through
  KVM: arm/arm64: Introduce kvm_pmu_vcpu_init() to setup PMU counter index
2019-08-09 15:46:29 -07:00
Linus Torvalds 76d7961ff4 A few MIPS fixes for 5.3:
- Various switch fall through annotations to fixup warnings & errors
   resulting from -Wimplicit-fallthrough.
 
 - A fix for systems (at least jazz) using an i8253 PIT as clocksource
   when it's not suitably configured.
 
 - Set struct cacheinfo's cpu_map_populated field to true, indicating
   that we filled in cache info detected from cop0 registers & avoiding
   complaints about that info being (intentionally) missing in
   devicetree.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXUnSPBUccGF1bC5idXJ0
 b25AbWlwcy5jb20ACgkQPqefrLV1AN2u3gD/TaMPczS5027R0FMXskiroUHaMG4S
 JL0EYIVmfny4vwYBAIvLr5l1jEXEqegjYXFabuI5PybQlFmTZMhjauh6gKYJ
 =e6fl
 -----END PGP SIGNATURE-----

Merge tag 'mips_fixes_5.3_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:
 "A few MIPS fixes for 5.3:

   - Various switch fall through annotations to fixup warnings & errors
     resulting from -Wimplicit-fallthrough.

   - A fix for systems (at least jazz) using an i8253 PIT as clocksource
     when it's not suitably configured.

   - Set struct cacheinfo's cpu_map_populated field to true, indicating
     that we filled in cache info detected from cop0 registers &
     avoiding complaints about that info being (intentionally) missing
     in devicetree"

* tag 'mips_fixes_5.3_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: BCM63XX: Mark expected switch fall-through
  MIPS: OProfile: Mark expected switch fall-throughs
  MIPS: Annotate fall-through in Cavium Octeon code
  MIPS: Annotate fall-through in kvm/emulate.c
  mips: fix cacheinfo
  MIPS: kernel: only use i8253 clocksource with periodic clockevent
2019-08-06 14:01:08 -07:00
Paolo Bonzini 741cbbae07 KVM: remove kvm_arch_has_vcpu_debugfs()
There is no need for this function as all arches have to implement
kvm_arch_create_vcpu_debugfs() no matter what.  A #define symbol
let us actually simplify the code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-05 12:55:48 +02:00
Paul Burton 5bbe54346b
MIPS: Annotate fall-through in kvm/emulate.c
kvm_compute_return_epc contains a switch statement with an intentional
fall-through from a case handling jal (jump and link) instructions to
one handling j (jump) instructions. With -Wimplicit-fallthrough this
triggers a compile error (due to -Werror being enabled for arch/mips).

This can be reproduced using malta_kvm_defconfig.

Fix this by annotating the intentional fall-through.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
2019-07-18 14:41:04 -07:00
Sean Christopherson f257d6dcda KVM: Directly return result from kvm_arch_check_processor_compat()
Add a wrapper to invoke kvm_arch_check_processor_compat() so that the
boilerplate ugliness of checking virtualization support on all CPUs is
hidden from the arch specific code.  x86's implementation in particular
is quite heinous, as it unnecessarily propagates the out-param pattern
into kvm_x86_ops.

While the x86 specific issue could be resolved solely by changing
kvm_x86_ops, make the change for all architectures as returning a value
directly is prettier and technically more robust, e.g. s390 doesn't set
the out param, which could lead to subtle breakage in the (highly
unlikely) scenario where the out-param was not pre-initialized by the
caller.

Opportunistically annotate svm_check_processor_compat() with __init.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-04 19:27:32 +02:00
Thomas Huth a86cb413f4 KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID
KVM_CAP_MAX_VCPU_ID is currently always reporting KVM_MAX_VCPU_ID on all
architectures. However, on s390x, the amount of usable CPUs is determined
during runtime - it is depending on the features of the machine the code
is running on. Since we are using the vcpu_id as an index into the SCA
structures that are defined by the hardware (see e.g. the sca_add_vcpu()
function), it is not only the amount of CPUs that is limited by the hard-
ware, but also the range of IDs that we can use.
Thus KVM_CAP_MAX_VCPU_ID must be determined during runtime on s390x, too.
So the handling of KVM_CAP_MAX_VCPU_ID has to be moved from the common
code into the architecture specific code, and on s390x we have to return
the same value here as for KVM_CAP_MAX_VCPUS.
This problem has been discovered with the kvm_create_max_vcpus selftest.
With this change applied, the selftest now passes on s390x, too.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190523164309.13345-9-thuth@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-05-28 15:52:19 +02:00
Linus Torvalds 92fab77b6b Main MIPS changes for v5.2:
- A set of memblock initialization improvements thanks to Serge Semin,
   tidying up after our conversion from bootmem to memblock back in
   v4.20.
 
 - Our eBPF JIT the previously supported only MIPS64r2 through MIPS64r5
   is improved to also support MIPS64r6. Support for MIPS32 systems is
   introduced, with the caveat that it only works for programs that don't
   use 64 bit registers or operations - those will bail out & need to be
   interpreted.
 
 - Improvements to the allocation & configuration of our exception vector
   that should fix issues seen on some platforms using recent versions of
   U-Boot.
 
 - Some minor improvements to code generated for jump labels, along with
   enabling them by default for generic kernels.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXNNB2RUccGF1bC5idXJ0
 b25AbWlwcy5jb20ACgkQPqefrLV1AN1zeAD/U/ScowcQE8ynoY97nA70d3UmbETH
 YETUX5WcOfR65O8A/1hvMX8QJ1x87XUlNTkE6Gdh/itAZJpJWiSo3dnd1GoF
 =L9IJ
 -----END PGP SIGNATURE-----

Merge tag 'mips_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Paul Burton:

 - A set of memblock initialization improvements thanks to Serge Semin,
   tidying up after our conversion from bootmem to memblock back in
   v4.20.

 - Our eBPF JIT the previously supported only MIPS64r2 through MIPS64r5
   is improved to also support MIPS64r6. Support for MIPS32 systems is
   introduced, with the caveat that it only works for programs that
   don't use 64 bit registers or operations - those will bail out & need
   to be interpreted.

 - Improvements to the allocation & configuration of our exception
   vector that should fix issues seen on some platforms using recent
   versions of U-Boot.

 - Some minor improvements to code generated for jump labels, along with
   enabling them by default for generic kernels.

* tag 'mips_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (27 commits)
  mips: Manually call fdt_init_reserved_mem() method
  mips: Make sure dt memory regions are valid
  mips: Perform early low memory test
  mips: Dump memblock regions for debugging
  mips: Add reserve-nomap memory type support
  mips: Use memblock to reserve the __nosave memory range
  mips: Discard post-CMA-init foreach loop
  mips: Reserve memory for the kernel image resources
  MIPS: Remove duplicate EBase configuration
  MIPS: Sync icache for whole exception vector
  MIPS: Always allocate exception vector for MIPSr2+
  MIPS: Use memblock_phys_alloc() for exception vector
  mips: Combine memblock init and memory reservation loops
  mips: Discard rudiments from bootmem_init
  mips: Make sure kernel .bss exists in boot mem pool
  mips: vdso: drop unnecessary cc-ldoption
  Revert "MIPS: ralink: fix cpu clock of mt7621 and add dt clk devices"
  MIPS: generic: Enable CONFIG_JUMP_LABEL
  MIPS: jump_label: Use compact branches for >= r6
  MIPS: jump_label: Remove redundant nops
  ...
2019-05-08 16:41:47 -07:00
David Howells 5dd50aaeb1
Make anon_inodes unconditional
Make the anon_inodes facility unconditional so that it can be used by core
VFS code and pidfd code.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[christian@brauner.io: adapt commit message to mention pidfds]
Signed-off-by: Christian Brauner <christian@brauner.io>
2019-04-19 14:03:11 +02:00
Paul Burton e6331a321a
MIPS: KVM: Use prandom_u32_max() to generate tlbwr index
Emulation of the tlbwr instruction, which writes a TLB entry to a random
index in the TLB, currently uses get_random_bytes() to generate a 4 byte
random number which we then mask to form the index. This is overkill in
a couple of ways:

  - We don't need 4 bytes here since we mask the value to form a 6 bit
    number anyway, so we waste /dev/random entropy generating 3 random
    bytes that are unused.

  - We don't need crypto-grade randomness here - the architecture spec
    allows implementations to use any algorithm & merely encourages that
    some pseudo-randomness be used rather than a simple counter. The
    fast prandom_u32() function fits that criteria well.

So rather than using get_random_bytes() & consuming /dev/random entropy,
switch to using the faster prandom_u32_max() which provides what we need
here whilst also performing the masking/modulo for us.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: George Spelvin <lkml@sdf.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
2019-03-25 14:02:12 -07:00
Paul Burton c8790d657b
MIPS: MemoryMapID (MMID) Support
Introduce support for using MemoryMapIDs (MMIDs) as an alternative to
Address Space IDs (ASIDs). The major difference between the two is that
MMIDs are global - ie. an MMID uniquely identifies an address space
across all coherent CPUs. In contrast ASIDs are non-global per-CPU IDs,
wherein each address space is allocated a separate ASID for each CPU
upon which it is used. This global namespace allows a new GINVT
instruction be used to globally invalidate TLB entries associated with a
particular MMID across all coherent CPUs in the system, removing the
need for IPIs to invalidate entries with separate ASIDs on each CPU.

The allocation scheme used here is largely borrowed from arm64 (see
arch/arm64/mm/context.c). In essence we maintain a bitmap to track
available MMIDs, and MMIDs in active use at the time of a rollover to a
new MMID version are preserved in the new version. The allocation scheme
requires efficient 64 bit atomics in order to perform reasonably, so
this support depends upon CONFIG_GENERIC_ATOMIC64=n (ie. currently it
will only be included in MIPS64 kernels).

The first, and currently only, available CPU with support for MMIDs is
the MIPS I6500. This CPU supports 16 bit MMIDs, and so for now we cap
our MMIDs to 16 bits wide in order to prevent the bitmap growing to
absurd sizes if any future CPU does implement 32 bit MMIDs as the
architecture manuals suggest is recommended.

When MMIDs are in use we also make use of GINVT instruction which is
available due to the global nature of MMIDs. By executing a sequence of
GINVT & SYNC 0x14 instructions we can avoid the overhead of an IPI to
each remote CPU in many cases. One complication is that GINVT will
invalidate wired entries (in all cases apart from type 0, which targets
the entire TLB). In order to avoid GINVT invalidating any wired TLB
entries we set up, we make sure to create those entries using a reserved
MMID (0) that we never associate with any address space.

Also of note is that KVM will require further work in order to support
MMIDs & GINVT, since KVM is involved in allocating IDs for guests & in
configuring the MMU. That work is not part of this patch, so for now
when MMIDs are in use KVM is disabled.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
2019-02-04 10:56:41 -08:00
Paul Burton 0b317c389c
MIPS: mm: Add set_cpu_context() for ASID assignments
When we gain MMID support we'll be storing MMIDs as atomic64_t values
and accessing them via atomic64_* functions. This necessitates that we
don't use cpu_context() as the left hand side of an assignment, ie. as a
modifiable lvalue. In preparation for this introduce a new
set_cpu_context() function & replace all assignments with cpu_context()
on their left hand side with an equivalent call to set_cpu_context().

To enforce that cpu_context() should not be used for assignments, we
rewrite it as a static inline function.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
2019-02-04 10:56:33 -08:00
Paul Burton 42d5b84657
MIPS: mm: Unify ASID version checks
Introduce a new check_mmu_context() function to check an mm's ASID
version & get a new one if it's outdated, and a
check_switch_mmu_context() function which additionally sets up the new
ASID & page directory. Simplify switch_mm() & various
get_new_mmu_context() callsites in MIPS KVM by making use of the new
functions, which will help reduce the amount of code that requires
modification to gain MMID support.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
2019-02-04 10:56:30 -08:00
Paul Burton 4739f7dd99
MIPS: mm: Remove redundant get_new_mmu_context() cpu argument
get_new_mmu_context() accepts a cpu argument, but implicitly assumes
that this is always equal to smp_processor_id() by operating on the
local CPU's TLB & icache.

Remove the cpu argument and have get_new_mmu_context() call
smp_processor_id() instead.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
2019-02-04 10:56:14 -08:00
Linus Torvalds 769e47094d Kconfig updates for v4.21
- support -y option for merge_config.sh to avoid downgrading =y to =m
 
  - remove S_OTHER symbol type, and touch include/config/*.h files correctly
 
  - fix file name and line number in lexer warnings
 
  - fix memory leak when EOF is encountered in quotation
 
  - resolve all shift/reduce conflicts of the parser
 
  - warn no new line at end of file
 
  - make 'source' statement more strict to take only string literal
 
  - rewrite the lexer and remove the keyword lookup table
 
  - convert to SPDX License Identifier
 
  - compile C files independently instead of including them from zconf.y
 
  - fix various warnings of gconfig
 
  - misc cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcJieuAAoJED2LAQed4NsGHlIP/1s0fQ86XD9dIMyHzAO0gh2f
 7rylfe2kEXJgIzJ0DyZdLu4iZtwbkEUqTQrRS1abriNGVemPkfBAnZdM5d92lOQX
 3iREa700AJ2xo7V7gYZ6AbhZoG3p0S9U9Q2qE5S+tFTe8c2Gy4xtjnODF+Vel85r
 S0P8tF5sE1/d00lm+yfMI/CJVfDjyNaMm+aVEnL0kZTPiRkaktjWgo6Fc2p4z1L5
 HFmMMP6/iaXmRZ+tHJGPQ2AT70GFVZw5ePxPcl50EotUP25KHbuUdzs8wDpYm3U/
 rcESVsIFpgqHWmTsdBk6dZk0q8yFZNkMlkaP/aYukVZpUn/N6oAXgTFckYl8dmQL
 fQBkQi6DTfr9EBPVbj18BKm7xI3Y4DdQ2fzTfYkJ2XwNRGFA5r9N3sjd7ZTVGjxC
 aeeMHCwvGdSx1x8PeZAhZfsUHW8xVDMSQiT713+ljBY+6cwzA+2NF0kP7B6OAqwr
 ETFzd4Xu2/lZcL7gQRH8WU3L2S5iedmDG6RnZgJMXI0/9V4qAA+nlsWaCgnl1TgA
 mpxYlLUMrd6AUJevE34FlnyFdk8IMn9iKRFsvF0f3doO5C7QzTVGqFdJu5a0CuWO
 4NBJvZjFT8/4amoWLfnDlfApWXzTfwLbKG+r6V2F30fLuXpYg5LxWhBoGRPYLZSq
 oi4xN1Mpx3TvXz6WcKVZ
 =r3Fl
 -----END PGP SIGNATURE-----

Merge tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kconfig updates from Masahiro Yamada:

 - support -y option for merge_config.sh to avoid downgrading =y to =m

 - remove S_OTHER symbol type, and touch include/config/*.h files correctly

 - fix file name and line number in lexer warnings

 - fix memory leak when EOF is encountered in quotation

 - resolve all shift/reduce conflicts of the parser

 - warn no new line at end of file

 - make 'source' statement more strict to take only string literal

 - rewrite the lexer and remove the keyword lookup table

 - convert to SPDX License Identifier

 - compile C files independently instead of including them from zconf.y

 - fix various warnings of gconfig

 - misc cleanups

* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
  kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
  kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
  kconfig: add static qualifiers to fix gconf warnings
  kconfig: split the lexer out of zconf.y
  kconfig: split some C files out of zconf.y
  kconfig: convert to SPDX License Identifier
  kconfig: remove keyword lookup table entirely
  kconfig: update current_pos in the second lexer
  kconfig: switch to ASSIGN_VAL state in the second lexer
  kconfig: stop associating kconf_id with yylval
  kconfig: refactor end token rules
  kconfig: stop supporting '.' and '/' in unquoted words
  treewide: surround Kconfig file paths with double quotes
  microblaze: surround string default in Kconfig with double quotes
  kconfig: use T_WORD instead of T_VARIABLE for variables
  kconfig: use specific tokens instead of T_ASSIGN for assignments
  kconfig: refactor scanning and parsing "option" properties
  kconfig: use distinct tokens for type and default properties
  kconfig: remove redundant token defines
  kconfig: rename depends_list to comment_option_list
  ...
2018-12-29 13:03:29 -08:00
Linus Torvalds 42b00f122c * ARM: selftests improvements, large PUD support for HugeTLB,
single-stepping fixes, improved tracing, various timer and vGIC
 fixes
 
 * x86: Processor Tracing virtualization, STIBP support, some correctness fixes,
 refactorings and splitting of vmx.c, use the Hyper-V range TLB flush hypercall,
 reduce order of vcpu struct, WBNOINVD support, do not use -ftrace for __noclone
 functions, nested guest support for PAUSE filtering on AMD, more Hyper-V
 enlightenments (direct mode for synthetic timers)
 
 * PPC: nested VFIO
 
 * s390: bugfixes only this time
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJcH0vFAAoJEL/70l94x66Dw/wH/2FZp1YOM5OgiJzgqnXyDbyf
 dNEfWo472MtNiLsuf+ZAfJojVIu9cv7wtBfXNzW+75XZDfh/J88geHWNSiZDm3Fe
 aM4MOnGG0yF3hQrRQyEHe4IFhGFNERax8Ccv+OL44md9CjYrIrsGkRD08qwb+gNh
 P8T/3wJEKwUcVHA/1VHEIM8MlirxNENc78p6JKd/C7zb0emjGavdIpWFUMr3SNfs
 CemabhJUuwOYtwjRInyx1y34FzYwW3Ejuc9a9UoZ+COahUfkuxHE8u+EQS7vLVF6
 2VGVu5SA0PqgmLlGhHthxLqVgQYo+dB22cRnsLtXlUChtVAq8q9uu5sKzvqEzuE=
 =b4Jx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - selftests improvements
   - large PUD support for HugeTLB
   - single-stepping fixes
   - improved tracing
   - various timer and vGIC fixes

  x86:
   - Processor Tracing virtualization
   - STIBP support
   - some correctness fixes
   - refactorings and splitting of vmx.c
   - use the Hyper-V range TLB flush hypercall
   - reduce order of vcpu struct
   - WBNOINVD support
   - do not use -ftrace for __noclone functions
   - nested guest support for PAUSE filtering on AMD
   - more Hyper-V enlightenments (direct mode for synthetic timers)

  PPC:
   -  nested VFIO

  s390:
   - bugfixes only this time"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits)
  KVM: x86: Add CPUID support for new instruction WBNOINVD
  kvm: selftests: ucall: fix exit mmio address guessing
  Revert "compiler-gcc: disable -ftracer for __noclone functions"
  KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines
  KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs
  KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup
  MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry
  KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams
  KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()
  KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp()
  KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte()
  KVM: Make kvm_set_spte_hva() return int
  KVM: Replace old tlb flush function with new one to flush a specified range.
  KVM/MMU: Add tlb flush with range helper function
  KVM/VMX: Add hv tlb range flush support
  x86/hyper-v: Add HvFlushGuestAddressList hypercall support
  KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops
  KVM: x86: Disable Intel PT when VMXON in L1 guest
  KVM: x86: Set intercept for Intel PT MSRs read/write
  KVM: x86: Implement Intel PT MSRs read/write emulation
  ...
2018-12-26 11:46:28 -08:00
Masahiro Yamada 8636a1f967 treewide: surround Kconfig file paths with double quotes
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.

I do not see a good reason to complicate Kconfig for the room of
ambiguity.

The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.

Make it treewide consistent now.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
2018-12-22 00:25:54 +09:00
Lan Tianyu 748c0e312f KVM: Make kvm_set_spte_hva() return int
The patch is to make kvm_set_spte_hva() return int and caller can
check return value to determine flush tlb or not.

Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-21 11:28:41 +01:00
Paolo Bonzini 2a31b9db15 kvm: introduce manual dirty log reprotect
There are two problems with KVM_GET_DIRTY_LOG.  First, and less important,
it can take kvm->mmu_lock for an extended period of time.  Second, its user
can actually see many false positives in some cases.  The latter is due
to a benign race like this:

  1. KVM_GET_DIRTY_LOG returns a set of dirty pages and write protects
     them.
  2. The guest modifies the pages, causing them to be marked ditry.
  3. Userspace actually copies the pages.
  4. KVM_GET_DIRTY_LOG returns those pages as dirty again, even though
     they were not written to since (3).

This is especially a problem for large guests, where the time between
(1) and (3) can be substantial.  This patch introduces a new
capability which, when enabled, makes KVM_GET_DIRTY_LOG not
write-protect the pages it returns.  Instead, userspace has to
explicitly clear the dirty log bits just before using the content
of the page.  The new KVM_CLEAR_DIRTY_LOG ioctl can also operate on a
64-page granularity rather than requiring to sync a full memslot;
this way, the mmu_lock is taken for small amounts of time, and
only a small amount of time will pass between write protection
of pages and the sending of their content.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-14 12:34:19 +01:00
Paolo Bonzini 8fe65a8299 kvm: rename last argument to kvm_get_dirty_log_protect
When manual dirty log reprotect will be enabled, kvm_get_dirty_log_protect's
pointer argument will always be false on exit, because no TLB flush is needed
until the manual re-protection operation.  Rename it from "is_dirty" to "flush",
which more accurately tells the caller what they have to do with it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-14 12:34:18 +01:00
Paul Burton c92e47e585
MIPS: Introduce CONFIG_MIPS_FP_SUPPORT
Introduce a Kconfig variable that will indicate whether to include
support for floating point in the kernel. For now this is always
enabled, and will be made configurable in a later patch.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/21016/
Cc: linux-mips@linux-mips.org
2018-11-09 10:23:15 -08:00
Mike Rapoport 57c8a661d9 mm: remove include/linux/bootmem.h
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Marc Zyngier a35381e10d KVM: Remove obsolete kvm_unmap_hva notifier backend
kvm_unmap_hva is long gone, and we only have kvm_unmap_hva_range to
deal with. Drop the now obsolete code.

Fixes: fb1522e099 ("KVM: update to new mmu_notifier semantic v2")
Cc: James Hogan <jhogan@kernel.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
2018-09-07 15:06:02 +02:00
Peter Zijlstra b3dae109fa sched/swait: Rename to exclusive
Since swait basically implemented exclusive waits only, make sure
the API reflects that.

  $ git grep -l -e "\<swake_up\>"
		-e "\<swait_event[^ (]*"
		-e "\<prepare_to_swait\>" | while read file;
    do
	sed -i -e 's/\<swake_up\>/&_one/g'
	       -e 's/\<swait_event[^ (]*/&_exclusive/g'
	       -e 's/\<prepare_to_swait\>/&_exclusive/g' $file;
    done

With a few manual touch-ups.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: bigeasy@linutronix.de
Cc: oleg@redhat.com
Cc: paulmck@linux.vnet.ibm.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180612083909.261946548@infradead.org
2018-06-20 11:35:56 +02:00
Linus Torvalds b357bf6023 Small update for KVM.
* ARM: lazy context-switching of FPSIMD registers on arm64, "split"
 regions for vGIC redistributor
 
 * s390: cleanups for nested, clock handling, crypto, storage keys and
 control register bits
 
 * x86: many bugfixes, implement more Hyper-V super powers,
 implement lapic_timer_advance_ns even when the LAPIC timer
 is emulated using the processor's VMX preemption timer.  Two
 security-related bugfixes at the top of the branch.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJbH8Z/AAoJEL/70l94x66DF+UIAJeOuTp6LGasT/9uAb2OovaN
 +5kGmOPGFwkTcmg8BQHI2fXT4vhxMXWPFcQnyig9eXJVxhuwluXDOH4P9IMay0yw
 VDCBsWRdMvZDQad2hn6Z5zR4Jx01XrSaG/KqvXbbDKDCy96mWG7SYAY2m3ZwmeQi
 3Pa3O3BTijr7hBYnMhdXGkSn4ZyU8uPaAgIJ8795YKeOJ2JmioGYk6fj6y2WCxA3
 ztJymBjTmIoZ/F8bjuVouIyP64xH4q9roAyw4rpu7vnbWGqx1fjPYJoB8yddluWF
 JqCPsPzhKDO7mjZJy+lfaxIlzz2BN7tKBNCm88s5GefGXgZwk3ByAq/0GQ2M3rk=
 =H5zI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Small update for KVM:

  ARM:
   - lazy context-switching of FPSIMD registers on arm64
   - "split" regions for vGIC redistributor

  s390:
   - cleanups for nested
   - clock handling
   - crypto
   - storage keys
   - control register bits

  x86:
   - many bugfixes
   - implement more Hyper-V super powers
   - implement lapic_timer_advance_ns even when the LAPIC timer is
     emulated using the processor's VMX preemption timer.
   - two security-related bugfixes at the top of the branch"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (79 commits)
  kvm: fix typo in flag name
  kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access
  KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system
  KVM: x86: introduce linear_{read,write}_system
  kvm: nVMX: Enforce cpl=0 for VMX instructions
  kvm: nVMX: Add support for "VMWRITE to any supported field"
  kvm: nVMX: Restrict VMX capability MSR changes
  KVM: VMX: Optimize tscdeadline timer latency
  KVM: docs: nVMX: Remove known limitations as they do not exist now
  KVM: docs: mmu: KVM support exposing SLAT to guests
  kvm: no need to check return value of debugfs_create functions
  kvm: Make VM ioctl do valloc for some archs
  kvm: Change return type to vm_fault_t
  KVM: docs: mmu: Fix link to NPT presentation from KVM Forum 2008
  kvm: x86: Amend the KVM_GET_SUPPORTED_CPUID API documentation
  KVM: x86: hyperv: declare KVM_CAP_HYPERV_TLBFLUSH capability
  KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}_EX implementation
  KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} implementation
  KVM: introduce kvm_make_vcpus_request_mask() API
  KVM: x86: hyperv: do rep check for each hypercall separately
  ...
2018-06-12 11:34:04 -07:00
Souptick Joarder 1499fa809e kvm: Change return type to vm_fault_t
Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.

commit 1c8f422059 ("mm: change return type to vm_fault_t")

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 19:18:25 +02:00
Colin Ian King ba3696e94d
KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
Trivial fix to spelling mistake in debugfs_entries text.

Fixes: 669e846e6c ("KVM/MIPS32: MIPS arch specific APIs for KVM")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kernel-janitors@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: James Hogan <jhogan@kernel.org>
2018-05-14 23:54:25 +01:00
Paolo Bonzini 5cb0944c0c KVM: introduce kvm_arch_vcpu_async_ioctl
After the vcpu_load/vcpu_put pushdown, the handling of asynchronous VCPU
ioctl is already much clearer in that it is obvious that they bypass
vcpu_load and vcpu_put.

However, it is still not perfect in that the different state of the VCPU
mutex is still hidden in the caller.  Separate those ioctls into a new
function kvm_arch_vcpu_async_ioctl that returns -ENOIOCTLCMD for more
"traditional" synchronous ioctls.

Cc: James Hogan <jhogan@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:26:59 +01:00
Christoffer Dall 9b062471e5 KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl
Move the calls to vcpu_load() and vcpu_put() in to the architecture
specific implementations of kvm_arch_vcpu_ioctl() which dispatches
further architecture-specific ioctls on to other functions.

Some architectures support asynchronous vcpu ioctls which cannot call
vcpu_load() or take the vcpu->mutex, because that would prevent
concurrent execution with a running VCPU, which is the intended purpose
of these ioctls, for example because they inject interrupts.

We repeat the separate checks for these specifics in the architecture
code for MIPS, S390 and PPC, and avoid taking the vcpu->mutex and
calling vcpu_load for these ioctls.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:26:58 +01:00
Christoffer Dall 875656fe0c KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_set_regs
Move vcpu_load() and vcpu_put() into the architecture specific
implementations of kvm_arch_vcpu_ioctl_set_regs().

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:26:52 +01:00
Christoffer Dall 1fc9b76b3d KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_get_regs
Move vcpu_load() and vcpu_put() into the architecture specific
implementations of kvm_arch_vcpu_ioctl_get_regs().

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:26:51 +01:00
Christoffer Dall accb757d79 KVM: Move vcpu_load to arch-specific kvm_arch_vcpu_ioctl_run
Move vcpu_load() and vcpu_put() into the architecture specific
implementations of kvm_arch_vcpu_ioctl_run().

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> # s390 parts
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Rebased. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:26:50 +01:00
Jan H. Schönherr 20b7035c66 KVM: Let KVM_SET_SIGNAL_MASK work as advertised
KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".

This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.

Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.

Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-11-27 17:53:47 +01:00
Greg Kroah-Hartman b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Davidlohr Bueso 4c0b4bc60f kvm,mips: Fix potential swait_active() races
For example, the following could occur, making us miss a wakeup:

CPU0					CPU1
kvm_vcpu_block				kvm_mips_comparecount_func
					  [L] swait_active(&vcpu->wq)
  [S] prepare_to_swait(&vcpu->wq)
  [L] if (!kvm_vcpu_has_pending_timer(vcpu))
         schedule()                       [S] queue_timer_int(vcpu)

Ensure that the swait_active() check is not hoisted over the interrupt.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15 16:57:13 +02:00
Longpeng(Mike) 199b5763d3 KVM: add spinlock optimization framework
If a vcpu exits due to request a user mode spinlock, then
the spinlock-holder may be preempted in user mode or kernel mode.
(Note that not all architectures trap spin loops in user mode,
only AMD x86 and ARM/ARM64 currently do).

But if a vcpu exits in kernel mode, then the holder must be
preempted in kernel mode, so we should choose a vcpu in kernel mode
as a more likely candidate for the lock holder.

This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted.  kvm_vcpu_on_spin's
new argument says the same of the spinning VCPU.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-08 10:57:43 +02:00
Linus Torvalds c136b84393 PPC:
- Better machine check handling for HV KVM
 - Ability to support guests with threads=2, 4 or 8 on POWER9
 - Fix for a race that could cause delayed recognition of signals
 - Fix for a bug where POWER9 guests could sleep with interrupts pending.
 
 ARM:
 - VCPU request overhaul
 - allow timer and PMU to have their interrupt number selected from userspace
 - workaround for Cavium erratum 30115
 - handling of memory poisonning
 - the usual crop of fixes and cleanups
 
 s390:
 - initial machine check forwarding
 - migration support for the CMMA page hinting information
 - cleanups and fixes
 
 x86:
 - nested VMX bugfixes and improvements
 - more reliable NMI window detection on AMD
 - APIC timer optimizations
 
 Generic:
 - VCPU request overhaul + documentation of common code patterns
 - kvm_stat improvements
 
 There is a small conflict in arch/s390 due to an arch-wide field rename.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJZW4XTAAoJEL/70l94x66DkhMH/izpk54KI17PtyQ9VYI2sYeZ
 BWK6Kl886g3ij4pFi3pECqjDJzWaa3ai+vFfzzpJJ8OkCJT5Rv4LxC5ERltVVmR8
 A3T1I/MRktSC0VJLv34daPC2z4Lco/6SPipUpPnL4bE2HATKed4vzoOjQ3tOeGTy
 dwi7TFjKwoVDiM7kPPDRnTHqCe5G5n13sZ49dBe9WeJ7ttJauWqoxhlYosCGNPEj
 g8ZX8+cvcAhVnz5uFL8roqZ8ygNEQq2mgkU18W8ZZKuiuwR0gdsG0gSBFNTdwIMK
 NoreRKMrw0+oLXTIB8SZsoieU6Qi7w3xMAMabe8AJsvYtoersugbOmdxGCr1lsA=
 =OD7H
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "PPC:
   - Better machine check handling for HV KVM
   - Ability to support guests with threads=2, 4 or 8 on POWER9
   - Fix for a race that could cause delayed recognition of signals
   - Fix for a bug where POWER9 guests could sleep with interrupts pending.

  ARM:
   - VCPU request overhaul
   - allow timer and PMU to have their interrupt number selected from userspace
   - workaround for Cavium erratum 30115
   - handling of memory poisonning
   - the usual crop of fixes and cleanups

  s390:
   - initial machine check forwarding
   - migration support for the CMMA page hinting information
   - cleanups and fixes

  x86:
   - nested VMX bugfixes and improvements
   - more reliable NMI window detection on AMD
   - APIC timer optimizations

  Generic:
   - VCPU request overhaul + documentation of common code patterns
   - kvm_stat improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits)
  Update my email address
  kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS
  x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12
  kvm: x86: mmu: allow A/D bits to be disabled in an mmu
  x86: kvm: mmu: make spte mmio mask more explicit
  x86: kvm: mmu: dead code thanks to access tracking
  KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code
  KVM: PPC: Book3S HV: Close race with testing for signals on guest entry
  KVM: PPC: Book3S HV: Simplify dynamic micro-threading code
  KVM: x86: remove ignored type attribute
  KVM: LAPIC: Fix lapic timer injection delay
  KVM: lapic: reorganize restart_apic_timer
  KVM: lapic: reorganize start_hv_timer
  kvm: nVMX: Check memory operand to INVVPID
  KVM: s390: Inject machine check into the nested guest
  KVM: s390: Inject machine check into the guest
  tools/kvm_stat: add new interactive command 'b'
  tools/kvm_stat: add new command line switch '-i'
  tools/kvm_stat: fix error on interactive command 'g'
  KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit
  ...
2017-07-06 18:38:31 -07:00
James Cowgill e27a9eca5d KVM: MIPS: Fix maybe-uninitialized build failure
This commit fixes a "maybe-uninitialized" build failure in
arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all
enabled. The failure is:

In file included from ./include/linux/printk.h:329:0,
                 from ./include/linux/kernel.h:13,
                 from ./include/asm-generic/bug.h:15,
                 from ./arch/mips/include/asm/bug.h:41,
                 from ./include/linux/bug.h:4,
                 from ./include/linux/thread_info.h:11,
                 from ./include/asm-generic/current.h:4,
                 from ./arch/mips/include/generated/asm/current.h:1,
                 from ./include/linux/sched.h:11,
                 from arch/mips/kvm/tlb.c:13:
arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’:
./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
   __dynamic_pr_debug(&descriptor, pr_fmt(fmt), \
   ^~~~~~~~~~~~~~~~~~
arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here
  int idx_user, idx_kernel;
                ^~~~~~~~~~

There is a similar error relating to "idx_user". Both errors were
observed with GCC 6.

As far as I can tell, it is impossible for either idx_user or idx_kernel
to be uninitialized when they are later read in the calls to kvm_debug,
but to satisfy the compiler, add zero initializers to both variables.

Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Fixes: 57e3869cfa ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID")
Cc: <stable@vger.kernel.org> # 4.11+
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-06-20 17:02:48 +02:00
Radim Krčmář 2fa6e1e12a KVM: add kvm_request_pending
A first step in vcpu->requests encapsulation.  Additionally, we now
use READ_ONCE() when accessing vcpu->requests, which ensures we
always load vcpu->requests when it's accessed.  This is important as
other threads can change it any time.  Also, READ_ONCE() documents
that vcpu->requests is used with other threads, likely requiring
memory barriers, which it does.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[ Documented the new use of READ_ONCE() and converted another check
  in arch/mips/kvm/vz.c ]
Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-06-04 16:53:00 +02:00
Radim Krčmář 72875d8a4d KVM: add kvm_{test,clear}_request to replace {test,clear}_bit
Users were expected to use kvm_check_request() for testing and clearing,
but request have expanded their use since then and some users want to
only test or do a faster clear.

Make sure that requests are not directly accessed with bit operations.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27 14:12:22 +02:00
Paolo Bonzini 3042255899 kvm: make KVM_CAP_COALESCED_MMIO architecture agnostic
Remove code from architecture files that can be moved to virt/kvm, since there
is already common code for coalesced MMIO.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[Removed a pointless 'break' after 'return'.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-04-07 16:49:00 +02:00
James Hogan dc44abd6aa KVM: MIPS/Emulate: Properly implement TLBR for T&E
Properly implement emulation of the TLBR instruction for Trap & Emulate.
This instruction reads the TLB entry pointed at by the CP0_Index
register into the other TLB registers, which may have the side effect of
changing the current ASID. Therefore abstract the CP0_EntryHi and ASID
changing code into a common function in the process.

A comment indicated that Linux doesn't use TLBR, which is true during
normal use, however dumping of the TLB does use it (for example with the
relatively recent 'x' magic sysrq key), as does a wired TLB entries test
case in my KVM tests.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 16:31:37 +01:00
James Hogan 1f48f9be4d KVM: MIPS/VZ: Handle Octeon III guest.PRid register
Octeon III implements a read-only guest CP0_PRid register, so add cases
to the KVM register access API for Octeon to ensure the correct value is
read and writes are ignored.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 15:36:20 +01:00
James Hogan 3ba731daf0 KVM: MIPS/VZ: Emulate hit CACHE ops for Octeon III
Octeon III doesn't implement the optional GuestCtl0.CG bit to allow
guest mode to execute virtual address based CACHE instructions, so
implement emulation of a few important ones specifically for Octeon III
in response to a GPSI exception.

Currently the main reason to perform these operations is for icache
synchronisation, so they are implemented as a simple icache flush with
local_flush_icache_range().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 15:36:19 +01:00
James Hogan 824533ad16 KVM: MIPS/VZ: VZ hardware setup for Octeon III
Set up hardware virtualisation on Octeon III cores, configuring guest
interrupt routing and carving out half of the root TLB for guest use,
restoring it back again afterwards.

We need to be careful to inhibit TLB shutdown machine check exceptions
while invalidating guest TLB entries, since TLB invalidation is not
available so guest entries must be invalidated by setting them to unique
unmapped addresses, which could conflict with mappings set by the guest
or root if recently repartitioned.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 15:36:18 +01:00
James Hogan 867f4da752 KVM: MIPS/T&E: Report correct dcache line size
Octeon CPUs don't report the correct dcache line size in CP0_Config1.DL,
so encode the correct value for the guest CP0_Config1.DL based on
cpu_dcache_line_size().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 15:36:18 +01:00
James Hogan 1c506c9c10 KVM: MIPS/TLB: Handle virtually tagged icaches
When TLB entries are invalidated in the presence of a virtually tagged
icache, such as that found on Octeon CPUs, flush the icache so that we
don't get a reserved instruction exception even though the TLB mapping
is removed.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 15:36:17 +01:00
James Hogan 4fa9de5a64 KVM: MIPS/Emulate: Adapt T&E CACHE emulation for Octeon
Cache management is implemented separately for Cavium Octeon CPUs, so
r4k_blast_[id]cache aren't available. Instead for Octeon perform a local
icache flush using local_flush_icache_range(), and for other platforms
which don't use c-r4k.c use __flush_cache_all() / flush_icache_all().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 15:36:17 +01:00
James Hogan edec9d7bdc KVM: MIPS/VZ: Trace guest mode changes
Create a trace event for guest mode changes, and enable VZ's
GuestCtl0.MC bit after the trace event is enabled to trap all guest mode
changes.

The MC bit causes Guest Hardware Field Change (GHFC) exceptions whenever
a guest mode change occurs (such as an exception entry or return from
exception), so we need to handle this exception now. The MC bit is only
enabled when restoring register state, so enabling the trace event won't
take immediate effect.

Tracing guest mode changes can be particularly handy when trying to work
out what a guest OS gets up to before something goes wrong, especially
if the problem occurs as a result of some previous guest userland
exception which would otherwise be invisible in the trace.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:54:00 +01:00
James Hogan f4474d50c7 KVM: MIPS/VZ: Support hardware guest timer
Transfer timer state to the VZ guest context (CP0_GTOffset & guest
CP0_Count) when entering guest mode, enabling direct guest access to it,
and transfer back to soft timer when saving guest register state.

This usually allows guest code to directly read CP0_Count (via MFC0 and
RDHWR) and read/write CP0_Compare, without trapping to the hypervisor
for it to emulate the guest timer. Writing to CP0_Count or CP0_Cause.DC
is much less common and still triggers a hypervisor GPSI exception, in
which case the timer state is transferred back to an hrtimer before
emulating the write.

We are careful to prevent small amounts of drift from building up due to
undeterministic time intervals between reading of the ktime and reading
of CP0_Count. Some drift is expected however, since the system
clocksource may use a different timer to the local CP0_Count timer used
by VZ. This is permitted to prevent guest CP0_Count from appearing to go
backwards.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:59 +01:00
James Hogan d42a008f86 KVM: MIPS/VZ: Emulate MAARs when necessary
Add emulation of Memory Accessibility Attribute Registers (MAARs) when
necessary. We can't actually do anything with whatever the guest
provides, but it may not be possible to clear Guest.Config5.MRP so we
have to emulate at least a pair of MAARs.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:58 +01:00
James Hogan 273819a6a5 KVM: MIPS/VZ: Support guest load-linked bit
When restoring guest state after another VCPU has run, be sure to clear
CP0_LLAddr.LLB in order to break any interrupted atomic critical
section. Without this SMP guest atomics don't work when LLB is present
as one guest can complete the atomic section started by another guest.

MIPS VZ guest read of CP0_LLAddr causes Guest Privileged Sensitive
Instruction (GPSI) exception due to the address being root physical.
Handle this by reporting only the LLB bit, which contains the bit for
whether a ll/sc atomic is in progress without any reason for failure.

Similarly on P5600 a guest write to CP0_LLAddr also causes a GPSI
exception. Handle this also by clearing the guest LLB bit from root
mode.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:58 +01:00
James Hogan 5a2f352f34 KVM: MIPS/VZ: Support guest hardware page table walker
Add support for VZ guest CP0_PWBase, CP0_PWField, CP0_PWSize, and
CP0_PWCtl registers for controlling the guest hardware page table walker
(HTW) present on P5600 and P6600 cores. These guest registers need
initialising on R6, context switching, and exposing via the KVM ioctl
API when they are present.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:57 +01:00
James Hogan 4b7de028e6 KVM: MIPS/VZ: Support guest segmentation control
Add support for VZ guest CP0_SegCtl0, CP0_SegCtl1, and CP0_SegCtl2
registers, as found on P5600 and P6600 cores. These guest registers need
initialising, context switching, and exposing via the KVM ioctl API when
they are present.

They also require the GVA -> GPA translation code for handling a GVA
root exception to be updated to interpret the segmentation registers and
decode the faulting instruction enough to detect EVA memory access
instructions.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:56 +01:00
James Hogan dffe042fd8 KVM: MIPS/VZ: Support guest CP0_[X]ContextConfig
Add support for VZ guest CP0_ContextConfig and CP0_XContextConfig
(MIPS64 only) registers, as found on P5600 and P6600 cores. These guest
registers need initialising, context switching, and exposing via the KVM
ioctl API when they are present.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:56 +01:00
James Hogan edc89260d4 KVM: MIPS/VZ: Support guest CP0_BadInstr[P]
Add support for VZ guest CP0_BadInstr and CP0_BadInstrP registers, as
found on most VZ capable cores. These guest registers need context
switching, and exposing via the KVM ioctl API when they are present.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:55 +01:00
James Hogan 56f7a44050 KVM: MIPS: Add VZ support to build system
Add support for the MIPS Virtualization (VZ) ASE to the MIPS KVM build
system. For now KVM can only be configured for T&E or VZ and not both,
but the design of the user facing APIs support the possibility of having
both available, so this could change in future.

Note that support for various optional guest features (some of which
can't be turned off) are implemented in immediately following commits,
so although it should now be possible to build VZ support, it may not
work yet on your hardware.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:54 +01:00
James Hogan c992a4f6a9 KVM: MIPS: Implement VZ support
Add the main support for the MIPS Virtualization ASE (A.K.A. VZ) to MIPS
KVM. The bulk of this work is in vz.c, with various new state and
definitions elsewhere.

Enough is implemented to be able to run on a minimal VZ core. Further
patches will fill out support for guest features which are optional or
can be disabled.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:54 +01:00
James Hogan ea1bdbf683 KVM: MIPS: Update exit handler for VZ
The general guest exit handler needs a few tweaks for VZ compared to
trap & emulate, which for now are made directly depending on
CONFIG_KVM_MIPS_VZ:

- There is no need to re-enable the hardware page table walker (HTW), as
  it can be left enabled during guest mode operation with VZ.

- There is no need to perform a privilege check, as any guest privilege
  violations should have already been detected by the hardware and
  triggered the appropriate guest exception.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:53 +01:00
James Hogan 60c7aa3320 KVM: MIPS/Emulate: Drop CACHE emulation for VZ
Ifdef out the trap & emulate CACHE instruction emulation functions for
VZ. We will provide separate CACHE instruction emulation in vz.c, and we
need to avoid linker errors due to the use of T&E specific MMU helpers.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:53 +01:00
James Hogan 5dee99b24e KVM: MIPS/Emulate: Update CP0_Compare emulation for VZ
Update emulation of guest writes to CP0_Compare for VZ. There are two
main differences compared to trap & emulate:

 - Writing to CP0_Compare in the VZ hardware guest context acks any
   pending timer, clearing CP0_Cause.TI. If we don't want an ack to take
   place we must carefully restore the TI bit if it was previously set.

 - Even with guest timer access disabled in CP0_GuestCtl0.GT, if the
   guest CP0_Count reaches the guest CP0_Compare the timer interrupt
   will assert. To prevent this we must set CP0_GTOffset to move the
   guest CP0_Count out of the way of the new guest CP0_Compare, either
   before or after depending on whether it is a forwards or backwards
   change.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:52 +01:00
James Hogan 372582a6c6 KVM: MIPS/TLB: Add VZ TLB management
Add functions for MIPS VZ TLB management to tlb.c.

kvm_vz_host_tlb_inv() will be used for invalidating root TLB entries
after GPA page tables have been modified due to a KVM page fault. It
arranges for a root GPA mapping to be flushed from the TLB, using the
gpa_mm ASID or the current GuestID to do the probe.

kvm_vz_local_flush_roottlb_all_guests() and
kvm_vz_local_flush_guesttlb_all() flush all TLB entries in the
corresponding TLB for guest mappings (GPA->RPA for root TLB with
GuestID, and all entries for guest TLB). They will be used when starting
a new GuestID cycle, when VZ hardware is enabled/disabled, and also when
switching to a guest when the guest TLB contents may be stale or belong
to a different VM.

kvm_vz_guest_tlb_lookup() converts a guest virtual address to a guest
physical address using the guest TLB. This will be used to decode guest
virtual addresses which are sometimes provided by VZ hardware in
CP0_BadVAddr for certain exceptions when the guest physical address is
unavailable.

kvm_vz_save_guesttlb() and kvm_vz_load_guesttlb() will be used to
preserve wired guest VTLB entries while a guest isn't running.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:51 +01:00
James Hogan 1934a3ad09 KVM: MIPS/Entry: Update entry code to support VZ
Update MIPS KVM entry code to support VZ:

 - We need to set GuestCtl0.GM while in guest mode.

 - For cores supporting GuestID, we need to set the root GuestID to
   match the main GuestID while in guest mode so that the root TLB
   refill handler writes the correct GuestID into the TLB.

 - For cores without GuestID where the root ASID dealiases RVA/GPA
   mappings, we need to load that ASID from the gpa_mm rather than the
   per-VCPU guest_kernel_mm or guest_user_mm, since the root TLB maps
   guest physical addresses. We also need to restore the normal process
   ASID on exit.

 - The normal linux process pgd needs restoring on exit, as we can't
   leave the GPA mappings active for kernel code.

 - GuestCtl0 needs saving on exit for the GExcCode field, as it may be
   clobbered if a preemption occurs.

We also need to move the TLB refill handler to the XTLB vector at offset
0x80 on 64-bit VZ kernels, as hardware will use Root.Status.KX to
determine whether a TLB refill or XTLB Refill exception is to be taken
on a root TLB miss from guest mode, and KX needs to be set for kernel
code to be able to access the 64-bit segments.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:51 +01:00
James Hogan a27660f3e4 KVM: MIPS: Abstract guest CP0 register access for VZ
Abstract the MIPS KVM guest CP0 register access macros into inline
functions which are generated by macros. This allows them to be
generated differently for VZ, where they will usually need to access the
hardware guest CP0 context rather than the saved values in RAM.

Accessors for each individual register are generated using these macros:

 - __BUILD_KVM_*_SW() for registers which are not present in the VZ
   hardware guest context, so kvm_{read,write}_c0_guest_##name() will
   access the saved value in RAM regardless of whether VZ is enabled.

 - __BUILD_KVM_*_HW() for registers which are present in the VZ hardware
   guest context, so kvm_{read,write}_c0_guest_##name() will access the
   hardware register when VZ is enabled.

These build the underlying accessors using further macros:

 - __BUILD_KVM_*_SAVED() builds e.g. kvm_{read,write}_sw_gc0_##name()
   functions for accessing the saved versions of the registers in RAM.
   This is used for implementing the common
   kvm_{read,write}_c0_guest_##name() accessors with T&E where registers
   are always stored in RAM, but are also available with VZ HW registers
   to allow them to be accessed while saved.

 - __BUILD_KVM_*_VZ() builds e.g. kvm_{read,write}_vz_gc0_##name()
   functions for accessing the VZ hardware guest context registers
   directly. This is used for implementing the common
   kvm_{read,write}_c0_guest_##name() accessors with VZ.

 - __BUILD_KVM_*_WRAP() builds wrappers with different names, which
   allows the common kvm_{read,write}_c0_guest_##name() functions to be
   implemented using the VZ accessors while still having the SAVED
   accessors available too.

 - __BUILD_KVM_SAVE_VZ() builds functions for saving and restoring VZ
   hardware guest context register state to RAM, improving conciseness
   of VZ context saving and restoring.

Similar macros exist for generating modifiers (set, clear, change),
either with a normal unlocked read/modify/write, or using atomic LL/SC
sequences.

These changes change the types of 32-bit registers to u32 instead of
unsigned long, which requires some changes to printk() functions in MIPS
KVM.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:50 +01:00
James Hogan 28c1e762b0 KVM: MIPS: Add guest exit exception callback
Add a callback for MIPS KVM implementations to handle the VZ guest
exit exception. Currently the trap & emulate implementation contains a
stub which reports an internal error, but the callback will be used
properly by the VZ implementation.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:49 +01:00
James Hogan edab4fe1f6 KVM: MIPS: Add hardware_{enable,disable} callback
Add an implementation callback for the kvm_arch_hardware_enable() and
kvm_arch_hardware_disable() architecture functions, with simple stubs
for trap & emulate. This is in preparation for VZ which will make use of
them.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:49 +01:00
James Hogan 607ef2fdeb KVM: MIPS: Add callback to check extension
Add an implementation callback for checking presence of KVM extensions.
This allows implementation specific extensions to be provided without
ifdefs in mips.c.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:48 +01:00
James Hogan a517c1ad65 KVM: MIPS: Init timer frequency from callback
Currently the software emulated timer is initialised to a frequency of
100MHz by kvm_mips_init_count(), but this isn't suitable for VZ where
the frequency of the guest timer matches that of the host.

Add a count_hz argument so the caller can specify the default frequency,
and move the call from kvm_arch_vcpu_create() to the implementation
specific vcpu_setup() callback, so that VZ can specify a different
frequency.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:48 +01:00
James Hogan a8a3c42677 KVM: MIPS: Add VZ & TE capabilities
Add new KVM_CAP_MIPS_VZ and KVM_CAP_MIPS_TE capabilities, and in order
to allow MIPS KVM to support VZ without confusing old users (which
expect the trap & emulate implementation), define and start checking
KVM_CREATE_VM type codes.

The codes available are:

 - KVM_VM_MIPS_TE = 0

   This is the current value expected from the user, and will create a
   VM using trap & emulate in user mode, confined to the user mode
   address space. This may in future become unavailable if the kernel is
   only configured to support VZ, in which case the EINVAL error will be
   returned and KVM_CAP_MIPS_TE won't be available even though
   KVM_CAP_MIPS_VZ is.

 - KVM_VM_MIPS_VZ = 1

   This can be provided when the KVM_CAP_MIPS_VZ capability is available
   to create a VM using VZ, with a fully virtualized guest virtual
   address space. If VZ support is unavailable in the kernel, the EINVAL
   error will be returned (although old kernels without the
   KVM_CAP_MIPS_VZ capability may well succeed and create a trap &
   emulate VM).

This is designed to allow the desired implementation (T&E vs VZ) to be
potentially chosen at runtime rather than being fixed in the kernel
configuration.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:47 +01:00
James Hogan a7244920d1 KVM: MIPS: Extend counters & events for VZ GExcCodes
Extend MIPS KVM stats counters and kvm_transition trace event codes to
cover hypervisor exceptions, which have their own GExcCode field in
CP0_GuestCtl0 with up to 32 hypervisor exception cause codes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:46 +01:00
James Hogan c58cf74179 KVM: MIPS: Update kvm_lose_fpu() for VZ
Update the implementation of kvm_lose_fpu() for VZ, where there is no
need to enable the FPU/MSA in the root context if the FPU/MSA state is
loaded but disabled in the guest context.

The trap & emulate implementation needs to disable FPU/MSA in the root
context when the guest disables them in order to catch the COP1 unusable
or MSA disabled exception when they're used and pass it on to the guest.

For VZ however as long as the context is loaded and enabled in the root
context, the guest can enable and disable it in the guest context
without the hypervisor having to do much, and will take guest exceptions
without hypervisor intervention if used without being enabled in the
guest context.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:45 +01:00
James Hogan 59d7814ab2 KVM: MIPS/Emulate: Implement 64-bit MMIO emulation
Implement additional MMIO emulation for MIPS64, including 64-bit
loads/stores, and 32-bit unsigned loads. These are only exposed on
64-bit VZ hosts.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:45 +01:00
James Hogan 8b48d5b75d KVM: MIPS/Emulate: De-duplicate MMIO emulation
Refactor MIPS KVM MMIO load/store emulation to reduce code duplication.
Each duplicate differed slightly anyway, and it will simplify adding
64-bit MMIO support for VZ.

kvm_mips_emulate_store() and kvm_mips_emulate_load() can now return
EMULATE_DO_MMIO (as possibly originally intended). We therefore stop
calling either of these from kvm_mips_emulate_inst(), which is now only
used by kvm_trap_emul_handle_cop_unusable() which is picky about return
values.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-03-28 14:53:44 +01:00
James Hogan 955d8dc3ee KVM: MIPS: Implement HYPCALL emulation
Emulate the HYPCALL instruction added in the VZ ASE and used by the MIPS
paravirtualised guest support that is already merged. The new hypcall.c
handles arguments and the return value. No actual hypercalls are yet
supported, but this still allows us to safely step over hypercalls and
set an error code in the return value for forward compatibility.

Non-zero HYPCALL codes are not handled.

We also document the hypercall ABI which asm/kvm_para.h uses.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
2017-03-28 14:53:33 +01:00
Ingo Molnar 174cd4b1e5 sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>
Fix up affected files that include this signal functionality via sched.h.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:32 +01:00
Paolo Bonzini 460df4c1fc KVM: race-free exit from KVM_RUN without POSIX signals
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick"
a VCPU out of KVM_RUN through a POSIX signal.  A signal is attached
to a dummy signal handler; by blocking the signal outside KVM_RUN and
unblocking it inside, this possible race is closed:

          VCPU thread                     service thread
   --------------------------------------------------------------
        check flag
                                          set flag
                                          raise signal
        (signal handler does nothing)
        KVM_RUN

However, one issue with KVM_SET_SIGNAL_MASK is that it has to take
tsk->sighand->siglock on every KVM_RUN.  This lock is often on a
remote NUMA node, because it is on the node of a thread's creator.
Taking this lock can be very expensive if there are many userspace
exits (as is the case for SMP Windows VMs without Hyper-V reference
time counter).

As an alternative, we can put the flag directly in kvm_run so that
KVM can see it:

          VCPU thread                     service thread
   --------------------------------------------------------------
                                          raise signal
        signal handler
          set run->immediate_exit
        KVM_RUN
          check run->immediate_exit

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-17 12:27:37 +01:00
James Hogan 12ed1faece KVM: MIPS: Allow multiple VCPUs to be created
Increase the maximum number of MIPS KVM VCPUs to 8, and implement the
KVM_CAP_NR_VCPUS and KVM_CAP_MAX_CPUS capabilities which expose the
recommended and maximum number of VCPUs to userland. The previous
maximum of 1 didn't allow for any form of SMP guests.

We calculate the values similarly to ARM, recommending as many VCPUs as
there are CPUs online in the system. This will allow userland to know
how many VCPUs it is possible to create.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:34 +00:00
James Hogan ad58d4d4a2 KVM: MIPS/T&E: Expose read-only CP0_IntCtl register
Expose the CP0_IntCtl register through the KVM register access API,
which is a required register since MIPS32r2. It is currently read-only
since the VS field isn't implemented due to lack of Config3.VInt or
Config3.VEIC.

It is implemented in trap_emul.c so that a VZ implementation can allow
writes.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:33 +00:00
James Hogan 013044cc65 KVM: MIPS/T&E: Expose CP0_EntryLo0/1 registers
Expose the CP0_EntryLo0 and CP0_EntryLo1 registers through the KVM
register access API. This is fairly straightforward for trap & emulate
since we don't support the RI and XI bits. For the sake of future
proofing (particularly for VZ) it is explicitly specified that the API
always exposes the 64-bit version of these registers (i.e. with the RI
and XI bits in bit positions 63 and 62 respectively), and they are
implemented in trap_emul.c rather than mips.c to allow them to be
implemented differently for VZ.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:32 +00:00
James Hogan be67a0be94 KVM: MIPS/T&E: Default to reset vector
Set the default VCPU state closer to the architectural reset state, with
PC pointing at the reset vector (uncached PA 0x1fc00000, which for KVM
T&E is VA 0x5fc00000), and with CP0_Status.BEV and CP0_Status.ERL to 1.

Although QEMU at least will overwrite this state, it makes sense to do
this now that CP0_EBase is properly implemented to check BEV, and now
that we support a sparse GPA layout potentially with a boot ROM at GPA
0x1fc00000.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:31 +00:00
James Hogan 7801bbe1bd KVM: MIPS/T&E: Implement CP0_EBase register
The CP0_EBase register is a standard feature of MIPS32r2, so we should
always have been implementing it properly. However the register value
was ignored and wasn't exposed to userland.

Fix the emulation of exceptions and interrupts to use the value stored
in guest CP0_EBase, and fix the masks so that the top 3 bits (rather
than the standard 2) are fixed, so that it is always in the guest KSeg0
segment.

Also add CP0_EBASE to the KVM one_reg interface so it can be accessed by
userland, also allowing the CPU number field to be written (which isn't
permitted by the guest).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:30 +00:00
James Hogan 654229a024 KVM: MIPS/T&E: Move CP0 register access into T&E
Access to various CP0 registers via the KVM register access API needs to
be implementation specific to allow restrictions to be made on changes,
for example when VZ guest registers aren't present, so move them all
into trap_emul.c in preparation for VZ.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:30 +00:00
James Hogan 230c57244c KVM: MIPS: Claim KVM_CAP_READONLY_MEM support
Now that load/store faults due to read only memory regions are treated
as MMIO accesses it is safe to claim support for read only memory
regions (KVM_CAP_READONLY_MEM).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:29 +00:00
James Hogan 411740f542 KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMU
Implement the SYNC_MMU capability for KVM MIPS, allowing changes in the
underlying user host virtual address (HVA) mappings to be promptly
reflected in the corresponding guest physical address (GPA) mappings.

This allows for several features to work with guest RAM which require
mappings to be altered or protected, such as copy-on-write, KSM (Kernel
Samepage Merging), idle page tracking, memory swapping, and guest memory
ballooning.

There are two main aspects of this change, described below.

The KVM MMU notifier architecture callbacks are implemented so we can be
notified of changes in the HVA mappings. These arrange for the guest
physical address (GPA) page tables to be modified and possibly for
derived mappings (GVA page tables and TLBs) to be flushed.

 - kvm_unmap_hva[_range]() - These deal with HVA mappings being removed,
   for example before a copy-on-write takes place, which requires the
   corresponding GPA page table mappings to be removed too.

 - kvm_set_spte_hva() - These update a GPA page table entry to match the
   new HVA entry, but must be careful to respect KVM specific
   configuration such as not dirtying a clean guest page which is dirty
   to the host, and write protecting writable pages in read only
   memslots (which will soon be supported).

 - kvm[_test]_age_hva() - These update GPA page table entries to be old
   (invalid) so that access can be tracked, making them young again.

The GPA page fault handling (kvm_mips_map_page) is updated to use
gfn_to_pfn_prot() (which may provide read-only pages), to handle
asynchronous page table invalidation from MMU notifier callbacks, and to
handle more cases in the fast path.

 - mmu_notifier_seq is used to detect asynchronous page table
   invalidations while we're holding a pfn from gfn_to_pfn_prot()
   outside of kvm->mmu_lock, retrying if invalidations have taken place,
   e.g. a COW or a KSM page merge.

 - The fast path (_kvm_mips_map_page_fast) now handles marking old pages
   as young / accessed, and disallowing dirtying of clean pages that
   aren't actually writable (e.g. shared pages that should COW, and
   read-only memory regions when they are enabled in a future patch).

 - Due to the use of MMU notifications we no longer need to keep the
   page references after we've updated the GPA page tables.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:28 +00:00
James Hogan f9b11e51f8 KVM: MIPS/MMU: Pass GPA PTE bits to mapped GVA PTEs
Propagate the GPA PTE protection bits on to the GVA PTEs on a mapped
fault (except _PAGE_WRITE, and filtered by the guest TLB entry), rather
than always overriding the protection. This allows dirty page tracking
to work in mapped guest segments as a clear dirty bit in the GPA PTE
will propagate to the GVA PTEs even when the guest TLB has the dirty bit
set.

Since the filtering of protection bits is now abstracted, if the buddy
GVA PTE is also valid, we obtain the corresponding GPA PTE using a
simple non-allocating walk and load that into the GVA PTE similarly
(which may itself be invalid).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:27 +00:00
James Hogan b584f460e6 KVM: MIPS/MMU: Pass GPA PTE bits to KSeg0 GVA PTEs
Propagate the GPA PTE protection bits on to the GVA PTEs on a KSeg0
fault (except _PAGE_WRITE), rather than always overriding the
protection. This allows dirty page tracking to work in KSeg0 as a clear
dirty bit in the GPA PTE will propagate to the GVA PTEs.

This makes it simpler to use a single kvm_mips_map_page() to obtain both
the main GPA PTE and its buddy (which may be invalid), which also allows
memory regions to be fully accessible when they don't start and end on a
2*PAGE_SIZE boundary.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:26 +00:00
James Hogan b5f1dd1ba4 KVM: MIPS/MMU: Handle dirty logging on GPA faults
Update kvm_mips_map_page() to handle logging of dirty guest physical
pages. Upcoming patches will propagate the dirty bit to the GVA page
tables.

A fast path is added for handling protection bits that can be resolved
without calling into KVM, currently just dirtying of clean pages being
written to.

The slow path marks the GPA page table entry writable only on writes,
and at the same time marks the page dirty in the dirty page logging
bitmask.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:25 +00:00
James Hogan a1ac9e17b7 KVM: MIPS: Clean & flush on dirty page logging enable
When an existing memory region has dirty page logging enabled, make the
entire slot clean (read only) so that writes will immediately start
logging dirty pages (once the dirty bit is transferred from GPA to GVA
page tables in an upcoming patch).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:24 +00:00
James Hogan e88643ba1a KVM: MIPS/MMU: Use generic dirty log & protect helper
MIPS hasn't up to this point properly supported dirty page logging, as
pages in slots with dirty logging enabled aren't made clean, and tlbmod
exceptions from writes to clean pages have been assumed to be due to
guest TLB protection and unconditionally passed to the guest.

Use the generic dirty logging helper kvm_get_dirty_log_protect() to
properly implement kvm_vm_ioctl_get_dirty_log(), similar to how ARM
does. This uses xchg to clear the dirty bits when reading them, rather
than wiping them out afterwards with a memset, which would potentially
wipe recently set bits that weren't caught by kvm_get_dirty_log(). It
also makes the pages clean again using the
kvm_arch_mmu_enable_log_dirty_pt_masked() architecture callback so that
further writes after the shadow memslot is flushed will trigger tlbmod
exceptions and dirty handling.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:23 +00:00
James Hogan f0c0c330f7 KVM: MIPS/MMU: Add GPA PT mkclean helper
Add a helper function to make a range of guest physical address (GPA)
mappings in the GPA page table clean so that writes can be caught. This
will be used in a few places to manage dirty page logging.

Note that until the dirty bit is transferred from GPA page table entries
to GVA page table entries in an upcoming patch this won't trigger a TLB
modified exception on write.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:22 +00:00
James Hogan 64ebc9e240 KVM: MIPS/T&E: Handle read only GPA in TLB mod
Rewrite TLB modified exception handling to handle read only GPA memory
regions, instead of unconditionally passing the exception to the guest.

If the guest TLB is not the cause of the exception we call into the
normal TLB fault handling depending on the memory segment, which will
soon attempt to remap the physical page to be writable (handling dirty
page tracking or copy on write in the process).

Failing that we fall back to treating it as MMIO, due to a read only
memory region. Once the capability is enabled, this will allow read only
memory regions (such as the Malta boot flash as emulated by QEMU) to
have writes treated as MMIO, while still allowing reads to run
untrapped.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:22 +00:00
James Hogan b8f79ddb7d KVM: MIPS/T&E: Treat unhandled guest KSeg0 as MMIO
Treat unhandled accesses to guest KSeg0 as MMIO, rather than only host
KSeg0 addresses. This will allow read only memory regions (such as the
Malta boot flash as emulated by QEMU) to have writes (before reads)
treated as MMIO, and unallocated physical addresses to have all accesses
treated as MMIO.

The MMIO emulation uses the gva_to_gpa callback, so this is also updated
for trap & emulate to handle guest KSeg0 addresses.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:21 +00:00
James Hogan 420ea09b64 KVM: MIPS/T&E: Abstract bad access handling
Abstract the handling of bad guest loads and stores which may need to
trigger an MMIO, so that the same code can be used in a later patch for
guest KSeg0 addresses (TLB exception handling) as well as for host KSeg1
addresses (existing address error exception and TLB exception handling).

We now use kvm_mips_emulate_store() and kvm_mips_emulate_load() directly
rather than the more generic kvm_mips_emulate_inst(), as there is no
need to expose emulation of any other instructions.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:20 +00:00
James Hogan 577ed7f71e KVM: MIPS: Pass type of fault down to kvm_mips_map_page()
kvm_mips_map_page() will need to know whether the fault was due to a
read or a write in order to support dirty page tracking,
KVM_CAP_SYNC_MMU, and read only memory regions, so get that information
passed down to it via new bool write_fault arguments to various
functions.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:19 +00:00
James Hogan 89d6ad8a6b KVM: MIPS/T&E: Ignore user writes to CP0_Config7
Ignore userland writes to CP0_Config7 rather than reporting an error,
since we do allow reads of this register and it is claimed to exist in
the ioctl API.

This allows userland to blindly save and restore KVM registers without
having to special case certain registers as not being writable, for
example during live migration once dirty page logging is fixed.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:18 +00:00
James Hogan b620911086 KVM: MIPS: Implement kvm_arch_flush_shadow_all/memslot
Implement the kvm_arch_flush_shadow_all() and
kvm_arch_flush_shadow_memslot() KVM functions for MIPS to allow guest
physical mappings to be safely changed.

The general MIPS KVM code takes care of flushing of GPA page table
entries. kvm_arch_flush_shadow_all() flushes the whole GPA page table,
and is always called on the cleanup path so there is no need to acquire
the kvm->mmu_lock. kvm_arch_flush_shadow_memslot() flushes only the
range of mappings in the GPA page table corresponding to the slot being
flushed, and happens when memory regions are moved or deleted.

MIPS KVM implementation callbacks are added for handling the
implementation specific flushing of mappings derived from the GPA page
tables. These are implemented for trap_emul.c using
kvm_flush_remote_tlbs() which should now be functional, and will flush
the per-VCPU GVA page tables and ASIDS synchronously (before next
entering guest mode or directly accessing GVA space).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:17 +00:00
James Hogan 4cf74c9c83 KVM: MIPS/Emulate: Use lockless GVA helpers for cache emulation
Use the lockless GVA helpers to implement the reading of guest
instructions for emulation. This will allow it to handle asynchronous
TLB flushes when they are implemented.

This is a little more complicated than the other two cases (get_inst()
and dynamic translation) due to the need to emulate the appropriate
guest TLB exception when the address isn't present or isn't valid in the
guest TLB.

Since there are several protected cache ops that may need to be
performed safely, this is abstracted by kvm_mips_guest_cache_op() which
is passed a protected cache op function pointer and takes care of the
lockless operation and fault handling / retry if the op should fail,
taking advantage of the new errors which the protected cache ops can now
return. This allows the existing advance fault handling which relied on
host TLB lookups to be removed, along with the now unused
kvm_mips_host_tlb_lookup(),

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:15 +00:00
James Hogan 5207ce144a KVM: MIPS/MMU: Use lockless GVA helpers for get_inst()
Use the lockless GVA helpers to implement the reading of guest
instructions for emulation. This will allow it to handle asynchronous
TLB flushes when they are implemented.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:13 +00:00
James Hogan 4b21e8abf9 KVM: MIPS/T&E: Use lockless GVA helpers for dyntrans
Use the lockless GVA helpers to implement the dynamic translation of
guest instructions. This will allow it to handle asynchronous TLB
flushes when they are implemented.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:12 +00:00
James Hogan 1880afd605 KVM: MIPS/T&E: Add lockless GVA access helpers
Add helpers to allow for lockless direct access to the GVA space, by
changing the VCPU mode to READING_SHADOW_PAGE_TABLES for the duration of
the access. This allows asynchronous TLB flush requests in future
patches to safely trigger either a TLB flush before the direct GVA space
access, or a delay until the in-progress lockless direct access is
complete.

The kvm_trap_emul_gva_lockless_begin() and
kvm_trap_emul_gva_lockless_end() helpers take care of guarding the
direct GVA accesses, and kvm_trap_emul_gva_fault() tries to handle a
uaccess fault resulting from a flush having taken place.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:11 +00:00
James Hogan 91737ea205 KVM: MIPS/T&E: Reduce stale ASID checks
The stale ASID checks taking place on VCPU load can be reduced:

- Now that we check for a stale ASID on guest re-entry, there is no need
  to do so when loading the VCPU outside of guest context, since it will
  happen before entering the guest. Note that a lot of KVM VCPU ioctls
  will cause the VCPU to be loaded but guest context won't be entered.

- There is no need to check for a stale kernel_mm ASID when the guest is
  in user mode and vice versa. In fact doing so can potentially be
  problematic since the user_mm ASID regeneration may trigger a new ASID
  cycle, which would cause the kern_mm ASID to become stale after it has
  been checked for staleness.

Therefore only check the ASID for the mm corresponding to the current
guest mode, and only if we're already in guest context. We drop some of
the related kvm_debug() calls here too.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:10 +00:00
James Hogan b29e115ae4 KVM: MIPS/T&E: Handle TLB invalidation requests
Add handling of TLB invalidation requests before entering guest mode.
This will allow asynchonous invalidation of the VCPU mappings when
physical memory regions are altered. Should the CPU running the VCPU
already be in guest mode an IPI will be sent to trigger a guest exit.

The reload_asid path will be used in a future patch for when GVA is
about to be directly accessed by KVM.

In the process, the stale user ASID check in the re-entry path (for lazy
user GVA flushing) is generalised to check the ASID for the current
guest mode, in case a TLB invalidation request was handled. This has the
side effect of making the ASID checks on vcpu_load too conservative,
which will be addressed in a later patch.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:09 +00:00
James Hogan 4841e0dd4f KVM: MIPS: Update vcpu->mode and vcpu->cpu
Keep the vcpu->mode and vcpu->cpu variables up to date so that
kvm_make_all_cpus_request() has a chance of functioning correctly. This
will soon need to be used for kvm_flush_remote_tlbs().

We can easily update vcpu->cpu when the VCPU context is loaded or saved,
which will happen when accessing guest context and when the guest is
scheduled in and out.

We need to be a little careful with vcpu->mode though, as we will in
future be checking for outstanding VCPU requests, and this must be done
after the value of IN_GUEST_MODE in vcpu->mode is visible to other CPUs.
Otherwise the other CPU could fail to trigger an IPI to wait for
completion dispite the VCPU request not being seen.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:08 +00:00
James Hogan 06c158c96e KVM: MIPS/MMU: Convert guest physical map to page table
Current guest physical memory is mapped to host physical addresses using
a single linear array (guest_pmap of length guest_pmap_npages). This was
only really meant to be temporary, and isn't sparse, so its wasteful of
memory. A small amount of RAM at GPA 0 and a small boot exception vector
at GPA 0x1fc00000 cannot be represented without a full 128KiB guest_pmap
allocation (MIPS32 with 16KiB pages), which is one reason why QEMU
currently runs its boot code at the top of RAM instead of the usual boot
exception vector address.

Instead use the existing infrastructure for host virtual page table
management to allocate a page table for guest physical memory too. This
should be sufficient for now, assuming the size of physical memory
doesn't exceed the size of virtual memory. It may need extending in
future to handle XPA (eXtended Physical Addressing) in 32-bit guests, as
supported by VZ guests on P5600.

Some of this code is based loosely on Cavium's VZ KVM implementation.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:08 +00:00
James Hogan 6a97c775ff KVM: MIPS: Use CP0_BadInstr[P] for emulation
When exiting from the guest, store the values of the CP0_BadInstr and
CP0_BadInstrP registers if they exist, which contain the encodings of
the instructions which caused the last synchronous exception.

When the instruction is needed for emulation, kvm_get_badinstr() and
kvm_get_badinstrp() are used instead of calling kvm_get_inst() directly,
to decide whether to read the saved CP0_BadInstr/CP0_BadInstrP registers
(if they exist), or read the instruction from memory (if not).

The use of these registers should be more robust than using
kvm_get_inst(), as it actually gives the instruction encoding seen by
the hardware rather than relying on user accessors after the fact, which
can be fooled by incoherent icache or a racing code modification. It
will also work with VZ, where the guest virtual memory isn't directly
accessible by the host with user accessors.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:07 +00:00
James Hogan 122e51d474 KVM: MIPS: Improve kvm_get_inst() error return
Currently kvm_get_inst() returns KVM_INVALID_INST in the event of a
fault reading the guest instruction. This has the rather arbitrary magic
value 0xdeadbeef. This API isn't very robust, and in fact 0xdeadbeef is
a valid MIPS64 instruction encoding, namely "ld t1,-16657(s5)".

Therefore change the kvm_get_inst() API to return 0 or -EFAULT, and to
return the instruction via a u32 *out argument. We can then drop the
KVM_INVALID_INST definition entirely.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:06 +00:00
James Hogan a1ecc54d7e KVM: MIPS/T&E: Don't treat code fetch faults as MMIO
In order to make use of the CP0_BadInstr & CP0_BadInstrP registers we
need to be a bit more careful not to treat code fetch faults as MMIO,
lest we hit an UNPREDICTABLE register value when we try to emulate the
MMIO load instruction but there was no valid instruction word available
to the hardware.

Add a kvm_is_ifetch_fault() helper to try to figure out whether a load
fault was due to a code fetch, and prevent MMIO instruction emulation in
that case.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:05 +00:00
James Hogan a98dd74103 KVM: MIPS/MMU: Drop kvm_get_new_mmu_context()
MIPS KVM uses its own variation of get_new_mmu_context() which takes an
extra vcpu pointer (unused) and does exactly the same thing.

Switch to just using get_new_mmu_context() directly and drop KVM's
version of it as it doesn't really serve any purpose.

The nearby declarations of kvm_mips_alloc_new_mmu_context(),
kvm_mips_vcpu_load() and kvm_mips_vcpu_put() are also removed from
kvm_host.h, as no definitions or users exist.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:04 +00:00
James Hogan 7071a8859b KVM: MIPS/Emulate: Drop redundant TLB flushes on exceptions
When exceptions are injected into the MIPS KVM guest, the whole host TLB
is flushed (except any entries in the guest KSeg0 range). This is
certainly not mandated by the architecture when exceptions are taken
(userland can't directly change TLB mappings anyway), and is a pretty
heavyweight operation:

 - There may be hundreds of TLB entries especially when a 512 entry FTLB
   is present. These are walked and read and conditionally invalidated,
   so the TLBINV feature can't be used either.

 - It'll indiscriminately wipe out entries belonging to other memory
   spaces. A simple ASID regeneration would be much faster to perform,
   although it'd wipe out the guest KSeg0 mappings too.

My suspicion is that this was simply to plaster over the fact that
kvm_mips_host_tlb_inv() incorrectly only invalidated TLB entries in the
ASID for guest usermode, and not the ASID for guest kernelmode.

Now that the recent commit "KVM: MIPS/TLB: Flush host TLB entry in
kernel ASID" fixes kvm_mips_host_tlb_inv() to flush TLB entries in the
kernelmode ASID when the guest TLB changes, lets drop these calls and
the otherwise unused kvm_mips_flush_host_tlb().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:03 +00:00
James Hogan 49ec508e3b KVM: MIPS/TLB: Drop kvm_local_flush_tlb_all()
Now that KVM no longer uses wired entries we can safely use
local_flush_tlb_all() when we need to flush the entire TLB (on the start
of a new ASID cycle). This doesn't flush wired entries, which allows
other code to use them without KVM clobbering them all the time. It also
is more up to date, knowing about the tlbinv architectural feature,
flushing of micro TLB on cores where that is necessary (Loongson I
believe), and knows to stop the HTW while doing so.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:02 +00:00
James Hogan 8af0e3c2e8 KVM: MIPS/Emulate: Fix CACHE emulation for EVA hosts
Use protected_writeback_dcache_line() instead of flush_dcache_line(),
and protected_flush_icache_line() instead of flush_icache_line(), so
that CACHEE (the EVA variant) is used on EVA host kernels.

Without this, guest floating point branch delay slot emulation via a
trampoline on the user stack fails on EVA host kernels due to failure of
the icache sync, resulting in the break instruction getting skipped and
execution from the stack.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:01 +00:00
James Hogan dacc3ed1dd KVM: MIPS: Use uaccess to read/modify guest instructions
Now that we have GVA page tables, use standard user accesses with page
faults disabled to read & modify guest instructions. This should be more
robust (than the rather dodgy method of accessing guest mapped segments
by just directly addressing them) and will also work with Enhanced
Virtual Addressing (EVA) host kernel configurations where dedicated
instructions are needed for accessing user mode memory.

For simplicity and speed we do this regardless of the guest segment the
address resides in, rather than handling guest KSeg0 specially with
kmap_atomic() as before.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:01 +00:00
James Hogan 7a156e9f82 KVM: MIPS: Drop vm_init() callback
Now that the commpage doesn't use wired TLB entries, the per-CPU
vm_init() callback is the only work done by kvm_mips_init_vm_percpu().

The trap & emulate implementation doesn't actually need to do anything
from vm_init(), and the future VZ implementation would be better served
by a kvm_arch_hardware_enable callback anyway.

Therefore drop the vm_init() callback entirely, allowing the
kvm_mips_init_vm_percpu() function to also be dropped, along with the
kvm_mips_instance atomic counter.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:21:00 +00:00
James Hogan 4c86460cbc KVM: MIPS/MMU: Convert commpage fault handling to page tables
Now that we have GVA page tables and an optimised TLB refill handler in
place, convert the handling of commpage faults from the guest kernel to
fill the GVA page table and invalidate the TLB entry, rather than
filling the wired TLB entry directly.

For simplicity we no longer use a wired entry for the commpage (refill
should be much cheaper with the fast-path handler anyway). Since we
don't need to manipulate the TLB directly any longer, move the function
from tlb.c to mmu.c. This puts it closer to the similar functions
handling KSeg0 and TLB mapped page faults from the guest.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:59 +00:00
James Hogan 7e3d2a750b KVM: MIPS/MMU: Convert TLB mapped faults to page tables
Now that we have GVA page tables and an optimised TLB refill handler in
place, convert the handling of page faults in TLB mapped segment from
the guest to fill a single GVA page table entry and invalidate the TLB
entry, rather than filling a TLB entry pair directly.

Also remove the now unused kvm_mips_get_{kernel,user}_asid() functions
in mmu.c and kvm_mips_host_tlb_write() in tlb.c.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:58 +00:00
James Hogan fb99589391 KVM: MIPS/MMU: Convert KSeg0 faults to page tables
Now that we have GVA page tables and an optimised TLB refill handler in
place, convert the handling of KSeg0 page faults from the guest to fill
the GVA page tables and invalidate the TLB entry, rather than filling a
TLB entry directly.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:57 +00:00
James Hogan aba8592950 KVM: MIPS/MMU: Invalidate stale GVA PTEs on TLBW
Implement invalidation of specific pairs of GVA page table entries in
one or both of the GVA page tables. This is used when existing mappings
are replaced in the guest TLB by emulated TLBWI/TLBWR instructions. Due
to the sharing of page tables in the host kernel range, we should be
careful not to allow host pages to be invalidated.

Add a helper kvm_mips_walk_pgd() which can be used when walking of
either GPA (future patches) or GVA page tables is needed, optionally
with allocation of page tables along the way when they don't exist.

GPA page table walking will need to be protected by the kvm->mmu_lock,
so we also add a small MMU page cache in each KVM VCPU, like that found
for other architectures but smaller. This allows enough pages to be
pre-allocated to handle a single fault without holding the lock,
allowing the helper to run with the lock held without having to handle
allocation failures.

Using the same mechanism for GVA allows the same code to be used, and
allows it to use the same cache of allocated pages if the GPA walk
didn't need to allocate any new tables.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:56 +00:00
James Hogan a31b50d741 KVM: MIPS/MMU: Invalidate GVA PTs on ASID changes
Implement invalidation of large ranges of virtual addresses from GVA
page tables in response to a guest ASID change (immediately for guest
kernel page table, lazily for guest user page table).

We iterate through a range of page tables invalidating entries and
freeing fully invalidated tables. To minimise overhead the exact ranges
invalidated depends on the flags argument to kvm_mips_flush_gva_pt(),
which also allows it to be used in future KVM_CAP_SYNC_MMU patches in
response to GPA changes, which unlike guest TLB mapping changes affects
guest KSeg0 mappings.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:55 +00:00
James Hogan 57e3869cfa KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID
Refactor kvm_mips_host_tlb_inv() to also be able to invalidate any
matching TLB entry in the kernel ASID rather than assuming only the TLB
entries in the user ASID can change. Two new bool user/kernel arguments
allow the caller to indicate whether the mapping should affect each of
the ASIDs for guest user/kernel mode.

- kvm_mips_invalidate_guest_tlb() (used by TLBWI/TLBWR emulation) can
  now invalidate any corresponding TLB entry in both the kernel ASID
  (guest kernel may have accessed any guest mapping), and the user ASID
  if the entry being replaced is in guest USeg (where guest user may
  also have accessed it).

- The tlbmod fault handler (and the KSeg0 / TLB mapped / commpage fault
  handlers in later patches) can now invalidate the corresponding TLB
  entry in whichever ASID is currently active, since only a single page
  table will have been updated anyway.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:54 +00:00
James Hogan f3a8603f09 KVM: MIPS/TLB: Fix off-by-one in TLB invalidate
kvm_mips_host_tlb_inv() uses the TLBP instruction to probe the host TLB
for an entry matching the given guest virtual address, and determines
whether a match was found based on whether CP0_Index > 0. This is
technically incorrect as an index of 0 (with the high bit clear) is a
perfectly valid TLB index.

This is harmless at the moment due to the use of at least 1 wired TLB
entry for the KVM commpage, however we will soon be ridding ourselves of
that particular wired entry so lets fix the condition in case the entry
needing invalidation does land at TLB index 0.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:53 +00:00
James Hogan a7cfa7ac12 KVM: MIPS: Add fast path TLB refill handler
Use functions from the general MIPS TLB exception vector generation code
(tlbex.c) to construct a fast path TLB refill handler similar to the
general one, but cut down and capable of preserving K0 and K1.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:53 +00:00
James Hogan 29b500b54e KVM: MIPS: Support NetLogic KScratch registers
tlbex.c uses the implementation dependent $22 CP0 register group on
NetLogic cores, with the help of the c0_kscratch() helper. Allow these
registers to be allocated by the KVM entry code too instead of assuming
KScratch registers are all $31, which will also allow pgd_reg to be
handled since it is allocated that way.

We also drop the masking of kscratch_mask with 0xfc, as it is redundant
for the standard KScratch registers (Config4.KScrExist won't have the
low 2 bits set anyway), and apparently not necessary for NetLogic.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:52 +00:00
James Hogan 7faa6eec69 KVM: MIPS/T&E: Activate GVA page tables in guest context
Activate the GVA page tables when in guest context. This will allow the
normal Linux TLB refill handler to fill from it when guest memory is
read, as well as preventing accidental reading from user memory.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:51 +00:00
James Hogan f7f1427dc0 KVM: MIPS/T&E: Allocate GVA -> HPA page tables
Allocate GVA -> HPA page tables for guest kernel and guest user mode on
each VCPU, to allow for fast path TLB refill handling to be added later.

In the process kvm_arch_vcpu_init() needs updating to pass on any error
from the vcpu_init() callback.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:50 +00:00
James Hogan 630766b369 KVM: MIPS: Wire up vcpu uninit
Wire up a vcpu uninit implementation callback. This will be used for the
clean up of GVA->HPA page tables.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:49 +00:00
James Hogan a7ebb2e410 KVM: MIPS/T&E: active_mm = init_mm in guest context
Set init_mm as the active_mm and update mm_cpumask(current->mm) to
reflect that it isn't active when in guest context. This prevents cache
management code from attempting cache flushes on host virtual addresses
while in guest context, for example due to a cache management IPIs or
later when writing of dynamically translated code hits copy on write.

We do this using helpers in static kernel code to avoid having to export
init_mm to modules.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
2017-02-03 15:20:48 +00:00