Commit Graph

901 Commits

Author SHA1 Message Date
Linus Torvalds 7cbb39d4d4 Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "PPC and ARM do not have much going on this time.  Most of the cool
  stuff, instead, is in s390 and (after a few releases) x86.

  ARM has some caching fixes and PPC has transactional memory support in
  guests.  MIPS has some fixes, with more probably coming in 3.16 as
  QEMU will soon get support for MIPS KVM.

  For x86 there are optimizations for debug registers, which trigger on
  some Windows games, and other important fixes for Windows guests.  We
  now expose to the guest Broadwell instruction set extensions and also
  Intel MPX.  There's also a fix/workaround for OS X guests, nested
  virtualization features (preemption timer), and a couple kvmclock
  refinements.

  For s390, the main news is asynchronous page faults, together with
  improvements to IRQs (floating irqs and adapter irqs) that speed up
  virtio devices"

* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
  KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
  KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
  KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
  KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
  KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
  KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
  KVM: PPC: Book3S HV: Add transactional memory support
  KVM: Specify byte order for KVM_EXIT_MMIO
  KVM: vmx: fix MPX detection
  KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
  KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
  KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
  KVM: s390: clear local interrupts at cpu initial reset
  KVM: s390: Fix possible memory leak in SIGP functions
  KVM: s390: fix calculation of idle_mask array size
  KVM: s390: randomize sca address
  KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
  KVM: Bump KVM_MAX_IRQ_ROUTES for s390
  KVM: s390: irq routing for adapter interrupts.
  KVM: s390: adapter interrupt sources
  ...
2014-04-02 14:50:10 -07:00
Linus Torvalds 235c7b9feb Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull main powerpc updates from Ben Herrenschmidt:
 "This time around, the powerpc merges are going to be a little bit more
  complicated than usual.

  This is the main pull request with most of the work for this merge
  window.  I will describe it a bit more further down.

  There is some additional cpuidle driver work, however I haven't
  included it in this tree as it depends on some work in tip/timer-core
  which Thomas accidentally forgot to put in a topic branch.  Since I
  didn't want to carry all of that tip timer stuff in powerpc -next, I
  setup a separate branch on top of Thomas tree with just that cpuidle
  driver in it, and Stephen has been carrying that in next separately
  for a while now.  I'll send a separate pull request for it.

  Additionally, two new pieces in this tree add users for a sysfs API
  that Tejun and Greg have been deprecating in drivers-core-next.
  Thankfully Greg reverted the patch that removes the old API so this
  merge can happen cleanly, but once merged, I will send a patch
  adjusting our new code to the new API so that Greg can send you the
  removal patch.

  Now as for the content of this branch, we have a lot of perf work for
  power8 new counters including support for our new "nest" counters
  (also called 24x7) under pHyp (not natively yet).

  We have new functionality when running under the OPAL firmware
  (non-virtualized or KVM host), such as access to the firmware error
  logs and service processor dumps, system parameters and sensors, along
  with a hwmon driver for the latter.

  There's also a bunch of bug fixes accross the board, some LE fixes,
  and a nice set of selftests for validating our various types of copy
  loops.

  On the Freescale side, we see mostly new chip/board revisions, some
  clock updates, better support for machine checks and debug exceptions,
  etc..."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (70 commits)
  powerpc/book3s: Fix CFAR clobbering issue in machine check handler.
  powerpc/compat: 32-bit little endian machine name is ppcle, not ppc
  powerpc/le: Big endian arguments for ppc_rtas()
  powerpc: Use default set of netfilter modules (CONFIG_NETFILTER_ADVANCED=n)
  powerpc/defconfigs: Enable THP in pseries defconfig
  powerpc/mm: Make sure a local_irq_disable prevent a parallel THP split
  powerpc: Rate-limit users spamming kernel log buffer
  powerpc/perf: Fix handling of L3 events with bank == 1
  powerpc/perf/hv_{gpci, 24x7}: Add documentation of device attributes
  powerpc/perf: Add kconfig option for hypervisor provided counters
  powerpc/perf: Add support for the hv 24x7 interface
  powerpc/perf: Add support for the hv gpci (get performance counter info) interface
  powerpc/perf: Add macros for defining event fields & formats
  powerpc/perf: Add a shared interface to get gpci version and capabilities
  powerpc/perf: Add 24x7 interface headers
  powerpc/perf: Add hv_gpci interface header
  powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info)
  sysfs: create bin_attributes under the requested group
  powerpc/perf: Enable BHRB access for EBB events
  powerpc/perf: Add BHRB constraint and IFM MMCRA handling for EBB
  ...
2014-04-02 13:42:59 -07:00
Paolo Bonzini 7227fc0666 Merge branch 'kvm-ppchv-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-next 2014-03-29 15:44:05 +01:00
Paul Mackerras 72cde5a88d KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
Currently we save the host PMU configuration, counter values, etc.,
when entering a guest, and restore it on return from the guest.
(We have to do this because the guest has control of the PMU while
it is executing.)  However, we missed saving/restoring the SIAR and
SDAR registers, as well as the registers which are new on POWER8,
namely SIER and MMCR2.

This adds code to save the values of these registers when entering
the guest and restore them on exit.  This also works around the bug
in POWER8 where setting PMAE with a counter already negative doesn't
generate an interrupt.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:52 +11:00
Paul Mackerras c5fb80d3b2 KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
Commit c7699822bc21 ("KVM: PPC: Book3S HV: Make physical thread 0 do
the MMU switching") reordered the guest entry/exit code so that most
of the guest register save/restore code happened in guest MMU context.
A side effect of that is that the timebase still contains the guest
timebase value at the point where we compute and use vcpu->arch.dec_expires,
and therefore that is now a guest timebase value rather than a host
timebase value.  That in turn means that the timeouts computed in
kvmppc_set_timer() are wrong if the timebase offset for the guest is
non-zero.  The consequence of that is things such as "sleep 1" in a
guest after migration may sleep for much longer than they should.

This fixes the problem by converting between guest and host timebase
values as necessary, by adding or subtracting the timebase offset.
This also fixes an incorrect comment.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:39 +11:00
Paul Mackerras 797f9c07eb KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
With HV KVM, some high-frequency hypercalls such as H_ENTER are handled
in real mode, and need to access the memslots array for the guest.
Accessing the memslots array is safe, because we hold the SRCU read
lock for the whole time that a guest vcpu is running.  However, the
checks that kvm_memslots() does when lockdep is enabled are potentially
unsafe in real mode, when only the linear mapping is available.
Furthermore, kvm_memslots() can be called from a secondary CPU thread,
which is an offline CPU from the point of view of the host kernel,
and is not running the task which holds the SRCU read lock.

To avoid false positives in the checks in kvm_memslots(), and to avoid
possible side effects from doing the checks in real mode, this replaces
kvm_memslots() with kvm_memslots_raw() in all the places that execute
in real mode.  kvm_memslots_raw() is a new function that is like
kvm_memslots() but uses rcu_dereference_raw_notrace() instead of
kvm_dereference_check().

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:35 +11:00
Paul Mackerras 739e2425fe KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
If an attempt is made to load the kvm-hv module on a machine which
doesn't have hypervisor mode available, return an ENODEV error,
which is the conventional thing to return to indicate that this
module is not applicable to the hardware of the current machine,
rather than EIO, which causes a warning to be printed.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:29 +11:00
Paul Mackerras b24f36f33e KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
The in-kernel emulation of RTAS functions needs to read the argument
buffer from guest memory in order to find out what function is being
requested.  The guest supplies the guest physical address of the buffer,
and on a real system the code that reads that buffer would run in guest
real mode.  In guest real mode, the processor ignores the top 4 bits
of the address specified in load and store instructions.  In order to
emulate that behaviour correctly, we need to mask off those bits
before calling kvm_read_guest() or kvm_write_guest().  This adds that
masking.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:23 +11:00
Michael Neuling a7d80d01c6 KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
This adds code to get/set_one_reg to read and write the new transactional
memory (TM) state.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:17 +11:00
Michael Neuling e4e3812150 KVM: PPC: Book3S HV: Add transactional memory support
This adds saving of the transactional memory (TM) checkpointed state
on guest entry and exit.  We only do this if we see that the guest has
an active transaction.

It also adds emulation of the TM state changes when delivering IRQs
into the guest.  According to the architecture, if we are
transactional when an IRQ occurs, the TM state is changed to
suspended, otherwise it's left unchanged.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-29 19:58:02 +11:00
Anton Blanchard 7505258c5f KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
I noticed KVM is broken when KVM in-kernel XICS emulation
(CONFIG_KVM_XICS) is disabled.

The problem was introduced in 48eaef05 (KVM: PPC: Book3S HV: use
xics_wake_cpu only when defined). It used CONFIG_KVM_XICS to wrap
xics_wake_cpu, where CONFIG_PPC_ICP_NATIVE should have been
used.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-03-26 23:34:56 +11:00
Laurent Dufour 69e9fbb278 KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
This introduces the H_GET_TCE hypervisor call, which is basically the
reverse of H_PUT_TCE, as defined in the Power Architecture Platform
Requirements (PAPR).

The hcall H_GET_TCE is required by the kdump kernel, which uses it to
retrieve TCEs set up by the previous (panicked) kernel.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2014-03-26 23:34:27 +11:00
Greg Kurz e59d24e612 KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
When the guest does an MMIO write which is handled successfully by an
ioeventfd, ioeventfd_write() returns 0 (success) and
kvmppc_handle_store() returns EMULATE_DONE.  Then
kvmppc_emulate_mmio() converts EMULATE_DONE to RESUME_GUEST_NV and
this causes an exit from the loop in kvmppc_vcpu_run_hv(), causing an
exit back to userspace with a bogus exit reason code, typically
causing userspace (e.g. qemu) to crash with a message about an unknown
exit code.

This adds handling of RESUME_GUEST_NV in kvmppc_vcpu_run_hv() in order
to fix that.  For generality, we define a helper to check for either
of the return-to-guest codes we use, RESUME_GUEST and RESUME_GUEST_NV,
to make it easy to check for either and provide one place to update if
any other return-to-guest code gets defined in future.

Since it only affects Book3S HV for now, the helper is added to
the kvm_book3s.h header file.

We use the helper in two places in kvmppc_run_core() as well for
future-proofing, though we don't see RESUME_GUEST_NV in either place
at present.

[paulus@samba.org - combined 4 patches into one, rewrote description]

Suggested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2014-03-26 23:33:44 +11:00
Scott Wood a3dc620743 powerpc/booke64: Use SPRG_TLB_EXFRAME on bolted handlers
While bolted handlers (including e6500) do not need to deal with a TLB
miss recursively causing another TLB miss, nested TLB misses can still
happen with crit/mc/debug exceptions -- so we still need to honor
SPRG_TLB_EXFRAME.

We don't need to spend time modifying it in the TLB miss fastpath,
though -- the special level exception will handle that.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Cc: kvm-ppc@vger.kernel.org
2014-03-19 19:57:15 -05:00
Scott Wood 9d378dfac8 powerpc/booke64: Use SPRG7 for VDSO
Previously SPRG3 was marked for use by both VDSO and critical
interrupts (though critical interrupts were not fully implemented).

In commit 8b64a9dfb0 ("powerpc/booke64:
Use SPRG0/3 scratch for bolted TLB miss & crit int"), Mihai Caraman
made an attempt to resolve this conflict by restoring the VDSO value
early in the critical interrupt, but this has some issues:

 - It's incompatible with EXCEPTION_COMMON which restores r13 from the
   by-then-overwritten scratch (this cost me some debugging time).
 - It forces critical exceptions to be a special case handled
   differently from even machine check and debug level exceptions.
 - It didn't occur to me that it was possible to make this work at all
   (by doing a final "ld r13, PACA_EXCRIT+EX_R13(r13)") until after
   I made (most of) this patch. :-)

It might be worth investigating using a load rather than SPRG on return
from all exceptions (except TLB misses where the scratch never leaves
the SPRG) -- it could save a few cycles.  Until then, let's stick with
SPRG for all exceptions.

Since we cannot use SPRG4-7 for scratch without corrupting the state of
a KVM guest, move VDSO to SPRG7 on book3e.  Since neither SPRG4-7 nor
critical interrupts exist on book3s, SPRG3 is still used for VDSO
there.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: kvm-ppc@vger.kernel.org
2014-03-19 19:57:14 -05:00
Linus Torvalds 4907cdca72 A fix for a PowerPC bug that was introduced during the 3.14 merge window.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJTItl6AAoJEBvWZb6bTYby/EMP/A+/l9rU9oji5nns2ei2YVf7
 kmnpshb2rgDP6gTtPet3qPKcSUNq4bG7zPEZ04FG5H70PduDYBWchfT/jTZKxcBB
 rZgtrsMAzGjoxvtsQAc1yQiMtlkG28HD3yn0TC7BvazTQlZnzHVZTmiCZc9r0Qo7
 SjLslo00ITbhRVwkLi1cuSeuaNcim1WDgxeUFhpRd48HoywtkzGLGuU8dWmakwb0
 IDoo6X9srNlmZgrRdMaBS3agQtt1GoOE2T/KaJbnJ6lmnkGzBVKrw8kIZamQlwnK
 GnmJAM//1zu4eoWJquNuc9wqsZK2t0e68LXAL9HLeMQGyl1YBh5hTiInTUnXiCAI
 D/r/rkF2lqPgKotDBY/Q1l0bHABR53MOwa5ooZANLatnTwTv4gSpM8L1CAnekGhq
 dxqVYqeJr1+ZemhalO1pbi/V9CfpJHBBye9JDan8rJmkj25kCH7PF3BxLYgKckQs
 OtfZyOqpsZSCvpzFQa70VmWTGDqOxk686/R9ql88sdGbgyzZOHxHCeMIoAgCah6c
 o5s1CktKKLOmC2M6NNn1z44V1/LmmC2fgKuaSjI/y7rjgGwRoGrAV1giVMdnTc5y
 hGo4fD83/zHJnkafVdBQbmkKP69bC65a0w4iLXSxOKkPtHCI3c2IrZpovAC42aTp
 iREImat0+YzW4hDRI+qa
 =uZp5
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull another kvm fix from Paolo Bonzini:
 "A fix for a PowerPC bug that was introduced during the 3.14 merge
  window"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Fix register usage when loading/saving VRSAVE
  KVM: PPC: Book3S HV: Remove bogus duplicate code
2014-03-18 11:32:08 -07:00
Paolo Bonzini 8fbb1daf3e Merge branch 'kvm-ppc-fix' into HEAD 2014-03-14 16:06:30 +01:00
Paul Mackerras e724f080f5 KVM: PPC: Book3S HV: Fix register usage when loading/saving VRSAVE
Commit 595e4f7e69 ("KVM: PPC: Book3S HV: Use load/store_fp_state
functions in HV guest entry/exit") changed the register usage in
kvmppc_save_fp() and kvmppc_load_fp() but omitted changing the
instructions that load and save VRSAVE.  The result is that the
VRSAVE value was loaded from a constant address, and saved to a
location past the end of the vcpu struct, causing host kernel
memory corruption and various kinds of host kernel crashes.

This fixes the problem by using register r31, which contains the
vcpu pointer, instead of r3 and r4.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-13 10:47:01 +01:00
Paul Mackerras a5b0ccb0b5 KVM: PPC: Book3S HV: Remove bogus duplicate code
Commit 7b490411c3 ("KVM: PPC: Book3S HV: Add new state for
transactional memory") incorrectly added some duplicate code to the
guest exit path because I didn't manage to clean up after a rebase
correctly.  This removes the extraneous material.  The presence of
this extraneous code causes host crashes whenever a guest is run.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-13 10:46:52 +01:00
Linus Torvalds e2a0f813e0 Second batch of KVM updates. Some minor x86 fixes,
two s390 guest features that need some handling in the host,
 and all the PPC changes.  The PPC changes include support for
 little-endian guests and enablement for new POWER8 features.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS6UF5AAoJEBvWZb6bTYby55kP/AgTJnyu7avN653/2aSHkjkx
 KgYSMYhZPIFoY5LyZuNetXaoXFRvCykux1VYSZ6V6s35h2PZ+hdJNbHGjFYKPGTq
 FQ92xQVNuWCAPxmFCjDNuDV/0BauG5y08/Orh/jpjz+GAfH43LruUQGbtXUuyJ8u
 vf+yTHniU5gguqsAmodqjHUgbf+GoPJ1j7hmRoWwt8IWm7Ns3v/IK4l0p6G0h26a
 RjE6aK+Tm208Yr5hD/dRAqeTbBNt3c4xub+QPsKoiEMaZBSuAOiux7D3Kx+If1gp
 WsmqEQxoymihVtkZhUFO9ONLJepvmG2QwJVVyMSUW9iqxX9rraXsvVyVMwcQAhog
 JuOAYxBftH07xu6Fs4eym5KvCFghM+EaJvxxt+kgnvdD4htK1+eK5trntc2zygSi
 /qGiIrkqjXpkskW8kujLayF0eAU3CrZvFWveEPBfFgYiOGX/2wzJCtSm/bt9Jo0M
 v60qgNFK3LNqAyeEfnm9VtlwGr6ZgsAB6DHNPX4fM5s2IBjL+qloXk/e/+aVKkW0
 I3yeRdy/ExhLAab6w81JtMeR7G3YS0UNuAEVvcoxzNb5wIBY8qnpfUzTKyMxQR94
 64EVpxWEYO1s55eCCyMujWrSvc+YAwhJcWHGKgC4K7mxxLD3FVyQXX6YZvgRozMX
 HjQju+DToj9CskyrFlRL
 =yd0Z
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "Second batch of KVM updates.  Some minor x86 fixes, two s390 guest
  features that need some handling in the host, and all the PPC changes.

  The PPC changes include support for little-endian guests and
  enablement for new POWER8 features"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (45 commits)
  x86, kvm: correctly access the KVM_CPUID_FEATURES leaf at 0x40000101
  x86, kvm: cache the base of the KVM cpuid leaves
  kvm: x86: move KVM_CAP_HYPERV_TIME outside #ifdef
  KVM: PPC: Book3S PR: Cope with doorbell interrupts
  KVM: PPC: Book3S HV: Add software abort codes for transactional memory
  KVM: PPC: Book3S HV: Add new state for transactional memory
  powerpc/Kconfig: Make TM select VSX and VMX
  KVM: PPC: Book3S HV: Basic little-endian guest support
  KVM: PPC: Book3S HV: Add support for DABRX register on POWER7
  KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells
  KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8
  KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs
  KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap
  KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8
  KVM: PPC: Book3S HV: Add handler for HV facility unavailable
  KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8
  KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs
  KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers
  KVM: PPC: Book3S HV: Don't set DABR on POWER8
  kvm/ppc: IRQ disabling cleanup
  ...
2014-01-31 08:37:32 -08:00
Paolo Bonzini b73117c493 Merge branch 'kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm-queue
Conflicts:
	arch/powerpc/kvm/book3s_hv_rmhandlers.S
	arch/powerpc/kvm/booke.c
2014-01-29 18:29:01 +01:00
Linus Torvalds 1b17366d69 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Ben Herrenschmidt:
 "So here's my next branch for powerpc.  A bit late as I was on vacation
  last week.  It's mostly the same stuff that was in next already, I
  just added two patches today which are the wiring up of lockref for
  powerpc, which for some reason fell through the cracks last time and
  is trivial.

  The highlights are, in addition to a bunch of bug fixes:

   - Reworked Machine Check handling on kernels running without a
     hypervisor (or acting as a hypervisor).  Provides hooks to handle
     some errors in real mode such as TLB errors, handle SLB errors,
     etc...

   - Support for retrieving memory error information from the service
     processor on IBM servers running without a hypervisor and routing
     them to the memory poison infrastructure.

   - _PAGE_NUMA support on server processors

   - 32-bit BookE relocatable kernel support

   - FSL e6500 hardware tablewalk support

   - A bunch of new/revived board support

   - FSL e6500 deeper idle states and altivec powerdown support

  You'll notice a generic mm change here, it has been acked by the
  relevant authorities and is a pre-req for our _PAGE_NUMA support"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (121 commits)
  powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked()
  powerpc: Add support for the optimised lockref implementation
  powerpc/powernv: Call OPAL sync before kexec'ing
  powerpc/eeh: Escalate error on non-existing PE
  powerpc/eeh: Handle multiple EEH errors
  powerpc: Fix transactional FP/VMX/VSX unavailable handlers
  powerpc: Don't corrupt transactional state when using FP/VMX in kernel
  powerpc: Reclaim two unused thread_info flag bits
  powerpc: Fix races with irq_work
  Move precessing of MCE queued event out from syscall exit path.
  pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
  powerpc: Make add_system_ram_resources() __init
  powerpc: add SATA_MV to ppc64_defconfig
  powerpc/powernv: Increase candidate fw image size
  powerpc: Add debug checks to catch invalid cpu-to-node mappings
  powerpc: Fix the setup of CPU-to-Node mappings during CPU online
  powerpc/iommu: Don't detach device without IOMMU group
  powerpc/eeh: Hotplug improvement
  powerpc/eeh: Call opal_pci_reinit() on powernv for restoring config space
  powerpc/eeh: Add restore_config operation
  ...
2014-01-27 21:11:26 -08:00
Paul Mackerras 4068890931 KVM: PPC: Book3S PR: Cope with doorbell interrupts
When the PR host is running on a POWER8 machine in POWER8 mode, it
will use doorbell interrupts for IPIs.  If one of them arrives while
we are in the guest, we pop out of the guest with trap number 0xA00,
which isn't handled by kvmppc_handle_exit_pr, leading to the following
BUG_ON:

[  331.436215] exit_nr=0xa00 | pc=0x1d2c | msr=0x800000000000d032
[  331.437522] ------------[ cut here ]------------
[  331.438296] kernel BUG at arch/powerpc/kvm/book3s_pr.c:982!
[  331.439063] Oops: Exception in kernel mode, sig: 5 [#2]
[  331.439819] SMP NR_CPUS=1024 NUMA pSeries
[  331.440552] Modules linked in: tun nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw virtio_net kvm binfmt_misc ibmvscsi scsi_transport_srp scsi_tgt virtio_blk
[  331.447614] CPU: 11 PID: 1296 Comm: qemu-system-ppc Tainted: G      D      3.11.7-200.2.fc19.ppc64p7 #1
[  331.448920] task: c0000003bdc8c000 ti: c0000003bd32c000 task.ti: c0000003bd32c000
[  331.450088] NIP: d0000000025d6b9c LR: d0000000025d6b98 CTR: c0000000004cfdd0
[  331.451042] REGS: c0000003bd32f420 TRAP: 0700   Tainted: G      D       (3.11.7-200.2.fc19.ppc64p7)
[  331.452331] MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI>  CR: 28004824  XER: 20000000
[  331.454616] SOFTE: 1
[  331.455106] CFAR: c000000000848bb8
[  331.455726]
GPR00: d0000000025d6b98 c0000003bd32f6a0 d0000000026017b8 0000000000000032
GPR04: c0000000018627f8 c000000001873208 320d0a3030303030 3030303030643033
GPR08: c000000000c490a8 0000000000000000 0000000000000000 0000000000000002
GPR12: 0000000028004822 c00000000fdc6300 0000000000000000 00000100076ec310
GPR16: 000000002ae343b8 00003ffffd397398 0000000000000000 0000000000000000
GPR20: 00000100076f16f4 00000100076ebe60 0000000000000008 ffffffffffffffff
GPR24: 0000000000000000 0000008001041e60 0000000000000000 0000008001040ce8
GPR28: c0000003a2d80000 0000000000000a00 0000000000000001 c0000003a2681810
[  331.466504] NIP [d0000000025d6b9c] .kvmppc_handle_exit_pr+0x75c/0xa80 [kvm]
[  331.466999] LR [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm]
[  331.467517] Call Trace:
[  331.467909] [c0000003bd32f6a0] [d0000000025d6b98] .kvmppc_handle_exit_pr+0x758/0xa80 [kvm] (unreliable)
[  331.468553] [c0000003bd32f750] [d0000000025d98f0] kvm_start_lightweight+0xb4/0xc4 [kvm]
[  331.469189] [c0000003bd32f920] [d0000000025d7648] .kvmppc_vcpu_run_pr+0xd8/0x270 [kvm]
[  331.469838] [c0000003bd32f9c0] [d0000000025cf748] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
[  331.470790] [c0000003bd32fa50] [d0000000025cc19c] .kvm_arch_vcpu_ioctl_run+0x5c/0x1b0 [kvm]
[  331.471401] [c0000003bd32fae0] [d0000000025c4888] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
[  331.472026] [c0000003bd32fc90] [c00000000026192c] .do_vfs_ioctl+0x4dc/0x7a0
[  331.472561] [c0000003bd32fd80] [c000000000261cc4] .SyS_ioctl+0xd4/0xf0
[  331.473095] [c0000003bd32fe30] [c000000000009ed8] syscall_exit+0x0/0x98
[  331.473633] Instruction dump:
[  331.473766] 4bfff9b4 2b9d0800 419efc18 60000000 60420000 3d220000 e8bf11a0 e8df12a8
[  331.474733] 7fa4eb78 e8698660 48015165 e8410028 <0fe00000> 813f00e4 3ba00000 39290001
[  331.475386] ---[ end trace 49fc47d994c1f8f2 ]---
[  331.479817]

This fixes the problem by making kvmppc_handle_exit_pr() recognize the
interrupt.  We also need to jump to the doorbell interrupt handler in
book3s_segment.S to handle the interrupt on the way out of the guest.
Having done that, there's nothing further to be done in
kvmppc_handle_exit_pr().

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:23 +01:00
Michael Neuling 7b490411c3 KVM: PPC: Book3S HV: Add new state for transactional memory
Add new state for transactional memory (TM) to kvm_vcpu_arch.  Also add
asm-offset bits that are going to be required.

This also moves the existing TFHAR, TFIAR and TEXASR SPRs into a
CONFIG_PPC_TRANSACTIONAL_MEM section.  This requires some code changes to
ensure we still compile with CONFIG_PPC_TRANSACTIONAL_MEM=N.  Much of the added
the added #ifdefs are removed in a later patch when the bulk of the TM code is
added.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix merge conflict]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:20 +01:00
Anton Blanchard d682916a38 KVM: PPC: Book3S HV: Basic little-endian guest support
We create a guest MSR from scratch when delivering exceptions in
a few places.  Instead of extracting LPCR[ILE] and inserting it
into MSR_LE each time, we simply create a new variable intr_msr which
contains the entire MSR to use.  For a little-endian guest, userspace
needs to set the ILE (interrupt little-endian) bit in the LPCR for
each vcpu (or at least one vcpu in each virtual core).

[paulus@samba.org - removed H_SET_MODE implementation from original
version of the patch, and made kvmppc_set_lpcr update vcpu->arch.intr_msr.]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:16 +01:00
Paul Mackerras 8563bf52d5 KVM: PPC: Book3S HV: Add support for DABRX register on POWER7
The DABRX (DABR extension) register on POWER7 processors provides finer
control over which accesses cause a data breakpoint interrupt.  It
contains 3 bits which indicate whether to enable accesses in user,
kernel and hypervisor modes respectively to cause data breakpoint
interrupts, plus one bit that enables both real mode and virtual mode
accesses to cause interrupts.  Currently, KVM sets DABRX to allow
both kernel and user accesses to cause interrupts while in the guest.

This adds support for the guest to specify other values for DABRX.
PAPR defines a H_SET_XDABR hcall to allow the guest to set both DABR
and DABRX with one call.  This adds a real-mode implementation of
H_SET_XDABR, which shares most of its code with the existing H_SET_DABR
implementation.  To support this, we add a per-vcpu field to store the
DABRX value plus code to get and set it via the ONE_REG interface.

For Linux guests to use this new hcall, userspace needs to add
"hcall-xdabr" to the set of strings in the /chosen/hypertas-functions
property in the device tree.  If userspace does this and then migrates
the guest to a host where the kernel doesn't include this patch, then
userspace will need to implement H_SET_XDABR by writing the specified
DABR value to the DABR using the ONE_REG interface.  In that case, the
old kernel will set DABRX to DABRX_USER | DABRX_KERNEL.  That should
still work correctly, at least for Linux guests, since Linux guests
cope with getting data breakpoint interrupts in modes that weren't
requested by just ignoring the interrupt, and Linux guests never set
DABRX_BTI.

The other thing this does is to make H_SET_DABR and H_SET_XDABR work
on POWER8, which has the DAWR and DAWRX instead of DABR/X.  Guests that
know about POWER8 should use H_SET_MODE rather than H_SET_[X]DABR, but
guests running in POWER7 compatibility mode will still use H_SET_[X]DABR.
For them, this adds the logic to convert DABR/X values into DAWR/X values
on POWER8.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:15 +01:00
Paul Mackerras 5d00f66b86 KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells
POWER8 has support for hypervisor doorbell interrupts.  Though the
kernel doesn't use them for IPIs on the powernv platform yet, it
probably will in future, so this makes KVM cope gracefully if a
hypervisor doorbell interrupt arrives while in a guest.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:13 +01:00
Paul Mackerras e0622bd9f2 KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8
POWER8 has a bit in the LPCR to enable or disable the PURR and SPURR
registers to count when in the guest.  Set this bit.

POWER8 has a field in the LPCR called AIL (Alternate Interrupt Location)
which is used to enable relocation-on interrupts.  Allow userspace to
set this field.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:11 +01:00
Paul Mackerras aa31e84322 KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs
* SRR1 wake reason field for system reset interrupt on wakeup from nap
  is now a 4-bit field on P8, compared to 3 bits on P7.

* Set PECEDP in LPCR when napping because of H_CEDE so guest doorbells
  will wake us up.

* Waking up from nap because of a guest doorbell interrupt is not a
  reason to exit the guest.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:10 +01:00
Paul Mackerras e3bbbbfa13 KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap
Currently in book3s_hv_rmhandlers.S we have three places where we
have woken up from nap mode and we check the reason field in SRR1
to see what event woke us up.  This consolidates them into a new
function, kvmppc_check_wake_reason.  It looks at the wake reason
field in SRR1, and if it indicates that an external interrupt caused
the wakeup, calls kvmppc_read_intr to check what sort of interrupt
it was.

This also consolidates the two places where we synthesize an external
interrupt (0x500 vector) for the guest.  Now, if the guest exit code
finds that there was an external interrupt which has been handled
(i.e. it was an IPI indicating that there is now an interrupt pending
for the guest), it jumps to deliver_guest_interrupt, which is in the
last part of the guest entry code, where we synthesize guest external
and decrementer interrupts.  That code has been streamlined a little
and now clears LPCR[MER] when appropriate as well as setting it.

The extra clearing of any pending IPI on a secondary, offline CPU
thread before going back to nap mode has been removed.  It is no longer
necessary now that we have code to read and acknowledge IPIs in the
guest exit path.

This fixes a minor bug in the H_CEDE real-mode handling - previously,
if we found that other threads were already exiting the guest when we
were about to go to nap mode, we would branch to the cede wakeup path
and end up looking in SRR1 for a wakeup reason.  Now we branch to a
point after we have checked the wakeup reason.

This also fixes a minor bug in kvmppc_read_intr - previously it could
return 0xff rather than 1, in the case where we find that a host IPI
is pending after we have cleared the IPI.  Now it returns 1.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:08 +01:00
Paul Mackerras 5557ae0ec7 KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8
This allows us to select architecture 2.05 (POWER6) or 2.06 (POWER7)
compatibility modes on a POWER8 processor.  (Note that transactional
memory is disabled for usermode if either or both of the PCR_TM_DIS
and PCR_ARCH_206 bits are set.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:06 +01:00
Michael Ellerman bd3048b80c KVM: PPC: Book3S HV: Add handler for HV facility unavailable
At present this should never happen, since the host kernel sets
HFSCR to allow access to all facilities.  It's better to be prepared
to handle it cleanly if it does ever happen, though.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:04 +01:00
Paul Mackerras ca25205513 KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8
POWER8 has 512 sets in the TLB, compared to 128 for POWER7, so we need
to do more tlbiel instructions when flushing the TLB on POWER8.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:02 +01:00
Michael Neuling b005255e12 KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs
This adds fields to the struct kvm_vcpu_arch to store the new
guest-accessible SPRs on POWER8, adds code to the get/set_one_reg
functions to allow userspace to access this state, and adds code to
the guest entry and exit to context-switch these SPRs between host
and guest.

Note that DPDES (Directed Privileged Doorbell Exception State) is
shared between threads on a core; hence we store it in struct
kvmppc_vcore and have the master thread save and restore it.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:01:00 +01:00
Paul Mackerras e0b7ec058c KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers
On a threaded processor such as POWER7, we group VCPUs into virtual
cores and arrange that the VCPUs in a virtual core run on the same
physical core.  Currently we don't enforce any correspondence between
virtual thread numbers within a virtual core and physical thread
numbers.  Physical threads are allocated starting at 0 on a first-come
first-served basis to runnable virtual threads (VCPUs).

POWER8 implements a new "msgsndp" instruction which guest kernels can
use to interrupt other threads in the same core or sub-core.  Since
the instruction takes the destination physical thread ID as a parameter,
it becomes necessary to align the physical thread IDs with the virtual
thread IDs, that is, to make sure virtual thread N within a virtual
core always runs on physical thread N.

This means that it's possible that thread 0, which is where we call
__kvmppc_vcore_entry, may end up running some other vcpu than the
one whose task called kvmppc_run_core(), or it may end up running
no vcpu at all, if for example thread 0 of the virtual core is
currently executing in userspace.  However, we do need thread 0
to be responsible for switching the MMU -- a previous version of
this patch that had other threads switching the MMU was found to
be responsible for occasional memory corruption and machine check
interrupts in the guest on POWER7 machines.

To accommodate this, we no longer pass the vcpu pointer to
__kvmppc_vcore_entry, but instead let the assembly code load it from
the PACA.  Since the assembly code will need to know the kvm pointer
and the thread ID for threads which don't have a vcpu, we move the
thread ID into the PACA and we add a kvm pointer to the virtual core
structure.

In the case where thread 0 has no vcpu to run, it still calls into
kvmppc_hv_entry in order to do the MMU switch, and then naps until
either its vcpu is ready to run in the guest, or some other thread
needs to exit the guest.  In the latter case, thread 0 jumps to the
code that switches the MMU back to the host.  This control flow means
that now we switch the MMU before loading any guest vcpu state.
Similarly, on guest exit we now save all the guest vcpu state before
switching the MMU back to the host.  This has required substantial
code movement, making the diff rather large.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:00:59 +01:00
Michael Neuling eee7ff9d2c KVM: PPC: Book3S HV: Don't set DABR on POWER8
POWER8 doesn't have the DABR and DABRX registers; instead it has
new DAWR/DAWRX registers, which will be handled in a later patch.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:00:57 +01:00
Scott Wood 6c85f52b10 kvm/ppc: IRQ disabling cleanup
Simplify the handling of lazy EE by going directly from fully-enabled
to hard-disabled.  This replaces the lazy_irq_pending() check
(including its misplaced kvm_guest_exit() call).

As suggested by Tiejun Chen, move the interrupt disabling into
kvmppc_prepare_to_enter() rather than have each caller do it.  Also
move the IRQ enabling on heavyweight exit into
kvmppc_prepare_to_enter().

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:00:55 +01:00
Mihai Caraman 70713fe315 KVM: PPC: e500: Fix bad address type in deliver_tlb_misss()
Use gva_t instead of unsigned int for eaddr in deliver_tlb_miss().

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
CC: stable@vger.kernel.org
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:00:54 +01:00
Andreas Schwab 48eaef0518 KVM: PPC: Book3S HV: use xics_wake_cpu only when defined
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
CC: stable@vger.kernel.org
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:00:52 +01:00
Cédric Le Goater 736017752d KVM: PPC: Book3S: MMIO emulation support for little endian guests
MMIO emulation reads the last instruction executed by the guest
and then emulates. If the guest is running in Little Endian order,
or more generally in a different endian order of the host, the
instruction needs to be byte-swapped before being emulated.

This patch adds a helper routine which tests the endian order of
the host and the guest in order to decide whether a byteswap is
needed or not. It is then used to byteswap the last instruction
of the guest in the endian order of the host before MMIO emulation
is performed.

Finally, kvmppc_handle_load() of kvmppc_handle_store() are modified
to reverse the endianness of the MMIO if required.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[agraf: add booke handling]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-27 16:00:39 +01:00
Linus Torvalds 7ebd3faa9b First round of KVM updates for 3.14; PPC parts will come next week.
Nothing major here, just bugfixes all over the place.  The most
 interesting part is the ARM guys' virtualized interrupt controller
 overhaul, which lets userspace get/set the state and thus enables
 migration of ARM VMs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU
 uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH
 5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0
 iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8
 FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd
 Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ
 dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z
 2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR
 l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy
 X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF
 hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu
 x7N8KR5hAj+mLBoM9/Al
 =8sYU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "First round of KVM updates for 3.14; PPC parts will come next week.

  Nothing major here, just bugfixes all over the place.  The most
  interesting part is the ARM guys' virtualized interrupt controller
  overhaul, which lets userspace get/set the state and thus enables
  migration of ARM VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
  kvm: make KVM_MMU_AUDIT help text more readable
  KVM: s390: Fix memory access error detection
  KVM: nVMX: Update guest activity state field on L2 exits
  KVM: nVMX: Fix nested_run_pending on activity state HLT
  KVM: nVMX: Clean up handling of VMX-related MSRs
  KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
  KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
  KVM: nVMX: Leave VMX mode on clearing of feature control MSR
  KVM: VMX: Fix DR6 update on #DB exception
  KVM: SVM: Fix reading of DR6
  KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
  add support for Hyper-V reference time counter
  KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
  KVM: x86: fix tsc catchup issue with tsc scaling
  KVM: x86: limit PIT timer frequency
  KVM: x86: handle invalid root_hpa everywhere
  kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
  kvm: vfio: silence GCC warning
  KVM: ARM: Remove duplicate include
  arm/arm64: KVM: relax the requirements of VMA alignment for THP
  ...
2014-01-22 21:40:43 -08:00
Zhouyi Zhou 47d45d9f53 KVM: PPC: NULL return of kvmppc_mmu_hpte_cache_next should be handled
NULL return of kvmppc_mmu_hpte_cache_next should be handled

Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:11 +01:00
Tiejun Chen 9bd880a2c8 KVM: PPC: Book3E HV: call RECONCILE_IRQ_STATE to sync the software state
Rather than calling hard_irq_disable() when we're back in C code
we can just call RECONCILE_IRQ_STATE to soft disable IRQs while
we're already in hard disabled state.

This should be functionally equivalent to the code before, but
cleaner and faster.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
[agraf: fix comment, commit message]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:10 +01:00
Bharat Bhushan 08c9a188d0 kvm: powerpc: use caching attributes as per linux pte
KVM uses same WIM tlb attributes as the corresponding qemu pte.
For this we now search the linux pte for the requested page and
get these cache caching/coherency attributes from pte.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Reviewed-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:08 +01:00
Bharat Bhushan 7c85e6b39c kvm: book3s: rename lookup_linux_pte() to lookup_linux_pte_and_update()
lookup_linux_pte() is doing more than lookup, updating the pte,
so for clarity it is renamed to lookup_linux_pte_and_update()

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Reviewed-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:06 +01:00
Bharat Bhushan 30a91fe24b kvm: booke: clear host tlb reference flag on guest tlb invalidation
On booke, "struct tlbe_ref" contains host tlb mapping information
(pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
for a guest tlb entry. So when a guest creates a TLB entry then
"struct tlbe_ref" is set to point to valid "pfn" and set attributes in
"flags" field of the above said structure. When a guest TLB entry is
invalidated then flags field of corresponding "struct tlbe_ref" is
updated to point that this is no more valid, also we selectively clear
some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.

Ideally we should clear complete "flags" as this entry is invalid and does not
have anything to re-used. The other part of the problem is that when we use
the same entry again then also we do not clear (started doing or-ing etc).

So far it was working because the selectively clearing mentioned above
actually clears "flags" what was set during TLB mapping. But the problem
starts coming when we add more attributes to this then we need to selectively
clear them and which is not needed.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Reviewed-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:04 +01:00
Paul Mackerras 595e4f7e69 KVM: PPC: Book3S HV: Use load/store_fp_state functions in HV guest entry/exit
This modifies kvmppc_load_fp and kvmppc_save_fp to use the generic
FP/VSX and VMX load/store functions instead of open-coding the
FP/VSX/VMX load/store instructions.  Since kvmppc_load/save_fp don't
follow C calling conventions, we make them private symbols within
book3s_hv_rmhandlers.S.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:03 +01:00
Paul Mackerras 99dae3bad2 KVM: PPC: Load/save FP/VMX/VSX state directly to/from vcpu struct
Now that we have the vcpu floating-point and vector state stored in
the same type of struct as the main kernel uses, we can load that
state directly from the vcpu struct instead of having extra copies
to/from the thread_struct.  Similarly, when the guest state needs to
be saved, we can have it saved it directly to the vcpu struct by
setting the current->thread.fp_save_area and current->thread.vr_save_area
pointers.  That also means that we don't need to back up and restore
userspace's FP/vector state.  This all makes the code simpler and
faster.

Note that it's not necessary to save or modify current->thread.fpexc_mode,
since nothing in KVM uses or is affected by its value.  Nor is it
necessary to touch used_vr or used_vsr.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:02 +01:00
Paul Mackerras efff191223 KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structures
This uses struct thread_fp_state and struct thread_vr_state to store
the floating-point, VMX/Altivec and VSX state, rather than flat arrays.
This makes transferring the state to/from the thread_struct simpler
and allows us to unify the get/set_one_reg implementations for the
VSX registers.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:00 +01:00
Paul Mackerras 09548fdaf3 KVM: PPC: Use load_fp/vr_state rather than load_up_fpu/altivec
The load_up_fpu and load_up_altivec functions were never intended to
be called from C, and do things like modifying the MSR value in their
callers' stack frames, which are assumed to be interrupt frames.  In
addition, on 32-bit Book S they require the MMU to be off.

This makes KVM use the new load_fp_state() and load_vr_state() functions
instead of load_up_fpu/altivec.  This means we can remove the assembler
glue in book3s_rmhandlers.S, and potentially fixes a bug on Book E,
where load_up_fpu was called directly from C.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:14:59 +01:00