This way, we can avoid checking the user space address many times when
we read the guest memory.
Although we can do the same for write if we check which slots are
writable, we do not care write now: reading the guest memory happens
more often than writing.
[avi: change VERIFY_READ to VERIFY_WRITE]
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
When we optimized walk_addr_generic() by not using the generic guest
memory reader, we replaced copy_from_user() with get_user():
commit e30d2a170506830d5eef5e9d7990c5aedf1b0a51
KVM: MMU: Optimize guest page table walk
commit 15e2ac9a43d4d7d08088e404fddf2533a8e7d52e
KVM: MMU: Fix 64-bit paging breakage on x86_32
But as Andi pointed out later, copy_from_user() does the same as
get_user() as long as we give a constant size to it.
So we use copy_from_user() to clean up the code.
The only, noticeable, regression introduced by this is 64-bit gpte
reading on x86_32 hosts needed for PAE guests.
But this can be mitigated by implementing 8-byte get_user() for x86_32,
if needed.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Since the emulator now checks segment limits and access rights, it
generates a lot more accesses to the vmcs segment fields. Undo some
of the performance hit by cacheing those fields in a read-only cache
(the entire cache is invalidated on any write, or on guest exit).
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of separate accessors for the segment selector and cached descriptor,
use one accessor for both. This simplifies the code somewhat.
Signed-off-by: Avi Kivity <avi@redhat.com>
dump_vmcb isn't used outside this module, make it static.
Shrink text and object by ~1% by standardizing formats.
$ size arch/x86/kvm/svm.o*
text data bss dec hex filename
52910 580 10072 63562 f84a arch/x86/kvm/svm.o.new
53563 580 10072 64215 fad7 arch/x86/kvm/svm.o.old
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Fix regression introduced by
commit e30d2a170506830d5eef5e9d7990c5aedf1b0a51
KVM: MMU: Optimize guest page table walk
On x86_32, get_user() does not support 64-bit values and we fail to
build KVM at the point of 64-bit paging.
This patch fixes this by using get_user() twice for that condition.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
The CPUIDs for Centaur are added, and then the features of
PadLock hardware engine on VIA CPU, such as "ace", "ace_en"
and so on, can be passed into the kvm guest.
Signed-off-by: Brilly Wu <brillywu@viatech.com.cn>
Signed-off-by: Kary Jin <karyjin@viatech.com.cn>
Signed-off-by: Avi Kivity <avi@redhat.com>
mmio_index should be taken into account when copying data from
userspace.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Move all groups into a single field and handle them in a single place. This
saves bits when we add more group types (3 bits -> 7 groups types).
Signed-off-by: Avi Kivity <avi@redhat.com>
walk_addr_generic() is a hot path and is also hard for the cpu to predict -
some of the parameters (fetch_fault in particular) vary wildly from
invocation to invocation.
Add unlikely() annotations where appropriate; all walk failures are
considered unlikely, as are cases where we have to mark the accessed or
dirty bit, as they are slow paths both in kvm and on real processors.
Signed-off-by: Avi Kivity <avi@redhat.com>
For this, emulate_pusha/popa() are converted to em_pusha/popa().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
In addition, the RET emulation is changed to call em_pop() to remove
the pop_instruction label.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
The following instructions are changed to use opcode::execute.
Group 1 (80-83)
ADD (00-05), OR (08-0D), ADC (10-15), SBB (18-1D), AND (20-25),
SUB (28-2D), XOR (30-35), CMP (38-3D)
CMPS (A6-A7), SCAS (AE-AF)
The last two do the same as CMP in the emulator, so em_cmp() is used.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch optimizes the guest page table walk by using get_user()
instead of copy_from_user().
With this patch applied, paging64_walk_addr_generic() has become
about 0.5us to 1.0us faster on my Phenom II machine with NPT on.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
By reserving 0 as an invalid x86_intercept_stage, we no longer
need to store a valid flag in x86_intercept_map.
Signed-off-by: Avi Kivity <avi@redhat.com>
While it isn't defined, no need to force a #UD. If it becomes defined
in the future this can cause wierd problems for the guest.
Signed-off-by: Avi Kivity <avi@redhat.com>
arch/x86/kvm/emulate.c:2598: warning: integer constant is too large for 'long' type
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Commit 0b56652e33c72092956c651ab6ceb9f0ad081153 fails to build:
CC [M] arch/x86/kvm/emulate.o
arch/x86/kvm/emulate.c: In function 'x86_emulate_insn':
arch/x86/kvm/emulate.c:4095:25: error: macro "wbinvd" passed 1 arguments, but takes just 0
arch/x86/kvm/emulate.c:4095:3: warning: statement with no effect
make[2]: *** [arch/x86/kvm/emulate.o] Error 1
make[1]: *** [arch/x86/kvm] Error 2
make: *** [arch/x86] Error 2
Work around this for now.
Signed-off-by: Clemens Noss <cnoss@gmx.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch makes the cmpxchg_gpte() function aware of the
difference between l1-gfns and l2-gfns when nested
virtualization is in use. This fixes a potential
data-corruption problem in the l1-guest and makes the code
work correct (at least as correct as the hardware which is
emulated in this code) again.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avoid using ctxt->vcpu; we can do everything with ->get_cr() and ->set_cr().
A side effect is that we no longer activate the fpu on emulated CLTS; but that
should be very rare.
Signed-off-by: Avi Kivity <avi@redhat.com>
Making the emulator caller agnostic.
[Takuya Yoshikawa: fix typo leading to LDT failures]
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Since segments need to be handled slightly differently when fetching
instructions, we add a __linearize helper that accepts a new 'fetch' boolean.
[avi: fix oops caused by wrong segmented_address initialization order]
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The last_guest_tsc is used in vcpu_load to adjust the
tsc_offset since tsc-scaling is merged. So the
last_guest_tsc needs to be updated in vcpu_put instead of
the the last_host_tsc. This is fixed with this patch.
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch fixes a bug in the nested-svm path when
decode-assists is available on the machine. After a
selective-cr0 intercept is detected the rip is advanced
unconditionally. This causes the l1-guest to continue
running with an l2-rip.
This bug was with the sel_cr0 unit-test on decode-assists
capable hardware.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently, setting a large (i.e. negative) base address for %cs does not work on
a 64-bit host. The "JOS" teaching operating system, used by MIT and other
universities, relies on such segments while bootstrapping its way to full
virtual memory management.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Just remove useless function define kvm_inject_pit_timer_irqs() from
file arch/x86/kvm/i8254.h
Signed-off-by:Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Just remove useless function define kvm_pic_clear_isr_ack() and
pit_has_pending_timer()
Signed-off-by: Duan Jiong<djduanjiong@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When doing a soft int, we need to bump eip before pushing it to
the stack. Otherwise we'll do the int a second time.
[apw@canonical.com: merged eip update as per Jan's recommendation.]
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Avi Kivity <avi@redhat.com>