OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Aneesh Kumar K.V	06da28e76b	KVM: PPC: BOOK3S: PR: Emulate instruction counter Writing to IC is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:10 +02:00
Aneesh Kumar K.V	8f42ab2749	KVM: PPC: BOOK3S: PR: Emulate virtual timebase register virtual time base register is a per VM, per cpu register that needs to be saved and restored on vm exit and entry. Writing to VTB is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix compile error] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:21:50 +02:00
Linus Torvalds	b05d59dfce	At over 200 commits, covering almost all supported architectures, this was a pretty active cycle for KVM. Changes include: - a lot of s390 changes: optimizations, support for migration, GDB support and more - ARM changes are pretty small: support for the PSCI 0.2 hypercall interface on both the guest and the host (the latter acked by Catalin) - initial POWER8 and little-endian host support - support for running u-boot on embedded POWER targets - pretty large changes to MIPS too, completing the userspace interface and improving the handling of virtualized timer hardware - for x86, a larger set of changes is scheduled for 3.17. Still, we have a few emulator bugfixes and support for running nested fully-virtualized Xen guests (para-virtualized Xen guests have always worked). And some optimizations too. The only missing architecture here is ia64. It's not a coincidence that support for KVM on ia64 is scheduled for removal in 3.17. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTjtlBAAoJEBvWZb6bTYbyMOUP/2NAePghE3IjG99ikHFdn+BX BfrURsuR6GD0AhYQnBidBmpFbAmN/LwSJxv/M7sV7OBRWLu3qbt69DrPTU2e/FK1 j9q25peu8jRyHzJ1q9rBroo74nD9lQYuVr3uXNxxcg0DRnw14JHGlM3y8LDEknO8 W+gpWTeAQ+2AuOX98MpRbCRMuzziCSv5bP5FhBVnsWHiZfvMbcUrbeJt+zYSiDAZ 0tHm/5dFKzfj/vVrrnjD4EZcRr688Bs5rztG96hY6aoVJryjZGLtLp92wCWkRRmH CCvZwd245NmNthuKHzcs27/duSWfU0uOlu7AMrD44QYhzeDGyB/2nbCxbGqLLoBA nnOviXH4cC65/CnisZ79zfo979HbZcX+Lzg747EjBgCSxJmLlwgiG8yXtDvk5otB TH6GUeGDiEEPj//JD3XtgSz0sF2NvjREWRyemjDMvhz6JC/bLytXKb3sn+NXSj8m ujzF9eQoa4qKDcBL4IQYGTJ4z5nY3Pd68dHFIPHB7n82OxFLSQUBKxXw8/1fb5og VVb8PL4GOcmakQlAKtTMlFPmuy4bbL2r/2iV5xJiOZKmXIu8Hs1JezBE3SFAltbl 3cAGwSM9/dDkKxUbTFblyOE9bkKbg4WYmq0LkdzsPEomb3IZWntOT25rYnX+LrBz bAknaZpPiOrW11Et1htY =j5Od -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm into next Pull KVM updates from Paolo Bonzini: "At over 200 commits, covering almost all supported architectures, this was a pretty active cycle for KVM. Changes include: - a lot of s390 changes: optimizations, support for migration, GDB support and more - ARM changes are pretty small: support for the PSCI 0.2 hypercall interface on both the guest and the host (the latter acked by Catalin) - initial POWER8 and little-endian host support - support for running u-boot on embedded POWER targets - pretty large changes to MIPS too, completing the userspace interface and improving the handling of virtualized timer hardware - for x86, a larger set of changes is scheduled for 3.17. Still, we have a few emulator bugfixes and support for running nested fully-virtualized Xen guests (para-virtualized Xen guests have always worked). And some optimizations too. The only missing architecture here is ia64. It's not a coincidence that support for KVM on ia64 is scheduled for removal in 3.17" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits) KVM: add missing cleanup_srcu_struct KVM: PPC: Book3S PR: Rework SLB switching code KVM: PPC: Book3S PR: Use SLB entry 0 KVM: PPC: Book3S HV: Fix machine check delivery to guest KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs KVM: PPC: Book3S HV: Make sure we don't miss dirty pages KVM: PPC: Book3S HV: Fix dirty map for hugepages KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number KVM: PPC: Book3S: Add ONE_REG register names that were missed KVM: PPC: Add CAP to indicate hcall fixes KVM: PPC: MPIC: Reset IRQ source private members KVM: PPC: Graciously fail broken LE hypercalls PPC: ePAPR: Fix hypercall on LE guest KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler KVM: PPC: BOOK3S: Always use the saved DAR value PPC: KVM: Make NX bit available with magic page KVM: PPC: Disable NX for old magic page using guests KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest ...	2014-06-04 08:47:12 -07:00
Alexander Graf	2e23f54413	KVM: PPC: Book3S PR: Expose EBB registers POWER8 introduces a new facility called the "Event Based Branch" facility. It contains of a few registers that indicate where a guest should branch to when a defined event occurs and it's in PR mode. We don't want to really enable EBB as it will create a big mess with !PR guest mode while hardware is in PR and we don't really emulate the PMU anyway. So instead, let's just leave it at emulation of all its registers. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-05-30 14:26:23 +02:00
Alexander Graf	e14e7a1e53	KVM: PPC: Book3S PR: Expose TAR facility to guest POWER8 implements a new register called TAR. This register has to be enabled in FSCR and then from KVM's point of view is mere storage. This patch enables the guest to use TAR. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-05-30 14:26:23 +02:00
Alexander Graf	616dff8602	KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR POWER8 introduced a new interrupt type called "Facility unavailable interrupt" which contains its status message in a new register called FSCR. Handle these exits and try to emulate instructions for unhandled facilities. Follow-on patches enable KVM to expose specific facilities into the guest. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-05-30 14:26:22 +02:00
Alexander Graf	5deb8e7ad8	KVM: PPC: Make shared struct aka magic page guest endian The shared (magic) page is a data structure that contains often used supervisor privileged SPRs accessible via memory to the user to reduce the number of exits we have to take to read/write them. When we actually share this structure with the guest we have to maintain it in guest endianness, because some of the patch tricks only work with native endian load/store operations. Since we only share the structure with either host or guest in little endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv. For booke, the shared struct stays big endian. For book3s_64 hv we maintain the struct in host native endian, since it never gets shared with the guest. For book3s_64 pr we introduce a variable that tells us which endianness the shared struct is in and route every access to it through helper inline functions that evaluate this variable. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-05-30 14:26:21 +02:00
Alexander Graf	ab78475c76	KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit The book3s_32 target can get built as module which means we don't see the config define for it in code. Instead, check on the bool define CONFIG_KVM_BOOK3S_32_HANDLER whenever we want to know whether we're building for a book3s_32 host. This fixes running book3s_32 kvm as a module for me. Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2014-04-28 12:35:42 +02:00
Paul Mackerras	efff191223	KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structures This uses struct thread_fp_state and struct thread_vr_state to store the floating-point, VMX/Altivec and VSX state, rather than flat arrays. This makes transferring the state to/from the thread_struct simpler and allows us to unify the get/set_one_reg implementations for the VSX registers. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-01-09 10:15:00 +01:00
Alexander Graf	398a76c677	KVM: PPC: Add devname:kvm aliases for modules Systems that support automatic loading of kernel modules through device aliases should try and automatically load kvm when /dev/kvm gets opened. Add code to support that magic for all PPC kvm targets, even the ones that don't support modules yet. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-01-09 10:14:00 +01:00
Aneesh Kumar K.V	a78b55d1c0	kvm: powerpc: book3s: drop is_hv_enabled drop is_hv_enabled, because that should not be a callback property Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 18:43:34 +02:00
Aneesh Kumar K.V	cbbc58d4fd	kvm: powerpc: book3s: Allow the HV and PR selection per virtual machine This moves the kvmppc_ops callbacks to be a per VM entity. This enables us to select HV and PR mode when creating a VM. We also allow both kvm-hv and kvm-pr kernel module to be loaded. To achieve this we move /dev/kvm ownership to kvm.ko module. Depending on which KVM mode we select during VM creation we take a reference count on respective module Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix coding style] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 18:42:36 +02:00
Aneesh Kumar K.V	5587027ce9	kvm: Add struct kvm arg to memslot APIs We will use that in the later patch to find the kvm ops handler Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:49:23 +02:00
Aneesh Kumar K.V	2ba9f0d887	kvm: powerpc: book3s: Support building HV and PR KVM as module Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: squash in compile fix] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:45:35 +02:00
Aneesh Kumar K.V	699cc87641	kvm: powerpc: book3s: Add is_hv_enabled to kvmppc_ops This help us to identify whether we are running with hypervisor mode KVM enabled. The change is needed so that we can have both HV and PR kvm enabled in the same kernel. If both HV and PR KVM are included, interrupts come in to the HV version of the kvmppc_interrupt code, which then jumps to the PR handler, renamed to kvmppc_interrupt_pr, if the guest is a PR guest. Allowing both PR and HV in the same kernel required some changes to kvm_dev_ioctl_check_extension(), since the values returned now can't be selected with #ifdefs as much as previously. We look at is_hv_enabled to return the right value when checking for capabilities.For capabilities that are only provided by HV KVM, we return the HV value only if is_hv_enabled is true. For capabilities provided by PR KVM but not HV, we return the PR value only if is_hv_enabled is false. NOTE: in later patch we replace is_hv_enabled with a static inline function comparing kvm_ppc_ops Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:29:09 +02:00
Aneesh Kumar K.V	3a167beac0	kvm: powerpc: Add kvmppc_ops callback This patch add a new callback kvmppc_ops. This will help us in enabling both HV and PR KVM together in the same kernel. The actual change to enable them together is done in the later patch in the series. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: squash in booke changes] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:24:26 +02:00
Paul Mackerras	93b159b466	KVM: PPC: Book3S PR: Better handling of host-side read-only pages Currently we request write access to all pages that get mapped into the guest, even if the guest is only loading from the page. This reduces the effectiveness of KSM because it means that we unshare every page we access. Also, we always set the changed (C) bit in the guest HPTE if it allows writing, even for a guest load. This fixes both these problems. We pass an 'iswrite' flag to the mmu.xlate() functions and to kvmppc_mmu_map_page() to indicate whether the access is a load or a store. The mmu.xlate() functions now only set C for stores. kvmppc_gfn_to_pfn() now calls gfn_to_pfn_prot() instead of gfn_to_pfn() so that it can indicate whether we need write access to the page, and get back a 'writable' flag to indicate whether the page is writable or not. If that 'writable' flag is clear, we then make the host HPTE read-only even if the guest HPTE allowed writing. This means that we can get a protection fault when the guest writes to a page that it has mapped read-write but which is read-only on the host side (perhaps due to KSM having merged the page). Thus we now call kvmppc_handle_pagefault() for protection faults as well as HPTE not found faults. In kvmppc_handle_pagefault(), if the access was allowed by the guest HPTE and we thus need to install a new host HPTE, we then need to remove the old host HPTE if there is one. This is done with a new function, kvmppc_mmu_unmap_page(), which uses kvmppc_mmu_pte_vflush() to find and remove the old host HPTE. Since the memslot-related functions require the KVM SRCU read lock to be held, this adds srcu_read_lock/unlock pairs around the calls to kvmppc_handle_pagefault(). Finally, this changes kvmppc_mmu_book3s_32_xlate_pte() to not ignore guest HPTEs that don't permit access, and to return -EPERM for accesses that are not permitted by the page protections. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:35 +02:00
Paul Mackerras	c0867fd509	KVM: PPC: Book3S: Add GET/SET_ONE_REG interface for VRSAVE The VRSAVE register value for a vcpu is accessible through the GET/SET_SREGS interface for Book E processors, but not for Book 3S processors. In order to make this accessible for Book 3S processors, this adds a new register identifier for GET/SET_ONE_REG, and adds the code to implement it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:44:59 +02:00
Paul Mackerras	8b78645c93	KVM: PPC: Book3S: Facilities to save/restore XICS presentation ctrler state This adds the ability for userspace to save and restore the state of the XICS interrupt presentation controllers (ICPs) via the KVM_GET/SET_ONE_REG interface. Since there is one ICP per vcpu, we simply define a new 64-bit register in the ONE_REG space for the ICP state. The state includes the CPU priority setting, the pending IPI priority, and the priority and source number of any pending external interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-04-26 20:27:34 +02:00
Benjamin Herrenschmidt	bc5ad3f370	KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller This adds in-kernel emulation of the XICS (eXternal Interrupt Controller Specification) interrupt controller specified by PAPR, for both HV and PR KVM guests. The XICS emulation supports up to 1048560 interrupt sources. Interrupt source numbers below 16 are reserved; 0 is used to mean no interrupt and 2 is used for IPIs. Internally these are represented in blocks of 1024, called ICS (interrupt controller source) entities, but that is not visible to userspace. Each vcpu gets one ICP (interrupt controller presentation) entity, used to store the per-vcpu state such as vcpu priority, pending interrupt state, IPI request, etc. This does not include any API or any way to connect vcpus to their ICP state; that will be added in later patches. This is based on an initial implementation by Michael Ellerman <michael@ellerman.id.au> reworked by Benjamin Herrenschmidt and Paul Mackerras. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: fix typo, add dependency on !KVM_MPIC] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-04-26 20:27:30 +02:00
Bharat Bhushan	092d62ee93	KVM: PPC: debug stub interface parameter defined This patch defines the interface parameter for KVM_SET_GUEST_DEBUG ioctl support. Follow up patches will use this for setting up hardware breakpoints, watchpoints and software breakpoints. Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below. This is because I am not sure what is required for book3s. So this ioctl behaviour will not change for book3s. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-04-26 20:27:02 +02:00
Bharat Bhushan	8c32a2ea65	Added ONE_REG interface for debug instruction This patch adds the one_reg interface to get the special instruction to be used for setting software breakpoint from userspace. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-04-17 15:21:14 +02:00
Paul Mackerras	4fe27d2add	KVM: PPC: Remove unused argument to kvmppc_core_dequeue_external Currently kvmppc_core_dequeue_external() takes a struct kvm_interrupt * argument and does nothing with it, in any of its implementations. This removes it in order to make things easier for forthcoming in-kernel interrupt controller emulation code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-03-22 01:21:17 +01:00
Paul Mackerras	a8bd19ef4d	KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface This enables userspace to get and set all the guest floating-point state using the KVM_[GS]ET_ONE_REG ioctls. The floating-point state includes all of the traditional floating-point registers and the FPSCR (floating point status/control register), all the VMX/Altivec vector registers and the VSCR (vector status/control register), and on POWER7, the vector-scalar registers (note that each FP register is the high-order half of the corresponding VSR). Most of these are implemented in common Book 3S code, except for VSX on POWER7. Because HV and PR differ in how they store the FP and VSX registers on POWER7, the code for these cases is not common. On POWER7, the FP registers are the upper halves of the VSX registers vsr0 - vsr31. PR KVM stores vsr0 - vsr31 in two halves, with the upper halves in the arch.fpr[] array and the lower halves in the arch.vsr[] array, whereas HV KVM on POWER7 stores the whole VSX register in arch.vsr[]. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: fix whitespace, vsx compilation] Signed-off-by: Alexander Graf <agraf@suse.de>	2012-10-05 23:38:54 +02:00
Paul Mackerras	a136a8bdc0	KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface This enables userspace to get and set various SPRs (special-purpose registers) using the KVM_[GS]ET_ONE_REG ioctls. With this, userspace can get and set all the SPRs that are part of the guest state, either through the KVM_[GS]ET_REGS ioctls, the KVM_[GS]ET_SREGS ioctls, or the KVM_[GS]ET_ONE_REG ioctls. The SPRs that are added here are: - DABR: Data address breakpoint register - DSCR: Data stream control register - PURR: Processor utilization of resources register - SPURR: Scaled PURR - DAR: Data address register - DSISR: Data storage interrupt status register - AMR: Authority mask register - UAMOR: User authority mask override register - MMCR0, MMCR1, MMCRA: Performance monitor unit control registers - PMC1..PMC8: Performance monitor unit counter registers In order to reduce code duplication between PR and HV KVM code, this moves the kvm_vcpu_ioctl_[gs]et_one_reg functions into book3s.c and centralizes the copying between user and kernel space there. The registers that are handled differently between PR and HV, and those that exist only in one flavor, are handled in kvmppc_[gs]et_one_reg() functions that are specific to each flavor. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: minimal style fixes] Signed-off-by: Alexander Graf <agraf@suse.de>	2012-10-05 23:38:54 +02:00
Bharat Bhushan	f61c94bb99	KVM: PPC: booke: Add watchdog emulation This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> [bharat.bhushan@freescale.com: reworked patch] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> [agraf: adjust to new request framework] Signed-off-by: Alexander Graf <agraf@suse.de>	2012-10-05 23:38:47 +02:00
Benjamin Herrenschmidt	bbcc9c0669	powerpc/kvm: Fix magic page vs. 32-bit RTAS on ppc64 When the kernel calls into RTAS, it switches to 32-bit mode. The magic page was is longer accessible in that case, causing the patched instructions in the RTAS call wrapper to crash. This fixes it by making available a 32-bit mapping of the magic page in that case. This mapping is flushed whenever we switch the kernel back to 64-bit mode. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: add a check if the magic page is mapped] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:02:39 +03:00
Alexander Graf	a8e4ef8414	KVM: PPC: booke: rework rescheduling checks Instead of checking whether we should reschedule only when we exited due to an interrupt, let's always check before entering the guest back again. This gets the target more in line with the other archs. Also while at it, generalize the whole thing so that eventually we could have a single kvmppc_prepare_to_enter function for all ppc targets that does signal and reschedule checking for us. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:05 +03:00
Paul Mackerras	82ed36164c	KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bit This changes the implementation of kvm_vm_ioctl_get_dirty_log() for Book3s HV guests to use the hardware C (changed) bits in the guest hashed page table. Since this makes the implementation quite different from the Book3s PR case, this moves the existing implementation from book3s.c to book3s_pr.c and creates a new implementation in book3s_hv.c. That implementation calls kvmppc_hv_get_dirty_log() to do the actual work by calling kvm_test_clear_dirty on each page. It iterates over the HPTEs, clearing the C bit if set, and returns 1 if any C bit was set (including the saved C bit in the rmap entry). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:39 +02:00
Scott Wood	dfd4d47e9a	KVM: PPC: booke: Improve timer register emulation Decrementers are now properly driven by TCR/TSR, and the guest has full read/write access to these registers. The decrementer keeps ticking (and setting the TSR bit) regardless of whether the interrupts are enabled with TCR. The decrementer stops at zero, rather than going negative. Decrementers (and FITs, once implemented) are delivered as level-triggered interrupts -- dequeued when the TSR bit is cleared, not on delivery. Signed-off-by: Liu Yu <yu.liu@freescale.com> [scottwood@freescale.com: significant changes] Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:27 +02:00
Scott Wood	b59049720d	KVM: PPC: Paravirtualize SPRG4-7, ESR, PIR, MASn This allows additional registers to be accessed by the guest in PR-mode KVM without trapping. SPRG4-7 are readable from userspace. On booke, KVM will sync these registers when it enters the guest, so that accesses from guest userspace will work. The guest kernel, OTOH, must consistently use either the real registers or the shared area between exits. This also applies to the already-paravirted SPRG3. On non-booke, it's not clear to what extent SPRG4-7 are supported (they're not architected for book3s, but exist on at least some classic chips). They are copied in the get/set regs ioctls, but I do not see any non-booke emulation. I also do not see any syncing with real registers (in PR-mode) including the user-readable SPRG3. This patch should not make that situation any worse. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:26 +02:00
Scott Wood	7e28e60ef9	KVM: PPC: Rename deliver_interrupts to prepare_to_enter This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:25 +02:00
Xiao Guangrong	28a37544fb	KVM: introduce id_to_memslot function Introduce id_to_memslot to get memslot by slot id Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-12-27 11:17:39 +02:00
Paul Gortmaker	66b15db69c	powerpc: add export.h to files making use of EXPORT_SYMBOL With module.h being implicitly everywhere via device.h, the absence of explicitly including something for EXPORT_SYMBOL went unnoticed. Since we are heading to fix things up and clean module.h from the device.h file, we need to explicitly include these files now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:37 -04:00
Paul Mackerras	3cf658b605	KVM: PPC: Deliver program interrupts right away instead of queueing them Doing so means that we don't have to save the flags anywhere and gets rid of the last reference to to_book3s(vcpu) in arch/powerpc/kvm/book3s.c. Doing so is OK because a program interrupt won't be generated at the same time as any other synchronous interrupt. If a program interrupt and an asynchronous interrupt (external or decrementer) are generated at the same time, the program interrupt will be delivered, which is correct because it has a higher priority, and then the asynchronous interrupt will be masked. We don't ever generate system reset or machine check interrupts to the guest, but if we did, then we would need to make sure they got delivered rather than the program interrupt. The current code would be wrong in this situation anyway since it would deliver the program interrupt as well as the reset/machine check interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:49 +03:00
Paul Mackerras	f05ed4d56e	KVM: PPC: Split out code from book3s.c into book3s_pr.c In preparation for adding code to enable KVM to use hypervisor mode on 64-bit Book 3S processors, this splits book3s.c into two files, book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is specific to running the guest in problem state (user mode) and book3s.c contains code which should apply to all Book 3S processors. In doing this, we abstract some details, namely the interrupt offset, updating the interrupt pending flag, and detecting if the guest is in a critical section. These are all things that will be different when we use hypervisor mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:47 +03:00
Paul Mackerras	c4befc58a0	KVM: PPC: Move fields between struct kvm_vcpu_arch and kvmppc_vcpu_book3s This moves the slb field, which represents the state of the emulated SLB, from the kvmppc_vcpu_book3s struct to the kvm_vcpu_arch, and the hpte_hash_[v]pte[_long] fields from kvm_vcpu_arch to kvmppc_vcpu_book3s. This is in accord with the principle that the kvm_vcpu_arch struct represents the state of the emulated CPU, and the kvmppc_vcpu_book3s struct holds the auxiliary data structures used in the emulation. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:46 +03:00
Paul Mackerras	149dbdb185	KVM: PPC: Fix machine checks on 32-bit Book3S Commit 69acc0d3ba ("KVM: PPC: Resolve real-mode handlers through function exports") resulted in vcpu->arch.trampoline_lowmem and vcpu->arch.trampoline_enter ending up with kernel virtual addresses rather than physical addresses. This is OK on 64-bit Book3S machines, which ignore the top 4 bits of the effective address in real mode, but on 32-bit Book3S machines, accessing these addresses in real mode causes machine check interrupts, as the hardware uses the whole effective address as the physical address in real mode. This fixes the problem by using __pa() to convert these addresses to physical addresses. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:45 +03:00
Alexander Graf	a22a2daccf	KVM: PPC: Resolve real-mode handlers through function exports Up until now, Book3S KVM had variables stored in the kernel that a kernel module or the kvm code in the kernel could read from to figure out where some real mode helper functions are located. This is all unnecessary. The high bits of the EA get ignore in real mode, so we can just use the pointer as is. Also, it's a lot easier on relocations when we use the normal way of resolving the address to a function, instead of jumping through hoops. This patch fixes compilation with CONFIG_RELOCATABLE=y. Signed-off-by: Alexander Graf <agraf@suse.de>	2011-07-12 13:16:29 +03:00
Paul Mackerras	44075d95e2	powerpc/kvm: Fix kvmppc_core_pending_dec The vcpu->arch.pending_exceptions field is a bitfield indexed by interrupt priority number as returned by kvmppc_book3s_vec2irqprio. However, kvmppc_core_pending_dec was using an interrupt vector shifted by 7 as the bit index. Fix it to use the irqprio value for the decrementer interrupt instead. This problem was found by code inspection. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-05-20 13:43:41 +10:00
Peter Tyser	bc9c1933d9	KVM: PPC: Fix SPRG get/set for Book3S and BookE Previously SPRGs 4-7 were improperly read and written in kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs(); Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:25 -03:00
Takuya Yoshikawa	2653503769	KVM: replace vmalloc and memset with vzalloc Let's use newly introduced vzalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:28:55 +02:00
Alexander Graf	17bd158006	KVM: PPC: Implement Level interrupts on Book3S The current interrupt logic is just completely broken. We get a notification from user space, telling us that an interrupt is there. But then user space expects us that we just acknowledge an interrupt once we deliver it to the guest. This is not how real hardware works though. On real hardware, the interrupt controller pulls the external interrupt line until it gets notified that the interrupt was received. So in reality we have two events: pulling and letting go of the interrupt line. To maintain backwards compatibility, I added a new request for the pulling part. The letting go part was implemented earlier already. With this in place, we can now finally start guests that do not randomly stall and stop to work at random times. This patch implements above logic for Book3S. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:19 +02:00
Alexander Graf	296c19d0b4	KVM: PPC: Don't put MSR_POW in MSR On Book3S a mtmsr with the MSR_POW bit set indicates that the OS is in idle and only needs to be waked up on the next interrupt. Now, unfortunately we let that bit slip into the stored MSR value which is not what the real CPU does, so that we ended up executing code like this: r = mfmsr(); /* r containts MSR_POW */ mtmsr(r \| MSR_EE); This obviously breaks, as we're going into idle mode in code sections that don't expect to be idling. This patch masks MSR_POW out of the stored MSR value on wakeup, making guests happy again. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:16 +02:00
Alexander Graf	9ee18b1e08	KVM: PPC: Update int_pending also on dequeue When having a decrementor interrupt pending, the dequeuing happens manually through an mtdec instruction. This instruction simply calls dequeue on that interrupt, so the int_pending hint doesn't get updated. This patch enables updating the int_pending hint also on dequeue, thus correctly enabling guests to stay in guest contexts more often. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:14 +02:00
Alexander Graf	df1bfa25d8	KVM: PPC: Put segment registers in shared page Now that the actual mtsr doesn't do anything anymore, we can move the sr contents over to the shared page, so a guest can directly read and write its sr contents from guest context. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:11 +02:00
Alexander Graf	8e8651783f	KVM: PPC: Interpret SR registers on demand Right now we're examining the contents of Book3s_32's segment registers when the register is written and put the interpreted contents into a struct. There are two reasons this is bad. For starters, the struct has worse real-time performance, as it occupies more ram. But the more important part is that with segment registers being interpreted from their raw values, we can put them in the shared page, allowing guests to mess with them directly. This patch makes the internal representation of SRs be u32s. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:11 +02:00
Alexander Graf	2e602847d9	KVM: PPC: Don't flush PTEs on NX/RO hit When hitting a no-execute or read-only data/inst storage interrupt we were flushing the respective PTE so we're sure it gets properly overwritten next. According to the spec, this is unnecessary though. The guest issues a tlbie anyways, so we're safe to just keep the PTE around and have it manually removed from the guest, saving us a flush. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:06 +02:00
Alexander Graf	4cb6b7ea0c	KVM: PPC: Preload magic page when in kernel mode When the guest jumps into kernel mode and has the magic page mapped, theres a very high chance that it will also use it. So let's detect that scenario and map the segment accordingly. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:05 +02:00
Alexander Graf	bed1ed9860	KVM: PPC: Move EXIT_DEBUG partially to tracepoints We have a debug printk on every exit that is usually #ifdef'ed out. Using tracepoints makes a lot more sense here though, as they can be dynamically enabled. This patch converts the most commonly used debug printks of EXIT_DEBUG to tracepoints. Signed-off-by: Alexander Graf <agraf@suse.de>	2010-10-24 10:52:00 +02:00
Wei Yongjun	646bab55a2	KVM: PPC: fix leakage of error page in kvmppc_patch_dcbz() Add kvm_release_page_clean() after is_error_page() to avoid leakage of error page. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:51:05 +02:00
Alexander Graf	e8508940a8	KVM: PPC: Magic Page Book3s support We need to override EA as well as PA lookups for the magic page. When the guest tells us to project it, the magic page overrides any guest mappings. In order to reflect that, we need to hook into all the MMU layers of KVM to force map the magic page if necessary. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:48 +02:00
Alexander Graf	28e83b4fa7	KVM: PPC: Make PAM a define On PowerPC it's very normal to not support all of the physical RAM in real mode. To check if we're matching on the shared page or not, we need to know the limits so we can restrain ourselves to that range. So let's make it a define instead of open-coding it. And while at it, let's also increase it. Signed-off-by: Alexander Graf <agraf@suse.de> v2 -> v3: - RMO -> PAM (non-magic page) Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:46 +02:00
Alexander Graf	90bba35887	KVM: PPC: Tell guest about pending interrupts When the guest turns on interrupts again, it needs to know if we have an interrupt pending for it. Because if so, it should rather get out of guest context and get the interrupt. So we introduce a new field in the shared page that we use to tell the guest that there's a pending interrupt lying around. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:46 +02:00
Alexander Graf	5c6cedf488	KVM: PPC: Add PV guest critical sections When running in hooked code we need a way to disable interrupts without clobbering any interrupts or exiting out to the hypervisor. To achieve this, we have an additional critical field in the shared page. If that field is equal to the r1 register of the guest, it tells the hypervisor that we're in such a critical section and thus may not receive any interrupts. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:46 +02:00
Alexander Graf	2a342ed577	KVM: PPC: Implement hypervisor interface To communicate with KVM directly we need to plumb some sort of interface between the guest and KVM. Usually those interfaces use hypercalls. This hypercall implementation is described in the last patch of the series in a special documentation file. Please read that for further information. This patch implements stubs to handle KVM PPC hypercalls on the host and guest side alike. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:45 +02:00
Alexander Graf	a73a9599e0	KVM: PPC: Convert SPRG[0-4] to shared page When in kernel mode there are 4 additional registers available that are simple data storage. Instead of exiting to the hypervisor to read and write those, we can just share them with the guest using the page. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:45 +02:00
Alexander Graf	de7906c36c	KVM: PPC: Convert SRR0 and SRR1 to shared page The SRR0 and SRR1 registers contain cached values of the PC and MSR respectively. They get written to by the hypervisor when an interrupt occurs or directly by the kernel. They are also used to tell the rfi(d) instruction where to jump to. Because it only gets touched on defined events that, it's very simple to share with the guest. Hypervisor and guest both have full r/w access. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:45 +02:00
Alexander Graf	5e030186df	KVM: PPC: Convert DAR to shared page. The DAR register contains the address a data page fault occured at. This register behaves pretty much like a simple data storage register that gets written to on data faults. There is no hypervisor interaction required on read or write. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:45 +02:00
Alexander Graf	d562de48de	KVM: PPC: Convert DSISR to shared page The DSISR register contains information about a data page fault. It is fully read/write from inside the guest context and we don't need to worry about interacting based on writes of this register. This patch converts all users of the current field to the shared page. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:44 +02:00
Alexander Graf	666e7252a1	KVM: PPC: Convert MSR to shared page One of the most obvious registers to share with the guest directly is the MSR. The MSR contains the "interrupts enabled" flag which the guest has to toggle in critical sections. So in order to bring the overhead of interrupt en- and disabling down, let's put msr into the shared page. Keep in mind that even though you can fully read its contents, writing to it doesn't always update all state. There are a few safe fields that don't require hypervisor interaction. See the documentation for a list of MSR bits that are safe to be set from inside the guest. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:43 +02:00
Alexander Graf	96bc451a15	KVM: PPC: Introduce shared page For transparent variable sharing between the hypervisor and guest, I introduce a shared page. This shared page will contain all the registers the guest can read and write safely without exiting guest context. This patch only implements the stubs required for the basic structure of the shared page. The actual register moving follows. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:50:42 +02:00
Alexander Graf	fef093bec0	KVM: PPC: Make use of hash based Shadow MMU We just introduced generic functions to handle shadow pages on PPC. This patch makes the respective backends make use of them, getting rid of a lot of duplicate code along the way. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:47:28 +03:00
Andreas Schwab	49f6be8ea1	KVM: PPC: elide struct thread_struct instances from stack Instead of instantiating a whole thread_struct on the stack use only the required parts of it. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Tested-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-08-01 10:39:24 +03:00
Avi Kivity	2122ff5eab	KVM: move vcpu locking to dispatcher for generic vcpu ioctls All vcpu ioctls need to be locked, so instead of locking each one specifically we lock at the generic dispatcher. This patch only updates generic ioctls and leaves arch specific ioctls alone. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-08-01 10:35:47 +03:00
Avi Kivity	98001d8d01	KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:41:10 +03:00
Avi Kivity	0ee75bead8	KVM: Let vcpu structure alignment be determined at runtime vmx and svm vcpus have different contents and therefore may have different alignmment requirements. Let each specify its required alignment. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:29 +03:00
Stephen Rothwell	329d20ba45	KVM: powerpc: use of kzalloc/kfree requires including slab.h Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:24 +03:00
Alexander Graf	b83d4a9cfc	KVM: PPC: Enable native paired singles When we're on a paired single capable host, we can just always enable paired singles and expose them to the guest directly. This approach breaks when multiple VMs run and access PS concurrently, but this should suffice until we get a proper framework for it in Linux. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:08 +03:00
Alexander Graf	f7bc74e1c3	KVM: PPC: Improve split mode When in split mode, instruction relocation and data relocation are not equal. So far we implemented this mode by reserving a special pseudo-VSID for the two cases and flushing all PTEs when going into split mode, which is slow. Unfortunately 32bit Linux and Mac OS X use split mode extensively. So to not slow down things too much, I came up with a different idea: Mark the split mode with a bit in the VSID and then treat it like any other segment. This means we can just flush the shadow segment cache, but keep the PTEs intact. I verified that this works with ppc32 Linux and Mac OS X 10.4 guests and does speed them up. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:58 +03:00
Alexander Graf	7fdaec997c	KVM: PPC: Make Performance Counters work When we get a performance counter interrupt we need to route it on to the Linux handler after we got out of the guest context. We also need to tell our handling code that this particular interrupt doesn't need treatment. So let's add those two bits in, making perf work while having a KVM guest running. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:57 +03:00
Alexander Graf	af7b4d104b	KVM: PPC: Convert u64 -> ulong There are some pieces in the code that I overlooked that still use u64s instead of longs. This slows down 32 bit hosts unnecessarily, so let's just move them to ulong. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:55 +03:00
Alexander Graf	ada7ba17b4	KVM: PPC: Check max IRQ prio We have a define on what the highest bit of IRQ priorities is. So we can just as well use it in the bit checking code and avoid invalid IRQ values to be triggered. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:51 +03:00
Alexander Graf	07b0907db1	KVM: PPC: Add Book3S compatibility code Some code we had so far required defines and had code that was completely Book3S_64 specific. Since we now opened book3s.c to Book3S_32 too, we need to take care of these pieces. So let's add some minor code where it makes sense to not go the Book3S_64 code paths and add compat defines on others. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:46 +03:00
Alexander Graf	61db97cc1e	KVM: PPC: Emulate segment fault Book3S_32 doesn't know about segment faults. It only knows about page faults. So in order to know that we didn't map a segment, we need to fake segment faults. We do this by setting invalid segment registers to an invalid VSID and then check for that VSID on normal page faults. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:45 +03:00
Alexander Graf	9cc5e9538a	KVM: PPC: Extract MMU init The host shadow mmu code needs to get initialized. It needs to fetch a segment it can use to put shadow PTEs into. That initialization code was in generic code, which is icky. Let's move it over to the respective MMU file. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:34 +03:00
Alexander Graf	c7f38f46f2	KVM: PPC: Improve indirect svcpu accessors We already have some inline fuctions we use to access vcpu or svcpu structs, depending on whether we're on booke or book3s. Since we just put a few more registers into the svcpu, we also need to make sure the respective callbacks are available and get used. So this patch moves direct use of the now in the svcpu struct fields to inline function calls. While at it, it also moves the definition of those inline function calls to respective header files for booke and book3s, greatly improving readability. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:26 +03:00
Alexander Graf	05b0ab1c0b	KVM: PPC: Disable MSR_FEx for Cell hosts Cell can't handle MSR_FE0 and MSR_FE1 too well. It gets dog slow. So let's just override the guest whenever we see one of the two and mask them out. See commit `ddf5f75a16` for reference. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:17:21 +03:00
Alexander Graf	9fb244a2c2	KVM: PPC: Fix dcbz emulation On most systems we need to emulate dcbz when running 32 bit guests. So far we've been rather slack, not giving correct DSISR values to the guest. This patch makes the emulation more accurate, introducing a difference between "page not mapped" and "write protection fault". While at it, it also speeds up dcbz emulation by an order of magnitude by using kmap. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:17:14 +03:00
Alexander Graf	a2b07664f6	KVM: PPC: Make build work without CONFIG_VSX/ALTIVEC The FPU/Altivec/VSX enablement also brought access to some structure elements that are only defined when the respective config options are enabled. Unfortuately I forgot to check for the config options at some places, so let's do that now. Unbreaks the build when CONFIG_VSX is not set. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:17:12 +03:00
Alexander Graf	ad0a048b09	KVM: PPC: Add OSI hypercall interface MOL uses its own hypercall interface to call back into userspace when the guest wants to do something. So let's implement that as an exit reason, specify it with a CAP and only really use it when userspace wants us to. The only user of it so far is MOL. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:17:10 +03:00
Alexander Graf	ca7f4203b9	KVM: PPC: Implement alignment interrupt Mac OS X has some applications - namely the Finder - that require alignment interrupts to work properly. So we need to implement them. But the spec for 970 and 750 also looks different. While 750 requires the DSISR and DAR fields to reflect some instruction bits (DSISR) and the fault address (DAR), the 970 declares this as an optional feature. So we need to reconstruct DSISR and DAR manually. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:17:07 +03:00
Alexander Graf	a56cf347c2	KVM: PPC: Load VCPU for register fetching When trying to read or store vcpu register data, we should also make sure the vcpu is actually loaded, so we're 100% sure we get the correct values. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:59 +03:00
Alexander Graf	c2453693d4	KVM: PPC: Don't reload FPU with invalid values When the guest activates the FPU, we load it up. That's fine when it wasn't activated before on the host, but if it was we end up reloading FPU values from last time the FPU was deactivated on the host without writing the proper values back to the vcpu struct. This patch checks if the FPU is enabled already and if so just doesn't bother activating it, making FPU operations survive guest context switches. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:57 +03:00
Alexander Graf	8963221d7d	KVM: PPC: Split instruction reading out The current check_ext function reads the instruction and then does the checking. Let's split the reading out so we can reuse it for different functions. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:56 +03:00
Alexander Graf	18978768d8	KVM: PPC: Allow userspace to unset the IRQ line Userspace can tell us that it wants to trigger an interrupt. But so far it can't tell us that it wants to stop triggering one. So let's interpret the parameter to the ioctl that we have anyways to tell us if we want to raise or lower the interrupt line. Signed-off-by: Alexander Graf <agraf@suse.de> v2 -> v3: - Add CAP for unset irq Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:51 +03:00
Alexander Graf	3eeafd7da2	KVM: PPC: Ensure split mode works On PowerPC we can go into MMU Split Mode. That means that either data relocation is on but instruction relocation is off or vice versa. That mode didn't work properly, as we weren't always flushing entries when going into a new split mode, potentially mapping different code or data that we're supposed to. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:49 +03:00
Alexander Graf	7e821d3920	KVM: PPC: Memset vcpu to zeros While converting the kzalloc we used to allocate our vcpu struct to vmalloc, I forgot to memset the contents to zeros. That broke quite a lot. This patch memsets it to zero again. Signed-off-by: Alexander Graf <alex@csgraf.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:39:21 +03:00
Alexander Graf	032c340731	KVM: PPC: Allocate vcpu struct using vmalloc We used to use get_free_pages to allocate our vcpu struct. Unfortunately that call failed on me several times after my machine had a big enough uptime, as memory became too fragmented by then. Fortunately, we don't need it to be page aligned any more! We can just vmalloc it and everything's great. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:38:04 +03:00
Alexander Graf	e5c29e926c	KVM: PPC: Enable program interrupt to do MMIO When we get a program interrupt we usually don't expect it to perform an MMIO operation. But why not? When we emulate paired singles, we can end up loading or storing to an MMIO address - and the handling of those happens in the program interrupt handler. So let's teach the program interrupt handler how to deal with EMULATE_MMIO. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:35:24 +03:00
Alexander Graf	aba3bd7ffe	KVM: PPC: Make ext giveup non-static We need to call the ext giveup handlers from code outside of book3s.c. So let's make it non-static. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:35:12 +03:00
Alexander Graf	5467a97d0f	KVM: PPC: Make software load/store return eaddr The Book3S KVM implementation contains some helper functions to load and store data from and to virtual addresses. Unfortunately, this helper used to keep the physical address it so nicely found out for us to itself. So let's change that and make it return the physical address it resolved. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:35:09 +03:00
Alexander Graf	d1bab74c51	KVM: PPC: Preload FPU when possible There are some situations when we're pretty sure the guest will use the FPU soon. So we can save the churn of going into the guest, finding out it does want to use the FPU and going out again. This patch adds preloading of the FPU when it's reasonable. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:34:59 +03:00
Alexander Graf	c8c0b6f2f7	KVM: PPC: Combine extension interrupt handlers When we for example get an Altivec interrupt, but our guest doesn't support altivec, we need to inject a program interrupt, not an altivec interrupt. The same goes for paired singles. When an altivec interrupt arrives, we're pretty sure we need to emulate the instruction because it's a paired single operation. So let's make all the ext handlers aware that they need to jump to the program interrupt handler when an extension interrupt arrives that was not supposed to arrive for the guest CPU. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:34:56 +03:00
Alexander Graf	3c402a75ea	KVM: PPC: Add hidden flag for paired singles The Gekko implements an extension called paired singles. When the guest wants to use that extension, we need to make sure we're not running the host FPU, because all FPU instructions need to get emulated to accomodate for additional operations that occur. This patch adds an hflag to track if we're in paired single mode or not. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:34:50 +03:00
Alexander Graf	37f5bca64e	KVM: PPC: Add AGAIN type for emulation return Emulation of an instruction can have different outcomes. It can succeed, fail, require MMIO, do funky BookE stuff - or it can just realize something's odd and will be fixed the next time around. Exactly that is what EMULATE_AGAIN means. Using that flag we can now tell the caller that nothing happened, but we still want to go back to the guest and see what happens next time we come around. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-25 12:34:47 +03:00
Takuya Yoshikawa	87bf6e7de1	KVM: fix the handling of dirty bitmaps to avoid overflows Int is not long enough to store the size of a dirty bitmap. This patch fixes this problem with the introduction of a wrapper function to calculate the sizes of dirty bitmaps. Note: in mark_page_dirty(), we have to consider the fact that __set_bit() takes the offset as int, not long. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-04-20 13:06:55 +03:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Alexander Graf	a76f8497fd	KVM: PPC: Move Shadow MSR calculation to function We keep a copy of the MSR around that we use when we go into the guest context. That copy is basically the normal process MSR flags OR some allowed guest specified MSR flags. We also AND the external providers into this, so we get traps on FPU usage when we haven't activated it on the host yet. Currently this calculation is part of the set_msr function that we use whenever we set the guest MSR value. With the external providers, we also have the case that we don't modify the guest's MSR, but only want to update the shadow MSR. So let's move the shadow MSR parts to a separate function that we then use whenever we only need to update it. That way we don't accidently kvm_vcpu_block within a preempt notifier context. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-03-01 12:35:56 -03:00
Alexander Graf	f7adbba1e5	KVM: PPC: Keep SRR1 flags around in shadow_msr SRR1 stores more information that just the MSR value. It also stores valuable information about the type of interrupt we received, for example whether the storage interrupt we just got was because of a missing htab entry or not. We use that information to speed up the exit path. Now if we get preempted before we can interpret the shadow_msr values, we get into vcpu_put which then calls the MSR handler, which then sets all the SRR1 information bits in shadow_msr to 0. Great. So let's preserve the SRR1 specific bits in shadow_msr whenever we set the MSR. They don't hurt. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-03-01 12:35:56 -03:00

1 2 3 4

165 Commits