OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Hollis Blanchard	c5fbdffbda	KVM: ppc: save and restore guest mappings on context switch Store shadow TLB entries in memory, but only use it on host context switch (instead of every guest entry). This improves performance for most workloads on 440 by reducing the guest TLB miss rate. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:09 +02:00
Hollis Blanchard	7924bd4109	KVM: ppc: directly insert shadow mappings into the hardware TLB Formerly, we used to maintain a per-vcpu shadow TLB and on every entry to the guest would load this array into the hardware TLB. This consumed 1280 bytes of memory (64 entries of 16 bytes plus a struct page pointer each), and also required some assembly to loop over the array on every entry. Instead of saving a copy in memory, we can just store shadow mappings directly into the hardware TLB, accepting that the host kernel will clobber these as part of the normal 440 TLB round robin. When we do that we need less than half the memory, and we have decreased the exit handling time for all guest exits, at the cost of increased number of TLB misses because the host overwrites some guest entries. These savings will be increased on processors with larger TLBs or which implement intelligent flush instructions like tlbivax (which will avoid the need to walk arrays in software). In addition to that and to the code simplification, we have a greater chance of leaving other host userspace mappings in the TLB, instead of forcing all subsequent tasks to re-fault all their mappings. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:09 +02:00
Hollis Blanchard	891686188f	KVM: ppc: support large host pages KVM on 440 has always been able to handle large guest mappings with 4K host pages -- we must, since the guest kernel uses 256MB mappings. This patch makes KVM work when the host has large pages too (tested with 64K). Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:07 +02:00
Hollis Blanchard	fe4e771d5c	KVM: ppc: fix userspace mapping invalidation on context switch We used to defer invalidating userspace TLB entries until jumping out of the kernel. This was causing MMU weirdness most easily triggered by using a pipe in the guest, e.g. "dmesg \| tail". I believe the problem was that after the guest kernel changed the PID (part of context switch), the old process's mappings were still present, and so copy_to_user() on the "return to new process" path ended up using stale mappings. Testing with large pages (64K) exposed the problem, probably because with 4K pages, pressure on the TLB faulted all process A's mappings out before the guest kernel could insert any for process B. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:26 +02:00
Hollis Blanchard	df9b856c45	KVM: ppc: use prefetchable mappings for guest memory Bare metal Linux on 440 can "overmap" RAM in the kernel linear map, so that it can use large (256MB) mappings even if memory isn't a multiple of 256MB. To prevent the hardware prefetcher from loading from an invalid physical address through that mapping, it's marked Guarded. However, KVM must ensure that all guest mappings are backed by real physical RAM (since a deliberate access through a guarded mapping could still cause a machine check). Accordingly, we don't need to make our mappings guarded, so let's allow prefetching as the designers intended. Curiously this patch didn't affect performance at all on the quick test I tried, but it's clearly the right thing to do anyways and may improve other workloads. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:26 +02:00
Hollis Blanchard	bf5d4025c9	KVM: ppc: use MMUCR accessor to obtain TID We have an accessor; might as well use it. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:25 +02:00
Hollis Blanchard	74ef740da6	KVM: ppc: fix Kconfig constraints Make sure that CONFIG_KVM cannot be selected without processor support (currently, 440 is the only processor implementation available). Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:25 +02:00
Hollis Blanchard	fcfdbd266a	KVM: ppc: improve trap emulation set ESR[PTR] when emulating a guest trap. This allows Linux guests to properly handle WARN_ON() (i.e. detect that it's a non-fatal trap). Also remove debugging printk in trap emulation. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:24 +02:00
Hollis Blanchard	d4cf3892e5	KVM: ppc: optimize irq delivery path In kvmppc_deliver_interrupt is just one case left in the switch and it is a rare one (less than 8%) when looking at the exit numbers. Therefore we can at least drop the switch/case and if an if. I inserted an unlikely too, but that's open for discussion. In kvmppc_can_deliver_interrupt all frequent cases are in the default case. I know compilers are smart but we can make it easier for them. By writing down all options and removing the default case combined with the fact that ithe values are constants 0..15 should allow the compiler to write an easy jump table. Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems to be unable to reduce priority_exception[x] to a build time constant. Therefore I changed the usage of the translation arrays in the interrupt delivery path completely. It is now using priority without translation to irq on the full irq delivery path. To be able to do that ivpr regs are stored by their priority now. Additionally the decision made in kvmppc_can_deliver_interrupt is already sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace the 16x4byte array used here with a single 4byte variable (might still be one miss, but the chance to find this in cache should be better than the right entry of the whole array). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:23 +02:00
Hollis Blanchard	9ab80843c0	KVM: ppc: optimize find first bit Since we use a unsigned long here anyway we can use the optimized __ffs. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:23 +02:00
Hollis Blanchard	1b6766c7f3	KVM: ppc: optimize kvm stat handling Currently we use an unnecessary if&switch to detect some cases. To be honest we don't need the ligh_exits counter anyway, because we can calculate it out of others. Sum_exits can also be calculated, so we can remove that too. MMIO, DCR and INTR can be counted on other places without these additional control structures (The INTR case was never hit anyway). The handling of BOOKE_INTERRUPT_EXTERNAL/BOOKE_INTERRUPT_DECREMENTER is similar, but we can avoid the additional if when copying 3 lines of code. I thought about a goto there to prevent duplicate lines, but rewriting three lines should be better style than a goto cross switch/case statements (its also not enough code to justify a new inline function). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:23 +02:00
Hollis Blanchard	b8fd68ac8d	KVM: ppc: fix set regs to take care of msr change When changing some msr bits e.g. problem state we need to take special care of that. We call the function in our mtmsr emulation (not needed for wrtee[i]), but we don't call kvmppc_set_msr if we change msr via set_regs ioctl. It's a corner case we never hit so far, but I assume it should be kvmppc_set_msr in our arch set regs function (I found it because it is also a corner case when using pv support which would miss the update otherwise). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:23 +02:00
Hollis Blanchard	5cf8ca2214	KVM: ppc: adjust vcpu types to support 64-bit cores However, some of these fields could be split into separate per-core structures in the future. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:22 +02:00
Hollis Blanchard	db93f5745d	KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor This patch doesn't yet move all 44x-specific data into the new structure, but is the first step down that path. In the future we may also want to create a struct kvm_vcpu_booke. Based on patch from Liu Yu <yu.liu@freescale.com>. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:22 +02:00
Hollis Blanchard	5cbb5106f5	KVM: ppc: Move the last bits of 44x code out of booke.c Needed to port to other Book E processors. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:22 +02:00
Hollis Blanchard	75f74f0dbe	KVM: ppc: refactor instruction emulation into generic and core-specific pieces Cores provide 3 emulation hooks, implemented for example in the new 4xx_emulate.c: kvmppc_core_emulate_op kvmppc_core_emulate_mtspr kvmppc_core_emulate_mfspr Strictly speaking the last two aren't necessary, but provide for more informative error reporting ("unknown SPR"). Long term I'd like to have instruction decoding autogenerated from tables of opcodes, and that way we could aggregate universal, Book E, and core-specific instructions more easily and without redundant switch statements. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:21 +02:00
Hollis Blanchard	c381a04313	ppc: Create disassemble.h to extract instruction fields This is used in a couple places in KVM, but isn't KVM-specific. However, this patch doesn't modify other in-kernel emulation code: - xmon uses a direct copy of ppc_opc.c from binutils - emulate_instruction() doesn't need it because it can use a series of mask tests. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:21 +02:00
Hollis Blanchard	9dd921cfea	KVM: ppc: Refactor powerpc.c to relocate 440-specific code This introduces a set of core-provided hooks. For 440, some of these are implemented by booke.c, with the rest in (the new) 44x.c. Note that these hooks are link-time, not run-time. Since it is not possible to build a single kernel for both e500 and 440 (for example), using function pointers would only add overhead. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:21 +02:00
Hollis Blanchard	d9fbd03d24	KVM: ppc: combine booke_guest.c and booke_host.c The division was somewhat artificial and cumbersome, and had no functional benefit anyways: we can only guests built for the real host processor. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:51:50 +02:00
Hollis Blanchard	0f55dc481e	KVM: ppc: Rename "struct tlbe" to "struct kvmppc_44x_tlbe" This will ease ports to other cores. Also remove unused "struct kvm_tlb" while we're at it. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:51:50 +02:00
Hollis Blanchard	a0d7b9f246	KVM: ppc: Move 440-specific TLB code into 44x_tlb.c This will make it easier to provide implementations for other cores. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:51:50 +02:00
Paul Mackerras	fad7b9b51e	powerpc: Fix KVM build on ppc440 Commit `2a4aca1144` ("powerpc/mm: Split low level tlb invalidate for nohash processors") changed a call to _tlbia to _tlbil_all but didn't include the header that defines _tlbil_all, leading to a build failure on 440 if KVM is enabled. This fixes it. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-23 14:58:30 +11:00
Benjamin Herrenschmidt	2a4aca1144	powerpc/mm: Split low level tlb invalidate for nohash processors Currently, the various forms of low level TLB invalidations are all implemented in misc_32.S for 32-bit processors, in a fairly scary mess of #ifdef's and with interesting duplication such as a whole bunch of code for FSL _tlbie and _tlbia which are no longer used. This moves things around such that _tlbie is now defined in hash_low_32.S and is only used by the 32-bit hash code, and all nohash CPUs use the various _tlbil_* forms that are now moved to a new file, tlb_nohash_low.S. I moved all the definitions for that stuff out of include/asm/tlbflush.h as they are really internal mm stuff, into mm/mmu_decl.h The code should have no functional changes. I kept some variants inline for trivial forms on things like 40x and 8xx. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:16 +11:00
Hollis Blanchard	c30f8a6c6d	KVM: ppc: stop leaking host memory on VM exit When the VM exits, we must call put_page() for every page referenced in the shadow TLB. Without this patch, we usually leak 30-50 host pages (120 - 200 KiB with 4 KiB pages). The maximum number of pages leaked is the size of our shadow TLB, 64 pages. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-11-25 12:02:48 +02:00
Marcelo Tosatti	4c2155ce81	KVM: switch to get_user_pages_fast Convert gfn_to_pfn to use get_user_pages_fast, which can do lockless pagetable lookups on x86. Kernel compilation on 4-way guest is 3.7% faster on VMX. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-10-15 14:25:06 +02:00
Hollis Blanchard	0bd595fc22	KVM: ppc: kvmppc_44x_shadow_release() does not require mmap_sem to be locked And it gets in the way of get_user_pages_fast(). Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:26 +02:00
Hollis Blanchard	49dd2c4928	KVM: powerpc: Map guest userspace with TID=0 mappings When we use TID=N userspace mappings, we must ensure that kernel mappings have been destroyed when entering userspace. Using TID=1/TID=0 for kernel/user mappings and running userspace with PID=0 means that userspace can't access the kernel mappings, but the kernel can directly access userspace. The net is that we don't need to flush the TLB on privilege switches, but we do on guest context switches (which are far more infrequent). Guest boot time performance improvement: about 30%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Hollis Blanchard	83aae4a809	KVM: ppc: Write only modified shadow entries into the TLB on exit Track which TLB entries need to be written, instead of overwriting everything below the high water mark. Typically only a single guest TLB entry will be modified in a single exit. Guest boot time performance improvement: about 15%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Hollis Blanchard	20754c2495	KVM: ppc: Stop saving host TLB state We're saving the host TLB state to memory on every exit, but never using it. Originally I had thought that we'd want to restore host TLB for heavyweight exits, but that could actually hurt when context switching to an unrelated host process (i.e. not qemu). Since this decreases the performance penalty of all exits, this patch improves guest boot time by about 15%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Hollis Blanchard	6a0ab738ef	KVM: ppc: guest breakpoint support Allow host userspace to program hardware debug registers to set breakpoints inside guests. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Christian Ehrhardt	3b4bd7969f	KVM: ppc: trace powerpc instruction emulation This patch adds a trace point for the instruction emulation on embedded powerpc utilizing the KVM_TRACE interface. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:15 +02:00
Jerone Young	31711f2294	KVM: ppc: adds trace points for ppc tlb activity This patch adds trace points to track powerpc TLB activities using the KVM_TRACE infrastructure. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:15 +02:00
Jerone Young	12f6755602	KVM: ppc: enable KVM_TRACE building for powerpc This patch enables KVM_TRACE to build for PowerPC arch. This means just adding sections to Kconfig and Makefile. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:15 +02:00
Hollis Blanchard	cc04454fa8	KVM: ppc: fix invalidation of large guest pages When guest invalidates a large tlb map, there may be more than one corresponding shadow tlb maps that need to be invalidated. Use eaddr and eend to find these shadow tlb maps. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-27 12:02:05 +03:00
Marcelo Tosatti	34d4cb8fca	KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction Flush the shadow mmu before removing regions to avoid stale entries. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Laurent Vivier	588968b6b7	KVM: Add coalesced MMIO support (powerpc part) This patch enables coalesced MMIO for powerpc architecture. It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO. It enables the compilation of coalesced_mmio.c. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:31 +03:00
Avi Kivity	7cc8883074	KVM: Remove decache_vcpus_on_cpu() and related callbacks Obsoleted by the vmx-specific per-cpu list. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:25 +03:00
Hollis Blanchard	9dcb40e1aa	KVM: ppc: Report bad GFNs This code shouldn't be hit anyways, but when it is, it's useful to have a little more information about the failure. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:22:41 +03:00
Hollis Blanchard	905fa4b9d6	KVM: ppc: Use a read lock around MMU operations, and release it on error gfn_to_page() and kvm_release_page_clean() are called from other contexts with mmap_sem locked only for reading. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:22:33 +03:00
Hollis Blanchard	52435b7c7a	KVM: ppc: Remove unmatched kunmap() call We're not calling kmap() now, so we shouldn't call kunmap() either. This has no practical effect in the non-highmem case, which is why it hasn't caused more obvious problems. Pointed out by Anthony Liguori. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:22:25 +03:00
Hollis Blanchard	ac3cd34e4e	KVM: ppc: add lwzx/stwz emulation Somehow these load/store instructions got missed before, but weren't used by the guest so didn't break anything. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:22:17 +03:00
Hollis Blanchard	ce263d70e5	KVM: ppc: Remove duplicate function This was left behind from some code movement. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:22:09 +03:00
Christian Ehrhardt	de368dceb3	KVM: ppc: deliver INTERRUPT_FP_UNAVAIL to the guest This patch adds the delivery of INTERRUPT_FP_UNAVAIL exceptions to the guest. It's needed if a guest uses ppc binaries using the Floating point instructions. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Acked-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-05-04 14:44:45 +03:00
Hollis Blanchard	45c5eb67da	KVM: ppc: Handle guest idle by emulating MSR[WE] writes This reduces host CPU usage when the guest is idle. However, the guest must set MSR[WE] in its idle loop, which Linux did not do until 2.6.26. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-05-04 14:44:44 +03:00
Hollis Blanchard	bbf45ba57e	KVM: ppc: PowerPC 440 KVM implementation This functionality is definitely experimental, but is capable of running unmodified PowerPC 440 Linux kernels as guests on a PowerPC 440 host. (Only tested with 440EP "Bamboo" guests so far, but with appropriate userspace support other SoC/board combinations should work.) See Documentation/powerpc/kvm_440.txt for technical details. [stephen: build fix] Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 18:21:39 +03:00

45 Commits