OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Marc Zyngier	42428525a9	ARM: KVM: Remove __kvm_hyp_code_start/__kvm_hyp_code_end Now that we've unified the way we refer to the HYP text between arm and arm64, drop __kvm_hyp_code_start/end, and just use the __hyp_text_start/end symbols. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-29 18:34:12 +00:00
Marc Zyngier	1a61ae7af4	ARM: KVM: Move the HYP code to its own section In order to be able to spread the HYP code into multiple compilation units, adopt a layout similar to that of arm64: - the HYP text is emited in its own section (.hyp.text) - two linker generated symbols are use to identify the boundaries of that section No functionnal change. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-29 18:34:12 +00:00
Marc Zyngier	35a2491a62	arm/arm64: KVM: Add hook for C-based stage2 init As we're about to move the stage2 init to C code, introduce some C hooks that will later be populated with arch-specific implementations. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-29 18:34:12 +00:00
Michael S. Tsirkin	4cad67fca3	arm/arm64: KVM: Fix ioctl error handling Calling return copy_to_user(...) in an ioctl will not do the right thing if there's a pagefault: copy_to_user returns the number of bytes not copied in this case. Fix up kvm to do return copy_to_user(...)) ? -EFAULT : 0; everywhere. Cc: stable@vger.kernel.org Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-29 09:56:40 +00:00
Ingo Molnar	6aa447bcbb	Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-29 09:42:07 +01:00
Marcelo Tosatti	8577370fb0	KVM: Use simple waitqueue for vcpu->wq The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-02-25 11:27:16 +01:00
Marc Zyngier	1d6a821277	arm/arm64: KVM: Feed initialized memory to MMIO accesses On an MMIO access, we always copy the on-stack buffer info the shared "run" structure, even if this is a read access. This ends up leaking up to 8 bytes of uninitialized memory into userspace, depending on the size of the access. An obvious fix for this one is to only perform the copy if this is an actual write. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-24 11:53:09 +00:00
Ard Biesheuvel	a0bf9776cd	arm64: kvm: deal with kernel symbols outside of linear mapping KVM on arm64 uses a fixed offset between the linear mapping at EL1 and the HYP mapping at EL2. Before we can move the kernel virtual mapping out of the linear mapping, we have to make sure that references to kernel symbols that are accessed via the HYP mapping are translated to their linear equivalent. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2016-02-18 18:16:40 +00:00
Dan Williams	ba049e93ae	kvm: rename pfn_t to kvm_pfn_t To date, we have implemented two I/O usage models for persistent memory, PMEM (a persistent "ram disk") and DAX (mmap persistent memory into userspace). This series adds a third, DAX-GUP, that allows DAX mappings to be the target of direct-i/o. It allows userspace to coordinate DMA/RDMA from/to persistent memory. The implementation leverages the ZONE_DEVICE mm-zone that went into 4.3-rc1 (also discussed at kernel summit) to flag pages that are owned and dynamically mapped by a device driver. The pmem driver, after mapping a persistent memory range into the system memmap via devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus page-backed pmem-pfns via flags in the new pfn_t type. The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the resulting pte(s) inserted into the process page tables with a new _PAGE_DEVMAP flag. Later, when get_user_pages() is walking ptes it keys off _PAGE_DEVMAP to pin the device hosting the page range active. Finally, get_page() and put_page() are modified to take references against the device driver established page mapping. Finally, this need for "struct page" for persistent memory requires memory capacity to store the memmap array. Given the memmap array for a large pool of persistent may exhaust available DRAM introduce a mechanism to allocate the memmap from persistent memory. The new "struct vmem_altmap *" parameter to devm_memremap_pages() enables arch_add_memory() to use reserved pmem capacity rather than the page allocator. This patch (of 18): The core has developed a need for a "pfn_t" type [1]. Move the existing pfn_t in KVM to kvm_pfn_t [2]. [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.html Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Pavel Fedin	c7da6fa43c	arm/arm64: KVM: Detect vGIC presence at runtime Before commit `662d971584` ("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}") is was possible to compile the kernel without vGIC and vTimer support. Commit message says about possibility to detect vGIC support in runtime, but this has never been implemented. This patch introduces runtime check, restoring the lost functionality. It again allows to use KVM on hardware without vGIC. Interrupt controller has to be emulated in userspace in this case. -ENODEV return code from probe function means there's no GIC at all. -ENXIO happens when, for example, there is GIC node in the device tree, but it does not specify vGIC resources. Any other error code is still treated as full stop because it might mean some really serious problems. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-18 12:01:58 +00:00
Vladimir Murzin	20475f784d	arm64: KVM: Add support for 16-bit VMID The ARMv8.1 architecture extension allows to choose between 8-bit and 16-bit of VMID, so use this capability for KVM. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-18 10:15:12 +00:00
Vladimir Murzin	9d4dc68834	arm/arm64: KVM: Remove unreferenced S2_PGD_ORDER Since commit `a987370` ("arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting") there is no reference to S2_PGD_ORDER, so kill it for the good. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-18 10:15:11 +00:00
Marc Zyngier	e078ef8151	ARM: KVM: Cleanup exception injection David Binderman reported that the exception injection code had a couple of unused variables lingering around. Upon examination, it looked like this code could do with an anticipated spring cleaning, which amounts to deduplicating the CPSR/SPSR update, and making it look a bit more like the architecture spec. The spurious variables are removed in the process. Reported-by: David Binderman <dcb314@hotmail.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-18 10:14:54 +00:00
Marc Zyngier	910917bb7d	arm64: KVM: Map the kernel RO section into HYP In order to run C code in HYP, we must make sure that the kernel's RO section is mapped into HYP (otherwise things break badly). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-12-14 11:30:43 +00:00
Amit Tomar	b19e6892a9	KVM: arm/arm64: Count guest exit due to various reasons It would add guest exit statistics to debugfs, this can be helpful while measuring KVM performance. [ Renamed some of the field names - Christoffer ] Signed-off-by: Amit Singh Tomar <amittomer25@gmail.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-12-14 11:30:00 +00:00
Ard Biesheuvel	0de58f8528	ARM/arm64: KVM: correct PTE uncachedness check Commit `e6fab54423` ("ARM/arm64: KVM: test properly for a PTE's uncachedness") modified the logic to test whether a HYP or stage-2 mapping needs flushing, from [incorrectly] interpreting the page table attributes to [incorrectly] checking whether the PFN that backs the mapping is covered by host system RAM. The PFN number is part of the output of the translation, not the input, so we have to use pte_pfn() on the contents of the PTE, not __phys_to_pfn() on the HYP virtual address or stage-2 intermediate physical address. Fixes: `e6fab54423` ("ARM/arm64: KVM: test properly for a PTE's uncachedness") Cc: stable@vger.kernel.org Tested-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-04 16:30:17 +00:00
Pavel Fedin	f6be563abb	arm64: KVM: Get rid of old vcpu_reg() Using oldstyle vcpu_reg() accessor is proven to be inappropriate and unsafe on ARM64. This patch converts the rest of use cases to new accessors and completely removes vcpu_reg() on ARM64. Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-04 16:30:03 +00:00
Pavel Fedin	bc45a516fa	arm64: KVM: Correctly handle zero register during MMIO On ARM64 register index of 31 corresponds to both zero register and SP. However, all memory access instructions, use ZR as transfer register. SP is used only as a base register in indirect memory addressing, or by register-register arithmetics, which cannot be trapped here. Correct emulation is achieved by introducing new register accessor functions, which can do special handling for reg_num == 31. These new accessors intentionally do not rely on old vcpu_reg() on ARM64, because it is to be removed. Since the affected code is shared by both ARM flavours, implementations of these accessors are also added to ARM32 code. This patch fixes setting MMIO register to a random value (actually SP) instead of zero by something like: ((volatile int )reg) = 0; compilers tend to generate "str wzr, [xx]" here [Marc: Fixed 32bit splat] Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-12-04 16:29:37 +00:00
Christoffer Dall	7e16aa81f9	KVM: arm/arm64: Fix preemptible timer active state crazyness We were setting the physical active state on the GIC distributor in a preemptible section, which could cause us to set the active state on different physical CPU from the one we were actually going to run on, hacoc ensues. Since we are no longer descheduling/scheduling soft timers in the flush/sync timer functions, simply moving the timer flush into a non-preemptible section. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 18:04:00 +01:00
Ard Biesheuvel	e6fab54423	ARM/arm64: KVM: test properly for a PTE's uncachedness The open coded tests for checking whether a PTE maps a page as uncached use a flawed '(pte_val(xxx) & CONST) != CONST' pattern, which is not guaranteed to work since the type of a mapping is not a set of mutually exclusive bits For HYP mappings, the type is an index into the MAIR table (i.e, the index itself does not contain any information whatsoever about the type of the mapping), and for stage-2 mappings it is a bit field where normal memory and device types are defined as follows: #define MT_S2_NORMAL 0xf #define MT_S2_DEVICE_nGnRE 0x1 I.e., masking and comparing with the latter matches on the former, and we have been getting lucky merely because the S2 device mappings also have the PTE_UXN bit set, or we would misidentify memory mappings as device mappings. Since the unmap_range() code path (which contains one instance of the flawed test) is used both for HYP mappings and stage-2 mappings, and considering the difference between the two, it is non-trivial to fix this by rewriting the tests in place, as it would involve passing down the type of mapping through all the functions. However, since HYP mappings and stage-2 mappings both deal with host physical addresses, we can simply check whether the mapping is backed by memory that is managed by the host kernel, and only perform the D-cache maintenance if this is the case. Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Pavel Fedin <p.fedin@samsung.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 17:58:00 +01:00
Christoffer Dall	b5905dc12e	arm/arm64: KVM: Improve kvm_exit tracepoint The ARM architecture only saves the exit class to the HSR (ESR_EL2 for arm64) on synchronous exceptions, not on asynchronous exceptions like an IRQ. However, we only report the exception class on kvm_exit, which is confusing because an IRQ looks like it exited at some PC with the same reason as the previous exit. Add a lookup table for the exception index and prepend the kvm_exit tracepoint text with the exception type to clarify this situation. Also resolve the exception class (EC) to a human-friendly text version so the trace output becomes immediately usable for debugging this code. Cc: Wei Huang <wei@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:47 +02:00
Eric Auger	3b92830ad4	KVM: arm/arm64: implement kvm_arm_[halt,resume]_guest We introduce kvm_arm_halt_guest and resume functions. They will be used for IRQ forward state change. Halt is synchronous and prevents the guest from being re-entered. We use the same mechanism put in place for PSCI former pause, now renamed power_off. A new flag is introduced in arch vcpu state, pause, only meant to be used by those functions. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:46 +02:00
Eric Auger	101d3da09c	KVM: arm/arm64: check power_off in critical section before VCPU run In case a vcpu off PSCI call is called just after we executed the vcpu_sleep check, we can enter the guest although power_off is set. Let's check the power_off state in the critical section, just before entering the guest. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reported-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:46 +02:00
Eric Auger	4f5f1dc036	KVM: arm/arm64: check power_off in kvm_arch_vcpu_runnable kvm_arch_vcpu_runnable now also checks whether the power_off flag is set. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:46 +02:00
Eric Auger	3781528e30	KVM: arm/arm64: rename pause into power_off The kvm_vcpu_arch pause field is renamed into power_off to prepare for the introduction of a new pause field. Also vcpu_pause is renamed into vcpu_sleep since we will sleep until both power_off and pause are false. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:45 +02:00
Wei Huang	75755c6d02	arm/arm64: KVM : Enable vhost device selection under KVM config menu vhost drivers provide guest VMs with better I/O performance and lower CPU utilization. This patch allows users to select vhost devices under KVM configuration menu on ARM. This makes vhost support on arm/arm64 on a par with other architectures (e.g. x86, ppc). Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:45 +02:00
Christoffer Dall	4b4b4512da	arm/arm64: KVM: Rework the arch timer to use level-triggered semantics The arch timer currently uses edge-triggered semantics in the sense that the line is never sampled by the vgic and lowering the line from the timer to the vgic doesn't have any effect on the pending state of virtual interrupts in the vgic. This means that we do not support a guest with the otherwise valid behavior of (1) disable interrupts (2) enable the timer (3) disable the timer (4) enable interrupts. Such a guest would validly not expect to see any interrupts on real hardware, but will see interrupts on KVM. This patch fixes this shortcoming through the following series of changes. First, we change the flow of the timer/vgic sync/flush operations. Now the timer is always flushed/synced before the vgic, because the vgic samples the state of the timer output. This has the implication that we move the timer operations in to non-preempible sections, but that is fine after the previous commit getting rid of hrtimer schedules on every entry/exit. Second, we change the internal behavior of the timer, letting the timer keep track of its previous output state, and only lower/raise the line to the vgic when the state changes. Note that in theory this could have been accomplished more simply by signalling the vgic every time the state potentially changed, but we don't want to be hitting the vgic more often than necessary. Third, we get rid of the use of the map->active field in the vgic and instead simply set the interrupt as active on the physical distributor whenever the input to the GIC is asserted and conversely clear the physical active state when the input to the GIC is deasserted. Fourth, and finally, we now initialize the timer PPIs (and all the other unused PPIs for now), to be level-triggered, and modify the sync code to sample the line state on HW sync and re-inject a new interrupt if it is still pending at that time. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:44 +02:00
Christoffer Dall	d35268da66	arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block We currently schedule a soft timer every time we exit the guest if the timer did not expire while running the guest. This is really not necessary, because the only work we do in the timer work function is to kick the vcpu. Kicking the vcpu does two things: (1) If the vpcu thread is on a waitqueue, make it runnable and remove it from the waitqueue. (2) If the vcpu is running on a different physical CPU from the one doing the kick, it sends a reschedule IPI. The second case cannot happen, because the soft timer is only ever scheduled when the vcpu is not running. The first case is only relevant when the vcpu thread is on a waitqueue, which is only the case when the vcpu thread has called kvm_vcpu_block(). Therefore, we only need to make sure a timer is scheduled for kvm_vcpu_block(), which we do by encapsulating all calls to kvm_vcpu_block() with kvm_timer_{un}schedule calls. Additionally, we only schedule a soft timer if the timer is enabled and unmasked, since it is useless otherwise. Note that theoretically userspace can use the SET_ONE_REG interface to change registers that should cause the timer to fire, even if the vcpu is blocked without a scheduled timer, but this case was not supported before this patch and we leave it for future work for now. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-22 23:01:42 +02:00
Arnd Bergmann	4a5d69b739	KVM: arm: use GIC support unconditionally The vgic code on ARM is built for all configurations that enable KVM, but the parent_data field that it references is only present when CONFIG_IRQ_DOMAIN_HIERARCHY is set: virt/kvm/arm/vgic.c: In function 'kvm_vgic_map_phys_irq': virt/kvm/arm/vgic.c:1781:13: error: 'struct irq_data' has no member named 'parent_data' This flag is implied by the GIC driver, and indeed the VGIC code only makes sense if a GIC is present. This changes the CONFIG_KVM symbol to always select GIC, which avoids the issue. Fixes: `662d971584` ("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-20 18:04:49 +02:00
Pavel Fedin	399ea0f6bc	KVM: arm/arm64: Fix memory leak if timer initialization fails Jump to correct label and free kvm_host_cpu_state Reviewed-by: Wei Huang <wei@redhat.com> Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-10-20 18:04:48 +02:00
Ming Lei	ef748917b5	arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' This patch removes config option of KVM_ARM_MAX_VCPUS, and like other ARCHs, just choose the maximum allowed value from hardware, and follows the reasons: 1) from distribution view, the option has to be defined as the max allowed value because it need to meet all kinds of virtulization applications and need to support most of SoCs; 2) using a bigger value doesn't introduce extra memory consumption, and the help text in Kconfig isn't accurate because kvm_vpu structure isn't allocated until request of creating VCPU is sent from QEMU; 3) the main effect is that the field of vcpus[] in 'struct kvm' becomes a bit bigger(sizeof(void *) per vcpu) and need more cache lines to hold the structure, but 'struct kvm' is one generic struct, and it has worked well on other ARCHs already in this way. Also, the world switch frequecy is often low, for example, it is ~2000 when running kernel building load in VM from APM xgene KVM host, so the effect is very small, and the difference can't be observed in my test at all. Cc: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-09-17 13:13:27 +01:00
Marc Zyngier	688bc577ac	arm: KVM: Disable virtual timer even if the guest is not using it When running a guest with the architected timer disabled (with QEMU and the kernel_irqchip=off option, for example), it is important to make sure the timer gets turned off. Otherwise, the guest may try to enable it anyway, leading to a screaming HW interrupt. The fix is to unconditionally turn off the virtual timer on guest exit. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-09-17 13:11:48 +01:00
Pavel Fedin	c2f58514cf	arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources Until `b26e5fdac4` ("arm/arm64: KVM: introduce per-VM ops"), kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(), and vgic_v2_map_resources() still has it. But now vm_ops are not initialized until we call kvm_vgic_create(). Therefore kvm_vgic_map_resources() can being called without a VGIC, and we die because vm_ops.map_resources is NULL. Fixing this restores QEMU's kernel-irqchip=off option to a working state, allowing to use GIC emulation in userspace. Fixes: `b26e5fdac4` ("arm/arm64: KVM: introduce per-VM ops") Cc: stable@vger.kernel.org Signed-off-by: Pavel Fedin <p.fedin@samsung.com> [maz: reworked commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-09-16 18:35:28 +01:00
Marek Majtyka	ca09f02f12	arm: KVM: Fix incorrect device to IPA mapping A critical bug has been found in device memory stage1 translation for VMs with more then 4GB of address space. Once vm_pgoff size is smaller then pa (which is true for LPAE case, u32 and u64 respectively) some more significant bits of pa may be lost as a shift operation is performed on u32 and later cast onto u64. Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12 expected pa(u64): 0x0000002010030000 produced pa(u64): 0x0000000010030000 The fix is to change the order of operations (casting first onto phys_addr_t and then shifting). Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> [maz: fixed changelog and patch formatting] Cc: stable@vger.kernel.org Signed-off-by: Marek Majtyka <marek.majtyka@tieto.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-09-16 14:50:45 +01:00
Alexander Spyridakis	0c0672922d	arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores If a guest requests the affinity info for a non-existing vCPU we need to properly return an error, instead of erroneously reporting an off state. Signed-off-by: Alexander Spyridakis <a.spyridakis@virtualopensystems.com> Signed-off-by: Alvise Rigo <a.rigo@virtualopensystems.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-09-04 17:02:48 +01:00
Mario Smarduch	054167b3d5	arm: KVM: keep arm vfp/simd exit handling consistent with arm64 After enhancing arm64 FP/SIMD exit handling, ARMv7 VFP exit branch is moved to guest trap handling. This allows us to keep exit handling flow between both architectures consistent. Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-08-19 22:27:58 +01:00
Marc Zyngier	f120cd6533	KVM: arm/arm64: timer: Allow the timer to control the active state In order to remove the crude hack where we sneak the masked bit into the timer's control register, make use of the phys_irq_map API control the active state of the interrupt. This causes some limited changes to allow for potential error propagation. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-08-12 11:28:26 +01:00
Marc Zyngier	6c3d63c9a2	KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts In order to be able to feed physical interrupts to a guest, we need to be able to establish the virtual-physical mapping between the two worlds. The mappings are kept in a set of RCU lists, indexed by virtual interrupts. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-08-12 11:28:25 +01:00
Marc Zyngier	abdf584383	arm/arm64: KVM: Move vgic handling to a non-preemptible section As we're about to introduce some serious GIC-poking to the vgic code, it is important to make sure that we're going to poke the part of the GIC that belongs to the CPU we're about to run on (otherwise, we'd end up with some unexpected interrupts firing)... Introducing a non-preemptible section in kvm_arch_vcpu_ioctl_run prevents the problem from occuring. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-08-12 11:28:23 +01:00
Marc Zyngier	9a99d05070	arm/arm64: KVM: Fix ordering of timer/GIC on guest entry As we now inject the timer interrupt when we're about to enter the guest, it makes a lot more sense to make sure this happens before the vgic code queues the pending interrupts. Otherwise, we get the interrupt on the following exit, which is not great for latency (and leads to all kind of bizarre issues when using with active interrupts at the HW level). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-08-12 11:28:23 +01:00
Alex Bennée	84e690bfbe	KVM: arm64: introduce vcpu->arch.debug_ptr This introduces a level of indirection for the debug registers. Instead of using the sys_regs[] directly we store registers in a structure in the vcpu. The new kvm_arm_reset_debug_ptr() sets the debug ptr to the guest context. Because we no longer give the sys_regs offset for the sys_reg_desc->reg field, but instead the index into a debug-specific struct we need to add a number of additional trap functions for each register. Also as the generic generic user-space access code no longer works we have introduced a new pair of function pointers to the sys_reg_desc structure to override the generic code when needed. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-07-21 12:50:25 +01:00
Alex Bennée	56c7f5e77f	KVM: arm: introduce kvm_arm_init/setup/clear_debug This is a precursor for later patches which will need to do more to setup debug state before entering the hyp.S switch code. The existing functionality for setting mdcr_el2 has been moved out of hyp.S and now uses the value kept in vcpu->arch.mdcr_el2. As the assembler used to previously mask and preserve MDCR_EL2.HPMN I've had to add a mechanism to save the value of mdcr_el2 as a per-cpu variable during the initialisation code. The kernel never sets this number so we are assuming the bootcode has set up the correct value here. This also moves the conditional setting of the TDA bit from the hyp code into the C code which is currently used for the lazy debug register context switch code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-07-21 12:47:08 +01:00
Alex Bennée	0e6f07f29c	KVM: arm: guest debug, add stub KVM_SET_GUEST_DEBUG ioctl This commit adds a stub function to support the KVM_SET_GUEST_DEBUG ioctl. Any unsupported flag will return -EINVAL. For now, only KVM_GUESTDBG_ENABLE is supported, although it won't have any effects. Signed-off-by: Alex Bennée <alex.bennee@linaro.org>. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-07-21 12:47:08 +01:00
Linus Torvalds	e8a0b37d28	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: "Bigger items included in this update are: - A series of updates from Arnd for ARM randconfig build failures - Updates from Dmitry for StrongARM SA-1100 to move IRQ handling to drivers/irqchip/ - Move ARMs SP804 timer to drivers/clocksource/ - Perf updates from Mark Rutland in preparation to move the ARM perf code into drivers/ so it can be shared with ARM64. - MCPM updates from Nicolas - Add support for taking platform serial number from DT - Re-implement Keystone2 physical address space switch to conform to architecture requirements - Clean up ARMv7 LPAE code, which goes in hand with the Keystone2 changes. - L2C cleanups to avoid unlocking caches if we're prevented by the secure support to unlock. - Avoid cleaning a potentially dirty cache containing stale data on CPU initialisation - Add ARM-only entry point for secondary startup (for machines that can only call into a Thumb kernel in ARM mode). Same thing is also done for the resume entry point. - Provide arch_irqs_disabled via asm-generic - Enlarge ARMv7M vector table - Always use BFD linker for VDSO, as gold doesn't accept some of the options we need. - Fix an incorrect BSYM (for Thumb symbols) usage, and convert all BSYM compiler macros to a "badr" (for branch address). - Shut up compiler warnings provoked by our cmpxchg() implementation. - Ensure bad xchg sizes fail to link" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (75 commits) ARM: Fix build if CLKDEV_LOOKUP is not configured ARM: fix new BSYM() usage introduced via for-arm-soc branch ARM: 8383/1: nommu: avoid deprecated source register on mov ARM: 8391/1: l2c: add options to overwrite prefetching behavior ARM: 8390/1: irqflags: Get arch_irqs_disabled from asm-generic ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap ARM: 8388/1: tcm: Don't crash when TCM banks are protected by TrustZone ARM: 8384/1: VDSO: force use of BFD linker ARM: 8385/1: VDSO: group link options ARM: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations ARM: remove __bad_xchg definition ARM: 8369/1: ARMv7M: define size of vector table for Vybrid ARM: 8382/1: clocksource: make ARM_TIMER_SP804 depend on GENERIC_SCHED_CLOCK ARM: 8366/1: move Dual-Timer SP804 driver to drivers/clocksource ARM: 8365/1: introduce sp804_timer_disable and remove arm_timer.h inclusion ARM: 8364/1: fix BE32 module loading ARM: 8360/1: add secondary_startup_arm prototype in header file ARM: 8359/1: correct secondary_startup_arm mode ARM: proc-v7: sanitise and document registers around errata ARM: proc-v7: clean up MIDR access ...	2015-06-26 12:20:00 -07:00
Linus Torvalds	e3d8238d7f	arm64 updates for 4.2, mostly refactoring/clean-up: - CPU ops and PSCI (Power State Coordination Interface) refactoring following the merging of the arm64 ACPI support, together with handling of Trusted (secure) OS instances - Using fixmap for permanent FDT mapping, removing the initial dtb placement requirements (within 512MB from the start of the kernel image). This required moving the FDT self reservation out of the memreserve processing - Idmap (1:1 mapping used for MMU on/off) handling clean-up - Removing flush_cache_all() - not safe on ARM unless the MMU is off. Last stages of CPU power down/up are handled by firmware already - "Alternatives" (run-time code patching) refactoring and support for immediate branch patching, GICv3 CPU interface access - User faults handling clean-up And some fixes: - Fix for VDSO building with broken ELF toolchains - Fixing another case of init_mm.pgd usage for user mappings (during ASID roll-over broadcasting) - Fix for FPSIMD reloading after CPU hotplug - Fix for missing syscall trace exit - Workaround for .inst asm bug - Compat fix for switching the user tls tpidr_el0 register -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVitZgAAoJEGvWsS0AyF7x+ToP/0Yci5bNsYVwVay8N1rK6WHh SGzDMzyxcSBjQpz2IhhTJ28eTAEH+a+HWQms+0tBehjqxqkvjuzBN0okDkc/z8NB 7Z0BV2aLkQcMwTbjgIh5jm25ZpGmvmvbWPD5oBwgmgQ4v4i1OLRKgx7+YQ+z9rWZ zC70d0UwyWjs2oxmjd2ZrAkps7v/qozEFhcRHxLzCn8Mbw+3FcTQsqMbfnoWGnH0 YuGfHQQqBY4/HC7uAslMCy7tXeJXqb+NsgrnAovjfEbHGDjsg0KNl06K++LHwE37 A5Noa3M0AQEPYqx/sg0Ec8RNUUEMB4RA2DCaibp7XlVGncXOwFfiyk/M5uVrYXIO ku5QF0ytUfZKzrMq3yQKbEDuCPOFTqjjdVpkeXKFdW66zYTohKVc3vUBV5xHZ5uO 7Kr8H0ZnhAv3OxPyKdEwAuQ5sJdWwQSvZyGClxMUO4eC/UzD0Mwwf1Y8WYtiAXx+ NSTeBKw/m33W3/KhNuNH1+qGEOKhuXuKX7AcYA84Rab8ytxYWcurHCG2bmhzpEse 2DZtNMybrP/HMQPyJlYgGac8B3QbsAIAkkU1f+dJTAv9otuBDhscaDQyZ9Y6WVht /k8zJiZeMEuGAmwgTkzLmWs/8pQq42nW4J4eQdXPZAwp4ghCIypPWfaZASAwee6/ p+es3v8P4k9wkv2TFZMh =YeGl -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Mostly refactoring/clean-up: - CPU ops and PSCI (Power State Coordination Interface) refactoring following the merging of the arm64 ACPI support, together with handling of Trusted (secure) OS instances - Using fixmap for permanent FDT mapping, removing the initial dtb placement requirements (within 512MB from the start of the kernel image). This required moving the FDT self reservation out of the memreserve processing - Idmap (1:1 mapping used for MMU on/off) handling clean-up - Removing flush_cache_all() - not safe on ARM unless the MMU is off. Last stages of CPU power down/up are handled by firmware already - "Alternatives" (run-time code patching) refactoring and support for immediate branch patching, GICv3 CPU interface access - User faults handling clean-up And some fixes: - Fix for VDSO building with broken ELF toolchains - Fix another case of init_mm.pgd usage for user mappings (during ASID roll-over broadcasting) - Fix for FPSIMD reloading after CPU hotplug - Fix for missing syscall trace exit - Workaround for .inst asm bug - Compat fix for switching the user tls tpidr_el0 register" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits) arm64: use private ratelimit state along with show_unhandled_signals arm64: show unhandled SP/PC alignment faults arm64: vdso: work-around broken ELF toolchains in Makefile arm64: kernel: rename __cpu_suspend to keep it aligned with arm arm64: compat: print compat_sp instead of sp arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAP arm64: entry: fix context tracking for el0_sp_pc arm64: defconfig: enable memtest arm64: mm: remove reference to tlb.S from comment block arm64: Do not attempt to use init_mm in reset_context() arm64: KVM: Switch vgic save/restore to alternative_insn arm64: alternative: Introduce feature for GICv3 CPU interface arm64: psci: fix !CONFIG_HOTPLUG_CPU build warning arm64: fix bug for reloading FPSIMD state after CPU hotplug. arm64: kernel thread don't need to save fpsimd context. arm64: fix missing syscall trace exit arm64: alternative: Work around .inst assembler bugs arm64: alternative: Merge alternative-asm.h into alternative.h arm64: alternative: Allow immediate branch as alternative instruction arm64: Rework alternate sequence for ARM erratum 845719 ...	2015-06-24 10:02:15 -07:00
Paolo Bonzini	05fe125fa3	KVM/ARM changes for v4.2: - Proper guest time accounting - FP access fix for 32bit - The usual pile of GIC fixes - PSCI fixes - Random cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVg/c2AAoJECPQ0LrRPXpDjSoQALjxhdY0ROAz2yjVqcYD5Goj 8XHLbwDxlXQIqXulEakpx/2CtQs3tYsnt7gg3HoO3aY0TqxKY+0nh9RuZ/b15KcP hailUK/hbdLo9sDmoPBRBBP26gqjnXVMYTNqSFk4u2KADYUYgl5auhXcTNARH6Gx +c3LJG9HUbbvKkTG+RB+Xgl3f9N0uaIaV15oTX+WsolNIQVmA8Sh5X7nXPnFXV4r klb9pcR3FPWA9m+fDW8VszoLZ39a12kfQsK0yshKy5ep/AA1liPEYDVc6Wp/cEqz ybDfGuO9GPoPLfzhYq1Cw/SeKaAFruu6QD1ugsSlh1DVZ2evme+OsEvzRsfCqEYc VkKkX2Rc8DQTjsJoaTCOmNfU+v0dWwriCXmfLPBj01+55M4oCDcntTasrEsDAc3o YkTk68T9HJxsNesWpkZDo0pYzGeu3GDS1Hg/ElESminCNfK/S/GBTtidLq7Rs7KU VJCaUeob6CoQnRdmNEm+nHInh5pSoOlKzdduFM9IcvOO5B12YlNMcVnr48Y16ea7 ElYGkxR/n/uLt33A3Yvaj8+PeJbwxQ1RcWRDJy8Fw77AFLEaT2VTr2kZgJIqTatQ ENnCLMxNP2uAt1TrGhEempndYZdWK/RiuaA9MjnxSscFkrXPfn4bMJzV58ghF9L+ 62hk2S8I8q69SvqbPTgv =VQWP -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM changes for v4.2: - Proper guest time accounting - FP access fix for 32bit - The usual pile of GIC fixes - PSCI fixes - Random cleanups	2015-06-19 17:15:24 +02:00
Marc Zyngier	4642019dc4	arm/arm64: KVM: vgic: Do not save GICH_HCR / ICH_HCR_EL2 The GIC Hypervisor Configuration Register is used to enable the delivery of virtual interupts to a guest, as well as to define in which conditions maintenance interrupts are delivered to the host. This register doesn't contain any information that we need to read back (the EOIcount is utterly useless for us). So let's save ourselves some cycles, and not save it before writing zero to it. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-06-17 09:59:55 +01:00
Lorenzo Pieralisi	e2d997366d	ARM: kvm: psci: fix handling of unimplemented functions According to the PSCI specification and the SMC/HVC calling convention, PSCI function_ids that are not implemented must return NOT_SUPPORTED as return value. Current KVM implementation takes an unhandled PSCI function_id as an error and injects an undefined instruction into the guest if PSCI implementation is called with a function_id that is not handled by the resident PSCI version (ie it is not implemented), which is not the behaviour expected by a guest when calling a PSCI function_id that is not implemented. This patch fixes this issue by returning NOT_SUPPORTED whenever the kvm PSCI call is executed for a function_id that is not implemented by the PSCI kvm layer. Cc: <stable@vger.kernel.org> # 3.18+ Cc: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-06-17 09:46:29 +01:00
Kim Phillips	8889583c03	KVM: arm/arm64: Enable the KVM-VFIO device The KVM-VFIO device is used by the QEMU VFIO device. It is used to record the list of in-use VFIO groups so that KVM can manipulate them. Signed-off-by: Kim Phillips <kim.phillips@linaro.org> Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-06-17 09:46:29 +01:00
Christoffer Dall	1b3d546daf	arm/arm64: KVM: Properly account for guest CPU time Until now we have been calling kvm_guest_exit after re-enabling interrupts when we come back from the guest, but this has the unfortunate effect that CPU time accounting done in the context of timer interrupts occurring while the guest is running doesn't properly notice that the time since the last tick was spent in the guest. Inspired by the comment in the x86 code, move the kvm_guest_exit() call below the local_irq_enable() call and change __kvm_guest_exit() to kvm_guest_exit(), because we are now calling this function with interrupts enabled. We have to now explicitly disable preemption and not enable preemption before we've called kvm_guest_exit(), since otherwise we could be preempted and everything happening before we eventually get scheduled again would be accounted for as guest time. At the same time, move the trace_kvm_exit() call outside of the atomic section, since there is no reason for us to do that with interrupts disabled. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-06-17 09:46:28 +01:00
Tiejun Chen	ea2c6d9745	kvm: remove one useless check extension We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD, kvm_vm_ioctl_check_extension_generic() \| + switch (arg) { + ... + #ifdef CONFIG_HAVE_KVM_IRQFD + case KVM_CAP_IRQFD: + #endif + ... + return 1; + ... + } \| + kvm_vm_ioctl_check_extension() So its not necessary to check this in arch again, and also fix one typo, s/emlation/emulation. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-06-17 09:46:28 +01:00
Marc Zyngier	85e84ba310	arm: KVM: force execution of HCPTR access on VM exit On VM entry, we disable access to the VFP registers in order to perform a lazy save/restore of these registers. On VM exit, we restore access, test if we did enable them before, and save/restore the guest/host registers if necessary. In this sequence, the FPEXC register is always accessed, irrespective of the trapping configuration. If the guest didn't touch the VFP registers, then the HCPTR access has now enabled such access, but we're missing a barrier to ensure architectural execution of the new HCPTR configuration. If the HCPTR access has been delayed/reordered, the subsequent access to FPEXC will cause a trap, which we aren't prepared to handle at all. The same condition exists when trapping to enable VFP for the guest. The fix is to introduce a barrier after enabling VFP access. In the vmexit case, it can be relaxed to only takes place if the guest hasn't accessed its view of the VFP registers, making the access to FPEXC safe. The set_hcptr macro is modified to deal with both vmenter/vmexit and vmtrap operations, and now takes an optional label that is branched to when the guest hasn't touched the VFP registers. Reported-by: Vikram Sethi <vikrams@codeaurora.org> Cc: stable@kernel.org # v3.9+ Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-06-17 09:40:14 +01:00
Firo Yang	a5f56ba3b4	ARM: KVM: Remove pointless void pointer cast No need to cast the void pointer returned by kmalloc() in arch/arm/kvm/mmu.c::kvm_alloc_stage2_pgd(). Signed-off-by: Firo Yang <firogm@gmail.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-06-09 18:05:17 +01:00
Paolo Bonzini	f36f3f2846	KVM: add "new" argument to kvm_arch_commit_memory_region This lets the function access the new memory slot without going through kvm_memslots and id_to_memslot. It will simplify the code when more than one address space will be supported. Unfortunately, the "const"ness of the new argument must be casted away in two places. Fixing KVM to accept const struct kvm_memory_slot pointers would require modifications in pretty much all architectures, and is left for later. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-05-28 10:42:58 +02:00
Mark Rutland	538b9b25fa	arm/arm64: kvm: add missing PSCI include We make use of the PSCI function IDs, but don't explicitly include the header which defines them. Relying on transitive header includes is fragile and will be broken as headers are refactored. This patch includes the relevant header file directly so as to avoid future breakage. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com>	2015-05-27 13:21:16 +01:00
Paolo Bonzini	15f46015ee	KVM: add memslots argument to kvm_arch_memslots_updated Prepare for the case of multiple address spaces. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-05-26 12:40:17 +02:00
Paolo Bonzini	09170a4942	KVM: const-ify uses of struct kvm_userspace_memory_region Architecture-specific helpers are not supposed to muck with struct kvm_userspace_memory_region contents. Add const to enforce this. In order to eliminate the only write in __kvm_set_memory_region, the cleaning of deleted slots is pulled up from update_memslots to __kvm_set_memory_region. Reviewed-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-05-26 12:40:13 +02:00
Paolo Bonzini	9f6b802978	KVM: use kvm_memslots whenever possible kvm_memslots provides lockdep checking. Use it consistently instead of explicit dereferencing of kvm->memslots. Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-05-26 12:40:08 +02:00
Russell King	5890298a83	ARM: kvm: fix a bad BSYM() usage BSYM() should only be used when refering to local symbols in the same assembly file which are resolved by the assembler, and not for linker-fixed up symbols. The use of BSYM() with panic is incorrect as the linker is involved in fixing up this relocation, and it knows whether panic() is ARM or Thumb. Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2015-05-08 17:33:47 +01:00
Christian Borntraeger	ccf73aaf5a	KVM: arm/mips/x86/power use __kvm_guest_{enter\|exit} Use __kvm_guest_{enter\|exit} instead of kvm_guest_{enter\|exit} where interrupts are disabled. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-05-07 11:28:22 +02:00
Linus Torvalds	eadf16a912	This mostly includes the PPC changes for 4.1, which this time cover Book3S HV only (debugging aids, minor performance improvements and some cleanups). But there are also bug fixes and small cleanups for ARM, x86 and s390. The task_migration_notifier revert and real fix is still pending review, but I'll send it as soon as possible after -rc1. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJVOONLAAoJEL/70l94x66DbsMIAIpZPsaqgXOC1sDEiZuYay+6 rD4n4id7j8hIAzcf3AlZdyf5XgLlr6I1Zyt62s1WcoRq/CCnL7k9EljzSmw31WFX P2y7/J0iBdkn0et+PpoNThfL2GsgTqNRCLOOQlKgEQwMP9Dlw5fnUbtC1UchOzTg eAMeBIpYwufkWkXhdMw4PAD4lJ9WxUZ1eXHEBRzJb0o0ZxAATJ1tPZGrFJzoUOSM WsVNTuBsNd7upT02kQdvA1TUo/OPjseTOEoksHHwfcORt6bc5qvpctL3jYfcr7sk /L6sIhYGVNkjkuredjlKGLfT2DDJjSEdJb1k2pWrDRsY76dmottQubAE9J9cDTk= =OAi2 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull second batch of KVM changes from Paolo Bonzini: "This mostly includes the PPC changes for 4.1, which this time cover Book3S HV only (debugging aids, minor performance improvements and some cleanups). But there are also bug fixes and small cleanups for ARM, x86 and s390. The task_migration_notifier revert and real fix is still pending review, but I'll send it as soon as possible after -rc1" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits) KVM: arm/arm64: check IRQ number on userland injection KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi KVM: VMX: Preserve host CR4.MCE value while in guest mode. KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C KVM: PPC: Book3S HV: Streamline guest entry and exit KVM: PPC: Book3S HV: Use bitmap of active threads rather than count KVM: PPC: Book3S HV: Use decrementer to wake napping threads KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu KVM: PPC: Book3S HV: Minor cleanups KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update KVM: PPC: Book3S HV: Accumulate timing information for real-mode code KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT KVM: PPC: Book3S HV: Add ICP real mode counters KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock KVM: PPC: Book3S HV: Add guest->host real mode completion counters KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte ...	2015-04-26 13:06:22 -07:00
Andre Przywara	fd1d0ddf2a	KVM: arm/arm64: check IRQ number on userland injection When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently only check it against a fixed limit, which historically is set to 127. With the new dynamic IRQ allocation the effective limit may actually be smaller (64). So when now a malicious or buggy userland injects a SPI in that range, we spill over on our VGIC bitmaps and bytemaps memory. I could trigger a host kernel NULL pointer dereference with current mainline by injecting some bogus IRQ number from a hacked kvmtool: ----------------- .... DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1) DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1) DEBUG: IRQ #114 still in the game, writing to bytemap now... Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = ffffffc07652e000 [00000000] pgd=00000000f658b003, pud=00000000f658b003, *pmd=0000000000000000 Internal error: Oops: 96000006 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027 Hardware name: FVP Base (DT) task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000 PC is at kvm_vgic_inject_irq+0x234/0x310 LR is at kvm_vgic_inject_irq+0x30c/0x310 pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145 ..... So this patch fixes this by checking the SPI number against the actual limit. Also we remove the former legacy hard limit of 127 in the ioctl code. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> CC: <stable@vger.kernel.org> # 4.0, 3.19, 3.18 [maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__, as suggested by Christopher Covington] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-04-22 15:42:24 +01:00
Linus Torvalds	714d8e7e27	arm64 updates for 4.1: The main change here is a significant head.S rework that allows us to boot on machines with physical memory at a really high address without having to increase our mapped VA range. Other changes include: - AES performance boost for Cortex-A57 - AArch32 (compat) userspace with 64k pages - Cortex-A53 erratum workaround for #845719 - defconfig updates (new platforms, PCI, ...) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJVLnQpAAoJELescNyEwWM03RIH/iwcDc0MBZgkwfD5cnY+29p4 m89lMDo3SyGQT4NynHSw7P3R7c3zULmI+9hmJMw/yfjjjL6m7X+vVAF3xj1Am4Al OzCqYLHyFnlRktzJ6dWeF1Ese7tWqPpxn+OCXgYNpz/r5MfF/HhlyX/qNzAQPKrw ZpDvnt44DgUfweqjTbwQUg2wkyCRjmz57MQYxDcmJStdpHIu24jWOvDIo3OJGjyS L49I9DU6DGUhkISZmmBE0T7vmKMD1BcgI7OIzX2WIqn521QT+GSLMhRxaHmK1s1V A8gaMTwpo0xFhTAt7sbw/5+2663WmfRdZI+FtduvORsoxX6KdDn7DH1NQixIm8s= =+F0I -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Here are the core arm64 updates for 4.1. Highlights include a significant rework to head.S (allowing us to boot on machines with physical memory at a really high address), an AES performance boost on Cortex-A57 and the ability to run a 32-bit userspace with 64k pages (although this requires said userspace to be built with a recent binutils). The head.S rework spilt over into KVM, so there are some changes under arch/arm/ which have been acked by Marc Zyngier (KVM co-maintainer). In particular, the linker script changes caused us some issues in -next, so there are a few merge commits where we had to apply fixes on top of a stable branch. Other changes include: - AES performance boost for Cortex-A57 - AArch32 (compat) userspace with 64k pages - Cortex-A53 erratum workaround for #845719 - defconfig updates (new platforms, PCI, ...)" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (39 commits) arm64: fix midr range for Cortex-A57 erratum 832075 arm64: errata: add workaround for cortex-a53 erratum #845719 arm64: Use bool function return values of true/false not 1/0 arm64: defconfig: updates for 4.1 arm64: Extract feature parsing code from cpu_errata.c arm64: alternative: Allow immediate branch as alternative instruction arm64: insn: Add aarch64_insn_decode_immediate ARM: kvm: round HYP section to page size instead of log2 upper bound ARM: kvm: assert on HYP section boundaries not actual code size arm64: head.S: ensure idmap_t0sz is visible arm64: pmu: add support for interrupt-affinity property dt: pmu: extend ARM PMU binding to allow for explicit interrupt affinity arm64: head.S: ensure visibility of page tables arm64: KVM: use ID map with increased VA range if required arm64: mm: increase VA range of identity map ARM: kvm: implement replacement for ld's LOG2CEIL() arm64: proc: remove unused cpu_get_pgd macro arm64: enforce x1\|x2\|x3 == 0 upon kernel entry as per boot protocol arm64: remove __calc_phys_offset arm64: merge __enable_mmu and __turn_mmu_on ...	2015-04-16 13:58:29 -05:00
Paolo Bonzini	bf0fb67cf9	KVM/ARM changes for v4.1: - fixes for live migration - irqfd support - kvm-io-bus & vgic rework to enable ioeventfd - page ageing for stage-2 translation - various cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVHQ0kAAoJECPQ0LrRPXpDHKQQALjw6STaZd7n20OFopNgHd4P qVeWYEKBxnsiSvL4p3IOSlZlEul+7x08aZqtyxWQRQcDT4ggTI+3FKKfc+8yeRpH WV6YJP0bGqz7039PyMLuIgs48xkSZtntePw69hPJfHZh4C1RBlP5T2SfE8mU8VZX fWToiU3W12QfKnmN7JFgxZopnGhrYCrG0EexdTDziAZu0GEMlDrO4wnyTR60WCvT 4TEF73R0kpAz4yplKuhcDHuxIG7VFhQ4z7b09M1JtR0gQ3wUvfbD3Wqqi49SwHkv NQOStcyLsIlDosSRcLXNCwb3IxjObXTBcAxnzgm2Aoc1xMMZX1ZPQNNs6zFZzycb 2c6QMiQ35zm7ellbvrG+bT+BP86JYWcAtHjWcaUFgqSJjb8MtqcMtsCea/DURhqx /kictqbPYBBwKW6SKbkNkisz59hPkuQnv35fuf992MRCbT9LAXLPRLbcirucCzkE p1MOotsWoO3ldJMZaVn0KYk3sQf6mCIfbYPEdOcw3fhJlvyy3NdjVkLOFbA5UUg1 rQ7Ru2rTemBc0ExVrymngNTMpMB4XcEeJzXfhcgMl3DWbDj60Ku/O26sDtZ6bsFv JuDYn8FVDHz9gpEQHgiUi1YMsBKXLhnILa1ppaa6AflykU3BRfYjAk1SXmX84nQK mJUJEdFuxi6pHN0UKxUI =avA4 -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next' KVM/ARM changes for v4.1: - fixes for live migration - irqfd support - kvm-io-bus & vgic rework to enable ioeventfd - page ageing for stage-2 translation - various cleanups	2015-04-07 18:09:20 +02:00
Nikolay Nikolaev	d44758c0df	KVM: arm/arm64: enable KVM_CAP_IOEVENTFD As the infrastructure for eventfd has now been merged, report the ioeventfd capability as being supported. Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com> [maz: grouped the case entry with the others, fixed commit log] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-03-30 17:07:24 +01:00
Andre Przywara	950324ab81	KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus Currently we have struct kvm_exit_mmio for encapsulating MMIO abort data to be passed on from syndrome decoding all the way down to the VGIC register handlers. Now as we switch the MMIO handling to be routed through the KVM MMIO bus, it does not make sense anymore to use that structure already from the beginning. So we keep the data in local variables until we put them into the kvm_io_bus framework. Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private structure. On that way we replace the data buffer in that structure with a pointer pointing to a single location in a local variable, so we get rid of some copying on the way. With all of the virtual GIC emulation code now being registered with the kvm_io_bus, we can remove all of the old MMIO handling code and its dispatching functionality. I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something), because that touches a lot of code lines without any good reason. This is based on an original patch by Nikolay. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-03-30 17:07:19 +01:00
Will Deacon	849176c96d	Merge branch 'aarch64/kvm-bounce-page' into aarch64/for-next/core Just as we thought we'd fixed this, another old linker reared its ugly head trying to build linux-next. Unfortunately, it's the linker binary provided on kernel.org, so give up trying to be clever and align the hyp page to 4k.	2015-03-27 12:22:50 +00:00
Ard Biesheuvel	a9fea8b388	ARM: kvm: round HYP section to page size instead of log2 upper bound Older binutils do not support expressions involving the values of external symbols so just round up the HYP region to the page size. Tested-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: when will this ever end?!] Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-03-27 12:21:27 +00:00
Andre Przywara	5d9d15af1c	KVM: arm/arm64: remove now unneeded include directory from Makefile virt/kvm was never really a good include directory for anything else than locally included headers. With the move of iodev.h there is no need anymore to add this directory the compiler's include path, so remove it from the arm and arm64 kvm Makefile. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2015-03-26 21:43:13 +00:00
Ard Biesheuvel	e4c5a68510	arm64: KVM: use ID map with increased VA range if required This patch modifies the HYP init code so it can deal with system RAM residing at an offset which exceeds the reach of VA_BITS. Like for EL1, this involves configuring an additional level of translation for the ID map. However, in case of EL2, this implies that all translations use the extra level, as we cannot seamlessly switch between translation tables with different numbers of translation levels. So add an extra translation table at the root level. Since the ID map and the runtime HYP map are guaranteed not to overlap, they can share this root level, and we can essentially merge these two tables into one. Tested-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-03-23 11:35:29 +00:00
Ard Biesheuvel	06f75a1f62	ARM, arm64: kvm: get rid of the bounce page The HYP init bounce page is a runtime construct that ensures that the HYP init code does not cross a page boundary. However, this is something we can do perfectly well at build time, by aligning the code appropriately. For arm64, we just align to 4 KB, and enforce that the code size is less than 4 KB, regardless of the chosen page size. For ARM, the whole code is less than 256 bytes, so we tweak the linker script to align at a power of 2 upper bound of the code size Note that this also fixes a benign off-by-one error in the original bounce page code, where a bounce page would be allocated unnecessarily if the code was exactly 1 page in size. On ARM, it also fixes an issue with very large kernels reported by Arnd Bergmann, where stub sections with linker emitted veneers could erroneously trigger the size/alignment ASSERT() in the linker script. Tested-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-03-19 19:21:56 +00:00
Christoffer Dall	1a74847885	arm/arm64: KVM: Fix migration race in the arch timer When a VCPU is no longer running, we currently check to see if it has a timer scheduled in the future, and if it does, we schedule a host hrtimer to notify is in case the timer expires while the VCPU is still not running. When the hrtimer fires, we mask the guest's timer and inject the timer IRQ (still relying on the guest unmasking the time when it receives the IRQ). This is all good and fine, but when migration a VM (checkpoint/restore) this introduces a race. It is unlikely, but possible, for the following sequence of events to happen: 1. Userspace stops the VM 2. Hrtimer for VCPU is scheduled 3. Userspace checkpoints the VGIC state (no pending timer interrupts) 4. The hrtimer fires, schedules work in a workqueue 5. Workqueue function runs, masks the timer and injects timer interrupt 6. Userspace checkpoints the timer state (timer masked) At restore time, you end up with a masked timer without any timer interrupts and your guest halts never receiving timer interrupts. Fix this by only kicking the VCPU in the workqueue function, and sample the expired state of the timer when entering the guest again and inject the interrupt and mask the timer only then. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-14 13:48:00 +01:00
Alex Bennée	ecccf0cc72	arm/arm64: KVM: export VCPU power state via MP_STATE ioctl To cleanly restore an SMP VM we need to ensure that the current pause state of each vcpu is correctly recorded. Things could get confused if the CPU starts running after migration restore completes when it was paused before it state was captured. We use the existing KVM_GET/SET_MP_STATE ioctl to do this. The arm/arm64 interface is a lot simpler as the only valid states are KVM_MP_STATE_RUNNABLE and KVM_MP_STATE_STOPPED. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-14 13:44:52 +01:00
Marc Zyngier	aeda9130c3	arm/arm64: KVM: Optimize handling of Access Flag faults Now that we have page aging in Stage-2, it becomes obvious that we're doing way too much work handling the fault. The page is not going anywhere (it is still mapped), the page tables are already allocated, and all we want is to flip a bit in the PMD or PTE. Also, we can avoid any form of TLB invalidation, since a page with the AF bit off is not allowed to be cached. An obvious solution is to have a separate handler for FSC_ACCESS, where we pride ourselves to only do the very minimum amount of work. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-12 22:34:49 +01:00
Marc Zyngier	35307b9a5f	arm/arm64: KVM: Implement Stage-2 page aging Until now, KVM/arm didn't care much for page aging (who was swapping anyway?), and simply provided empty hooks to the core KVM code. With server-type systems now being available, things are quite different. This patch implements very simple support for page aging, by clearing the Access flag in the Stage-2 page tables. On access fault, the current fault handling will write the PTE or PMD again, putting the Access flag back on. It should be possible to implement a much faster handling for Access faults, but that's left for a later patch. With this in place, performance in VMs is degraded much more gracefully. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-12 22:34:43 +01:00
Marc Zyngier	1d2ebaccc7	arm/arm64: KVM: Allow handle_hva_to_gpa to return a value So far, handle_hva_to_gpa was never required to return a value. As we prepare to age pages at Stage-2, we need to be able to return a value from the iterator (kvm_test_age_hva). Adapt the code to handle this situation. No semantic change. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-12 22:34:30 +01:00
Eric Auger	174178fed3	KVM: arm/arm64: add irqfd support This patch enables irqfd on arm/arm64. Both irqfd and resamplefd are supported. Injection is implemented in vgic.c without routing. This patch enables CONFIG_HAVE_KVM_EVENTFD and CONFIG_HAVE_KVM_IRQFD. KVM_CAP_IRQFD is now advertised. KVM_CAP_IRQFD_RESAMPLE capability automatically is advertised as soon as CONFIG_HAVE_KVM_IRQFD is set. Irqfd injection is restricted to SPI. The rationale behind not supporting PPI irqfd injection is that any device using a PPI would be a private-to-the-CPU device (timer for instance), so its state would have to be context-switched along with the VCPU and would require in-kernel wiring anyhow. It is not a relevant use case for irqfds. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-12 15:15:34 +01:00
Eric Auger	c1426e4c5a	KVM: arm/arm64: implement kvm_arch_intc_initialized On arm/arm64 the VGIC is dynamically instantiated and it is useful to expose its state, especially for irqfd setup. This patch defines __KVM_HAVE_ARCH_INTC_INITIALIZED and implements kvm_arch_intc_initialized. Signed-off-by: Eric Auger <eric.auger@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-12 15:15:33 +01:00
Eric Auger	df2bd1ac03	KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP CONFIG_HAVE_KVM_IRQCHIP is needed to support IRQ routing (along with irq_comm.c and irqchip.c usage). This is not the case for arm/arm64 currently. This patch unsets the flag for both arm and arm64. Signed-off-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-12 15:15:32 +01:00
Christoffer Dall	662d971584	arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER} We can definitely decide at run-time whether to use the GIC and timers or not, and the extra code and data structures that we allocate space for is really negligable with this config option, so I don't think it's worth the extra complexity of always having to define stub static inlines. The !CONFIG_KVM_ARM_VGIC/TIMER case is pretty much an untested code path anyway, so we're better off just getting rid of it. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com>	2015-03-12 15:15:27 +01:00
Paolo Bonzini	69ff5c619c	KVM: arm/arm64: prefer IS_ENABLED to a static variable IS_ENABLED gives compile-time checking and keeps the code clearer. The one exception is inside kvm_vm_ioctl_check_extension, where the established idiom is to wrap the case labels with an #ifdef. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-03-11 15:27:55 +01:00
Marc Zyngier	04b8dc85bf	arm64: KVM: Do not use pgd_index to index stage-2 pgd The kernel's pgd_index macro is designed to index a normal, page sized array. KVM is a bit diffferent, as we can use concatenated pages to have a bigger address space (for example 40bit IPA with 4kB pages gives us an 8kB PGD. In the above case, the use of pgd_index will always return an index inside the first 4kB, which makes a guest that has memory above 0x8000000000 rather unhappy, as it spins forever in a page fault, whist the host happilly corrupts the lower pgd. The obvious fix is to get our own kvm_pgd_index that does the right thing(tm). Tested on X-Gene with a hacked kvmtool that put memory at a stupidly high address. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-11 14:24:36 +01:00
Marc Zyngier	a987370f8e	arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting We're using __get_free_pages with to allocate the guest's stage-2 PGD. The standard behaviour of this function is to return a set of pages where only the head page has a valid refcount. This behaviour gets us into trouble when we're trying to increment the refcount on a non-head page: page:ffff7c00cfb693c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x4000000000000000() page dumped because: VM_BUG_ON_PAGE((({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) )&((&page->_count)->counter); })) <= 0) BUG: failure at include/linux/mm.h:548/get_page()! Kernel panic - not syncing: BUG! CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825 Hardware name: APM X-Gene Mustang board (DT) Call trace: [<ffff80000008a09c>] dump_backtrace+0x0/0x13c [<ffff80000008a1e8>] show_stack+0x10/0x1c [<ffff800000691da8>] dump_stack+0x74/0x94 [<ffff800000690d78>] panic+0x100/0x240 [<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc [<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0 [<ffff8000000a420c>] handle_exit+0x58/0x180 [<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c [<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754 [<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8 [<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78 CPU0: stopping A possible approach for this is to split the compound page using split_page() at allocation time, and change the teardown path to free one page at a time. It turns out that alloc_pages_exact() and free_pages_exact() does exactly that. While we're at it, the PGD allocation code is reworked to reduce duplication. This has been tested on an X-Gene platform with a 4kB/48bit-VA host kernel, and kvmtool hacked to place memory in the second page of the hardware PGD (PUD for the host kernel). Also regression-tested on a Cubietruck (Cortex-A7). [ Reworked to use alloc_pages_exact() and free_pages_exact() and to return pointers directly instead of by reference as arguments - Christoffer ] Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-03-11 14:23:20 +01:00
Wei Huang	91314cb005	arm/arm64: KVM: Add exit reaons to kvm_exit event tracing This patch extends trace_kvm_exit() to include KVM exit reasons (i.e. EC of HSR). The tracing function then dumps both exit reason and PC of vCPU, shown as the following. Tracing tools can use this new exit_reason field to better understand the behavior of guest VMs. 886.301252: kvm_exit: HSR_EC: 0x0024, PC: 0xfffffe0000506b28 Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-02-23 22:28:48 +01:00
Linus Torvalds	b9085bcbf5	Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: the highlights are support for GICv3 emulation and dirty page tracking s390: several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. ARM has other conflicts where functions are added in the same place by 3.19-rc and 3.20 patches. These are not large though, and entirely within KVM. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJU28rkAAoJEL/70l94x66DXqQH/1TDOfJIjW7P2kb0Sw7Fy1wi cEX1KO/VFxAqc8R0E/0Wb55CXyPjQJM6xBXuFr5cUDaIjQ8ULSktL4pEwXyyv/s5 DBDkN65mriry2w5VuEaRLVcuX9Wy+tqLQXWNkEySfyb4uhZChWWHvKEcgw5SqCyg NlpeHurYESIoNyov3jWqvBjr4OmaQENyv7t2c6q5ErIgG02V+iCux5QGbphM2IC9 LFtPKxoqhfeB2xFxTOIt8HJiXrZNwflsTejIlCl/NSEiDVLLxxHCxK2tWK/tUXMn JfLD9ytXBWtNMwInvtFm4fPmDouv2VDyR0xnK2db+/axsJZnbxqjGu1um4Dqbak= =7gdx -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM update from Paolo Bonzini: "Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: The highlights are support for GICv3 emulation and dirty page tracking s390: Several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. Powerpc: Nothing yet. The KVM/PPC changes will come in through the PPC maintainers, because I haven't received them yet and I might end up being offline for some part of next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: ia64: drop kvm.h from installed user headers KVM: x86: fix build with !CONFIG_SMP KVM: x86: emulate: correct page fault error code for NoWrite instructions KVM: Disable compat ioctl for s390 KVM: s390: add cpu model support KVM: s390: use facilities and cpu_id per KVM KVM: s390/CPACF: Choose crypto control block format s390/kernel: Update /proc/sysinfo file with Extended Name and UUID KVM: s390: reenable LPP facility KVM: s390: floating irqs: fix user triggerable endless loop kvm: add halt_poll_ns module parameter kvm: remove KVM_MMIO_SIZE KVM: MIPS: Don't leak FPU/DSP to guest KVM: MIPS: Disable HTW while in guest KVM: nVMX: Enable nested posted interrupt processing KVM: nVMX: Enable nested virtual interrupt delivery KVM: nVMX: Enable nested apic register virtualization KVM: nVMX: Make nested control MSRs per-cpu KVM: nVMX: Enable nested virtualize x2apic mode KVM: nVMX: Prepare for using hardware MSR bitmap ...	2015-02-13 09:55:09 -08:00
Linus Torvalds	23e8fe2e16	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU changes in this cycle are: - Documentation updates. - Miscellaneous fixes. - Preemptible-RCU fixes, including fixing an old bug in the interaction of RCU priority boosting and CPU hotplug. - SRCU updates. - RCU CPU stall-warning updates. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) rcu: Initialize tiny RCU stall-warning timeouts at boot rcu: Fix RCU CPU stall detection in tiny implementation rcu: Add GP-kthread-starvation checks to CPU stall warnings rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors rcu: Optionally run grace-period kthreads at real-time priority ksoftirqd: Use new cond_resched_rcu_qs() function ksoftirqd: Enable IRQs and call cond_resched() before poking RCU rcutorture: Add more diagnostics in rcu_barrier() test failure case torture: Flag console.log file to prevent holdovers from earlier runs torture: Add "-enable-kvm -soundhw pcspk" to qemu command line rcutorture: Handle different mpstat versions rcutorture: Check from beginning to end of grace period rcu: Remove redundant rcu_batches_completed() declaration rcutorture: Drop rcu_torture_completed() and friends rcu: Provide rcu_batches_completed_sched() for TINY_RCU rcutorture: Use unsigned for Reader Batch computations rcutorture: Make build-output parsing correctly flag RCU's warnings rcu: Make _batches_completed() functions return unsigned long rcutorture: Issue warnings on close calls due to Reader Batch blows documentation: Fix smp typo in memory-barriers.txt ...	2015-02-09 14:28:42 -08:00
Marc Zyngier	0d3e4d4fad	arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault When handling a fault in stage-2, we need to resync I$ and D$, just to be sure we don't leave any old cache line behind. That's very good, except that we do so using the user address. Under heavy load (swapping like crazy), we may end up in a situation where the page gets mapped in stage-2 while being unmapped from userspace by another CPU. At that point, the DC/IC instructions can generate a fault, which we handle with kvm->mmu_lock held. The box quickly deadlocks, user is unhappy. Instead, perform this invalidation through the kernel mapping, which is guaranteed to be present. The box is much happier, and so am I. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-29 23:24:57 +01:00
Marc Zyngier	363ef89f8e	arm/arm64: KVM: Invalidate data cache on unmap Let's assume a guest has created an uncached mapping, and written to that page. Let's also assume that the host uses a cache-coherent IO subsystem. Let's finally assume that the host is under memory pressure and starts to swap things out. Before this "uncached" page is evicted, we need to make sure we invalidate potential speculated, clean cache lines that are sitting there, or the IO subsystem is going to swap out the cached view, loosing the data that has been written directly into memory. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-29 23:24:56 +01:00
Marc Zyngier	3c1e716508	arm/arm64: KVM: Use set/way op trapping to track the state of the caches Trying to emulate the behaviour of set/way cache ops is fairly pointless, as there are too many ways we can end-up missing stuff. Also, there is some system caches out there that simply ignore set/way operations. So instead of trying to implement them, let's convert it to VA ops, and use them as a way to re-enable the trapping of VM ops. That way, we can detect the point when the MMU/caches are turned off, and do a full VM flush (which is what the guest was trying to do anyway). This allows a 32bit zImage to boot on the APM thingy, and will probably help bootloaders in general. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-29 23:24:56 +01:00
Kai Huang	3b0f1d01e5	KVM: Rename kvm_arch_mmu_write_protect_pt_masked to be more generic for log dirty We don't have to write protect guest memory for dirty logging if architecture supports hardware dirty logging, such as PML on VMX, so rename it to be more generic. Signed-off-by: Kai Huang <kai.huang@linux.intel.com> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-29 15:30:38 +01:00
Paolo Bonzini	8fff5e374a	KVM: s390: fixes and features for kvm/next (3.20) 1. Generic - sparse warning (make function static) - optimize locking - bugfixes for interrupt injection - fix MVPG addressing modes 2. hrtimer/wakeup fun A recent change can cause KVM hangs if adjtime is used in the host. The hrtimer might wake up too early or too late. Too early is fatal as vcpu_block will see that the wakeup condition is not met and sleep again. This CPU might never wake up again. This series addresses this problem. adjclock slowing down the host clock will result in too late wakeups. This will require more work. In addition to that we also change the hrtimer from REALTIME to MONOTONIC to avoid similar problems with timedatectl set-time. 3. sigp rework We will move all "slow" sigps to QEMU (protected with a capability that can be enabled) to avoid several races between concurrent SIGP orders. 4. Optimize the shadow page table Provide an interface to announce the maximum guest size. The kernel will use that to make the pagetable 2,3,4 (or theoretically) 5 levels. 5. Provide an interface to set the guest TOD We now use two vm attributes instead of two oneregs, as oneregs are vcpu ioctl and we don't want to call them from other threads. 6. Protected key functions The real HMC allows to enable/disable protected key CPACF functions. Lets provide an implementation + an interface for QEMU to activate this the protected key instructions. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJUwj60AAoJEBF7vIC1phx8iV0QAKq1LZRTmgTLS2fd0oyWKZeN ShWUIUiB+7IUiuogYXZMfqOm61oogxwc95Ti+3tpSWYwkzUWagpS/RJQze7E1HOc 3pHpXwrR01ueUT6uVV4xc/vmVIlQAIl/ScRDDPahlAT2crCleWcKVC9l0zBs/Kut IrfzN9pJcrkmXD178CDP8/VwXsn02ptLQEpidGibGHCd03YVFjp3X0wfwNdQxMbU qOwNYCz3SLfDm5gsybO2DG+aVY3AbM2ZOJt/qLv2j4Phz4XB4t4W9iJnAefSz7JA W4677wbMQpfZlUQYhI78H/Cl9SfWAuLug1xk83O/+lbEiR5u+8zLxB69dkFTiBaH 442OY957T6TQZ/V9d0jDo2XxFrcaU9OONbVLsfBQ56Vwv5cAg9/7zqG8eqH7Nq9R gU3fQesgD4N0Kpa77T9k45TT/hBRnUEtsGixAPT6QYKyE6cK4AJATHKSjMSLbdfj ELbt0p2mVtKhuCcANfEx54U2CxOrg5ElBmPz8hRw0OkXdwpqh1sGKmt0govcHP1I BGSzE9G4mswwI1bQ7cqcyTk/lwL8g3+KQmRJoOcgCveQlnY12X5zGD5DhuPMPiIT VENqbcTzjlxdu+4t7Enml+rXl7ySsewT9L231SSrbLsTQVgCudD1B9m72WLu5ZUT 9/Z6znv6tkeKV5rM9DYE =zLjR -----END PGP SIGNATURE----- Merge tag 'kvm-s390-next-20150122' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next KVM: s390: fixes and features for kvm/next (3.20) 1. Generic - sparse warning (make function static) - optimize locking - bugfixes for interrupt injection - fix MVPG addressing modes 2. hrtimer/wakeup fun A recent change can cause KVM hangs if adjtime is used in the host. The hrtimer might wake up too early or too late. Too early is fatal as vcpu_block will see that the wakeup condition is not met and sleep again. This CPU might never wake up again. This series addresses this problem. adjclock slowing down the host clock will result in too late wakeups. This will require more work. In addition to that we also change the hrtimer from REALTIME to MONOTONIC to avoid similar problems with timedatectl set-time. 3. sigp rework We will move all "slow" sigps to QEMU (protected with a capability that can be enabled) to avoid several races between concurrent SIGP orders. 4. Optimize the shadow page table Provide an interface to announce the maximum guest size. The kernel will use that to make the pagetable 2,3,4 (or theoretically) 5 levels. 5. Provide an interface to set the guest TOD We now use two vm attributes instead of two oneregs, as oneregs are vcpu ioctl and we don't want to call them from other threads. 6. Protected key functions The real HMC allows to enable/disable protected key CPACF functions. Lets provide an implementation + an interface for QEMU to activate this the protected key instructions.	2015-01-23 14:33:36 +01:00
Dominik Dingel	31928aa586	KVM: remove unneeded return value of vcpu_postcreate The return value of kvm_arch_vcpu_postcreate is not checked in its caller. This is okay, because only x86 provides vcpu_postcreate right now and it could only fail if vcpu_load failed. But that is not possible during KVM_CREATE_VCPU (kvm_arch_vcpu_load is void, too), so just get rid of the unchecked return value. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:24:52 +01:00
Christoffer Dall	227ea818f2	arm/arm64: KVM: Fixup incorrect config symbol in comment A comment in the dirty page logging patch series mentioned incorrectly spelled config symbols, just fix them up to match the real thing. Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-23 10:51:58 +01:00
Andre Przywara	1d916229e3	arm/arm64: KVM: split GICv2 specific emulation code from vgic.c vgic.c is currently a mixture of generic vGIC emulation code and functions specific to emulating a GICv2. To ease the addition of GICv3, split off strictly v2 specific parts into a new file vgic-v2-emul.c. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> ------- As the diff isn't always obvious here (and to aid eventual rebases), here is a list of high-level changes done to the code: * added new file to respective arm/arm64 Makefiles * moved GICv2 specific functions to vgic-v2-emul.c: - handle_mmio_misc() - handle_mmio_set_enable_reg() - handle_mmio_clear_enable_reg() - handle_mmio_set_pending_reg() - handle_mmio_clear_pending_reg() - handle_mmio_priority_reg() - vgic_get_target_reg() - vgic_set_target_reg() - handle_mmio_target_reg() - handle_mmio_cfg_reg() - handle_mmio_sgi_reg() - vgic_v2_unqueue_sgi() - read_set_clear_sgi_pend_reg() - write_set_clear_sgi_pend_reg() - handle_mmio_sgi_set() - handle_mmio_sgi_clear() - vgic_v2_handle_mmio() - vgic_get_sgi_sources() - vgic_dispatch_sgi() - vgic_v2_queue_sgi() - vgic_v2_map_resources() - vgic_v2_init() - vgic_v2_add_sgi_source() - vgic_v2_init_model() - vgic_v2_init_emulation() - handle_cpu_mmio_misc() - handle_mmio_abpr() - handle_cpu_mmio_ident() - vgic_attr_regs_access() - vgic_create() (renamed to vgic_v2_create()) - vgic_destroy() (renamed to vgic_v2_destroy()) - vgic_has_attr() (renamed to vgic_v2_has_attr()) - vgic_set_attr() (renamed to vgic_v2_set_attr()) - vgic_get_attr() (renamed to vgic_v2_get_attr()) - struct kvm_mmio_range vgic_dist_ranges[] - struct kvm_mmio_range vgic_cpu_ranges[] - struct kvm_device_ops kvm_arm_vgic_v2_ops {} Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-20 18:25:30 +01:00
Andre Przywara	3caa2d8c3b	arm/arm64: KVM: make the maximum number of vCPUs a per-VM value Currently the maximum number of vCPUs supported is a global value limited by the used GIC model. GICv3 will lift this limit, but we still need to observe it for guests using GICv2. So the maximum number of vCPUs is per-VM value, depending on the GIC model the guest uses. Store and check the value in struct kvm_arch, but keep it down to 8 for now. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-20 18:25:28 +01:00
Andre Przywara	59892136c4	arm/arm64: KVM: pass down user space provided GIC type into vGIC code With the introduction of a second emulated GIC model we need to let userspace specify the GIC model to use for each VM. Pass the userspace provided value down into the vGIC code and store it there to differentiate later. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-20 18:25:25 +01:00
Andre Przywara	4429fc64b9	arm/arm64: KVM: rework MPIDR assignment and add accessors The virtual MPIDR registers (containing topology information) for the guest are currently mapped linearily to the vcpu_id. Improve this mapping for arm64 by using three levels to not artificially limit the number of vCPUs. To help this, change and rename the kvm_vcpu_get_mpidr() function to mask off the non-affinity bits in the MPIDR register. Also add an accessor to later allow easier access to a vCPU with a given MPIDR. Use this new accessor in the PSCI emulation. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-20 18:25:17 +01:00
Mario Smarduch	7276030a08	KVM: arm/arm64: Enable Dirty Page logging for ARMv8 This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic layer through Kconfig symbol, and drops earlier ARM64 constraints to enable logging at architecture layer. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>	2015-01-16 14:42:49 +01:00
Mario Smarduch	15a49a44fc	KVM: arm: page logging 2nd stage fault handling This patch adds support for 2nd stage page fault handling while dirty page logging. On huge page faults, huge pages are dissolved to normal pages, and rebuilding of 2nd stage huge pages is blocked. In case migration is canceled this restriction is removed and huge pages may be rebuilt again. Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-16 14:42:44 +01:00
Mario Smarduch	53c810c364	KVM: arm: dirty logging write protect support Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl calls. We call kvm_get_dirty_log_protect() function to do most of the work. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>	2015-01-16 14:40:15 +01:00
Mario Smarduch	c64735554c	KVM: arm: Add initial dirty page locking support Add support for initial write protection of VM memslots. This patch series assumes that huge PUDs will not be used in 2nd stage tables, which is always valid on ARMv7 Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>	2015-01-16 14:40:14 +01:00
Mario Smarduch	72fc36b600	KVM: arm: Add ARMv7 API to flush TLBs This patch adds ARMv7 architecture TLB Flush function. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>	2015-01-16 14:40:14 +01:00
Andre Przywara	924de80db9	ARM: KVM: extend WFI tracepoint to differentiate between wfi and wfe Currently the trace printk talks about "wfi" only, though the trace point triggers both on wfi and wfe traps. Add a parameter to differentiate between the two. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Wei Huang <wei@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-01-15 13:12:27 +01:00
Pranith Kumar	83fe27ea53	rcu: Make SRCU optional by using CONFIG_SRCU SRCU is not necessary to be compiled by default in all cases. For tinification efforts not compiling SRCU unless necessary is desirable. The current patch tries to make compiling SRCU optional by introducing a new Kconfig option CONFIG_SRCU which is selected when any of the components making use of SRCU are selected. If we do not select CONFIG_SRCU, srcu.o will not be compiled at all. text data bss dec hex filename 2007 0 0 2007 7d7 kernel/rcu/srcu.o Size of arch/powerpc/boot/zImage changes from text data bss dec hex filename 831552 64180 23944 919676 e087c arch/powerpc/boot/zImage : before 829504 64180 23952 917636 e0084 arch/powerpc/boot/zImage : after so the savings are about ~2000 bytes. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Josh Triplett <josh@joshtriplett.org> CC: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]	2015-01-06 11:04:29 -08:00
Linus Torvalds	66dcff86ba	3.19 changes for KVM: - spring cleaning: removed support for IA64, and for hardware-assisted virtualization on the PPC970 - ARM, PPC, s390 all had only small fixes For x86: - small performance improvements (though only on weird guests) - usual round of hardware-compliancy fixes from Nadav - APICv fixes - XSAVES support for hosts and guests. XSAVES hosts were broken because the (non-KVM) XSAVES patches inadvertently changed the KVM userspace ABI whenever XSAVES was enabled; hence, this part is going to stable. Guest support is just a matter of exposing the feature and CPUID leaves support. Right now KVM is broken for PPC BookE in your tree (doesn't compile). I'll reply to the pull request with a patch, please apply it either before the pull request or in the merge commit, in order to preserve bisectability somewhat. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJUkpg+AAoJEL/70l94x66DUmoH/jzXYkptSW9NGgm79KqxGJlD lzLnLBkitVvx++Mz5YBhdJEhKKLUlCtifFT1zPJQ/pthQhIRSaaAwZyNGgUs5w5x yMGKHiPQFyZRbmQtZhCInW0BftJoYHHciO3nUfHCZnp34My9MP2D55W7/z+fYFfQ DuqBSE9ThyZJtZ4zh8NRA9fCOeuqwVYRyoBs820Wbsh4cpIBoIK63Dg7k+CLE+ZV MZa/mRL6bAfsn9W5bnOUAgHJ3SPznnWbO3/g0aV+roL/5pffblprJx9lKNR08xUM 6hDFLop2gDehDJesDkY/o8Ckp1hEouvfsVpSShry4vcgtn0hgh2O5/6Orbmj6vE= =Zwq1 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM update from Paolo Bonzini: "3.19 changes for KVM: - spring cleaning: removed support for IA64, and for hardware- assisted virtualization on the PPC970 - ARM, PPC, s390 all had only small fixes For x86: - small performance improvements (though only on weird guests) - usual round of hardware-compliancy fixes from Nadav - APICv fixes - XSAVES support for hosts and guests. XSAVES hosts were broken because the (non-KVM) XSAVES patches inadvertently changed the KVM userspace ABI whenever XSAVES was enabled; hence, this part is going to stable. Guest support is just a matter of exposing the feature and CPUID leaves support" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (179 commits) KVM: move APIC types to arch/x86/ KVM: PPC: Book3S: Enable in-kernel XICS emulation by default KVM: PPC: Book3S HV: Improve H_CONFER implementation KVM: PPC: Book3S HV: Fix endianness of instruction obtained from HEIR register KVM: PPC: Book3S HV: Remove code for PPC970 processors KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions KVM: PPC: Book3S HV: Simplify locking around stolen time calculations arch: powerpc: kvm: book3s_paired_singles.c: Remove unused function arch: powerpc: kvm: book3s_pr.c: Remove unused function arch: powerpc: kvm: book3s.c: Remove some unused functions arch: powerpc: kvm: book3s_32_mmu.c: Remove unused function KVM: PPC: Book3S HV: Check wait conditions before sleeping in kvmppc_vcore_blocked KVM: PPC: Book3S HV: ptes are big endian KVM: PPC: Book3S HV: Fix inaccuracies in ICP emulation for H_IPI KVM: PPC: Book3S HV: Fix KSM memory corruption KVM: PPC: Book3S HV: Fix an issue where guest is paused on receiving HMI KVM: PPC: Book3S HV: Fix computation of tlbie operand KVM: PPC: Book3S HV: Add missing HPTE unlock KVM: PPC: BookE: Improve irq inject tracepoint arm/arm64: KVM: Require in-kernel vgic for the arch timers ...	2014-12-18 16:05:28 -08:00
Christoffer Dall	05971120fc	arm/arm64: KVM: Require in-kernel vgic for the arch timers It is curently possible to run a VM with architected timers support without creating an in-kernel VGIC, which will result in interrupts from the virtual timer going nowhere. To address this issue, move the architected timers initialization to the time when we run a VCPU for the first time, and then only initialize (and enable) the architected timers if we have a properly created and initialized in-kernel VGIC. When injecting interrupts from the virtual timer to the vgic, the current setup should ensure that this never calls an on-demand init of the VGIC, which is the only call path that could return an error from kvm_vgic_inject_irq(), so capture the return value and raise a warning if there's an error there. We also change the kvm_timer_init() function from returning an int to be a void function, since the function always succeeds. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-15 11:50:42 +01:00
Christoffer Dall	716139df25	arm/arm64: KVM: Don't allow creating VCPUs after vgic_initialized When the vgic initializes its internal state it does so based on the number of VCPUs available at the time. If we allow KVM to create more VCPUs after the VGIC has been initialized, we are likely to error out in unfortunate ways later, perform buffer overflows etc. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:17:10 +01:00
Christoffer Dall	c52edf5f8c	arm/arm64: KVM: Rename vgic_initialized to vgic_ready The vgic_initialized() macro currently returns the state of the vgic->ready flag, which indicates if the vgic is ready to be used when running a VM, not specifically if its internal state has been initialized. Rename the macro accordingly in preparation for a more nuanced initialization flow. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:17:05 +01:00
Peter Maydell	6d3cfbe21b	arm/arm64: KVM: vgic: move reset initialization into vgic_init_maps() VGIC initialization currently happens in three phases: (1) kvm_vgic_create() (triggered by userspace GIC creation) (2) vgic_init_maps() (triggered by userspace GIC register read/write requests, or from kvm_vgic_init() if not already run) (3) kvm_vgic_init() (triggered by first VM run) We were doing initialization of some state to correspond with the state of a freshly-reset GIC in kvm_vgic_init(); this is too late, since it will overwrite changes made by userspace using the register access APIs before the VM is run. Move this initialization earlier, into the vgic_init_maps() phase. This fixes a bug where QEMU could successfully restore a saved VM state snapshot into a VM that had already been run, but could not restore it "from cold" using the -loadvm command line option (the symptoms being that the restored VM would run but interrupts were ignored). Finally rename vgic_init_maps to vgic_init and renamed kvm_vgic_init to kvm_vgic_map_resources. [ This patch is originally written by Peter Maydell, but I have modified it somewhat heavily, renaming various bits and moving code around. If something is broken, I am to be blamed. - Christoffer ] Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:52 +01:00
Christoffer Dall	957db105c9	arm/arm64: KVM: Introduce stage2_unmap_vm Introduce a new function to unmap user RAM regions in the stage2 page tables. This is needed on reboot (or when the guest turns off the MMU) to ensure we fault in pages again and make the dcache, RAM, and icache coherent. Using unmap_stage2_range for the whole guest physical range does not work, because that unmaps IO regions (such as the GIC) which will not be recreated or in the best case faulted in on a page-by-page basis. Call this function on secondary and subsequent calls to the KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest Stage-1 MMU is off when faulting in pages and make the caches coherent. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:27 +01:00
Christoffer Dall	cf5d318865	arm/arm64: KVM: Turn off vcpus on PSCI shutdown/reboot When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:27 +01:00
Christoffer Dall	f7fa034dc8	arm/arm64: KVM: Clarify KVM_ARM_VCPU_INIT ABI It is not clear that this ioctl can be called multiple times for a given vcpu. Userspace already does this, so clarify the ABI. Also specify that userspace is expected to always make secondary and subsequent calls to the ioctl with the same parameters for the VCPU as the initial call (which userspace also already does). Add code to check that userspace doesn't violate that ABI in the future, and move the kvm_vcpu_set_target() function which is currently duplicated between the 32-bit and 64-bit versions in guest.c to a common static function in arm.c, shared between both architectures. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:26 +01:00
Christoffer Dall	b856a59141	arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also reset the HCR, because we now modify the HCR dynamically to enable/disable trapping of guest accesses to the VM registers. This is crucial for reboot of VMs working since otherwise we will not be doing the necessary cache maintenance operations when faulting in pages with the guest MMU off. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:26 +01:00
Christoffer Dall	3ad8b3de52	arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option The implementation of KVM_ARM_VCPU_INIT is currently not doing what userspace expects, namely making sure that a vcpu which may have been turned off using PSCI is returned to its initial state, which would be powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag. Implement the expected functionality and clarify the ABI. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:25 +01:00
Christoffer Dall	03f1d4c17e	arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag If a VCPU was originally started with power off (typically to be brought up by PSCI in SMP configurations), there is no need to clear the POWER_OFF flag in the kernel, as this flag is only tested during the init ioctl itself. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-12-13 14:15:25 +01:00
Ard Biesheuvel	bb55e9b131	arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn() Instead of using kvm_is_mmio_pfn() to decide whether a host region should be stage 2 mapped with device attributes, add a new static function kvm_is_device_pfn() that disregards RAM pages with the reserved bit set, as those should usually not be mapped as device memory. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-11-26 14:40:45 +01:00
Mark Rutland	7cbb87d67e	arm64: KVM: fix unmapping with 48-bit VAs Currently if using a 48-bit VA, tearing down the hyp page tables (which can happen in the absence of a GICH or GICV resource) results in the rather nasty splat below, evidently becasue we access a table that doesn't actually exist. Commit `38f791a4e4` (arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2) added a pgd_none check to __create_hyp_mappings to account for the additional level of tables, but didn't add a corresponding check to unmap_range, and this seems to be the source of the problem. This patch adds the missing pgd_none check, ensuring we don't try to access tables that don't exist. Original splat below: kvm [1]: Using HYP init bounce page @83fe94a000 kvm [1]: Cannot obtain GICH resource Unable to handle kernel paging request at virtual address ffff7f7fff000000 pgd = ffff800000770000 [ffff7f7fff000000] *pgd=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89 task: ffff8003eb500000 ti: ffff8003eb45c000 task.ti: ffff8003eb45c000 PC is at unmap_range+0x120/0x580 LR is at free_hyp_pgds+0xac/0xe4 pc : [<ffff80000009b768>] lr : [<ffff80000009cad8>] pstate: 80000045 sp : ffff8003eb45fbf0 x29: ffff8003eb45fbf0 x28: ffff800000736000 x27: ffff800000735000 x26: ffff7f7fff000000 x25: 0000000040000000 x24: ffff8000006f5000 x23: 0000000000000000 x22: 0000007fffffffff x21: 0000800000000000 x20: 0000008000000000 x19: 0000000000000000 x18: ffff800000648000 x17: ffff800000537228 x16: 0000000000000000 x15: 000000000000001f x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000020 x11: 0000000000000062 x10: 0000000000000006 x9 : 0000000000000000 x8 : 0000000000000063 x7 : 0000000000000018 x6 : 00000003ff000000 x5 : ffff800000744188 x4 : 0000000000000001 x3 : 0000000040000000 x2 : ffff800000000000 x1 : 0000007fffffffff x0 : 000000003fffffff Process swapper/0 (pid: 1, stack limit = 0xffff8003eb45c058) Stack: (0xffff8003eb45fbf0 to 0xffff8003eb460000) fbe0: eb45fcb0 ffff8003 0009cad8 ffff8000 fc00: 00000000 00000080 00736140 ffff8000 00736000 ffff8000 00000000 00007c80 fc20: 00000000 00000080 006f5000 ffff8000 00000000 00000080 00743000 ffff8000 fc40: 00735000 ffff8000 006d3030 ffff8000 006fe7b8 ffff8000 00000000 00000080 fc60: ffffffff 0000007f fdac1000 ffff8003 fd94b000 ffff8003 fda47000 ffff8003 fc80: 00502b40 ffff8000 ff000000 ffff7f7f fdec6000 00008003 fdac1630 ffff8003 fca0: eb45fcb0 ffff8003 ffffffff 0000007f eb45fd00 ffff8003 0009b378 ffff8000 fcc0: ffffffea 00000000 006fe000 ffff8000 00736728 ffff8000 00736120 ffff8000 fce0: 00000040 00000000 00743000 ffff8000 006fe7b8 ffff8000 0050cd48 00000000 fd00: eb45fd60 ffff8003 00096070 ffff8000 006f06e0 ffff8000 006f06e0 ffff8000 fd20: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00000000 00000000 fd40: 00000ae0 00000000 006aa25c ffff8000 eb45fd60 ffff8003 0017ca44 00000002 fd60: eb45fdc0 ffff8003 0009a33c ffff8000 006f06e0 ffff8000 006f06e0 ffff8000 fd80: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00735000 ffff8000 fda0: 006d3090 ffff8000 006aa25c ffff8000 00735000 ffff8000 006d3030 ffff8000 fdc0: eb45fdd0 ffff8003 000814c0 ffff8000 eb45fe50 ffff8003 006aaac4 ffff8000 fde0: 006ddd90 ffff8000 00000006 00000000 006d3000 ffff8000 00000095 00000000 fe00: 006a1e90 ffff8000 00735000 ffff8000 006d3000 ffff8000 006aa25c ffff8000 fe20: 00735000 ffff8000 006d3030 ffff8000 eb45fe50 ffff8003 006fac68 ffff8000 fe40: 00000006 00000006 fe293ee6 ffff8003 eb45feb0 ffff8003 004f8ee8 ffff8000 fe60: 004f8ed4 ffff8000 00735000 ffff8000 00000000 00000000 00000000 00000000 fe80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 fea0: 00000000 00000000 00000000 00000000 00000000 00000000 000843d0 ffff8000 fec0: 004f8ed4 ffff8000 00000000 00000000 00000000 00000000 00000000 00000000 fee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000 ffe0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Call trace: [<ffff80000009b768>] unmap_range+0x120/0x580 [<ffff80000009cad4>] free_hyp_pgds+0xa8/0xe4 [<ffff80000009b374>] kvm_arch_init+0x268/0x44c [<ffff80000009606c>] kvm_init+0x24/0x260 [<ffff80000009a338>] arm_init+0x18/0x24 [<ffff8000000814bc>] do_one_initcall+0x88/0x1a0 [<ffff8000006aaac0>] kernel_init_freeable+0x148/0x1e8 [<ffff8000004f8ee4>] kernel_init+0x10/0xd4 Code: 8b000263 92628479 d1000720 eb01001f (f9400340) ---[ end trace 3bc230562e926fa4 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jungseok Lee <jungseoklee85@gmail.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-11-26 14:40:42 +01:00
Andre Przywara	5100f9833e	arm/arm64: KVM: avoid unnecessary guest register mangling on MMIO read Currently we mangle the endianness of the guest's register even on an MMIO _read_, where it is completely useless, because we will not use the value of that register. Rework the io_mem_abort() function to clearly separate between reads and writes and only do the endianness mangling on MMIO writes. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-11-25 13:57:28 +00:00
Ard Biesheuvel	849260c72c	arm, arm64: KVM: handle potential incoherency of readonly memslots Readonly memslots are often used to implement emulation of ROMs and NOR flashes, in which case the guest may legally map these regions as uncached. To deal with the incoherency associated with uncached guest mappings, treat all readonly memslots as incoherent, and ensure that pages that belong to regions tagged as such are flushed to DRAM before being passed to the guest. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-11-25 13:57:27 +00:00
Laszlo Ersek	840f4bfbe0	arm, arm64: KVM: allow forced dcache flush on page faults To allow handling of incoherent memslots in a subsequent patch, this patch adds a paramater 'ipa_uncached' to cache_coherent_guest_page() so that we can instruct it to flush the page's contents to DRAM even if the guest has caching globally enabled. Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-11-25 13:57:27 +00:00
Ard Biesheuvel	07a9748c78	arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn() Instead of using kvm_is_mmio_pfn() to decide whether a host region should be stage 2 mapped with device attributes, add a new static function kvm_is_device_pfn() that disregards RAM pages with the reserved bit set, as those should usually not be mapped as device memory. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-11-25 13:57:26 +00:00
Linus Torvalds	8a5de18239	Second batch of changes for KVM/{arm,arm64} for 3.18 - Support for 48bit IPA and VA (EL2) - A number of fixes for devices mapped into guests - Yet another VGIC fix for BE - A fix for CPU hotplug - A few compile fixes (disabled VGIC, strict mm checks) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUQkyTAAoJECPQ0LrRPXpDwrUP/1WbELgB74W35CJ1zIc4KuBi unP1muW3QAr9Vmp/KovRKyLKFiRaTRlQsszaI78f4ZQ++0vzivU8dZwV81Gn1y/v 0qF63OB0UYsOgXMRrh5JTEqzUyNyNBLUH+FAQiEO/srDoH5WLp3Zq7ThjzjwGn7Q K2ArxFiml+p2BGIGKWe3XIrxNgpW4oWhfe1kW4WU7sshuJlut3Nee+q2lSIg9mZx 2VXYnLNzSsHizgQHuVEyXIqn8HA5FSCvjBYIUcLERlWB0I66WvzOqg9rH/BmNNR2 H+cBDY+9D8KBUBG9zZSG7hZ0mAONKcOnxGZWGzte3Oi7FMZkB3Y/zrIs0na4iB5Y FxE8j+2qclZk9fkHQ7wn9Ws8hpGR2OrFlc2O5ZoBJJ2KJ4wMRHMeEbYRBCRQbTCN +81SUW7mh2j/La0JBqZ6DhhTiymUdIB+6v78im9WGlHsFRAIHBt0Q0u/pIyY+GJs OH7FoswI3vF5iODlHeRO1yjaO3rkj+IJwqTuUuhAIGu9+qnof3ge+eM1cOqrudNa u2kDz+BC21+Q8dflOF99Ryz7cMWqMiwtR+N+OUYpxc7RL7mCeHVANJxdWIFHKa2Z XJaHmbKjmw8AoR0fbS6YWOl2xmIhqU+FAngI+mow/Hz4pJDpR2K3w17ASXIVVQVX go2bvGHONkdn8Ji3Asap =3fl5 -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm Pull second batch of changes for KVM/{arm,arm64} from Marc Zyngier: "The most obvious thing is the sizeable MMU changes to support 48bit VAs on arm64. Summary: - support for 48bit IPA and VA (EL2) - a number of fixes for devices mapped into guests - yet another VGIC fix for BE - a fix for CPU hotplug - a few compile fixes (disabled VGIC, strict mm checks)" [ I'm pulling directly from Marc at the request of Paolo Bonzini, whose backpack was stolen at Düsseldorf airport and will do new keys and rebuild his web of trust. - Linus ] * tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm: arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 arm/arm64: KVM: map MMIO regions at creation time arm64: kvm: define PAGE_S2_DEVICE as read-only by default ARM: kvm: define PAGE_S2_DEVICE as read-only by default arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremap arm/arm64: KVM: fix potential NULL dereference in user_mem_abort() arm/arm64: KVM: use __GFP_ZERO not memset() to get zeroed pages ARM: KVM: fix vgic-disabled build arm: kvm: fix CPU hotplug	2014-10-18 14:32:31 -07:00
Christoffer Dall	2df36a5dd6	arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs The EIRSR and ELRSR registers are 32-bit registers on GICv2, and we store these as an array of two such registers on the vgic vcpu struct. However, we access them as a single 64-bit value or as a bitmap pointer in the generic vgic code, which breaks BE support. Instead, store them as u64 values on the vgic structure and do the word-swapping in the assembly code, which already handles the byte order for BE systems. Tested-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-16 10:57:41 +02:00
Steve Capper	3d08c62924	arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort Commit: `b886576` ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping introduced some code in user_mem_abort that failed to compile if STRICT_MM_TYPECHECKS was enabled. This patch fixes up the failing comparison. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Kim Phillips <kim.phillips@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-15 11:25:22 +02:00
Christoffer Dall	c3058d5da2	arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE When creating or moving a memslot, make sure the IPA space is within the addressable range of the guest. Otherwise, user space can create too large a memslot and KVM would try to access potentially unallocated page table entries when inserting entries in the Stage-2 page tables. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-14 05:48:25 -07:00
Christoffer Dall	38f791a4e4	arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 This patch adds the necessary support for all host kernel PGSIZE and VA_SPACE configuration options for both EL2 and the Stage-2 page tables. However, for 40bit and 42bit PARange systems, the architecture mandates that VTCR_EL2.SL0 is maximum 1, resulting in fewer levels of stage-2 pagge tables than levels of host kernel page tables. At the same time, systems with a PARange > 42bit, we limit the IPA range by always setting VTCR_EL2.T0SZ to 24. To solve the situation with different levels of page tables for Stage-2 translation than the host kernel page tables, we allocate a dummy PGD with pointers to our actual inital level Stage-2 page table, in order for us to reuse the kernel pgtable manipulation primitives. Reproducing all these in KVM does not look pretty and unnecessarily complicates the 32-bit side. Systems with a PARange < 40bits are not yet supported. [ I have reworked this patch from its original form submitted by Jungseok to take the architecture constraints into consideration. There were too many changes from the original patch for me to preserve the authorship. Thanks to Catalin Marinas for his help in figuring out a good solution to this challenge. I have also fixed various bugs and missing error code handling from the original patch. - Christoffer ] Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-14 05:48:19 -07:00
Ard Biesheuvel	8eef91239e	arm/arm64: KVM: map MMIO regions at creation time There is really no point in faulting in memory regions page by page if they are not backed by demand paged system RAM but by a linear passthrough mapping of a host MMIO region. So instead, detect such regions at setup time and install the mappings for the backing all at once. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-13 03:36:53 -07:00
Ard Biesheuvel	c40f2f8ff8	arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremap Add support for read-only MMIO passthrough mappings by adding a 'writable' parameter to kvm_phys_addr_ioremap. For the moment, mappings will be read-write even if 'writable' is false, but once the definition of PAGE_S2_DEVICE gets changed, those mappings will be created read-only. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-10 13:07:37 +02:00
Ard Biesheuvel	37b544087e	arm/arm64: KVM: fix potential NULL dereference in user_mem_abort() Handle the potential NULL return value of find_vma_intersection() before dereferencing it. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-10 13:07:37 +02:00
Ard Biesheuvel	e9e8578b6c	arm/arm64: KVM: use __GFP_ZERO not memset() to get zeroed pages Pass __GFP_ZERO to __get_free_pages() instead of calling memset() explicitly. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-10-10 13:07:37 +02:00
Linus Torvalds	e4e65676f2	Fixes and features for 3.18. Apart from the usual cleanups, here is the summary of new features: - s390 moves closer towards host large page support - PowerPC has improved support for debugging (both inside the guest and via gdbstub) and support for e6500 processors - ARM/ARM64 support read-only memory (which is necessary to put firmware in emulated NOR flash) - x86 has the usual emulator fixes and nested virtualization improvements (including improved Windows support on Intel and Jailhouse hypervisor support on AMD), adaptive PLE which helps overcommitting of huge guests. Also included are some patches that make KVM more friendly to memory hot-unplug, and fixes for rare caching bugs. Two patches have trivial mm/ parts that were acked by Rik and Andrew. Note: I will soon switch to a subkey for signing purposes. To verify future signed pull requests from me, please update my key with "gpg --recv-keys 9B4D86F2". You should see 3 new subkeys---the one for signing will be a 2048-bit RSA key, 4E6B09D7. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJUL5sPAAoJEBvWZb6bTYbyfkEP/3MNhSyn6HCjPjtjLNPAl9KL WpExZSUFL2+4CztpdGIsek1BeJYHmqv3+c5S+WvaWVA1aqh2R7FT1D1ErBLjgLQq lq23IOr+XxmC3dXQUEEk+TlD+283UzypzEG4l4UD3JYg79fE3UrXAz82SeyewJDY x7aPYhkZG3RHu+wAyMPasG6E3zS5LySdUtGWbiPwz5BejrhBJoJdeb2WIL/RwnUK 7ppSLB5EoFj/uMkuyeAAdAbdfSrhHA6faDZxNdxS9k9wGutrhhfUoQ49ONrKG4dV sFo1tSPTVgRs8QFYUZ2fJUPBAmUVddsgqh2K9d0NftGTq7b8YszaCsfFrs2/Y4MU YxssWEhxsfszerCu12bbAJrv6JBZYQ7TwGvI9L7P0iFU6IVw/djmukU4AkM9/e91 YS/cue/PN+9Pn2ccXzL9J7xRtZb8FsOuRsCXTCmbOwDkLmrKPDBN2t3RUbeF+Eam ABrpWnLKX13kZSo4LKU+/niarzmPMp7odQfHVdr8ea0fiYLp4iN8puA20WaSPIgd CLvm+RAvXe5Lm91L4mpFotJ2uFyK6QlIYJV4FsgeWv/0D0qppWQi0Utb/aCNHCgy z8MyUMD48y7EpoQrFYr/7cddXIu0/NegnM8I1coVjIPEk4NfeebGUlCJ/V3D8wMG BgEfS2x6jRc5zB3hjwDr =iEVi -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "Fixes and features for 3.18. Apart from the usual cleanups, here is the summary of new features: - s390 moves closer towards host large page support - PowerPC has improved support for debugging (both inside the guest and via gdbstub) and support for e6500 processors - ARM/ARM64 support read-only memory (which is necessary to put firmware in emulated NOR flash) - x86 has the usual emulator fixes and nested virtualization improvements (including improved Windows support on Intel and Jailhouse hypervisor support on AMD), adaptive PLE which helps overcommitting of huge guests. Also included are some patches that make KVM more friendly to memory hot-unplug, and fixes for rare caching bugs. Two patches have trivial mm/ parts that were acked by Rik and Andrew. Note: I will soon switch to a subkey for signing purposes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (157 commits) kvm: do not handle APIC access page if in-kernel irqchip is not in use KVM: s390: count vcpu wakeups in stat.halt_wakeup KVM: s390/facilities: allow TOD-CLOCK steering facility bit KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode arm/arm64: KVM: Report correct FSC for unsupported fault types arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc kvm: Fix kvm_get_page_retry_io __gup retval check arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset kvm: x86: Unpin and remove kvm_arch->apic_access_page kvm: vmx: Implement set_apic_access_page_addr kvm: x86: Add request bit to reload APIC access page address kvm: Add arch specific mmu notifier for page invalidation kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make it non-static kvm: Fix page ageing bugs kvm/x86/mmu: Pass gfn and level to rmapp callback. x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only kvm: x86: use macros to compute bank MSRs KVM: x86: Remove debug assertion of non-PAE reserved bits kvm: don't take vcpu mutex for obviously invalid vcpu ioctls kvm: Faults which trigger IO release the mmap_sem ...	2014-10-08 05:27:39 -04:00
Vladimir Murzin	37a34ac1d4	arm: kvm: fix CPU hotplug On some platforms with no power management capabilities, the hotplug implementation is allowed to return from a smp_ops.cpu_die() call as a function return. Upon a CPU onlining event, the KVM CPU notifier tries to reinstall the hyp stub, which fails on platform where no reset took place following a hotplug event, with the message: CPU1: smp_ops.cpu_die() returned, trying to resuscitate CPU1: Booted secondary processor Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x80409540 unexpected data abort in Hyp mode at: 0x80401fe8 unexpected HVC/SVC trap in Hyp mode at: 0x805c6170 since KVM code is trying to reinstall the stub on a system where it is already configured. To prevent this issue, this patch adds a check in the KVM hotplug notifier that detects if the HYP stub really needs re-installing when a CPU is onlined and skips the installation call if the stub is already in place, which means that the CPU has not been reset. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-09-29 16:02:34 +02:00
Christoffer Dall	0496daa5cf	arm/arm64: KVM: Report correct FSC for unsupported fault types When we catch something that's not a permission fault or a translation fault, we log the unsupported FSC in the kernel log, but we were masking off the bottom bits of the FSC which was not very helpful. Also correctly report the FSC for data and instruction faults rather than telling people it was a DFCS, which doesn't exist in the ARM ARM. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-09-26 14:39:45 +02:00
Joel Schopp	dbff124e29	arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc The current aarch64 calculation for VTTBR_BADDR_MASK masks only 39 bits and not all the bits in the PA range. This is clearly a bug that manifests itself on systems that allocate memory in the higher address space range. [ Modified from Joel's original patch to be based on PHYS_MASK_SHIFT instead of a hard-coded value and to move the alignment check of the allocation to mmu.c. Also added a comment explaining why we hardcode the IPA range and changed the stage-2 pgd allocation to be based on the 40 bit IPA range instead of the maximum possible 48 bit PA range. - Christoffer ] Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Joel Schopp <joel.schopp@amd.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-09-26 14:39:13 +02:00
Marc Zyngier	4956f2bc1f	arm/arm64: KVM: vgic: delay vgic allocation until init time It is now quite easy to delay the allocation of the vgic tables until we actually require it to be up and running (when the first vcpu is kicking around, or someones tries to access the GIC registers). This allow us to allocate memory for the exact number of CPUs we have. As nobody configures the number of interrupts just yet, use a fallback to VGIC_NR_IRQS_LEGACY. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-09-18 18:48:58 -07:00
Marc Zyngier	c1bfb577ad	arm/arm64: KVM: vgic: switch to dynamic allocation So far, all the VGIC data structures are statically defined by the maximum number of vcpus and interrupts it supports. It means that we always have to oversize it to cater for the worse case. Start by changing the data structures to be dynamically sizeable, and allocate them at runtime. The sizes are still very static though. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-09-18 18:48:52 -07:00
Christoffer Dall	a875dafcf9	Merge remote-tracking branch 'kvm/next' into queue Conflicts: arch/arm64/include/asm/kvm_host.h virt/kvm/arm/vgic.c	2014-09-18 18:15:32 -07:00
Ard Biesheuvel	a7d079cea2	ARM/arm64: KVM: fix use of WnR bit in kvm_is_write_fault() The ISS encoding for an exception from a Data Abort has a WnR bit[6] that indicates whether the Data Abort was caused by a read or a write instruction. While there are several fields in the encoding that are only valid if the ISV bit[24] is set, WnR is not one of them, so we can read it unconditionally. Instead of fixing both implementations of kvm_is_write_fault() in place, reimplement it just once using kvm_vcpu_dabt_iswrite(), which already does the right thing with respect to the WnR bit. Also fix up the callers to pass 'vcpu' Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-09-11 11:31:13 +01:00
Radim Krčmář	13a34e067e	KVM: remove garbage arg to *hardware_{en,dis}able In the beggining was on_each_cpu(), which required an unused argument to kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten. Remove unnecessary arguments that stem from this. Signed-off-by: Radim KrÄmÃ¡Å™ <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-08-29 16:35:55 +02:00
Radim Krčmář	0865e636ae	KVM: static inline empty kvm_arch functions Using static inline is going to save few bytes and cycles. For example on powerpc, the difference is 700 B after stripping. (5 kB before) This patch also deals with two overlooked empty functions: kvm_arch_flush_shadow was not removed from arch/mips/kvm/mips.c `2df72e9bc` KVM: split kvm_arch_flush_shadow and kvm_arch_sched_in never made it into arch/ia64/kvm/kvm-ia64.c. `e790d9ef6` KVM: add kvm_arch_sched_in Signed-off-by: Radim KrÄmÃ¡Å™ <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-08-29 16:35:55 +02:00
Christoffer Dall	05e0127f9e	arm/arm64: KVM: Complete WFI/WFE instructions The architecture specifies that when the processor wakes up from a WFE or WFI instruction, the instruction is considered complete, however we currrently return to EL1 (or EL0) at the WFI/WFE instruction itself. While most guests may not be affected by this because their local exception handler performs an exception returning setting the event bit or with an interrupt pending, some guests like UEFI will get wedged due this little mishap. Simply skip the instruction when we have completed the emulation. Cc: <stable@vger.kernel.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-08-29 11:53:53 +02:00
Pranavkumar Sawargaonkar	f6edbbf36d	ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU X-Gene u-boot runs in EL2 mode with MMU enabled hence we might have stale EL2 tlb enteris when we enable EL2 MMU on each host CPU. This can happen on any ARM/ARM64 board running bootloader in Hyp-mode (or EL2-mode) with MMU enabled. This patch ensures that we flush all Hyp-mode (or EL2-mode) TLBs on each host CPU before enabling Hyp-mode (or EL2-mode) MMU. Cc: <stable@vger.kernel.org> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-08-29 11:53:26 +02:00
Will Deacon	bd218bce92	KVM: ARM/arm64: return -EFAULT if copy_from_user fails in set_timer_reg We currently return the number of bytes not copied if set_timer_reg fails, which is almost certainly not what userspace would like. This patch returns -EFAULT instead. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-08-27 22:49:45 +02:00
Will Deacon	18d457661f	KVM: ARM/arm64: avoid returning negative error code as bool is_valid_cache returns true if the specified cache is valid. Unfortunately, if the parameter passed it out of range, we return -ENOENT, which ends up as true leading to potential hilarity. This patch returns false on the failure path instead. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-08-27 22:49:45 +02:00
Will Deacon	4000be423c	KVM: ARM/arm64: fix broken __percpu annotation Running sparse results in a bunch of noisy address space mismatches thanks to the broken __percpu annotation on kvm_get_running_vcpus. This function returns a pcpu pointer to a pointer, not a pointer to a pcpu pointer. This patch fixes the annotation, which kills the warnings from sparse. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-08-27 22:49:45 +02:00
Christoffer Dall	98047888bb	arm/arm64: KVM: Support KVM_CAP_READONLY_MEM When userspace loads code and data in a read-only memory regions, KVM needs to be able to handle this on arm and arm64. Specifically this is used when running code directly from a read-only flash device; the common scenario is a UEFI blob loaded with the -bios option in QEMU. Note that the MMIO exit on writes to a read-only memory is ABI and can be used to emulate block-erase style flash devices. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-08-27 22:46:09 +02:00
Radim Krčmář	e790d9ef64	KVM: add kvm_arch_sched_in Introduce preempt notifiers for architecture specific code. Advantage over creating a new notifier in every arch is slightly simpler code and guaranteed call order with respect to kvm_sched_in. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-08-21 18:45:21 +02:00
Linus Torvalds	66bb0aa077	Here are the PPC and ARM changes for KVM, which I separated because they had small conflicts (respectively within KVM documentation, and with 3.16-rc changes). Since they were all within the subsystem, I took care of them. Stephen Rothwell reported some snags in PPC builds, but they are all fixed now; the latest linux-next report was clean. New features for ARM include: - KVM VGIC v2 emulation on GICv3 hardware - Big-Endian support for arm/arm64 (guest and host) - Debug Architecture support for arm64 (arm32 is on Christoffer's todo list) And for PPC: - Book3S: Good number of LE host fixes, enable HV on LE - Book3S HV: Add in-guest debug support This release drops support for KVM on the PPC440. As a result, the PPC merge removes more lines than it adds. :) I also included an x86 change, since Davidlohr tied it to an independent bug report and the reporter quickly provided a Tested-by; there was no reason to wait for -rc2. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJT4iIJAAoJEBvWZb6bTYbyZqoP/3Wxy8NWPFJ8HGt81NHlGnDS a9UbL7EibcOEG+aaKqmtBglTD5YDiGBDNCxxiSJaDHt+grLN4fsWIliJob1nJFoO 90f89EWN2XjeCrJXA5nUoeg5tpc5OoYKsiP6pTgzIwkP8vvs/H1+zpcTS/UmYsr/ qipVMMsM+zZeHWZcSbqjW88z7YqIn1sr5282wJ85cbyv4KGizb/G4dyPuDqLb6np hkAD8Ah6VV2suQ2FSy7G2fg20R0vglUi60hkEHLoCBPVqJCl7SmC8MvxNbjBnP8S J36R0R0u1wHYKzAGooLJGVOZ/o/gSiVqKX+++L2EvJBN+kuA6u/7fxLyBT+LwDAE IF/Aln5rpg1fe+eywvhz86WljTVEQ8bO1zVsIQUPY+/ZOPedZHMwyvXft8ogbjSp 2m9OJ/3e8Aggh0OeHpCDoeow+QDUXvX0YdCw+2Yh0p+7VMXqkyp0QEiBu38jrusC rB3VNifJbDSWLKdG9LfCAPHnxZD2XYEwv2WFBo6KQOGMGHfx0GXpCOL/jQihrhA6 HtEG5Bs3lvnHQemdpUZ58xojiABbMaUPdcnPXQQEp23WhZzrfLMLzqVG0VYnhSsC 9pi7MJj8c31rqx5WU2oRM28i/BvNxN0NCtkDpineO5s3f89Ws1xnwxqlm38AKP0J irJQTYFEqec+GM9JK1rG =hyQP -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull second round of KVM changes from Paolo Bonzini: "Here are the PPC and ARM changes for KVM, which I separated because they had small conflicts (respectively within KVM documentation, and with 3.16-rc changes). Since they were all within the subsystem, I took care of them. Stephen Rothwell reported some snags in PPC builds, but they are all fixed now; the latest linux-next report was clean. New features for ARM include: - KVM VGIC v2 emulation on GICv3 hardware - Big-Endian support for arm/arm64 (guest and host) - Debug Architecture support for arm64 (arm32 is on Christoffer's todo list) And for PPC: - Book3S: Good number of LE host fixes, enable HV on LE - Book3S HV: Add in-guest debug support This release drops support for KVM on the PPC440. As a result, the PPC merge removes more lines than it adds. :) I also included an x86 change, since Davidlohr tied it to an independent bug report and the reporter quickly provided a Tested-by; there was no reason to wait for -rc2" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (122 commits) KVM: Move more code under CONFIG_HAVE_KVM_IRQFD KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use KVM: nVMX: Fix nested vmexit ack intr before load vmcs01 KVM: PPC: Enable IRQFD support for the XICS interrupt controller KVM: Give IRQFD its own separate enabling Kconfig option KVM: Move irq notifier implementation into eventfd.c KVM: Move all accesses to kvm::irq_routing into irqchip.c KVM: irqchip: Provide and use accessors for irq routing table KVM: Don't keep reference to irq routing table in irqfd struct KVM: PPC: drop duplicate tracepoint arm64: KVM: fix 64bit CP15 VM access for 32bit guests KVM: arm64: GICv3: mandate page-aligned GICV region arm64: KVM: GICv3: move system register access to msr_s/mrs_s KVM: PPC: PR: Handle FSCR feature deselects KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr KVM: PPC: Remove DCR handling KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Handle magic page in kvmppc_ld/st ...	2014-08-07 11:35:30 -07:00
Paolo Bonzini	cc568ead3c	Patch queue for ppc - 2014-08-01 Highlights in this release include: - BookE: Rework instruction fetch, not racy anymore now - BookE HV: Fix ONE_REG accessors for some in-hardware registers - Book3S: Good number of LE host fixes, enable HV on LE - Book3S: Some misc bug fixes - Book3S HV: Add in-guest debug support - Book3S HV: Preload cache lines on context switch - Remove 440 support Alexander Graf (31): KVM: PPC: Book3s PR: Disable AIL mode with OPAL KVM: PPC: Book3s HV: Fix tlbie compile error KVM: PPC: Book3S PR: Handle hyp doorbell exits KVM: PPC: Book3S PR: Fix ABIv2 on LE KVM: PPC: Book3S PR: Fix sparse endian checks PPC: Add asm helpers for BE 32bit load/store KVM: PPC: Book3S HV: Make HTAB code LE host aware KVM: PPC: Book3S HV: Access guest VPA in BE KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE KVM: PPC: Book3S HV: Access XICS in BE KVM: PPC: Book3S HV: Fix ABIv2 on LE KVM: PPC: Book3S HV: Enable for little endian hosts KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct KVM: PPC: Deflect page write faults properly in kvmppc_st KVM: PPC: Book3S: Stop PTE lookup on write errors KVM: PPC: Book3S: Add hack for split real mode KVM: PPC: Book3S: Make magic page properly 4k mappable KVM: PPC: Remove 440 support KVM: Rename and add argument to check_extension KVM: Allow KVM_CHECK_EXTENSION on the vm fd KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode KVM: PPC: Implement kvmppc_xlate for all targets KVM: PPC: Move kvmppc_ld/st to common code KVM: PPC: Remove kvmppc_bad_hva() KVM: PPC: Use kvm_read_guest in kvmppc_ld KVM: PPC: Handle magic page in kvmppc_ld/st KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Remove DCR handling KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: PR: Handle FSCR feature deselects Alexey Kardashevskiy (1): KVM: PPC: Book3S: Fix LPCR one_reg interface Aneesh Kumar K.V (4): KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation KVM: PPC: BOOK3S: PR: Emulate virtual timebase register KVM: PPC: BOOK3S: PR: Emulate instruction counter KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page Anton Blanchard (2): KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Bharat Bhushan (10): kvm: ppc: bookehv: Added wrapper macros for shadow registers kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR kvm: ppc: booke: Add shared struct helpers of SPRN_ESR kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 kvm: ppc: Add SPRN_EPR get helper function kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit KVM: PPC: Booke-hv: Add one reg interface for SPRG9 KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr Michael Neuling (1): KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling Mihai Caraman (8): KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule KVM: PPC: e500: Fix default tlb for victim hint KVM: PPC: e500: Emulate power management control SPR KVM: PPC: e500mc: Revert "add load inst fixup" KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 KVM: PPC: Book3s: Remove kvmppc_read_inst() function KVM: PPC: Allow kvmppc_get_last_inst() to fail KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Paul Mackerras (4): KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication Stewart Smith (2): Split out struct kvmppc_vcore creation to separate function Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJT21skAAoJECszeR4D/txgeFEP/AzJopN7s//W33CfyBqURHXp XALCyAw+S67gtcaTZbxomcG1xuT8Lj9WEw28iz3rCtAnJwIxsY63xrI1nXMzTaI2 p1rC0ai5Qy+nlEbd6L78spZy/Nzh8DFYGWx78iUSO1mYD8xywJwtoiBA539pwp8j 8N+mgn61Hwhv31bKtsZlmzXymVr/jbTp5LVuxsBLJwD2lgT49g+4uBnX2cG/iXkg Rzbh7LxoNNXrSPI8sYmTWu/81aeXteeX70ja6DHuV5dWLNTuAXJrh5EUfeAZqBrV aYcLWUYmIyB87txNmt6ZGVar2p3jr2Xhb9mKx+EN4dbehblanLc1PUqlHd0q3dKc Nt60ByqpZn+qDAK86dShSZLEe+GT3lovvE76CqVXD4Er+OUEkc9JoxhN1cof/Gb0 o6uwZ2isXHRdGoZx5vb4s3UTOlwZGtoL/CyY/HD/ujYDSURkCGbxLj3kkecSY8ut QdDAWsC15BwsHtKLr5Zwjp2w+0eGq2QJgfvO0zqWFiz9k33SCBCUpwluFeqh27Hi aR5Wir3j+MIw9G8XlYlDJWYfi0h/SZ4G7hh7jSu26NBNBzQsDa8ow/cLzdMhdUwH OYSaeqVk5wiRb9to1uq1NQWPA0uRAx3BSjjvr9MCGRqmvn+FV5nj637YWUT+53Hi aSvg/U2npghLPPG2cihu =JuLr -----END PGP SIGNATURE----- Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm Patch queue for ppc - 2014-08-01 Highlights in this release include: - BookE: Rework instruction fetch, not racy anymore now - BookE HV: Fix ONE_REG accessors for some in-hardware registers - Book3S: Good number of LE host fixes, enable HV on LE - Book3S: Some misc bug fixes - Book3S HV: Add in-guest debug support - Book3S HV: Preload cache lines on context switch - Remove 440 support Alexander Graf (31): KVM: PPC: Book3s PR: Disable AIL mode with OPAL KVM: PPC: Book3s HV: Fix tlbie compile error KVM: PPC: Book3S PR: Handle hyp doorbell exits KVM: PPC: Book3S PR: Fix ABIv2 on LE KVM: PPC: Book3S PR: Fix sparse endian checks PPC: Add asm helpers for BE 32bit load/store KVM: PPC: Book3S HV: Make HTAB code LE host aware KVM: PPC: Book3S HV: Access guest VPA in BE KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE KVM: PPC: Book3S HV: Access XICS in BE KVM: PPC: Book3S HV: Fix ABIv2 on LE KVM: PPC: Book3S HV: Enable for little endian hosts KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct KVM: PPC: Deflect page write faults properly in kvmppc_st KVM: PPC: Book3S: Stop PTE lookup on write errors KVM: PPC: Book3S: Add hack for split real mode KVM: PPC: Book3S: Make magic page properly 4k mappable KVM: PPC: Remove 440 support KVM: Rename and add argument to check_extension KVM: Allow KVM_CHECK_EXTENSION on the vm fd KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode KVM: PPC: Implement kvmppc_xlate for all targets KVM: PPC: Move kvmppc_ld/st to common code KVM: PPC: Remove kvmppc_bad_hva() KVM: PPC: Use kvm_read_guest in kvmppc_ld KVM: PPC: Handle magic page in kvmppc_ld/st KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Remove DCR handling KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: PR: Handle FSCR feature deselects Alexey Kardashevskiy (1): KVM: PPC: Book3S: Fix LPCR one_reg interface Aneesh Kumar K.V (4): KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation KVM: PPC: BOOK3S: PR: Emulate virtual timebase register KVM: PPC: BOOK3S: PR: Emulate instruction counter KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page Anton Blanchard (2): KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Bharat Bhushan (10): kvm: ppc: bookehv: Added wrapper macros for shadow registers kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR kvm: ppc: booke: Add shared struct helpers of SPRN_ESR kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 kvm: ppc: Add SPRN_EPR get helper function kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit KVM: PPC: Booke-hv: Add one reg interface for SPRG9 KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr Michael Neuling (1): KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling Mihai Caraman (8): KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule KVM: PPC: e500: Fix default tlb for victim hint KVM: PPC: e500: Emulate power management control SPR KVM: PPC: e500mc: Revert "add load inst fixup" KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 KVM: PPC: Book3s: Remove kvmppc_read_inst() function KVM: PPC: Allow kvmppc_get_last_inst() to fail KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Paul Mackerras (4): KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication Stewart Smith (2): Split out struct kvmppc_vcore creation to separate function Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 Conflicts: Documentation/virtual/kvm/api.txt	2014-08-05 09:58:11 +02:00
Alexander Graf	784aa3d7fb	KVM: Rename and add argument to check_extension In preparation to make the check_extension function available to VM scope we add a struct kvm * argument to the function header and rename the function accordingly. It will still be called from the /dev/kvm fd, but with a NULL argument for struct kvm *. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com>	2014-07-28 15:23:17 +02:00
Russell King	6ebbf2ce43	ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-18 12:29:04 +01:00
Russell King	af040ffc9b	ARM: make it easier to check the CPU part number correctly Ensure that platform maintainers check the CPU part number in the right manner: the CPU part number is meaningless without also checking the CPU implement(e\|o)r (choose your preferred spelling!) Provide an interface which returns both the implementer and part number together, and update the definitions to include the implementer. Mark the old function as being deprecated... indeed, using the old function with the definitions will now always evaluate as false, so people must update their un-merged code to the new function. While this could be avoided by adding new definitions, we'd also have to create new names for them which would be awkward. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-18 12:29:02 +01:00
Victor Kamensky	f5aa462147	ARM: KVM: enable KVM in Kconfig on big-endian systems Previous patches addresses ARMV7 big-endian virtualiztion, kvm related issues, so enable ARM_VIRT_EXT for big-endian now. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:41 -07:00
Victor Kamensky	73891f72c4	ARM: KVM: one_reg coproc set and get BE fixes Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers. In order to work in endian agnostic way reg_from_user to reg_to_user functions should copy register value only to kernel variable with size that matches register size. In few place where size mismatch existed fix issue on macro caller side. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:40 -07:00
Victor Kamensky	6d7311b520	ARM: KVM: __kvm_vcpu_run function return result fix in BE case The __kvm_vcpu_run function returns a 64-bit result in two registers, which has to be adjusted for BE case. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:39 -07:00
Victor Kamensky	19b0e60a63	ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:38 -07:00
Victor Kamensky	64054c25cf	ARM: KVM: fix vgic V7 assembler code to work in BE image The vgic h/w registers are little endian; when BE asm code reads/writes from/to them, it needs to do byteswap after/before. Byteswap code uses ARM_BE8 wrapper to add swap only if CONFIG_CPU_BIG_ENDIAN is configured. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:38 -07:00
Marc Zyngier	8f186d522c	KVM: ARM: vgic: split GICv2 backend from the main vgic code Brutally hack the innocent vgic code, and move the GICv2 specific code to its own file, using vgic_ops and vgic_params as a way to pass information between the two blocks. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:34 -07:00
Marc Zyngier	eede821dbf	KVM: arm/arm64: vgic: move GICv2 registers to their own structure In order to make way for the GICv3 registers, move the v2-specific registers to their own structure. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:57:31 -07:00
Alex Bennée	1df08ba0aa	arm64: KVM: allow export and import of generic timer regs For correct guest suspend/resume behaviour we need to ensure we include the generic timer registers for 64 bit guests. As CONFIG_KVM_ARM_TIMER is always set for arm64 we don't need to worry about null implementations. However I have re-jigged the kvm_arm_timer_set/get_reg declarations to be in the common include/kvm/arm_arch_timer.h headers. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-07-11 04:46:55 -07:00
Kim Phillips	b88657674d	ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping A userspace process can map device MMIO memory via VFIO or /dev/mem, e.g., for platform device passthrough support in QEMU. During early development, we found the PAGE_S2 memory type being used for MMIO mappings. This patch corrects that by using the more strongly ordered memory type for device MMIO mappings: PAGE_S2_DEVICE. Signed-off-by: Kim Phillips <kim.phillips@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2014-07-11 04:46:53 -07:00
Eric Auger	df6ce24f2e	ARM: KVM: Unmap IPA on memslot delete/move Currently when a KVM region is deleted or moved after KVM_SET_USER_MEMORY_REGION ioctl, the corresponding intermediate physical memory is not unmapped. This patch corrects this and unmaps the region's IPA range in kvm_arch_commit_memory_region using unmap_stage2_range. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-07-11 04:46:52 -07:00
Christoffer Dall	4f853a714b	arm/arm64: KVM: Fix and refactor unmap_range unmap_range() was utterly broken, to quote Marc, and broke in all sorts of situations. It was also quite complicated to follow and didn't follow the usual scheme of having a separate iterating function for each level of page tables. Address this by refactoring the code and introduce a pgd_clear() function. Reviewed-by: Jungseok Lee <jays.lee@samsung.com> Reviewed-by: Mario Smarduch <m.smarduch@samsung.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-07-11 04:46:51 -07:00
Linus Torvalds	b05d59dfce	At over 200 commits, covering almost all supported architectures, this was a pretty active cycle for KVM. Changes include: - a lot of s390 changes: optimizations, support for migration, GDB support and more - ARM changes are pretty small: support for the PSCI 0.2 hypercall interface on both the guest and the host (the latter acked by Catalin) - initial POWER8 and little-endian host support - support for running u-boot on embedded POWER targets - pretty large changes to MIPS too, completing the userspace interface and improving the handling of virtualized timer hardware - for x86, a larger set of changes is scheduled for 3.17. Still, we have a few emulator bugfixes and support for running nested fully-virtualized Xen guests (para-virtualized Xen guests have always worked). And some optimizations too. The only missing architecture here is ia64. It's not a coincidence that support for KVM on ia64 is scheduled for removal in 3.17. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTjtlBAAoJEBvWZb6bTYbyMOUP/2NAePghE3IjG99ikHFdn+BX BfrURsuR6GD0AhYQnBidBmpFbAmN/LwSJxv/M7sV7OBRWLu3qbt69DrPTU2e/FK1 j9q25peu8jRyHzJ1q9rBroo74nD9lQYuVr3uXNxxcg0DRnw14JHGlM3y8LDEknO8 W+gpWTeAQ+2AuOX98MpRbCRMuzziCSv5bP5FhBVnsWHiZfvMbcUrbeJt+zYSiDAZ 0tHm/5dFKzfj/vVrrnjD4EZcRr688Bs5rztG96hY6aoVJryjZGLtLp92wCWkRRmH CCvZwd245NmNthuKHzcs27/duSWfU0uOlu7AMrD44QYhzeDGyB/2nbCxbGqLLoBA nnOviXH4cC65/CnisZ79zfo979HbZcX+Lzg747EjBgCSxJmLlwgiG8yXtDvk5otB TH6GUeGDiEEPj//JD3XtgSz0sF2NvjREWRyemjDMvhz6JC/bLytXKb3sn+NXSj8m ujzF9eQoa4qKDcBL4IQYGTJ4z5nY3Pd68dHFIPHB7n82OxFLSQUBKxXw8/1fb5og VVb8PL4GOcmakQlAKtTMlFPmuy4bbL2r/2iV5xJiOZKmXIu8Hs1JezBE3SFAltbl 3cAGwSM9/dDkKxUbTFblyOE9bkKbg4WYmq0LkdzsPEomb3IZWntOT25rYnX+LrBz bAknaZpPiOrW11Et1htY =j5Od -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm into next Pull KVM updates from Paolo Bonzini: "At over 200 commits, covering almost all supported architectures, this was a pretty active cycle for KVM. Changes include: - a lot of s390 changes: optimizations, support for migration, GDB support and more - ARM changes are pretty small: support for the PSCI 0.2 hypercall interface on both the guest and the host (the latter acked by Catalin) - initial POWER8 and little-endian host support - support for running u-boot on embedded POWER targets - pretty large changes to MIPS too, completing the userspace interface and improving the handling of virtualized timer hardware - for x86, a larger set of changes is scheduled for 3.17. Still, we have a few emulator bugfixes and support for running nested fully-virtualized Xen guests (para-virtualized Xen guests have always worked). And some optimizations too. The only missing architecture here is ia64. It's not a coincidence that support for KVM on ia64 is scheduled for removal in 3.17" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits) KVM: add missing cleanup_srcu_struct KVM: PPC: Book3S PR: Rework SLB switching code KVM: PPC: Book3S PR: Use SLB entry 0 KVM: PPC: Book3S HV: Fix machine check delivery to guest KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs KVM: PPC: Book3S HV: Make sure we don't miss dirty pages KVM: PPC: Book3S HV: Fix dirty map for hugepages KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number KVM: PPC: Book3S: Add ONE_REG register names that were missed KVM: PPC: Add CAP to indicate hcall fixes KVM: PPC: MPIC: Reset IRQ source private members KVM: PPC: Graciously fail broken LE hypercalls PPC: ePAPR: Fix hypercall on LE guest KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler KVM: PPC: BOOK3S: Always use the saved DAR value PPC: KVM: Make NX bit available with magic page KVM: PPC: Disable NX for old magic page using guests KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest ...	2014-06-04 08:47:12 -07:00
Anup Patel	4447a208f7	ARM/ARM64: KVM: Advertise KVM_CAP_ARM_PSCI_0_2 to user space We have PSCI v0.2 emulation available in KVM ARM/ARM64 hence advertise this to user space (i.e. QEMU or KVMTOOL) via KVM_CHECK_EXTENSION ioctl. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:58 -07:00
Anup Patel	b376d02b53	ARM/ARM64: KVM: Emulate PSCI v0.2 CPU_SUSPEND This patch adds emulation of PSCI v0.2 CPU_SUSPEND function call for KVM ARM/ARM64. This is a CPU-level function call which can suspend current CPU or current CPU cluster. We don't have VCPU clusters in KVM so we only suspend the current VCPU. The CPU_SUSPEND emulation is not tested much because currently there is no CPUIDLE driver in Linux kernel that uses PSCI CPU_SUSPEND. The PSCI CPU_SUSPEND implementation in ARM64 kernel was tested using a Simple CPUIDLE driver which is not published due to unstable DT-bindings for PSCI. (For more info, http://lwn.net/Articles/574950/) For simplicity, we implement CPU_SUSPEND emulation similar to WFI (Wait-for-interrupt) emulation and we also treat power-down request to be same as stand-by request. This is consistent with section 5.4.1 and section 5.4.2 of PSCI v0.2 specification. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:58 -07:00
Anup Patel	aa8aeefe5e	ARM/ARM64: KVM: Fix CPU_ON emulation for PSCI v0.2 As-per PSCI v0.2, the source CPU provides physical address of "entry point" and "context id" for starting a target CPU. Also, if target CPU is already running then we should return ALREADY_ON. Current emulation of CPU_ON function does not consider physical address of "context id" and returns INVALID_PARAMETERS if target CPU is already running. This patch updates kvm_psci_vcpu_on() such that it works for both PSCI v0.1 and PSCI v0.2. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:58 -07:00
Anup Patel	bab0b43012	ARM/ARM64: KVM: Emulate PSCI v0.2 MIGRATE_INFO_TYPE and related functions This patch adds emulation of PSCI v0.2 MIGRATE, MIGRATE_INFO_TYPE, and MIGRATE_INFO_UP_CPU function calls for KVM ARM/ARM64. KVM ARM/ARM64 being a hypervisor (and not a Trusted OS), we cannot provide this functions hence we emulate these functions in following way: 1. MIGRATE - Returns "Not Supported" 2. MIGRATE_INFO_TYPE - Return 2 i.e. Trusted OS is not present 3. MIGRATE_INFO_UP_CPU - Returns "Not Supported" Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:58 -07:00
Anup Patel	e6bc13c8a7	ARM/ARM64: KVM: Emulate PSCI v0.2 AFFINITY_INFO This patch adds emulation of PSCI v0.2 AFFINITY_INFO function call for KVM ARM/ARM64. This is a VCPU-level function call which will be used to determine current state of given affinity level. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:58 -07:00
Anup Patel	4b1238269e	ARM/ARM64: KVM: Emulate PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET The PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET functions are system-level functions hence cannot be fully emulated by in-kernel PSCI emulation code. To tackle this, we forward PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET function calls from vcpu to user space (i.e. QEMU or KVMTOOL) via kvm_run structure using KVM_EXIT_SYSTEM_EVENT exit reasons. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:58 -07:00
Anup Patel	e8e7fcc5e2	ARM/ARM64: KVM: Make kvm_psci_call() return convention more flexible Currently, the kvm_psci_call() returns 'true' or 'false' based on whether the PSCI function call was handled successfully or not. This does not help us emulate system-level PSCI functions where the actual emulation work will be done by user space (QEMU or KVMTOOL). Examples of such system-level PSCI functions are: PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET. This patch updates kvm_psci_call() to return three types of values: 1) > 0 (success) 2) = 0 (success but exit to user space) 3) < 0 (errors) Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:57 -07:00
Anup Patel	7d0f84aae9	ARM/ARM64: KVM: Add base for PSCI v0.2 emulation Currently, the in-kernel PSCI emulation provides PSCI v0.1 interface to VCPUs. This patch extends current in-kernel PSCI emulation to provide PSCI v0.2 interface to VCPUs. By default, ARM/ARM64 KVM will always provide PSCI v0.1 interface for keeping the ABI backward-compatible. To select PSCI v0.2 interface for VCPUs, the user space (i.e. QEMU or KVMTOOL) will have to set KVM_ARM_VCPU_PSCI_0_2 feature when doing VCPU init using KVM_ARM_VCPU_INIT ioctl. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-30 04:18:57 -07:00
Mark Salter	5d4e08c45a	arm: KVM: fix possible misalignment of PGDs and bounce page The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate a bounce page (if hypervisor init code crosses page boundary) and hypervisor PGDs. The problem is that kalloc() does not guarantee the proper alignment. In the case of the bounce page, the page sized buffer allocated may also cross a page boundary negating the purpose and leading to a hang during kvm initialization. Likewise the PGDs allocated may not meet the minimum alignment requirements of the underlying MMU. This patch uses __get_free_page() to guarantee the worst case alignment needs of the bounce page and PGDs on both arm and arm64. Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-28 03:21:48 -07:00
Will Deacon	4e4468fac4	ARM: KVM: disable KVM in Kconfig on big-endian systems KVM currently crashes and burns on big-endian hosts, so don't allow it to be selected until we've got that fixed. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-04-26 04:20:03 -07:00
Linus Torvalds	467a9e1633	CPU hotplug notifiers registration fixes for 3.15-rc1 The purpose of this single series of commits from Srivatsa S Bhat (with a small piece from Gautham R Shenoy) touching multiple subsystems that use CPU hotplug notifiers is to provide a way to register them that will not lead to deadlocks with CPU online/offline operations as described in the changelog of commit `93ae4f978c` (CPU hotplug: Provide lockless versions of callback registration functions). The first three commits in the series introduce the API and document it and the rest simply goes through the users of CPU hotplug notifiers and converts them to using the new method. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+ 4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0 BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ sWl//rg76nb13dFjvF+q =SW7Q -----END PGP SIGNATURE----- Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull CPU hotplug notifiers registration fixes from Rafael Wysocki: "The purpose of this single series of commits from Srivatsa S Bhat (with a small piece from Gautham R Shenoy) touching multiple subsystems that use CPU hotplug notifiers is to provide a way to register them that will not lead to deadlocks with CPU online/offline operations as described in the changelog of commit `93ae4f978c` ("CPU hotplug: Provide lockless versions of callback registration functions"). The first three commits in the series introduce the API and document it and the rest simply goes through the users of CPU hotplug notifiers and converts them to using the new method" * tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits) net/iucv/iucv.c: Fix CPU hotplug callback registration net/core/flow.c: Fix CPU hotplug callback registration mm, zswap: Fix CPU hotplug callback registration mm, vmstat: Fix CPU hotplug callback registration profile: Fix CPU hotplug callback registration trace, ring-buffer: Fix CPU hotplug callback registration xen, balloon: Fix CPU hotplug callback registration hwmon, via-cputemp: Fix CPU hotplug callback registration hwmon, coretemp: Fix CPU hotplug callback registration thermal, x86-pkg-temp: Fix CPU hotplug callback registration octeon, watchdog: Fix CPU hotplug callback registration oprofile, nmi-timer: Fix CPU hotplug callback registration intel-idle: Fix CPU hotplug callback registration clocksource, dummy-timer: Fix CPU hotplug callback registration drivers/base/topology.c: Fix CPU hotplug callback registration acpi-cpufreq: Fix CPU hotplug callback registration zsmalloc: Fix CPU hotplug callback registration scsi, fcoe: Fix CPU hotplug callback registration scsi, bnx2fc: Fix CPU hotplug callback registration scsi, bnx2i: Fix CPU hotplug callback registration ...	2014-04-07 14:55:46 -07:00
Srivatsa S. Bhat	8146875de7	arm, kvm: Fix CPU hotplug callback registration On 03/15/2014 12:40 AM, Christoffer Dall wrote: > On Fri, Mar 14, 2014 at 11:13:29AM +0530, Srivatsa S. Bhat wrote: >> On 03/13/2014 04:51 AM, Christoffer Dall wrote: >>> On Tue, Mar 11, 2014 at 02:05:38AM +0530, Srivatsa S. Bhat wrote: >>>> Subsystems that want to register CPU hotplug callbacks, as well as perform >>>> initialization for the CPUs that are already online, often do it as shown >>>> below: >>>> [...] >>> Just so we're clear, the existing code was simply racy as not prone to >>> deadlocks, right? >>> >>> This makes it clear that the test above for compatible CPUs can be quite >>> easily evaded by using CPU hotplug, but we don't really have a good >>> solution for handling that yet... Hmmm, grumble grumble, I guess if you >>> hotplug unsupported CPUs on a KVM/ARM system for now, stuff will break. >>> >> >> In this particular case, there was no deadlock possibility, rather the >> existing code had insufficient synchronization against CPU hotplug. >> >> init_hyp_mode() would invoke cpu_init_hyp_mode() on currently online CPUs >> using on_each_cpu(). If a CPU came online after this point and before calling >> register_cpu_notifier(), that CPU would remain uninitialized because this >> subsystem would miss the hot-online event. This patch fixes this bug and >> also uses the new synchronization method (instead of get/put_online_cpus()) >> to ensure that we don't deadlock with CPU hotplug. >> > > Yes, that was my conclusion as well. Thanks for clarifying. (It could > be noted in the commit message as well if you should feel so inclined). > Please find the patch with updated changelog (and your Ack) below. (No changes in code). From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Subject: [PATCH] arm, kvm: Fix CPU hotplug callback registration Subsystems that want to register CPU hotplug callbacks, as well as perform initialization for the CPUs that are already online, often do it as shown below: get_online_cpus(); for_each_online_cpu(cpu) init_cpu(cpu); register_cpu_notifier(&foobar_cpu_notifier); put_online_cpus(); This is wrong, since it is prone to ABBA deadlocks involving the cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently with CPU hotplug operations). Instead, the correct and race-free way of performing the callback registration is: cpu_notifier_register_begin(); for_each_online_cpu(cpu) init_cpu(cpu); /* Note the use of the double underscored version of the API */ __register_cpu_notifier(&foobar_cpu_notifier); cpu_notifier_register_done(); In the existing arm kvm code, there is no synchronization with CPU hotplug to avoid missing the hotplug events that might occur after invoking init_hyp_mode() and before calling register_cpu_notifier(). Fix this bug and also use the new synchronization method (instead of get/put_online_cpus()) to ensure that we don't deadlock with CPU hotplug. Cc: Gleb Natapov <gleb@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-03-20 13:43:41 +01:00
Marc Zyngier	56041bf920	ARM: KVM: fix warning in mmu.c Compiling with THP enabled leads to the following warning: arch/arm/kvm/mmu.c: In function ‘unmap_range’: arch/arm/kvm/mmu.c:177:39: warning: ‘pte’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (kvm_pmd_huge(*pmd) \|\| page_empty(pte)) { ^ Code inspection reveals that these two cases are mutually exclusive, so GCC is a bit overzealous here. Silence it anyway by initializing pte to NULL and testing it later on. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-03-03 01:15:25 +00:00
Marc Zyngier	8034699a42	ARM: KVM: trap VM system registers until MMU and caches are ON In order to be able to detect the point where the guest enables its MMU and caches, trap all the VM related system registers. Once we see the guest enabling both the MMU and the caches, we can go back to a saner mode of operation, which is to leave these registers in complete control of the guest. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-03-03 01:15:24 +00:00
Marc Zyngier	af20814ee9	ARM: KVM: add world-switch for AMAIR{0,1} HCR.TVM traps (among other things) accesses to AMAIR0 and AMAIR1. In order to minimise the amount of surprise a guest could generate by trying to access these registers with caches off, add them to the list of registers we switch/handle. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-03-03 01:15:24 +00:00
Marc Zyngier	ac30a11e8e	ARM: KVM: introduce per-vcpu HYP Configuration Register So far, KVM/ARM used a fixed HCR configuration per guest, except for the VI/VF/VA bits to control the interrupt in absence of VGIC. With the upcoming need to dynamically reconfigure trapping, it becomes necessary to allow the HCR to be changed on a per-vcpu basis. The fix here is to mimic what KVM/arm64 already does: a per vcpu HCR field, initialized at setup time. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-03-03 01:15:23 +00:00
Marc Zyngier	547f781378	ARM: KVM: fix ordering of 64bit coprocessor accesses Commit `240e99cbd0` (ARM: KVM: Fix 64-bit coprocessor handling) added an ordering dependency for the 64bit registers. The order described is: CRn, CRm, Op1, Op2, 64bit-first. Unfortunately, the implementation is: CRn, 64bit-first, CRm... Move the 64bit test to be last in order to match the documentation. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-03-03 01:15:23 +00:00
Marc Zyngier	46c214dd59	ARM: KVM: fix handling of trapped 64bit coprocessor accesses Commit `240e99cbd0` (ARM: KVM: Fix 64-bit coprocessor handling) changed the way we match the 64bit coprocessor access from user space, but didn't update the trap handler for the same set of registers. The effect is that a trapped 64bit access is never matched, leading to a fault being injected into the guest. This went unnoticed as we didn't really trap any 64bit register so far. Placing the CRm field of the access into the CRn field of the matching structure fixes the problem. Also update the debug feature to emit the expected string in case of failing match. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-03-03 01:15:23 +00:00
Marc Zyngier	9d218a1fcf	arm64: KVM: flush VM pages before letting the guest enable caches When the guest runs with caches disabled (like in an early boot sequence, for example), all the writes are diectly going to RAM, bypassing the caches altogether. Once the MMU and caches are enabled, whatever sits in the cache becomes suddenly visible, which isn't what the guest expects. A way to avoid this potential disaster is to invalidate the cache when the MMU is being turned on. For this, we hook into the SCTLR_EL1 trapping code, and scan the stage-2 page tables, invalidating the pages/sections that have already been mapped in. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-03-03 01:15:22 +00:00
Marc Zyngier	a3c8bd31af	ARM: KVM: introduce kvm_pd_addr_end The use of pd_addr_end with stage-2 translation is slightly dodgy, as the IPA is 40bits, while all the p*d_addr_end helpers are taking an unsigned long (arm64 is fine with that as unligned long is 64bit). The fix is to introduce 64bit clean versions of the same helpers, and use them in the stage-2 page table code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-03-03 01:15:22 +00:00
Marc Zyngier	2d58b733c8	arm64: KVM: force cache clean on page fault when caches are off In order for the guest with caches off to observe data written contained in a given page, we need to make sure that page is committed to memory, and not just hanging in the cache (as guest accesses are completely bypassing the cache until it decides to enable it). For this purpose, hook into the coherent_icache_guest_page function and flush the region if the guest SCTLR_EL1 register doesn't show the MMU and caches as being enabled. The function also get renamed to coherent_cache_guest_page. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-03-03 01:15:20 +00:00
Marc Zyngier	b20c9f29c5	arm/arm64: KVM: detect CPU reset on CPU_PM_EXIT Commit `1fcf7ce0c6` (arm: kvm: implement CPU PM notifier) added support for CPU power-management, using a cpu_notifier to re-init KVM on a CPU that entered CPU idle. The code assumed that a CPU entering idle would actually be powered off, loosing its state entierely, and would then need to be reinitialized. It turns out that this is not always the case, and some HW performs CPU PM without actually killing the core. In this case, we try to reinitialize KVM while it is still live. It ends up badly, as reported by Andre Przywara (using a Calxeda Midway): [ 3.663897] Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x685760 [ 3.663897] unexpected data abort in Hyp mode at: 0xc067d150 [ 3.663897] unexpected HVC/SVC trap in Hyp mode at: 0xc0901dd0 The trick here is to detect if we've been through a full re-init or not by looking at HVBAR (VBAR_EL2 on arm64). This involves implementing the backend for __hyp_get_vectors in the main KVM HYP code (rather small), and checking the return value against the default one when the CPU notifier is called on CPU_PM_EXIT. Reported-by: Andre Przywara <osp@andrep.de> Tested-by: Andre Przywara <osp@andrep.de> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Rob Herring <rob.herring@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-02-27 19:27:10 +01:00
Linus Torvalds	7ebd3faa9b	First round of KVM updates for 3.14; PPC parts will come next week. Nothing major here, just bugfixes all over the place. The most interesting part is the ARM guys' virtualized interrupt controller overhaul, which lets userspace get/set the state and thus enables migration of ARM VMs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH 5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0 iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8 FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z 2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu x7N8KR5hAj+mLBoM9/Al =8sYU -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "First round of KVM updates for 3.14; PPC parts will come next week. Nothing major here, just bugfixes all over the place. The most interesting part is the ARM guys' virtualized interrupt controller overhaul, which lets userspace get/set the state and thus enables migration of ARM VMs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits) kvm: make KVM_MMU_AUDIT help text more readable KVM: s390: Fix memory access error detection KVM: nVMX: Update guest activity state field on L2 exits KVM: nVMX: Fix nested_run_pending on activity state HLT KVM: nVMX: Clean up handling of VMX-related MSRs KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit KVM: nVMX: Leave VMX mode on clearing of feature control MSR KVM: VMX: Fix DR6 update on #DB exception KVM: SVM: Fix reading of DR6 KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS add support for Hyper-V reference time counter KVM: remove useless write to vcpu->hv_clock.tsc_timestamp KVM: x86: fix tsc catchup issue with tsc scaling KVM: x86: limit PIT timer frequency KVM: x86: handle invalid root_hpa everywhere kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub kvm: vfio: silence GCC warning KVM: ARM: Remove duplicate include arm/arm64: KVM: relax the requirements of VMA alignment for THP ...	2014-01-22 21:40:43 -08:00
Paolo Bonzini	ab53f22e2e	Merge tag 'kvm-arm-for-3.14' of git://git.linaro.org/people/christoffer.dall/linux-kvm-arm into kvm-queue	2014-01-15 12:14:29 +01:00
Sachin Kamat	61466710de	KVM: ARM: Remove duplicate include trace.h was included twice. Remove duplicate inclusion. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-01-08 13:54:00 -08:00
Marc Zyngier	136d737fd2	arm/arm64: KVM: relax the requirements of VMA alignment for THP The THP code in KVM/ARM is a bit restrictive in not allowing a THP to be used if the VMA is not 2MB aligned. Actually, it is not so much the VMA that matters, but the associated memslot: A process can perfectly mmap a region with no particular alignment restriction, and then pass a 2MB aligned address to KVM. In this case, KVM will only use this 2MB aligned region, and will ignore the range between vma->vm_start and memslot->userspace_addr. It can also choose to place this memslot at whatever alignment it wants in the IPA space. In the end, what matters is the relative alignment of the user space and IPA mappings with respect to a 2M page. They absolutely must be the same if you want to use THP. Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-01-08 13:49:03 -08:00
Christoffer Dall	e9b152cb95	arm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put The arch-generic KVM code expects the cpu field of a vcpu to be -1 if the vcpu is no longer assigned to a cpu. This is used for the optimized make_all_cpus_request path and will be used by the vgic code to check that no vcpus are running. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 10:01:34 -08:00
Christoffer Dall	ce01e4e887	KVM: arm-vgic: Set base addr through device API Support setting the distributor and cpu interface base addresses in the VM physical address space through the KVM_{SET,GET}_DEVICE_ATTR API in addition to the ARM specific API. This has the added benefit of being able to share more code in user space and do things in a uniform manner. Also deprecate the older API at the same time, but backwards compatibility will be maintained. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 10:01:22 -08:00
Christoffer Dall	7330672bef	KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC Support creating the ARM VGIC device through the KVM_CREATE_DEVICE ioctl, which can then later be leveraged to use the KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in a more generic API than the ARM-specific one and is useful for save/restore of VGIC state. Adds KVM_CAP_DEVICE_CTRL to ARM capabilities. Note that we change the check for creating a VGIC from bailing out if any VCPUs were created, to bailing out if any VCPUs were ever run. This is an important distinction that shouldn't break anything, but allows creating the VGIC after the VCPUs have been created. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 10:01:16 -08:00
Christoffer Dall	e1ba0207a1	ARM: KVM: Allow creating the VGIC after VCPUs Rework the VGIC initialization slightly to allow initialization of the vgic cpu-specific state even if the irqchip (the VGIC) hasn't been created by user space yet. This is safe, because the vgic data structures are already allocated when the CPU is allocated if VGIC support is compiled into the kernel. Further, the init process does not depend on any other information and the sacrifice is a slight performance degradation for creating VMs in the no-VGIC case. The reason is that the new device control API doesn't mandate creating the VGIC before creating the VCPU and it is unreasonable to require user space to create the VGIC before creating the VCPUs. At the same time move the irqchip_in_kernel check out of kvm_vcpu_first_run_init and into the init function to make the per-vcpu and global init functions symmetric and add comments on the exported functions making it a bit easier to understand the init flow by only looking at vgic.c. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 10:01:06 -08:00
Andre Przywara	39735a3a39	ARM/KVM: save and restore generic timer registers For migration to work we need to save (and later restore) the state of each core's virtual generic timer. Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export the three needed registers (control, counter, compare value). Though they live in cp15 space, we don't use the existing list, since they need special accessor functions and the arch timer is optional. Acked-by: Marc Zynger <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 10:00:15 -08:00
Christoffer Dall	a1a64387ad	arm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init Initialize the cntvoff at kvm_init_vm time, not before running the VCPUs at the first time because that will overwrite any potentially restored values from user space. Cc: Andre Przywara <andre.przywara@linaro.org> Acked-by: Marc Zynger <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 09:58:57 -08:00
Christoffer Dall	478a8237f6	arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive The current KVM implementation of PSCI returns INVALID_PARAMETERS if the waitqueue for the corresponding CPU is not active. This does not seem correct, since KVM should not care what the specific thread is doing, for example, user space may not have called KVM_RUN on this VCPU yet or the thread may be busy looping to user space because it received a signal; this is really up to the user space implementation. Instead we should check specifically that the CPU is marked as being turned off, regardless of the VCPU thread state, and if it is, we shall simply clear the pause flag on the CPU and wake up the thread if it happens to be blocked for us. Further, the implementation seems to be racy when executing multiple VCPU threads. There really isn't a reasonable user space programming scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init before turning on the boot CPU. Therefore, set the pause flag on the vcpu at VCPU init time (which can reasonably be expected to be completed for all CPUs by user space before running any VCPUs) and clear both this flag and the feature (in case the feature can somehow get set again in the future) and ping the waitqueue on turning on a VCPU using PSCI. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-21 09:55:17 -08:00
Lorenzo Pieralisi	1fcf7ce0c6	arm: kvm: implement CPU PM notifier Upon CPU shutdown and consequent warm-reboot, the hypervisor CPU state must be re-initialized. This patch implements a CPU PM notifier that upon warm-boot calls a KVM hook to reinitialize properly the hypervisor state so that the CPU can be safely resumed. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2013-12-16 17:17:33 +00:00
Santosh Shilimkar	4fda342cc7	arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings KVM initialisation fails on architectures implementing virt_to_idmap() because virt_to_phys() on such architectures won't fetch you the correct idmap page. So update the KVM ARM code to use the virt_to_idmap() to fix the issue. Since the KVM code is shared between arm and arm64, we create kvm_virt_to_phys() and handle the redirection in respective headers. Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-12-11 09:49:31 -08:00
Gleb Natapov	2ecd1aba59	Fix percpu vmalloc allocations -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJSiDFHAAoJEEtpOizt6ddyCWQH/jmTxPGBK3Bx1qTr25izgW8Z 69SYTfkRoCV/3O3VWh9LiOSz9zrnCm0ODopQmMUDbEYF8KnzMiLASZ5mf3xmvoo1 3KB/upM6DXEd2u2ZEQKnwp0cQZmcTzXM0PfGZ86xlmLFH6zjMt0iPa/FwflQMWsM QJyRJOMrB/WyLbY77FdsyK8RD4fJy7mlLS6LSXnnDb5gMeiPu8d9PWXdlIwCsEe3 Zk0AAyJBXYn0yjzrZ/5HbUlmTBMLbEIomzCqCVAglpiXUb7E0w7YoyTc3dagyMyK zZ+6OqJAZpIi5tliGuhadLsioinM/L9xfnOFtd8jtvPqZbYcetkphTbBg9Bj4mw= =VmDY -----END PGP SIGNATURE----- Merge tag 'kvm-arm-fixes-3.13-1' of git://git.linaro.org/people/cdall/linux-kvm-arm into next Fix percpu vmalloc allocations	2013-11-19 10:43:05 +02:00
Christoffer Dall	40c2729bab	arm/arm64: KVM: Fix hyp mappings of vmalloc regions Using virt_to_phys on percpu mappings is horribly wrong as it may be backed by vmalloc. Introduce kvm_kaddr_to_phys which translates both types of valid kernel addresses to the corresponding physical address. At the same time resolves a typing issue where we were storing the physical address as a 32 bit unsigned long (on arm), truncating the physical address for addresses above the 4GB limit. This caused breakage on Keystone. Cc: <stable@vger.kernel.org> [3.10+] Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-11-16 18:54:45 -08:00
Linus Torvalds	f080480488	Here are the 3.13 KVM changes. There was a lot of work on the PPC side: the HV and emulation flavors can now coexist in a single kernel is probably the most interesting change from a user point of view. On the x86 side there are nested virtualization improvements and a few bugfixes. ARM got transparent huge page support, improved overcommit, and support for big endian guests. Finally, there is a new interface to connect KVM with VFIO. This helps with devices that use NoSnoop PCI transactions, letting the driver in the guest execute WBINVD instructions. This includes some nVidia cards on Windows, that fail to start without these patches and the corresponding userspace changes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc 0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1 7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH NTmT/cfmYp2BRkiCfCiS =rWNf -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM changes from Paolo Bonzini: "Here are the 3.13 KVM changes. There was a lot of work on the PPC side: the HV and emulation flavors can now coexist in a single kernel is probably the most interesting change from a user point of view. On the x86 side there are nested virtualization improvements and a few bugfixes. ARM got transparent huge page support, improved overcommit, and support for big endian guests. Finally, there is a new interface to connect KVM with VFIO. This helps with devices that use NoSnoop PCI transactions, letting the driver in the guest execute WBINVD instructions. This includes some nVidia cards on Windows, that fail to start without these patches and the corresponding userspace changes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits) kvm, vmx: Fix lazy FPU on nested guest arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu arm/arm64: KVM: MMIO support for BE guest kvm, cpuid: Fix sparse warning kvm: Delete prototype for non-existent function kvm_check_iopl kvm: Delete prototype for non-existent function complete_pio hung_task: add method to reset detector pvclock: detect watchdog reset at pvclock read kvm: optimize out smp_mb after srcu_read_unlock srcu: API for barrier after srcu read unlock KVM: remove vm mmap method KVM: IOMMU: hva align mapping page size KVM: x86: trace cpuid emulation when called from emulator KVM: emulator: cleanup decode_register_operand() a bit KVM: emulator: check rex prefix inside decode_register() KVM: x86: fix emulation of "movzbl %bpl, %eax" kvm_host: typo fix KVM: x86: emulate SAHF instruction MAINTAINERS: add tree for kvm.git Documentation/kvm: add a 00-INDEX file ...	2013-11-15 13:51:36 +09:00
Paolo Bonzini	ede5822242	A handful of fixes for KVM/arm64: - A couple a basic fixes for running BE guests on a LE host - A performance improvement for overcommitted VMs (same as the equivalent patch for ARM) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJSfLSVAAoJECPQ0LrRPXpDMG0QAJrTocErN2BQMDoT9DpcQhh6 yoD6KjbS3O4lWz60wJ0BgJ6gcQFg7JiFrPk6JcyT+ykXYf1UuLymUhAkU7Sw+0lP GVt7sr2SaaQd6ZjGphWyWPXuDbvN1CxyIi7TD4CNe0tTYwSI6Vaf19h2Bkjd+VfJ o2Sf2zHz4mutTCmPJuqnI255MLveTyQr/VZT1xNS79FiJM3/j3+UxCEi1fwgTkkb 4l3AyW9RN2mmTsS4VE6an2iosCi9pqoAC3y88vnaeBpUHIf/O2O57sT0+8o7jM6z 6uZVesMsKNmDtkMUFRQyj4Mps3yIVcccDpFJr4UcZH+ipbM+5nPY8AlYcKqk+4KY T7Zys1hITq7xSEK4HiQt+AJnXDXZF5YZnzqUqVQHZFBn5P1GfB4/Bo9E3+QG68oq AO1ry6SiRRPmTAZVeYqV82DX1YSjbvghvvPXhtPvolNhyzJJooBvhpWfGthVeZds tuazKvvwDv0pFEWSwiFvWyqGW4FHKz3vWSUfuR1MF2P86fIfT5buJ9/StMxiRRH8 tSwoV3Ksut2kX9l5o8MZqmv1UkH88hwxxxd2J2aHGU+XPLlbQaYSWSK0d6hKTLMZ ErZmCK/BARdll9UiTdXZ8h7UjfVDfuhTVeW1PvKQJ7sLGCDeT0VfhcrXnTtr41Je iLWeDea0NJBzQxRZoJHG =s0ny -----END PGP SIGNATURE----- Merge tag 'kvm-arm64/for-3.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into kvm-next A handful of fixes for KVM/arm64: - A couple a basic fixes for running BE guests on a LE host - A performance improvement for overcommitted VMs (same as the equivalent patch for ARM) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Conflicts: arch/arm/include/asm/kvm_emulate.h arch/arm64/include/asm/kvm_emulate.h	2013-11-11 12:05:20 +01:00
Paolo Bonzini	6da8ae556c	Updates for KVM/ARM, take 3 supporting more than 4 CPUs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJSfD2NAAoJEEtpOizt6ddyK0IH+wQf6Hwe2nLAhj86knqDHsPt LgJ6ZSY2bhWTKhCVDkH4HQt6ZqWEV7P8HsLNLc9FxjxCIgGO6Lp6Obv6sYscZrvh OdzsZ/+j1t035qmeLwBJnB2x+j21ACd5LYVaWkfJPmGJ40KrcgP/t3fj3r1le91Z jUM5BZY8fbEpJ1JaLoYZIEm1nYQJ4cabkqb9dFieqbVB1OoYw5W7KCV0wazOpg+f T2WL8Dy+lP8DfGRrEjsIM299DlMAFfUCj7mgfO3yTDUK0Q6Q3ZN7f394c+04OfZv /V1fM4y4X1i/2wlL3oPaf1WtYhwd2VxJwo1n3SLg1b9gWghYnqUR/vBHrUmzX/A= =2ESp -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-3.13-3' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-next Updates for KVM/ARM, take 3 supporting more than 4 CPUs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Conflicts: arch/arm/kvm/reset.c [cpu_reset->reset_regs change; context only]	2013-11-11 12:02:27 +01:00
Marc Zyngier	ce94fe93d5	arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu When booting a vcpu using PSCI, make sure we start it with the endianness of the caller. Otherwise, secondaries can be pretty unhappy to execute a BE kernel in LE mode... This conforms to PSCI spec Rev B, 5.13.3. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2013-11-07 19:09:08 +00:00
Marc Zyngier	6d89d2d9b5	arm/arm64: KVM: MMIO support for BE guest Do the necessary byteswap when host and guest have different views of the universe. Actually, the only case we need to take care of is when the guest is BE. All the other cases are naturally handled. Also be careful about endianness when the data is being memcopy-ed from/to the run buffer. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2013-11-07 19:09:04 +00:00
Gleb Natapov	95f328d3ad	Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue Conflicts: arch/powerpc/include/asm/processor.h	2013-11-04 10:20:57 +02:00
Christoph Lameter	1436c1aa62	ARM: 7862/1: pcpu: replace __get_cpu_var_uses This is the ARM part of Christoph's patchset cleaning up the various uses of __get_cpu_var across the tree. The idea is to convert __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and fewer registers are used when code is generated. [will: fixed debug ref counting checks and pcpu array accesses] Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-10-29 11:06:27 +00:00
Paolo Bonzini	5bb3398dd2	Updates for KVM/ARM, take 2 including: - Transparent Huge Pages and hugetlbfs support for KVM/ARM - Yield CPU when guest executes WFE to speed up CPU overcommit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJSZ5NVAAoJEEtpOizt6ddyEJgH+wWw6KWqHlParb+rf04cqCQV Fj3euz+SpYr2U2u0RimgkmeahUiGUhnlBSSH+tkLmt1if6nLawBJbUcIhaZMVdv+ cvS6k+NtK7ibwPOyFeoZCS8taEbVDut2YgrtRKbne6QDLRYBEXFtpY8o6ptLoSu4 ifQCF0FZyElCGLylSxFt9GsK+LjNjQWatVrzoHap9d58u2bma6GYwr4mEzVMHms7 REtTvpwWgsDR5C/69aG8wE4cpJZALH3OeCgy6AccdzTLaQWWpK2YLWz8AFOvoYx6 EsFmBFHZYcuwN+fv2jILgA3Is1oWwqI6k5lL+N3g/oTNNALDSWnfiJkXypJsfow= =2Ijm -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-3.13-2' of git://git.linaro.org/people/cdall/linux-kvm-arm into kvm-queue Updates for KVM/ARM, take 2 including: - Transparent Huge Pages and hugetlbfs support for KVM/ARM - Yield CPU when guest executes WFE to speed up CPU overcommit	2013-10-28 13:15:55 +01:00
Marc Zyngier	79c648806f	arm/arm64: KVM: PSCI: use MPIDR to identify a target CPU The KVM PSCI code blindly assumes that vcpu_id and MPIDR are the same thing. This is true when vcpus are organized as a flat topology, but is wrong when trying to emulate any other topology (such as A15 clusters). Change the KVM PSCI CPU_ON code to look at the MPIDR instead of the vcpu_id to pick a target CPU. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-22 08:00:06 -07:00
Marc Zyngier	7999b4d182	ARM: KVM: drop limitation to 4 CPU VMs Now that the KVM/arm code knows about affinity, remove the hard limit of 4 vcpus per VM. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-22 08:00:06 -07:00
Marc Zyngier	9cbb6d969c	ARM: KVM: fix L2CTLR to be per-cluster The L2CTLR register contains the number of CPUs in this cluster. Make sure the register content is actually relevant to the vcpu that is being configured by computing the number of cores that are part of its cluster. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-22 08:00:06 -07:00
Marc Zyngier	2d1d841bd4	ARM: KVM: Fix MPIDR computing to support virtual clusters In order to be able to support more than 4 A7 or A15 CPUs, we need to fix the MPIDR computing to reflect the fact that both A15 and A7 can only exist in clusters of at most 4 CPUs. Fix the MPIDR computing to allow virtual clusters to be exposed to the guest. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-22 08:00:06 -07:00
Christoffer Dall	9b5fdb9781	KVM: ARM: Transparent huge page (THP) support Support transparent huge pages in KVM/ARM and KVM/ARM64. The transparent_hugepage_adjust is not very pretty, but this is also how it's solved on x86 and seems to be simply an artifact on how THPs behave. This should eventually be shared across architectures if possible, but that can always be changed down the road. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-17 17:06:30 -07:00
Christoffer Dall	ad361f093c	KVM: ARM: Support hugetlbfs backed huge pages Support huge pages in KVM/ARM and KVM/ARM64. The pud_huge checking on the unmap path may feel a bit silly as the pud_huge check is always defined to false, but the compiler should be smart about this. Note: This deals only with VMAs marked as huge which are allocated by users through hugetlbfs only. Transparent huge pages can only be detected by looking at the underlying pages (or the page tables themselves) and this patch so far simply maps these on a page-by-page level in the Stage-2 page tables. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-17 17:06:20 -07:00
Christoffer Dall	86ed81aa2e	KVM: ARM: Update comments for kvm_handle_wfi Update comments to reflect what is really going on and add the TWE bit to the comments in kvm_arm.h. Also renames the function to kvm_handle_wfx like is done on arm64 for consistency and uber-correctness. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-17 15:26:50 -07:00
Marc Zyngier	58d5ec8f8e	ARM: KVM: Yield CPU when vcpu executes a WFE On an (even slightly) oversubscribed system, spinlocks are quickly becoming a bottleneck, as some vcpus are spinning, waiting for a lock to be released, while the vcpu holding the lock may not be running at all. This creates contention, and the observed slowdown is 40x for hackbench. No, this isn't a typo. The solution is to trap blocking WFEs and tell KVM that we're now spinning. This ensures that other vpus will get a scheduling boost, allowing the lock to be released more quickly. Also, using CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance when the VM is severely overcommited. Quick test to estimate the performance: hackbench 1 process 1000 2xA15 host (baseline): 1.843s 2xA15 guest w/o patch: 2.083s 4xA15 guest w/o patch: 80.212s 8xA15 guest w/o patch: Could not be bothered to find out 2xA15 guest w/ patch: 2.102s 4xA15 guest w/ patch: 3.205s 8xA15 guest w/ patch: 6.887s So we go from a 40x degradation to 1.5x in the 2x overcommit case, which is vaguely more acceptable. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-17 15:26:50 -07:00
Gleb Natapov	13acfd5715	Powerpc KVM work is based on a commit after rc4. Merging master into next to satisfy the dependencies. Conflicts: arch/arm/kvm/reset.c	2013-10-17 17:41:49 +03:00
Aneesh Kumar K.V	5587027ce9	kvm: Add struct kvm arg to memslot APIs We will use that in the later patch to find the kvm ops handler Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 15:49:23 +02:00
Jonathan Austin	e8c2d99f82	KVM: ARM: Add support for Cortex-A7 This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts. As Cortex-A7 is architecturally compatible with A15, this patch is largely just generalising existing code. Areas where 'implementation defined' behaviour is identical for A7 and A15 is moved to allow it to be used by both cores. The check to ensure that coprocessor register tables are sorted correctly is also moved in to 'common' code to avoid each new cpu doing its own check (and possibly forgetting to do so!) Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-12 17:45:30 -07:00
Jonathan Austin	1158fca401	KVM: ARM: Fix calculation of virtual CPU ID KVM does not have a notion of multiple clusters for CPUs, just a linear array of CPUs. When using a system with cores in more than one cluster, the current method for calculating the virtual MPIDR will leak the (physical) cluster information into the virtual MPIDR. One effect of this is that Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster. This patch does away with exposing the real MPIDR fields in favour of simply using the virtual CPU number (but preserving the U bit, as before). Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-12 17:44:39 -07:00
Anup Patel	42c4e0c77a	ARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl For implementing CPU=host, we need a mechanism for querying preferred VCPU target type on underlying Host. This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which returns struct kvm_vcpu_init instance containing information about preferred VCPU target type and target specific features available for it. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-02 11:29:48 -07:00
Anup Patel	4a6fee805d	ARM: KVM: Implement kvm_vcpu_preferred_target() function This patch implements kvm_vcpu_preferred_target() function for KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl for user space. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-02 11:29:10 -07:00
Anup Patel	b373e492f3	KVM: ARM: Fix typo in comments of inject_abt() Very minor typo in comments of inject_abt() when we update fault status register for injecting prefetch abort. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-02 17:29:19 +01:00
Olof Johansson	ac570e0493	ARM: kvm: rename cpu_reset to avoid name clash cpu_reset is already #defined in <asm/proc-fns.h> as processor.reset, so it expands here and causes problems. Cc: <stable@vger.kernel.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-09-24 11:15:05 -07:00
Linus Torvalds	2e03285224	Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm Pull ARM updates from Russell King: "This set includes adding support for Neon acceleration of RAID6 XOR code from Ard Biesheuvel, cache flushing and barrier updates from Will Deacon, and a cleanup to the ARM debug code which reduces the amount of code by about 500 lines. A few other cleanups, such as constifying the machine descriptors which already shouldn't be written to, cleaning up the printing of the L2 cache size" * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (55 commits) ARM: 7826/1: debug: support debug ll on hisilicon soc ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables ARM: 7828/1: ARMv7-M: implement restart routine common to all v7-M machines ARM: 7827/1: highbank: fix debug uart virtual address for LPAE ARM: 7823/1: errata: workaround Cortex-A15 erratum 773022 ARM: 7806/1: allow DEBUG_UNCOMPRESS for Tegra ARM: 7793/1: debug: use generic option for ep93xx PL10x debug port ARM: debug: move SPEAr debug to generic PL01x code ARM: debug: move davinci debug to generic 8250 code ARM: debug: move keystone debug to generic 8250 code ARM: debug: remove DEBUG_ROCKCHIP_UART ARM: debug: provide generic option choices for 8250 and PL01x ports ARM: debug: move PL01X debug include into arch/arm/include/debug/ ARM: debug: provide PL01x debug uart phys/virt address configuration options ARM: debug: add support for word accesses to debug/8250.S ARM: debug: move 8250 debug include into arch/arm/include/debug/ ARM: debug: provide 8250 debug uart phys/virt address configuration options ARM: debug: provide 8250 debug uart register shift configuration option ARM: debug: provide 8250 debug uart flow control configuration option ...	2013-09-05 18:07:32 -07:00
Russell King	141b97433d	Merge branches 'debug-choice', 'devel-stable' and 'misc' into for-linus	2013-09-05 10:34:15 +01:00
Linus Torvalds	ae7a835cc5	Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Gleb Natapov: "The highlights of the release are nested EPT and pv-ticketlocks support (hypervisor part, guest part, which is most of the code, goes through tip tree). Apart of that there are many fixes for all arches" Fix up semantic conflicts as discussed in the pull request thread.. * 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits) ARM: KVM: Add newlines to panic strings ARM: KVM: Work around older compiler bug ARM: KVM: Simplify tracepoint text ARM: KVM: Fix kvm_set_pte assignment ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256 ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs ARM: KVM: vgic: fix GICD_ICFGRn access ARM: KVM: vgic: simplify vgic_get_target_reg KVM: MMU: remove unused parameter KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate() KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX KVM: x86: update masterclock when kvmclock_offset is calculated (v2) KVM: PPC: Book3S: Fix compile error in XICS emulation KVM: PPC: Book3S PR: return appropriate error when allocation fails arch: powerpc: kvm: add signed type cast for comparation KVM: x86: add comments where MMIO does not return to the emulator KVM: vmx: count exits to userspace during invalid guest emulation KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp kvm: optimize away THP checks in kvm_is_mmio_pfn() ...	2013-09-04 18:15:06 -07:00
Christoffer Dall	1fe40f6d39	ARM: KVM: Add newlines to panic strings The panic strings are hard to read and on narrow terminals some characters are simply truncated off the panic message. Make is slightly prettier with a newline in the Hyp panic strings. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-30 15:48:02 -07:00
Christoffer Dall	6833d83891	ARM: KVM: Work around older compiler bug Compilers before 4.6 do not behave well with unnamed fields in structure initializers and therefore produces build errors: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676 By refering to the unnamed union using braces, both older and newer compilers produce the same result. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reported-by: Russell King <linux@arm.linux.org.uk> Tested-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-30 15:47:58 -07:00
Christoffer Dall	6e72cc5700	ARM: KVM: Simplify tracepoint text The tracepoint for kvm_guest_fault was extremely long, make it a slightly bit shorter. Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-30 15:47:53 -07:00
Christoffer Dall	8947c09d05	ARM: 7808/1: KVM: mm: Get rid of L_PTE_USER ref from PAGE_S2_DEVICE THe L_PTE_USER actually has nothing to do with stage 2 mappings and the L_PTE_S2_RDWR value sets the readable bit, which was what L_PTE_USER was used for before proper handling of stage 2 memory defines. Changelog: [v3]: Drop call to kvm_set_s2pte_writable in mmu.c [v2]: Change default mappings to be r/w instead of r/o, as per Marc Zyngier's suggestion. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-08-13 20:25:06 +01:00
Will Deacon	e3ab547f57	ARM: kvm: use inner-shareable barriers after TLB flushing When flushing the TLB at PL2 in response to remapping at stage-2 or VMID rollover, we have a dsb instruction to ensure completion of the command before continuing. Since we only care about other processors for TLB invalidation, use the inner-shareable variant of the dsb instruction instead. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2013-08-12 12:25:45 +01:00
Christoffer Dall	2184a60de2	KVM: ARM: Squash len warning The 'len' variable was declared an unsigned and then checked for less than 0, which results in warnings on some compilers. Since len is assigned an int, make it an int. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-11 21:03:39 -07:00
Marc Zyngier	979acd5e18	arm64: KVM: fix 2-level page tables unmapping When using 64kB pages, we only have two levels of page tables, meaning that PGD, PUD and PMD are fused. In this case, trying to refcount PUDs and PMDs independently is a a complete disaster, as they are the same. We manage to get it right for the allocation (stage2_set_pte uses {pmd,pud}_none), but the unmapping path clears both pud and pmd refcounts, which fails spectacularly with 2-level page tables. The fix is to avoid calling clear_pud_entry when both the pmd and pud pages are empty. For this, and instead of introducing another pud_empty function, consolidate both pte_empty and pmd_empty into page_empty (the code is actually identical) and use that to also test the validity of the pud. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-07 18:17:39 -07:00
Christoffer Dall	d3840b2661	ARM: KVM: Fix unaligned unmap_range leak The unmap_range function did not properly cover the case when the start address was not aligned to PMD_SIZE or PUD_SIZE and an entire pte table or pmd table was cleared, causing us to leak memory when incrementing the addr. The fix is to always move onto the next page table entry boundary instead of adding the full size of the VA range covered by the corresponding table level entry. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-07 18:17:28 -07:00
Christoffer Dall	240e99cbd0	ARM: KVM: Fix 64-bit coprocessor handling The PAR was exported as CRn == 7 and CRm == 0, but in fact the primary coprocessor register number was determined by CRm for 64-bit coprocessor registers as the user space API was modeled after the coprocessor access instructions (see the ARM ARM rev. C - B3-1445). However, just changing the CRn to CRm breaks the sorting check when booting the kernel, because the internal kernel logic always treats CRn as the primary register number, and it makes the table sorting impossible to understand for humans. Alternatively we could change the logic to always have CRn == CRm, but that becomes unclear in the number of ways we do look up of a coprocessor register. We could also have a separate 64-bit table but that feels somewhat over-engineered. Instead, keep CRn the primary representation of the primary coproc. register number in-kernel and always export the primary number as CRm as per the existing user space ABI. Note: The TTBR registers just magically worked because they happened to follow the CRn(0) regs and were considered CRn(0) in the in-kernel representation. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-08-06 11:32:30 -07:00
Takuya Yoshikawa	e59dbe09f8	KVM: Introduce kvm_arch_memslots_updated() This is called right after the memslots is updated, i.e. when the result of update_memslots() gets installed in install_new_memslots(). Since the memslots needs to be updated twice when we delete or move a memslot, kvm_arch_commit_memory_region() does not correspond to this exactly. In the following patch, x86 will use this new API to check if the mmio generation has reached its maximum value, in which case mmio sptes need to be flushed out. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Acked-by: Alexander Graf <agraf@suse.de> Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-07-18 12:29:25 +02:00
Linus Torvalds	fe489bf450	KVM fixes for 3.11 On the x86 side, there are some optimizations and documentation updates. The big ARM/KVM change for 3.11, support for AArch64, will come through Catalin Marinas's tree. s390 and PPC have misc cleanups and bugfixes. There is a conflict due to "s390/pgtable: fix ipte notify bit" having entered 3.10 through Martin Schwidefsky's s390 tree. This pull request has additional changes on top, so this tree's version is the correct one. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR0oU6AAoJEBvWZb6bTYbynnsP/RSUrrHrA8Wu1tqVfAKu+1y5 6OIihqZ9x11/YMaNofAfv86jqxFu0/j7CzMGphNdjzujqKI+Q1tGe7oiVCmKzoG+ UvSctWsz0lpllgBtnnrm5tcfmG6rrddhLtpA7m320+xCVx8KV5P4VfyHZEU+Ho8h ziPmb2mAQ65gBNX6nLHEJ3ITTgad6gt4NNbrKIYpyXuWZQJypzaRqT/vpc4md+Ed dCebMXsL1xgyb98EcnOdrWH1wV30MfucR7IpObOhXnnMKeeltqAQPvaOlKzZh4dK +QfxJfdRZVS0cepcxzx1Q2X3dgjoKQsHq1nlIyz3qu1vhtfaqBlixLZk0SguZ/R9 1S1YqucZiLRO57RD4q0Ak5oxwobu18ZoqJZ6nledNdWwDe8bz/W2wGAeVty19ky0 qstBdM9jnwXrc0qrVgZp3+s5dsx3NAm/KKZBoq4sXiDLd/yBzdEdWIVkIrU3X9wU 3X26wOmBxtsB7so/JR7ciTsQHelmLicnVeXohAEP9CjIJffB81xVXnXs0P0SYuiQ RzbSCwjPzET4JBOaHWT0Dhv0DTS/EaI97KzlN32US3Bn3WiLlS1oDCoPFoaLqd2K LxQMsXS8anAWxFvexfSuUpbJGPnKSidSQoQmJeMGBa9QhmZCht3IL16/Fb641ToN xBohzi49L9FDbpOnTYfz =1zpG -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "On the x86 side, there are some optimizations and documentation updates. The big ARM/KVM change for 3.11, support for AArch64, will come through Catalin Marinas's tree. s390 and PPC have misc cleanups and bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits) KVM: PPC: Ignore PIR writes KVM: PPC: Book3S PR: Invalidate SLB entries properly KVM: PPC: Book3S PR: Allow guest to use 1TB segments KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry KVM: PPC: Book3S PR: Fix proto-VSID calculations KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL KVM: Fix RTC interrupt coalescing tracking kvm: Add a tracepoint write_tsc_offset KVM: MMU: Inform users of mmio generation wraparound KVM: MMU: document fast invalidate all mmio sptes KVM: MMU: document fast invalidate all pages KVM: MMU: document fast page fault KVM: MMU: document mmio page fault KVM: MMU: document write_flooding_count KVM: MMU: document clear_spte_count KVM: MMU: drop kvm_mmu_zap_mmio_sptes KVM: MMU: init kvm generation close to mmio wrap-around value KVM: MMU: add tracepoint for check_mmio_spte KVM: MMU: fast invalidate all mmio sptes ...	2013-07-03 13:21:40 -07:00
Linus Torvalds	1873e50028	Main features: - KVM and Xen ports to AArch64 - Hugetlbfs and transparent huge pages support for arm64 - Applied Micro X-Gene Kconfig entry and dts file - Cache flushing improvements -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iQIcBAABAgAGBQJR0bZAAAoJEGvWsS0AyF7xTEEP/R/aRoqWwbVAMlwAhujq616O t4RzIyBXZXqxS9I+raokCX4mgYxdeisJlzN2hoq73VEX2BQlXZoYh8vmfY9WeNSM 2pdfif2HF7oo9ymCRyqfuhbumPrTyJhpbguzOYrxPqpp2f1hv2D8hbUJEFj429yL UjqTFoONngfouZmAlwrPGZQKhBI95vvN53yvDMH0PWfvpm07DKGIQMYp20y0pj8j slhLH3bh2kfpS1cf23JtH6IICwWD2pXW0POo569CfZry6bI74xve+Trcsm7iPnsO PSI1P046ME1mu3SBbKwiPIdN/FQqWwTHW07fvMmH/xuXu3Zs/mxgzi7vDzDrVvTg PJSbKWD6N/IPPwKS/gCUmWWDASO0bXx3KlDuRZqAjbRojs0UPUOTUhzJM/BHUms1 vY2QS9lAm02LmZZrk1LeKKP85gB+qKQvHuOVhIOldWeLGKtsNufz1kynz6YTqsLq uUB55KwbhQ7q8+aoY6lWujqiTXMoLkBgGdjHs2I407PAv7ZjlhRWk2fIry7xJifp rKu2cIlWsRe4CGvGI410NvIJFrGvJAV4wA43sgBDjPumyILgT/5jw9r3RpJEBZZs akw/Bl1CbL+gMjyoPUWgcWZdRkUCE0eLrgyMOmaYfst8cOTaWw4dWLvUG/bBZg+Y mGnuEQUQtAPadk8P/Sv3 =PZ/e -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull ARM64 updates from Catalin Marinas: "Main features: - KVM and Xen ports to AArch64 - Hugetlbfs and transparent huge pages support for arm64 - Applied Micro X-Gene Kconfig entry and dts file - Cache flushing improvements For arm64 huge pages support, there are x86 changes moving part of arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (66 commits) arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board arm64: Add defines for APM ARMv8 implementation arm64: Enable APM X-Gene SOC family in the defconfig arm64: Add Kconfig option for APM X-Gene SOC family arm64/Makefile: provide vdso_install target ARM64: mm: THP support. ARM64: mm: Raise MAX_ORDER for 64KB pages and THP. ARM64: mm: HugeTLB support. ARM64: mm: Move PTE_PROT_NONE bit. ARM64: mm: Make PAGE_NONE pages read only and no-execute. ARM64: mm: Restore memblock limit when map_mem finished. mm: thp: Correct the HPAGE_PMD_ORDER check. x86: mm: Remove general hugetlb code from x86. mm: hugetlb: Copy general hugetlb code from x86 to mm. x86: mm: Remove x86 version of huge_pmd_share. mm: hugetlb: Copy huge_pmd_share from x86 to mm. arm64: KVM: document kernel object mappings in HYP arm64: KVM: MAINTAINERS update arm64: KVM: userspace API documentation arm64: KVM: enable initialization of a 32bit vcpu ...	2013-07-03 10:31:38 -07:00
Russell King	3c0c01ab74	Merge branch 'devel-stable' into for-next Conflicts: arch/arm/Makefile arch/arm/include/asm/glue-proc.h	2013-06-29 11:44:43 +01:00
Arnd Bergmann	8bd4ffd6b3	ARM: kvm: don't include drivers/virtio/Kconfig The virtio configuration has recently moved and is now visible everywhere. Including the file again from KVM as we used to need earlier now causes dependency problems: warning: (CAIF_VIRTIO && VIRTIO_PCI && VIRTIO_MMIO && REMOTEPROC && RPMSG) selects VIRTIO which has unmet direct dependencies (VIRTUALIZATION) Cc: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-06-26 10:50:06 -07:00
Geoff Levand	f2dda9d829	arm/kvm: Cleanup KVM_ARM_MAX_VCPUS logic Commit `d21a1c83c7` (ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally) changed the Kconfig logic for KVM_ARM_MAX_VCPUS to work around a build error arising from the use of KVM_ARM_MAX_VCPUS when CONFIG_KVM=n. The resulting Kconfig logic is a bit awkward and leaves a KVM_ARM_MAX_VCPUS always defined in the kernel config file. This change reverts the Kconfig logic back and adds a simple preprocessor conditional in kvm_host.h to handle when CONFIG_KVM_ARM_MAX_VCPUS is undefined. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-06-26 10:50:05 -07:00
Marc Zyngier	22cfbb6d73	ARM: KVM: clear exclusive monitor on all exception returns Make sure we clear the exclusive monitor on all exception returns, which otherwise could lead to lock corruptions. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-06-26 10:50:05 -07:00
Marc Zyngier	479c5ae2f8	ARM: KVM: add missing dsb before invalidating Stage-2 TLBs When performing a Stage-2 TLB invalidation, it is necessary to make sure the write to the page tables is observable by all CPUs. For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa before doing the TLB invalidation itself. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-06-26 10:50:04 -07:00
Marc Zyngier	6a077e4ab9	ARM: KVM: perform save/restore of PAR Not saving PAR is an unfortunate oversight. If the guest performs an AT* operation and gets scheduled out before reading the result of the translation from PAR, it could become corrupted by another guest or the host. Saving this register is made slightly more complicated as KVM also uses it on the permission fault handling path, leading to an ugly "stash and restore" sequence. Fortunately, this is already a slow path so we don't really care. Also, Linux doesn't do any AT* operation, so Linux guests are not impacted by this bug. [ Slightly tweaked to use an even register as first operand to ldrd and strd operations in interrupts_head.S - Christoffer ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-06-26 10:50:04 -07:00
Marc Zyngier	4db845c3d8	ARM: KVM: get rid of S2_PGD_SIZE S2_PGD_SIZE defines the number of pages used by a stage-2 PGD and is unused, except for a VM_BUG_ON check that missuses the define. As the check is very unlikely to ever triggered except in circumstances where KVM is the least of our worries, just kill both the define and the VM_BUG_ON check. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>	2013-06-26 10:50:04 -07:00
Marc Zyngier	8734f16fb2	ARM: KVM: don't special case PC when doing an MMIO Admitedly, reading a MMIO register to load PC is very weird. Writing PC to a MMIO register is probably even worse. But the architecture doesn't forbid any of these, and injecting a Prefetch Abort is the wrong thing to do anyway. Remove this check altogether, and let the adventurous guest wander into LaLaLand if they feel compelled to do so. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>	2013-06-26 10:50:03 -07:00
Marc Zyngier	dac288f7b3	ARM: KVM: use phys_addr_t instead of unsigned long long for HYP PGDs HYP PGDs are passed around as phys_addr_t, except just before calling into the hypervisor init code, where they are cast to a rather weird unsigned long long. Just keep them around as phys_addr_t, which is what makes the most sense. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>	2013-06-26 10:50:03 -07:00
Dave P Martin	24a7f67575	ARM: KVM: Don't handle PSCI calls via SMC Currently, kvmtool unconditionally declares that HVC should be used to call PSCI, so the function numbers in the DT tell the guest nothing about the function ID namespace or calling convention for SMC. We already assume that the guest will examine and honour the DT, since there is no way it could possibly guess the KVM-specific PSCI function IDs otherwise. So let's not encourage guests to violate what's specified in the DT by using SMC to make the call. [ Modified to apply to top of kvm/arm tree - Christoffer ] Signed-off-by: Dave P Martin <Dave.Martin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>	2013-06-26 10:50:02 -07:00

... 3 4 5 6 7 ...

544 Commits