OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Marc Zyngier	333a53ff7f	KVM: arm64: vgic-its: Validate the device table L1 entry Checking that the device_id fits if the table, and we must make sure that the associated memory is also accessible. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:15:17 +01:00
Marc Zyngier	b90338b7cb	KVM: arm64: vgic-its: Fix misleading nr_entries in vgic_its_check_device_id The nr_entries variable in vgic_its_check_device_id actually describe the size of the L1 table, and not the number of entries in this table. Rename it to l1_tbl_size, so that we can now change the code with a better understanding of what is what. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:15:17 +01:00
Marc Zyngier	7e3963a515	KVM: arm64: vgic-its: Fix vgic_its_check_device_id BE handling The ITS tables are stored in LE format. If the host is reading a L1 table entry to check its validity, it must convert it to the CPU endianness. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:15:16 +01:00
Marc Zyngier	c0091073dd	KVM: arm64: vgic-its: Fix handling of indirect tables The current code will fail on valid indirect tables, and happily use the ones that are pointing out of the guest RAM. Funny what a small "!" can do for you... Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:15:16 +01:00
Marc Zyngier	d97594e6bc	KVM: arm64: vgic-its: Generalize use of vgic_get_irq_kref Instead of sprinkling raw kref_get() calls everytime we cannot do a normal vgic_get_irq(), use the existing vgic_get_irq_kref(), which does the same thing and is paired with a vgic_put_irq(). vgic_get_irq_kref is moved to vgic.h in order to be easily shared. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:15:16 +01:00
Eric Auger	9d5fcb9dd7	KVM: arm/arm64: Fix vGICv2 KVM_DEV_ARM_VGIC_GRP_CPU/DIST_REGS For VGICv2 save and restore the CPU interface registers are accessed. Restore the modality which has been altered. Also explicitly set the iodev_type for both the DIST and CPU interface. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:15:15 +01:00
Andre Przywara	0e4e82f154	KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller Now that all ITS emulation functionality is in place, we advertise MSI functionality to userland and also the ITS device to the guest - if userland has configured that. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:38 +01:00
Andre Przywara	2891a7dfb6	KVM: arm64: vgic-its: Implement MSI injection in ITS emulation When userland wants to inject an MSI into the guest, it uses the KVM_SIGNAL_MSI ioctl, which carries the doorbell address along with the payload and the device ID. With the help of the KVM IO bus framework we learn the corresponding ITS from the doorbell address. We then use our wrapper functions to iterate the linked lists and find the proper Interrupt Translation Table Entry (ITTE) and thus the corresponding struct vgic_irq to finally set the pending bit. We also provide the handler for the ITS "INT" command, which allows a guest to trigger an MSI via the ITS command queue. Since this one knows about the right ITS already, we directly call the MMIO handler function without using the kvm_io_bus framework. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:38 +01:00
Andre Przywara	df9f58fbea	KVM: arm64: vgic-its: Implement ITS command queue command handlers The connection between a device, an event ID, the LPI number and the associated CPU is stored in in-memory tables in a GICv3, but their format is not specified by the spec. Instead software uses a command queue in a ring buffer to let an ITS implementation use its own format. Implement handlers for the various ITS commands and let them store the requested relation into our own data structures. Those data structures are protected by the its_lock mutex. Our internal ring buffer read and write pointers are protected by the its_cmd mutex, so that only one VCPU per ITS can handle commands at any given time. Error handling is very basic at the moment, as we don't have a good way of communicating errors to the guest (usually an SError). The INT command handler is missing from this patch, as we gain the capability of actually injecting MSIs into the guest only later on. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:38 +01:00
Andre Przywara	f9f77af9e2	KVM: arm64: vgic-its: Allow updates of LPI configuration table The (system-wide) LPI configuration table is held in a table in (guest) memory. To achieve reasonable performance, we cache this data in our struct vgic_irq. If the guest updates the configuration data (which consists of the enable bit and the priority value), it issues an INV or INVALL command to allow us to update our information. Provide functions that update that information for one LPI or all LPIs mapped to a specific collection. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:37 +01:00
Andre Przywara	33d3bc9556	KVM: arm64: vgic-its: Read initial LPI pending table The LPI pending status for a GICv3 redistributor is held in a table in (guest) memory. To achieve reasonable performance, we cache the pending bit in our struct vgic_irq. The initial pending state must be read from guest memory upon enabling LPIs for this redistributor. As we can't access the guest memory while we hold the lpi_list spinlock, we create a snapshot of the LPI list and iterate over that. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:37 +01:00
Andre Przywara	3802411d01	KVM: arm64: vgic-its: Connect LPIs to the VGIC emulation LPIs are dynamically created (mapped) at guest runtime and their actual number can be quite high, but is mostly assigned using a very sparse allocation scheme. So arrays are not an ideal data structure to hold the information. We use a spin-lock protected linked list to hold all mapped LPIs, represented by their struct vgic_irq. This lock is grouped between the ap_list_lock and the vgic_irq lock in our locking order. Also we store a pointer to that struct vgic_irq in our struct its_itte, so we can easily access it. Eventually we call our new vgic_get_lpi() from vgic_get_irq(), so the VGIC code gets transparently access to LPIs. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:36 +01:00
Andre Przywara	424c33830f	KVM: arm64: vgic-its: Implement basic ITS register handlers Add emulation for some basic MMIO registers used in the ITS emulation. This includes: - GITS_{CTLR,TYPER,IIDR} - ID registers - GITS_{CBASER,CREADR,CWRITER} (which implement the ITS command buffer handling) - GITS_BASER<n> Most of the handlers are pretty straight forward, only the CWRITER handler is a bit more involved by taking the new its_cmd mutex and then iterating over the command buffer. The registers holding base addresses and attributes are sanitised before storing them. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:36 +01:00
Andre Przywara	1085fdc68c	KVM: arm64: vgic-its: Introduce new KVM ITS device Introduce a new KVM device that represents an ARM Interrupt Translation Service (ITS) controller. Since there can be multiple of this per guest, we can't piggy back on the existing GICv3 distributor device, but create a new type of KVM device. On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data structure and store the pointer in the kvm_device data. Upon an explicit init ioctl from userland (after having setup the MMIO address) we register the handlers with the kvm_io_bus framework. Any reference to an ITS thus has to go via this interface. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:35 +01:00
Andre Przywara	59c5ab4098	KVM: arm64: vgic-its: Introduce ITS emulation file with MMIO framework The ARM GICv3 ITS emulation code goes into a separate file, but needs to be connected to the GICv3 emulation, of which it is an option. The ITS MMIO handlers require the respective ITS pointer to be passed in, so we amend the existing VGIC MMIO framework to let it cope with that. Also we introduce the basic ITS data structure and initialize it, but don't return any success yet, as we are not yet ready for the show. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:35 +01:00
Andre Przywara	0aa1de5731	KVM: arm64: vgic: Handle ITS related GICv3 redistributor registers In the GICv3 redistributor there are the PENDBASER and PROPBASER registers which we did not emulate so far, as they only make sense when having an ITS. In preparation for that emulate those MMIO accesses by storing the 64-bit data written into it into a variable which we later read in the ITS emulation. We also sanitise the registers, making sure RES0 regions are respected and checking for valid memory attributes. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:14:35 +01:00
Andre Przywara	5dd4b924e3	KVM: arm/arm64: vgic: Add refcounting for IRQs In the moment our struct vgic_irq's are statically allocated at guest creation time. So getting a pointer to an IRQ structure is trivial and safe. LPIs are more dynamic, they can be mapped and unmapped at any time during the guest's _runtime_. In preparation for supporting LPIs we introduce reference counting for those structures using the kernel's kref infrastructure. Since private IRQs and SPIs are statically allocated, we avoid actually refcounting them, since they would never be released anyway. But we take provisions to increase the refcount when an IRQ gets onto a VCPU list and decrease it when it gets removed. Also this introduces vgic_put_irq(), which wraps kref_put and hides the release function from the callers. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:10:48 +01:00
Andre Przywara	8a39d00670	KVM: kvm_io_bus: Add kvm_io_bus_get_dev() call The kvm_io_bus framework is a nice place of holding information about various MMIO regions for kernel emulated devices. Add a call to retrieve the kvm_io_device structure which is associated with a certain MMIO address. This avoids to duplicate kvm_io_bus' knowledge of MMIO regions without having to fake MMIO calls if a user needs the device a certain MMIO address belongs to. This will be used by the ITS emulation to get the associated ITS device when someone triggers an MSI via an ioctl from userspace. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:10:40 +01:00
Andre Przywara	42c8870f90	KVM: arm/arm64: vgic: Check return value for kvm_register_vgic_device kvm_register_device_ops() can return an error, so lets check its return value and propagate this up the call chain. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:10:08 +01:00
Andre Przywara	8f6cdc1c2e	KVM: arm/arm64: vgic: Move redistributor kvm_io_devices Logically a GICv3 redistributor is assigned to a (v)CPU, so we should aim to keep redistributor related variables out of our struct vgic_dist. Let's start by replacing the redistributor related kvm_io_device array with two members in our existing struct vgic_cpu, which are naturally per-VCPU and thus don't require any allocation / freeing. So apart from the better fit with the redistributor design this saves some code as well. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-07-18 18:09:40 +01:00
Anna-Maria Gleixner	15d7e3d349	KVM/arm/arm64/vgic-new: Convert to hotplug state machine Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Eric Auger <eric.auger@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153337.900484868@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:41:43 +02:00
Richard Cochran	b3c9950a5c	arm/kvm/arch_timer: Convert to hotplug state machine Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153336.634155707@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:40:27 +02:00
Richard Cochran	42ec50b5f2	arm/kvm/vgic: Convert to hotplug state machine Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. The VGIC callback is run after KVM's main callback since it reflects the makefile order. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-arm-kernel@lists.infradead.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153336.546953286@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:40:26 +02:00
Thomas Gleixner	8c18b2d2d0	virt: Convert kvm hotplug to state machine Install the callbacks via the state machine. The core won't invoke the callbacks on already online CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160713153335.886159080@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:40:23 +02:00
Al Viro	506cfba9e7	KVM: don't use anon_inode_getfd() before possible failures Once anon_inode_getfd() has succeeded, it's impossible to undo in a clean way and no, sys_close() is not usable in such cases. Use anon_inode_getfile() and get_unused_fd_flags() to get struct file and descriptor and do not install the file into the descriptor table until after the last possible failure exit. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-14 19:11:22 +02:00
Paolo Bonzini	7964218c7d	Revert "KVM: release anon file in failure path of vm creation" This reverts commit 77ecc085fed1af1000ca719522977b960aa6da52. Al Viro colorfully says: "You should NEVER use sys_close() on failure exit paths like that. Moreover, this kvm_put_kvm() becomes a double-put, since closing the damn file will drop that reference to kvm. Please, revert. anon_inode_getfd() should be used only when there's no possible failures past its call". Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-14 19:11:21 +02:00
Liu Shuo	2be5b3f6dc	KVM: release anon file in failure path of vm creation The failure of create debugfs of VM will return directly without release the anon file. It will leak memory and file descriptors, even through be not serious. Signed-off-by: Liu Shuo <shuo.a.liu@intel.com> Fixes: `536a6f88c4` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-14 19:11:21 +02:00
Jim Mattson	2f1fe81123	KVM: nVMX: Fix memory corruption when using VMCS shadowing When freeing the nested resources of a vcpu, there is an assumption that the vcpu's vmcs01 is the current VMCS on the CPU that executes nested_release_vmcs12(). If this assumption is violated, the vcpu's vmcs01 may be made active on multiple CPUs at the same time, in violation of Intel's specification. Moreover, since the vcpu's vmcs01 is not VMCLEARed on every CPU on which it is active, it can linger in a CPU's VMCS cache after it has been freed and potentially repurposed. Subsequent eviction from the CPU's VMCS cache on a capacity miss can result in memory corruption. It is not sufficient for vmx_free_vcpu() to call vmx_load_vmcs01(). If the vcpu in question was last loaded on a different CPU, it must be migrated to the current CPU before calling vmx_load_vmcs01(). Signed-off-by: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-14 19:11:20 +02:00
Radim Krčmář	c63cf538eb	KVM: pass struct kvm to kvm_set_routing_entry Arch-specific code will use it. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-14 09:03:56 +02:00
Paolo Bonzini	add6a0cd1c	KVM: MMU: try to fix up page faults before giving up The vGPU folks would like to trap the first access to a BAR by setting vm_ops on the VMAs produced by mmap-ing a VFIO device. The fault handler then can use remap_pfn_range to place some non-reserved pages in the VMA. This kind of VM_PFNMAP mapping is not handled by KVM, but follow_pfn and fixup_user_fault together help supporting it. The patch also supports VM_MIXEDMAP vmas where the pfns are not reserved and thus subject to reference counting. Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Neo Jia <cjia@nvidia.com> Reported-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-05 14:41:26 +02:00
Paolo Bonzini	92176a8ede	KVM: MMU: prepare to support mapping of VM_IO and VM_PFNMAP frames Handle VM_IO like VM_PFNMAP, as is common in the rest of Linux; extract the formula to convert hva->pfn into a new function, which will soon gain more capabilities. Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-05 14:41:03 +02:00
Marc Zyngier	50926d82fa	KVM: arm/arm64: The GIC is dead, long live the GIC I don't think any single piece of the KVM/ARM code ever generated as much hatred as the GIC emulation. It was written by someone who had zero experience in modeling hardware (me), was riddled with design flaws, should have been scrapped and rewritten from scratch long before having a remote chance of reaching mainline, and yet we supported it for a good three years. No need to mention the names of those who suffered, the git log is singing their praises. Thankfully, we now have a much more maintainable implementation, and we can safely put the grumpy old GIC to rest. Fellow hackers, please raise your glass in memory of the GIC: The GIC is dead, long live the GIC! Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-07-03 23:09:37 +02:00
Xiubo Li	caf1ff26e1	kvm: Fix irq route entries exceeding KVM_MAX_IRQ_ROUTES These days, we experienced one guest crash with 8 cores and 3 disks, with qemu error logs as bellow: qemu-system-x86_64: /build/qemu-2.0.0/kvm-all.c:984: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. And then we found one patch(bdf026317d) in qemu tree, which said could fix this bug. Execute the following script will reproduce the BUG quickly: irq_affinity.sh ======================================================================== vda_irq_num=25 vdb_irq_num=27 while [ 1 ] do for irq in {1,2,4,8,10,20,40,80} do echo $irq > /proc/irq/$vda_irq_num/smp_affinity echo $irq > /proc/irq/$vdb_irq_num/smp_affinity dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct done done ======================================================================== The following qemu log is added in the qemu code and is displayed when this bug reproduced: kvm_irqchip_commit_routes: max gsi: 1008, nr_allocated_irq_routes: 1024, irq_routes->nr: 1024, gsi_count: 1024. That's to say when irq_routes->nr == 1024, there are 1024 routing entries, but in the kernel code when routes->nr >= 1024, will just return -EINVAL; The nr is the number of the routing entries which is in of [1 ~ KVM_MAX_IRQ_ROUTES], not the index in [0 ~ KVM_MAX_IRQ_ROUTES - 1]. This patch fix the BUG above. Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Signed-off-by: Wei Tang <tangwei@cmss.chinamobile.com> Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-16 09:38:15 +02:00
Paolo Bonzini	557abc40d1	KVM: remove kvm_vcpu_compatible The new created_vcpus field makes it possible to avoid the race between irqchip and VCPU creation in a much nicer way; just check under kvm->lock whether a VCPU has already been created. We can then remove KVM_APIC_ARCHITECTURE too, because at this point the symbol is only governing the default definition of kvm_vcpu_compatible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-16 00:05:00 +02:00
Paolo Bonzini	6c7caebc26	KVM: introduce kvm->created_vcpus The race between creating the irqchip and the first VCPU is currently fixed by checking the presence of an irqchip before updating kvm->online_vcpus, and undoing the whole VCPU creation if someone created the irqchip in the meanwhile. Instead, introduce a new field in struct kvm that will count VCPUs under a mutex, without the atomic access and memory ordering that we need elsewhere to protect the vcpus array. This also plugs the race and is more easily applicable in all similar circumstances. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-06-16 00:05:00 +02:00
Paolo Bonzini	f8c1b85b25	KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID This causes an ugly dmesg splat. Beautified syzkaller testcase: #include <unistd.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <fcntl.h> #include <linux/kvm.h> long r[8]; int main() { struct kvm_irq_routing ir = { 0 }; r[2] = open("/dev/kvm", O_RDWR); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_SET_GSI_ROUTING, &ir); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>	2016-06-02 17:38:50 +02:00
Paolo Bonzini	c622a3c21e	KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi Found by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 0000000000000120 IP: [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm] PGD 6f80b067 PUD b6535067 PMD 0 Oops: 0000 [#1] SMP CPU: 3 PID: 4988 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1 [...] Call Trace: [<ffffffffa0795f62>] irqfd_update+0x32/0xc0 [kvm] [<ffffffffa0796c7c>] kvm_irqfd+0x3dc/0x5b0 [kvm] [<ffffffffa07943f4>] kvm_vm_ioctl+0x164/0x6f0 [kvm] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480 [<ffffffff812418a9>] SyS_ioctl+0x79/0x90 [<ffffffff817a1062>] tracesys_phase2+0x84/0x89 Code: b5 71 a7 e0 5b 41 5c 41 5d 5d f3 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 8b 8f 10 2e 00 00 31 c0 48 89 e5 <39> 91 20 01 00 00 76 6a 48 63 d2 48 8b 94 d1 28 01 00 00 48 85 RIP [<ffffffffa0797202>] kvm_irq_map_gsi+0x12/0x90 [kvm] RSP <ffff8800926cbca8> CR2: 0000000000000120 Testcase: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[26]; int main() { memset(r, -1, sizeof(r)); r[2] = open("/dev/kvm", 0); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); struct kvm_irqfd ifd; ifd.fd = syscall(SYS_eventfd2, 5, 0); ifd.gsi = 3; ifd.flags = 2; ifd.resamplefd = ifd.fd; r[25] = ioctl(r[3], KVM_IRQFD, &ifd); return 0; } Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>	2016-06-02 17:38:50 +02:00
Marc Zyngier	05fb05a6ca	KVM: arm/arm64: vgic-new: Removel harmful BUG_ON When changing the active bit from an MMIO trap, we decide to explode if the intid is that of a private interrupt. This flawed logic comes from the fact that we were assuming that kvm_vcpu_kick() as called by kvm_arm_halt_vcpu() would not return before the called vcpu responded, but this is not the case, so we need to perform this wait even for private interrupts. Dropping the BUG_ON seems like the right thing to do. [ Commit message tweaked by Christoffer ] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-06-02 11:52:21 +02:00
Marc Zyngier	637d122baa	KVM: arm/arm64: vgic-v3: Always resample level interrupts When reading back from the list registers, we need to perform two actions for level interrupts: 1) clear the soft-pending bit if the interrupt is not pending anymore in the list register 2) resample the line level and propagate it to the pending state But these two actions shouldn't be linked, and we should always resample the line level, no matter what state is in the list register. Otherwise, we may end-up injecting spurious interrupts that have been already retired. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-31 16:12:16 +02:00
Marc Zyngier	df7942d17e	KVM: arm/arm64: vgic-v2: Always resample level interrupts When reading back from the list registers, we need to perform two actions for level interrupts: 1) clear the soft-pending bit if the interrupt is not pending anymore in the list register 2) resample the line level and propagate it to the pending state But these two actions shouldn't be linked, and we should always resample the line level, no matter what state is in the list register. Otherwise, we may end-up injecting spurious interrupts that have been already retired. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-31 16:12:15 +02:00
Christoffer Dall	4d3afc9bad	KVM: arm/arm64: vgic-v2: Clear all dirty LRs When saving the state of the list registers, it is critical to reset them zero, as we could otherwise leave unexpected EOI interrupts pending for virtual level interrupts. Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-05-31 16:09:28 +02:00
Janosch Frank	536a6f88c4	KVM: Create debugfs dir and stat files for each VM This patch adds a kvm debugfs subdirectory for each VM, which is named after its pid and file descriptor. The directories contain the same kind of files that are already in the kvm debugfs directory, but the data exported through them is now VM specific. This makes the debugfs kvm data a convenient alternative to the tracepoints which already have per VM data. The debugfs data is easy to read and low overhead. CC: Dan Carpenter <dan.carpenter@oracle.com> [includes fixes by Dan Carpenter] Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-25 16:12:05 +02:00
Paolo Bonzini	44bcc92238	KVM/ARM Changes for v4.7 take 2 "The GIC is dead; Long live the GIC" This set of changes include the new vgic, which is a reimplementation of our horribly broken legacy vgic implementation. The two implementations will live side-by-side (with the new being the configured default) for one kernel release and then we'll remove it. Also fixes a non-critical issue with virtual abort injection to guests. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXRA7PAAoJEEtpOizt6ddy9a0H+wQ5LTC4rrgcWOsLjfa7res7 SqB3HEqrzmN3zMbv4dgYeMSBDqr6F3b1+8DJ6n1WkG7afXOKrpMWsl9SXgwmRc1q 8H54DYLcu/CDakzi/FKTeZEIp/u+tpMC7xQLFk8PFx/NUPfspPt1NU6Qi0fumZj6 5AbXwC6FKojplgO6wxV7oHRRiEnfrN5F+1whD3QlaDmI+rlNUSYNp2Ljhp8k9m9E NSeGzZgeMdHOeZpv60uhjotuxegG1zmHeHI59ltNytdfjuFL3LvNm18yG8u1yjKm 9ZDjIu1m7dfuPw39DYC99cxP6tvEh03/N2zw86ZFgj7QfdW756WrV+UkSFfeIKE= =xy+z -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4-7-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next KVM/ARM Changes for v4.7 take 2 "The GIC is dead; Long live the GIC" This set of changes include the new vgic, which is a reimplementation of our horribly broken legacy vgic implementation. The two implementations will live side-by-side (with the new being the configured default) for one kernel release and then we'll remove it. Also fixes a non-critical issue with virtual abort injection to guests.	2016-05-24 12:10:51 +02:00
Christoffer Dall	35a2d58588	KVM: arm/arm64: vgic-new: Synchronize changes to active state When modifying the active state of an interrupt via the MMIO interface, we should ensure that the write has the intended effect. If a guest sets an interrupt to active, but that interrupt is already flushed into a list register on a running VCPU, then that VCPU will write the active state back into the struct vgic_irq upon returning from the guest and syncing its state. This is a non-benign race, because the guest can observe that an interrupt is not active, and it can have a reasonable expectations that other VCPUs will not ack any IRQs, and then set the state to active, and expect it to stay that way. Currently we are not honoring this case. Thefore, change both the SACTIVE and CACTIVE mmio handlers to stop the world, change the irq state, potentially queue the irq if we're setting it to active, and then continue. We take this chance to slightly optimize these functions by not stopping the world when touching private interrupts where there is inherently no possible race. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 16:26:38 +02:00
Andre Przywara	efffe55af5	KVM: arm/arm64: vgic-new: enable build Now that the new VGIC implementation has reached feature parity with the old one, add the new files to the build system and add a Kconfig option to switch between the two versions. We set the default to the new version to get maximum test coverage, in case people experience problems they can switch back to the old behaviour if needed. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:09 +02:00
Andre Przywara	568e8c901e	KVM: arm/arm64: vgic-new: implement mapped IRQ handling We now store the mapped hardware IRQ number in our struct, so we don't need the irq_phys_map for the new VGIC. Implement the hardware IRQ mapping on top of the reworked arch timer interface. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:09 +02:00
Andre Przywara	03f0c94c73	KVM: arm/arm64: vgic-new: Wire up irqfd injection Connect to the new VGIC to the irqfd framework, so that we can inject IRQs. GSI routing and MSI routing is not yet implemented. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:08 +02:00
Eric Auger	f7b6985cc3	KVM: arm/arm64: vgic-new: Add vgic_v2/v3_enable Enable the VGIC operation by properly initialising the registers in the hypervisor GIC interface. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:07 +02:00
Eric Auger	b0442ee227	KVM: arm/arm64: vgic-new: vgic_init: implement map_resources map_resources is the last initialization step. It is executed on first VCPU run. At that stage the code checks that userspace has provided the base addresses for the relevant VGIC regions, which depend on the type of VGIC that is exposed to the guest. Also we check if the two regions overlap. If the checks succeeded, we register the respective register frames with the kvm_io_bus framework. If we emulate a GICv2, the function also forces vgic_init execution if it has not been executed yet. Also we map the virtual GIC CPU interface onto the guest's CPU interface. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:07 +02:00
Eric Auger	ad275b8bb1	KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init This patch allocates and initializes the data structures used to model the vgic distributor and virtual cpu interfaces. At that stage the number of IRQs and number of virtual CPUs is frozen. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:06 +02:00
Eric Auger	5e6431da8f	KVM: arm/arm64: vgic-new: vgic_init: implement vgic_create This patch implements the vgic_creation function which is called on CREATE_IRQCHIP VM IOCTL (v2 only) or KVM_CREATE_DEVICE Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:06 +02:00
Eric Auger	9097773245	KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init Implements kvm_vgic_hyp_init and vgic_probe function. This uses the new firmware independent VGIC probing to support both ACPI and DT based systems (code from Marc Zyngier). The vgic_global struct is enriched with new fields populated by those functions. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:05 +02:00
Andre Przywara	878c569e45	KVM: arm/arm64: vgic-new: Add userland GIC CPU interface access Using the VMCR accessors we provide access to GIC CPU interface state to userland by wiring it up to the existing userland interface. [Marc: move and make VMCR accessors static, streamline MMIO handlers] Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:05 +02:00
Andre Przywara	e4823a7a1b	KVM: arm/arm64: vgic-new: Add GICH_VMCR accessors Since the GIC CPU interface is always virtualized by the hardware, we don't have CPU interface state information readily available in our emulation if userland wants to save or restore it. Fortunately the GIC hypervisor interface provides the VMCR register to access the required virtual CPU interface bits. Provide wrappers for GICv2 and GICv3 hosts to have access to this register. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:04 +02:00
Andre Przywara	7d450e2821	KVM: arm/arm64: vgic-new: Add userland access to VGIC dist registers Userland may want to save and restore the state of the in-kernel VGIC, so we provide the code which takes a userland request and translate that into calls to our MMIO framework. From Christoffer: When accessing the VGIC state from userspace we really don't want a VCPU to be messing with the state at the same time, and the API specifies that we should return -EBUSY if any VCPUs are running. Check and prevent VCPUs from running by grabbing their mutexes, one by one, and error out if we fail. (Note: This could potentially be simplified to just do a simple check and see if any VCPUs are running, and return -EBUSY then, without enforcing the locking throughout the duration of the uaccess, if we think that taking/releasing all these mutexes for every single GIC register access is too heavyweight.) Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:03 +02:00
Christoffer Dall	c3199f28e0	KVM: arm/arm64: vgic-new: Export register access interface Userland can access the emulated GIC to save and restore its state for initialization or migration purposes. The kvm_io_bus API requires an absolute gpa, which does not fit the KVM_DEV_ARM_VGIC_GRP_DIST_REGS user API, that only provides relative offsets. So we provide a wrapper to plug into our MMIO framework and find the respective register handler. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:40:03 +02:00
Eric Auger	f94591e2e6	KVM: arm/arm64: vgic-new: vgic_kvm_device: access to VGIC registers This patch implements the switches for KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS API which allows the userspace to access VGIC registers. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:02 +02:00
Eric Auger	e5c3029467	KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM_DEV_ARM_VGIC_GRP_ADDR This patch implements the KVM_DEV_ARM_VGIC_GRP_ADDR group which enables to set the base address of GIC regions as seen by the guest. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:02 +02:00
Eric Auger	e2c1f9abff	KVM: arm/arm64: vgic-new: vgic_kvm_device: implement kvm_vgic_addr kvm_vgic_addr is used by the userspace to set the base address of the following register regions, as seen by the guest: - distributor(v2 and v3), - re-distributors (v3), - CPU interface (v2). Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:01 +02:00
Eric Auger	afcc7c50ce	KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM_DEV_ARM_VGIC_GRP_CTRL This patch implements the KVM_DEV_ARM_VGIC_GRP_CTRL group API featuring KVM_DEV_ARM_VGIC_CTRL_INIT attribute. The vgic_init function is not yet implemented though. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:00 +02:00
Eric Auger	fca256026b	KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM_DEV_ARM_VGIC_GRP_NR_IRQS This patch implements the KVM_DEV_ARM_VGIC_GRP_NR_IRQS group. This modality is supported by both VGIC V2 and V3 KVM device as will be other groups, hence the introduction of common helpers. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:40:00 +02:00
Eric Auger	c86c772191	KVM: arm/arm64: vgic-new: vgic_kvm_device: KVM device ops registration This patch introduces the skeleton for the KVM device operations associated to KVM_DEV_TYPE_ARM_VGIC_V2 and KVM_DEV_TYPE_ARM_VGIC_V3. At that stage kvm_vgic_create is stubbed. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:59 +02:00
Andre Przywara	621ecd8d21	KVM: arm/arm64: vgic-new: Add GICv3 SGI system register trap handler In contrast to GICv2 SGIs in a GICv3 implementation are not triggered by a MMIO write, but with a system register write. KVM knows about that register already, we just need to implement the handler and wire it up to the core KVM/ARM code. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:59 +02:00
Andre Przywara	78a714aba0	KVM: arm/arm64: vgic-new: Add GICv3 IROUTER register handlers Since GICv3 supports much more than the 8 CPUs the GICv2 ITARGETSR register can handle, the new IROUTER register covers the whole range of possible target (V)CPUs by using the same MPIDR that the cores report themselves. In addition to translating this MPIDR into a vcpu pointer we store the originally written value as well. The architecture allows to write any values into the register, which must be read back as written. Since we don't support affinity level 3, we don't need to take care about the upper word of this 64-bit register, which simplifies the handling a bit. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:58 +02:00
Andre Przywara	54f59d2b3a	KVM: arm/arm64: vgic-new: Add GICv3 IDREGS register handler We implement the only one ID register that is required by the architecture, also this is the one that Linux actually checks. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:57 +02:00
Andre Przywara	741972d8a6	KVM: arm/arm64: vgic-new: Add GICv3 redistributor IIDR and TYPER handler The redistributor TYPER tells the OS about the associated MPIDR, also the LAST bit is crucial to determine the number of redistributors. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:57 +02:00
Andre Przywara	fd59ed3be1	KVM: arm/arm64: vgic-new: Add GICv3 CTLR, IIDR, TYPER handlers As in the GICv2 emulation we handle those three registers in one function. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:56 +02:00
Andre Przywara	ed9b8cefa9	KVM: arm/arm64: vgic-new: Add GICv3 MMIO handling framework Create a new file called vgic-mmio-v3.c and describe the GICv3 distributor and redistributor registers there. This adds a special macro to deal with the split of SGI/PPI in the redistributor and SPIs in the distributor, which allows us to reuse the existing GICv2 handlers for those registers which are compatible. Also we provide a function to deal with the registration of the two separate redistributor frames per VCPU. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:56 +02:00
Andre Przywara	ed40213ef9	KVM: arm/arm64: vgic-new: Add SGIPENDR register handlers As this register is v2 specific, its implementation lives entirely in vgic-mmio-v2.c. This register allows setting the source mask of an IPI. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:55 +02:00
Andre Przywara	55cc01fb90	KVM: arm/arm64: vgic-new: Add SGIR register handler Triggering an IPI via this register is v2 specific, so the implementation lives entirely in vgic-mmio-v2.c. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:55 +02:00
Andre Przywara	2c234d6f18	KVM: arm/arm64: vgic-new: Add TARGET registers handlers The target register handlers are v2 emulation specific, so their implementation lives entirely in vgic-mmio-v2.c. We copy the old VGIC behaviour of assigning an IRQ to the first VCPU set in the target mask instead of making it possibly pending on multiple VCPUs. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:54 +02:00
Andre Przywara	79717e4ac0	KVM: arm/arm64: vgic-new: Add CONFIG registers handlers The config register handlers are shared between the v2 and v3 emulation, so their implementation goes into vgic-mmio.c, to be easily referenced from the v3 emulation as well later. Signed-off-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:53 +02:00
Andre Przywara	055658bf48	KVM: arm/arm64: vgic-new: Add PRIORITY registers handlers The priority register handlers are shared between the v2 and v3 emulation, so their implementation goes into vgic-mmio.c, to be easily referenced from the v3 emulation as well later. There is a corner case when we change the priority of a pending interrupt which we don't handle at the moment. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:53 +02:00
Andre Przywara	69b6fe0c6e	KVM: arm/arm64: vgic-new: Add ACTIVE registers handlers The active register handlers are shared between the v2 and v3 emulation, so their implementation goes into vgic-mmio.c, to be easily referenced from the v3 emulation as well later. Since activation/deactivation of an interrupt may happen entirely in the guest without it ever exiting, we need some extra logic to properly track the active state. For clearing the active state, we basically have to halt the guest to make sure this is properly propagated into the respective VCPUs. Signed-off-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:52 +02:00
Andre Przywara	96b298000d	KVM: arm/arm64: vgic-new: Add PENDING registers handlers The pending register handlers are shared between the v2 and v3 emulation, so their implementation goes into vgic-mmio.c, to be easily referenced from the v3 emulation as well later. For level triggered interrupts the real line level is unaffected by this write, so we keep this state separate and combine it with the device's level to get the actual pending state. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:52 +02:00
Andre Przywara	fd122e6209	KVM: arm/arm64: vgic-new: Add ENABLE registers handlers As the enable register handlers are shared between the v2 and v3 emulation, their implementation goes into vgic-mmio.c, to be easily referenced from the v3 emulation as well later. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:51 +02:00
Marc Zyngier	2b0cda8789	KVM: arm/arm64: vgic-new: Add CTLR, TYPER and IIDR handlers Those three registers are v2 emulation specific, so their implementation lives entirely in vgic-mmio-v2.c. Also they are handled in one function, as their implementation is pretty simple. When the guest enables the distributor, we kick all VCPUs to get potentially pending interrupts serviced. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:50 +02:00
Andre Przywara	fb848db396	KVM: arm/arm64: vgic-new: Add GICv2 MMIO handling framework Create vgic-mmio-v2.c to describe GICv2 emulation specific handlers using the initializer macros provided by the VGIC MMIO framework. Provide a function to register the GICv2 distributor registers to the kvm_io_bus framework. The actual handler functions are still stubs in this patch. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:50 +02:00
Marc Zyngier	4493b1c486	KVM: arm/arm64: vgic-new: Add MMIO handling framework Add an MMIO handling framework to the VGIC emulation: Each register is described by its offset, size (or number of bits per IRQ, if applicable) and the read/write handler functions. We provide initialization macros to describe each GIC register later easily. Separate dispatch functions for read and write accesses are connected to the kvm_io_bus framework and binary-search for the responsible register handler based on the offset address within the region. We convert the incoming data (referenced by a pointer) to the host's endianess and use pass-by-value to hand the data over to the actual handler functions. The register handler prototype and the endianess conversion are courtesy of Christoffer Dall. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:49 +02:00
Eric Auger	90eee56c5f	KVM: arm/arm64: vgic-new: Implement kvm_vgic_vcpu_pending_irq Tell KVM whether a particular VCPU has an IRQ that needs handling in the guest. This is used to decide whether a VCPU is runnable. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>	2016-05-20 15:39:49 +02:00
Marc Zyngier	59529f69f5	KVM: arm/arm64: vgic-new: Add GICv3 world switch backend As the GICv3 virtual interface registers differ from their GICv2 siblings, we need different handlers for processing maintenance interrupts and reading/writing to the LRs. Implement the respective handler functions and connect them to existing code to be called if the host is using a GICv3. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:48 +02:00
Marc Zyngier	140b086dd1	KVM: arm/arm64: vgic-new: Add GICv2 world switch backend Processing maintenance interrupts and accessing the list registers are dependent on the host's GIC version. Introduce vgic-v2.c to contain GICv2 specific functions. Implement the GICv2 specific code for syncing the emulation state into the VGIC registers. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:48 +02:00
Marc Zyngier	0919e84c0f	KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework Implement the framework for syncing IRQs between our emulation and the list registers, which represent the guest's view of IRQs. This is done in kvm_vgic_flush_hwstate and kvm_vgic_sync_hwstate, which gets called on guest entry and exit. The code talking to the actual GICv2/v3 hardware is added in the following patches. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:47 +02:00
Christoffer Dall	8e44474579	KVM: arm/arm64: vgic-new: Add IRQ sorting Adds the sorting function to cover the case where you have more IRQs to consider than you have LRs. We now consider priorities. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>	2016-05-20 15:39:46 +02:00
Christoffer Dall	81eeb95ddb	KVM: arm/arm64: vgic-new: Implement virtual IRQ injection Provide a vgic_queue_irq_unlock() function which decides whether a given IRQ needs to be queued to a VCPU's ap_list. This should be called whenever an IRQ becomes pending or enabled, either as a result of userspace injection, from in-kernel emulated devices like the architected timer or from MMIO accesses to the distributor emulation. Also provides the necessary functions to allow userland to inject an IRQ to a guest. Since this is the first code that starts using our locking mechanism, we add some (hopefully) clear documentation of our locking strategy and requirements along with this patch. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:46 +02:00
Christoffer Dall	64a959d66e	KVM: arm/arm64: vgic-new: Add acccessor to new struct vgic_irq instance The new VGIC implementation centers around a struct vgic_irq instance per virtual IRQ. Provide a function to retrieve the right instance for a given IRQ number and (in case of private interrupts) the right VCPU. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com>	2016-05-20 15:39:45 +02:00
Andre Przywara	44bfc42e94	KVM: arm/arm64: move GICv2 emulation defines into arm-gic-v3.h As (some) GICv3 hosts can emulate a GICv2, some GICv2 specific masks for the list register definition also apply to GICv3 LRs. At the moment we have those definitions in the KVM VGICv3 implementation, so let's move them into the GICv3 header file to have them automatically defined. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com>	2016-05-20 15:39:44 +02:00
Andre Przywara	2defaff48a	KVM: arm/arm64: pmu: abstract access to number of SPIs Currently the PMU uses a member of the struct vgic_dist directly, which not only breaks abstraction, but will fail with the new VGIC. Abstract this access in the VGIC header file and refactor the validity check in the PMU code. Signed-off-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:43 +02:00
Christoffer Dall	83091db981	KVM: arm/arm64: Fix MMIO emulation data handling When the kernel was handling a guest MMIO read access internally, we need to copy the emulation result into the run->mmio structure in order for the kvm_handle_mmio_return() function to pick it up and inject the result back into the guest. Currently the only user of kvm_io_bus for ARM is the VGIC, which did this copying itself, so this was not causing issues so far. But with the upcoming new vgic implementation we need this done properly. Update the kvm_handle_mmio_return description and cleanup the code to only perform a single copying when needed. Code and commit message inspired by Andre Przywara. Reported-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:42 +02:00
Christoffer Dall	2db4c104fa	KVM: arm/arm64: Get rid of vgic_cpu->nr_lr The number of list registers is a property of the underlying system, not of emulated VGIC CPU interface. As we are about to move this variable to global state in the new vgic for clarity, move it from the legacy implementation as well to make the merge of the new code easier. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:41 +02:00
Christoffer Dall	41a54482c0	KVM: arm/arm64: Move timer IRQ map to latest possible time We are about to modify the VGIC to allocate all data structures dynamically and store mapped IRQ information on a per-IRQ struct, which is indeed allocated dynamically at init time. Therefore, we cannot record the mapped IRQ info from the timer at timer reset time like it's done now, because VCPU reset happens before timer init. A possible later time to do this is on the first run of a per VCPU, it just requires us to move the enable state to be a per-VCPU state and do the lookup of the physical IRQ number when we are about to run the VCPU. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com>	2016-05-20 15:39:41 +02:00
Andre Przywara	c8eb3f6b9b	KVM: arm/arm64: vgic: Remove irq_phys_map from interface Now that the virtual arch timer does not care about the irq_phys_map anymore, let's rework kvm_vgic_map_phys_irq() to return an error value instead. Any reference to that mapping can later be done by passing the correct combination of VCPU and virtual IRQ number. This makes the irq_phys_map handling completely private to the VGIC code. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:40 +02:00
Andre Przywara	a7e33ad9b2	KVM: arm/arm64: arch_timer: Remove irq_phys_map Now that the interface between the arch timer and the VGIC does not require passing the irq_phys_map entry pointer anymore, let's remove it from the virtual arch timer and use the virtual IRQ number instead directly. The remaining pointer returned by kvm_vgic_map_phys_irq() will be removed in the following patch. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:39 +02:00
Christoffer Dall	b452cb5207	KVM: arm/arm64: Remove the IRQ field from struct irq_phys_map The communication of a Linux IRQ number from outside the VGIC to the vgic was a leftover from the day when the vgic code cared about how a particular device injects virtual interrupts mapped to a physical interrupt. We can safely remove this notion, leaving all physical IRQ handling to be done in the device driver (the arch timer in this case), which makes room for a saner API for the new VGIC. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org>	2016-05-20 15:39:39 +02:00
Andre Przywara	63306c28ac	KVM: arm/arm64: vgic: avoid map in kvm_vgic_unmap_phys_irq() kvm_vgic_unmap_phys_irq() only needs the virtual IRQ number, so let's just pass that between the arch timer and the VGIC to get rid of the irq_phys_map pointer. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:38 +02:00
Andre Przywara	e262f41936	KVM: arm/arm64: vgic: avoid map in kvm_vgic_map_is_active() For getting the active state of a mapped IRQ, we actually only need the virtual IRQ number, not the pointer to the mapping entry. Pass the virtual IRQ number from the arch timer to the VGIC directly. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:38 +02:00
Andre Przywara	4f551a3d96	KVM: arm/arm64: vgic: avoid map in kvm_vgic_inject_mapped_irq() When we want to inject a hardware mapped IRQ into a guest, we actually only need the virtual IRQ number from the irq_phys_map. So let's pass this number directly from the arch timer to the VGIC to avoid using the map as a parameter. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:37 +02:00
Andre Przywara	7cbc084dc2	KVM: arm/arm64: vgic: streamline vgic_update_irq_pending() interface We actually don't use the irq_phys_map parameter in vgic_update_irq_pending(), so let's just remove it. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Eric Auger <eric.auger@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>	2016-05-20 15:39:37 +02:00
Radim Krčmář	dd1a4cc1fb	KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick AVIC has a use for kvm_vcpu_wake_up. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-18 18:04:27 +02:00
Christian Borntraeger	2086d3200d	KVM: shrink halt polling even more for invalid wakeups commit `3491caf275` ("KVM: halt_polling: provide a way to qualify wakeups during poll") added more aggressive shrinking of the polling interval if the wakeup did not match some criteria. This still allows to keep polling enabled if the polling time was smaller that the current max poll time (block_ns <= vcpu->halt_poll_ns). Performance measurement shows that even more aggressive shrinking (shrink polling on any invalid wakeup) reduces absolute and relative (to the workload) CPU usage even further. Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Radim Krčmář <rkrcmar@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-18 18:04:22 +02:00

1 2 3 4 5 ...

1152 Commits