linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Jeremy Fitzhardinge	667c78afae	xen: Provide a variant of __RING_SIZE() that is an integer constant expression Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where this is being used to specify array sizes. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Miller <davem@davemloft.net> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-12-15 12:34:28 -08:00
Stefano Stabellini	af42b8d12f	xen: fix MSI setup and teardown for PV on HVM guests When remapping MSIs into pirqs for PV on HVM guests, qemu is responsible for doing the actual mapping and unmapping. We only give qemu the desired pirq number when we ask to do the mapping the first time, after that we should be reading back the pirq number from qemu every time we want to re-enable the MSI. This fixes a bug in xen_hvm_setup_msi_irqs that manifests itself when trying to enable the same MSI for the second time: the old MSI to pirq mapping is still valid at this point but xen_hvm_setup_msi_irqs would try to assign a new pirq anyway. A simple way to reproduce this bug is to assign an MSI capable network card to a PV on HVM guest, if the user brings down the corresponding ethernet interface and up again, Linux would fail to enable MSIs on the device. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-12-02 14:34:25 +00:00
Stefano Stabellini	e5fc734541	xen: use PHYSDEVOP_get_free_pirq to implement find_unbound_pirq Use the new hypercall PHYSDEVOP_get_free_pirq to ask Xen to allocate a pirq. Remove the unsupported PHYSDEVOP_get_nr_pirqs hypercall to get the amount of pirq available. This fixes find_unbound_pirq that otherwise would return a number starting from nr_irqs that might very well be out of range in Xen. The symptom of this bug is that when you passthrough an MSI capable pci device to a PV on HVM guest, Linux would fail to enable MSIs on the device. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-12-02 14:28:22 +00:00
Jeremy Fitzhardinge	9b8321531a	Merge branches 'upstream/core', 'upstream/xenfs' and 'upstream/evtchn' into upstream/for-linus * upstream/core: xen/events: Use PIRQ instead of GSI value when unmapping MSI/MSI-X irqs. xen: set IO permission early (before early_cpu_init()) xen: re-enable boot-time ballooning xen/balloon: make sure we only include remaining extra ram xen/balloon: the balloon_lock is useless xen: add extra pages to balloon xen/events: use locked set\|clear_bit() for cpu_evtchn_mask xen/evtchn: clear secondary CPUs' cpu_evtchn_mask[] after restore xen: implement XENMEM_machphys_mapping * upstream/xenfs: Revert "xen/privcmd: create address space to allow writable mmaps" xen/xenfs: update xenfs_mount for new prototype xen: fix header export to userspace xen: set vma flag VM_PFNMAP in the privcmd mmap file_op xen: xenfs: privcmd: check put_user() return code * upstream/evtchn: xen: make evtchn's name less generic xen/evtchn: the evtchn device is non-seekable xen/evtchn: add missing static xen/evtchn: Fix name of Xen event-channel device xen/evtchn: don't do unbind_from_irqhandler under spinlock xen/evtchn: remove spurious barrier xen/evtchn: ports start enabled xen/evtchn: dynamically allocate port_user array xen/evtchn: track enabled state for each port	2010-11-22 12:22:42 -08:00
Jeremy Fitzhardinge	9be4d45759	xen: add extra pages to balloon Add extra pages in the pseudo-physical address space to the balloon so we can extend into them later. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-19 22:15:59 -08:00
Jeremy Fitzhardinge	20b4755e4f	Merge commit 'v2.6.37-rc2' into upstream/xenfs * commit 'v2.6.37-rc2': (10093 commits) Linux 2.6.37-rc2 capabilities/syslog: open code cap_syslog logic to fix build failure i2c: Sanity checks on adapter registration i2c: Mark i2c_adapter.id as deprecated i2c: Drivers shouldn't include <linux/i2c-id.h> i2c: Delete unused adapter IDs i2c: Remove obsolete cleanup for clientdata include/linux/kernel.h: Move logging bits to include/linux/printk.h Fix gcc 4.5.1 miscompiling drivers/char/i8k.c (again) hwmon: (w83795) Check for BEEP pin availability hwmon: (w83795) Clear intrusion alarm immediately hwmon: (w83795) Read the intrusion state properly hwmon: (w83795) Print the actual temperature channels as sources hwmon: (w83795) List all usable temperature sources hwmon: (w83795) Expose fan control method hwmon: (w83795) Fix fan control mode attributes hwmon: (lm95241) Check validity of input values hwmon: Change mail address of Hans J. Koch PCI: sysfs: fix printk warnings GFS2: Fix inode deallocation race ...	2010-11-16 11:06:22 -08:00
Randy Dunlap	744f9f104e	xen: fix header export to userspace scripts/headers_install.pl prevents "__user" from being exported to userspace headers, so just use compiler.h to make sure that __user is defined and avoid the error. unifdef: linux-next-20101112/xx64/usr/include/xen/privcmd.h.tmp: 79: Premature EOF (#if line 33 depth 1) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xensource.com (moderated for non-subscribers) Cc: virtualization@lists.osdl.org Cc: Tony Finch <dot@dotat.at> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-11-16 10:57:05 -08:00
Ian Campbell	7e77506a59	xen: implement XENMEM_machphys_mapping This hypercall allows Xen to specify a non-default location for the machine to physical mapping. This capability is used when running a 32 bit domain 0 on a 64 bit hypervisor to shrink the hypervisor hole to exactly the size required. [ Impact: add Xen hypercall definitions ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-11-12 15:00:06 -08:00
Linus Torvalds	18cb657ca1	Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm * 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: xen: register xen pci notifier xen: initialize cpu masks for pv guests in xen_smp_init xen: add a missing #include to arch/x86/pci/xen.c xen: mask the MTRR feature from the cpuid xen: make hvc_xen console work for dom0. xen: add the direct mapping area for ISA bus access xen: Initialize xenbus for dom0. xen: use vcpu_ops to setup cpu masks xen: map a dummy page for local apic and ioapic in xen_set_fixmap xen: remap MSIs into pirqs when running as initial domain xen: remap GSIs as pirqs when running as initial domain xen: introduce XEN_DOM0 as a silent option xen: map MSIs into pirqs xen: support GSI -> pirq remapping in PV on HVM guests xen: add xen hvm acpi_register_gsi variant acpi: use indirect call to register gsi in different modes xen: implement xen_hvm_register_pirq xen: get the maximum number of pirqs from xen xen: support pirq != irq * 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits) X86/PCI: Remove the dependency on isapnp_disable. xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright. x86: xen: Sanitse irq handling (part two) swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it. MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer. xen/pci: Request ACS when Xen-SWIOTLB is activated. xen-pcifront: Xen PCI frontend driver. xenbus: prevent warnings on unhandled enumeration values xenbus: Xen paravirtualised PCI hotplug support. xen/x86/PCI: Add support for the Xen PCI subsystem x86: Introduce x86_msi_ops msi: Introduce default_[teardown\|setup]_msi_irqs with fallback. x86/PCI: Export pci_walk_bus function. x86/PCI: make sure _PAGE_IOMAP it set on pci mappings x86/PCI: Clean up pci_cache_line_size xen: fix shared irq device passthrough xen: Provide a variant of xen_poll_irq with timeout. xen: Find an unbound irq number in reverse order (high to low). xen: statically initialize cpu_evtchn_mask_p ... Fix up trivial conflicts in drivers/pci/Makefile	2010-10-28 17:11:17 -07:00
Weidong Han	e28c31a96b	xen: register xen pci notifier Register a pci notifier to add (or remove) pci devices to Xen via hypercalls. Xen needs to know the pci devices present in the system to handle pci passthrough and even MSI remapping in the initial domain. Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-10-27 18:56:07 +01:00
Linus Torvalds	520045db94	Merge branches 'upstream/xenfs' and 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/xenfs' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xen/privcmd: make privcmd visible in domU xen/privcmd: move remap_domain_mfn_range() to core xen code and export. privcmd: MMAPBATCH: Fix error handling/reporting xenbus: export xen_store_interface for xenfs xen/privcmd: make sure vma is ours before doing anything to it xen/privcmd: print SIGBUS faults xen/xenfs: set_page_dirty is supposed to return true if it dirties xen/privcmd: create address space to allow writable mmaps xen: add privcmd driver xen: add variable hypercall caller xen: add xen_set_domain_pte() xen: add /proc/xen/xsd_{kva,port} to xenfs * 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: (29 commits) xen: include xen/xen.h for definition of xen_initial_domain() xen: use host E820 map for dom0 xen: correctly rebuild mfn list list after migration. xen: improvements to VIRQ_DEBUG output xen: set up IRQ before binding virq to evtchn xen: ensure that all event channels start off bound to VCPU 0 xen/hvc: only notify if we actually sent something xen: don't add extra_pages for RAM after mem_end xen: add support for PAT xen: make sure xen_max_p2m_pfn is up to date xen: limit extra memory to a certain ratio of base xen: add extra pages for E820 RAM regions, even if beyond mem_end xen: make sure xen_extra_mem_start is beyond all non-RAM e820 xen: implement "extra" memory to reserve space for pages not present at boot xen: Use host-provided E820 map xen: don't map missing memory xen: defer building p2m mfn structures until kernel is mapped xen: add return value to set_phys_to_machine() xen: convert p2m to a 3 level tree xen: make install_p2mtop_page() static ... Fix up trivial conflict in arch/x86/xen/mmu.c, and fix the use of 'reserve_early()' - in the new memblock world order it is now 'memblock_x86_reserve_range()' instead. Pointed out by Jeremy.	2010-10-26 18:20:19 -07:00
Jeremy Fitzhardinge	4fe7d5a708	xen: make hvc_xen console work for dom0. Use the console hypercalls for dom0 console. [ Impact: Add Xen dom0 console ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:26:01 +01:00
Qing He	f731e3ef02	xen: remap MSIs into pirqs when running as initial domain Implement xen_create_msi_irq to create an msi and remap it as pirq. Use xen_create_msi_irq to implement an initial domain specific version of setup_msi_irqs. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:44 +01:00
Jeremy Fitzhardinge	38aa66fcb7	xen: remap GSIs as pirqs when running as initial domain Implement xen_register_gsi to setup the correct triggering and polarity properties of a gsi. Implement xen_register_pirq to register a particular gsi as pirq and receive interrupts as events. Call xen_setup_pirqs to register all the legacy ISA irqs as pirqs. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:43 +01:00
Stefano Stabellini	809f9267bb	xen: map MSIs into pirqs Map MSIs into pirqs, writing 0 in the MSI vector data field and the pirq number in the MSI destination id field. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:43 +01:00
Stefano Stabellini	3942b740e5	xen: support GSI -> pirq remapping in PV on HVM guests Disable pcifront when running on HVM: it is meant to be used with pv guests that don't have PCI bus. Use acpi_register_gsi_xen_hvm to remap GSIs into pirqs. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:42 +01:00
Stefano Stabellini	42a1de56f3	xen: implement xen_hvm_register_pirq xen_hvm_register_pirq allows the kernel to map a GSI into a Xen pirq and receive the interrupt as an event channel from that point on. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:41 +01:00
Stefano Stabellini	01557baff6	xen: get the maximum number of pirqs from xen Use PHYSDEVOP_get_nr_pirqs to get the maximum number of pirqs from xen. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:40 +01:00
Stefano Stabellini	7a043f119c	xen: support pirq != irq PHYSDEVOP_map_pirq might return a pirq different from what we asked if we are running as an HVM guest, so we need to be able to support pirqs that are different from linux irqs. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-22 21:25:40 +01:00
Ian Campbell	35ae11fd14	xen: Use host-provided E820 map Rather than simply using a flat memory map from Xen, use its provided E820 map. This allows the domain builder to tell the domain to reserve space for more pages than those initially provided at domain-build time. It also allows the host to specify holes in the address space (for PCI-passthrough, for example). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-10-22 12:57:27 -07:00
Ian Campbell	de1ef2065c	xen/privcmd: move remap_domain_mfn_range() to core xen code and export. This allows xenfs to be built as a module, previously it required flush_tlb_all and arbitrary_virt_to_machine to be exported. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-10-20 16:22:34 -07:00
Jeremy Fitzhardinge	1c5de1939c	xen: add privcmd driver The privcmd interface in xenfs allows the tool stack in the privileged domain to get fairly direct access to the hypervisor in order to do various management things such as domain construction. [ Impact: new xenfs interface for privileged operations ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-10-20 16:22:29 -07:00
Ryan Wilson	956a9202cd	xen-pcifront: Xen PCI frontend driver. This is a port of the 2.6.18 Xen PCI front driver with fixes to make it build under 2.6.34 and later (for the full list of changes: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git historic/xen-pcifront-0.1). It also includes the fixes to make it work properly. [v2: Updated Kconfig, removed crud, added Reviewed-by] [v3: Added 'static', fixed grant table leak, redid Kconfig] [v4: Added one more 'static' and removed comments] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Jan Beulich <JBeulich@novell.com>	2010-10-18 10:49:37 -04:00
Yosuke Iwamatsu	89afb6e46a	xenbus: Xen paravirtualised PCI hotplug support. The Xen PCI front driver adds two new states that are utilizez for PCI hotplug support. This is a patch pulled from the linux-2.6-xen-sparse tree. Signed-off-by: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@ab.jp.nec.com>	2010-10-18 10:49:35 -04:00
Alex Nixon	b5401a96b5	xen/x86/PCI: Add support for the Xen PCI subsystem The frontend stub lives in arch/x86/pci/xen.c, alongside other sub-arch PCI init code (e.g. olpc.c). It provides a mechanism for Xen PCI frontend to setup/destroy legacy interrupts, MSI/MSI-X, and PCI configuration operations. [ Impact: add core of Xen PCI support ] [ v2: Removed the IOMMU code and only focusing on PCI.] [ v3: removed usage of pci_scan_all_fns as that does not exist] [ v4: introduced pci_xen value to fix compile warnings] [ v5: squished fixes+features in one patch, changed Reviewed-by to Ccs] [ v7: added Acked-by] Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Qing He <qing.he@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org	2010-10-18 10:49:35 -04:00
Konrad Rzeszutek Wilk	15ebbb82ba	xen: fix shared irq device passthrough In driver/xen/events.c, whether bind_pirq is shareable or not is determined by desc->action is NULL or not. But in __setup_irq, startup(irq) is invoked before desc->action is assigned with new action. So desc->action in startup_irq is always NULL, and bind_pirq is always not shareable. This results in pt_irq_create_bind failure when passthrough a device which shares irq to other devices. This patch doesn't use probing_irq to determine if pirq is shareable or not, instead set shareable flag in irq_info according to trigger mode in xen_allocate_pirq. Set level triggered interrupts shareable. Thus use this flag to set bind_pirq flag accordingly. [v2: arch/x86/xen/pci.c no more, so file skipped] Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-18 10:49:29 -04:00
Konrad Rzeszutek Wilk	d9a8814f27	xen: Provide a variant of xen_poll_irq with timeout. The 'xen_poll_irq_timeout' provides a method to pass in the poll timeout for IRQs if requested. We also export those two poll functions as Xen PCI fronted uses them. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-10-18 10:49:28 -04:00
Gerd Hoffmann	1a60d05f40	xen: set pirq name to something useful. Impact: cleanup Make pirq show useful information in /proc/interrupts [v2: Removed the parts for arch/x86/xen/pci.c ] Signed-off-by: Gerd Hoffmann <kraxel@xeni.home.kraxel.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-18 10:41:43 -04:00
Jeremy Fitzhardinge	d46a78b05c	xen: implement pirq type event channels A privileged PV Xen domain can get direct access to hardware. In order for this to be useful, it must be able to get hardware interrupts. Being a PV Xen domain, all interrupts are delivered as event channels. PIRQ event channels are bound to a pirq number and an interrupt vector. When a IO APIC raises a hardware interrupt on that vector, it is delivered as an event channel, which we can deliver to the appropriate device driver(s). This patch simply implements the infrastructure for dealing with pirq event channels. [ Impact: integrate hardware interrupts into Xen's event scheme ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-10-18 10:40:29 -04:00
Ian Campbell	9c35e90c6f	xen: pvhvm: make it clearer that XEN_UNPLUG_* define bits in a bitfield by defining in terms of (1<<N). XEN_UNPLUG_UNNECESSARY and XEN_UNPLUG_NEVER are only used within the kernel and are not defined as a bit on the unplug IO port. Therefore use a bit which is outside the potentially valid range of the 16 bit IO port. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>	2010-08-23 12:01:35 +01:00
Ian Campbell	1dc7ce99b0	xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary It is not immediately clear what this option causes to become ignored. The actual meaning is that it is not necessary to unplug the emulated devices to safely use the PV ones, even if the platform does not support the unplug protocol. (pressumably the user will only add this option if they have ensured that their domain configuration is safe). I think xen_emul_unplug=unnecessary better captures this. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>	2010-08-23 11:59:29 +01:00
Ian Campbell	c93a4dfb31	xen: pvhvm: allow user to request no emulated device unplug this allows the user to disable pvhvm and revert to emulated devices in case of a system misconfiguration (e.g. initramfs with only emulated drivers in it). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>	2010-08-23 11:59:28 +01:00
Linus Torvalds	26f0cf9181	Merge branch 'stable/xen-swiotlb-0.8.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/xen-swiotlb-0.8.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: x86: Detect whether we should use Xen SWIOTLB. pci-swiotlb-xen: Add glue code to setup dma_ops utilizing xen_swiotlb_* functions. swiotlb-xen: SWIOTLB library for Xen PV guest with PCI passthrough. xen/mmu: inhibit vmap aliases rather than trying to clear them out vmap: add flag to allow lazy unmap to be disabled at runtime xen: Add xen_create_contiguous_region xen: Rename the balloon lock xen: Allow unprivileged Xen domains to create iomap pages xen: use _PAGE_IOMAP in ioremap to do machine mappings Fix up trivial conflicts (adding both xen swiotlb and xen pci platform driver setup close to each other) in drivers/xen/{Kconfig,Makefile} and include/xen/xen-ops.h	2010-08-12 09:09:41 -07:00
Konrad Rzeszutek Wilk	b097186fd2	swiotlb-xen: SWIOTLB library for Xen PV guest with PCI passthrough. This patchset: PV guests under Xen are running in an non-contiguous memory architecture. When PCI pass-through is utilized, this necessitates an IOMMU for translating bus (DMA) to virtual and vice-versa and also providing a mechanism to have contiguous pages for device drivers operations (say DMA operations). Specifically, under Xen the Linux idea of pages is an illusion. It assumes that pages start at zero and go up to the available memory. To help with that, the Linux Xen MMU provides a lookup mechanism to translate the page frame numbers (PFN) to machine frame numbers (MFN) and vice-versa. The MFN are the "real" frame numbers. Furthermore memory is not contiguous. Xen hypervisor stitches memory for guests from different pools, which means there is no guarantee that PFN==MFN and PFN+1==MFN+1. Lastly with Xen 4.0, pages (in debug mode) are allocated in descending order (high to low), meaning the guest might never get any MFN's under the 4GB mark. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Albert Herranz <albert_herranz@yahoo.es> Cc: Ian Campbell <Ian.Campbell@citrix.com>	2010-07-27 11:51:00 -04:00
Stefano Stabellini	5915100106	x86: Call HVMOP_pagetable_dying on exit_mmap. When a pagetable is about to be destroyed, we notify Xen so that the hypervisor can clear the related shadow pagetable. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:26 -07:00
Stefano Stabellini	c1c5413ad5	x86: Unplug emulated disks and nics. Add a xen_emul_unplug command line option to the kernel to unplug xen emulated disks and nics. Set the default value of xen_emul_unplug depending on whether or not the Xen PV frontends and the Xen platform PCI driver have been compiled for this kernel (modules or built-in are both OK). The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM even if the host platform doesn't support unplug. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:25 -07:00
Stefano Stabellini	409771d258	x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock. Use xen_vcpuop_clockevent instead of hpet and APIC timers as main clockevent device on all vcpus, use the xen wallclock time as wallclock instead of rtc and use xen_clocksource as clocksource. The pv clock algorithm needs to work correctly for the xen_clocksource and xen wallclock to be usable, only modern Xen versions offer a reliable pv clock in HVM guests (XENFEAT_hvm_safe_pvclock). Using the hpet as clocksource means a VMEXIT every time we read/write to the hpet mmio addresses, pvclock give us a better rating without VMEXITs. Same goes for the xen wallclock and xen_vcpuop_clockevent Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-26 23:13:25 -07:00
Stefano Stabellini	016b6f5fe8	xen: Add suspend/resume support for PV on HVM guests. Suspend/resume requires few different things on HVM: the suspend hypercall is different; we don't need to save/restore memory related settings; except the shared info page and the callback mechanism. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-22 16:46:21 -07:00
Stefano Stabellini	183d03cc4f	xen: Xen PCI platform device driver. Add the xen pci platform device driver that is responsible for initializing the grant table and xenbus in PV on HVM mode. Few changes to xenbus and grant table are necessary to allow the delayed initialization in HVM mode. Grant table needs few additional modifications to work in HVM mode. The Xen PCI platform device raises an irq every time an event has been delivered to us. However these interrupts are only delivered to vcpu 0. The Xen PCI platform interrupt handler calls xen_hvm_evtchn_do_upcall that is a little wrapper around __xen_evtchn_do_upcall, the traditional Xen upcall handler, the very same used with traditional PV guests. When running on HVM the event channel upcall is never called while in progress because it is a normal Linux irq handler (and we cannot switch the irq chip wholesale to the Xen PV ones as we are running QEMU and might have passed in PCI devices), therefore we cannot be sure that evtchn_upcall_pending is 0 when returning. For this reason if evtchn_upcall_pending is set by Xen we need to loop again on the event channels set pending otherwise we might loose some event channel deliveries. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-22 16:46:09 -07:00
Sheng Yang	38e20b07ef	x86/xen: event channels delivery on HVM. Set the callback to receive evtchns from Xen, using the callback vector delivery mechanism. The traditional way for receiving event channel notifications from Xen is via the interrupts from the platform PCI device. The callback vector is a newer alternative that allow us to receive notifications on any vcpu and doesn't need any PCI support: we allocate a vector exclusively to receive events, in the vector handler we don't need to interact with the vlapic, therefore we avoid a VMEXIT. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2010-07-22 16:45:59 -07:00
Jeremy Fitzhardinge	18f19aa62a	xen: Add support for HVM hypercalls. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2010-07-22 16:45:31 -07:00
Alex Nixon	08bbc9da92	xen: Add xen_create_contiguous_region A memory region must be physically contiguous in order to be accessed through DMA. This patch adds xen_create_contiguous_region, which ensures a region of contiguous virtual memory is also physically contiguous. Based on Stephen Tweedie's port of the 2.6.18-xen version. Remove contiguous_bitmap[] as it's no longer needed. Ported from linux-2.6.18-xen.hg 707:e410857fd83c [ Impact: add Xen-internal API to make pages phys-contig ] Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-06-07 15:37:53 -04:00
Alex Nixon	19001c8c5b	xen: Rename the balloon lock * xen_create_contiguous_region needs access to the balloon lock to ensure memory doesn't change under its feet, so expose the balloon lock * Change the name of the lock to xen_reservation_lock, to imply it's now less-specific usage. [ Impact: cleanup ] Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2010-06-07 14:34:07 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jeremy Fitzhardinge	1ccbf5344c	xen: move Xen-testing predicates to common header Move xen_domain and related tests out of asm-x86 to xen/xen.h so they can be included whenever they are necessary. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:24 -08:00
Jeremy Fitzhardinge	1943689c47	Merge branches 'for-linus/xen/dev-evtchn', 'for-linus/xen/xenbus', 'for-linus/xen/xenfs' and 'for-linus/xen/sys-hypervisor' into for-linus/xen/master * for-linus/xen/dev-evtchn: xen/dev-evtchn: clean up locking in evtchn xen: export ioctl headers to userspace xen: add /dev/xen/evtchn driver xen: add irq_from_evtchn * for-linus/xen/xenbus: xen/xenbus: export xenbus_dev_changed xen: use device model for suspending xenbus devices xen: remove suspend_cancel hook * for-linus/xen/xenfs: xen: add "capabilities" file * for-linus/xen/sys-hypervisor: xen: drop kexec bits from /sys/hypervisor since kexec isn't implemented yet xen/sys/hypervisor: change writable_pt to features xen: add /sys/hypervisor support Conflicts: drivers/xen/Makefile	2009-03-30 10:00:45 -07:00
Jeremy Fitzhardinge	cff7e81b3d	xen: add /sys/hypervisor support Adds support for Xen info under /sys/hypervisor. Taken from Novell 2.6.27 backport tree. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-03-30 09:27:06 -07:00
Ian Campbell	de5b31bd47	xen: use device model for suspending xenbus devices Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-03-30 09:26:56 -07:00
Ian Campbell	a1ce1be578	xen: remove suspend_cancel hook Remove suspend_cancel hook from xenbus_driver, in preparation for using the device model for suspending. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-03-30 09:26:55 -07:00
Ian Campbell	c5cfef0f79	xen: export ioctl headers to userspace Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-03-30 09:26:49 -07:00
Ian Campbell	f7116284c7	xen: add /dev/xen/evtchn driver This driver is used by application which wish to receive notifications from the hypervisor or other guests via Xen's event channel mechanism. In particular it is used by the xenstore daemon in domain 0. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-03-30 09:26:49 -07:00
Ian Campbell	d4c045364d	xen: add irq_from_evtchn Given an evtchn, return the corresponding irq. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>	2009-03-30 09:26:49 -07:00
Alex Zeffertt	1107ba885e	xen: add xenfs to allow usermode <-> Xen interaction The xenfs filesystem exports various interfaces to usermode. Initially this exports a file to allow usermode to interact with xenbus/xenstore. Traditionally this appeared in /proc/xen. Rather than extending procfs, this patch adds a backward-compat mountpoint on /proc/xen, and provides a xenfs filesystem which can be mounted there. Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-08 08:30:59 -08:00
Jeremy Fitzhardinge	ecbf29cdb3	xen: clean up asm/xen/hypervisor.h Impact: cleanup hypervisor.h had accumulated a lot of crud, including lots of spurious #includes. Clean it all up, and go around fixing up everything else accordingly. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-16 21:50:31 +01:00
Jeremy Fitzhardinge	d19c8e516e	xen: remove unused balloon.h The balloon driver doesn't have any externally callable functions at the moment, so remove the (effectively empty) header. We can add it back if we need to. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-03 10:04:10 +02:00
Jeremy Fitzhardinge	168d2f464a	xen: save previous spinlock when blocking A spinlock can be interrupted while spinning, so make sure we preserve the previous lock of interest if we're taking a lock from within an interrupt handler. We also need to deal with the case where the blocking path gets interrupted between testing to see if the lock is free and actually blocking. If we get interrupted there and end up in the state where the lock is free but the irq isn't pending, then we'll block indefinitely in the hypervisor. This fix is to make sure that any nested lock-takers will always leave the irq pending if there's any chance the outer lock became free. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-21 13:52:57 +02:00
Jeremy Fitzhardinge	2d9e1e2f58	xen: implement Xen-specific spinlocks The standard ticket spinlocks are very expensive in a virtual environment, because their performance depends on Xen's scheduler giving vcpus time in the order that they're supposed to take the spinlock. This implements a Xen-specific spinlock, which should be much more efficient. The fast-path is essentially the old Linux-x86 locks, using a single lock byte. The locker decrements the byte; if the result is 0, then they have the lock. If the lock is negative, then locker must spin until the lock is positive again. When there's contention, the locker spin for 2^16[] iterations waiting to get the lock. If it fails to get the lock in that time, it adds itself to the contention count in the lock and blocks on a per-cpu event channel. When unlocking the spinlock, the locker looks to see if there's anyone blocked waiting for the lock by checking for a non-zero waiter count. If there's a waiter, it traverses the per-cpu "lock_spinners" variable, which contains which lock each CPU is waiting on. It picks one CPU waiting on the lock and sends it an event to wake it up. This allows efficient fast-path spinlock operation, while allowing spinning vcpus to give up their processor time while waiting for a contended lock. [] 2^16 iterations is threshold at which 98% locks have been taken according to Thomas Friebel's Xen Summit talk "Preventing Guests from Spinning Around". Therefore, we'd expect the lock and unlock slow paths will only be entered 2% of the time. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <clameter@linux-foundation.org> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Virtualization <virtualization@lists.linux-foundation.org> Cc: Xen devel <xen-devel@lists.xensource.com> Cc: Thomas Friebel <thomas.friebel@amd.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:15:53 +02:00
Ingo Molnar	9c8a442044	xen64: fix !HVC_XEN build dependency fix: arch/x86/xen/built-in.o: In function `set_page_prot': enlighten.c:(.text+0x111d): undefined reference to `xen_raw_printk' arch/x86/xen/built-in.o: In function `xen_start_kernel': : undefined reference to `xen_raw_console_write' arch/x86/xen/built-in.o: In function `xen_start_kernel': : undefined reference to `xen_raw_console_write' Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 11:06:48 +02:00
Jeremy Fitzhardinge	48b5db2062	xen64: define asm/xen/interface for 64-bit Copy 64-bit definitions of various interface structures into place. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:56:18 +02:00
Isaku Yamahata	ad55db9fed	xen: add xen_arch_resume()/xen_timer_resume hook for ia64 support add xen_timer_resume() hook. Timer resume should be done after event channel is resumed. add xen_arch_resume() hook when ipi becomes usable after resume. After resume, some cpu specific resource must be reinitialized on ia64 that can't be set by another cpu. However available hooks is run once on only one cpu so that ipi has to be used. During stop_machine_run() ipi can't be used because interrupt is masked. So add another hook after stop_machine_run(). Another approach might be use resume hook which is run by device_resume(). However device_resume() may be executed on suspend error recovery path. So it is necessary to determine whether it is executed on real resume path or error recovery path. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 10:55:50 +02:00
Jeremy Fitzhardinge	e57778a1e3	xen: implement ptep_modify_prot_start/commit Xen has a pte update function which will update a pte while preserving its accessed and dirty bits. This means that ptep_modify_prot_start() can be implemented as a simple read of the pte value. The hardware may update the pte in the meantime, but ptep_modify_prot_commit() updates it while preserving any changes that may have happened in the meantime. The updates in ptep_modify_prot_commit() are batched if we're currently in lazy mmu mode. The mmu_update hypercall can take a batch of updates to perform, but this code doesn't make particular use of that feature, in favour of using generic multicall batching to get them all into the hypervisor. The net effect of this is that each mprotect pte update turns from two expensive trap-and-emulate faults into they hypervisor into a single hypercall whose cost is amortized in a batched multicall. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 15:17:23 +02:00
Ingo Molnar	d02859ecb3	Merge commit 'v2.6.26-rc8' into x86/xen Conflicts: arch/x86/xen/enlighten.c arch/x86/xen/mmu.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-25 12:16:51 +02:00
Gerd Hoffmann	1c7b67f757	x86: Make xen use the paravirt clocksource structs and functions This patch updates the xen guest to use the pvclock structs and helper functions. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-24 21:02:32 +03:00
Jeremy Fitzhardinge	7e0edc1bc3	xen: add new Xen elfnote types and use them appropriately Define recently added XEN_ELFNOTEs, and use them appropriately. Most significantly, this enables domain checkpointing (xm save -c). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-02 13:25:51 +02:00
Ingo Molnar	0261ac5f2f	xen: fix "xen: implement save/restore" -tip testing found the following build breakage: drivers/built-in.o: In function `xen_suspend': manage.c:(.text+0x4390f): undefined reference to `xen_console_resume' with this config: http://redhat.com/~mingo/misc/config-Thu_May_29_09_23_16_CEST_2008.bad i have bisected it down to: \| commit `0e91398f2a` \| Author: Jeremy Fitzhardinge <jeremy@goop.org> \| Date: Mon May 26 23:31:27 2008 +0100 \| \| xen: implement save/restore the problem is that drivers/xen/manage.c is built unconditionally if CONFIG_XEN is enabled and makes use of xen_suspend(), but drivers/char/hvc_xen.c, where the xen_suspend() method is implemented, is only build if CONFIG_HVC_XEN=y as well. i have solved this by providing a NOP implementation for xen_suspend() in the !CONFIG_HVC_XEN case. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-29 09:31:57 +02:00
Jeremy Fitzhardinge	359cdd3f86	xen: maintain clock offset over save/restore Hook into the device model to make sure that timekeeping's resume handler is called. This deals with our clocksource's non-monotonicity over the save/restore. Explicitly call clock_has_changed() to make sure that all the timers get retriggered properly. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:38 +02:00
Jeremy Fitzhardinge	0e91398f2a	xen: implement save/restore This patch implements Xen save/restore and migration. Saving is triggered via xenbus, which is polled in drivers/xen/manage.c. When a suspend request comes in, the kernel prepares itself for saving by: 1 - Freeze all processes. This is primarily to prevent any partially-completed pagetable updates from confusing the suspend process. If CONFIG_PREEMPT isn't defined, then this isn't necessary. 2 - Suspend xenbus and other devices 3 - Stop_machine, to make sure all the other vcpus are quiescent. The Xen tools require the domain to run its save off vcpu0. 4 - Within the stop_machine state, it pins any unpinned pgds (under construction or destruction), performs canonicalizes various other pieces of state (mostly converting mfns to pfns), and finally 5 - Suspend the domain Restore reverses the steps used to save the domain, ending when all the frozen processes are thawed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:38 +02:00
Jeremy Fitzhardinge	6b9b732d0e	xen-console: add save/restore Add code to: 1. Deal with the console page being canonicalized. During save, the console's mfn in the start_info structure is canonicalized to a pfn. In order to deal with that, we always use a copy of the pfn and indirect off that all the time. However, we fall back to using the mfn if the pfn hasn't been initialized yet. 2. Restore the console event channel, and rebind it to the existing irq. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:37 +02:00
Jeremy Fitzhardinge	eb1e305f4e	xen: add rebind_evtchn_irq Add rebind_evtchn_irq(), which will rebind an device driver's existing irq to a new event channel on restore. Since the new event channel will be masked and bound to vcpu0, we update the state accordingly and unmask the irq once everything is set up. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:37 +02:00
Isaku Yamahata	bfdab126cf	xen: add missing definitions in include/xen/interface/memory.h which ia64/xen needs Add xen handles realted definitions for xen memory which ia64/xen needs. Pointer argumsnts for ia64/xen hypercall are passed in pseudo physical address (guest physical address) so that it is required to convert guest kernel virtual address into pseudo physical address. The xen guest handle represents such arguments. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:36 +02:00
Markus Armbruster	e4dcff1f6e	xen pvfb: Dynamic mode support (screen resizing) The pvfb backend indicates dynamic mode support by creating node feature_resize with a non-zero value in its xenstore directory. xen-fbfront sends a resize notification event on mode change. Fully backwards compatible both ways. Framebuffer size and initial resolution can be controlled through kernel parameter xen_fbfront.video. The backend enforces a separate size limit, which it advertises in node videoram in its xenstore directory. xen-kbdfront gets the maximum screen resolution from nodes width and height in the backend's xenstore directory instead of hardcoding it. Additional goodie: support for larger framebuffers (512M on a 64-bit system with 4K pages). Changing the number of bits per pixels dynamically is not supported, yet. Ported from http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/92f7b3144f41 http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/bfc040135633 Signed-off-by: Pat Campbell <plc@novell.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:36 +02:00
Markus Armbruster	6ba0e7b36c	xen pvfb: Pointer z-axis (mouse wheel) support Add z-axis motion to pointer events. Backward compatible, because there's space for the z-axis in union xenkbd_in_event, and old backends zero it. Derived from http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/57dfe0098000 http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/1edfea26a2a9 http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/c3ff0b26f664 Signed-off-by: Pat Campbell <plc@novell.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:36 +02:00
Jeremy Fitzhardinge	0acf10d8fb	xen: add raw console write functions for debug Add a couple of functions which can write directly to the Xen console for debugging. This output ends up on the host's dom0 console (assuming it allows the domain to write there). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-27 10:11:35 +02:00
Jeremy Fitzhardinge	1775826cee	xen: add balloon driver The balloon driver allows memory to be dynamically added or removed from the domain, in order to allow host memory to be balanced between multiple domains. This patch introduces the Xen balloon driver, though it currently only allows a domain to be shrunk from its initial size (and re-grown back to that size). A later patch will add the ability to grow a domain beyond its initial size. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Markus Armbruster	4ee36dc08e	xen pvfb: Para-virtual framebuffer, keyboard and pointer driver This is a pair of Xen para-virtual frontend device drivers: drivers/video/xen-fbfront.c provides a framebuffer, and drivers/input/xen-kbdfront provides keyboard and mouse. The backends run in dom0 user space. The two drivers are not in two separate patches, because the intermediate step (one driver, not the other) is somewhat problematic: the backend in dom0 needs both drivers, and will refuse to complete device initialization unless they're both present. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Christian Limpach	1d78d70556	xen blkfront: Delay wait for block devices until after the disk is added When the xen block frontend driver is built as a module the module load is only synchronous up to the point where the frontend and the backend become connected rather than when the disk is added. This means that there can be a race on boot between loading the module and loading the dm-* modules and doing the scan for LVM physical volumes (all in the initrd). In the failure case the disk is not present until after the scan for physical volumes is complete. Taken from: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/11483a00c017 Signed-off-by: Christian Limpach <Christian.Limpach@xensource.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:33 +02:00
Markus Armbruster	3e334239d8	xen: Make xen-blkfront write its protocol ABI to xenstore Frontends are expected to write their protocol ABI to xenstore. Since the protocol ABI defaults to the backend's native ABI, things work fine without that as long as the frontend's native ABI is identical to the backend's native ABI. This is not the case for xen-blkfront running 32-on-64, because its ABI differs between 32 and 64 bit, and thus needs this fix. Based on http://xenbits.xensource.com/xen-unstable.hg?rev/c545932a18f3 and http://xenbits.xensource.com/xen-unstable.hg?rev/ffe52263b430 by Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	b15993fcc1	xen: import arch generic part of xencomm On xen/ia64 and xen/powerpc hypercall arguments are passed by pseudo physical address (guest physical address) so that it's necessary to convert from virtual address into pseudo physical address. The frame work is called xencomm. Import arch generic part of xencomm. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	8d3d2106c1	xen: make grant table arch portable split out x86 specific part from grant-table.c and allow ia64/xen specific initialization. ia64/xen grant table is based on pseudo physical address (guest physical address) unlike x86/xen. On ia64 init_mm doesn't map identity straight mapped area. ia64/xen specific grant table initialization is necessary. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	5f0ababbf4	xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed one Don't use alloc_vm_area()/free_vm_area() directly, instead define xen_alloc_vm_area()/xen_free_vm_area() and use them. alloc_vm_area()/free_vm_area() are used to allocate/free area which are for grant table mapping. Xen/x86 grant table is based on virtual address so that alloc_vm_area()/free_vm_area() are suitable. On the other hand Xen/ia64 (and Xen/powerpc) grant table is based on pseudo physical address (guest physical address) so that allocation should be done differently. The original version of xenified Linux/IA64 have its own allocate_vm_area()/free_vm_area() definitions which don't allocate vm area contradictory to those names. Now vanilla Linux already has its definitions so that it's impossible to have IA64 definitions of allocate_vm_area()/free_vm_area(). Instead introduce xen_allocate_vm_area()/xen_free_vm_area() and use them. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	20e71f2edb	xen: make include/xen/page.h portable moving those definitions under asm dir The definitions in include/asm/xen/page.h are arch specific. ia64/xen wants to define its own version. So move them to arch specific directory and keep include/xen/page.h in order not to break compilation. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	642e0c882c	xen: add resend_irq_on_evtchn() definition into events.c Define resend_irq_on_evtchn() which ia64/xen uses. Although it isn't used by current x86/xen code, it's arch generic so that put it into common code. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	e849c3e9e0	Xen: make events.c portable for ia64/xen support Remove x86 dependency in drivers/xen/events.c for ia64/xen support introducing include/asm/xen/events.h. Introduce xen_irqs_disabled() to hide regs->flags Introduce xen_do_IRQ() to hide regs->orig_ax. make enum ipi_vector definition arch specific. ia64/xen needs four vectors. Add one rmb() because on ia64 xchg() isn't barrier. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	e04d0d0767	xen: move events.c to drivers/xen for IA64/Xen support move arch/x86/xen/events.c undedr drivers/xen to share codes with x86 and ia64. And minor adjustment to compile. ia64/xen also uses events.c Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	2724426924	xen: add missing definitions in include/xen/interface/vcpu.h which ia64/xen needs Add xen handles realted definitions for xen vcpu which ia64/xen needs. Pointer argumsnts for ia64/xen hypercall are passed in pseudo physical address (guest physical address) so that it is required to convert guest kernel virtual address into pseudo physical address. The xen guest handle represents such arguments. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	87e27cf628	xen: add missing definitions for xen grant table which ia64/xen needs Add xen handles realted definitions for grant table which ia64/xen needs. Pointer argumsnts for ia64/xen hypercall are passed in pseudo physical address (guest physical address) so that it is required to convert guest kernel virtual address into pseudo physical address right before issuing hypercall. The xen guest handle represents such arguments. Define necessary handles and helper functions. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	2eb6d5eb48	xen: definitions which ia64/xen needs Add xen VIRQ numbers defined for arch specific use. ia64/xen domU uses VIRQ_ARCH_0 for virtual itc timer. Although all those constants aren't used yet by ia64 at this moment, add all arch specific VIRQ numbers. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Isaku Yamahata	9a9db275b0	xen: definisions which ia64 needs Add xen hypercall numbers defined for arch specific use. ia64/xen domU uses __HYPERVISOR_arch_1 to manipulate paravirtualized IOSAPIC. Although all those constants aren't used yet by IA64 at this moment, add all arch specific hypercall numbers. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:32 +02:00
Jeremy Fitzhardinge	aa380c82b8	xen: add support for callbackops hypercall Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	9666e9d44b	xen: unify pte operations on machine frames Xen's pte operations on mfns can be unified like the kernel's pfn operations. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	3b4724b0e6	xen: use phys_addr_t when referring to physical addresses Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:57:31 +02:00
Jeremy Fitzhardinge	c8e5393ab3	x86: page.h: make pte_t a union to always include Make sure pte_t, whatever its definition, has a pte element with type pteval_t. This allows common code to access it without needing to be specifically parameterised on what pagetable mode we're compiling for. For 32-bit, this means that pte_t becomes a union with "pte" and "{ pte_low, pte_high }" (PAE) or just "pte_low" (non-PAE). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:57 +01:00
Jeremy Fitzhardinge	e3d2697669	xen: fix incorrect vcpu_register_vcpu_info hypercall argument The kernel's copy of struct vcpu_register_vcpu_info was out of date, at best causing the hypercall to fail and the guest kernel to fall back to the old mechanism, or worse, causing random memory corruption. [ Stable folks: applies to 2.6.23 ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Stable Kernel <stable@kernel.org> Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk> Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>	2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge	dfb68689bf	xen: xen/page.h compile fix Fix: linux/include/xen/page.h: In function mfn_pte: linux/include/xen/page.h:149: error: __supported_pte_mask undeclared (first use in this function) linux/include/xen/page.h:149: error: (Each undeclared identifier is reported only once linux/include/xen/page.h:149: error: for each function it appears in.) Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:35:16 -07:00
Jeremy Fitzhardinge	60223a326f	xen: Place vcpu_info structure into per-cpu memory An experimental patch for Xen allows guests to place their vcpu_info structs anywhere. We try to use this to place the vcpu_info into the PDA, which allows direct access. If this works, then switch to using direct access operations for irq_enable, disable, save_fl and restore_fl. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Keir Fraser <keir@xensource.com>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	4bac07c993	xen: add the Xenbus sysfs and virtual device hotplug driver This communicates with the machine control software via a registry residing in a controlling virtual machine. This allows dynamic creation, destruction and modification of virtual device configurations (network devices, block devices and CPUS, to name some examples). [ Greg, would you mind giving this a review? Thanks -J ] Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Greg KH <greg@kroah.com>	2007-07-18 08:47:45 -07:00
Jeremy Fitzhardinge	ad9a86121f	xen: Add grant table support Add Xen 'grant table' driver which allows granting of access to selected local memory pages by other virtual machines and, symmetrically, the mapping of remote memory pages which other virtual machines have granted access to. This driver is a prerequisite for many of the Xen virtual device drivers, which grant the 'device driver domain' restricted and temporary access to only those memory pages that are currently involved in I/O operations. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	b536b4b962	xen: use the hvc console infrastructure for Xen console Implement a Xen back-end for hvc console. * * * Add early printk support via hvc console, enable using "earlyprintk=xen" on the kernel command line. From: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Olof Johansson <olof@lixom.net>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	f87e4cac4f	xen: SMP guest support This is a fairly straightforward Xen implementation of smp_ops. Xen has its own IPI mechanisms, and has no dependency on any APIC-based IPI. The smp_ops hooks and the flush_tlb_others pv_op allow a Xen guest to avoid all APIC code in arch/i386 (the only apic operation is a single apic_read for the apic version number). One subtle point which needs to be addressed is unpinning pagetables when another cpu may have a lazy tlb reference to the pagetable. Xen will not allow an in-use pagetable to be unpinned, so we must find any other cpus with a reference to the pagetable and get them to shoot down their references. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andi Kleen <ak@suse.de>	2007-07-18 08:47:44 -07:00
Jeremy Fitzhardinge	e46cdb66c8	xen: event channels Xen implements interrupts in terms of event channels. Each guest domain gets 1024 event channels which can be used for a variety of purposes, such as Xen timer events, inter-domain events, inter-processor events (IPI) or for real hardware IRQs. Within the kernel, we map the event channels to IRQs, and implement the whole interrupt handling using a Xen irq_chip. Rather than setting NR_IRQ to 1024 under PARAVIRT in order to accomodate Xen, we create a dynamic mapping between event channels and IRQs. Ideally, Linux will eventually move towards dynamically allocating per-irq structures, and we can use a 1:1 mapping between event channels and irqs. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Eric W. Biederman <ebiederm@xmission.com>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	5ead97c84f	xen: Core Xen implementation This patch is a rollup of all the core pieces of the Xen implementation, including: - booting and setup - pagetable setup - privileged instructions - segmentation - interrupt flags - upcalls - multicall batching BOOTING AND SETUP The vmlinux image is decorated with ELF notes which tell the Xen domain builder what the kernel's requirements are; the domain builder then constructs the address space accordingly and starts the kernel. Xen has its own entrypoint for the kernel (contained in an ELF note). The ELF notes are set up by xen-head.S, which is included into head.S. In principle it could be linked separately, but it seems to provoke lots of binutils bugs. Because the domain builder starts the kernel in a fairly sane state (32-bit protected mode, paging enabled, flat segments set up), there's not a lot of setup needed before starting the kernel proper. The main steps are: 1. Install the Xen paravirt_ops, which is simply a matter of a structure assignment. 2. Set init_mm to use the Xen-supplied pagetables (analogous to the head.S generated pagetables in a native boot). 3. Reserve address space for Xen, since it takes a chunk at the top of the address space for its own use. 4. Call start_kernel() PAGETABLE SETUP Once we hit the main kernel boot sequence, it will end up calling back via paravirt_ops to set up various pieces of Xen specific state. One of the critical things which requires a bit of extra care is the construction of the initial init_mm pagetable. Because Xen places tight constraints on pagetables (an active pagetable must always be valid, and must always be mapped read-only to the guest domain), we need to be careful when constructing the new pagetable to keep these constraints in mind. It turns out that the easiest way to do this is use the initial Xen-provided pagetable as a template, and then just insert new mappings for memory where a mapping doesn't already exist. This means that during pagetable setup, it uses a special version of xen_set_pte which ignores any attempt to remap a read-only page as read-write (since Xen will map its own initial pagetable as RO), but lets other changes to the ptes happen, so that things like NX are set properly. PRIVILEGED INSTRUCTIONS AND SEGMENTATION When the kernel runs under Xen, it runs in ring 1 rather than ring 0. This means that it is more privileged than user-mode in ring 3, but it still can't run privileged instructions directly. Non-performance critical instructions are dealt with by taking a privilege exception and trapping into the hypervisor and emulating the instruction, but more performance-critical instructions have their own specific paravirt_ops. In many cases we can avoid having to do any hypercalls for these instructions, or the Xen implementation is quite different from the normal native version. The privileged instructions fall into the broad classes of: Segmentation: setting up the GDT and the GDT entries, LDT, TLS and so on. Xen doesn't allow the GDT to be directly modified; all GDT updates are done via hypercalls where the new entries can be validated. This is important because Xen uses segment limits to prevent the guest kernel from damaging the hypervisor itself. Traps and exceptions: Xen uses a special format for trap entrypoints, so when the kernel wants to set an IDT entry, it needs to be converted to the form Xen expects. Xen sets int 0x80 up specially so that the trap goes straight from userspace into the guest kernel without going via the hypervisor. sysenter isn't supported. Kernel stack: The esp0 entry is extracted from the tss and provided to Xen. TLB operations: the various TLB calls are mapped into corresponding Xen hypercalls. Control registers: all the control registers are privileged. The most important is cr3, which points to the base of the current pagetable, and we handle it specially. Another instruction we treat specially is CPUID, even though its not privileged. We want to control what CPU features are visible to the rest of the kernel, and so CPUID ends up going into a paravirt_op. Xen implements this mainly to disable the ACPI and APIC subsystems. INTERRUPT FLAGS Xen maintains its own separate flag for masking events, which is contained within the per-cpu vcpu_info structure. Because the guest kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely ignored (and must be, because even if a guest domain disables interrupts for itself, it can't disable them overall). (A note on terminology: "events" and interrupts are effectively synonymous. However, rather than using an "enable flag", Xen uses a "mask flag", which blocks event delivery when it is non-zero.) There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which are implemented to manage the Xen event mask state. The only thing worth noting is that when events are unmasked, we need to explicitly see if there's a pending event and call into the hypervisor to make sure it gets delivered. UPCALLS Xen needs a couple of upcall (or callback) functions to be implemented by each guest. One is the event upcalls, which is how events (interrupts, effectively) are delivered to the guests. The other is the failsafe callback, which is used to report errors in either reloading a segment register, or caused by iret. These are implemented in i386/kernel/entry.S so they can jump into the normal iret_exc path when necessary. MULTICALL BATCHING Xen provides a multicall mechanism, which allows multiple hypercalls to be issued at once in order to mitigate the cost of trapping into the hypervisor. This is particularly useful for context switches, since the 4-5 hypercalls they would normally need (reload cr3, update TLS, maybe update LDT) can be reduced to one. This patch implements a generic batching mechanism for hypercalls, which gets used in many places in the Xen code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Adrian Bunk <bunk@stusta.de>	2007-07-18 08:47:42 -07:00
Jeremy Fitzhardinge	a42089dd35	xen: Add Xen interface header files Add Xen interface header files. These are taken fairly directly from the Xen tree, but somewhat rearranged to suit the kernel's conventions. Define macros and inline functions for doing hypercalls into the hypervisor. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>	2007-07-18 08:47:42 -07:00

1 2 3 4 5

202 Commits