Commit Graph

197 Commits

Author SHA1 Message Date
Stefano Stabellini 3942b740e5 xen: support GSI -> pirq remapping in PV on HVM guests
Disable pcifront when running on HVM: it is meant to be used with pv
guests that don't have PCI bus.

Use acpi_register_gsi_xen_hvm to remap GSIs into pirqs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-22 21:25:42 +01:00
Stefano Stabellini 42a1de56f3 xen: implement xen_hvm_register_pirq
xen_hvm_register_pirq allows the kernel to map a GSI into a Xen pirq and
receive the interrupt as an event channel from that point on.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-22 21:25:41 +01:00
Stefano Stabellini 01557baff6 xen: get the maximum number of pirqs from xen
Use PHYSDEVOP_get_nr_pirqs to get the maximum number of pirqs from xen.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-22 21:25:40 +01:00
Stefano Stabellini 7a043f119c xen: support pirq != irq
PHYSDEVOP_map_pirq might return a pirq different from what we asked if
we are running as an HVM guest, so we need to be able to support pirqs
that are different from linux irqs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-22 21:25:40 +01:00
Stefano Stabellini 67ba37293e Merge commit 'konrad/stable/xen-pcifront-0.8.2' into 2.6.36-rc8-initial-domain-v6 2010-10-22 21:24:06 +01:00
Konrad Rzeszutek Wilk 2d7d06dd8f xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
Without this dependency we get these compile errors:

linux-next-20101020/drivers/xen/biomerge.c: In function 'xen_biovec_phys_mergeable':
linux-next-20101020/drivers/xen/biomerge.c:8: error: dereferencing pointer to incomplete type
linux-next-20101020/drivers/xen/biomerge.c:9: error: dereferencing pointer to incomplete type
linux-next-20101020/drivers/xen/biomerge.c:11: error: implicit declaration of function '__BIOVEC_PHYS_MERGEABLE'

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
2010-10-20 13:04:13 -04:00
Konrad Rzeszutek Wilk 2c52f8d3f7 x86: xen: Sanitse irq handling (part two)
Thomas Gleixner cleaned up event handling to use the
sparse_irq handling, but the xen-pcifront patches utilized the
old mechanism. This fixes them to work with sparse_irq handling.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 17:12:38 -04:00
Konrad Rzeszutek Wilk 2775609c5d swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
We used to depend on CONFIG_SWIOTLB, but that is disabled by default.
So when compiling we get this compile error:

arch/x86/xen/pci-swiotlb-xen.c: In function 'pci_xen_swiotlb_detect':
arch/x86/xen/pci-swiotlb-xen.c:48: error: lvalue required as left operand of assignment

Fix it by actually activating the SWIOTLB library.

Reported-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:49:40 -04:00
Konrad Rzeszutek Wilk 74226b8c8a xen/pci: Request ACS when Xen-SWIOTLB is activated.
It used to done in the Xen startup code but that is not really
appropiate.

[v2: Update Kconfig with PCI requirement]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:49:38 -04:00
Yosuke Iwamatsu 89afb6e46a xenbus: Xen paravirtualised PCI hotplug support.
The Xen PCI front driver adds two new states that are utilizez
for PCI hotplug support. This is a patch pulled from the
linux-2.6-xen-sparse tree.

Signed-off-by: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Yosuke Iwamatsu <y-iwamatsu@ab.jp.nec.com>
2010-10-18 10:49:35 -04:00
Alex Nixon b5401a96b5 xen/x86/PCI: Add support for the Xen PCI subsystem
The frontend stub lives in arch/x86/pci/xen.c, alongside other
sub-arch PCI init code (e.g. olpc.c).

It provides a mechanism for Xen PCI frontend to setup/destroy
legacy interrupts, MSI/MSI-X, and PCI configuration operations.

[ Impact: add core of Xen PCI support ]
[ v2: Removed the IOMMU code and only focusing on PCI.]
[ v3: removed usage of pci_scan_all_fns as that does not exist]
[ v4: introduced pci_xen value to fix compile warnings]
[ v5: squished fixes+features in one patch, changed Reviewed-by to Ccs]
[ v7: added Acked-by]
Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Qing He <qing.he@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
2010-10-18 10:49:35 -04:00
Konrad Rzeszutek Wilk 15ebbb82ba xen: fix shared irq device passthrough
In driver/xen/events.c, whether bind_pirq is shareable or not is
determined by desc->action is NULL or not. But in __setup_irq,
startup(irq) is invoked before desc->action is assigned with
new action. So desc->action in startup_irq is always NULL, and
bind_pirq is always not shareable. This results in pt_irq_create_bind
failure when passthrough a device which shares irq to other devices.

This patch doesn't use probing_irq to determine if pirq is shareable
or not, instead set shareable flag in irq_info according to trigger
mode in xen_allocate_pirq. Set level triggered interrupts shareable.
Thus use this flag to set bind_pirq flag accordingly.

[v2: arch/x86/xen/pci.c no more, so file skipped]

Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:49:29 -04:00
Konrad Rzeszutek Wilk d9a8814f27 xen: Provide a variant of xen_poll_irq with timeout.
The 'xen_poll_irq_timeout' provides a method to pass in
the poll timeout for IRQs if requested. We also export
those two poll functions as Xen PCI fronted uses them.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-10-18 10:49:28 -04:00
Konrad Rzeszutek Wilk 3a69e9165a xen: Find an unbound irq number in reverse order (high to low).
In earlier Xen Linux kernels, the IRQ mapping was a straight 1:1 and the
find_unbound_irq started looking around 256 for open IRQs and up. IRQs
from 0 to 255 were reserved for PCI devices.  Previous to this patch,
the 'find_unbound_irq'  started looking at get_nr_hw_irqs() number.
For privileged  domain where the ACPI information is available that
returns the upper-bound of what the GSIs. For non-privileged PV domains,
where ACPI is no-existent the get_nr_hw_irqs() reports the IRQ_LEGACY (16).
With PCI passthrough enabled, and with PCI cards that have IRQs pinned
to a higher number than 16 we collide with previously allocated IRQs.
Specifically the PCI IRQs collide with the IPI's for Xen functions
(as they are allocated earlier).
For example:

00:00.11 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller (prog-if 10 [OHCI])
	...
	Interrupt: pin A routed to IRQ 18

[root@localhost ~]# cat /proc/interrupts | head
           CPU0       CPU1       CPU2
 16:      38186          0          0   xen-dyn-virq      timer0
 17:        149          0          0   xen-dyn-ipi       spinlock0
 18:        962          0          0   xen-dyn-ipi       resched0

and when the USB controller is loaded, the kernel reports:
IRQ handler type mismatch for IRQ 18
current handler: resched0

One way to fix this is to reverse the logic when looking for un-used
IRQ numbers and start with the highest available number. With that,
we would get:

           CPU0       CPU1       CPU2
... snip ..
292:         35          0          0   xen-dyn-ipi       callfunc0
293:       3992          0          0   xen-dyn-ipi       resched0
294:        224          0          0   xen-dyn-ipi       spinlock0
295:      57183          0          0   xen-dyn-virq      timer0
NMI:          0          0          0   Non-maskable interrupts
.. snip ..

And interrupts for PCI cards are now accessible.

This patch also includes the fix, found by Ian Campbell, titled
"xen: fix off-by-one error in find_unbound_irq."

[v2: Added an explanation in the code]
[v3: Rebased on top of tip/irq/core]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-10-18 10:49:10 -04:00
Jeremy Fitzhardinge 3b32f574a0 xen: statically initialize cpu_evtchn_mask_p
Sometimes cpu_evtchn_mask_p can get used early, before it has been
allocated.  Statically initialize it with an initdata version to catch
any early references.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:41:44 -04:00
Gerd Hoffmann 1a60d05f40 xen: set pirq name to something useful.
Impact: cleanup

Make pirq show useful information in /proc/interrupts

[v2: Removed the parts for arch/x86/xen/pci.c ]

Signed-off-by: Gerd Hoffmann <kraxel@xeni.home.kraxel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:41:43 -04:00
Jeremy Fitzhardinge b21ddbf503 xen: dynamically allocate irq & event structures
Dynamically allocate the irq_info and evtchn_to_irq arrays, so that
1) the irq_info array scales to the actual number of possible irqs,
and 2) we don't needlessly increase the static size of the kernel
when we aren't running under Xen.

Derived on patch from Mike Travis <travis@sgi.com>.

[Impact: reduce memory usage ]
[v2: Conflict in drivers/xen/events.c: Replaced alloc_bootmen with kcalloc ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:41:42 -04:00
Konrad Rzeszutek Wilk 0794bfc743 xen: identity map gsi->irqs
Impact: preserve compat with native

Reserve the lower irq range for use for hardware interrupts so we
can identity-map them.

[v2: Rebased on top tip/irq/core]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:41:08 -04:00
Jeremy Fitzhardinge d46a78b05c xen: implement pirq type event channels
A privileged PV Xen domain can get direct access to hardware.  In
order for this to be useful, it must be able to get hardware
interrupts.

Being a PV Xen domain, all interrupts are delivered as event channels.
PIRQ event channels are bound to a pirq number and an interrupt
vector.  When a IO APIC raises a hardware interrupt on that vector, it
is delivered as an event channel, which we can deliver to the
appropriate device driver(s).

This patch simply implements the infrastructure for dealing with pirq
event channels.

[ Impact: integrate hardware interrupts into Xen's event scheme ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:40:29 -04:00
Jeremy Fitzhardinge d8e0420603 xen: define BIOVEC_PHYS_MERGEABLE()
Impact: allow Xen control of bio merging

When running in Xen domain with device access, we need to make sure
the block subsystem doesn't merge requests across pages which aren't
machine physically contiguous.  To do this, we define our own
BIOVEC_PHYS_MERGEABLE.  When CONFIG_XEN isn't enabled, or we're not
running in a Xen domain, this has identical behaviour to the normal
implementation.  When running under Xen, we also make sure the
underlying machine pages are the same or adjacent.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-10-18 10:40:28 -04:00
Thomas Gleixner 77dff1c755 x86: xen: Sanitise sparse_irq handling
There seems to be more cleanups possible, but that's left to the xen
experts :)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
2010-10-12 16:53:44 +02:00
Stefano Stabellini a947f0f8f7 xen: do not set xenstored_ready before xenbus_probe on hvm
Register_xenstore_notifier should guarantee that the caller gets
notified even if xenstore is already up.
Therefore we revert "do not notify callers from
register_xenstore_notifier" and set xenstored_read at the right time for
PV on HVM guests too.
In fact in case of PV on HVM guests xenstored is ready only after the
platform pci driver has completed the initialization, so do not set
xenstored_ready before the call to xenbus_probe().

This patch fixes a shutdown_event watcher registration bug that causes
"xm shutdown" not to work properly.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
2010-10-05 13:37:28 +01:00
Linus Torvalds 2637d139fb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: pxa27x_keypad - remove input_free_device() in pxa27x_keypad_remove()
  Input: mousedev - fix regression of inverting axes
  Input: uinput - add devname alias to allow module on-demand load
  Input: hil_kbd - fix compile error
  USB: drop tty argument from usb_serial_handle_sysrq_char()
  Input: sysrq - drop tty argument form handle_sysrq()
  Input: sysrq - drop tty argument from sysrq ops handlers
2010-08-28 13:55:31 -07:00
Jeremy Fitzhardinge dffe2e1e1a xen: handle events as edge-triggered
Xen events are logically edge triggered, as Xen only calls the event
upcall when an event is newly set, but not continuously as it remains set.
As a result, use handle_edge_irq rather than handle_level_irq.

This has the important side-effect of fixing a long-standing bug of
events getting lost if:
 - an event's interrupt handler is running
 - the event is migrated to a different vcpu
 - the event is re-triggered

The most noticable symptom of these lost events is occasional lockups
of blkfront.

Many thanks to Tom Kopec and Daniel Stodden in tracking this down.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Tom Kopec <tek@acm.org>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2010-08-24 11:14:12 -07:00
Jeremy Fitzhardinge aaca49642b xen: use percpu interrupts for IPIs and VIRQs
IPIs and VIRQs are inherently per-cpu event types, so treat them as such:
 - use a specific percpu irq_chip implementation, and
 - handle them with handle_percpu_irq

This makes the path for delivering these interrupts more efficient
(no masking/unmasking, no locks), and it avoid problems with attempts
to migrate them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2010-08-24 11:13:28 -07:00
Dmitry Torokhov f335397d17 Input: sysrq - drop tty argument form handle_sysrq()
Sysrq operations do not accept tty argument anymore so no need to pass
it to us.

[Stephen Rothwell <sfr@canb.auug.org.au>: fix build breakage in drm code
 caused by sysrq using bool but not including linux/types.h]

[Sachin Sant <sachinp@in.ibm.com>: fix build breakage in s390 keyboadr
 driver]

Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-08-21 00:34:45 -07:00
Linus Torvalds 26f0cf9181 Merge branch 'stable/xen-swiotlb-0.8.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/xen-swiotlb-0.8.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  x86: Detect whether we should use Xen SWIOTLB.
  pci-swiotlb-xen: Add glue code to setup dma_ops utilizing xen_swiotlb_* functions.
  swiotlb-xen: SWIOTLB library for Xen PV guest with PCI passthrough.
  xen/mmu: inhibit vmap aliases rather than trying to clear them out
  vmap: add flag to allow lazy unmap to be disabled at runtime
  xen: Add xen_create_contiguous_region
  xen: Rename the balloon lock
  xen: Allow unprivileged Xen domains to create iomap pages
  xen: use _PAGE_IOMAP in ioremap to do machine mappings

Fix up trivial conflicts (adding both xen swiotlb and xen pci platform
driver setup close to each other) in drivers/xen/{Kconfig,Makefile} and
include/xen/xen-ops.h
2010-08-12 09:09:41 -07:00
Linus Torvalds 2f9e825d3e Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block: (149 commits)
  block: make sure that REQ_* types are seen even with CONFIG_BLOCK=n
  xen-blkfront: fix missing out label
  blkdev: fix blkdev_issue_zeroout return value
  block: update request stacking methods to support discards
  block: fix missing export of blk_types.h
  writeback: fix bad _bh spinlock nesting
  drbd: revert "delay probes", feature is being re-implemented differently
  drbd: Initialize all members of sync_conf to their defaults [Bugz 315]
  drbd: Disable delay probes for the upcomming release
  writeback: cleanup bdi_register
  writeback: add new tracepoints
  writeback: remove unnecessary init_timer call
  writeback: optimize periodic bdi thread wakeups
  writeback: prevent unnecessary bdi threads wakeups
  writeback: move bdi threads exiting logic to the forker thread
  writeback: restructure bdi forker loop a little
  writeback: move last_active to bdi
  writeback: do not remove bdi from bdi_list
  writeback: simplify bdi code a little
  writeback: do not lose wake-ups in bdi threads
  ...

Fixed up pretty trivial conflicts in drivers/block/virtio_blk.c and
drivers/scsi/scsi_error.c as per Jens.
2010-08-10 15:22:42 -07:00
Daniel Stodden 5b61cb90c2 xenbus: Make xenbus_switch_state transactional
According to the comments, this was how it's been done years ago, but
apparently took an xbt pointer from elsewhere back then. The code was
removed because of consistency issues: cancellation wont't roll back
the saved xbdev->state.

Still, unsolicited writes to the state field remain an issue,
especially if device shutdown takes thread synchronization, and subtle
races cause accidental recreation of the device node.

Fixed by reintroducing the transaction. An internal one is sufficient,
so the xbdev->state value remains consistent.

Also fixes the original hack to prevent infinite recursion. Instead of
bailing out on the first attempt to switch to Closing, checks call
depth now.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-07 18:31:34 +02:00
Linus Torvalds 1787985782 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  xen: Do not suspend IPI IRQs.
  powerpc: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupts
  ixp4xx-beeper: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupt
  irq: Add new IRQ flag IRQF_NO_SUSPEND
2010-08-06 13:25:43 -07:00
Jeremy Fitzhardinge 7cc88fdcff Merge branch 'xen/xenbus' into upstream/xen
* xen/xenbus:
  implement O_NONBLOCK for /proc/xen/xenbus
  xenbus: do not hold transaction_mutex when returning to userspace
2010-08-04 14:49:24 -07:00
Jeremy Fitzhardinge ca50a5f390 Merge branch 'upstream/pvhvm' into upstream/xen
* upstream/pvhvm:
  Introduce CONFIG_XEN_PVHVM compile option
  blkfront: do not create a PV cdrom device if xen_hvm_guest
  support multiple .discard.* sections to avoid section type conflicts
  xen/pvhvm: fix build problem when !CONFIG_XEN
  xenfs: enable for HVM domains too
  x86: Call HVMOP_pagetable_dying on exit_mmap.
  x86: Unplug emulated disks and nics.
  x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock.
  xen: Fix find_unbound_irq in presence of ioapic irqs.
  xen: Add suspend/resume support for PV on HVM guests.
  xen: Xen PCI platform device driver.
  x86/xen: event channels delivery on HVM.
  x86: early PV on HVM features initialization.
  xen: Add support for HVM hypercalls.

Conflicts:
	arch/x86/xen/enlighten.c
	arch/x86/xen/time.c
2010-08-04 14:49:16 -07:00
Jeremy Fitzhardinge a70ce4b606 Merge branch 'upstream/core' into upstream/xen
* upstream/core:
  xen/panic: use xen_reboot and fix smp_send_stop
  Xen: register panic notifier to take crashes of xen guests on panic
  xen: support large numbers of CPUs with vcpu info placement
  xen: drop xen_sched_clock in favour of using plain wallclock time
  pvops: do not notify callers from register_xenstore_notifier
  xen: make sure pages are really part of domain before freeing
  xen: release unused free memory
2010-08-04 14:49:05 -07:00
Stefano Stabellini 31de189f7d pvops: do not notify callers from register_xenstore_notifier
Currently register_xenstore_notifier notifies the caller during the
registration itself if xenstore is believed to be ready. This behaviour
causes problems to PV on HVM guests, in which case callers should be
notified by xenbus_probe only after the platform pci driver is loaded.
We already make sure xenbus_probe is called at the right time, calling
it either from device_initcall (PV case) or from the platform pci
driver initialization (HVM case) so we don't need this additional
notification.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-08-04 14:47:28 -07:00
Stefano Stabellini ca65f9fc0c Introduce CONFIG_XEN_PVHVM compile option
This patch introduce a CONFIG_XEN_PVHVM compile time option to
enable/disable Xen PV on HVM support.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-07-29 11:11:33 -07:00
Ian Campbell 4877c73728 xen: Do not suspend IPI IRQs.
In general the semantics of IPIs are that they are are expected to
continue functioning after dpm_suspend_noirq().

Specifically I have seen a deadlock between the callfunc IPI and the
stop machine used by xen's do_suspend() routine. If one CPU has already
called dpm_suspend_noirq() then there is a window where it can be sent
a callfunc IPI before all the other CPUs have entered stop_cpu().

If this happens then the first CPU ends up spinning in stop_cpu()
waiting for the other to rendezvous in state STOPMACHINE_PREPARE while
the other is spinning in csd_lock_wait().

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: xen-devel@lists.xensource.com
LKML-Reference: <1280398595-29708-4-git-send-email-ian.campbell@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-29 13:24:58 +02:00
Konrad Rzeszutek Wilk b097186fd2 swiotlb-xen: SWIOTLB library for Xen PV guest with PCI passthrough.
This patchset:

PV guests under Xen are running in an non-contiguous memory architecture.

When PCI pass-through is utilized, this necessitates an IOMMU for
translating bus (DMA) to virtual and vice-versa and also providing a
mechanism to have contiguous pages for device drivers operations (say DMA
operations).

Specifically, under Xen the Linux idea of pages is an illusion. It
assumes that pages start at zero and go up to the available memory. To
help with that, the Linux Xen MMU provides a lookup mechanism to
translate the page frame numbers (PFN) to machine frame numbers (MFN)
and vice-versa. The MFN are the "real" frame numbers. Furthermore
memory is not contiguous. Xen hypervisor stitches memory for guests
from different pools, which means there is no guarantee that PFN==MFN
and PFN+1==MFN+1. Lastly with Xen 4.0, pages (in debug mode) are
allocated in descending order (high to low), meaning the guest might
never get any MFN's under the 4GB mark.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Albert Herranz <albert_herranz@yahoo.es>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
2010-07-27 11:51:00 -04:00
Jeremy Fitzhardinge 43df95c44e xenfs: enable for HVM domains too
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-26 23:13:27 -07:00
Stefano Stabellini c1c5413ad5 x86: Unplug emulated disks and nics.
Add a xen_emul_unplug command line option to the kernel to unplug
xen emulated disks and nics.

Set the default value of xen_emul_unplug depending on whether or
not the Xen PV frontends and the Xen platform PCI driver have
been compiled for this kernel (modules or built-in are both OK).

The user can specify xen_emul_unplug=ignore to enable PV drivers on HVM
even if the host platform doesn't support unplug.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-26 23:13:25 -07:00
Paolo Bonzini 6280f190da implement O_NONBLOCK for /proc/xen/xenbus
This patch implements O_NONBLOCK for /proc/xen/xenbus.  It is a simple
matter of returning -EAGAIN instead of waiting on a queue.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-26 10:05:05 -07:00
Stefano Stabellini 99ad198c49 xen: Fix find_unbound_irq in presence of ioapic irqs.
Don't break the assumption that the first 16 irqs are ISA irqs;
make sure that the irq is actually free before using it.

Use dynamic_irq_init_keep_chip_data instead of
dynamic_irq_init so that chip_data is not NULL (a NULL chip_data breaks
setup_vector_irq).

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22 16:46:30 -07:00
Stefano Stabellini 016b6f5fe8 xen: Add suspend/resume support for PV on HVM guests.
Suspend/resume requires few different things on HVM: the suspend
hypercall is different; we don't need to save/restore memory related
settings; except the shared info page and the callback mechanism.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22 16:46:21 -07:00
Stefano Stabellini 183d03cc4f xen: Xen PCI platform device driver.
Add the xen pci platform device driver that is responsible
for initializing the grant table and xenbus in PV on HVM mode.
Few changes to xenbus and grant table are necessary to allow the delayed
initialization in HVM mode.
Grant table needs few additional modifications to work in HVM mode.

The Xen PCI platform device raises an irq every time an event has been
delivered to us. However these interrupts are only delivered to vcpu 0.
The Xen PCI platform interrupt handler calls xen_hvm_evtchn_do_upcall
that is a little wrapper around __xen_evtchn_do_upcall, the traditional
Xen upcall handler, the very same used with traditional PV guests.

When running on HVM the event channel upcall is never called while in
progress because it is a normal Linux irq handler (and we cannot switch
the irq chip wholesale to the Xen PV ones as we are running QEMU and
might have passed in PCI devices), therefore we cannot be sure that
evtchn_upcall_pending is 0 when returning.
For this reason if evtchn_upcall_pending is set by Xen we need to loop
again on the event channels set pending otherwise we might loose some
event channel deliveries.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22 16:46:09 -07:00
Sheng Yang 38e20b07ef x86/xen: event channels delivery on HVM.
Set the callback to receive evtchns from Xen, using the
callback vector delivery mechanism.

The traditional way for receiving event channel notifications from Xen
is via the interrupts from the platform PCI device.
The callback vector is a newer alternative that allow us to receive
notifications on any vcpu and doesn't need any PCI support: we allocate
a vector exclusively to receive events, in the vector handler we don't
need to interact with the vlapic, therefore we avoid a VMEXIT.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22 16:45:59 -07:00
Sheng Yang bee6ab53e6 x86: early PV on HVM features initialization.
Initialize basic pv on hvm features adding a new Xen HVM specific
hypervisor_x86 structure.

Don't try to initialize xen-kbdfront and xen-fbfront when running on HVM
because the backends are not available.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-22 16:45:35 -07:00
Alex Nixon 19001c8c5b xen: Rename the balloon lock
* xen_create_contiguous_region needs access to the balloon lock to
  ensure memory doesn't change under its feet, so expose the balloon
  lock
* Change the name of the lock to xen_reservation_lock, to imply it's
  now less-specific usage.

[ Impact: cleanup ]

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-06-07 14:34:07 -04:00
Ian Campbell b3831cb55d xen: avoid allocation causing potential swap activity on the resume path
Since the device we are resuming could be the device containing the
swap device we should ensure that the allocation cannot cause
IO.

On resume, this path is triggered when the running system tries to
continue using its devices.  If it cannot then the resume will fail;
to try to avoid this we let it dip into the emergency pools.

The majority of these changes were made when linux-2.6.18-xen.hg
changeset e8b49cfbdac0 was ported upstream in
a144ff09bc but somehow this hunk was
dropped.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Stable Kernel <stable@kernel.org> # .32.x
2010-06-03 09:34:45 +01:00
Randy Dunlap f3bc3189a0 xen: fix build when SYSRQ is disabled
Fix build error when CONFIG_MAGIC_SYSRQ is not enabled:

drivers/xen/manage.c:223: error: implicit declaration of function 'handle_sysrq'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:07 -07:00
Tejun Heo 3fc1f1e27a stop_machine: reimplement using cpu_stop
Reimplement stop_machine using cpu_stop.  As cpu stoppers are
guaranteed to be available for all online cpus,
stop_machine_create/destroy() are no longer necessary and removed.

With resource management and synchronization handled by cpu_stop, the
new implementation is much simpler.  Asking the cpu_stop to execute
the stop_cpu() state machine on all online cpus with cpu hotplug
disabled is enough.

stop_machine itself doesn't need to manage any global resources
anymore, so all per-instance information is rolled into struct
stop_machine_data and the mutex and all static data variables are
removed.

The previous implementation created and destroyed RT workqueues as
necessary which made stop_machine() calls highly expensive on very
large machines.  According to Dimitri Sivanich, preventing the dynamic
creation/destruction makes booting faster more than twice on very
large machines.  cpu_stop resources are preallocated for all online
cpus and should have the same effect.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06 18:49:20 +02:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00