linux-sg2042/arch/x86/xen
Zhang, Fengzhe 2f14ddc3a7 xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps.
With the hypervisor argument of dom0_mem=X we iterate over the physical
(only for the initial domain) E820 and subtract the the size from each
E820_RAM region the delta so that the cumulative size of all E820_RAM regions
is equal to 'X'. This sometimes ends up with E820_RAM regions with zero size
(which are removed by e820_sanitize) and E820_RAM that are smaller
than physically.

Later on the PCI API looks at the E820 and attempts to set up an
resource region for the "PCI mem". The E820 (assume dom0_mem=1GB is
set) compared to the physical looks as so:

 [    0.000000] BIOS-provided physical RAM map:
 [    0.000000]  Xen: 0000000000000000 - 0000000000097c00 (usable)
 [    0.000000]  Xen: 0000000000097c00 - 0000000000100000 (reserved)
-[    0.000000]  Xen: 0000000000100000 - 00000000defafe00 (usable)
+[    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
 [    0.000000]  Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
 [    0.000000]  Xen: 00000000defb1ea0 - 00000000e0000000 (reserved)
 [    0.000000]  Xen: 00000000f4000000 - 00000000f8000000 (reserved)
..
And we get
[    0.000000] Allocating PCI resources starting at 40000000 (gap: 40000000:9efafe00)

while it should have started at e0000000 (a nice big gap up to
f4000000 exists). The "Allocating PCI" is part of the resource API.

The users that end up using those PCI I/O regions usually supply their
own BARs when calling the resource API (request_resource, or allocate_resource),
but there are exceptions which provide an empty 'struct resource' and
expect the API to provide the 'struct resource' to be populated with valid values.
The one that triggered this bug was the intel AGP driver that requested
a region for the flush page (intel_i9xx_setup_flush).

Before this patch, when running under Xen hypervisor, the 'struct resource'
returned could have (depending on the dom0_mem size) physical ranges of a 'System RAM'
instead of 'I/O' regions. This ended up with the Hypervisor failing a request
to populate PTE's with those PFNs as the domain did not have access to those
'System RAM' regions (rightly so).

After this patch, the left-over E820_RAM region from the truncation, will be
labeled as E820_UNUSABLE. The E820 will look as so:

 [    0.000000] BIOS-provided physical RAM map:
 [    0.000000]  Xen: 0000000000000000 - 0000000000097c00 (usable)
 [    0.000000]  Xen: 0000000000097c00 - 0000000000100000 (reserved)
-[    0.000000]  Xen: 0000000000100000 - 00000000defafe00 (usable)
+[    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
+[    0.000000]  Xen: 0000000040000000 - 00000000defafe00 (unusable)
 [    0.000000]  Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
 [    0.000000]  Xen: 00000000defb1ea0 - 00000000e0000000 (reserved)
 [    0.000000]  Xen: 00000000f4000000 - 00000000f8000000 (reserved)

For more information:
http://mid.gmane.org/1A42CE6F5F474C41B63392A5F80372B2335E978C@shsmsx501.ccr.corp.intel.com

BugLink: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1726

Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-22 12:48:50 -05:00
..
Kconfig Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen 2010-10-28 17:11:17 -07:00
Makefile xen: move p2m handling to separate file 2011-01-11 14:31:07 -05:00
debugfs.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
debugfs.h xen: add debugfs support 2008-08-21 13:52:58 +02:00
enlighten.c lockdep: Move early boot local IRQ enable/disable status to init/main.c 2011-01-20 13:32:33 +01:00
grant-table.c xen: make grant table arch portable 2008-04-24 23:57:32 +02:00
irq.c xen: fix non-ANSI function warning in irq.c 2011-01-20 14:52:13 -05:00
mmu.c xen: export arbitrary_virt_to_machine 2011-01-14 16:11:12 -08:00
mmu.h xen: make install_p2mtop_page() static 2010-10-22 12:57:23 -07:00
multicalls.c x86, xen: do multicall callbacks with interrupts disabled 2009-02-16 08:56:41 +01:00
multicalls.h xen: Use this_cpu_ops 2010-12-17 15:07:19 +01:00
p2m.c xen/p2m: Mark INVALID_P2M_ENTRY the mfn_list past max_pfn. 2011-01-27 10:49:34 -05:00
pci-swiotlb-xen.c Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen 2010-10-28 17:11:17 -07:00
platform-pci-unplug.c xen: unplug the emulated devices at resume time 2010-12-02 14:40:53 +00:00
setup.c xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps. 2011-02-22 12:48:50 -05:00
smp.c Merge branch 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen 2010-10-28 17:11:17 -07:00
spinlock.c xen: Use this_cpu_ops 2010-12-17 15:07:19 +01:00
suspend.c xen: unplug the emulated devices at resume time 2010-12-02 14:40:53 +00:00
time.c Merge branch 'this_cpu_ops' into for-2.6.38 2010-12-17 15:16:46 +01:00
vdso.h i386: move xen 2007-10-11 11:16:51 +02:00
xen-asm.S x86: style cleanups for xen assemblies 2009-02-05 20:25:41 +01:00
xen-asm.h xen: make direct versions of irq_enable/disable/save/restore to common code 2009-02-04 16:59:04 -08:00
xen-asm_32.S percpu: remove per_cpu__ prefix. 2009-10-29 22:34:15 +09:00
xen-asm_64.S xen: use iret for return from 64b kernel to 32b usermode 2009-12-03 11:14:54 -08:00
xen-head.S x86: use _types.h headers in asm where available 2009-02-13 11:35:01 -08:00
xen-ops.h xen: unplug the emulated devices at resume time 2010-12-02 14:40:53 +00:00