Commit Graph

2992 Commits

Author SHA1 Message Date
Lin Ming 1cc0c998fd ACPI: Fix D3hot v D3cold confusion
Before this patch, ACPI_STATE_D3 incorrectly referenced D3hot
in some places, but D3cold in other places.

After this patch, ACPI_STATE_D3 always means ACPI_STATE_D3_COLD;
and all references to D3hot use ACPI_STATE_D3_HOT.

ACPI's _PR3 method is used to enter both D3hot and D3cold states.
What distinguishes D3hot from D3cold is the presence _PR3
(Power Resources for D3hot)  If these resources are all ON,
then the state is D3hot.  If _PR3 is not present,
or all _PR0 resources for the devices are OFF,
then the state is D3cold.

This patch applies after Linux-3.4-rc1.
A future syntax cleanup may remove ACPI_STATE_D3
to emphasize that it always means ACPI_STATE_D3_COLD.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Aaron Lu <aaron.lu@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-05-05 01:19:52 -04:00
Khalid Aziz b566a22c23 PCI: disable Bus Master on PCI device shutdown
Disable Bus Master bit on the device in pci_device_shutdown() to ensure PCI
devices do not continue to DMA data after shutdown.  This can cause memory
corruption in case of a kexec where the current kernel shuts down and
transfers control to a new kernel while a PCI device continues to DMA to
memory that does not belong to it any more in the new kernel.

I have tested this code on two laptops, two workstations and a 16-socket
server.  kexec worked correctly on all of them.

Signed-off-by: Khalid Aziz <khalid.aziz@hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-05-04 11:07:18 -06:00
Xudong Hao df558de16c PCI: work around IvyBridge internal graphics FLR erratum
For IvyBridge Mobile platform, a system hang may occur if a FLR (Function
Level Reset) is asserted to internal graphics.

This quirk is a workaround for the IVB FLR errata issue.  We are
disabling the FLR reset handshake between the PCH and CPU display, then
manually powering down the panel power sequencing and resetting the PCH
display.

Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-05-02 10:02:32 -06:00
Dave Airlie 5bc69bf9ae Merge tag 'drm-intel-next-2012-04-23' of git://people.freedesktop.org/~danvet/drm-intel into drm-core-next
Daniel Vetter writes:

A new drm-intel-next pull. Highlights:
- More gmbus patches from Daniel Kurtz, I think gmbus is now ready, all
 known issues fixed.
- Fencing cleanup and pipelined fencing removal from Chris.
- rc6 residency interface from Ben, useful for powertop.
- Cleanups and code reorg around the ringbuffer code (Ben&me).
- Use hw semaphores in the pageflip code from Ben.
- More vlv stuff from Jesse, unfortunately his vlv cpu is doa, so less
 merged than I've hoped for - we still have the unused function warning :(
- More hsw patches from Eugeni, again, not yet enabled fully.
- intel_pm.c refactoring from Eugeni.
- Ironlake sprite support from Chris.
- And various smaller improvements/fixes all over the place.

Note that this pull request also contains a backmerge of -rc3 to sort out
a few things in -next. I've also had to frob the shortlog a bit to exclude
anything that -rc3 brings in with this pull.

Regression wise we have a few strange bugs going on, but for all of them
closer inspection revealed that they've been pre-existing, just now
slightly more likely to be hit. And for most of them we have a patch
already. Otherwise QA has not reported any regressions, and I'm also not
aware of anything bad happening in 3.4.

* tag 'drm-intel-next-2012-04-23' of git://people.freedesktop.org/~danvet/drm-intel: (420 commits)
  drm/i915: rc6 residency (fix the fix)
  drm/i915/tv: fix open-coded ARRAY_SIZE.
  drm/i915: invalidate render cache on gen2
  drm/i915: Silence the change of LVDS sync polarity
  drm/i915: add generic power management initialization
  drm/i915: move clock gating functionality into intel_pm module
  drm/i915: move emon functionality into intel_pm module
  drm/i915: move drps, rps and rc6-related functions to intel_pm
  drm/i915: fix line breaks in intel_pm
  drm/i915: move watermarks settings into intel_pm module
  drm/i915: move fbc-related functionality into intel_pm module
  drm/i915: Refactor get_fence() to use the common fence writing routine
  drm/i915: Refactor fence clearing to use the common fence writing routine
  drm/i915: Refactor put_fence() to use the common fence writing routine
  drm/i915: Prepare to consolidate fence writing
  drm/i915: Remove the unsightly "optimisation" from flush_fence()
  drm/i915: Simplify fence finding
  drm/i915: Discard the unused obj->last_fenced_ring
  drm/i915: Remove unused ring->setup_seqno
  drm/i915: Remove fence pipelining
  ...
2012-05-02 09:22:29 +01:00
Huang, Xiong 7cb6a291ef atl1c: add workaround for issue of bit INTX-disable for MSI interrupt
All supported devices have one issue that msi interrupt doesn't assert
if pci command register bit (PCI_COMMAND_INTX_DISABLE) is set.
Add workaround in drivers/pci/quirks.c

Signed-off-by: xiong <xiong@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30 21:44:12 -04:00
Bjorn Helgaas 29473ec23a Merge branch 'topic/yinghai-hostbridge-cleanup' into next 2012-04-30 16:55:58 -06:00
Konrad Rzeszutek Wilk 977f857ca5 PCI: move mutex locking out of pci_dev_reset function
The intent of git commit 6fbf9e7a90
"PCI: Introduce __pci_reset_function_locked to be used when holding
device_lock." was to have a non-locking function that would call
pci_dev_reset function.

But it fell short of that by just probing and not actually reseting
the device. To make that work we need a way to move the lock
around device_lock to not be in pci_dev_reset (as the caller of
__pci_reset_function_locked already holds said lock). We do this by
renaming pci_dev_reset to __pci_dev_reset and bubbling said mutex out
of __pci_dev_reset to pci_dev_reset (a wrapper around __pci_dev_reset).
The __pci_reset_function_locked  can now call __pci_dev_reset without
having to worry about the dead-lock.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-30 16:47:26 -06:00
Bjorn Helgaas 284f5f9dba PCI: work around Stratus ftServer broken PCIe hierarchy
A PCIe downstream port is a P2P bridge.  Its secondary interface is
a link that should lead only to device 0 (unless ARI is enabled)[1], so
we don't probe for non-zero device numbers.

Some Stratus ftServer systems have a PCIe downstream port (02:00.0) that
leads to both an upstream port (03:00.0) and a downstream port (03:01.0),
and 03:01.0 has important devices below it:

  [0000:02]-+-00.0-[03-3c]--+-00.0-[04-09]--...
                            \-01.0-[0a-0d]--+-[USB]
                                            +-[NIC]
                                            +-...

Previously, we didn't enumerate device 03:01.0, so USB and the network
didn't work.  This patch adds a DMI quirk to scan all device numbers,
not just 0, below a downstream port.

Based on a patch by Prarit Bhargava.

[1] PCIe spec r3.0, sec 7.3.1

CC: Myron Stowe <mstowe@redhat.com>
CC: Don Dutile <ddutile@redhat.com>
CC: James Paradis <james.paradis@stratus.com>
CC: Matthew Wilcox <matthew.r.wilcox@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-30 15:21:02 -06:00
Yinghai Lu 4fa2649a01 PCI: add host bridge release support
We need a hook to release host bridge resources allocated when creating
root bus.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-30 14:52:43 -06:00
Yinghai Lu 7b54366358 PCI: add generic device into pci_host_bridge struct
Use that device for pci_root_bus bridge pointer.

Use pci_release_bus_bridge_dev() to release allocated pci_host_bridge in
remove path.

Use root bus bridge pointer to get host bridge pointer instead of searching
host bridge list.  That leaves the host bridge list unused, so remove it.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-30 13:53:42 -06:00
Yinghai Lu 459f58ce51 PCI: rename pci_host_bridge() to find_pci_root_bridge()
pci_host_bridge() looks like a C++ constructor.  Also separate
find_pci_root_bus() out.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-30 13:53:08 -06:00
Yinghai Lu 610929e119 PCI: move host bridge-related code to host-bridge.c
Move host bridge-related code from probe.c to a new host-bridge.c.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-04-30 13:36:14 -06:00
Linus Torvalds f6072452c9 Merge branch 'for-v3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull build fixes for less mainstream architectures from Paul Gortmaker:
 "These are fixes for frv(1), blackfin(2), powerpc(1) and xtensa(4).

  Fortunately the touches are nearly all specific to files just used by
  the arch in question.  The two touches to shared/common files
  [kernel/irq/debug.h and drivers/pci/Makefile] are trivial to assess as
  no risk to anyone.

  Half of them relate to xtensa directly.  It was only when I fixed the
  last xtensa issue that I realized that the arch has been broken for a
  significant time, and isn't a specific v3.4 regression.  So if you
  wanted, we could leave xtensa lying bleeding in the street for a
  couple more weeks and queue those for 3.5.  But given they are no risk
  to anyone outside of xtensa, I figured to just leave them in.

  If you are OK with taking the xtensa fixes, then please pull to get:

   - one last implicit include uncovered by system.h that is in a file
     specific to just one powerpc defconfig.  (I'd sync'd with BenH).

   - fix an oversight in the PCI makefile where shared code wasn't being
     compiled for ARCH=frv

   - fix a missing include for GPIO in blackfin framebuffer.

   - audit and tag endif in blackfin ezkit board file, in order to find
     and fix the misplaced endif masking a block of code.

   - fix irq/debug.h choice of temporary macro names to be more internal
     so they don't conflict with names used by xtensa.

   - fix a reference to an undeclared local var in xtensa's signal.c

   - fix an implicit bug.h usage in xtensa's asm/io.h uncovered by my
     removing bug.h from kernel.h

   - fix xtensa to properly indicate it is using asm-generic/hardirq.h
     in order to resolve the link error - undefined ack_bad_irq

  The xtensa still fails final link as my latest binutils does something
  evil when ld forward-relocates unlikely() blocks, but in theory people
  who have older/valid toolchains could now use the thing."

* 'for-v3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  xtensa: fix build fail on undefined ack_bad_irq
  blackfin: fix ifdef fustercluck in mach-bf538/boards/ezkit.c
  blackfin: fix compile error in bfin-lq035q1-fb.c
  pci: frv architecture needs generic setup-bus infrastructure
  irq: hide debug macros so they don't collide with others.
  xtensa: fix build error in xtensa/include/asm/io.h
  xtensa: fix build failure in xtensa/kernel/signal.c
  powerpc: fix system.h fallout in sysdev/scom.c [chroma_defconfig]
2012-04-27 19:32:37 -07:00
Paul Gortmaker cd0a2bfb77 pci: frv architecture needs generic setup-bus infrastructure
Otherwise we get this link failure for frv's defconfig:

   LD      .tmp_vmlinux1
 drivers/built-in.o: In function `pci_assign_resource':
 (.text+0xbf0c): undefined reference to `pci_cardbus_resource_alignment'
 drivers/built-in.o: In function `pci_setup':
 pci.c:(.init.text+0x174): undefined reference to `pci_realloc_get_opt'
 pci.c:(.init.text+0x1a0): undefined reference to `pci_realloc_get_opt'
 make[1]: *** [.tmp_vmlinux1] Error 1

Cc: David Howells <dhowells@redhat.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-26 14:46:17 -04:00
Matthew Garrett 1a39b310e9 vgaarb: Add support for setting the default video device (v2)
The default VGA device is a somewhat fluid concept on platforms with
multiple GPUs. Add support for setting it so switching code can update
things appropriately, and make sure that the sysfs code returns the right
device if it's changed.

v2: Updated to fix builds when __ARCH_HAS_VGA_DEFAULT_DEVICE is false.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: benh@kernel.crashing.org
Cc: airlied@redhat.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-04-24 09:50:15 +01:00
Rafael J. Wysocki a6cb9ee7ca PCI: Retry BARs restoration for Type 0 headers only
Some shortcomings introduced into pci_restore_state() by commit
26f41062f2 ("PCI: check for pci bar restore completion and retry")
have been fixed by recent commit ebfc5b802f ("PCI: Fix regression in
pci_restore_state(), v3"), but that commit treats all PCI devices as
those with Type 0 configuration headers.

That is not entirely correct, because Type 1 and Type 2 headers have
different layouts.  In particular, the area occupied by BARs in Type 0
config headers contains the secondary status register in Type 1 ones and
it doesn't make sense to retry the restoration of that register even if
the value read back from it after a write is not the same as the written
one (it very well may be different).

For this reason, make pci_restore_state() only retry the restoration
of BARs for Type 0 config headers.  This effectively makes it behave
as before commit 26f41062f2 for all header types except for Type 0.

Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-16 18:33:35 -07:00
Rafael J. Wysocki ebfc5b802f PCI: Fix regression in pci_restore_state(), v3
Commit 26f41062f2 ("PCI: check for pci bar restore completion and
retry") attempted to address problems with PCI BAR restoration on
systems where FLR had not been completed before pci_restore_state() was
called, but it did that in an utterly wrong way.

First off, instead of retrying the writes for the BAR registers only, it
did that for all of the PCI config space of the device, including the
status register (whose value after the write quite obviously need not be
the same as the written one).  Second, it added arbitrary delay to
pci_restore_state() even for systems where the PCI config space
restoration was successful at first attempt.  Finally, the mdelay(10) it
added to every iteration of the writing loop was way too much of a delay
for any reasonable device.

All of this actually caused resume failures for some devices on Mikko's
system.

To fix the regression, make pci_restore_state() only retry the writes
for BAR registers and only wait if the first read from the register
doesn't return the written value.  Additionaly, make it wait for 1 ms,
instead of 10 ms, after every failing attempt to write into config
space.

Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-15 13:06:29 -07:00
Linus Torvalds 9479f0f801 Two fixes for regressions:
* one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree,
  * the other is to fix drivers to load on PV (a previous patch made them only
    load in PVonHVM mode).
 
 The rest are just minor fixes in the various drivers and some cleanup in the
 core code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPfyVUAAoJEFjIrFwIi8fJUjUH/jbY5JavRqSlNELZW2A4Ta76
 8p00LqLHw/C56iHZcWKke8mqtWNb+ZfcQt7ZYcxDIYa4QWBL28x0OLAO2tOBIt37
 ZjYESWSdFJaJvmpADluWtFyGyZ9TYJllDTBm/jWj1ZtKSZvR1YkhuMXCS0f4AmGQ
 xFzSWJZUDdiOAqpN+VQD8wP00gfR8knQLg16XE2fvFdQo4XwpCtqLfHV/5pMMGdy
 Cs/ep6rq/7cdv/nshKOcBnw7RW8l3Xoi/28ht8k3DvAQ2VtFq1Tugv2G9pcCHwQG
 DIBkB3SOU6/v6P5at5+egKS5xR1fJetCWlkMd8kkbcdz2NPI4UDMkvOW6Q8yQls=
 =6Ve+
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull xen fixes from Konrad Rzeszutek Wilk:
 "Two fixes for regressions:
   * one is a workaround that will be removed in v3.5 with proper fix in
     the tip/x86 tree,
   * the other is to fix drivers to load on PV (a previous patch made
     them only load in PVonHVM mode).

  The rest are just minor fixes in the various drivers and some cleanup
  in the core code."

* tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
  xen/pciback: fix XEN_PCI_OP_enable_msix result
  xen/smp: Remove unnecessary call to smp_processor_id()
  xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'
  xen: only check xen_platform_pci_unplug if hvm
2012-04-06 17:54:53 -07:00
Jan Beulich f09d8432e3 xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
The original XenoLinux code has always had things this way, and for
compatibility reasons (in particular with a subsequent pciback
adjustment) upstream Linux should behave the same way (allowing for two
distinct error indications to be returned by the backend).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06 12:16:02 -04:00
Matthew Garrett c9651e70ad ASPM: Fix pcie devices with non-pcie children
Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
Some other systems using the pata_jmicron driver fail to boot because no
disks are detected.  Passing pcie_aspm=force on the kernel command line
works around it.

The cause: commit 4949be1682 ("PCI: ignore pre-1.1 ASPM quirking when
ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
always return 0 if aspm is disabled, in order to avoid cases where we
changed ASPM state on pre-PCIe 1.1 devices.

This skipped the secondary function of pcie_aspm_sanity_check which was
to avoid us enabling ASPM on devices that had non-PCIe children, causing
trouble later on.  Move the aspm_disabled check so we continue to honour
that scenario.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
          http://bugs.debian.org/665420

Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
[jn: with more symptoms in log message]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-31 12:49:56 -07:00
Linus Torvalds a335750b9a Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI & Power Management changes from Len Brown:
 - ACPI 5.0 after-ripples, ACPICA/Linux divergence cleanup
 - cpuidle evolving, more ARM use
 - thermal sub-system evolving, ditto
 - assorted other PM bits

Fix up conflicts in various cpuidle implementations due to ARM cpuidle
cleanups (ARM at91 self-refresh and cpu idle code rewritten into
"standby" in asm conflicting with the consolidation of cpuidle time
keeping), trivial SH include file context conflict and RCU tracing fixes
in generic code.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (77 commits)
  ACPI throttling: fix endian bug in acpi_read_throttling_status()
  Disable MCP limit exceeded messages from Intel IPS driver
  ACPI video: Don't start video device until its associated input device has been allocated
  ACPI video: Harden video bus adding.
  ACPI: Add support for exposing BGRT data
  ACPI: export acpi_kobj
  ACPI: Fix logic for removing mappings in 'acpi_unmap'
  CPER failed to handle generic error records with multiple sections
  ACPI: Clean redundant codes in scan.c
  ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
  ACPI: consistently use should_use_kmap()
  PNPACPI: Fix device ref leaking in acpi_pnp_match
  ACPI: Fix use-after-free in acpi_map_lsapic
  ACPI: processor_driver: add missing kfree
  ACPI, APEI: Fix incorrect APEI register bit width check and usage
  Update documentation for parameter *notrigger* in einj.txt
  ACPI, APEI, EINJ, new parameter to control trigger action
  ACPI, APEI, EINJ, limit the range of einj_param
  ACPI, APEI, Fix ERST header length check
  cpuidle: power_usage should be declared signed integer
  ...
2012-03-30 16:45:39 -07:00
Lin Ming b24e509885 ACPI, PCI: Move acpi_dev_run_wake() to ACPI core
acpi_dev_run_wake() is a generic function which can be used by
other subsystem too. Rename it to acpi_pm_device_run_wake, to be
consistent with acpi_pm_device_sleep_wake.

Then move it to ACPI core.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 02:21:18 -04:00
Linus Torvalds 475c77edf8 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
Pull PCI changes (including maintainer change) from Jesse Barnes:
 "This pull has some good cleanups from Bjorn and Yinghai, as well as
  some more code from Yinghai to better handle resource re-allocation
  when enabled.

  There's also a new initcall_debug feature from Arjan which will print
  out quirk timing information to help identify slow quirks for fixing
  or refinement (Yinghai sent in a few patches to do just that once the
  new debug code landed).

  Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas.
  He's been a core PCI and Linux contributor for some time now, and has
  kindly volunteered to take over.  I just don't feel I have the time
  for PCI review and work that it deserves lately (I've taken on some
  other projects), and haven't been as responsive lately as I'd like, so
  I approached Bjorn asking if he'd like to manage things.  He's going
  to give it a try, and I'm confident he'll do at least as well as I
  have in keeping the tree managed, patches flowing, and keeping things
  stable."

Fix up some fairly trivial conflicts due to other cleanups (mips device
resource fixup cleanups clashing with list handling cleanup, ppc iseries
removal clashing with pci_probe_only cleanup etc)

* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits)
  PCI: Bjorn gets PCI hotplug too
  PCI: hand PCI maintenance over to Bjorn Helgaas
  unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h
  sparc/PCI: convert devtree and arch-probed bus addresses to resource
  powerpc/PCI: allow reallocation on PA Semi
  powerpc/PCI: convert devtree bus addresses to resource
  powerpc/PCI: compute I/O space bus-to-resource offset consistently
  arm/PCI: don't export pci_flags
  PCI: fix bridge I/O window bus-to-resource conversion
  x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()'
  PCI / PCIe: Introduce command line option to disable ARI
  PCI: make acpihp use __pci_remove_bus_device instead
  PCI: export __pci_remove_bus_device
  PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
  PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
  PCI: print out PCI device info along with duration
  PCI: Move "pci reassigndev resource alignment" out of quirks.c
  PCI: Use class for quirk for usb host controller fixup
  PCI: Use class for quirk for ti816x class fixup
  PCI: Use class for quirk for intel e100 interrupt fixup
  ...
2012-03-23 14:02:12 -07:00
Linus Torvalds d4c6fa73fe Features:
- PV multiconsole support, so that there can be hvc1, hvc2, etc;
  - P-state and C-state power management driver that uploads said
    power management data to the hypervisor. It also inhibits cpufreq
    scaling drivers to load so that only the hypervisor can make power
    management decisions - fixing a weird perf bug.
  - Function Level Reset (FLR) support in the Xen PCI backend.
 Fixes:
  - Kconfig dependencies for Xen PV keyboard and video
  - Compile warnings and constify fixes
  - Change over to use percpu_xxx instead of this_cpu_xxx
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPZ0qkAAoJEFjIrFwIi8fJjCgH/jeJ39E8ML8DP9tCS2HQnMqM
 uTEjLcqvoJ7sEhHvtBLPeG2p0jyBvOWjLbSc7P8nESBAMPvSYol8L6WqfWrdSU4r
 lHrma2sg9UYzRog5NyxAgkp7bBsBBFOnhVL3Cxb5Ig78cPWzeSWGpqGZ8M/d51Wf
 1iE0tHuU4DpN+fg1SZqPqEm8ecEJ/eSrVTnyTx/Qo2Ak+Zw98SqzX7SV5lo8mudd
 WFL1F2K9FyTNk79ndGhqFt36x6nEbFgMLbmCDWumLuWN6bMd1Uq0wNkCqW4F1h28
 3yqnY+rfQh4y3eXK1B9nttCUTs+/66U5ZWrT6B1IJumGTAIqcWfgeUX/Vn/HVC4=
 =tfMc
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull xen updates from Konrad Rzeszutek Wilk:
 "which has three neat features:

   - PV multiconsole support, so that there can be hvc1, hvc2, etc; This
     can be used in HVM and in PV mode.

   - P-state and C-state power management driver that uploads said power
     management data to the hypervisor.  It also inhibits cpufreq
     scaling drivers to load so that only the hypervisor can make power
     management decisions - fixing a weird perf bug.

     There is one thing in the Kconfig that you won't like: "default y
     if (X86_ACPI_CPUFREQ = y || X86_POWERNOW_K8 = y)" (note, that it
     all depends on CONFIG_XEN which depends on CONFIG_PARAVIRT which by
     default is off).  I've a fix to convert that boolean expression
     into "default m" which I am going to post after the cpufreq git
     pull - as the two patches to make this work depend on a fix in Dave
     Jones's tree.

   - Function Level Reset (FLR) support in the Xen PCI backend.

  Fixes:

   - Kconfig dependencies for Xen PV keyboard and video
   - Compile warnings and constify fixes
   - Change over to use percpu_xxx instead of this_cpu_xxx"

Fix up trivial conflicts in drivers/tty/hvc/hvc_xen.c due to changes to
a removed commit.

* tag 'stable/for-linus-3.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND deps
  xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.
  xen: constify all instances of "struct attribute_group"
  xen/xenbus: ignore console/0
  hvc_xen: introduce HVC_XEN_FRONTEND
  hvc_xen: implement multiconsole support
  hvc_xen: support PV on HVM consoles
  xenbus: don't free other end details too early
  xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
  xen/setup/pm/acpi: Remove the call to boot_option_idle_override.
  xenbus: address compiler warnings
  xen: use this_cpu_xxx replace percpu_xxx funcs
  xen/pciback: Support pci_reset_function, aka FLR or D3 support.
  pci: Introduce __pci_reset_function_locked to be used when holding device_lock.
  xen: Utilize the restore_msi_irqs hook.
2012-03-22 20:16:14 -07:00
Linus Torvalds 3b59bf0816 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking merge from David Miller:
 "1) Move ixgbe driver over to purely page based buffering on receive.
     From Alexander Duyck.

  2) Add receive packet steering support to e1000e, from Bruce Allan.

  3) Convert TCP MD5 support over to RCU, from Eric Dumazet.

  4) Reduce cpu usage in handling out-of-order TCP packets on modern
     systems, also from Eric Dumazet.

  5) Support the IP{,V6}_UNICAST_IF socket options, making the wine
     folks happy, from Erich Hoover.

  6) Support VLAN trunking from guests in hyperv driver, from Haiyang
     Zhang.

  7) Support byte-queue-limtis in r8169, from Igor Maravic.

  8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but
     was never properly implemented, Jiri Benc fixed that.

  9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang.

  10) Support kernel side dump filtering by ctmark in netfilter
      ctnetlink, from Pablo Neira Ayuso.

  11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker.

  12) Add new peek socket options to assist with socket migration, from
      Pavel Emelyanov.

  13) Add sch_plug packet scheduler whose queue is controlled by
      userland daemons using explicit freeze and release commands.  From
      Shriram Rajagopalan.

  14) Fix FCOE checksum offload handling on transmit, from Yi Zou."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits)
  Fix pppol2tp getsockname()
  Remove printk from rds_sendmsg
  ipv6: fix incorrent ipv6 ipsec packet fragment
  cpsw: Hook up default ndo_change_mtu.
  net: qmi_wwan: fix build error due to cdc-wdm dependecy
  netdev: driver: ethernet: Add TI CPSW driver
  netdev: driver: ethernet: add cpsw address lookup engine support
  phy: add am79c874 PHY support
  mlx4_core: fix race on comm channel
  bonding: send igmp report for its master
  fs_enet: Add MPC5125 FEC support and PHY interface selection
  net: bpf_jit: fix BPF_S_LDX_B_MSH compilation
  net: update the usage of CHECKSUM_UNNECESSARY
  fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx
  net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso
  ixgbe: Fix issues with SR-IOV loopback when flow control is disabled
  net/hyperv: Fix the code handling tx busy
  ixgbe: fix namespace issues when FCoE/DCB is not enabled
  rtlwifi: Remove unused ETH_ADDR_LEN defines
  igbvf: Use ETH_ALEN
  ...

Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and
drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
2012-03-20 21:04:47 -07:00
Linus Torvalds 4a52246302 driver core merge for 3.4-rc1
Here's the big driver core merge for 3.4-rc1.
 
 Lots of various things here, sysfs fixes/tweaks (with the nlink breakage
 reverted), dynamic debugging updates, w1 drivers, hyperv driver updates,
 and a variety of other bits and pieces, full information in the
 shortlog.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iEYEABECAAYFAk9neCsACgkQMUfUDdst+ylyQwCfY2eizvzw5HhjQs8gOiBRDADe
 yrgAnj1Zan2QkoCnQIFJNAoxqNX9yAhd
 =biH6
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core patches for 3.4-rc1 from Greg KH:
 "Here's the big driver core merge for 3.4-rc1.

  Lots of various things here, sysfs fixes/tweaks (with the nlink
  breakage reverted), dynamic debugging updates, w1 drivers, hyperv
  driver updates, and a variety of other bits and pieces, full
  information in the shortlog."

* tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (78 commits)
  Tools: hv: Support enumeration from all the pools
  Tools: hv: Fully support the new KVP verbs in the user level daemon
  Drivers: hv: Support the newly introduced KVP messages in the driver
  Drivers: hv: Add new message types to enhance KVP
  regulator: Support driver probe deferral
  Revert "sysfs: Kill nlink counting."
  uevent: send events in correct order according to seqnum (v3)
  driver core: minor comment formatting cleanups
  driver core: move the deferred probe pointer into the private area
  drivercore: Add driver probe deferral mechanism
  DS2781 Maxim Stand-Alone Fuel Gauge battery and w1 slave drivers
  w1_bq27000: Only one thread can access the bq27000 at a time.
  w1_bq27000 - remove w1_bq27000_write
  w1_bq27000: remove unnecessary NULL test.
  sysfs: Fix memory leak in sysfs_sd_setsecdata().
  intel_idle: Revert change of auto_demotion_disable_flags for Nehalem
  w1: Fix w1_bq27000
  driver-core: documentation: fix up Greg's email address
  powernow-k6: Really enable auto-loading
  powernow-k7: Fix CPU family number
  ...
2012-03-20 11:16:20 -07:00
Bjorn Helgaas cf48fb6a2b PCI: fix bridge I/O window bus-to-resource conversion
In 5bfa14ed9f, I forgot to initialize res2.flags before calling
pcibios_bus_to_resource(), which depends on the resource type to locate the
correct aperture.  This bug won't hurt x86, which currently never has an
offset between bus and CPU addresses, but will affect other architectures.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-03-20 10:41:27 -07:00
David S. Miller 4da0bd7365 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-03-18 23:29:41 -04:00
Greg Kroah-Hartman 263a5c8e16 Merge 3.3-rc6 into driver-core-next
This was done to resolve a conflict in the drivers/base/cpu.c file.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-09 12:35:53 -08:00
Matthew Garrett 4949be1682 PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled
Right now we won't touch ASPM state if ASPM is disabled, except in the case
where we find a device that appears to be too old to reliably support ASPM.
Right now we'll clear it in that case, which is almost certainly the wrong
thing to do. The easiest way around this is just to disable the blacklisting
when ASPM is disabled.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-03-07 20:26:47 -08:00
Matt Carlson 0b47150671 tg3: Recode PCI MRRS adjustment as a PCI quirk
This patch recodes the MRRS cap for 5719 A0 devices as a PCI quirk.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-04 20:54:01 -05:00
Rafael J. Wysocki 6748dcc269 PCI / PCIe: Introduce command line option to disable ARI
There are PCIe devices on the market that report ARI support but
then fail to initialize correctly when ARI is actually used.  This
leads to situations in which kernels 2.6.34 and newer fail to handle
systems where the previous kernels worked without any apparent
problems.  Unfortunately, it is currently unknown how many such
devices are there.

For this reason, introduce a new kernel command line option,
pci=noari, allowing users to disable PCIe ARI altogether if they
see problems with PCIe device initialization.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-03-01 13:36:04 -08:00
Yinghai Lu f6330c3178 PCI: make acpihp use __pci_remove_bus_device instead
pci_stop_bus_device gets called before in the same loop.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-27 12:17:16 -08:00
Yinghai Lu 6b22cf3f35 PCI: export __pci_remove_bus_device
Don't switch to pci_remove_bus_device yet, keep the __ prefix for now
(the behavior is still the same: remove without stopping first).

This allows other out of tree users or pending patches to get notified
from compiler warning.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-27 12:16:55 -08:00
Yinghai Lu 6754b9e9c3 PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
The old pci_remove_behind_bridge actually do stop and remove.

Make the name reflect that to reduce confusion.

Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-27 12:14:55 -08:00
Yinghai Lu 210647af89 PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
The old pci_remove_bus_device actually did stop and remove.

Make the name reflect that to reduce confusion.

This patch is done by sed scripts and changes back some incorrect
__pci_remove_bus_device changes.

Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-27 12:12:18 -08:00
Yinghai Lu 3cf8b64380 PCI: print out PCI device info along with duration
Makes it a little easier to figure out which device may have caused a
slow quirk.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:38:55 -08:00
Yinghai Lu 2069ecfbe1 PCI: Move "pci reassigndev resource alignment" out of quirks.c
This isn't really a quirk; calling it directly from pci_add_device makes
more sense.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:37:26 -08:00
Yinghai Lu 40c96236bd PCI: Use class for quirk for ti816x class fixup
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:35:16 -08:00
Yinghai Lu 4c5b28e26d PCI: Use class for quirk for intel e100 interrupt fixup
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:35:15 -08:00
Yinghai Lu 08803efe84 PCI: Use class for quirk for netmos class fixup
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:35:14 -08:00
Yinghai Lu faa738bba5 PCI: Use class for quirk for legacy ATA NO_D3
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:35:14 -08:00
Yinghai Lu ae9de56bdd PCI: Use class for quirk for cardbus_legacy
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:35:13 -08:00
Yinghai Lu 52d21b5ef4 PCI: Use class for quirk for host bridge mmio_always_on
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:34:48 -08:00
Yinghai Lu f4ca5c6a56 PCI: Add class support in quirk handling
Recently added support to allow quirks to report duration also make the
boot log very crowded when initcall_debug is specified.

One thing we can to do mitigate this is to not call quirks unnecessarily
by adding a new quirk declaration macro that takes a class argument.

The new macro takes a class value and a class shift value (since it can
vary) so that quirks will be limited to certain device classes, greatly
reducing the number we call on every PCI device addition.

-v2: fix v1 that left over of sparated patch.
-v3: according to Jesse, change cls to class, cls_shift, to class_shift.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 14:34:40 -08:00
Jesse Barnes ecd58d667a Merge branch 'pci-next+probe_only+bus2res-fb127cb' of git://github.com/bjorn-helgaas/linux into linux-next 2012-02-24 14:25:33 -08:00
Yinghai Lu b07f2ebc10 PCI: add a PCI resource reallocation config option
Add a new config option, PCI_REALLOC_ENABLE_AUTO, which will
automatically try to re-allocate PCI resources if PCI_IOV support is
enabled and the SR-IOV resources are unassigned.  Behavior can still be
controlled using the pci=realloc= parameter.

-v2: According to Jesse, adding one CONFIG option for distribution to
     disable it or enable it.
-v3: update Kconfig text (jbarnes)

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 09:38:59 -08:00
Yinghai Lu eb572e7c76 PCI: print out suggestion about using pci=realloc
let user know they could try if pci=realloc could help.

-v2: update suggestion text.

Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 08:47:53 -08:00
Yinghai Lu b55438fdd5 PCI: prepare pci=realloc for multiple options
Let the user could enable and disable with pci=realloc=on or pci=realloc=off

Also
1. move variable and functions near the place they are used.
2. change macro to function
3. change related functions and variable to static and _init
4. update parameter description accordingly.

This will let us add a config option to control default behavior, and
still allow the user to turn off automatic reallocation if it fails on
their platform until a permanent solution is found.

-v2: still honor pci=realloc, and treat it as pci=realloc=on
     also use enum instead of ...
-v3: update kernel-paramenters.txt according to Jesse.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 08:47:42 -08:00
Yinghai Lu 0c5be0cb0e PCI: Retry on IORESOURCE_IO type allocations
When enabling pci reallocation for a pci bridge, we clear the small size
in in bridge and re-assign with requested + optional size for first
several tries, but Ram mention could have problem with one case:
	https://bugzilla.kernel.org/show_bug.cgi?id=15960

After checking the booting log in
	https://lkml.org/lkml/2010/4/19/44
	[regression, bisected] Xonar DX invalid PCI I/O range since 977d17bb17

We should not stop too early for io ports.
	Apr 19 10:19:38 [kernel] pci 0000:04:00.0: BAR 7: can't assign io (size 0x4000)
	Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 8: assigned [mem 0x80400000-0x805fffff]
	Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 7: can't assign io (size 0x2000)
	Apr 19 10:19:38 [kernel] pci 0000:05:02.0: BAR 7: can't assign io (size 0x1000)
	Apr 19 10:19:38 [kernel] pci 0000:05:03.0: BAR 7: can't assign io (size 0x1000)
	Apr 19 10:19:38 [kernel] pci 0000:08:00.0: BAR 7: can't assign io (size 0x1000)
	Apr 19 10:19:38 [kernel] pci 0000:09:04.0: BAR 0: can't assign io (size 0x100)
and clear 00:1c.0 to retry again.

This patch removes IORESOUCE_IO checking, and tries one more time.  It
gives us a chance to get an allocation for the 00:1c.0 io port range
because the range from 0x4000 to 0x8000 will be freed and we can use it.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-24 08:44:38 -08:00
Bjorn Helgaas fb127cb9de PCI: collapse pcibios_resource_to_bus
Everybody uses the generic pcibios_resource_to_bus() supplied by the core
now, so remove the ARCH_HAS_GENERIC_PCI_OFFSETS used during conversion.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:19:04 -07:00
Bjorn Helgaas 36a66cd6fd PCI: add generic pcibios_resource_to_bus()
This replaces the generic versions of pcibios_resource_to_bus() and
pcibios_bus_to_resource() in asm-generic/pci.h with versions that use
pci_resource_to_bus() and pci_bus_to_resource().

The replacements are equivalent except that they can apply host
bridge window offsets when the arch has supplied them by using
pci_add_resource_offset().

Each arch can convert to using pci_add_resource_offset() individually by
removing its device resource fixups from pcibios_fixup_bus() and supplying
ARCH_HAS_GENERIC_PCI_OFFSETS.  ARCH_HAS_GENERIC_PCI_OFFSETS can be removed
after all have converted.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:19:00 -07:00
Bjorn Helgaas 5bfa14ed9f PCI: convert bus addresses to resource when reading BARs
Some PCI host bridges translate CPU addresses to PCI bus addresses.
Previously, we initialized pci_dev resources with PCI bus addresses,
then converted them to CPU addresses later in arch-specific code
(pcibios_fixup_resources()), which leaves a window of time where the
pci_dev resources are incorrect.

This patch adds support in the core for this address translation.
When the arch creates the root bus, it can supply the host bridge
address translation information, and the core can use it to set the
pci_dev resources correctly from the beginning.

This gives us a way to fix the problem that quirks that run between device
discovery and pcibios_fixup_resources() fail because they use pci_dev
resources that haven't been converted.  The reference below is to one
such problem that affected ARM and ia64.

Note that this patch has no effect until an arch starts using
pci_add_resource_offset() with a non-zero offset: before that, all
all host bridge windows have a zero offset and pci_bus_to_resource()
copies the pci_bus_region directly to the struct resource.

Reference: https://lkml.org/lkml/2009/10/12/405
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:19:00 -07:00
Bjorn Helgaas 0efd5aab41 PCI: add struct pci_host_bridge_window with CPU/bus address offset
Some PCI host bridges apply an address offset, so bus addresses on PCI are
different from CPU addresses.  This patch adds a way for architectures to
tell the PCI core about this offset.  For example:

    LIST_HEAD(resources);
    pci_add_resource_offset(&resources, host->io_space, host->io_offset);
    pci_add_resource_offset(&resources, host->mem_space, host->mem_offset);
    pci_scan_root_bus(parent, bus, ops, sysdata, &resources);

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:19:00 -07:00
Bjorn Helgaas 5a21d70dbd PCI: add struct pci_host_bridge and a list of all bridges found
This adds a list of all PCI host bridges we find and a way to look up
the host bridge from a pci_dev.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:18:59 -07:00
Bjorn Helgaas a5390aa6dc PCI: don't publish new root bus until it's fully initialized
When pci_create_root_bus() adds the new struct pci_bus to the global
pci_root_buses list, the bus becomes visible to other parts of the
kernel, so it should be fully initialized.

This patch delays adding the bus to the pci_root_buses list until after
all the struct pci_bus initialization is finished.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:18:59 -07:00
Bjorn Helgaas 844393f4c5 PCI: make pci_flags non-weak
No architecture defines its own pci_flags, so the core symbol does not
need to be weak.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:18:59 -07:00
Bjorn Helgaas 47087700ce PCI: make pci_flags always available
If we move resource assignment functions into the core, we'll still
need a way for architectures to prevent reassignment, e.g., the
"pci_probe_only" functionality, and we'll need a generic, always
available way the core can test for that.  The "pci_flags"
arrangement used by several architectures seems like a convenient
way to do this.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-02-23 20:18:55 -07:00
MUNEDA Takahiro 7570a333d8 PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp driver
Add a parameter to avoid using MSI/MSI-X for PCIe native hotplug; it's
known to be buggy on some platforms.

In my environment, while shutting down, following stack trace is shown
sometimes.

  irq 16: nobody cared (try booting with the "irqpoll" option)
  Pid: 1081, comm: reboot Not tainted 3.2.0 #1
  Call Trace:
   <IRQ>  [<ffffffff810cec1d>] __report_bad_irq+0x3d/0xe0
   [<ffffffff810cee1c>] note_interrupt+0x15c/0x210
   [<ffffffff810cc485>] handle_irq_event_percpu+0xb5/0x210
   [<ffffffff810cc621>] handle_irq_event+0x41/0x70
   [<ffffffff810cf675>] handle_fasteoi_irq+0x55/0xc0
   [<ffffffff81015356>] handle_irq+0x46/0xb0
   [<ffffffff814fbe9d>] do_IRQ+0x5d/0xe0
   [<ffffffff814f146e>] common_interrupt+0x6e/0x6e
   [<ffffffff8106b040>] ? __do_softirq+0x60/0x210
   [<ffffffff8108aeb1>] ? hrtimer_interrupt+0x151/0x240
   [<ffffffff814fb5ec>] call_softirq+0x1c/0x30
   [<ffffffff810152d5>] do_softirq+0x65/0xa0
   [<ffffffff8106ae9d>] irq_exit+0xbd/0xe0
   [<ffffffff814fbf8e>] smp_apic_timer_interrupt+0x6e/0x99
   [<ffffffff814f9e5e>] apic_timer_interrupt+0x6e/0x80
   <EOI>  [<ffffffff814f0fb1>] ? _raw_spin_unlock_irqrestore+0x11/0x20
   [<ffffffff812629fc>] pci_bus_write_config_word+0x6c/0x80
   [<ffffffff81266fc2>] pci_intx+0x52/0xa0
   [<ffffffff8127de3d>] pci_intx_for_msi+0x1d/0x30
  [<ffffffff8127e4fb>] pci_msi_shutdown+0x7b/0x110
   [<ffffffff81269d34>] pci_device_shutdown+0x34/0x50
   [<ffffffff81326c4f>] device_shutdown+0x2f/0x140
   [<ffffffff8107b981>] kernel_restart_prepare+0x31/0x40
   [<ffffffff8107b9e6>] kernel_restart+0x16/0x60
   [<ffffffff8107bbfd>] sys_reboot+0x1ad/0x220
   [<ffffffff814f4b90>] ? do_page_fault+0x1e0/0x460
   [<ffffffff811942d0>] ? __sync_filesystem+0x90/0x90
   [<ffffffff8105c9aa>] ? __cond_resched+0x2a/0x40
   [<ffffffff814ef090>] ? _cond_resched+0x30/0x40
   [<ffffffff81169e17>] ? iterate_supers+0xb7/0xd0
   [<ffffffff814f9382>] system_call_fastpath+0x16/0x1b
  handlers:
  [<ffffffff8138a0f0>] usb_hcd_irq
  [<ffffffff8138a0f0>] usb_hcd_irq
  [<ffffffff8138a0f0>] usb_hcd_irq
  Disabling IRQ #16

An un-wanted interrupt is generated when PCI driver switches from
MSI/MSI-X to INTx while shutting down the device.  The interrupt does
not happen if MSI/MSI-X is not used on the device.
I confirmed that this problem does not happen if pcie_hp=nomsi was
specified and hotplug operation worked fine as usual.

v2: Automatically disable MSI/MSI-X against following device:
    PCI bridge: Integrated Device Technology, Inc. Device 807f (rev 02)
v3: Based on the review comment, combile the if statements.
v4: Removed module parameter.
    Move some code to build pciehp as a module.
    Move device specific code to driver/pci/quirks.c.
v5: Drop a device specific code until getting a vendor statement.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 12:29:35 -08:00
Yinghai Lu 34a4876e30 PCI: move pci_find_saved_cap out of linux/pci.h
Only one user in driver/pci/pci.c, so we don't need to put it in global
pci.h

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 12:27:11 -08:00
Yinghai Lu f796841e49 PCI: fix memleak for pci dev removing during hotplug
unreferenced object 0xffff880276d17700 (size 64):
  comm "swapper/0", pid 1, jiffies 4294897182 (age 3976.028s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 18 f9 de 76 02 88 ff ff  ...........v....
    10 00 00 00 0e 00 00 00 0f 28 40 00 00 00 00 00  .........(@.....
  backtrace:
    [<ffffffff81c8aede>] kmemleak_alloc+0x26/0x43
    [<ffffffff811385f0>] __kmalloc+0x121/0x183
    [<ffffffff813cf821>] pci_add_cap_save_buffer+0x35/0x7c
    [<ffffffff813d12b7>] pci_allocate_cap_save_buffers+0x1d/0x65
    [<ffffffff813cdb52>] pci_device_add+0x92/0xf1
    [<ffffffff81c8afe6>] pci_scan_single_device+0x9f/0xa1
    [<ffffffff813cdbd2>] pci_scan_slot.part.20+0x21/0x106
    [<ffffffff813cdce2>] pci_scan_slot+0x2b/0x35
    [<ffffffff81c8dae4>] __pci_scan_child_bus+0x51/0x107
    [<ffffffff81c8d75b>] pci_scan_bridge+0x376/0x6ae
    [<ffffffff81c8db60>] __pci_scan_child_bus+0xcd/0x107
    [<ffffffff81c8dbab>] pci_scan_child_bus+0x11/0x2a
    [<ffffffff81cca58c>] pci_acpi_scan_root+0x18b/0x21c
    [<ffffffff81c916be>] acpi_pci_root_add+0x1e1/0x42a
    [<ffffffff81406210>] acpi_device_probe+0x50/0x190
    [<ffffffff814a0227>] really_probe+0x99/0x126

Need to free saved_buffer for capabilities.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 12:08:53 -08:00
Yinghai Lu 2dd8ba921d PCI: Fix device class print out
Found debug print of class is shifted.

| pci 0000:f8:15.2: [8086:2b56] type 0 class 0x000600

Code is trying to print class with 6 digits, but use shifted class with
4 digits valid value as variable.

Change to original dev->class directly.

Also remove not needed calculating of local variable class, because it
will be updated after pci_fixup_device(pci_fixup_early...)

Also unify type print out when class and header is not matched.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 12:05:59 -08:00
Yinghai Lu 3796f1e2ca PCI: Skip cardbus assigned resource reset during pci bus rescan
Otherwise when rescan is used for cardbus, assigned resources will get
cleared.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 12:00:04 -08:00
Yinghai Lu 1184893439 PCI: Fix "cardbus bridge resources as optional" size handling
We should not set the requested size to -2; that will confuse the
resource list sorting with align when SIZEALIGN is used.

Change to STARTALIGN and pass align from start;  we are safe to do that
just as we do that regular pci bridge.  In the long run, we should just
treat cardbus like a regular pci bridge.

Also fix the case when realloc_head is not passed: we should keep the
requested size.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 11:59:56 -08:00
Yinghai Lu dcef0d06b3 PCI: Disable cardbus bridge MEM1 prefetchable bit
Some BIOSes enable prefetch on both MEM0 and MEM1.  But the cardbus code
assumes MEM1 is non-pref...

Discussion could be found at:
	https://lkml.org/lkml/2012/1/12/1
	https://bugzilla.kernel.org/show_bug.cgi?id=41622#c23

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-23 11:59:45 -08:00
Linus Torvalds 7bcd5b4671 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
One regression fix for SR-IOV on PPC and a couple of misc fixes from
Yinghai.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
  PCI: Fix pci cardbus removal
  PCI: set pci sriov page size before reading SRIOV BAR
  PCI: workaround hard-wired bus number V2
2012-02-18 15:26:11 -08:00
Rafael J. Wysocki 5b415f1e79 PCI / PM: Disable wakeup during shutdown for devices not enabled to wake up
If a PCI device is enabled to generate wakeup signals (PME) when put
into a low-power state by runtime PM, it will be still enabled to
generate those signals after the system shutdown, unless its driver's
.shutdown() callback takes care of the wakeup signals generation
setting.  Moreover, there are devices that are not enabled to wake
up the system and that are configured by runtime PM to generate
wakeup signals so that (runtime) remote wakeup works with them.
Those devices should be reconfigured during system shutdown so that
they don't generate wakeup signals, but at least some drivers don't
do that.  However, that very well may be done by the PCI core so
that drivers don't have to worry about it.  For this reason, modify
pci_device_shutdown() to disable the generation of wakeup events for
devices not supposed to wake up the system.

References: https://bugzilla.kernel.org/show_bug.cgi?id=37952
Reported-and-tested-by: Kamil Iskra <kamil.54002@iskra.name>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-17 09:22:04 -08:00
Yinghai Lu 09cedbef44 PCI: Fix /sys warning when sriov enabled and card is hot removed
sysfs is a bit stricter now and emits warnings in more cases.

For SRIOV hotplug, we are calling pci_stop_dev() for each VF first
(after we update pci_stop_bus_devices) which remove each VF subdir.  So
double check the VF dir in /sys before trying to remove the physfn link.

Signed-of-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-17 09:22:03 -08:00
Matthew Garrett ad71c96213 PCI: pcie: Add support for setting default ASPM policy
Distributions may wish to provide different defaults for PCIE ASPM
depending on their target audience. Provide a configuration option for
choosing the default policy.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-17 09:22:03 -08:00
Linus Torvalds 694ce18ec3 Two fixes for VCPU offlining; One to fix the string format exposed
by the xen-pci[front|back] to conform to the one used in majority of
 PCI drivers; Two fixes to make the code more resilient to invalid
 configurations.
 
 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPOeReAAoJEFjIrFwIi8fJn9QIANP48kzrGg0uO4bjSf2h/z7G
 pp3ISdtVLk7pwMov2POBqskoXSq8E0yQAfNN8se183wqNXo3Dm4rU1DIG7HQFBk9
 sdcyfHI8x7pat9JClRhGxpQ23Ig9f1iWkShweCcZCO782vfxZyNd65i6t87X7uLq
 7SPtG1XH2RixTX7tHtKKBqdzZ0OMXOEkJ33dgCmyrn+wzohbKrFj5mg+NdOgmzEo
 VgsHPVtuq7orDROe+F9d91eAg0TILQ13th8xfWZ59lQATXu/zAlaueYt87tpy1pb
 oVQvumsn8Xev+7hct9My9Tw45D4m8YOSFLG2HcekkW2WtNmGhTTbIyMh9PsLugk=
 =NDYK
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Two fixes for VCPU offlining; One to fix the string format exposed
by the xen-pci[front|back] to conform to the one used in majority of
PCI drivers; Two fixes to make the code more resilient to invalid
configurations.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

* tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xenbus_dev: add missing error check to watch handling
  xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.
  xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
  xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.
  xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
2012-02-14 15:20:11 -08:00
Thomas Jarosch f67fd55fa9 PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge GPUs
Some BIOS implementations leave the Intel GPU interrupts enabled,
even though no one is handling them (f.e. i915 driver is never loaded).
Additionally the interrupt destination is not set up properly
and the interrupt ends up -somewhere-.

These spurious interrupts are "sticky" and the kernel disables
the (shared) interrupt line after 100.000+ generated interrupts.

Fix it by disabling the still enabled interrupts.
This resolves crashes often seen on monitor unplug.

Tested on the following boards:
- Intel DH61CR: Affected
- Intel DH67BL: Affected
- Intel S1200KP server board: Affected
- Asus P8H61-M LE: Affected, but system does not crash.
  Probably the IRQ ends up somewhere unnoticed.

According to reports on the net, the Intel DH61WW board is also affected.

Many thanks to Jesse Barnes from Intel for helping
with the register configuration and to Intel in general
for providing public hardware documentation.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Charlie Suffin <charlie.suffin@stratus.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:26 -08:00
Arjan van de Ven 3209874a1d PCI: Annotate PCI quirks in initcall_debug style
While diagnosing some boot time issues on a platform, all that I
could see in the bootgraph/dmesg was that the system was spending
a lot of time in applying one or more PCI quirks... which
was virtually undebuggable.

This patch adds printk's in "initcall_debug" style to the dmesg,
which are added when the user asks for the initcall_debug
(the nr one tool to use when debugging boot hangs or boot time issues)
kernel command line option.

v2: add #includes so quirks can build on non-x86

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:04 -08:00
Danny Kukawka 309c665110 PCI hotplug: cpcihp: fix debug module parameter to be bool
Fix debug variable from module parameter to be really bool to
fix 'warning: return from incompatible pointer type'.

Acked-by: Scott Murray <scott@spiteful.org>
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:03 -08:00
Kay, Allen M 26f41062f2 PCI: check for pci bar restore completion and retry
On some OEM systems, pci_restore_state() is called while FLR has not yet
completed.  As a result, PCI BAR register restore is not successful.  This fix
reads back the restored value and compares it with saved value and re-tries 10
times before giving up.

Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:02 -08:00
Yinghai Lu 2debd92899 PCI: pciehp: Disable/enable link during slot power off/on
On a system with a repeater on the system board to support gen2 hotplug,
we found that when an ExpressModule is removed from some slots,
/var/log/messages will be full of "card present/not present" warnings.

It turns out the root complex is continually trying to train the link to
the repeater because the repeater has not been reset.

This patch will disable the link at removal time to allow the repeater
to be reset properly.  This also prevents a potential AER message at
removal time.

Also, when testing hotplug on a system under development, we found if we
boot the system without an EM installed, and later hot-add an EM, it
does not work with Linux, but another OS is ok.  The root cause is that
BIOS left link disabled when slot was empty at boot time, and other OS
is modifying the link disable bit in link ctrl during power on/off.

So we should do the same thing to disable/enable link during power off/on.

-v2: check link DLLA bit instead of 100ms waiting.
     Separate link disable/enable functions to another patch.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:02 -08:00
Yinghai Lu 7f822999e1 PCI: pciehp: Add Disable/enable link functions
Will use it during power off/on of slots

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:01 -08:00
Yinghai Lu bffe4f72fc PCI: pciehp: Add pcie_wait_link_not_active()
Will use it for link disable status checking.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:01 -08:00
Yinghai Lu 4e2ce405b2 PCI: pciehp: make check_link_active more helpful
A few changes:
  - remove the 'inline' and let the complier decide
  - return a bool to indicate whether the link was active
  - add a debug message to indicate link state when it beocmes active

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:00 -08:00
Yinghai Lu 2f5d8e4ff9 PCI: pciehp: replace unconditional sleep with config space access check
During reviewing
|	PCI: pciehp: wait 1000 ms before Link Training check
Linus said:
>...
> That's a *long* time, and it's irritating to the user. It makes the
> user think "the machine is slow".
>...
> And quite frankly, an unconditional one-second delay here seems bad.
>Two seconds was unacceptable, one second is just bad.

Try to access the pci conf of a pci device that is supposed to show up
in 1s.  If we can read back a valid vendor/device id, we can return
early.

Related discussion could be found:
	https://lkml.org/lkml/2011/12/6/339

-v2: seperate code to pci_bus_read_dev_vendor_id() from pci_scan_device()
    and reuse it from pciehp code. Suggested by Matthew Wilcox.
-v3: According to Kenj, don't use array in stack, and don't wait too long
    for crs, also return fail status if not found.
    Also separate pci_bus_dev_read_vendor_id() change to another patch.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:45:00 -08:00
Yinghai Lu efdc87dab1 PCI: Separate pci_bus_read_dev_vendor_id from pci_scan_device
We can reuse it for pciehp probing.

-v2: according to Kenji, fix crs timeout checking, and export the function
     for later use when pciehp is compiled as a module.

Suggested-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:59 -08:00
Yinghai Lu ac205b7bb7 PCI: make sriov work with hotplug remove
When hot removing a pci express module that has a pcie switch and supports
SRIOV, we got:

[ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
[ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
[ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
[ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
[ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
[ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
[ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain🚌device=0000:b0:00
[ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
[ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain🚌dev = 0000:b0:00
[ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
[ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
[ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
[ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
[ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
[ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
[ 5926.133377] PGD 0
[ 5926.135402] Oops: 0000 [#1] SMP
[ 5926.138659] CPU 2
[ 5926.140499] Modules linked in:
...
[ 5926.143754]
[ 5926.275823] Call Trace:
[ 5926.278267]  [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
[ 5926.284180]  [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
[ 5926.290264]  [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
[ 5926.296866]  [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
[ 5926.303123]  [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
[ 5926.309206]  [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
...

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]--
 |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
 |           |                               |            +-00.1 Device
 |           |                               |            +-00.2 Device
 |           |                               |            \-00.3 Device
 |           |                               \-03.0-[b3]--+-00.0 Device
 |           |                                            +-00.1 Device
 |           |                                            +-00.2 Device
 |           |                                            \-00.3 Device

root complex: 80:02.2
pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
end devices  are b2:00.0 and b3.00.0.
VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3

Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
virt fn and remove the fn, so
	list_for_each_safe(l, n, &bus->devices)
will have problem to refer freed n that is pointed to vf entry.

Solution is just replacing list_for_each_safe() with
list_for_each_prev_safe().  This will make sure we can get valid n pointer
to PF instead of the freed VF pointer (because newly added devices are
inserted to the bus->devices list tail).

During reviewing the patch, Bjorn said:
|   The PCI hot-remove path calls pci_stop_bus_devices() via
|   pci_remove_bus_device().
|
|   pci_stop_bus_devices() traverses the bus->devices list (point A below),
|   stopping each device in turn, which calls the driver remove() method.  When
|   the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
|   also uses pci_remove_bus_device() to remove the VF devices from the
|   bus->devices list (point B).
|
|       pci_remove_bus_device
|         pci_stop_bus_device
|           pci_stop_bus_devices(subordinate)
|             list_for_each(bus->devices)             <-- A
|               pci_stop_bus_device(PF)
|                 ...
|                   driver->remove
|                     pci_disable_sriov
|                       ...
|                         pci_remove_bus_device(VF)
|                             <remove from bus_list>  <-- B
|
|   At B, we're changing the same list we're iterating through at A, so when
|   the driver remove() method returns, the pci_stop_bus_devices() iterator has
|   a pointer to a list entry that has already been freed.

Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
				 https://lkml.org/lkml/2012/1/23/360

-v5: According to Linus to make remove more robust, Change to
     list_for_each_prev_safe instead. That is more reasonable, because
     those devices are added to tail of the list before.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:59 -08:00
Yinghai Lu 67cc7e26a5 PCI: remove add_to_failed_list()
Only one user; just use add_to_list instead.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:58 -08:00
Yinghai Lu b592443d90 PCI: add debug print out for add_size
For use in debugging resource reallocation.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:58 -08:00
Yinghai Lu bffc56d411 PCI: make free_list() into a function
After merging struct pci_dev_resource_x and pci_dev_resource,
We can use a function instead of macro now.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:57 -08:00
Yinghai Lu b9b0bba96c PCI: Rename dev_res_x to add_res or fail_res
Linus says don't use dev_res_x because it doesn't communicate anything
about usage.  Rename them to add_res or fail_res etc according to
context.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:57 -08:00
Yinghai Lu 764242a0ae PCI: Merge pci_dev_resource_x and pci_dev_resource
pci_dev_resource_x is a superset of pci_dev_resource and they're just
temp structs used during resource reallocation.

pci_dev_resource usage is quite limted.

So just use pci_dev_resource_x, and rename it as new pci_dev_resource.

-v2: According to Linus, Separate free_list change to another patch

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:56 -08:00
Yinghai Lu bdc4abecae PCI: Replace resource_list with generic list
So we can use helper functions for generic list.  This makes the
resource re-allocation code much more readable.

-v2: Use list_add_tail instead of adding list_insert_before, Pointed out
     by Linus.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:55 -08:00
Yinghai Lu 2934a0de09 PCI: Move struct resource_list to setup-bus.c
No user outside of setup-bus.c now.  Later patches will convert
resource_list to a regular list.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:55 -08:00
Yinghai Lu 78c3b329b9 PCI: Move pdev_sort_resources() to setup-bus.c
This allows us to move the definition of struct resource_list to
setup_bus.c and later convert resource_list to a regular list.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:54 -08:00
Yinghai Lu 19aa7ee432 PCI: make re-allocation try harder by reassigning ranges higher in the heirarchy
On a system with devices that support SRIOV connected to a pcie switch
to pcie root port:

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a
 |           |                               \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a
 |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a
 |           |                               \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a

When the BIOS does not assign resources for SRIOV BARs, kernel pci
reallocation only goes up one bridge and then gives up, failing to to
get resources for all sSRIOV BARs, even though the range is large enough
in the peer root bus.

Specifically, only the bridge at the a1:02.0 level has its resources
cleared and reallocated.  The kernel does not go up to clear the bridge
at the 80:02.0 level.

To make it go to upper levels, during retry, we need to treat "good to have"
resources as "must have".

Only on the last try will we treat good to have resources as optional.
At that time, parent bridge resources will already have been released so
we'll have a chance to get everything assigned with must_have plus
good_to_have for all child devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:54 -08:00
Yinghai Lu 9b03088f95 PCI: Make pci_rescan_bus handle add_list
This allows us to allocate resources to hotplug bridges during
remove/rescan.

We need to move the function to setup-bus.c so it can use
__pci_bus_size_bridges and __pci_bus_assign_resources directly to take
the add_list resource tracking list.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:53 -08:00
Yinghai Lu 2f320521a0 PCI: Make rescan bus increase bridge resource size if needed
Current rescan will not touch bridge MMIO and IO.

Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.

Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:53 -08:00
Yinghai Lu 8424d7592e PCI: Use add_list in pcie hotplug path.
We need add size for hot plug path when pluging in hotplug chassis
without cards.

-v2: change descriptions. make it applicable after "pci: Check bridge
     resources after resource allocation."

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:52 -08:00
Yinghai Lu 3e6e0d8094 PCI: try to assign required+option size first
We found reassignment can not find a range for one resource, even if the
total available range is large enough.

bridge b1:02.0 will need 2M+3M
bridge b1:03.0 will need 2M+3M

so bridge b0:00.0 will get assigned: 4M : [f8000000-f83fffff]
   later is reassigned to 10M : [f8000000-f9ffffff]

b1:02.0 is assigned to 2M : [f8000000-f81fffff]
b1:03.0 is assigned to 2M : [f8200000-f83fffff]

After that b1:03.0 get chance to be reassigned to [f8200000-f86fffff],
but b1:02.0 will not have chance to expand, because b1:03.0 is using in
middle one.

[  187.911401] pci 0000:b1:02.0: bridge window [mem 0x00100000-0x002fffff] to [bus b2-b2] add_size 300000
[  187.920764] pci 0000:b1:03.0: bridge window [mem 0x00100000-0x002fffff] to [bus b3-b3] add_size 300000
[  187.930129] pci 0000:b1:02.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
[  187.938500] pci 0000:b1:03.0: [mem 0x00100000-0x002fffff] get_res_add_size  add_size 300000
[  187.946857] pci 0000:b0:00.0: bridge window [mem 0x00100000-0x004fffff] to [bus b1-b3] add_size 600000
[  187.956206] pci 0000:b0:00.0: BAR 14: assigned [mem 0xf8000000-0xf83fffff]
[  187.963102] pci 0000:b0:00.0: BAR 15: assigned [mem 0xf5000000-0xf51fffff pref]
[  187.970434] pci 0000:b0:00.0: BAR 14: reassigned [mem 0xf8000000-0xf89fffff]
[  187.977497] pci 0000:b1:02.0: BAR 14: assigned [mem 0xf8000000-0xf81fffff]
[  187.984383] pci 0000:b1:02.0: BAR 15: assigned [mem 0xf5000000-0xf50fffff pref]
[  187.991695] pci 0000:b1:03.0: BAR 14: assigned [mem 0xf8200000-0xf83fffff]
[  187.998576] pci 0000:b1:03.0: BAR 15: assigned [mem 0xf5100000-0xf51fffff pref]
[  188.005888] pci 0000:b1:03.0: BAR 14: reassigned [mem 0xf8200000-0xf86fffff]
[  188.012939] pci 0000:b1:02.0: BAR 14: can't assign mem (size 0x200000)
[  188.019471] pci 0000:b1:02.0: failed to add 300000 to res=[mem 0xf8000000-0xf81fffff]
[  188.027326] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.034071] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.040795] pci 0000:b2:00.0: BAR 2: assigned [mem 0xf8000000-0xf80fffff 64bit]
[  188.048119] pci 0000:b2:00.0: BAR 2: set to [mem 0xf8000000-0xf80fffff 64bit] (PCI address [0xf8000000-0xf80fffff])
[  188.058550] pci 0000:b2:00.0: BAR 6: assigned [mem 0xf5000000-0xf50fffff pref]
[  188.065802] pci 0000:b2:00.0: BAR 0: assigned [mem 0xf8100000-0xf8103fff 64bit]
[  188.073125] pci 0000:b2:00.0: BAR 0: set to [mem 0xf8100000-0xf8103fff 64bit] (PCI address [0xf8100000-0xf8103fff])
[  188.083596] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.090310] pci 0000:b2:00.0: BAR 9: can't assign mem (size 0x300000)
[  188.096773] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.103479] pci 0000:b2:00.0: BAR 7: assigned [mem 0xf8104000-0xf810ffff 64bit]
[  188.110801] pci 0000:b2:00.0: BAR 7: set to [mem 0xf8104000-0xf810ffff 64bit] (PCI address [0xf8104000-0xf810ffff])
[  188.121256] pci 0000:b1:02.0: PCI bridge to [bus b2-b2]
[  188.126512] pci 0000:b1:02.0:   bridge window [mem 0xf8000000-0xf81fffff]
[  188.133328] pci 0000:b1:02.0:   bridge window [mem 0xf5000000-0xf50fffff pref]
[  188.140608] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.147341] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.154076] pci 0000:b3:00.0: BAR 2: assigned [mem 0xf8200000-0xf82fffff 64bit]
[  188.161417] pci 0000:b3:00.0: BAR 2: set to [mem 0xf8200000-0xf82fffff 64bit] (PCI address [0xf8200000-0xf82fffff])
[  188.171865] pci 0000:b3:00.0: BAR 6: assigned [mem 0xf5100000-0xf51fffff pref]
[  188.179090] pci 0000:b3:00.0: BAR 0: assigned [mem 0xf8300000-0xf8303fff 64bit]
[  188.186431] pci 0000:b3:00.0: BAR 0: set to [mem 0xf8300000-0xf8303fff 64bit] (PCI address [0xf8300000-0xf8303fff])
[  188.196884] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[  188.203591] pci 0000:b3:00.0: BAR 9: assigned [mem 0xf8400000-0xf86fffff 64bit]
[  188.210909] pci 0000:b3:00.0: BAR 9: set to [mem 0xf8400000-0xf86fffff 64bit] (PCI address [0xf8400000-0xf86fffff])
[  188.221379] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[  188.228089] pci 0000:b3:00.0: BAR 7: assigned [mem 0xf8304000-0xf830ffff 64bit]
[  188.235407] pci 0000:b3:00.0: BAR 7: set to [mem 0xf8304000-0xf830ffff 64bit] (PCI address [0xf8304000-0xf830ffff])
[  188.245843] pci 0000:b1:03.0: PCI bridge to [bus b3-b3]
[  188.251107] pci 0000:b1:03.0:   bridge window [mem 0xf8200000-0xf86fffff]
[  188.257922] pci 0000:b1:03.0:   bridge window [mem 0xf5100000-0xf51fffff pref]
[  188.265180] pci 0000:b0:00.0: PCI bridge to [bus b1-b3]
[  188.270443] pci 0000:b0:00.0:   bridge window [mem 0xf8000000-0xf89fffff]
[  188.277250] pci 0000:b0:00.0:   bridge window [mem 0xf5000000-0xf51fffff pref]
[  188.284512] pcieport 0000:80:02.2: PCI bridge to [bus b0-bf]
[  188.290184] pcieport 0000:80:02.2:   bridge window [io  0xa000-0xbfff]
[  188.296735] pcieport 0000:80:02.2:   bridge window [mem 0xf8000000-0xf8ffffff]
[  188.303963] pcieport 0000:80:02.2:   bridge window [mem 0xf5000000-0xf5ffffff 64bit pref]

Thus b2:00.0 BAR 9 does not get assigned...

root cause:
b1:02.0 can not be added more range, because b1:03.0 is just after it;
no space between the required ranges.

Solution:
Try to assign required + optional all together at first, and if that
fails, try again with just the required resources.

-v2: seperate add_to_list change() to another patch according to Jesse.
     seperate get_res_add_size() moving to another patch according to Jesse.
     add !realloc_head->next check if the list is empty to bail early
     according to Jesse.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:52 -08:00
Yinghai Lu 1c372353e9 PCI: Move get_res_add_size() function
Need to call it from __assign_resources_sorted() later and we'd like to
avoid a forward declaraion.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:51 -08:00
Yinghai Lu ef62dfefa9 PCI: Make add_to_list() return status
Will be used for resource_list_x duplication when trying
requested+optional at first.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:51 -08:00
Yinghai Lu a4ac9fea01 PCI : Calculate right add_size
During debug of one SRIOV enabled hotplug device, we found found that
add_size is not passed properly.

The device has devices under two level bridges:

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0  Oracle Corporation Device
 |           |                               \-03.0-[a3]--+-00.0  Oracle Corporation Device

Which means later the parent bridge will not try to add a big enough range:

[  557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
[  557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
[  557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
[  557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
[  557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
[  557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
[  557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
[  557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
[  557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
[  557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]

It turns out we did not calculate size1 properly.

static resource_size_t calculate_memsize(resource_size_t size,
                resource_size_t min_size,
                resource_size_t size1,
                resource_size_t old_size,
                resource_size_t align)
{
        if (size < min_size)
                size = min_size;
        if (old_size == 1 )
                old_size = 0;
        if (size < old_size)
                size = old_size;
        size = ALIGN(size + size1, align);
        return size;
}

We should not pass add_size with min_size in calculate_memsize since
that will make add_size not contribute final add_size.

So just pass add_size with size1 to calculate_memsize().

With this change, we should have chance to remove extra addon in
pci_reassign_resource.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:50 -08:00
Masanari Iida 0dea210b17 PCI: Fix typo in setup-res.c
Correct spelling "resouce" to "resource" in
dricers/pci/setup-res.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:50 -08:00
Konrad Rzeszutek Wilk 6fbf9e7a90 PCI: Introduce __pci_reset_function_locked to be used when holding device_lock.
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.

The call chain when a user does "bind" looks as so:

 echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind

and ends up calling:
  driver_bind:
    device_lock(dev);  <=== TAKES LOCK
    XXXX_probe:
         .. pci_enable_device()
         ...__pci_reset_function(), which calls
                 pci_dev_reset(dev, 0):
                        if (!0) {
                                device_lock(dev) <==== DEADLOCK

The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:48 -08:00
Julia Lawall 8f0cdddcd3 PCI: drivers/pci/hotplug/ibmphp_ebda.c: add missing iounmap
Add missing iounmap in error handling code, in a case where the function
already preforms iounmap on some other execution path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
statement S,S1;
int ret;
@@
e = \(ioremap\|ioremap_nocache\)(...)
... when != iounmap(e)
if (<+...e...+>) S
... when any
    when != iounmap(e)
*if (...)
   { ... when != iounmap(e)
     return ...; }
... when any
iounmap(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:47 -08:00
Amos Kong f382a086f3 PCI: Can continually add funcs after adding func0
Boot up a KVM guest, and hotplug multifunction
devices(func1,func2,func0,func3) to guest.

for i in 1 2 0 3;do
qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
(qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
(qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
done

In linux kernel, when func0 of the slot is hot-added, the whole
slot will be marked as 'enabled', then driver will ignore other new
hotadded funcs.
But in Win7 & WinXP, we can continaully add other funcs after adding
func0, all funcs will be added in guest.

drivers/pci/hotplug/acpiphp_glue.c:
static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{
....
        for (slot = bridge->slots; slot; slot = slot->next) {
                if (slot->flags & SLOT_ENABLED) {
                        acpiphp_disable_slot()
                else
                        acpiphp_enable_slot()
....                              |
}                                 v
                            enable_device()
                                  |
                                  v
        //only don't enable slot if func0 is not added
	list_for_each_entry(func, &slot->funcs, sibling) {
               ...
        }
       slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'

This patch just make pci driver can continaully add funcs after adding
func 0. Only mark slot to 'enabled' when all funcs are added.

For pci multifunction hotplug, we can add functions one by one(func 0 is
necessary), and all functions will be removed in one time.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:47 -08:00
Myron Stowe 6535943fbf x86/PCI: Convert maintaining FW-assigned BIOS BAR values to use a list
This patch converts the underlying maintenance aspects of FW-assigned
BIOS BAR values from a statically allocated array within struct pci_dev
to a list of temporary, stand alone, entries.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:46 -08:00
Myron Stowe 351fc6d1a5 PCI: Fix starting basis for resource requests
pci_revert_fw_address() is used to reinstate a PCI device's original
FW-assigned BIOS BAR value(s) if normal resource assignment fails.

When attempting to reinstate an address, the point within the resource
tree from which to attempt the new resource request should be the parent
resource corresponding to the device, not the base of the resource tree
(ioport_resource or iomem_resource).  For PCI devices this would
typically be the resource corresponding to the upstream PCI host bridge
or P2P bridge aperture.

This patch sets the point within the resource tree to attempt a new
resource assignment request to the PCI device's parent resource and only
if that fails does it fall back to the base ioport_resource or
iomem_resource.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:45 -08:00
Yinghai Lu 3682a3946d PCI: Fix pci cardbus removal
During test busn_res allocation with cardbus, found pci card removal is not
working anymore, and it turns out it is broken by:

|commit 79cc9601c3
|Date:   Tue Nov 22 21:06:53 2011 -0800
|
|    PCI: Only call pci_stop_bus_device() one time for child devices at remove

The above changed the behavior of pci_remove_behind_bridge that
yenta_cardbus depended on.  So restore the old behavoir of
pci_remove_behind_bridge (which requires stopping and removing of all
devices) by:

1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
   __pci_remove_bus_device() call it instead.
2. add pci_stop_behind_bridge that will stop devices behind a bridge
3. add back pci_remove_behind_bridge that will stop and remove devices
   under bridge.

-v2: update commit description a little bit.

Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 12:19:31 -08:00
Vaidyanathan Srinivasan 8161fe91d8 PCI: set pci sriov page size before reading SRIOV BAR
For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
the PCI_SRIOV_BAR are queried.  The sys pagesize defaults to 4k,
so this change is required on powerpc box with 64k base page size.
    
This is a regression caused due to moving SRIOV init to sriov_enable().
    
| commit afd24ece5c
| Author: Ram Pai <linuxram@us.ibm.com>
    
| PCI: delay configuration of SRIOV capability
| The SRIOV capability, namely page size and total_vfs of a device are
| configured during enumeration phase of the device.  This can potentially
| interfere with the PCI operations of the platform, if the IOV capability
| of the device is not enabled.
    
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 12:01:56 -08:00
Yinghai Lu 71f6bd4a23 PCI: workaround hard-wired bus number V2
Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
when using ACPI resources (_CRS).
This is default, a manual workaround (without this patch)
would be pci=nocrs boot param.

V2: Add dev_warn if the workaround is hit. This should reveal
how common such setups are (via google) and point to possible
problems if things are still not working as expected.
-> Suggested by Jan Beulich.

Cc: stable@vger.kernel.org
Tested-by: garyhade@us.ibm.com
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 11:34:42 -08:00
Konrad Rzeszutek Wilk e4de866a83 xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.
.. as the rest of the kernel is using that format.

Suggested-by: Марк Коренберг <socketpair@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-02-03 16:06:57 -05:00
Greg Kroah-Hartman bd1d462e13 Merge 3.3-rc2 into the driver-core-next branch.
This was done to resolve a merge and build problem with the
drivers/acpi/processor_driver.c file.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-02 11:24:44 -08:00
Alan Stern 07d251460b PCI/XEN: Fix bug introduced by a recent change
This patch (as1516) fixes a bug introduced during the removal of
put_driver() and get_driver() from drivers/pci/xen-pcifront.c.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-27 14:35:30 -08:00
Alan Stern ed283e9f0a USB/PCI/PCMCIA: Clean up new_id and remove_id sysfs attribute routines
This patch (as1514) cleans up some places where new_id and remove_id
sysfs attributes are created and deleted.  Handling both attributes in
a single routine rather than a pair of routines makes the code
smaller.  It also prevents certain kinds of errors, like one we
currently have in the USB subsystem: The removeid attribute is often
created even when newid isn't (because the driver's no_dynamid_id flag
is set).

In the case of the PCMCIA subsystem, the newid attribute is created
but never explicitly deleted.  The patch adds a deletion routine.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-26 17:04:39 -08:00
Alan Stern f3ff924708 Remove useless get_driver()/put_driver() calls
As part of the removal of get_driver()/put_driver(), this patch
(as1512) gets rid of various useless and unnecessary calls in several
drivers.  In some cases it may be desirable to pin the driver by
calling try_module_get(), but that can be done later.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: "David S. Miller" <davem@davemloft.net>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Michael Buesch <m@bues.ch>
CC: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24 16:00:35 -08:00
Alan Stern cef9bc56e1 Dynamic ID addition doesn't need get_driver()
As part of the removal of get_driver()/put_driver(), this patch
(as1511) changes all the places that add dynamic IDs for drivers.
Since these additions are done by writing to the drivers' sysfs
attribute files, and the attributes are removed when the drivers are
unregistered, there is no reason to take an extra reference to the
drivers.

The one exception is the pci-stub driver, which calls pci_add_dynid()
as part of its registration.  But again, there's no reason to take an
extra reference here, because the driver can't be unloaded while it is
being registered.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Dmitry Torokhov <dmitry.torokhov@gmail.com>
CC: Jiri Kosina <jkosina@suse.cz>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-24 16:00:35 -08:00
Randy Dunlap 6e9292c588 kernel-doc: fix new warnings in pci
Fix new kernel-doc warnings:

Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-23 08:44:53 -08:00
Linus Torvalds c49c41a413 Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
  capabilities: remove __cap_full_set definition
  security: remove the security_netlink_recv hook as it is equivalent to capable()
  ptrace: do not audit capability check when outputing /proc/pid/stat
  capabilities: remove task_ns_* functions
  capabitlies: ns_capable can use the cap helpers rather than lsm call
  capabilities: style only - move capable below ns_capable
  capabilites: introduce new has_ns_capabilities_noaudit
  capabilities: call has_ns_capability from has_capability
  capabilities: remove all _real_ interfaces
  capabilities: introduce security_capable_noaudit
  capabilities: reverse arguments to security_capable
  capabilities: remove the task from capable LSM hook entirely
  selinux: sparse fix: fix several warnings in the security server cod
  selinux: sparse fix: fix warnings in netlink code
  selinux: sparse fix: eliminate warnings for selinuxfs
  selinux: sparse fix: declare selinux_disable() in security.h
  selinux: sparse fix: move selinux_complete_init
  selinux: sparse fix: make selinux_secmark_refcount static
  SELinux: Fix RCU deref check warning in sel_netport_insert()

Manually fix up a semantic mis-merge wrt security_netlink_recv():

 - the interface was removed in commit fd77846152 ("security: remove
   the security_netlink_recv hook as it is equivalent to capable()")

 - a new user of it appeared in commit a38f7907b9 ("crypto: Add
   userspace configuration API")

causing no automatic merge conflict, but Eric Paris pointed out the
issue.
2012-01-14 18:36:33 -08:00
Rusty Russell 90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Konrad Rzeszutek Wilk a96d627aba pci: Introduce __pci_reset_function_locked to be used when holding device_lock.
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.

The call chain when a user does "bind" looks as so:

 echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind

and ends up calling:
  driver_bind:
    device_lock(dev);  <=== TAKES LOCK
    XXXX_probe:
         .. pci_enable_device()
         ...__pci_reset_function(), which calls
                 pci_dev_reset(dev, 0):
                        if (!0) {
                                device_lock(dev) <==== DEADLOCK

The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-01-12 12:00:07 -05:00
Linus Torvalds 7b67e75147 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
  x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
  PCI: Increase resource array mask bit size in pcim_iomap_regions()
  PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
  PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
  PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
  x86/PCI: amd: factor out MMCONFIG discovery
  PCI: Enable ATS at the device state restore
  PCI: msi: fix imbalanced refcount of msi irq sysfs objects
  PCI: kconfig: English typo in pci/pcie/Kconfig
  PCI/PM/Runtime: make PCI traces quieter
  PCI: remove pci_create_bus()
  xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
  x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
  x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
  x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
  sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
  sparc/PCI: convert to pci_create_root_bus()
  sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
  powerpc/PCI: convert to pci_create_root_bus()
  powerpc/PCI: split PHB part out of pcibios_map_io_space()
  ...

Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
to the same patches being applied in other branches.
2012-01-11 18:50:26 -08:00
Linus Torvalds 1c8106528a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
  iommu/amd: Set IOTLB invalidation timeout
  iommu/amd: Init stats for iommu=pt
  iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume
  iommu/amd: Add invalidate-context call-back
  iommu/amd: Add amd_iommu_device_info() function
  iommu/amd: Adapt IOMMU driver to PCI register name changes
  iommu/amd: Add invalid_ppr callback
  iommu/amd: Implement notifiers for IOMMUv2
  iommu/amd: Implement IO page-fault handler
  iommu/amd: Add routines to bind/unbind a pasid
  iommu/amd: Implement device aquisition code for IOMMUv2
  iommu/amd: Add driver stub for AMD IOMMUv2 support
  iommu/amd: Add stat counter for IOMMUv2 events
  iommu/amd: Add device errata handling
  iommu/amd: Add function to get IOMMUv2 domain for pdev
  iommu/amd: Implement function to send PPR completions
  iommu/amd: Implement functions to manage GCR3 table
  iommu/amd: Implement IOMMUv2 TLB flushing routines
  iommu/amd: Add support for IOMMUv2 domain mode
  iommu/amd: Add amd_iommu_domain_direct_map function
  ...
2012-01-10 11:08:21 -08:00
Linus Torvalds 90160371b3 Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
  xen/pciback: Expand the warning message to include domain id.
  xen/pciback: Fix "device has been assigned to X domain!" warning
  xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
  xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
  xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
  xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
  Xen: consolidate and simplify struct xenbus_driver instantiation
  xen-gntalloc: introduce missing kfree
  xen/xenbus: Fix compile error - missing header for xen_initial_domain()
  xen/netback: Enable netback on HVM guests
  xen/grant-table: Support mappings required by blkback
  xenbus: Use grant-table wrapper functions
  xenbus: Support HVM backends
  xen/xenbus-frontend: Fix compile error with randconfig
  xen/xenbus-frontend: Make error message more clear
  xen/privcmd: Remove unused support for arch specific privcmp mmap
  xen: Add xenbus_backend device
  xen: Add xenbus device driver
  xen: Add privcmd device driver
  xen/gntalloc: fix reference counts on multi-page mappings
  ...
2012-01-10 10:09:59 -08:00
Joerg Roedel 00fb5430f5 Merge branches 'iommu/fixes', 'arm/omap' and 'x86/amd' into next
Conflicts:
	drivers/pci/hotplug/acpiphp_glue.c
2012-01-09 13:04:05 +01:00
Linus Torvalds 972b2c7199 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
  reiserfs: Properly display mount options in /proc/mounts
  vfs: prevent remount read-only if pending removes
  vfs: count unlinked inodes
  vfs: protect remounting superblock read-only
  vfs: keep list of mounts for each superblock
  vfs: switch ->show_options() to struct dentry *
  vfs: switch ->show_path() to struct dentry *
  vfs: switch ->show_devname() to struct dentry *
  vfs: switch ->show_stats to struct dentry *
  switch security_path_chmod() to struct path *
  vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
  vfs: trim includes a bit
  switch mnt_namespace ->root to struct mount
  vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
  vfs: opencode mntget() mnt_set_mountpoint()
  vfs: spread struct mount - remaining argument of next_mnt()
  vfs: move fsnotify junk to struct mount
  vfs: move mnt_devname
  vfs: move mnt_list to struct mount
  vfs: switch pnode.h macros to struct mount *
  ...
2012-01-08 12:19:57 -08:00
Konrad Rzeszutek Wilk 76ccc29701 x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 14:02:26 -08:00
Linus Torvalds 67b0243131 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
  x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
  x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
  x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
  x86, apic: Add probe() for apic_flat
  x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
  x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
  x86: Add per-cpu stat counter for APIC ICR read tries
  pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
  x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
  x86: Add NumaChip support
  x86: Add x86_init platform override to fix up NUMA core numbering
  x86: Make flat_init_apic_ldr() available
2012-01-06 13:58:21 -08:00
Hao, Xudong 1900ca132f PCI: Enable ATS at the device state restore
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.

Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:18 -08:00
Neil Horman 424eb39159 PCI: msi: fix imbalanced refcount of msi irq sysfs objects
This warning was recently reported to me:

------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
Hardware name: VMware Virtual Platform
kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
being called.
Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
vmw_pvscsi
Pid: 630, comm: modprobe Tainted: G        W   3.1.6-1.fc16.x86_64 #1
Call Trace:
 [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff810da293>] ? free_desc+0x63/0x70
 [<ffffffff812a9aa0>] kobject_put+0x50/0x60
 [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
 [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
 [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
 [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
 [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
 [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
 [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
 [<ffffffff8138bceb>] __driver_attach+0xab/0xb0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
 [<ffffffff8138b63e>] driver_attach+0x1e/0x20
 [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff8138c246>] driver_register+0x76/0x140
 [<ffffffff815ca414>] ? printk+0x51/0x53
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
 [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
 [<ffffffff81002042>] do_one_initcall+0x42/0x180
 [<ffffffff810aad71>] sys_init_module+0x91/0x200
 [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
---[ end trace 44593438a59a9558 ]---
Using INTx interrupt, #Rx queues: 1.

It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
be called.  Because populate_msi_sysfs fails, we never registered any of the
msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
on each of them, which gets flagged in the above stack trace.

The fix is pretty straightforward.  We can key of the parent pointer in the
kobject.  It is only set if the kobject_init_and_add succededs in
populate_msi_sysfs.  If anything fails there, each kobject has its parent reset
to NULL

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
P. Christeas d56641c772 PCI: kconfig: English typo in pci/pcie/Kconfig
Just fix this help text.

Signed-off-by: P. Christeas <xrg@linux.gr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
Vincent Palatin 85b8582d7c PCI/PM/Runtime: make PCI traces quieter
When the runtime PM is activated on PCI, if a device switches state
frequently (e.g. an EHCI controller with autosuspending USB devices
connected) the PCI configuration traces might be very verbose in the
kernel log.  Let's guard those traces with DEBUG condition.

Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:16 -08:00
Bjorn Helgaas 118faafaf9 PCI: remove pci_create_bus()
All users of pci_create_bus() have been converted to pci_create_root_bus(),
so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:15 -08:00
Bjorn Helgaas 7e00fe2e53 PCI: deprecate pci_scan_bus_parented()
Users of pci_scan_bus_parented() should be converted to use either
    pci_scan_root_bus() (preferred, but also calls pci_bus_add_devices)
or
    pci_create_root_bus()
    pci_scan_child_bus()

Since pci_scan_bus_parented(), I'm marking it deprecated now and will
actually remove it later.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:55 -08:00
Bjorn Helgaas 1e39ae9f90 PCI: convert pci_scan_bus_parented() to use pci_create_root_bus()
This converts pci_scan_bus_parented() to use pci_create_root_bus()
instead of pci_create_bus().  The new bus still has the default (incorrect)
resources, so this patch doesn't help fix that problem, but it does remove
one more use of pci_create_bus().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:54 -08:00
Bjorn Helgaas de4b2f76d6 PCI: convert pci_scan_bus() to use pci_create_root_bus()
I plan to deprecate pci_scan_bus_parented(), so use pci_create_root_bus()
directly instead.  pci_scan_bus() itself will be removed as soon as all
callers are gone, so this is just an interim step.

v2: export pci_scan_bus

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:53 -08:00
Bjorn Helgaas a2ebb82795 PCI: add pci_scan_root_bus() that accepts resource list
"Early" and "header" quirks often use incorrect bus resources because they
see the default resources assigned by pci_create_bus(), before the
architecture fixes them up (typically in pcibios_fixup_bus()).  Regions
reserved by these quirks end up with the wrong parents.

Here's the standard path for scanning a PCI root bus:

  pci_scan_bus or pci_scan_bus_parented
    pci_create_bus                     <-- A create with default resources
    pci_scan_child_bus
      pci_scan_slot
        pci_scan_single_device
          pci_scan_device
            pci_setup_device
              pci_fixup_device(early)  <-- B
          pci_device_add
            pci_fixup_device(header)   <-- C
      pcibios_fixup_bus                <-- D fill in correct resources

Early and header quirks at B and C use the default (incorrect) root bus
resources rather than those filled in at D.

This patch adds a new pci_scan_root_bus() function that sets the bus
resources correctly from a supplied list of resources.

I intend to remove pci_scan_bus() and pci_scan_bus_parented() after
fixing all callers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:52 -08:00
Bjorn Helgaas 166c637075 PCI: add pci_create_root_bus() that accepts resource list
pci_create_bus() assigns ioport_resource and iomem_resource as the default
bus resources, i.e., the entire address space.  Architectures fix these
later, typically in pcibios_fixup_bus() or after pci_scan_bus_parented()
returns, but code that runs in the interim sees incorrect resource
information.

This patch adds a new pci_create_root_bus() that sets the bus resources
correctly from a supplied list of resources.

I intend to remove pci_create_bus() after changing all callers.

Based on original patch by Deng-Cheng Zhu.

Reference: http://www.spinics.net/lists/mips/msg41654.html
Reference: https://lkml.org/lkml/2011/8/26/88
Signed-off-by: Deng-Cheng Zhu <dczhu@mips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:51 -08:00
Bjorn Helgaas a9d9f5276c PCI: show host bridges and root bus resources
Show the bus number and resources for every root bus we create.  This
will become more interesting when we supply the correct resources
instead of using the defaults (ioport_resource and iomem_resource).

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:51 -08:00
Bjorn Helgaas 45ca9e9730 PCI: add helpers for building PCI bus resource lists
We'd like to supply a list of resources when we create a new PCI bus,
e.g., the root bus under a PCI host bridge.  These are helpers for
constructing that list.

These are exported because the plan is to replace this exported interface:
    pci_scan_bus_parented()
with this one:
    pci_add_resource(resources, ...)
    pci_scan_root_bus(..., resources)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:50 -08:00
Ram Pai afd24ece5c PCI: delay configuration of SRIOV capability
The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device.  This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.

The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.

The patch is tested on x86 and power platform.

Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:49 -08:00
Yinghai Lu 79cc9601c3 PCI: Only call pci_stop_bus_device() one time for child devices at remove
During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.

So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:48 -08:00
Myron Stowe f676678f89 PCI: latency timer doesn't apply to PCIe
The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:47 -08:00
Myron Stowe 96c5590058 PCI: Pull PCI 'latency timer' setup up into the core
The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).

There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:42 -08:00
Jan Kiszka a2e27787f8 PCI: Introduce INTx check & mask API
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.

This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:34 -08:00
Jan Kiszka fb51ccbf21 PCI: Rework config space blocking services
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.

This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.

Adaptions of the ipr driver were originally written by Brian King.

Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:33 -08:00
Zac Storer 68e35c9b0b PCI: fix a brace coding style issue in probe.c
Fixed a brace coding style issue.

Signed-off-by: Zac Storer <zac.3.14159@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:31 -08:00
David Fries 82440a8253 PCI: pci_has_legacy_pm_support add driver and device to WARN
Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.

Signed-off-by: David Fries <David@Fries.net>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:30 -08:00
Eric W. Biederman d5dea7d95c PCI: msi: Disable msi interrupts when we initialize a pci device
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup.  The booting kernel had not enabled those interupts so
was not prepared to handle them.

I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts.  The pci
spec specifies that msi interrupts should be off by default.  Drivers
are expected to enable the msi interrupts if they want to use them.  Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.

This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:29 -08:00
Rafael J. Wysocki 4716a450eb PCI/ACPI/PM: Avoid resuming devices that don't signal PME
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:29 -08:00
Kenji Kaneshige ab4ca7821f PCI: pciehp: Handle push button event asynchronously
Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:28 -08:00
Kenji Kaneshige 863b7eb583 PCI: pciehp: Fix wrong workqueue cleanup
Fix improper workqueue cleanup.

In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:27 -08:00
Matthew Garrett 10f6dc7eed PCI: Rework ASPM disable code
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.

This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.

It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.

Measured to save around 5W on an idle Thinkpad X220.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:26 -08:00
Alex Williamson cfa4d8cc56 PCI: Fix PRI and PASID consistency
These are extended capabilities, rename and move to proper
group for consistency.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:26 -08:00
Neil Horman da8d1c8ba4 PCI/sysfs: add per pci device msi[x] irq listing (v5)
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs.  For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:25 -08:00
Eric Paris b7e724d303 capabilities: reverse arguments to security_capable
security_capable takes ns, cred, cap.  But the LSM capable() hook takes
cred, ns, cap.  The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!

Signed-off-by: Eric Paris <eparis@redhat.com>
2012-01-05 18:52:53 -05:00
Jan Beulich 73db144b58 Xen: consolidate and simplify struct xenbus_driver instantiation
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.

Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).

Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.

v2: Restore DRV_NAME for the driver name in xen-pciback.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-01-04 17:01:17 -05:00
Al Viro 587a1f1659 switch ->is_visible() to returning umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:55 -05:00
Yinghai Lu 497f16f21a pci: Fix hotplug of Express Module with pci bridges
I noticed that hotplug of one setup does not work with recent change in
pci tree.

After checking the bridge conf setup, I noticed that the bridges get
assigned but do not get enabled.

The reason is the following commit, while simply ignores bridge
resources when enabling a pci device:

| commit bbef98ab0f
| Author: Ram Pai <linuxram@us.ibm.com>
| Date:   Sun Nov 6 10:33:10 2011 +0800
|
|    PCI: defer enablement of SRIOV BARS
|...
|    NOTE: Note, there is subtle change in the pci_enable_device() API.  Any
|    driver that depends on SRIOV BARS to be enabled in pci_enable_device()
|    can fail.

Put back bridge resource and ROM resource checking to fix the problem.

That should fix regression like BIOS does not assign correct resource to
bridge.

Discussion can be found at:
	http://www.spinics.net/lists/linux-pci/msg12874.html

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-18 14:10:16 -08:00
Ajaykumar Hotchandani b51306c634 PCI: Set device power state to PCI_D0 for device without native PM support
During test of one IB card with guest VM, found that, msi is not
initialized properly.

It turns out __write_msi_msg will do nothing if device current_state is
not PCI_D0.  And, that pci device does not have pm_cap in guest VM.

There is an error in setting of power state to PCI_D0 in
pci_enable_device(), but error is not returned for this.  Following is
code flow:

pci_enable_device() -->   __pci_enable_device_flags() -->
do_pci_enable_device() -->   pci_set_power_state() -->
__pci_start_power_transition()

We have following condition inside __pci_start_power_transition():
         if (platform_pci_power_manageable(dev)) {
                 error = platform_pci_set_power_state(dev, state);
                 if (!error)
                         pci_update_current_state(dev, state);
         } else {
                 error = -ENODEV;
                 /* Fall back to PCI_D0 if native PM is not supported */
                 if (!dev->pm_cap)
                         dev->current_state = PCI_D0;
         }

Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is
getting called and that is failing with ENODEV because of following
condition:

         if (!handle || ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp)))
                 return -ENODEV;

Because of that, pci_update_current_state() is not getting called.

With this patch, if device power state can not be set via
platform_pci_set_power_state and that device does not have native pm
support, then PCI device power state will be set to PCI_D0.

-v2: This also reverts 47e9037ac1, as it's
     not needed after this change.

Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-14 08:26:42 -08:00
Rafael J. Wysocki 619a5182d1 PCI hotplug: Always allow acpiphp to handle non-PCIe bridges
Commit 0d52f54e2e (PCI / ACPI: Make
acpiphp ignore root bridges using PCIe native hotplug) added code
that made the acpiphp driver completely ignore PCIe root complexes
for which the kernel had been granted control of the native PCIe
hotplug feature by the BIOS through _OSC.  Unfortunately, however,
this was a mistake, because on some systems there were PCI bridges
supporting PCI (non-PCIe) hotplug under such root complexes and
those bridges should have been handled by acpiphp.

For this reason, revert the changes made by the commit mentioned
above and make register_slot() in drivers/pci/hotplug/acpiphp_glue.c
avoid registering hotplug slots for PCIe ports that belong to
root complexes with native PCIe hotplug enabled (which means that
the BIOS has granted the kernel control of this feature for the
given root complex).  This is reported to address the original
issue fixed by commit 0d52f54e2e and
to work on the system where that commit broke things.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-13 10:41:23 -08:00
Jan Beulich b95a7bd700 pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
This adjusts PCI_IOAPIC to be user configurable (possibly as a
module) on x86, since the base architecture code for adding
IO-APICs dynamically isn't there yet (and hence having the code
present everywhere is pretty pointless).

To make this consistent, a MODULE_DEVICE_TABLE() declaration
gets added, the class specifications get corrected (by properly
using PCI_DEVICE_CLASS() intended for purposes like this), and
the probe and remove functions get their sections adjusted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/4EDDD71A02000078000659F1@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06 09:21:05 +01:00
James Bottomley 8c45194567 PCI: fix ats compile failure
I get this compile failure on parisc:

drivers/pci/ats.c: In function 'ats_alloc_one':
drivers/pci/ats.c:29: error: implicit declaration of function 'kzalloc'
drivers/pci/ats.c:29: warning: assignment makes pointer from integer without a cast
drivers/pci/ats.c: In function 'ats_free_one':
drivers/pci/ats.c:45: error: implicit declaration of function 'kfree'

Because ats.c is missing linux/slab.h as an include.  This patch fixes it

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:31:25 -08:00
Ram Pai bbef98ab0f PCI: defer enablement of SRIOV BARS
All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device().  This unnecessarily enables SRIOV BARs of the
device.

On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.

The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().

NOTE: Note, there is subtle change in the pci_enable_device() API.  Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.

The patch has been touch tested on power and x86 platform.

Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:30:22 -08:00
Alex Williamson 91f57d5e1b PCI: More PRI/PASID cleanup
More consistency cleanups.  Drop the _OFF, separate and indent
CTRL/CAP/STATUS bit definitions.  This helped find the previous
mis-use of bit 0 in the PASID capability register.

Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:22:15 -08:00
Alex Williamson 60fe823837 PCI: Enable is not exposed as a PASID capability
The PASID ECN indicates bit 0 is reserved in the capability register.
Switch pci_enable_pasid() to error if PASID is already enabled and
don't expose enable as a feature in pci_pasid_features().

Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:22:03 -08:00
Eric W. Biederman a776c491ca PCI: msi: Disable msi interrupts when we initialize a pci device
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup.  The booting kernel had not enabled those interupts so
was not prepared to handle them.

I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts.  The pci
spec specifies that msi interrupts should be off by default.  Drivers
are expected to enable the msi interrupts if they want to use them.  Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.

This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:50 -08:00
Rafael J. Wysocki a424948dde PCI/ACPI/PM: Avoid resuming devices that don't signal PME
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:49 -08:00
Rafael J. Wysocki d90116ea38 PCI/ACPI: Make acpiphp ignore root bridges using SHPC native hotplug
If the kernel has requested control of the SHPC native hotplug
feature for a given root bridge, the acpiphp driver should not try
to handle that root bridge and it should leave it to shpchp.
Failing to do so causes problems to happen if shpchp is loaded
and unloaded before loading acpiphp (ACPI-based hotplug won't work
in that case anyway).

To address this issue make find_root_bridges() ignore PCI root
bridges with SHPC native hotplug enabled and make add_bridge()
return error code if SHPC native hotplug is enabled for the given
root bridge.  This causes acpiphp to refuse to load if SHPC native
hotplug is enabled for all root bridges and to refuse binding to
the root bridges with SHPC native hotplug enabled.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:48 -08:00
Kenji Kaneshige 486b10b9f4 PCI: pciehp: Handle push button event asynchronously
Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:47 -08:00
Kenji Kaneshige 027e8d52ab PCI: pciehp: Fix wrong workqueue cleanup
Fix improper workqueue cleanup.

In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:46 -08:00
Matthew Garrett 3c076351c4 PCI: Rework ASPM disable code
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.

This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.

It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.

Measured to save around 5W on an idle Thinkpad X220.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:45 -08:00
Alex Williamson 69166fbf02 PCI: Fix PRI and PASID consistency
These are extended capabilities, rename and move to proper
group for consistency.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:45 -08:00
Neil Horman b50cac55bf PCI/sysfs: add per pci device msi[x] irq listing (v5)
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs.  For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:44 -08:00
Linus Torvalds 09521577ca Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
  PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
  PCI: pciehp: wait 100 ms after Link Training check
  PCI: pciehp: wait 1000 ms before Link Training check
  PCI: pciehp: Retrieve link speed after link is trained
  PCI: Let PCI_PRI depend on PCI
  PCI: Fix compile errors with PCI_ATS and !PCI_IOV
  PCI / ACPI: Make acpiphp ignore root bridges using PCIe native hotplug
2011-11-23 14:58:46 -08:00
Bjorn Helgaas 4cac2eb158 PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
Previously we claimed device ID 0x7450, regardless of the vendor, which is
clearly wrong.  Now we'll claim that device ID only for AMD.

I suspect this was just a typo in the original code, but it's possible this
change will break shpchp on non-7450 AMD bridges.  If so, we'll have to fix
them as we find them.

Reference: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638863
Reported-by: Ralf Jung <ralfjung-e@gmx.de>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-14 09:43:14 -08:00
Kenji Kaneshige b3c0045422 PCI: pciehp: wait 100 ms after Link Training check
If the port supports Link speeds greater than 5.0 GT/s, we must wait
for 100 ms after Link training completes before sending configuration
request.

Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-11 09:31:43 -08:00
Kenji Kaneshige 0027cb3e19 PCI: pciehp: wait 1000 ms before Link Training check
We need to wait for 1000 ms after Data Link Layer Link Active (DLLLA)
bit reads 1b before sending configuration request. Currently pciehp
does this wait after checking Link Training (LT) bit. But we need it
before checking LT bit because LT is still set even after DLLLA bit is
set on some platforms.

Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-11 09:31:34 -08:00
Yinghai Lu fdbd3ce9ef PCI: pciehp: Retrieve link speed after link is trained
During hot plug, board_added will call pciehp_power_on_slot().
But link speed is updated in pciehp_power_on_slot().

We should not update link speed there, because that is too early.

So move the link speed update to pciehp_check_link_status() after making
sure the link has been trained.

-v2: fix compile warning that Kenji found.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-07 08:07:34 -08:00
Linus Torvalds 32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Linus Torvalds 02ebbbd481 Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  scsi: drop unused Kconfig symbol
  pci: drop unused Kconfig symbol
  stmmac: drop unused Kconfig symbol
  x86: drop unused Kconfig symbol
  powerpc: drop unused Kconfig symbols
  powerpc: 40x: drop unused Kconfig symbol
  mips: drop unused Kconfig symbols
  openrisc: drop unused Kconfig symbols
  arm: at91: drop unused Kconfig symbol
  samples: drop unused Kconfig symbol
  m32r: drop unused Kconfig symbol
  score: drop unused Kconfig symbols
  sh: drop unused Kconfig symbol
  um: drop unused Kconfig symbol
  sparc: drop unused Kconfig symbol
  alpha: drop unused Kconfig symbol

Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig
as per Michal: the STMMAC_DUAL_MAC config variable is still unused and
should be deleted.
2011-11-06 18:54:53 -08:00
Paul Gortmaker eefa9cfc89 pci: add module.h to files implicitly relying on its presence.
These were getting module.h implicitly from device.h but we want
to clean that up, so we fix it here to avoid things like:

pci/slot.c: In function ‘pci_hp_create_module_link’:
pci/slot.c:383: error: ‘module_kset’ undeclared (first use in this function)

Similarly, rpadlpar_core.c is modular, so add module.h to its includes.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:23 -04:00
Paul Gortmaker 363c75db1d pci: Fix files needing export.h for EXPORT_SYMBOL/THIS_MODULE
They were implicitly getting it from device.h --> module.h but
we want to clean that up.  So add the minimal header for these
macros.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:22 -04:00
Paul Bolle a8d2de5e55 pci: drop unused Kconfig symbol
There's no other Kconfig symbol that depends on XEN_PCIDEV_FE_DEBUG.
Neither is there anything that uses CONFIG_XEN_PCIDEV_FE_DEBUG.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-10-31 23:40:16 +01:00
Joerg Roedel c54420d330 PCI: Let PCI_PRI depend on PCI
This avoids the PCI_PRI question in 'make config' when PCI
is not selected.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-31 10:23:58 -07:00
Rafael J. Wysocki 0d52f54e2e PCI / ACPI: Make acpiphp ignore root bridges using PCIe native hotplug
If the kernel has requested control of the PCIe native hotplug
feature for a given root complex, the acpiphp driver should not try
to handle that root complex and it should leave it to pciehp.
Failing to do so causes problems to happen if acpiphp is loaded
before pciehp on such systems.

To address this issue make find_root_bridges() ignore PCIe root
complexes with PCIe native hotplug enabled and make add_bridge()
return error code if PCIe native hotplug is enabled for the given
root port.  This causes acpiphp to refuse to load if PCIe native
hotplug is enabled for all complexes and to refuse binding to
the root complexes with PCIe native hotplug is enabled.

Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-31 10:17:43 -07:00
Linus Torvalds 0e59e7e7fe Merge branch 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
  PCI: Clean-up MPS debug output
  pci: Clamp pcie_set_readrq() when using "performance" settings
  PCI: enable MPS "performance" setting to properly handle bridge MPS
  PCI: Workaround for Intel MPS errata
  PCI: Add support for PASID capability
  PCI: Add implementation for PRI capability
  PCI: Export ATS functions to modules
  PCI: Move ATS implementation into own file
  PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
  PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
  PCI / PM: Extend PME polling to all PCI devices
  PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
  PCI: Make pci_setup_bridge() non-static for use by arch code
  x86: constify PCI raw ops structures
  PCI: Add quirk for known incorrect MPSS
  PCI: Add Solarflare vendor ID and SFC4000 device IDs
2011-10-28 14:20:44 -07:00
Jon Mason a513a99a7c PCI: Clean-up MPS debug output
Clean-up MPS debug output to make it a single line and aligned, thus
making it more readable for a large number of buses and devices in a
single system.

Suggested by Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:44 -07:00
Benjamin Herrenschmidt a1c473aa11 pci: Clamp pcie_set_readrq() when using "performance" settings
When configuring the PCIe settings for "performance", we allow parents
to have a larger Max Payload Size than children and rely on children
Max Read Request Size to not be larger than their own MPS to avoid
having the host bridge generate responses they can't cope with.

However, various drivers in Linux call pci_set_readrq() with arbitrary
values, assuming this to be a simple performance tweak. This breaks
under our "performance" configuration.

Fix that by making sure the value programmed by pcie_set_readrq() is
never larger than the configured MPS for that device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:44 -07:00
Jon Mason 62f392ea5b PCI: enable MPS "performance" setting to properly handle bridge MPS
Rework the "performance" MPS option to configure the device MPS with the
smaller of the device MPSS or the bridge MPS (which is assumed to be
properly configured at this point to the largest allowable MPS based on
its parent bus).

Also, rework the MRRS setting to report an inability to set the MRRS to
a valid setting.

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:43 -07:00
Jon Mason d387a8d666 PCI: Workaround for Intel MPS errata
Intel 5000 and 5100 series memory controllers have a known issue if read
completion coalescing is enabled and the PCI-E Maximum Payload Size is
set to 256B.  To work around this issue, disable read completion
coalescing in the memory controller and root complexes.  Unfortunately,
it must always be disabled, even if no 256B MPS devices are present, due
to the possibility of one being hotplugged.

Links to erratas:
http://www.intel.com/content/dam/doc/specification-update/5000-chipset-memory-controller-hub-specification-update.pdf
http://www.intel.com/content/dam/doc/specification-update/5100-memory-controller-hub-chipset-specification-update.pdf

Thanks to Jesse Brandeburg and Ben Hutchings for providing insight into
the problem.

Tested-and-Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:42 -07:00
Linus Torvalds 982653009b Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, ioapic: Consolidate the explicit EOI code
  x86, ioapic: Restore the mask bit correctly in eoi_ioapic_irq()
  x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC
  iommu: Rename the DMAR and INTR_REMAP config options
  x86, ioapic: Define irq_remap_modify_chip_defaults()
  x86, msi, intr-remap: Use the ioapic set affinity routine
  iommu: Cleanup ifdefs in detect_intel_iommu()
  iommu: No need to set dmar_disabled in check_zero_address()
  iommu: Move IOMMU specific code to intel-iommu.c
  intr_remap: Call dmar_dev_scope_init() explicitly
  x86, x2apic: Enable the bios request for x2apic optout
2011-10-26 16:11:53 +02:00
Linus Torvalds 04a8752485 Merge branches 'stable/drivers-3.2', 'stable/drivers.bugfixes-3.2' and 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/drivers-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xenbus: don't rely on xen_initial_domain to detect local xenstore
  xenbus: Fix loopback event channel assuming domain 0
  xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'
  xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel
  xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable
  xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel
  xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports
  xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive

* 'stable/drivers.bugfixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pciback: Check if the device is found instead of blindly assuming so.
  xen/pciback: Do not dereference psdev during printk when it is NULL.
  xen: remove XEN_PLATFORM_PCI config option
  xen: XEN_PVHVM depends on PCI
  xen/pciback: double lock typo
  xen/pciback: use mutex rather than spinlock in vpci backend
  xen/pciback: Use mutexes when working with Xenbus state transitions.
  xen/pciback: miscellaneous adjustments
  xen/pciback: use mutex rather than spinlock in passthrough backend
  xen/pciback: use resource_size()

* 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pci: support multi-segment systems
  xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.
  xen/pci: make bus notifier handler return sane values
  xen-swiotlb: fix printk and panic args
  xen-swiotlb: Fix wrong panic.
  xen-swiotlb: Retry up three times to allocate Xen-SWIOTLB
  xen-pcifront: Update warning comment to use 'e820_host' option.
2011-10-25 09:19:36 +02:00
Joerg Roedel 086ac11f64 PCI: Add support for PASID capability
Devices supporting Process Address Space Identifiers
(PASIDs) can use an IOMMU to access multiple IO address
spaces at the same time. A PCIe device indicates support for
this feature by implementing the PASID capability. This
patch adds support for the capability to the Linux kernel.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:35 -07:00
Joerg Roedel c320b976d7 PCI: Add implementation for PRI capability
Implement the necessary functions to handle PRI capabilities
on PCIe devices. With PRI devices behind an IOMMU can signal
page fault conditions to software and recover from such
faults.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:34 -07:00
Joerg Roedel d4c0636c21 PCI: Export ATS functions to modules
This patch makes the ATS functions usable for modules.
They will be used by a module implementing some advanced
AMD IOMMU features.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:34 -07:00
Joerg Roedel db3c33c6d3 PCI: Move ATS implementation into own file
ATS does not depend on IOV support, so move the code into
its own file. This file will also include support for the
PRI and PASID capabilities later.
Also give ATS its own Kconfig variable to allow selecting it
without IOV support.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:33 -07:00
Rafael J. Wysocki 78d090b0be PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
The result returned by acpi_dev_run_wake() is always either -EINVAL
or -ENODEV, while obviously it should return 0 on success.  The
problem is that the leftover error variable, that's not really used
in the function, is initialized with -ENODEV and then returned
without modification.

To fix this issue remove the error variable from acpi_dev_run_wake()
and make the function return 0 on success as appropriate.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:32 -07:00
Prarit Bhargava 6af8bef14d PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
I originally submitted a patch to workaround this by pushing all Ejection
Requests and Device Checks onto the kacpi_hotplug queue.

http://marc.info/?l=linux-acpi&m=131678270930105&w=2

The patch is still insufficient in that Bus Checks also need to be added.

Rather than add all events, including non-PCI-hotplug events, to the
hotplug queue, mjg suggested that a better approach would be to modify
the acpiphp driver so only acpiphp events would be added to the
kacpi_hotplug queue.

It's a longer patch, but at least we maintain the benefit of having separate
queues in ACPI.  This, of course, is still only a workaround the problem.
As Bjorn and mjg pointed out, we have to refactor a lot of this code to do
the right thing but at this point it is a better to have this code working.

The acpi core places all events on the kacpi_notify queue.  When the acpiphp
driver is loaded and a PCI card with a PCI-to-PCI bridge is removed the
following call sequence occurs:

cleanup_p2p_bridge()
	    -> cleanup_bridge()
		    -> acpi_remove_notify_handler()
			    -> acpi_os_wait_events_complete()
				    -> flush_workqueue(kacpi_notify_wq)

which is the queue we are currently executing on and the process will hang.

Move all hotplug acpiphp events onto the kacpi_hotplug workqueue.  In
handle_hotplug_event_bridge() and handle_hotplug_event_func() we can simply
push the rest of the work onto the kacpi_hotplug queue and then avoid the
deadlock.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: mjg@redhat.com
Cc: bhelgaas@google.com
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:31 -07:00
Rafael J. Wysocki 379021d5c0 PCI / PM: Extend PME polling to all PCI devices
The land of PCI power management is a land of sorrow and ugliness,
especially in the area of signaling events by devices.  There are
devices that set their PME Status bits, but don't really bother
to send a PME message or assert PME#.  There are hardware vendors
who don't connect PME# lines to the system core logic (they know
who they are).  There are PCI Express Root Ports that don't bother
to trigger interrupts when they receive PME messages from the devices
below.  There are ACPI BIOSes that forget to provide _PRW methods for
devices capable of signaling wakeup.  Finally, there are BIOSes that
do provide _PRW methods for such devices, but then don't bother to
call Notify() for those devices from the corresponding _Lxx/_Exx
GPE-handling methods.  In all of these cases the kernel doesn't have
a chance to receive a proper notification that it should wake up a
device, so devices stay in low-power states forever.  Worse yet, in
some cases they continuously send PME Messages that are silently
ignored, because the kernel simply doesn't know that it should clear
the device's PME Status bit.

This problem was first observed for "parallel" (non-Express) PCI
devices on add-on cards and Matthew Garrett addressed it by adding
code that polls PME Status bits of such devices, if they are enabled
to signal PME, to the kernel.  Recently, however, it has turned out
that PCI Express devices are also affected by this issue and that it
is not limited to add-on devices, so it seems necessary to extend
the PME polling to all PCI devices, including PCI Express and planar
ones.  Still, it would be wasteful to poll the PME Status bits of
devices that are known to receive proper PME notifications, so make
the kernel (1) poll the PME Status bits of all PCI and PCIe devices
enabled to signal PME and (2) disable the PME Status polling for
devices for which correct PME notifications are received.

Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:31 -07:00
Josh Boyer 3e309cdf07 PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
Commit 15bed0f2f added a quirk for the e823 Ricoh card reader to lower the
base frequency.  However, the quirk first checks to see if the proprietary
MMC controller is disabled, and returns if so.  On some devices, such as the
Lenovo X220, the MMC controller is already disabled by firmware it seems,
but the frequency change is still needed so sdhci-pci can talk to the cards.
Since the MMC controller is disabled, the frequency fixup was never being run
on these machines.

This moves the e823 check above the MMC controller check so that it always
gets run.

This fixes https://bugzilla.redhat.com/show_bug.cgi?id=722509

Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:30 -07:00
Benjamin Herrenschmidt e24442733e PCI: Make pci_setup_bridge() non-static for use by arch code
The "powernv" platform of the powerpc architecture needs to assign PCI
resources using a specific algorithm to fit some HW constraints of
the IBM "IODA" architecture (related to the ability to create error
handling domains that encompass specific segments of MMIO space).

For doing so, it wants to call pci_setup_bridge() from architecture
specific resource management in order to configure bridges after all
resources have been assigned. So make it non-static.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:29 -07:00
Ben Hutchings a94d072b20 PCI: Add quirk for known incorrect MPSS
Using legacy interrupts and TLPs > 256 bytes on the SFC4000 (all
revisions) may cause interrupt messages to be replayed.  In some
systems this results in a non-recoverable MCE.  Early boards using the
SFC4000 set the maximum payload size supported (MPSS) to 1024 bytes
and we should override that.

There are probably other devices with similar issues, so give this
quirk a generic name.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:27 -07:00
Jon Mason 5f39e6705f PCI: Disable MPS configuration by default
Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults.  Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.

Also, add the option for peer to peer DMA MPS configuration.  Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another.  To work around this, simply make the system
wide MPS the smallest possible value (128B).

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-04 09:52:28 -07:00
Suresh Siddha d3f138106b iommu: Rename the DMAR and INTR_REMAP config options
Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
with the other IOMMU options.

Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
irq subsystem name.

And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:22:03 +02:00
Benjamin Herrenschmidt 1a4b1a41b8 pci: Don't crash when reading mpss from root complex
In pcie_find_smpss(), we have the following statement:

 	if (dev->is_hotplug_bridge && (!list_is_singular(&dev->bus->devices) ||
	    dev->bus->self->pcie_type != PCI_EXP_TYPE_ROOT_PORT))

The problem is that at least on my machine, this gets called for the
root complex (virtual P2P bridge), and dev->bus->self is NULL since
the parent bus for this is not itself anchor to a PCI device.

This adds the necessary NULL check.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jon Mason <mason@myri.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-13 16:08:31 -07:00
Jon Mason ed2888e906 PCI: Remove MRRS modification from MPS setting code
Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
massive negative ramifications on some devices.  Without knowing which
devices have this issue, do not modify from the default value when
walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
the default procedure.

Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Simon Kirby <sim@hostway.ca>
Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-09 19:49:58 -07:00
Shyam Iyer 5307f6d5fb Fix pointer dereference before call to pcie_bus_configure_settings
Commit b03e7495a8 ("PCI: Set PCI-E Max Payload Size on fabric")
introduced a potential NULL pointer dereference in calls to
pcie_bus_configure_settings due to attempts to access pci_bus self
variables when the self pointer is NULL.

To correct this, verify that the self pointer in pci_bus is non-NULL
before dereferencing it.

Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-09 19:49:58 -07:00
Konrad Rzeszutek Wilk 917e3e65c3 xen-pcifront: Update warning comment to use 'e820_host' option.
With Xen changeset 23428 "libxl: Add 'e820_host' option to config file"
the E820 as seen from the host can now be passed into the guest.
This means that a PV guest can now:
 - Use the correct PCI I/O gap. Before these patches, Linux guest would
   boot up and would tell:
   [    0.000000] Allocating PCI resources starting at 40000000 (gap: 40000000:c0000000)
   while in actuality the PCI I/O gap should have been:
   [    0.000000] Allocating PCI resources starting at b0000000 (gap: b0000000:4c000000)

 - The PV domain with PCI devices was limited to 3GB. It now can be booted
   with 4GB, 8GB, or whatever number you want. The PCI devices will now _not_ conflict
   with System RAM. Meaning the drivers can load.

CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: linux-pci@vger.kernel.org
CC: stable@kernel.org
[v2: Made the string less broken up. Suggested by Joe Perches]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-08-26 12:09:34 -04:00
Randy Dunlap 47c08f3107 pci: fix new kernel-doc warning in pci.c
Fix new kernel-doc warning in pci.c:

  Warning(drivers/pci/pci.c:3259): No description found for parameter 'mps'
  Warning(drivers/pci/pci.c:3259): Excess function parameter 'rq' description in 'pcie_set_mps'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-20 18:02:32 -07:00
Linus Torvalds 0c3bef6128 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: OF: Don't crash when bridge parent is NULL.
  PCI: export pcie_bus_configure_settings symbol
  PCI: code and comments cleanup
  PCI: make cardbus-bridge resources optional
  PCI: make SRIOV resources optional
  PCI : ability to relocate assigned pci-resources
  PCI: honor child buses add_size in hot plug configuration
  PCI: Set PCI-E Max Payload Size on fabric
2011-08-19 10:02:37 -07:00
David Daney 69566dd8be PCI: OF: Don't crash when bridge parent is NULL.
In pcibios_get_phb_of_node(), we will crash while booting if
bus->bridge->parent is NULL.

Check for this case and avoid dereferencing the NULL pointer.

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-19 08:51:37 -07:00
Linus Torvalds c299eba3c5 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (28 commits)
  ACPI:  delete stale reference in kernel-parameters.txt
  ACPI: add missing _OSI strings
  ACPI: remove NID_INVAL
  thermal: make THERMAL_HWMON implementation fully internal
  thermal: split hwmon lookup to a separate function
  thermal: hide CONFIG_THERMAL_HWMON
  ACPI print OSI(Linux) warning only once
  ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX
  ACPI / Battery: propagate sysfs error in acpi_battery_add()
  ACPI / Battery: avoid acpi_battery_add() use-after-free
  ACPI: introduce "acpi_rsdp=" parameter for kdump
  ACPI: constify ops structs
  ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
  ACPI: fix 80 char overflow
  ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
  ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
  ACPI / Battery: Add the hibernation process in the battery_notify()
  ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
  ACPI / Battery: Change 16-bit signed negative battery current into correct value
  ACPI / Battery: Add the power unit macro
  ...
2011-08-02 21:17:02 -10:00
Jon Mason debc3b7785 PCI: export pcie_bus_configure_settings symbol
pcie_bus_configure_settings needs to be exported if the PCI hotplug
driver is being compiled as a module.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-02 08:53:00 -07:00
Ram Pai 9e8bf93a7f PCI: code and comments cleanup
a) adjust_resource_sorted() is now called reassign_resource_sorted()
b) nice-to-have is now called optional
c) add_list is now called realloc_list.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:50:50 -07:00
Ram Pai 0a2daa1cf3 PCI: make cardbus-bridge resources optional
Allocate resources to cardbus bridge only after all other genuine
resources requests are satisfied. Dont retry if resource allocation
for cardbus-bridges fail.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:50:40 -07:00
Yinghai Lu 2aceefcbd5 PCI: make SRIOV resources optional
From: Yinghai Lu <yinghai@kernel.org>

Allocate resources to SRIOV BARs only after all other required
resource-requests are satisfied. Dont retry if resource allocation for SRIOV
BARs fail.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:50:31 -07:00
Ram Pai 2bbc694227 PCI : ability to relocate assigned pci-resources
Currently pci-bridges are allocated enough resources to satisfy their immediate
requirements.  Any additional resource-requests fail if additional free space,
contiguous to the one already allocated, is not available. This behavior is not
reasonable since sufficient contiguous resources, that can satisfy the request,
are available at a different location.

This patch provides the ability to expand and relocate a allocated resource.

	v2: Changelog: Fixed size calculation in pci_reassign_resource()
	v3: Changelog : Split this patch. The resource.c changes are already
			upstream. All the pci driver changes are in here.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:50:15 -07:00
Yinghai Lu be768912a4 PCI: honor child buses add_size in hot plug configuration
git commit c8adf9a3e8
    "PCI: pre-allocate additional resources to devices only after
	successful allocation of essential resources."

fails to take into consideration the optional-resources needed by children
devices while calculating the optional-resource needed by the bridge.

This can be a problem on some setup. For example, if a hotplug bridge has 8
children hotplug bridges, the bridge should have enough resources to accomodate
the hotplug requirements for each of its children hotplug bridges.  Currently
this is not the case.

This patch fixes the problem.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:49:39 -07:00
Jon Mason b03e7495a8 PCI: Set PCI-E Max Payload Size on fabric
On a given PCI-E fabric, each device, bridge, and root port can have a
different PCI-E maximum payload size.  There is a sizable performance
boost for having the largest possible maximum payload size on each PCI-E
device.  However, if improperly configured, fatal bus errors can occur.
Thus, it is important to ensure that PCI-E payloads sends by a device
are never larger than the MPS setting of all devices on the way to the
destination.

This can be achieved two ways:

- A conservative approach is to use the smallest common denominator of
  the entire tree below a root complex for every device on that fabric.

This means for example that having a 128 bytes MPS USB controller on one
leg of a switch will dramatically reduce performances of a video card or
10GE adapter on another leg of that same switch.

It also means that any hierarchy supporting hotplug slots (including
expresscard or thunderbolt I suppose, dbl check that) will have to be
entirely clamped to 128 bytes since we cannot predict what will be
plugged into those slots, and we cannot change the MPS on a "live"
system.

- A more optimal way is possible, if it falls within a couple of
  constraints:
* The top-level host bridge will never generate packets larger than the
  smallest TLP (or if it can be controlled independently from its MPS at
  least)
* The device will never generate packets larger than MPS (which can be
  configured via MRRS)
* No support of direct PCI-E <-> PCI-E transfers between devices without
  some additional code to specifically deal with that case

Then we can use an approach that basically ignores downstream requests
and focuses exclusively on upstream requests. In that case, all we need
to care about is that a device MPS is no larger than its parent MPS,
which allows us to keep all switches/bridges to the max MPS supported by
their parent and eventually the PHB.

In this case, your USB controller would no longer "starve" your 10GE
Ethernet and your hotplug slots won't affect your global MPS.
Additionally, the hotplugged devices themselves can be configured to a
larger MPS up to the value configured in the hotplug bridge.

To choose between the two available options, two PCI kernel boot args
have been added to the PCI calls.  "pcie_bus_safe" will provide the
former behavior, while "pcie_bus_perf" will perform the latter behavior.
By default, the latter behavior is used.

NOTE: due to the location of the enablement, each arch will need to add
calls to this function.  This patch only enables x86.

This patch includes a number of changes recommended by Benjamin
Herrenschmidt.

Tested-by: Jordan_Hargrave@dell.com
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-08-01 11:49:16 -07:00
Linus Torvalds f85f19de90 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: remove printks about disabled bridge windows
  PCI: fold pci_calc_resource_flags() into decode_bar()
  PCI: treat mem BAR type "11" (reserved) as 32-bit, not 64-bit, BAR
  PCI: correct pcie_set_readrq write size
  PCI: pciehp: change wait time for valid configuration access
  x86/PCI: Preserve existing pci=bfsort whitelist for Dell systems
  PCI: ARI is a PCIe v2 feature
  x86/PCI: quirks: Use pci_dev->revision
  PCI: Make the struct pci_dev * argument of pci_fixup_irqs const.
  PCI hotplug: cpqphp: use pci_dev->vendor
  PCI hotplug: cpqphp: use pci_dev->subsystem_{vendor|device}
  x86/PCI: config space accessor functions should not ignore the segment argument
  PCI: Assign values to 'pci_obff_signal_type' enumeration constants
  x86/PCI: reduce severity of host bridge window conflict warnings
  PCI: enumerate the PCI device only removed out PCI hieratchy of OS when re-scanning PCI
  PCI: PCIe AER: add aer_recover_queue
  x86/PCI: select direct access mode for mmconfig option
  PCI hotplug: Rename is_ejectable which also exists in dock.c
2011-07-29 23:35:05 -07:00
Linus Torvalds e371d46ae4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  merge fchmod() and fchmodat() guts, kill ancient broken kludge
  xfs: fix misspelled S_IS...()
  xfs: get rid of open-coded S_ISREG(), etc.
  vfs: document locking requirements for d_move, __d_move and d_materialise_unique
  omfs: fix (mode & S_IFDIR) abuse
  btrfs: S_ISREG(mode) is not mode & S_IFREG...
  ima: fmode_t misspelled as mode_t...
  pci-label.c: size_t misspelled as mode_t
  jffs2: S_ISLNK(mode & S_IFMT) is pointless
  snd_msnd ->mode is fmode_t, not mode_t
  v9fs_iop_get_acl: get rid of unused variable
  vfs: dont chain pipe/anon/socket on superblock s_inodes list
  Documentation: Exporting: update description of d_splice_alias
  fs: add missing unlock in default_llseek()
2011-07-26 18:30:20 -07:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Al Viro ed47641839 pci-label.c: size_t misspelled as mode_t
no, really, strlen() and snprintf() do not return mode_t values...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-26 13:01:09 -04:00
Linus Torvalds d3ec4844d4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  fs: Merge split strings
  treewide: fix potentially dangerous trailing ';' in #defined values/expressions
  uwb: Fix misspelling of neighbourhood in comment
  net, netfilter: Remove redundant goto in ebt_ulog_packet
  trivial: don't touch files that are removed in the staging tree
  lib/vsprintf: replace link to Draft by final RFC number
  doc: Kconfig: `to be' -> `be'
  doc: Kconfig: Typo: square -> squared
  doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
  drivers/net: static should be at beginning of declaration
  drivers/media: static should be at beginning of declaration
  drivers/i2c: static should be at beginning of declaration
  XTENSA: static should be at beginning of declaration
  SH: static should be at beginning of declaration
  MIPS: static should be at beginning of declaration
  ARM: static should be at beginning of declaration
  rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
  Update my e-mail address
  PCIe ASPM: forcedly -> forcibly
  gma500: push through device driver tree
  ...

Fix up trivial conflicts:
 - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
 - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
 - drivers/net/r8169.c (just context changes)
2011-07-25 13:56:39 -07:00
Linus Torvalds 6d16d6d9bb Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  iommu/core: Fix build with INTR_REMAP=y && CONFIG_DMAR=n
  iommu/amd: Don't use MSI address range for DMA addresses
  iommu/amd: Move missing parts to drivers/iommu
  iommu: Move iommu Kconfig entries to submenu
  x86/ia64: intel-iommu: move to drivers/iommu/
  x86: amd_iommu: move to drivers/iommu/
  msm: iommu: move to drivers/iommu/
  drivers: iommu: move to a dedicated folder
  x86/amd-iommu: Store device alias as dev_data pointer
  x86/amd-iommu: Search for existind dev_data before allocting a new one
  x86/amd-iommu: Allow dev_data->alias to be NULL
  x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions
  x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines
  x86/amd-iommu: Store ATS state in dev_data
  x86/amd-iommu: Store devid in dev_data
  x86/amd-iommu: Introduce global dev_data_list
  x86/amd-iommu: Remove redundant device_flush_dte() calls
  iommu-api: Add missing header file

Fix up trivial conflicts (independent additions close to each other) in
drivers/Makefile and include/linux/pci.h
2011-07-22 16:39:42 -07:00
Linus Torvalds 431bf99d26 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (51 commits)
  PM: Improve error code of pm_notifier_call_chain()
  PM: Add "RTC" to PM trace time stamps to avoid confusion
  PM / Suspend: Export suspend_set_ops, suspend_valid_only_mem
  PM / Suspend: Add .suspend_again() callback to suspend_ops
  PM / OPP: Introduce function to free cpufreq table
  ARM / shmobile: Return -EBUSY from A4LC power off if A3RV is active
  PM / Domains: Take .power_off() error code into account
  ARM / shmobile: Use genpd_queue_power_off_work()
  ARM / shmobile: Use pm_genpd_poweroff_unused()
  PM / Domains: Introduce function to power off all unused PM domains
  OMAP: PM: disable idle on suspend for GPIO and UART
  OMAP: PM: omap_device: add API to disable idle on suspend
  OMAP: PM: omap_device: add system PM methods for PM domain handling
  OMAP: PM: omap_device: conditionally use PM domain runtime helpers
  PM / Runtime: Add new helper function: pm_runtime_status_suspended()
  PM / Domains: Queue up power off work only if it is not pending
  PM / Domains: Improve handling of wakeup devices during system suspend
  PM / Domains: Do not restore all devices on power off error
  PM / Domains: Allow callbacks to execute all runtime PM helpers
  PM / Domains: Do not execute device callbacks under locks
  ...
2011-07-22 16:01:57 -07:00
Linus Torvalds acb41c0f92 Merge branch 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  pci/of: Consolidate pci_bus_to_OF_node()
  pci/of: Consolidate pci_device_to_OF_node()
  x86/devicetree: Use generic PCI <-> OF matching
  microblaze/pci: Move the remains of pci_32.c to pci-common.c
  microblaze/pci: Remove powermac originated cruft
  pci/of: Match PCI devices to OF nodes dynamically
2011-07-22 14:54:02 -07:00
Bjorn Helgaas 7b87c9df56 PCI: remove printks about disabled bridge windows
I don't think there's enough value in the fact of a bridge window
being disabled to justify cluttering the dmesg log with it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 09:08:07 -07:00
Bjorn Helgaas 28c6821a0f PCI: fold pci_calc_resource_flags() into decode_bar()
decode_bar() and pci_calc_resource_flags() both looked at the PCI BAR
type information, and it's simpler to just do it all in one place.

decode_bar() sets IORESOURCE_IO, IORESOURCE_MEM, and IORESOURCE_MEM_64
as appropriate, so res->flags contains all the information pci_bar_type
does, so we don't need to test the pci_bar_type return value.

decode_bar() used to return pci_bar_type, which we no longer need.  We
can simplify it a bit by returning the struct resource flags rather than
updating them internally.

In pci_update_resource(), there's no need to decode the BAR type bits
again; we can just test for IORESOURCE_MEM_64 directly.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 09:08:01 -07:00
Bjorn Helgaas 8d6a6a4763 PCI: treat mem BAR type "11" (reserved) as 32-bit, not 64-bit, BAR
This fixes a minor regression where broken PCI devices that use the
reserved "11" memory BAR type worked before e354597cce but not after.

The low four bits of a memory BAR are "PTT0" where P=1 for prefetchable
BARs, and TT is as follows:

  00  32-bit BAR, anywhere in lower 4GB
  01  anywhere below 1MB (reserved as of PCI 2.2)
  10  64-bit BAR
  11  reserved

Prior to e354597cce, we treated "0100" as a 64-bit BAR and all others,
including prefetchable 64-bit BARs ("1100") as 32-bit BARs.  The e354597cce
fix, which appeared in 2.6.28, treats "x1x0" as 64-bit BARs, so the
reserved "x110" types are treated as 64-bit instead of 32-bit.

This patch returns to treating the reserved "11" type as a 32-bit BAR and
adds a warning if we see it.

It also logs a note if we see a 1M BAR.  This is not a warning, because
such hardware conforms to pre-PCI 2.2 spec, but I think it's worth noting
because Linux ignores the 1M restriction if it ever has to assign the BAR.

CC: Peter Chubb <peterc@gelato.unsw.edu.au>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=35952
Reported-by: Jan Zwiegers <jan@radicalsystems.co.za>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 09:06:58 -07:00
Jon Mason c9b378c7cb PCI: correct pcie_set_readrq write size
When setting the PCI-E MRRS, pcie_set_readrq queries the current
settings via a pci_read_config_word call but writes the modified result
via a pci_write_config_dword.  This results in writing 16 more bits than
were queried.

Also, the function description comment is slightly incorrect.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 09:06:51 -07:00
Kenji Kaneshige 0cab0841dc PCI: pciehp: change wait time for valid configuration access
Naoki Yanagimoto reported that configuration read on some hot-added
PCIe device returns invalid value. This patch fixes this problem.

According to the PCIe spec, software must wait for at least 1 second
to judge if the hot-added device is broken after Data Link Layer State
Changed Event. This patch changes pciehp driver to wait for 1 second
after the Data Link Layer State Changed Event is detected before
initiating a configuration access instead of 100 ms.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Naoki Yanagimoto <yanagimoto@np.css.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 09:06:41 -07:00
Chris Wright 864d296cf9 PCI: ARI is a PCIe v2 feature
The function pci_enable_ari() may mistakenly set the downstream port
of a v1 PCIe switch in ARI Forwarding mode.  This is a PCIe v2 feature,
and with an SR-IOV device on that switch port believing the switch above
is ARI capable it may attempt to use functions 8-255, translating into
invalid (non-zero) device numbers for that bus.  This has been seen
to cause Completion Timeouts and general misbehaviour including hangs
and panics.

Cc: stable@kernel.org
Acked-by: Don Dutile <ddutile@redhat.com>
Tested-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:41:51 -07:00
Ralf Baechle d5341942d7 PCI: Make the struct pci_dev * argument of pci_fixup_irqs const.
Aside of the usual motivation for constification,  this function has a
history of being abused a hook for interrupt and other fixups so I turned
this function const ages ago in the MIPS code but it should be done
treewide.

Due to function pointer passing in varous places a few other functions
had to be constified as well.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
To: Anton Vorontsov <avorontsov@mvista.com>
To: Chris Metcalf <cmetcalf@tilera.com>
To: Colin Cross <ccross@android.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
To: Eric Miao <eric.y.miao@gmail.com>
To: Erik Gilling <konkers@android.com>
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
To: "H. Peter Anvin" <hpa@zytor.com>
To: Imre Kaloz <kaloz@openwrt.org>
To: Ingo Molnar <mingo@redhat.com>
To: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
To: Jesse Barnes <jbarnes@virtuousgeek.org>
To: Krzysztof Halasa <khc@pm.waw.pl>
To: Lennert Buytenhek <kernel@wantstofly.org>
To: Matt Turner <mattst88@gmail.com>
To: Nicolas Pitre <nico@fluxnic.net>
To: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mundt <lethal@linux-sh.org>
To: Richard Henderson <rth@twiddle.net>
To: Russell King <linux@arm.linux.org.uk>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-pci@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:26:06 -07:00
Sergei Shtylyov 05d3ac267a PCI hotplug: cpqphp: use pci_dev->vendor
The driver reads PCI vendor ID from the PCI configuration register while it is
already stored by the PCI subsystem in the 'vendor' field of 'struct pci_dev'...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:25:43 -07:00
Sergei Shtylyov 69b3e6199a PCI hotplug: cpqphp: use pci_dev->subsystem_{vendor|device}
The driver reads PCI subsystem IDs from the PCI configuration registers while
they are already stored by the PCI subsystem in the 'subsystem_{vendor|device}'
fields of 'struct pci_dev'...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:25:42 -07:00
Tiejun Chen b1a98b695b PCI: enumerate the PCI device only removed out PCI hieratchy of OS when re-scanning PCI
When hot-plugging a root bridge, we always prevent assigning a bus number
that already exists. This makes sure we don't step over an existing bus.
But sometimes we only remove PCI device in PCI hieratchy of OS, i,e.

echo 1 > /sys/bus/pci/devices/.../remove

but actually don't hotplug this device out the platform, so in this case
we still should re-scan this bus to enumerate this device when re-scanning
PCI again.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:25:38 -07:00
Huang Ying 0918472cee PCI: PCIe AER: add aer_recover_queue
In addition to native PCIe AER, now APEI (ACPI Platform Error
Interface) GHES (Generic Hardware Error Source) can be used to report
PCIe AER errors too.  To add support to APEI GHES PCIe AER recovery,
aer_recover_queue is added to export the recovery function in native
PCIe AER driver.

Recoverable PCIe AER errors are reported via NMI in APEI GHES.  Then
APEI GHES uses irq_work to delay the error processing into an IRQ
handler.  But PCIe AER recovery can be very time-consuming, so
aer_recover_queue, which can be used in IRQ handler, delays the real
recovery action into the process context, that is, work queue.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:25:37 -07:00
Thomas Renninger efe6d7272b PCI hotplug: Rename is_ejectable which also exists in dock.c
While it's declared static, etags points you to the wrong function
in drivers/acpi/dock.c and acpiphp_glue.c for example also makes
use of some (exported..) functions from this file.

If you trust etags and oversee the static declaration (what happened
to me) one gets totally confused...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22 08:25:35 -07:00
Manoj Iyer 15bed0f2fa mmc: Added quirks for Ricoh 1180:e823 lower base clock frequency
Ricoh 1180:e823 does not recognize certain types of SD/MMC cards,
as reported at http://launchpad.net/bugs/773524.  Lowering the SD
base clock frequency from 200Mhz to 50Mhz fixes this issue. This
solution was suggest by Koji Matsumuro, Ricoh Company, Ltd.

This change has no negative performance effect on standard SD
cards, though it's quite possible that there will be one on
UHS-1 cards.

Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Cc: Koji Matsumuro <matsumur@nts.ricoh.co.jp>
Cc: <stable@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-07-21 10:35:04 -04:00
Vasiliy Kulikov 9c8b04be44 ACPI: constify ops structs
Structs battery_file, acpi_dock_ops, file_operations,
thermal_cooling_device_ops, thermal_zone_device_ops, kernel_param_ops
are not changed in runtime.  It is safe to make them const.
register_hotplug_dock_device() was altered to take const "ops" argument
to respect acpi_dock_ops' const notion.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-16 18:36:17 -04:00
Rafael J. Wysocki 7ae033cc0d Merge branch 'pm-runtime' into for-linus
* pm-runtime:
  OMAP: PM: disable idle on suspend for GPIO and UART
  OMAP: PM: omap_device: add API to disable idle on suspend
  OMAP: PM: omap_device: add system PM methods for PM domain handling
  OMAP: PM: omap_device: conditionally use PM domain runtime helpers
  PM / Runtime: Add new helper function: pm_runtime_status_suspended()
  PM / Runtime: Consistent utilization of deferred_resume
  PM / Runtime: Prevent runtime_resume from racing with probe
  PM / Runtime: Replace "run-time" with "runtime" in documentation
  PM / Runtime: Improve documentation of enable, disable and barrier
  PM: Limit race conditions between runtime PM and system sleep (v2)
  PCI / PM: Detect early wakeup in pci_pm_prepare()
  PM / Runtime: Return special error code if runtime PM is disabled
  PM / Runtime: Update documentation of interactions with system sleep
2011-07-15 23:59:25 +02:00
Jiri Kosina b7e9c223be Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
2011-07-11 14:15:55 +02:00
Ram Pai f483d3923d PCI: conditional resource-reallocation through kernel parameter pci=realloc
Multiple attempts to dynamically reallocate pci resources have
unfortunately lead to regressions. Though we continue to fix the
regressions and fine tune the dynamic-reallocation behavior, we have not
reached a acceptable state yet.
    
This patch provides a interim solution. It disables dynamic reallocation
by default, but adds the ability to enable it through pci=realloc kernel
command line parameter.
    
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-08 15:49:20 -07:00
Ingo Molnar b395fb36d5 Merge branch 'iommu-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into core/iommu 2011-07-07 12:58:28 +02:00
Rafael J. Wysocki eea3fc0357 PCI / PM: Detect early wakeup in pci_pm_prepare()
A subsequent patch is going to move the invocation of
pm_runtime_barrier() from dpm_prepare() to __device_suspend().
Consequently, early wakeup events resulting from runtime resume
requests for wakeup devices queued up right before system suspend
will only be detected after all of the subsystem-level .prepare()
callbacks have run.  However, the PCI bus type calls
pm_runtime_get_sync() from its pci_pm_prepare() callback routine,
so it would destroy the early wakeup events information regarding PCI
devices.  To prevent this from happening add an early wakeup
detection mechanism, analogous to the one currently in dpm_prepare(),
to pci_pm_prepare().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-06 10:51:40 +02:00
Michael Witten 8072ba1ba7 PCIe ASPM: forcedly -> forcibly
Merriam-Webster tells us that the word exists. However ...

  * Google suggests `forcibly' because it doesn't recognize `forcedly'.
  * Google lists 494 thousand results for `forcedly'.
  * Google lists 13.7 million results for `forcibly'.
  * Linus's repo contains  1 occurrence  of `forcedly' ( 0 after my change).
  * Linus's repo contains 60 occurrences of `forcibly' (61 after my change).

Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-29 14:24:14 +02:00
Linus Torvalds a64227b085 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: queue: bring discard_granularity/alignment into line with SCSI
  mmc: queue: append partition subname to queue thread name
  mmc: core: make erase timeout calculation allow for gated clock
  mmc: block: switch card to User Data Area when removing the block driver
  mmc: sdio: reset card during power_restore
  mmc: cb710: fix #ifdef HAVE_EFFICIENT_UNALIGNED_ACCESS
  mmc: sdhi: DMA slave ID 0 is invalid
  mmc: tmio: fix regression in TMIO_MMC_WRPROTECT_DISABLE handling
  mmc: omap_hsmmc: use original sg_len for dma_unmap_sg
  mmc: omap_hsmmc: fix ocr mask usage
  mmc: sdio: fix runtime PM path during driver removal
  mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
  mmc: sdhi: fix module unloading
  mmc: of_mmc_spi: add NO_IRQ define to of_mmc_spi.c
  mmc: vub300: fix null dereferences in error handling
2011-06-27 14:55:43 -07:00
Linus Torvalds 12f1ba5a7d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI/ACPI: fix type mismatch
  PCI: fix new kernel-doc warning
  PCI: Fix warning in drivers/pci/probe.c on sparc64
2011-06-24 08:36:16 -07:00
Rafael J. Wysocki a5f76d5eba PCI / PM: Block races between runtime PM and system sleep
After commit e866500247
(PM: Allow pm_runtime_suspend() to succeed during system suspend) it
is possible that a device resumed by the pm_runtime_resume(dev) in
pci_pm_prepare() will be suspended immediately from a work item,
timer function or otherwise, defeating the very purpose of calling
pm_runtime_resume(dev) from there.  To prevent that from happening
it is necessary to increment the runtime PM usage counter of the
device by replacing pm_runtime_resume() with pm_runtime_get_sync().
Moreover, the incremented runtime PM usage counter has to be
decremented by the corresponding pci_pm_complete(), via
pm_runtime_put_sync().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-21 23:47:15 +02:00
Joerg Roedel 801019d59d Merge branches 'amd/transparent-bridge' and 'core'
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 11:14:10 +02:00
Ohad Ben-Cohen 166e9278a3 x86/ia64: intel-iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.

As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:30 +02:00
Manoj Iyer be98ca652f mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-06-18 22:18:18 -04:00
Linus Torvalds f39e840995 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm: Compare only lower 32 bits of framebuffer map offsets
  drm/i915: Don't leak in i915_gem_shmem_pread_slow()
  drm/radeon/kms: do bounds checking for 3D_LOAD_VBPNTR and bump array limit
  drm/radeon/kms: fix mac g5 quirk
  x86/uv/x2apic: update for change in pci bridge handling.
  alpha, drm: Remove obsolete Alpha support in MGA DRM code
  alpha/drm: Cleanup Alpha support in DRM generic code
  savage: remove unnecessary if statement
  drm/radeon: fix GUI idle IH debug statements
  drm/radeon/kms: check modes against max pixel clock
  drm: fix fbs in DRM_IOCTL_MODE_GETRESOURCES ioctl
2011-06-14 11:25:32 -07:00
Dave Airlie 7ad35cf288 x86/uv/x2apic: update for change in pci bridge handling.
When I added 3448a19da4
I forgot about the special uv handling code for this, so this
patch fixes it up.

Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ingo Molnar
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-06-14 09:50:12 +10:00
Joe Perches 28f65c11f2 treewide: Convert uses of struct resource to resource_size(ptr)
Several fixes as well where the +1 was missing.

Done via coccinelle scripts like:

@@
struct resource *ptr;
@@

- ptr->end - ptr->start + 1
+ resource_size(ptr)

and some grep and typing.

Mostly uncompiled, no cross-compilers.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-10 14:55:36 +02:00
Linus Torvalds 7f45e5cd17 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc32, leon: bugfix in LEON SMP interrupt init
  sparc32, sun4m: bugfix in SMP IPI traphandler
  sparc: Remove unnecessary semicolons
  Add support for allocating irqs for bootbus devices
  Do not skip interrupt sources in sun4d interrupt handler and acknowledge interrupts correctly
  Restructure sun4d_build_device_irq so that timer interrupts can be allocated
  sparc: PCIC_PCI needs SPARC32 dependency
  sparc: Do not select GENERIC_HARDIRQS_NO_DEPRECATED
  sparc32,leon: add GRPCI2 PCI Host driver
  sparc32,leon: added LEON-common low-level PCI routines
  sparc32: added CONFIG_PCIC_PCI Kconfig setting
2011-06-09 16:33:01 -07:00
Benjamin Herrenschmidt 98d9f30c82 pci/of: Match PCI devices to OF nodes dynamically
powerpc has two different ways of matching PCI devices to their
corresponding OF node (if any) for historical reasons. The ppc64 one
does a scan looking for matching bus/dev/fn, while the ppc32 one does a
scan looking only for matching dev/fn on each level in order to be
agnostic to busses being renumbered (which Linux does on some
platforms).

This removes both and instead moves the matching code to the PCI core
itself. It's the most logical place to do it: when a pci_dev is created,
we know the parent and thus can do a single level scan for the matching
device_node (if any).

The benefit is that all archs now get the matching for free. There's one
hook the arch might want to provide to match a PHB bus to its device
node. A default weak implementation is provided that looks for the
parent device device node, but it's not entirely reliable on powerpc for
various reasons so powerpc provides its own.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:17 +10:00
Rafael J. Wysocki 99592ba4a8 PM / Intel IOMMU: Fix init_iommu_pm_ops() for CONFIG_PM unset
If CONFIG_PM is not set, init_iommu_pm_ops() introduced by commit
134fac3f45 (PCI / Intel IOMMU: Use
syscore_ops instead of sysdev class and sysdev) is not defined
appropriately.  Fix this issue.

Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-06-07 21:32:31 +02:00
Daniel Hellstrom 26893c1368 sparc32,leon: added LEON-common low-level PCI routines
The LEON architecture does not have a BIOS or bootloader that
initializes PCI for us, instead Linux generic PCI layer is used
to set up resources and IRQ.

Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-02 14:32:37 -07:00
Linus Torvalds f0f52a9463 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: Fix off-by-one in RMRR setup
  intel-iommu: Add domain check in domain_remove_one_dev_info
  intel-iommu: Remove Host Bridge devices from identity mapping
  intel-iommu: Use coherent DMA mask when requested
  intel-iommu: Dont cache iova above 32bit
  intel-iommu: Speed up processing of the identity_mapping function
  intel-iommu: Check for identity mapping candidate using system dma mask
  intel-iommu: Only unlink device domains from iommu
  intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
  intel-iommu: Flush unmaps at domain_exit
  intel-iommu: Remove obsolete comment from detect_intel_iommu
  intel-iommu: fix VT-d PMR disable for TXT on S3 resume
2011-06-02 05:48:50 +09:00
Randy Dunlap 3f37d6229c PCI: fix new kernel-doc warning
Fix pci.c kernel-doc warnings:

Warning(drivers/pci/pci.c:3292): No description found for parameter 'flags'
Warning(drivers/pci/pci.c:3292): Excess function parameter 'change_bridge_flags' description in 'pci_set_vga_state'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-01 11:43:29 -07:00
David Woodhouse 70e535d1e5 intel-iommu: Fix off-by-one in RMRR setup
We were mapping an extra byte (and hence usually an extra page):
iommu_prepare_identity_map() expects to be given an 'end' argument which
is the last byte to be mapped; not the first byte *not* to be mapped.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:48:21 +01:00
Mike Habeck 8519dc4401 intel-iommu: Add domain check in domain_remove_one_dev_info
The comment in domain_remove_one_dev_info() states "No need to compare
PCI domain; it has to be the same". But for the si_domain that isn't
going to be true, as it consists of all the PCI devices that are
identity mapped thus multiple PCI domains can be in si_domain.  The
code needs to validate the PCI domain too.

Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:48 +01:00
Mike Travis 825507d6d0 intel-iommu: Remove Host Bridge devices from identity mapping
When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices
that do not use the IOMMU causes a kernel panic.  Fix that by not
inserting those devices into the si_domain.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:46 +01:00
Mike Travis c681d0ba12 intel-iommu: Use coherent DMA mask when requested
The __intel_map_single function is not honoring the passed in DMA mask.
This results in not using the coherent DMA mask when called from
intel_alloc_coherent().

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:45 +01:00
Chris Wright 1c9fc3d11b intel-iommu: Dont cache iova above 32bit
Mike Travis and Mike Habeck reported an issue where iova allocation
would return a range that was larger than a device's dma mask.

https://lkml.org/lkml/2011/3/29/423

The dmar initialization code will reserve all PCI MMIO regions and copy
those reservations into a domain specific iova tree.  It is possible for
one of those regions to be above the dma mask of a device.  It is typical
to allocate iovas with a 32bit mask (despite device's dma mask possibly
being larger) and cache the result until it exhausts the lower 32bit
address space.  Freeing the iova range that is >= the last iova in the
lower 32bit range when there is still an iova above the 32bit range will
corrupt the cached iova by pointing it to a region that is above 32bit.
If that region is also larger than the device's dma mask, a subsequent
allocation will return an unusable iova and cause dma failure.

Simply don't cache an iova that is above the 32bit caching boundary.

Reported-by: Mike Travis <travis@sgi.com>
Reported-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Acked-by: Mike Travis <travis@sgi.com>
Tested-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:40 +01:00
Mike Travis cb452a4040 intel-iommu: Speed up processing of the identity_mapping function
When there are a large count of PCI devices, and the pass through
option for iommu is set, much time is spent in the identity_mapping
function hunting though the iommu domains to check if a specific
device is "identity mapped".

Speed up the function by checking the cached info to see if
it's mapped to the static identity domain.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:36 +01:00
Chris Wright 8fcc5372fb intel-iommu: Check for identity mapping candidate using system dma mask
The identity mapping code appears to make the assumption that if the
devices dma_mask is greater than 32bits the device can use identity
mapping.  But that is not true: take the case where we have a 40bit
device in a 44bit architecture. The device can potentially receive a
physical address that it will truncate and cause incorrect addresses
to be used.

Instead check to see if the device's dma_mask is large enough
to address the system's dma_mask.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:34 +01:00
Alex Williamson 9b4554b21e intel-iommu: Only unlink device domains from iommu
Commit a97590e5 added unlinking domains from iommus to reciprocate the
iommu from domains unlinking that was already done.  We actually want
to only do this for device domains and never for the static
identity map domain or VM domains.  The SI domain is special and
never freed, while VM domain->id lives in their own special address
space, separate from iommu->domain_ids.

In the current code, a VM can get domain->id zero, then mark that
domain unused when unbound from pci-stub.  This leads to DMAR
write faults when the device is re-bound to the host driver.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:47:29 +01:00
Youquan Song 6dd9a7c737 intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
  - size >= 2MiB, and
  - virtual address aligned to 2MiB, and
  - physical address aligned to 2MiB, and
  - on hardware that supports superpages.

(and likewise for larger superpages).

We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.

Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.

Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.

==

The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.

I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.

The original sequence of commits leading to identical code was:

Youquan Song (3):
      intel-iommu: super page support
      intel-iommu: Fix superpage alignment calculation error
      intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()

David Woodhouse (4):
      intel-iommu: Precalculate superpage support for dmar_domain
      intel-iommu: Fix hardware_largepage_caps()
      intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
      intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages

Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:26:35 +01:00
David S. Miller 5aceca9d3c PCI: Fix warning in drivers/pci/probe.c on sparc64
IO_SPACE_LIMIT is currently used in two ways:

1) As a way to mask I/O port values read out of PCI base address
   registers.  This value should be 64-bit.

2) As a value which is the upper limit for all I/O "ports" in the
   system.

On sparc64 we store the full 64-bit physical I/O address in the
resources.  For this reason we define IO_SPACE_LIMIT at a 64-bit
"all 1's".

This is the right value to use for ioport_resource.end and for the
check made in drivers/pcmcia/rsrc_nonstatic.c:adjust_io().

But in driver/pci/probe.c:__pci_read_base() we mask this against
a "u32" variable and thus get the following warning:

drivers/pci/probe.c: In function ¡__pci_read_base¢:
drivers/pci/probe.c:207: warning: large integer implicitly truncated to unsigned type

Fix this by using an explicit "u32" cast.

I considered changing sparc64 to define a 32-bit "all 1's" like
most other systems do, but this wouldn't work because the checks
in PCMCIA's rsrc_nonstatic.c would no longer be right since they
are testing against fully formed 64-bit resources.  As described
above, on sparc64 such resources will hold full 64-bit physical
I/O addresses, not bus-centric 32-bit ones.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-31 14:29:26 -07:00
Linus Torvalds daa94222b6 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI EC: remove redundant code
  ACPI: Add D3 cold state
  ACPI: processor: fix processor_physically_present in UP kernel
  ACPI: Split out custom_method functionality into an own driver
  ACPI: Cleanup custom_method debug stuff
  ACPI EC: enable MSI workaround for Quanta laptops
  ACPICA: Update to version 20110413
  ACPICA: Execute an orphan _REG method under the EC device
  ACPICA: Move ACPI_NUM_PREDEFINED_REGIONS to a more appropriate place
  ACPICA: Update internal address SpaceID for DataTable regions
  ACPICA: Add more methods eligible for NULL package element removal
  ACPICA: Split all internal Global Lock functions to new file - evglock
  ACPI: EC: add another DMI check for ASUS hardware
  ACPI EC: remove dead code
  ACPICA: Fix code divergence of global lock handling
  ACPICA: Use acpi_os_create_lock interface
  ACPI: osl, add acpi_os_create_lock interface
  ACPI:Fix goto flows in thermal-sys
2011-05-29 11:19:16 -07:00
Lin Ming 28c2103dad ACPI: Add D3 cold state
_SxW returns an Integer containing the lowest D-state supported in state
Sx. If OSPM has not indicated that it supports _PR3, then the value “3”
corresponds to D3.  If it has indicated _PR3 support, the value “3”
represents D3hot and the value “4” represents D3cold.

Linux does set _OSC._PR3, so we should fix it to expect that _SxW can
return 4.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 02:21:08 -04:00
Linus Torvalds 98b98d3163 Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (169 commits)
  drivers/gpu/drm/radeon/atom.c: fix warning
  drm/radeon/kms: bump kms version number
  drm/radeon/kms: properly set num banks for fusion asics
  drm/radeon/kms/atom: move dig phy init out of modesetting
  drm/radeon/kms/cayman: fix typo in register mask
  drm/radeon/kms: fix typo in spread spectrum code
  drm/radeon/kms: fix tile_config value reported to userspace on cayman.
  drm/radeon/kms: fix incorrect comparison in cayman setup code.
  drm/radeon/kms: add wait idle ioctl for eg->cayman
  drm/radeon/cayman: setup hdp to invalidate and flush when asked
  drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked
  agp/uninorth: Fix lockups with radeon KMS and >1x.
  drm/radeon/kms: the SS_Id field in the LCD table if for LVDS only
  drm/radeon/kms: properly set the CLK_REF bit for DCE3 devices
  drm/radeon/kms: fixup eDP connector handling
  drm/radeon/kms: bail early for eDP in hotplug callback
  drm/radeon/kms: simplify hotplug handler logic
  drm/radeon/kms: rewrite DP handling
  drm/radeon/kms/atom: add support for setting DP panel mode
  drm/radeon/kms: atombios.h updates for DP panel mode
  ...
2011-05-24 12:06:40 -07:00
Alex Williamson 7b66835781 intel-iommu: Flush unmaps at domain_exit
We typically batch unmaps to be lazily flushed out at
regular intervals.  When we destroy a domain, we need
to force a flush of these lazy unmaps to be sure none
reference the domain we're about to free.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=35062
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
2011-05-24 13:08:34 +01:00
Jan Kiszka b3a530e4e7 intel-iommu: Remove obsolete comment from detect_intel_iommu
Since cacd4213d8, this comment no longer applies.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-24 13:08:31 +01:00
Joseph Cihula b779260b09 intel-iommu: fix VT-d PMR disable for TXT on S3 resume
This patch is a follow on to https://lkml.org/lkml/2011/3/21/239, which
was merged as commit 51a63e67da.

This patch adds support for S3, as pointed out by Chris Wright.

Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-24 13:07:56 +01:00
Linus Torvalds 5e152b4c9e Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
  PCI: Don't use dmi_name_in_vendors in quirk
  PCI: remove unused AER functions
  PCI/sysfs: move bus cpuaffinity to class dev_attrs
  PCI: add rescan to /sys/.../pci_bus/.../
  PCI: update bridge resources to get more big ranges when allocating space (again)
  KVM: Use pci_store/load_saved_state() around VM device usage
  PCI: Add interfaces to store and load the device saved state
  PCI: Track the size of each saved capability data area
  PCI/e1000e: Add and use pci_disable_link_state_locked()
  x86/PCI: derive pcibios_last_bus from ACPI MCFG
  PCI: add latency tolerance reporting enable/disable support
  PCI: add OBFF enable/disable support
  PCI: add ID-based ordering enable/disable support
  PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
  PCI: Set PCIE maxpayload for card during hotplug insertion
  PCI/ACPI: Report _OSC control mask returned on failure to get control
  x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs
  PCI: handle positive error codes
  PCI: check pci_vpd_pci22_wait() return
  PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
  ...

Fix up trivial conflicts in include/linux/pci_ids.h: commit a6e5e2be44
moved the intel SMBUS ID definitons to the i2c-i801.c driver.
2011-05-23 15:39:34 -07:00
Jean Delvare 9251bac97d PCI: Don't use dmi_name_in_vendors in quirk
Don't use the costly dmi_name_in_vendors() when we know the string we
are looking for can only be in the DMI board name field. This is more
robust and, more importantly, much faster.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:15 -07:00
Chen Gong cbfddd2093 PCI: remove unused AER functions
In the commit 28eb5f2, aer_osc_setup is removed but corresponding
definiton information in the aerdrv.h is missed.

Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:14 -07:00
Yinghai Lu dc2c2c9dd5 PCI/sysfs: move bus cpuaffinity to class dev_attrs
Requested by Greg KH to fix a race condition in the creating of PCI bus
cpuaffinity files.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:13 -07:00
Yinghai Lu b9d320fcb6 PCI: add rescan to /sys/.../pci_bus/.../
After remove the device from /sys, we have to rescan all or
find out the bridge and access /sys../device/rescan there.

this patch add /sys/.../pci_bus/.../rescan. So user can rescan more easy.
that is more clean and easy to understand.

like after remove 0000:c4:00.0, you can rescan 0000:c4 directly.

-v2: According to Jesse, use function instead of exposing attr, so could hide
	#ifdef in header file.
     also add code to remove rescan file in remove path.
-v3: GregKH pointed out that we should use dev_attrs to avoid racing.
     So add pcibus_attrs and make it to be member of pcibus_attrs.
-v4: Change name to pcibus_dev_attrs according to GregKH

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:12 -07:00
Yinghai Lu da7822e5ad PCI: update bridge resources to get more big ranges when allocating space (again)
With Ram's fixes, this should be safe to do again.  So let's give it
another try.

BIOS separates IO ranges between several IOHs, and on some slots, BIOS
assigns resources to a bridge, but stops assigning resources to the
device under that bridge, because the device needs a big resource.

So:
1. allocate resources and record the failed device resources
2. clear the BIOS assigned resources of the parent bridge of failing device
3. go back and call pci assign unassigned
4. if it still fails, go up the tree, clear more bridges. and try again

Now Ram's allocate requested resource already got into mainline. could
put this one again.

Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:11 -07:00
Alex Williamson ffbdd3f793 PCI: Add interfaces to store and load the device saved state
For KVM device assignment, we'd like to save off the state of a device
prior to passing it to the guest and restore it later.  We also want
to allow pci_reset_funciton() to be called while the device is owned
by the guest.  This however overwrites and invalidates the struct pci_dev
buffers, so we can't just manually call save and restore.  Add generic
interfaces for the saved state to be stored and reloaded back into
struct pci_dev at a later time.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:09 -07:00
Alex Williamson 24a4742f0b PCI: Track the size of each saved capability data area
This will allow us to store and load it later.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:08 -07:00
Yinghai Lu 9f728f53dd PCI/e1000e: Add and use pci_disable_link_state_locked()
Need to use it in _e1000e_disable_aspm.  This routine is used for error
recovery, where the pci_bus_sem is already held, and we don't want
pci_disable_link_state to try to take it again.  So add a locked variant
for use in cases like this.

Found lock up:

[ 2374.654557] kworker/32:1    D ffff881027f6b0f0     0  6075      2 0x00000000
[ 2374.654816]  ffff88503f099a68 0000000000000046 ffff88503f098000 0000000000004000
[ 2374.654837]  00000000001d1ec0 ffff88503f099fd8 00000000001d1ec0 ffff88503f099fd8
[ 2374.654860]  0000000000004000 00000000001d1ec0 ffff88503dcc8000 ffff88503f090000
[ 2374.654880] Call Trace:
[ 2374.654898]  [<ffffffff810b1302>] ? __lock_acquired+0x3a/0x224
[ 2374.654914]  [<ffffffff81c2b59c>] ? _raw_spin_unlock_irq+0x30/0x36
[ 2374.654925]  [<ffffffff810b069d>] ? trace_hardirqs_on_caller+0x1f/0x178
[ 2374.654936]  [<ffffffff81c2ab24>] rwsem_down_failed_common+0xd3/0x103
[ 2374.654945]  [<ffffffff810b158f>] ? __lock_contended+0x3a/0x2a2
[ 2374.654955]  [<ffffffff81c2ab7b>] rwsem_down_read_failed+0x12/0x14
[ 2374.654967]  [<ffffffff813371e4>] call_rwsem_down_read_failed+0x14/0x30
[ 2374.654981]  [<ffffffff8135df20>] ? pci_disable_link_state+0x5f/0xf5
[ 2374.654990]  [<ffffffff81c2a0e6>] ? down_read+0x7e/0x91
[ 2374.654999]  [<ffffffff8135df20>] ? pci_disable_link_state+0x5f/0xf5
[ 2374.655008]  [<ffffffff8135df20>] pci_disable_link_state+0x5f/0xf5
[ 2374.655024]  [<ffffffff81661796>] e1000e_disable_aspm+0x55/0x5a
[ 2374.655037]  [<ffffffff816677eb>] e1000_io_slot_reset+0x59/0xea
[ 2374.655048]  [<ffffffff8135fe0d>] ? report_mmio_enabled+0x5d/0x5d
[ 2374.655057]  [<ffffffff8135fe3b>] report_slot_reset+0x2e/0x5d
[ 2374.655072]  [<ffffffff8135369e>] pci_walk_bus+0x8a/0xb7
[ 2374.655081]  [<ffffffff8135fe0d>] ? report_mmio_enabled+0x5d/0x5d
[ 2374.655091]  [<ffffffff813603be>] broadcast_error_message+0xa4/0xb2
[ 2374.655101]  [<ffffffff81352c71>] ? pci_bus_read_config_dword+0x72/0x80
[ 2374.655110]  [<ffffffff813606df>] do_recovery+0x9e/0xf9
[ 2374.655120]  [<ffffffff81360786>] handle_error_source+0x4c/0x51
[ 2374.655129]  [<ffffffff81360974>] aer_isr_one_error+0x1e9/0x21a
[ 2374.655138]  [<ffffffff81360a6c>] aer_isr+0xc7/0xcc
[ 2374.655147]  [<ffffffff813609a5>] ? aer_isr_one_error+0x21a/0x21a
[ 2374.655159]  [<ffffffff81096d9f>] process_one_work+0x237/0x3ec
[ 2374.655168]  [<ffffffff81096d10>] ? process_one_work+0x1a8/0x3ec
[ 2374.655178]  [<ffffffff8109728d>] worker_thread+0x17c/0x240
[ 2374.655186]  [<ffffffff810b0803>] ? trace_hardirqs_on+0xd/0xf
[ 2374.655196]  [<ffffffff81097111>] ? manage_workers+0xab/0xab
[ 2374.655209]  [<ffffffff8109c8ed>] kthread+0xa0/0xa8
[ 2374.655223]  [<ffffffff81c332d4>] kernel_thread_helper+0x4/0x10
[ 2374.655232]  [<ffffffff81c2b880>] ? retint_restore_args+0xe/0xe
[ 2374.655243]  [<ffffffff8109c84d>] ? __init_kthread_worker+0x5b/0x5b
[ 2374.655252]  [<ffffffff81c332d0>] ? gs_change+0xb/0xb

when aer happens,
pci_walk_bus already have down_read(&pci_bus_sem)...
then report_slot_reset
        ==> e1000_io_slot_reset
                ==> e1000e_disable_aspm
                        ==> pci_disable_link_state...

We can not use pci_disable_link_state, and it will try to hold pci_bus_sem again.

Try to have __pci_disable_link_state that will not need to hold pci_bus_sem.

-v2: change name to pci_disable_link_state_locked() according to Jesse.

[jbarnes: make sure new function is exported for modules]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:16:44 -07:00
Linus Torvalds cbdad8dc18 Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, gart: Rename pci-gart_64.c to amd_gart_64.c
  x86/amd-iommu: Use threaded interupt handler
  arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS
  x86/amd-iommu: Add support for invalidate_all command
  x86/amd-iommu: Add extended feature detection
  x86/amd-iommu: Add ATS enable/disable code
  x86/amd-iommu: Add flag to indicate IOTLB support
  x86/amd-iommu: Flush device IOTLB if ATS is enabled
  x86/amd-iommu: Select PCI_IOV with AMD IOMMU driver
  PCI: Move ATS declarations in seperate header file
  dma-debug: print information about leaked entry
  x86/amd-iommu: Flush all internal TLBs when IOMMUs are enabled
  x86/amd-iommu: Rename iommu_flush_device
  x86/amd-iommu: Improve handling of full command buffer
  x86/amd-iommu: Rename iommu_flush* to domain_flush*
  x86/amd-iommu: Remove command buffer resetting logic
  x86/amd-iommu: Cleanup completion-wait handling
  x86/amd-iommu: Cleanup inv_pages command handling
  x86/amd-iommu: Move inv-dte command building to own function
  x86/amd-iommu: Move compl-wait command building to own function
2011-05-19 17:28:58 -07:00
Yinghai Lu 93d2175d3d PCI: Clear bridge resource flags if requested size is 0
During pci remove/rescan testing found:

  pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
  pci 0000:c0:03.0:   bridge window [io  0x1000-0x0fff]
  pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
  pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
  pci 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
  pci 0000:c0:03.0: Error enabling bridge (-22), continuing
  pci 0000:c0:03.0: enabling bus mastering
  pci 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
  pcieport: probe of 0000:c0:03.0 failed with error -22

This bug was caused by commit c8adf9a3e8 ("PCI: pre-allocate
additional resources to devices only after successful allocation of
essential resources.")

After that commit, pci_hotplug_io_size is changed to additional_io_size
from minium size.  So it will not go through resource_size(res) != 0
path, and will not be reset.

The root cause is: pci_bridge_check_ranges will set RESOURCE_IO flag for
pci bridge, and later if children do not need IO resource.  those bridge
resources will not need to be allocated.  but flags is still there.
that will confuse the the pci_enable_bridges later.

related code:

   static void assign_requested_resources_sorted(struct resource_list *head,
                                    struct resource_list_x *fail_head)
   {
           struct resource *res;
           struct resource_list *list;
           int idx;

           for (list = head->next; list; list = list->next) {
                   res = list->res;
                   idx = res - &list->dev->resource[0];
                   if (resource_size(res) && pci_assign_resource(list->dev, idx)) {
   ...
                           reset_resource(res);
                   }
           }
   }

At last, We have to clear the flags in pbus_size_mem/io when requested
size == 0 and !add_head.  becasue this case it will not go through
adjust_resources_sorted().

Just make size1 = size0 when !add_head. it will make flags get cleared.

At the same time when requested size == 0, add_size != 0, will still
have in head and add_list.  because we do not clear the flags for it.

After this, we will get right result:

  pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
  pci 0000:c0:03.0:   bridge window [io  disabled]
  pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
  pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
  pci 0000:c0:03.0: enabling bus mastering
  pci 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: setting latency timer to 64
  pcieport 0000:c0:03.0: irq 160 for MSI/MSI-X
  pcieport 0000:c0:03.0: Signaling PME through PCIe PME interrupt
  pci 0000:c4:00.0: Signaling PME through PCIe PME interrupt
  pcie_pme 0000:c0:03.0:pcie01: service driver pcie_pme loaded
  aer 0000:c0:03.0:pcie02: service driver aer loaded
  pciehp 0000:c0:03.0:pcie04: Hotplug Controller:

v3: more simple fix. also fix one typo in pbus_size_mem

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-16 18:33:35 -07:00
Jesse Barnes 51c2e0a7e5 PCI: add latency tolerance reporting enable/disable support
Latency tolerance reporting allows devices to send messages to the root
complex indicating their latency tolerance for snooped & unsnooped
memory transactions.  Add support for enabling & disabling this
feature, along with a routine to set the max latencies a device should
send upstream.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-11 15:18:53 -07:00
Jesse Barnes 48a92a8179 PCI: add OBFF enable/disable support
OBFF (optimized buffer flush/fill), where supported, can help improve
energy efficiency by giving devices information about when interrupts
and other activity will have a reduced power impact.  It requires
support from both the device and system (i.e. not only does the device
need to respond to OBFF messages, but the platform must be capable of
generating and routing them to the end point).

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-11 15:18:48 -07:00
Jesse Barnes b48d4425b6 PCI: add ID-based ordering enable/disable support
Add support to allow drivers to enable/disable ID-based ordering.  Where
supported, ID-based ordering can significantly improve the latency of
individual requests by preventing them from queueing up behind unrelated
traffic.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-11 15:18:40 -07:00
Ian Campbell 69643e4829 PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
Devices which do not support PCI configuration space based power
management may not otherwise be enabled.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-11 09:13:14 -07:00
Jordan_Hargrave@Dell.com e522a7126c PCI: Set PCIE maxpayload for card during hotplug insertion
The following patch sets the MaxPayload setting to match the parent
reading when inserting a PCIE card into a hotplug slot.  On our system,
the upstream bridge is set to 256, but when inserting a card, the card
setting defaults to 128.  As soon as I/O is performed to the card it
starts receiving errors since the payload size is too small.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:41 -07:00
Greg Thelen 34e3207205 PCI: handle positive error codes
Callers expect pci_user_{read,write}_config_*() to indicate errors by
returning negative values.  Prior to this change, the indicated routines
could return positive error codes (e.g. PCIBIOS_BAD_REGISTER_NUMBER)
which callers would mistakenly interpret as success.

This change converts any non-zero return from the mentioned routines
into unambiguous negative value return codes.

Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:36 -07:00
Greg Thelen d97ecd8191 PCI: check pci_vpd_pci22_wait() return
pci_vpd_pci22_write() calls pci_vpd_pci22_wait() after writing
PCI_VPD_DATA and PCI_VPD_ADDR to wait for the VPD operation to complete.
The result pci_vpd_pci22_wait() was not checked for error.

This change checks for error.

Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:35 -07:00
Jean Delvare b6d95bb63c PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
We were just lucky that ICH4_GPIO_EN and ICH6_GPIO_EN happen to have
the same value.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:34 -07:00
Jean Delvare 5d9c0a795f PCI: Fix typo in ich7 quirk comment
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:33 -07:00
Hemant Pedanekar 63c4408074 PCI: Add quirk for setting valid class for TI816X Endpoint
TI816X (common name for DM816x/C6A816x/AM389x family) devices configured
to boot as PCIe Endpoint have class code = 0. This makes kernel PCI bus
code to skip allocating BARs to these devices resulting into following
type of error when trying to enable them:

"Device 0000:01:00.0 not available because of resource collisions"

The device cannot be operated because of the above issue.

This patch adds a ID specific (TI VENDOR ID and 816X DEVICE ID based)
'early' fixup quirk to replace class code with
PCI_CLASS_MULTIMEDIA_VIDEO as class.

Signed-off-by: Hemant Pedanekar <hemantp@ti.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:31 -07:00
Wanlong Gao 40294d8f14 PCI: Fix uninitialized variable bug in AER injection code
If it was preempted, and the variable aer_mask_override is changed
after the spin_unlock_irqrestore it will write an uninitialized
variable by the pci_write_config_dword() function.

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:30 -07:00
Rafael J. Wysocki 83d74e036b PCI/PM: Add kerneldoc description of pci_pm_reset()
The pci_pm_reset() function is not a very nice interface due to its
limitations and conditional behavior (e.g. it doesn't affect devices
in low-power states), but it cannot be simply dropped, because
existing device drivers may depend on it.  However, its behavior and
limitations should be well documented, so add an appropriate
kerneldoc comment to it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:29 -07:00
Alex Williamson 3504e47ffc PCI: Enable ASPM state clearing regardless of policy
Commit 2f671e2d allowed us to clear ASPM state when the FADT
tells us it isn't supported, but we don't put this into effect
if the aspm_policy is set to POLICY_POWERSAVE.  Enable the
state to be cleared regardless of policy.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-10 15:43:28 -07:00
Joerg Roedel 604c307bf4 Merge branches 'dma-debug/next', 'amd-iommu/command-cleanups', 'amd-iommu/ats' and 'amd-iommu/extended-features' into iommu/2.6.40
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c
	arch/x86/kernel/amd_iommu_init.c
2011-05-10 10:25:23 +02:00
Dave Airlie 3448a19da4 vgaarb: use bridges to control VGA routing where possible.
So in a lot of modern systems, a GPU will always be below a parent bridge that won't share with any other GPUs. This means VGA arbitration on those GPUs can be controlled by using the bridge routing instead of io/mem decodes.

The problem is locating which GPUs share which upstream bridges. This patch attempts to identify all the GPUs which can be controlled via bridges, and ones that can't. This patch endeavours to work out the bridge sharing semantics.

When disabling GPUs via a bridge, it doesn't do irq callbacks or touch the io/mem decodes for the gpu.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-04 13:38:46 +10:00