9d265124d0 and 15a260d53f added quirks for P2P bridges that support
I/O windows that start/end at 1K boundaries, not just the 4K boundaries
defined by the PCI spec. For details, see the IOBL_ADR register and the
EN1K bit in the CNF register in the Intel 82870P2 (P64H2).
These quirks complicate the code that reads P2P bridge windows
(pci_read_bridge_io() and pci_cfg_fake_ranges()) because the bridge
I/O resource is updated in the HEADER quirk, in pci_read_bridge_io(),
in pci_setup_bridge(), and again in the FINAL quirk. This is confusing
and makes it impossible to reassign the bridge windows after FINAL
quirks are run.
This patch adds support for 1K windows in the generic paths, so the
HEADER quirk only has to enable this support. The FINAL quirk, which
used to undo damage done by pci_setup_bridge(), is no longer needed.
This removes "if (!res->start) res->start = ..." from pci_read_bridge_io();
that was part of 9d265124d0 to avoid overwriting the resource filled in
by the quirk. Since pci_read_bridge_io() itself now knows about
granularity, the quirk no longer updates the resource and this test is no
longer needed.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When we update 64-bit BARs, we have to perform two config writes. Between
the writes, the half-written BAR value could match a MEM access intended
for another device. This could result in corruption of this device (for
writes) or an unexpected response machine check (for reads).
To prevent this, disable MEM decoding while updating such BARs. This uses
the same safety test as 253d2e5498, which disables both MEM and IO while
sizing BARs, namely, we don't disable decoding for host bridge devices.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
After 253d2e5498, we disable MEM and IO decoding for most devices while we
size 32-bit BARs. However, we restore the original COMMAND register before
we size the upper 32 bits of 64-bit BARs, so we can still cause a conflict.
This patch waits to restore the original COMMAND register until we're
completely finished sizing the BAR.
Reference: https://lkml.org/lkml/2007/8/25/154
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The enable/suspend/resume_early/resume fixups can be called at any time, so
they can't be __init or __devinit.
[bhelgaas: changelog]
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
__nv_msi_ht_cap_quirk() acquires a temporary reference via
'pci_get_bus_and_slot()' that is never released.
This patch releases the temporary reference.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch restructures pci_do_fixups()'s quirk invocations in the style
of initcall_debug_start() and initcall_debug_report(), so we have only
one call site for the quirk.
[bhelgaas: changelog]
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
cd81e1ea1a added checks that prevent us from using P2P bridge windows
that start at PCI bus address zero. The reason was to "prevent us from
overwriting resources that are unassigned."
But generic code should allow address zero in both BARs and bridge
windows, so I think that commit was a mistake.
Windows at bus address zero are legal and likely to exist on machines with
an offset between bus addresses and CPU addresses. For example, in the
following hypothetical scenario, the bridge at 00:01.0 has a window at bus
address zero and the device at 01:00.0 has a BAR at bus address zero, and
I think both are perfectly valid:
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x100000000-0x1ffffffff] (bus address [0x00000000-0xffffffff])
pci 0000:00:01.0: PCI bridge to [bus 01]
pci 0000:00:01.0: bridge window [mem 0x100000000-0x100ffffff]
pci 0000:01:00.0: reg 10: [mem 0x100000000-0x100ffffff]
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/myron-pcibios_setup:
xtensa/PCI: factor out pcibios_setup()
x86/PCI: adjust section annotations for pcibios_setup()
unicore32/PCI: adjust section annotations for pcibios_setup()
tile/PCI: factor out pcibios_setup()
sparc/PCI: factor out pcibios_setup()
sh/PCI: adjust section annotations for pcibios_setup()
sh/PCI: factor out pcibios_setup()
powerpc/PCI: factor out pcibios_setup()
parisc/PCI: factor out pcibios_setup()
MIPS/PCI: adjust section annotations for pcibios_setup()
MIPS/PCI: factor out pcibios_setup()
microblaze/PCI: factor out pcibios_setup()
ia64/PCI: factor out pcibios_setup()
cris/PCI: factor out pcibios_setup()
alpha/PCI: factor out pcibios_setup()
PCI: pull pcibios_setup() up into core
Commit cc2893b6 (PCI: Ensure we re-enable devices on resume)
addressed the problem with USB not being powered after resume on
recent Lenovo machines, but it did that in a suboptimal way.
Namely, it should have changed the relevant code paths only,
which are pci_pm_resume_noirq() and pci_pm_restore_noirq() supposed
to restore the device's power and standard configuration registers
after system resume from suspend or hibernation. Instead, however,
it modified pci_set_power_state() which is executed in several
other situations too. That resulted in some undesirable effects,
like attempting to change a device's power state in the same way
multiple times in a row (up to as many as 4 times in a row in the
snd_hda_intel driver).
Fix the bug addressed by commit cc2893b6 in an alternative way,
by forcibly powering up all devices in pci_pm_default_resume_early(),
which is called by pci_pm_resume_noirq() and pci_pm_restore_noirq()
to restore the device's power and standard configuration registers,
and modifying pci_pm_runtime_resume() to avoid the forcible power-up
if not necessary. Then, revert the changes made by commit cc2893b6
to make the confusion introduced by it go away.
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, all of the architectures implement their own pcibios_setup()
routine. Most of the implementations do nothing so this patch introduces
a generic (__weak) routine in the core that can be used by all
architectures as a default. If necessary, it can be overridden by
architecture-specific code.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* topic/huang-d3cold-v7:
PCI/PM: add PCIe runtime D3cold support
PCI: do not call pci_set_power_state with PCI_D3cold
PCI/PM: add runtime PM support to PCIe port
ACPI/PM: specify lowest allowed state for device sleep state
This patch adds runtime D3cold support and corresponding ACPI platform
support. This patch only enables runtime D3cold support; it does not
enable D3cold support during system suspend/hibernate.
D3cold is the deepest power saving state for a PCIe device, where its main
power is removed. While it is in D3cold, you can't access the device at
all, not even its configuration space (which is still accessible in D3hot).
Therefore the PCI PM registers can not be used to transition into/out of
the D3cold state; that must be done by platform logic such as ACPI _PR3.
To support wakeup from D3cold, a system may provide auxiliary power, which
allows a device to request wakeup using a Beacon or the sideband WAKE#
signal. WAKE# is usually connected to platform logic such as ACPI GPE.
This is quite different from other power saving states, where devices
request wakeup via a PME message on the PCIe link.
Some devices, such as those in plug-in slots, have no direct platform
logic. For example, there is usually no ACPI _PR3 for them. D3cold
support for these devices can be done via the PCIe Downstream Port leading
to the device. When the PCIe port is powered on/off, the device is powered
on/off too. Wakeup events from the device will be notified to the
corresponding PCIe port.
For more information about PCIe D3cold and corresponding ACPI support,
please refer to:
- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0
[bhelgaas: changelog]
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch adds runtime PM support to PCIe port. This is needed by
PCIe D3cold support, where PCIe device without ACPI node may be
powered on/off by PCIe port.
Because runtime suspend is broken for some chipsets, a black list is
used to disable runtime PM support for these chipsets.
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Lower device sleep state can save more power, but has more exit
latency too. Sometimes, to satisfy some power QoS and other
requirement, we need to constrain the lowest device sleep state.
In this patch, a parameter to specify lowest allowed state for
acpi_pm_device_sleep_state is added. So that the caller can enforce
the constraint via the parameter.
This is needed by PCIe D3cold support, where the lowest power state
allowed may be D3_HOT instead of default D3_COLD.
CC: Len Brown <lenb@kernel.org>
CC: linux-acpi@vger.kernel.org
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* topic/jiang-mmconfig-v10:
ACPI: mark acpi_sfi_table_parse() as __init
x86/PCI: use pr_level() to replace printk(KERN_LEVEL)
x86/PCI: refine __pci_mmcfg_init() for better code readability
x86/PCI: get rid of redundant log messages
x86/PCI: simplify pci_mmcfg_late_insert_resources()
x86/PCI: update MMCONFIG information when hot-plugging PCI host bridges
PCI/ACPI: provide MMCONFIG address for PCI host bridges
x86/PCI: add pci_mmconfig_insert()/delete() for PCI root bridge hotplug
x86/PCI: prepare pci_mmcfg_check_reserved() to be called at runtime
x86/PCI: introduce pci_mmcfg_arch_map()/pci_mmcfg_arch_unmap()
x86/PCI: use RCU list to protect mmconfig list
x86/PCI: split out pci_mmconfig_alloc() for code reuse
x86/PCI: split out pci_mmcfg_check_reserved() for code reuse
This patch provide MMCONFIG address for PCI host bridges, which will
be used to support host bridge hotplug. It gets MMCONFIG address
by evaluating _CBA method if available.
Reviewed-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
For a valid pci_dev, dev->bus != NULL always, so remove this
unnecessary test.
Found by Coverity (CID 101680).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Check whether we evaluated _ADR successfully. Previously we ignored
failure, so we would have used garbage data from the stack as the device
and function number.
We return AE_OK so that we ignore only this slot and continue looking
for other slots.
Found by Coverity (CID 113981).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
"slots_not_empty" is initialized to zero and can't be set again before
reaching this point, so this return statement is dead. Remove it.
Found by Coverity (CID 114324).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
On P2P bridges with 32-bit I/O decoding, we incorrectly sign-extended
windows starting at 0x80000000 or above. In "base |= (io_base_hi << 16)",
"io_base_hi" is promoted to a signed int before being extended to an
unsigned long.
This would cause a window starting at I/O address 0x80000000 to be
treated as though it started at 0xffffffff80008000 instead, which
should cause "no compatible bridge window" errors when we enumerate
devices using that I/O space.
The mmio and mmio_pref casts are not strictly necessary, but without
them, correctness depends on the types of the PCI_MEMORY_RANGE_MASK and
PCI_PREF_RANGE_MASK constants, which are not obvious from reading the
local code.
Found by Coverity (CID 138747 and CID 138748).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_enable_obff() and pci_enable_ltr() incorrectly check "dev->bus" instead
of "dev->bus->self" to determine whether the upstream device is a P2P
bridge or a host bridge. For devices on the root bus, the upstream device
is a host bridge, "dev->bus != NULL" and "dev->bus->self == NULL", and we
panic with a null pointer dereference.
These functions should previously have panicked when called on devices
supporting OBFF or LTR, so they should be regarded as untested.
Found by Coverity (CID 143038 and CID 143039).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Warning(drivers/pci/setup-bus.c:277): No description found for parameter 'fail_head'
Warning(drivers/pci/setup-bus.c:277): Excess function parameter 'failed_list' description in 'assign_requested_resources_sorted'
Signed-off-by: Wanpeng Li <liwp.linux@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* topic/sebastian-devinit-fixups:
scripts/modpost: check for bad references in .pci.fixups area
sh/PCI: move fixup hooks from __init to __devinit
powerpc/PCI: move fixup hooks from __init to __devinit
frv/PCI: move fixup hooks from __init to __devinit
arm/PCI: move fixup hooks from __init to __devinit
alpha/PCI: move fixup hooks from __init to __devinit
PCI: move fixup hooks from __init to __devinit
x86/PCI: move fixup hooks from __init to __devinit
Passes pci_intx_mask_supported test but continues to send interrupts
as discovered through VFIO-based device assignment.
http://www.spinics.net/lists/kvm/msg73738.html
[bhelgaas: use HEADER, not FINAL, which is currently broken for hotplug]
Tested-by: Andreas Hartmann <andihartmann@01019freenet.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
According to
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/91388
the T310 does not properly support INTx masking as it fails to keep the
PCI_STATUS_INTERRUPT bit updated once the interrupt is masked. Mark this
adapter as broken so that pci_intx_mask_supported won't report it as
compatible.
[bhelgaas: use HEADER, not FINAL, which is currently broken for hotplug]
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_intx_mask_supported() assumes INTx masking is supported if the
PCI_COMMAND_INTX_DISABLE bit is writable. But when that bit is set,
some devices don't actually mask INTx or update PCI_STATUS_INTERRUPT
as we expect.
This patch adds a way for quirks to identify these broken devices.
[bhelgaas: split out from Chelsio quirk addition]
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When we add a device with acpiphp, we enumerate all functions in the
slot with pci_scan_slot(), regardless of whether they have associated
ACPI methods such as _EJ0.
When removing the device, we previously removed only the functions
with those ACPI methods. This patch makes the remove symmetric with the
add: we remove all functions in the slot, whether they have associated
ACPI methods or not.
With qemu-kvm and SeaBIOS, we can build a multi-function device where
only function 0 has _EJ0 and _ADR (see bugzilla below). Removing and
re-adding that slot (including all functions of the device) works correctly
with Windows guests. This patch makes it also work in Linux guests.
[bhelgaas: restructure loop iteration, pull out of slot->funcs loop]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=43219
Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously, we acquired two references to function 0, but only released
one.
[bhelgaas: split this out from "remove all functions" fix]
Signed-off-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
All callers of pci_do_scan_bus() are gone, so remove it.
Note that pci_do_scan_bus() was exported, so out-of-tree modules could
depend on it.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the new generic pci_hp_add_bridge() interface.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the new generic pci_hp_add_bridge() interface.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the new generic pci_hp_add_bridge() interface.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the new generic pci_hp_add_bridge() interface.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the new generic pci_hp_add_bridge() interface.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the new generic pci_hp_add_bridge() interface.
[bhelgaas: split "add generic pci_hp_add_bridge()" into a separate patch]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This creates a generic pci_hp_add_bridge() that can be used by several
hotplug drivers.
[bhelgaas: split out from pciehp patch]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now we can insert busn_res now, after all root bus's get inserted.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We need to put into the resources list for legacy system.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some callers do not supply the bus number aperture, usually because they do
not know the end. In this case, we assume the aperture extends from the
root bus number to bus 255, scan the bus, and shrink the bus number
resource so it ends at the largest bus number we found.
This is obviously not correct because the actual end of the aperture may
well be larger than the largest bus number we found. But I guess it's all
we have for now.
Also print out one info about that, so we could find out which path
does not have busn_res in resources list.
[bhelgaas: changelog, _safe iterator unnecessary, use %pR format for bus]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Will use them insert/update busn res in pci_bus struct.
[bhelgaas: print conflicting entry if insertion fails]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This adds get_pci_domain_busn_res(), which returns the root of the
bus number resource tree for a domain, creating it if necessary.
We will later populate the tree with the bus numbers used by host
bridges and P2P bridges in the domain.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Replace the struct pci_bus secondary/subordinate members with the
struct resource busn_res. Later we'll build a resource tree of these
bus numbers.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch (as1558) fixes a problem affecting several ASUS computers:
The machine crashes or corrupts memory when going into suspend if the
ehci-hcd driver is bound to any controllers. Users have been forced
to unbind or unload ehci-hcd before putting their systems to sleep.
After extensive testing, it was determined that the machines don't
like going into suspend when any EHCI controllers are in the PCI D3
power state. Presumably this is a firmware bug, but there's nothing
we can do about it except to avoid putting the controllers in D3
during system sleep.
The patch adds a new flag to indicate whether the problem is present,
and avoids changing the controller's power state if the flag is set.
Runtime suspend is unaffected; this matters only for system suspend.
However as a side effect, the controller will not respond to remote
wakeup requests while the system is asleep. Hence USB wakeup is not
functional -- but of course, this is already true in the current state
of affairs.
A similar patch has already been applied as commit
151b612847 (USB: EHCI: fix crash during
suspend on ASUS computers). The patch supersedes that one and reverts
it. There are two differences:
The old patch added the flag at the USB level; this patch
adds it at the PCI level.
The old patch applied to all chipsets with the same vendor,
subsystem vendor, and product IDs; this patch makes an
exception for a known-good system (based on DMI information).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Dâniel Fraga <fragabr@gmail.com>
Tested-by: Andrey Rahmatullin <wrar@wrar.name>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
VFIO PCI support will make use of these for user-initiated
PCI config accesses.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In a PCI environment, transactions aren't always required to reach
the root bus before being re-routed. Intermediate switches between
an endpoint and the root bus can redirect DMA back downstream before
things like IOMMUs have a chance to intervene. Legacy PCI is always
susceptible to this as it operates on a shared bus. PCIe added a
new capability to describe and control this behavior, Access Control
Services, or ACS.
The utility function pci_acs_enabled() allows us to test the ACS
capabilities of an individual devices against a set of flags while
pci_acs_path_enabled() tests a complete path from a given downstream
device up to the specified upstream device. We also include the
ability to add device specific tests as it's likely we'll see
devices that do not implement ACS, but want to indicate support
for various capabilities in this space.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The fixups are executed once the pci-device is found which is during
boot process so __init seems fine as long as the platform does not
support hotplug.
However it is possible to remove the PCI bus at run time and have it
rediscovered again via "echo 1 > /sys/bus/pci/rescan" and this will call
the fixups again.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Unlike PCI Express v1's Capabilities Structure, v2's requires the entire
structure to be implemented. In v2 structures, register fields that
are not implemented are present but hardwired to 0x0. These may
include: Link Capabilities, Status, and Control; Slot Capabilities,
Status, and Control; Root Capabilities, Status, and Control; and all of
the '2' (Device, Link, and Slot) Capabilities, Status, and Control
registers.
This patch removes the redundant capability checks corresponding to the
Link 2's and Slot 2's, Capabilities, Status, and Control registers as they
will be present if Device Capabilities 2's registers are (which explains
why the macros for each of the three are identical).
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch resolves potential issues when accessing PCI Express
Capability structures. The makeup of the capability varies
substantially between v1 and v2:
Version 1 of the PCI Express Capability (defined by PCI Express
1.0 and 1.1 base) neither requires the endpoint to implement the
entire PCIe capability structure nor specifies default values of
registers that are not implemented by the device.
Version 2 of the PCI Express Capability (defined by PCIe 1.1
Capability Structure Expansion ECN, PCIe 2.0, 2.1, and 3.0) added
additional registers to the structure and requires all registers
to be either implemented or hardwired to 0.
Due to the differences in the capability structures, code dealing with
capability features must be careful not to access the additional
registers introduced with v2 unless the device is specifically known to
be a v2 capable device. Otherwise, attempts to access non-existant
registers will occur. This is a subtle issue that is hard to track down
when it occurs (and it has - see commit 864d296cf9).
To try and help mitigate such occurrences, this patch introduces
pci_pcie_cap2() which is similar to pci_pcie_cap() but also checks
that the PCIe capability version is >= 2. pci_pcie_cap2() should be
used for qualifying PCIe capability features introduced after v1.
Suggested by Don Dutile.
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There are a number of redundant pci_is_pcie() checks in various PCI
Express capabilities related routines like the following:
if (!pci_is_pcie(dev))
return false;
pos = pci_pcie_cap(dev);
if (!pos)
return false;
The current pci_is_pcie() implementation is merely:
static inline bool pci_is_pcie(struct pci_dev *dev)
{
return !!pci_pcie_cap(dev);
}
so we can just drop the pci_is_pcie() test in such cases.
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI Express Latency Tolerance Reporting (LTR) feature's
pci_ltr_supported() routine is currently only used within
drivers/pci/pci.c so make it static.
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
DMA transactions are tagged with the source ID of the device making
the request. Occasionally hardware screws this up and uses the
source ID of a different device (often the wrong function number of
a multifunction device). A specific Ricoh multifunction device is
a prime example of this problem and included in this patch.
Given a pci_dev, this function returns the pci_dev to use as the
source ID for DMA. When hardware works correctly, this returns
the input device. For the components of the Ricoh multifunction
device, it returns the pci_dev for function 0.
This will be used by IOMMU drivers for determining the boundaries
of IOMMU groups as multiple devices using the same source ID must
be contained within the same group. This can also be used by
existing streaming DMA paths for the same purpose.
[bhelgaas: fold in pci_dev_get() for !CONFIG_PCI]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use pci_is_pcie() instead of looking at obsolete is_pcie field in
struct pci_dev.
CC: Huang Ying <ying.huang@intel.com>
CC: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_max_busnr() has been commented out for years (since 54c762fe62), and
this patch removes it completely.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull MIPS updates from Ralf Baechle:
"The whole series has been sitting in -next for quite a while with no
complaints. The last change to the series was before the weekend the
removal of an SPI patch which Grant - even though previously acked by
himself - appeared to raise objections. So I removed it until the
situation is clarified. Other than that all the patches have the acks
from their respective maintainers, all MIPS and x86 defconfigs are
building fine and I'm not aware of any problems introduced by this
series.
Among the key features for this patch series is a sizable patchset for
Lantiq which among other things introduces support for Lantiq's
flagship product, the FALCON SOC. It also means that the opensource
developers behind this patchset have overtaken Lantiq's competing
inhouse development team that was working behind closed doors.
Less noteworthy the ath79 patchset which adds support for a few more
chip variants, cleanups and fixes. Finally the usual dose of tweaking
of generic code."
Fix up trivial conflicts in arch/mips/lantiq/xway/gpio_{ebu,stp}.c where
printk spelling fixes clashed with file move and eventual removal of the
printk.
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (81 commits)
MIPS: lantiq: remove orphaned code
MIPS: Remove all -Wall and almost all -Werror usage from arch/mips.
MIPS: lantiq: implement support for FALCON soc
MTD: MIPS: lantiq: verify that the NOR interface is available on falcon soc
MTD: MIPS: lantiq: implement OF support
watchdog: MIPS: lantiq: implement OF support and minor fixes
SERIAL: MIPS: lantiq: implement OF support
GPIO: MIPS: lantiq: convert gpio-stp-xway to OF
GPIO: MIPS: lantiq: convert gpio-mm-lantiq to OF and of_mm_gpio
GPIO: MIPS: lantiq: move gpio-stp and gpio-ebu to the subsystem folder
MIPS: pci: convert lantiq driver to OF
MIPS: lantiq: convert dma to platform driver
MIPS: lantiq: implement support for clkdev api
MIPS: lantiq: drop ltq_gpio_request() and gpio_to_irq()
OF: MIPS: lantiq: implement irq_domain support
OF: MIPS: lantiq: implement OF support
MIPS: lantiq: drop mips_machine support
OF: PCI: const usage needed by MIPS
MIPS: Cavium: Remove smp_reserve_lock.
MIPS: Move cache setup to setup_arch().
...
Pull main drm updates from Dave Airlie:
"This is the main merge window request for the drm.
It's big, but jam packed will lots of features and of course 0
regressions. (okay maybe there'll be one).
Highlights:
- new KMS drivers for server GPU chipsets: ast, mgag200 and cirrus
(qemu only). These drivers use the generic modesetting drivers.
- initial prime/dma-buf support for i915, nouveau, radeon, udl and
exynos
- switcheroo audio support: so GPUs with HDMI can turn off the sound
driver without crashing stuff.
- There are some patches drifting outside drivers/gpu into x86 and
EFI for better handling of multiple video adapters in Apple Macs,
they've got correct acks except one trivial fixup.
- Core:
edid parser has better DMT and reduced blanking support,
crtc properties,
plane properties,
- Drivers:
exynos: add 2D core accel support, prime support, hdmi features
intel: more Haswell support, initial Valleyview support, more
hdmi infoframe fixes, update MAINTAINERS for Daniel, lots of
cleanups and fixes
radeon: more HDMI audio support, improved GPU lockup recovery
support, remove nested mutexes, less memory copying on PCIE, fix
bus master enable race (kexec), improved fence handling
gma500: cleanups, 1080p support, acpi fixes
nouveau: better nva3 memory reclocking, kepler accel (needs
external firmware rip), async buffer moves on nv84+ hw.
I've some more dma-buf patches that rely on the dma-buf merge for vmap
stuff, and I've a few fixes building up, but I'd decided I'd better
get rid of the main pull sooner rather than later, so the audio guys
are also unblocked."
Fix up trivial conflict due to some duplicated changes in
drivers/gpu/drm/i915/intel_ringbuffer.c
* 'drm-core-next' of git://people.freedesktop.org/~airlied/linux: (605 commits)
drm/nouveau/nvd9: Fix GPIO initialisation sequence.
drm/nouveau: Unregister switcheroo client on exit
drm/nouveau: Check dsm on switcheroo unregister
drm/nouveau: fix a minor annoyance in an output string
drm/nouveau: turn a BUG into a WARN
drm/nv50: decode PGRAPH DATA_ERROR = 0x24
drm/nouveau/disp: fix dithering not being enabled on some eDP macbooks
drm/nvd9/copy: initialise copy engine, seems to work like nvc0
drm/nvc0/ttm: use copy engines for async buffer moves
drm/nva3/ttm: use copy engine for async buffer moves
drm/nv98/ttm: add in a (disabled) crypto engine buffer copy method
drm/nv84/ttm: use crypto engine for async buffer copies
drm/nouveau/ttm: untangle code to support accelerated buffer moves
drm/nouveau/fbcon: use fence for sync, rather than notifier
drm/nv98/crypt: non-stub implementation of the engine hooks
drm/nouveau/fifo: turn all fifo modules into engine modules
drm/nv50/graph: remove ability to do interrupt-driven context switching
drm/nv50: remove manual context unload on context destruction
drm/nv50: remove execution engine context saves on suspend
drm/nv50/fifo: use hardware channel kickoff functionality
...
On MIPS we want to call of_irq_map_pci from inside
arch/mips/include/asm/pci.h:extern int pcibios_map_irq(
const struct pci_dev *dev, u8 slot, u8 pin);
For this to work we need to change several functions to const usage.
Signed-off-by: John Crispin <blogic@openwrt.org>
Cc: linux-pci@vger.kernel.org
Cc: devicetree-discuss@lists.ozlabs.org
Cc: linux-mips@linux-mips.org
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Patchwork: https://patchwork.linux-mips.org/patch/3710/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 1cc0c998fd ("ACPI: Fix D3hot v D3cold confusion") introduced a
bug in __acpi_bus_set_power() and changed the behavior of
acpi_pci_set_power_state() in such a way that it generally doesn't work
as expected if PCI_D3hot is passed to it as the second argument.
First off, if ACPI_STATE_D3 (equal to ACPI_STATE_D3_COLD) is passed to
__acpi_bus_set_power() and the explicit_set flag is set for the D3cold
state, the function will try to execute AML method called "_PS4", which
doesn't exist.
Fix this by adding a check to ensure that the name of the AML method
to execute for transitions to ACPI_STATE_D3_COLD is correct in
__acpi_bus_set_power(). Also make sure that the explicit_set flag
for ACPI_STATE_D3_COLD will be set if _PS3 is present and modify
acpi_power_transition() to avoid accessing power resources for
ACPI_STATE_D3_COLD, because they don't exist.
Second, if PCI_D3hot is passed to acpi_pci_set_power_state() as the
target state, the function will request a transition to
ACPI_STATE_D3_HOT instead of ACPI_STATE_D3. However,
ACPI_STATE_D3_HOT is now only marked as supported if the _PR3 AML
method is defined for the given device, which is rare. This causes
problems to happen on systems where devices were successfully put
into ACPI D3 by pci_set_power_state(PCI_D3hot) which doesn't work
now. In particular, some unused graphics adapters are not turned
off as a result.
To fix this issue restore the old behavior of
acpi_pci_set_power_state(), which is to request a transition to
ACPI_STATE_D3 (equal to ACPI_STATE_D3_COLD) if either PCI_D3hot or
PCI_D3cold is passed to it as the argument.
This approach is not ideal, because generally power should not
be removed from devices if PCI_D3hot is the target power state,
but since this behavior is relied on, we have no choice but to
restore it at the moment and spend more time on designing a
better solution in the future.
References: https://bugzilla.kernel.org/show_bug.cgi?id=43228
Reported-by: rocko <rockorequin@hotmail.com>
Reported-by: Cristian Rodríguez <crrodriguez@opensuse.org>
Reported-and-tested-by: Peter <lekensteyn@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h
Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.
In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of these:
drivers/pci/pcie/portdrv_core.c: In function 'pcie_port_device_register':
drivers/pci/pcie/portdrv_core.c:275:16: warning: 'cap_mask' may be used
uninitialized in this function [-Wuninitialized]
drivers/pci/pcie/portdrv_core.c:240:6: note: 'cap_mask' was declared here
In some cases, 'cap_mask' may be not set in pcie_port_platform_notify,
holding a garbage value.
Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPpvY9AAoJEHm+PkMAQRiGpEoIAJgbu+Y8gITnBK/wh9O6zy3S
5jie5KK4YWdbJsvO58WbNr3CyVIwGIqQ2dUZLiU59aBVLarlGw8xor0MmW+cZwhp
6fBHaf0qDYAV0MZjD+mnnExOiCRyISa2lPmsfu9dAWywh5KGe6/oAP6/qcXIyok3
KZyl3qQf4ENpaZPHwZPXCEkUvtuyHgNiszN+QXEadA3s19Ot4VGe9A3VGw+GNrSm
JqFIq3acQAbKa5BYaqf7TQC02v2FI7//eqt6QHxTqbE6a7LGbTvLfX3HlJ2mnfqa
1R6QHhM4y4OZDHbaMT2raHZ8WuLXzhehJzhP8Co7AHFOKwVKOb5XbcUr2RrukMU=
=HkMd
-----END PGP SIGNATURE-----
Merge tag 'v3.4-rc6' into drm-intel-next
Conflicts:
drivers/gpu/drm/i915/intel_display.c
Ok, this is a fun story of git totally messing things up. There
/shouldn't/ be any conflict in here, because the fixes in -rc6 do only
touch functions that have not been changed in -next.
The offending commits in drm-next are 14415745b2..1fa611065 which
simply move a few functions from intel_display.c to intel_pm.c. The
problem seems to be that git diff gets completely confused:
$ git diff 14415745b2..1fa611065
is a nice mess in intel_display.c, and the diff leaks into totally
unrelated functions, whereas
$git diff --minimal 14415745b2..1fa611065
is exactly what we want.
Unfortunately there seems to be no way to teach similar smarts to the
merge diff and conflict generation code, because with the minimal diff
there really shouldn't be any conflicts. For added hilarity, every
time something in that area changes the + and - lines in the diff move
around like crazy, again resulting in new conflicts. So I fear this
mess will stay with us for a little longer (and might result in
another backmerge down the road).
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pull an ACPI patch from Len Brown:
"It fixes a D3 issue new in 3.4-rc1."
By Lin Ming via Len Brown:
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
ACPI: Fix D3hot v D3cold confusion
Before this patch, ACPI_STATE_D3 incorrectly referenced D3hot
in some places, but D3cold in other places.
After this patch, ACPI_STATE_D3 always means ACPI_STATE_D3_COLD;
and all references to D3hot use ACPI_STATE_D3_HOT.
ACPI's _PR3 method is used to enter both D3hot and D3cold states.
What distinguishes D3hot from D3cold is the presence _PR3
(Power Resources for D3hot) If these resources are all ON,
then the state is D3hot. If _PR3 is not present,
or all _PR0 resources for the devices are OFF,
then the state is D3cold.
This patch applies after Linux-3.4-rc1.
A future syntax cleanup may remove ACPI_STATE_D3
to emphasize that it always means ACPI_STATE_D3_COLD.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Aaron Lu <aaron.lu@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Disable Bus Master bit on the device in pci_device_shutdown() to ensure PCI
devices do not continue to DMA data after shutdown. This can cause memory
corruption in case of a kexec where the current kernel shuts down and
transfers control to a new kernel while a PCI device continues to DMA to
memory that does not belong to it any more in the new kernel.
I have tested this code on two laptops, two workstations and a 16-socket
server. kexec worked correctly on all of them.
Signed-off-by: Khalid Aziz <khalid.aziz@hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
For IvyBridge Mobile platform, a system hang may occur if a FLR (Function
Level Reset) is asserted to internal graphics.
This quirk is a workaround for the IVB FLR errata issue. We are
disabling the FLR reset handshake between the PCH and CPU display, then
manually powering down the panel power sequencing and resetting the PCH
display.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Kay, Allen M <allen.m.kay@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Daniel Vetter writes:
A new drm-intel-next pull. Highlights:
- More gmbus patches from Daniel Kurtz, I think gmbus is now ready, all
known issues fixed.
- Fencing cleanup and pipelined fencing removal from Chris.
- rc6 residency interface from Ben, useful for powertop.
- Cleanups and code reorg around the ringbuffer code (Ben&me).
- Use hw semaphores in the pageflip code from Ben.
- More vlv stuff from Jesse, unfortunately his vlv cpu is doa, so less
merged than I've hoped for - we still have the unused function warning :(
- More hsw patches from Eugeni, again, not yet enabled fully.
- intel_pm.c refactoring from Eugeni.
- Ironlake sprite support from Chris.
- And various smaller improvements/fixes all over the place.
Note that this pull request also contains a backmerge of -rc3 to sort out
a few things in -next. I've also had to frob the shortlog a bit to exclude
anything that -rc3 brings in with this pull.
Regression wise we have a few strange bugs going on, but for all of them
closer inspection revealed that they've been pre-existing, just now
slightly more likely to be hit. And for most of them we have a patch
already. Otherwise QA has not reported any regressions, and I'm also not
aware of anything bad happening in 3.4.
* tag 'drm-intel-next-2012-04-23' of git://people.freedesktop.org/~danvet/drm-intel: (420 commits)
drm/i915: rc6 residency (fix the fix)
drm/i915/tv: fix open-coded ARRAY_SIZE.
drm/i915: invalidate render cache on gen2
drm/i915: Silence the change of LVDS sync polarity
drm/i915: add generic power management initialization
drm/i915: move clock gating functionality into intel_pm module
drm/i915: move emon functionality into intel_pm module
drm/i915: move drps, rps and rc6-related functions to intel_pm
drm/i915: fix line breaks in intel_pm
drm/i915: move watermarks settings into intel_pm module
drm/i915: move fbc-related functionality into intel_pm module
drm/i915: Refactor get_fence() to use the common fence writing routine
drm/i915: Refactor fence clearing to use the common fence writing routine
drm/i915: Refactor put_fence() to use the common fence writing routine
drm/i915: Prepare to consolidate fence writing
drm/i915: Remove the unsightly "optimisation" from flush_fence()
drm/i915: Simplify fence finding
drm/i915: Discard the unused obj->last_fenced_ring
drm/i915: Remove unused ring->setup_seqno
drm/i915: Remove fence pipelining
...
All supported devices have one issue that msi interrupt doesn't assert
if pci command register bit (PCI_COMMAND_INTX_DISABLE) is set.
Add workaround in drivers/pci/quirks.c
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The intent of git commit 6fbf9e7a90
"PCI: Introduce __pci_reset_function_locked to be used when holding
device_lock." was to have a non-locking function that would call
pci_dev_reset function.
But it fell short of that by just probing and not actually reseting
the device. To make that work we need a way to move the lock
around device_lock to not be in pci_dev_reset (as the caller of
__pci_reset_function_locked already holds said lock). We do this by
renaming pci_dev_reset to __pci_dev_reset and bubbling said mutex out
of __pci_dev_reset to pci_dev_reset (a wrapper around __pci_dev_reset).
The __pci_reset_function_locked can now call __pci_dev_reset without
having to worry about the dead-lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A PCIe downstream port is a P2P bridge. Its secondary interface is
a link that should lead only to device 0 (unless ARI is enabled)[1], so
we don't probe for non-zero device numbers.
Some Stratus ftServer systems have a PCIe downstream port (02:00.0) that
leads to both an upstream port (03:00.0) and a downstream port (03:01.0),
and 03:01.0 has important devices below it:
[0000:02]-+-00.0-[03-3c]--+-00.0-[04-09]--...
\-01.0-[0a-0d]--+-[USB]
+-[NIC]
+-...
Previously, we didn't enumerate device 03:01.0, so USB and the network
didn't work. This patch adds a DMI quirk to scan all device numbers,
not just 0, below a downstream port.
Based on a patch by Prarit Bhargava.
[1] PCIe spec r3.0, sec 7.3.1
CC: Myron Stowe <mstowe@redhat.com>
CC: Don Dutile <ddutile@redhat.com>
CC: James Paradis <james.paradis@stratus.com>
CC: Matthew Wilcox <matthew.r.wilcox@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We need a hook to release host bridge resources allocated when creating
root bus.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use that device for pci_root_bus bridge pointer.
Use pci_release_bus_bridge_dev() to release allocated pci_host_bridge in
remove path.
Use root bus bridge pointer to get host bridge pointer instead of searching
host bridge list. That leaves the host bridge list unused, so remove it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_host_bridge() looks like a C++ constructor. Also separate
find_pci_root_bus() out.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move host bridge-related code from probe.c to a new host-bridge.c.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull build fixes for less mainstream architectures from Paul Gortmaker:
"These are fixes for frv(1), blackfin(2), powerpc(1) and xtensa(4).
Fortunately the touches are nearly all specific to files just used by
the arch in question. The two touches to shared/common files
[kernel/irq/debug.h and drivers/pci/Makefile] are trivial to assess as
no risk to anyone.
Half of them relate to xtensa directly. It was only when I fixed the
last xtensa issue that I realized that the arch has been broken for a
significant time, and isn't a specific v3.4 regression. So if you
wanted, we could leave xtensa lying bleeding in the street for a
couple more weeks and queue those for 3.5. But given they are no risk
to anyone outside of xtensa, I figured to just leave them in.
If you are OK with taking the xtensa fixes, then please pull to get:
- one last implicit include uncovered by system.h that is in a file
specific to just one powerpc defconfig. (I'd sync'd with BenH).
- fix an oversight in the PCI makefile where shared code wasn't being
compiled for ARCH=frv
- fix a missing include for GPIO in blackfin framebuffer.
- audit and tag endif in blackfin ezkit board file, in order to find
and fix the misplaced endif masking a block of code.
- fix irq/debug.h choice of temporary macro names to be more internal
so they don't conflict with names used by xtensa.
- fix a reference to an undeclared local var in xtensa's signal.c
- fix an implicit bug.h usage in xtensa's asm/io.h uncovered by my
removing bug.h from kernel.h
- fix xtensa to properly indicate it is using asm-generic/hardirq.h
in order to resolve the link error - undefined ack_bad_irq
The xtensa still fails final link as my latest binutils does something
evil when ld forward-relocates unlikely() blocks, but in theory people
who have older/valid toolchains could now use the thing."
* 'for-v3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
xtensa: fix build fail on undefined ack_bad_irq
blackfin: fix ifdef fustercluck in mach-bf538/boards/ezkit.c
blackfin: fix compile error in bfin-lq035q1-fb.c
pci: frv architecture needs generic setup-bus infrastructure
irq: hide debug macros so they don't collide with others.
xtensa: fix build error in xtensa/include/asm/io.h
xtensa: fix build failure in xtensa/kernel/signal.c
powerpc: fix system.h fallout in sysdev/scom.c [chroma_defconfig]
Otherwise we get this link failure for frv's defconfig:
LD .tmp_vmlinux1
drivers/built-in.o: In function `pci_assign_resource':
(.text+0xbf0c): undefined reference to `pci_cardbus_resource_alignment'
drivers/built-in.o: In function `pci_setup':
pci.c:(.init.text+0x174): undefined reference to `pci_realloc_get_opt'
pci.c:(.init.text+0x1a0): undefined reference to `pci_realloc_get_opt'
make[1]: *** [.tmp_vmlinux1] Error 1
Cc: David Howells <dhowells@redhat.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
The default VGA device is a somewhat fluid concept on platforms with
multiple GPUs. Add support for setting it so switching code can update
things appropriately, and make sure that the sysfs code returns the right
device if it's changed.
v2: Updated to fix builds when __ARCH_HAS_VGA_DEFAULT_DEVICE is false.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: benh@kernel.crashing.org
Cc: airlied@redhat.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some shortcomings introduced into pci_restore_state() by commit
26f41062f2 ("PCI: check for pci bar restore completion and retry")
have been fixed by recent commit ebfc5b802f ("PCI: Fix regression in
pci_restore_state(), v3"), but that commit treats all PCI devices as
those with Type 0 configuration headers.
That is not entirely correct, because Type 1 and Type 2 headers have
different layouts. In particular, the area occupied by BARs in Type 0
config headers contains the secondary status register in Type 1 ones and
it doesn't make sense to retry the restoration of that register even if
the value read back from it after a write is not the same as the written
one (it very well may be different).
For this reason, make pci_restore_state() only retry the restoration
of BARs for Type 0 config headers. This effectively makes it behave
as before commit 26f41062f2 for all header types except for Type 0.
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 26f41062f2 ("PCI: check for pci bar restore completion and
retry") attempted to address problems with PCI BAR restoration on
systems where FLR had not been completed before pci_restore_state() was
called, but it did that in an utterly wrong way.
First off, instead of retrying the writes for the BAR registers only, it
did that for all of the PCI config space of the device, including the
status register (whose value after the write quite obviously need not be
the same as the written one). Second, it added arbitrary delay to
pci_restore_state() even for systems where the PCI config space
restoration was successful at first attempt. Finally, the mdelay(10) it
added to every iteration of the writing loop was way too much of a delay
for any reasonable device.
All of this actually caused resume failures for some devices on Mikko's
system.
To fix the regression, make pci_restore_state() only retry the writes
for BAR registers and only wait if the first read from the register
doesn't return the written value. Additionaly, make it wait for 1 ms,
instead of 10 ms, after every failing attempt to write into config
space.
Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree,
* the other is to fix drivers to load on PV (a previous patch made them only
load in PVonHVM mode).
The rest are just minor fixes in the various drivers and some cleanup in the
core code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJPfyVUAAoJEFjIrFwIi8fJUjUH/jbY5JavRqSlNELZW2A4Ta76
8p00LqLHw/C56iHZcWKke8mqtWNb+ZfcQt7ZYcxDIYa4QWBL28x0OLAO2tOBIt37
ZjYESWSdFJaJvmpADluWtFyGyZ9TYJllDTBm/jWj1ZtKSZvR1YkhuMXCS0f4AmGQ
xFzSWJZUDdiOAqpN+VQD8wP00gfR8knQLg16XE2fvFdQo4XwpCtqLfHV/5pMMGdy
Cs/ep6rq/7cdv/nshKOcBnw7RW8l3Xoi/28ht8k3DvAQ2VtFq1Tugv2G9pcCHwQG
DIBkB3SOU6/v6P5at5+egKS5xR1fJetCWlkMd8kkbcdz2NPI4UDMkvOW6Q8yQls=
=6Ve+
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen fixes from Konrad Rzeszutek Wilk:
"Two fixes for regressions:
* one is a workaround that will be removed in v3.5 with proper fix in
the tip/x86 tree,
* the other is to fix drivers to load on PV (a previous patch made
them only load in PVonHVM mode).
The rest are just minor fixes in the various drivers and some cleanup
in the core code."
* tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
xen/pciback: fix XEN_PCI_OP_enable_msix result
xen/smp: Remove unnecessary call to smp_processor_id()
xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'
xen: only check xen_platform_pci_unplug if hvm
The original XenoLinux code has always had things this way, and for
compatibility reasons (in particular with a subsequent pciback
adjustment) upstream Linux should behave the same way (allowing for two
distinct error indications to be returned by the backend).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
Some other systems using the pata_jmicron driver fail to boot because no
disks are detected. Passing pcie_aspm=force on the kernel command line
works around it.
The cause: commit 4949be1682 ("PCI: ignore pre-1.1 ASPM quirking when
ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
always return 0 if aspm is disabled, in order to avoid cases where we
changed ASPM state on pre-PCIe 1.1 devices.
This skipped the secondary function of pcie_aspm_sanity_check which was
to avoid us enabling ASPM on devices that had non-PCIe children, causing
trouble later on. Move the aspm_disabled check so we continue to honour
that scenario.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
http://bugs.debian.org/665420
Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
[jn: with more symptoms in log message]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ACPI & Power Management changes from Len Brown:
- ACPI 5.0 after-ripples, ACPICA/Linux divergence cleanup
- cpuidle evolving, more ARM use
- thermal sub-system evolving, ditto
- assorted other PM bits
Fix up conflicts in various cpuidle implementations due to ARM cpuidle
cleanups (ARM at91 self-refresh and cpu idle code rewritten into
"standby" in asm conflicting with the consolidation of cpuidle time
keeping), trivial SH include file context conflict and RCU tracing fixes
in generic code.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (77 commits)
ACPI throttling: fix endian bug in acpi_read_throttling_status()
Disable MCP limit exceeded messages from Intel IPS driver
ACPI video: Don't start video device until its associated input device has been allocated
ACPI video: Harden video bus adding.
ACPI: Add support for exposing BGRT data
ACPI: export acpi_kobj
ACPI: Fix logic for removing mappings in 'acpi_unmap'
CPER failed to handle generic error records with multiple sections
ACPI: Clean redundant codes in scan.c
ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
ACPI: consistently use should_use_kmap()
PNPACPI: Fix device ref leaking in acpi_pnp_match
ACPI: Fix use-after-free in acpi_map_lsapic
ACPI: processor_driver: add missing kfree
ACPI, APEI: Fix incorrect APEI register bit width check and usage
Update documentation for parameter *notrigger* in einj.txt
ACPI, APEI, EINJ, new parameter to control trigger action
ACPI, APEI, EINJ, limit the range of einj_param
ACPI, APEI, Fix ERST header length check
cpuidle: power_usage should be declared signed integer
...
acpi_dev_run_wake() is a generic function which can be used by
other subsystem too. Rename it to acpi_pm_device_run_wake, to be
consistent with acpi_pm_device_sleep_wake.
Then move it to ACPI core.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Pull PCI changes (including maintainer change) from Jesse Barnes:
"This pull has some good cleanups from Bjorn and Yinghai, as well as
some more code from Yinghai to better handle resource re-allocation
when enabled.
There's also a new initcall_debug feature from Arjan which will print
out quirk timing information to help identify slow quirks for fixing
or refinement (Yinghai sent in a few patches to do just that once the
new debug code landed).
Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas.
He's been a core PCI and Linux contributor for some time now, and has
kindly volunteered to take over. I just don't feel I have the time
for PCI review and work that it deserves lately (I've taken on some
other projects), and haven't been as responsive lately as I'd like, so
I approached Bjorn asking if he'd like to manage things. He's going
to give it a try, and I'm confident he'll do at least as well as I
have in keeping the tree managed, patches flowing, and keeping things
stable."
Fix up some fairly trivial conflicts due to other cleanups (mips device
resource fixup cleanups clashing with list handling cleanup, ppc iseries
removal clashing with pci_probe_only cleanup etc)
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits)
PCI: Bjorn gets PCI hotplug too
PCI: hand PCI maintenance over to Bjorn Helgaas
unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h
sparc/PCI: convert devtree and arch-probed bus addresses to resource
powerpc/PCI: allow reallocation on PA Semi
powerpc/PCI: convert devtree bus addresses to resource
powerpc/PCI: compute I/O space bus-to-resource offset consistently
arm/PCI: don't export pci_flags
PCI: fix bridge I/O window bus-to-resource conversion
x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()'
PCI / PCIe: Introduce command line option to disable ARI
PCI: make acpihp use __pci_remove_bus_device instead
PCI: export __pci_remove_bus_device
PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
PCI: print out PCI device info along with duration
PCI: Move "pci reassigndev resource alignment" out of quirks.c
PCI: Use class for quirk for usb host controller fixup
PCI: Use class for quirk for ti816x class fixup
PCI: Use class for quirk for intel e100 interrupt fixup
...
- PV multiconsole support, so that there can be hvc1, hvc2, etc;
- P-state and C-state power management driver that uploads said
power management data to the hypervisor. It also inhibits cpufreq
scaling drivers to load so that only the hypervisor can make power
management decisions - fixing a weird perf bug.
- Function Level Reset (FLR) support in the Xen PCI backend.
Fixes:
- Kconfig dependencies for Xen PV keyboard and video
- Compile warnings and constify fixes
- Change over to use percpu_xxx instead of this_cpu_xxx
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJPZ0qkAAoJEFjIrFwIi8fJjCgH/jeJ39E8ML8DP9tCS2HQnMqM
uTEjLcqvoJ7sEhHvtBLPeG2p0jyBvOWjLbSc7P8nESBAMPvSYol8L6WqfWrdSU4r
lHrma2sg9UYzRog5NyxAgkp7bBsBBFOnhVL3Cxb5Ig78cPWzeSWGpqGZ8M/d51Wf
1iE0tHuU4DpN+fg1SZqPqEm8ecEJ/eSrVTnyTx/Qo2Ak+Zw98SqzX7SV5lo8mudd
WFL1F2K9FyTNk79ndGhqFt36x6nEbFgMLbmCDWumLuWN6bMd1Uq0wNkCqW4F1h28
3yqnY+rfQh4y3eXK1B9nttCUTs+/66U5ZWrT6B1IJumGTAIqcWfgeUX/Vn/HVC4=
=tfMc
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen updates from Konrad Rzeszutek Wilk:
"which has three neat features:
- PV multiconsole support, so that there can be hvc1, hvc2, etc; This
can be used in HVM and in PV mode.
- P-state and C-state power management driver that uploads said power
management data to the hypervisor. It also inhibits cpufreq
scaling drivers to load so that only the hypervisor can make power
management decisions - fixing a weird perf bug.
There is one thing in the Kconfig that you won't like: "default y
if (X86_ACPI_CPUFREQ = y || X86_POWERNOW_K8 = y)" (note, that it
all depends on CONFIG_XEN which depends on CONFIG_PARAVIRT which by
default is off). I've a fix to convert that boolean expression
into "default m" which I am going to post after the cpufreq git
pull - as the two patches to make this work depend on a fix in Dave
Jones's tree.
- Function Level Reset (FLR) support in the Xen PCI backend.
Fixes:
- Kconfig dependencies for Xen PV keyboard and video
- Compile warnings and constify fixes
- Change over to use percpu_xxx instead of this_cpu_xxx"
Fix up trivial conflicts in drivers/tty/hvc/hvc_xen.c due to changes to
a removed commit.
* tag 'stable/for-linus-3.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND deps
xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.
xen: constify all instances of "struct attribute_group"
xen/xenbus: ignore console/0
hvc_xen: introduce HVC_XEN_FRONTEND
hvc_xen: implement multiconsole support
hvc_xen: support PV on HVM consoles
xenbus: don't free other end details too early
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.
xen/setup/pm/acpi: Remove the call to boot_option_idle_override.
xenbus: address compiler warnings
xen: use this_cpu_xxx replace percpu_xxx funcs
xen/pciback: Support pci_reset_function, aka FLR or D3 support.
pci: Introduce __pci_reset_function_locked to be used when holding device_lock.
xen: Utilize the restore_msi_irqs hook.
Pull networking merge from David Miller:
"1) Move ixgbe driver over to purely page based buffering on receive.
From Alexander Duyck.
2) Add receive packet steering support to e1000e, from Bruce Allan.
3) Convert TCP MD5 support over to RCU, from Eric Dumazet.
4) Reduce cpu usage in handling out-of-order TCP packets on modern
systems, also from Eric Dumazet.
5) Support the IP{,V6}_UNICAST_IF socket options, making the wine
folks happy, from Erich Hoover.
6) Support VLAN trunking from guests in hyperv driver, from Haiyang
Zhang.
7) Support byte-queue-limtis in r8169, from Igor Maravic.
8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but
was never properly implemented, Jiri Benc fixed that.
9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang.
10) Support kernel side dump filtering by ctmark in netfilter
ctnetlink, from Pablo Neira Ayuso.
11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker.
12) Add new peek socket options to assist with socket migration, from
Pavel Emelyanov.
13) Add sch_plug packet scheduler whose queue is controlled by
userland daemons using explicit freeze and release commands. From
Shriram Rajagopalan.
14) Fix FCOE checksum offload handling on transmit, from Yi Zou."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits)
Fix pppol2tp getsockname()
Remove printk from rds_sendmsg
ipv6: fix incorrent ipv6 ipsec packet fragment
cpsw: Hook up default ndo_change_mtu.
net: qmi_wwan: fix build error due to cdc-wdm dependecy
netdev: driver: ethernet: Add TI CPSW driver
netdev: driver: ethernet: add cpsw address lookup engine support
phy: add am79c874 PHY support
mlx4_core: fix race on comm channel
bonding: send igmp report for its master
fs_enet: Add MPC5125 FEC support and PHY interface selection
net: bpf_jit: fix BPF_S_LDX_B_MSH compilation
net: update the usage of CHECKSUM_UNNECESSARY
fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx
net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso
ixgbe: Fix issues with SR-IOV loopback when flow control is disabled
net/hyperv: Fix the code handling tx busy
ixgbe: fix namespace issues when FCoE/DCB is not enabled
rtlwifi: Remove unused ETH_ADDR_LEN defines
igbvf: Use ETH_ALEN
...
Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and
drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
Here's the big driver core merge for 3.4-rc1.
Lots of various things here, sysfs fixes/tweaks (with the nlink breakage
reverted), dynamic debugging updates, w1 drivers, hyperv driver updates,
and a variety of other bits and pieces, full information in the
shortlog.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk9neCsACgkQMUfUDdst+ylyQwCfY2eizvzw5HhjQs8gOiBRDADe
yrgAnj1Zan2QkoCnQIFJNAoxqNX9yAhd
=biH6
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches for 3.4-rc1 from Greg KH:
"Here's the big driver core merge for 3.4-rc1.
Lots of various things here, sysfs fixes/tweaks (with the nlink
breakage reverted), dynamic debugging updates, w1 drivers, hyperv
driver updates, and a variety of other bits and pieces, full
information in the shortlog."
* tag 'driver-core-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (78 commits)
Tools: hv: Support enumeration from all the pools
Tools: hv: Fully support the new KVP verbs in the user level daemon
Drivers: hv: Support the newly introduced KVP messages in the driver
Drivers: hv: Add new message types to enhance KVP
regulator: Support driver probe deferral
Revert "sysfs: Kill nlink counting."
uevent: send events in correct order according to seqnum (v3)
driver core: minor comment formatting cleanups
driver core: move the deferred probe pointer into the private area
drivercore: Add driver probe deferral mechanism
DS2781 Maxim Stand-Alone Fuel Gauge battery and w1 slave drivers
w1_bq27000: Only one thread can access the bq27000 at a time.
w1_bq27000 - remove w1_bq27000_write
w1_bq27000: remove unnecessary NULL test.
sysfs: Fix memory leak in sysfs_sd_setsecdata().
intel_idle: Revert change of auto_demotion_disable_flags for Nehalem
w1: Fix w1_bq27000
driver-core: documentation: fix up Greg's email address
powernow-k6: Really enable auto-loading
powernow-k7: Fix CPU family number
...
In 5bfa14ed9f, I forgot to initialize res2.flags before calling
pcibios_bus_to_resource(), which depends on the resource type to locate the
correct aperture. This bug won't hurt x86, which currently never has an
offset between bus and CPU addresses, but will affect other architectures.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Right now we won't touch ASPM state if ASPM is disabled, except in the case
where we find a device that appears to be too old to reliably support ASPM.
Right now we'll clear it in that case, which is almost certainly the wrong
thing to do. The easiest way around this is just to disable the blacklisting
when ASPM is disabled.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch recodes the MRRS cap for 5719 A0 devices as a PCI quirk.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are PCIe devices on the market that report ARI support but
then fail to initialize correctly when ARI is actually used. This
leads to situations in which kernels 2.6.34 and newer fail to handle
systems where the previous kernels worked without any apparent
problems. Unfortunately, it is currently unknown how many such
devices are there.
For this reason, introduce a new kernel command line option,
pci=noari, allowing users to disable PCIe ARI altogether if they
see problems with PCIe device initialization.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_stop_bus_device gets called before in the same loop.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Don't switch to pci_remove_bus_device yet, keep the __ prefix for now
(the behavior is still the same: remove without stopping first).
This allows other out of tree users or pending patches to get notified
from compiler warning.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The old pci_remove_behind_bridge actually do stop and remove.
Make the name reflect that to reduce confusion.
Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The old pci_remove_bus_device actually did stop and remove.
Make the name reflect that to reduce confusion.
This patch is done by sed scripts and changes back some incorrect
__pci_remove_bus_device changes.
Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Makes it a little easier to figure out which device may have caused a
slow quirk.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This isn't really a quirk; calling it directly from pci_add_device makes
more sense.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Recently added support to allow quirks to report duration also make the
boot log very crowded when initcall_debug is specified.
One thing we can to do mitigate this is to not call quirks unnecessarily
by adding a new quirk declaration macro that takes a class argument.
The new macro takes a class value and a class shift value (since it can
vary) so that quirks will be limited to certain device classes, greatly
reducing the number we call on every PCI device addition.
-v2: fix v1 that left over of sparated patch.
-v3: according to Jesse, change cls to class, cls_shift, to class_shift.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add a new config option, PCI_REALLOC_ENABLE_AUTO, which will
automatically try to re-allocate PCI resources if PCI_IOV support is
enabled and the SR-IOV resources are unassigned. Behavior can still be
controlled using the pci=realloc= parameter.
-v2: According to Jesse, adding one CONFIG option for distribution to
disable it or enable it.
-v3: update Kconfig text (jbarnes)
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
let user know they could try if pci=realloc could help.
-v2: update suggestion text.
Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Let the user could enable and disable with pci=realloc=on or pci=realloc=off
Also
1. move variable and functions near the place they are used.
2. change macro to function
3. change related functions and variable to static and _init
4. update parameter description accordingly.
This will let us add a config option to control default behavior, and
still allow the user to turn off automatic reallocation if it fails on
their platform until a permanent solution is found.
-v2: still honor pci=realloc, and treat it as pci=realloc=on
also use enum instead of ...
-v3: update kernel-paramenters.txt according to Jesse.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When enabling pci reallocation for a pci bridge, we clear the small size
in in bridge and re-assign with requested + optional size for first
several tries, but Ram mention could have problem with one case:
https://bugzilla.kernel.org/show_bug.cgi?id=15960
After checking the booting log in
https://lkml.org/lkml/2010/4/19/44
[regression, bisected] Xonar DX invalid PCI I/O range since 977d17bb17
We should not stop too early for io ports.
Apr 19 10:19:38 [kernel] pci 0000:04:00.0: BAR 7: can't assign io (size 0x4000)
Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 8: assigned [mem 0x80400000-0x805fffff]
Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 7: can't assign io (size 0x2000)
Apr 19 10:19:38 [kernel] pci 0000:05:02.0: BAR 7: can't assign io (size 0x1000)
Apr 19 10:19:38 [kernel] pci 0000:05:03.0: BAR 7: can't assign io (size 0x1000)
Apr 19 10:19:38 [kernel] pci 0000:08:00.0: BAR 7: can't assign io (size 0x1000)
Apr 19 10:19:38 [kernel] pci 0000:09:04.0: BAR 0: can't assign io (size 0x100)
and clear 00:1c.0 to retry again.
This patch removes IORESOUCE_IO checking, and tries one more time. It
gives us a chance to get an allocation for the 00:1c.0 io port range
because the range from 0x4000 to 0x8000 will be freed and we can use it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Everybody uses the generic pcibios_resource_to_bus() supplied by the core
now, so remove the ARCH_HAS_GENERIC_PCI_OFFSETS used during conversion.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This replaces the generic versions of pcibios_resource_to_bus() and
pcibios_bus_to_resource() in asm-generic/pci.h with versions that use
pci_resource_to_bus() and pci_bus_to_resource().
The replacements are equivalent except that they can apply host
bridge window offsets when the arch has supplied them by using
pci_add_resource_offset().
Each arch can convert to using pci_add_resource_offset() individually by
removing its device resource fixups from pcibios_fixup_bus() and supplying
ARCH_HAS_GENERIC_PCI_OFFSETS. ARCH_HAS_GENERIC_PCI_OFFSETS can be removed
after all have converted.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some PCI host bridges translate CPU addresses to PCI bus addresses.
Previously, we initialized pci_dev resources with PCI bus addresses,
then converted them to CPU addresses later in arch-specific code
(pcibios_fixup_resources()), which leaves a window of time where the
pci_dev resources are incorrect.
This patch adds support in the core for this address translation.
When the arch creates the root bus, it can supply the host bridge
address translation information, and the core can use it to set the
pci_dev resources correctly from the beginning.
This gives us a way to fix the problem that quirks that run between device
discovery and pcibios_fixup_resources() fail because they use pci_dev
resources that haven't been converted. The reference below is to one
such problem that affected ARM and ia64.
Note that this patch has no effect until an arch starts using
pci_add_resource_offset() with a non-zero offset: before that, all
all host bridge windows have a zero offset and pci_bus_to_resource()
copies the pci_bus_region directly to the struct resource.
Reference: https://lkml.org/lkml/2009/10/12/405
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some PCI host bridges apply an address offset, so bus addresses on PCI are
different from CPU addresses. This patch adds a way for architectures to
tell the PCI core about this offset. For example:
LIST_HEAD(resources);
pci_add_resource_offset(&resources, host->io_space, host->io_offset);
pci_add_resource_offset(&resources, host->mem_space, host->mem_offset);
pci_scan_root_bus(parent, bus, ops, sysdata, &resources);
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This adds a list of all PCI host bridges we find and a way to look up
the host bridge from a pci_dev.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When pci_create_root_bus() adds the new struct pci_bus to the global
pci_root_buses list, the bus becomes visible to other parts of the
kernel, so it should be fully initialized.
This patch delays adding the bus to the pci_root_buses list until after
all the struct pci_bus initialization is finished.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If we move resource assignment functions into the core, we'll still
need a way for architectures to prevent reassignment, e.g., the
"pci_probe_only" functionality, and we'll need a generic, always
available way the core can test for that. The "pci_flags"
arrangement used by several architectures seems like a convenient
way to do this.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add a parameter to avoid using MSI/MSI-X for PCIe native hotplug; it's
known to be buggy on some platforms.
In my environment, while shutting down, following stack trace is shown
sometimes.
irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 1081, comm: reboot Not tainted 3.2.0 #1
Call Trace:
<IRQ> [<ffffffff810cec1d>] __report_bad_irq+0x3d/0xe0
[<ffffffff810cee1c>] note_interrupt+0x15c/0x210
[<ffffffff810cc485>] handle_irq_event_percpu+0xb5/0x210
[<ffffffff810cc621>] handle_irq_event+0x41/0x70
[<ffffffff810cf675>] handle_fasteoi_irq+0x55/0xc0
[<ffffffff81015356>] handle_irq+0x46/0xb0
[<ffffffff814fbe9d>] do_IRQ+0x5d/0xe0
[<ffffffff814f146e>] common_interrupt+0x6e/0x6e
[<ffffffff8106b040>] ? __do_softirq+0x60/0x210
[<ffffffff8108aeb1>] ? hrtimer_interrupt+0x151/0x240
[<ffffffff814fb5ec>] call_softirq+0x1c/0x30
[<ffffffff810152d5>] do_softirq+0x65/0xa0
[<ffffffff8106ae9d>] irq_exit+0xbd/0xe0
[<ffffffff814fbf8e>] smp_apic_timer_interrupt+0x6e/0x99
[<ffffffff814f9e5e>] apic_timer_interrupt+0x6e/0x80
<EOI> [<ffffffff814f0fb1>] ? _raw_spin_unlock_irqrestore+0x11/0x20
[<ffffffff812629fc>] pci_bus_write_config_word+0x6c/0x80
[<ffffffff81266fc2>] pci_intx+0x52/0xa0
[<ffffffff8127de3d>] pci_intx_for_msi+0x1d/0x30
[<ffffffff8127e4fb>] pci_msi_shutdown+0x7b/0x110
[<ffffffff81269d34>] pci_device_shutdown+0x34/0x50
[<ffffffff81326c4f>] device_shutdown+0x2f/0x140
[<ffffffff8107b981>] kernel_restart_prepare+0x31/0x40
[<ffffffff8107b9e6>] kernel_restart+0x16/0x60
[<ffffffff8107bbfd>] sys_reboot+0x1ad/0x220
[<ffffffff814f4b90>] ? do_page_fault+0x1e0/0x460
[<ffffffff811942d0>] ? __sync_filesystem+0x90/0x90
[<ffffffff8105c9aa>] ? __cond_resched+0x2a/0x40
[<ffffffff814ef090>] ? _cond_resched+0x30/0x40
[<ffffffff81169e17>] ? iterate_supers+0xb7/0xd0
[<ffffffff814f9382>] system_call_fastpath+0x16/0x1b
handlers:
[<ffffffff8138a0f0>] usb_hcd_irq
[<ffffffff8138a0f0>] usb_hcd_irq
[<ffffffff8138a0f0>] usb_hcd_irq
Disabling IRQ #16
An un-wanted interrupt is generated when PCI driver switches from
MSI/MSI-X to INTx while shutting down the device. The interrupt does
not happen if MSI/MSI-X is not used on the device.
I confirmed that this problem does not happen if pcie_hp=nomsi was
specified and hotplug operation worked fine as usual.
v2: Automatically disable MSI/MSI-X against following device:
PCI bridge: Integrated Device Technology, Inc. Device 807f (rev 02)
v3: Based on the review comment, combile the if statements.
v4: Removed module parameter.
Move some code to build pciehp as a module.
Move device specific code to driver/pci/quirks.c.
v5: Drop a device specific code until getting a vendor statement.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Only one user in driver/pci/pci.c, so we don't need to put it in global
pci.h
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Found debug print of class is shifted.
| pci 0000:f8:15.2: [8086:2b56] type 0 class 0x000600
Code is trying to print class with 6 digits, but use shifted class with
4 digits valid value as variable.
Change to original dev->class directly.
Also remove not needed calculating of local variable class, because it
will be updated after pci_fixup_device(pci_fixup_early...)
Also unify type print out when class and header is not matched.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Otherwise when rescan is used for cardbus, assigned resources will get
cleared.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We should not set the requested size to -2; that will confuse the
resource list sorting with align when SIZEALIGN is used.
Change to STARTALIGN and pass align from start; we are safe to do that
just as we do that regular pci bridge. In the long run, we should just
treat cardbus like a regular pci bridge.
Also fix the case when realloc_head is not passed: we should keep the
requested size.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some BIOSes enable prefetch on both MEM0 and MEM1. But the cardbus code
assumes MEM1 is non-pref...
Discussion could be found at:
https://lkml.org/lkml/2012/1/12/1https://bugzilla.kernel.org/show_bug.cgi?id=41622#c23
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
One regression fix for SR-IOV on PPC and a couple of misc fixes from
Yinghai.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI: Fix pci cardbus removal
PCI: set pci sriov page size before reading SRIOV BAR
PCI: workaround hard-wired bus number V2
If a PCI device is enabled to generate wakeup signals (PME) when put
into a low-power state by runtime PM, it will be still enabled to
generate those signals after the system shutdown, unless its driver's
.shutdown() callback takes care of the wakeup signals generation
setting. Moreover, there are devices that are not enabled to wake
up the system and that are configured by runtime PM to generate
wakeup signals so that (runtime) remote wakeup works with them.
Those devices should be reconfigured during system shutdown so that
they don't generate wakeup signals, but at least some drivers don't
do that. However, that very well may be done by the PCI core so
that drivers don't have to worry about it. For this reason, modify
pci_device_shutdown() to disable the generation of wakeup events for
devices not supposed to wake up the system.
References: https://bugzilla.kernel.org/show_bug.cgi?id=37952
Reported-and-tested-by: Kamil Iskra <kamil.54002@iskra.name>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
sysfs is a bit stricter now and emits warnings in more cases.
For SRIOV hotplug, we are calling pci_stop_dev() for each VF first
(after we update pci_stop_bus_devices) which remove each VF subdir. So
double check the VF dir in /sys before trying to remove the physfn link.
Signed-of-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Distributions may wish to provide different defaults for PCIE ASPM
depending on their target audience. Provide a configuration option for
choosing the default policy.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
by the xen-pci[front|back] to conform to the one used in majority of
PCI drivers; Two fixes to make the code more resilient to invalid
configurations.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJPOeReAAoJEFjIrFwIi8fJn9QIANP48kzrGg0uO4bjSf2h/z7G
pp3ISdtVLk7pwMov2POBqskoXSq8E0yQAfNN8se183wqNXo3Dm4rU1DIG7HQFBk9
sdcyfHI8x7pat9JClRhGxpQ23Ig9f1iWkShweCcZCO782vfxZyNd65i6t87X7uLq
7SPtG1XH2RixTX7tHtKKBqdzZ0OMXOEkJ33dgCmyrn+wzohbKrFj5mg+NdOgmzEo
VgsHPVtuq7orDROe+F9d91eAg0TILQ13th8xfWZ59lQATXu/zAlaueYt87tpy1pb
oVQvumsn8Xev+7hct9My9Tw45D4m8YOSFLG2HcekkW2WtNmGhTTbIyMh9PsLugk=
=NDYK
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Two fixes for VCPU offlining; One to fix the string format exposed
by the xen-pci[front|back] to conform to the one used in majority of
PCI drivers; Two fixes to make the code more resilient to invalid
configurations.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus_dev: add missing error check to watch handling
xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.
xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.
xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
Some BIOS implementations leave the Intel GPU interrupts enabled,
even though no one is handling them (f.e. i915 driver is never loaded).
Additionally the interrupt destination is not set up properly
and the interrupt ends up -somewhere-.
These spurious interrupts are "sticky" and the kernel disables
the (shared) interrupt line after 100.000+ generated interrupts.
Fix it by disabling the still enabled interrupts.
This resolves crashes often seen on monitor unplug.
Tested on the following boards:
- Intel DH61CR: Affected
- Intel DH67BL: Affected
- Intel S1200KP server board: Affected
- Asus P8H61-M LE: Affected, but system does not crash.
Probably the IRQ ends up somewhere unnoticed.
According to reports on the net, the Intel DH61WW board is also affected.
Many thanks to Jesse Barnes from Intel for helping
with the register configuration and to Intel in general
for providing public hardware documentation.
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Charlie Suffin <charlie.suffin@stratus.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
While diagnosing some boot time issues on a platform, all that I
could see in the bootgraph/dmesg was that the system was spending
a lot of time in applying one or more PCI quirks... which
was virtually undebuggable.
This patch adds printk's in "initcall_debug" style to the dmesg,
which are added when the user asks for the initcall_debug
(the nr one tool to use when debugging boot hangs or boot time issues)
kernel command line option.
v2: add #includes so quirks can build on non-x86
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix debug variable from module parameter to be really bool to
fix 'warning: return from incompatible pointer type'.
Acked-by: Scott Murray <scott@spiteful.org>
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On some OEM systems, pci_restore_state() is called while FLR has not yet
completed. As a result, PCI BAR register restore is not successful. This fix
reads back the restored value and compares it with saved value and re-tries 10
times before giving up.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On a system with a repeater on the system board to support gen2 hotplug,
we found that when an ExpressModule is removed from some slots,
/var/log/messages will be full of "card present/not present" warnings.
It turns out the root complex is continually trying to train the link to
the repeater because the repeater has not been reset.
This patch will disable the link at removal time to allow the repeater
to be reset properly. This also prevents a potential AER message at
removal time.
Also, when testing hotplug on a system under development, we found if we
boot the system without an EM installed, and later hot-add an EM, it
does not work with Linux, but another OS is ok. The root cause is that
BIOS left link disabled when slot was empty at boot time, and other OS
is modifying the link disable bit in link ctrl during power on/off.
So we should do the same thing to disable/enable link during power off/on.
-v2: check link DLLA bit instead of 100ms waiting.
Separate link disable/enable functions to another patch.
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A few changes:
- remove the 'inline' and let the complier decide
- return a bool to indicate whether the link was active
- add a debug message to indicate link state when it beocmes active
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During reviewing
| PCI: pciehp: wait 1000 ms before Link Training check
Linus said:
>...
> That's a *long* time, and it's irritating to the user. It makes the
> user think "the machine is slow".
>...
> And quite frankly, an unconditional one-second delay here seems bad.
>Two seconds was unacceptable, one second is just bad.
Try to access the pci conf of a pci device that is supposed to show up
in 1s. If we can read back a valid vendor/device id, we can return
early.
Related discussion could be found:
https://lkml.org/lkml/2011/12/6/339
-v2: seperate code to pci_bus_read_dev_vendor_id() from pci_scan_device()
and reuse it from pciehp code. Suggested by Matthew Wilcox.
-v3: According to Kenj, don't use array in stack, and don't wait too long
for crs, also return fail status if not found.
Also separate pci_bus_dev_read_vendor_id() change to another patch.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We can reuse it for pciehp probing.
-v2: according to Kenji, fix crs timeout checking, and export the function
for later use when pciehp is compiled as a module.
Suggested-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When hot removing a pci express module that has a pcie switch and supports
SRIOV, we got:
[ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
[ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
[ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
[ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
[ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
[ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
[ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain🚌device=0000:b0:00
[ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
[ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain🚌dev = 0000:b0:00
[ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
[ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
[ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
[ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
[ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
[ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
[ 5926.133377] PGD 0
[ 5926.135402] Oops: 0000 [#1] SMP
[ 5926.138659] CPU 2
[ 5926.140499] Modules linked in:
...
[ 5926.143754]
[ 5926.275823] Call Trace:
[ 5926.278267] [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
[ 5926.284180] [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
[ 5926.290264] [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
[ 5926.296866] [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
[ 5926.303123] [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
[ 5926.309206] [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
...
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]--
| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
| | | +-00.1 Device
| | | +-00.2 Device
| | | \-00.3 Device
| | \-03.0-[b3]--+-00.0 Device
| | +-00.1 Device
| | +-00.2 Device
| | \-00.3 Device
root complex: 80:02.2
pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
end devices are b2:00.0 and b3.00.0.
VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3
Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
virt fn and remove the fn, so
list_for_each_safe(l, n, &bus->devices)
will have problem to refer freed n that is pointed to vf entry.
Solution is just replacing list_for_each_safe() with
list_for_each_prev_safe(). This will make sure we can get valid n pointer
to PF instead of the freed VF pointer (because newly added devices are
inserted to the bus->devices list tail).
During reviewing the patch, Bjorn said:
| The PCI hot-remove path calls pci_stop_bus_devices() via
| pci_remove_bus_device().
|
| pci_stop_bus_devices() traverses the bus->devices list (point A below),
| stopping each device in turn, which calls the driver remove() method. When
| the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
| also uses pci_remove_bus_device() to remove the VF devices from the
| bus->devices list (point B).
|
| pci_remove_bus_device
| pci_stop_bus_device
| pci_stop_bus_devices(subordinate)
| list_for_each(bus->devices) <-- A
| pci_stop_bus_device(PF)
| ...
| driver->remove
| pci_disable_sriov
| ...
| pci_remove_bus_device(VF)
| <remove from bus_list> <-- B
|
| At B, we're changing the same list we're iterating through at A, so when
| the driver remove() method returns, the pci_stop_bus_devices() iterator has
| a pointer to a list entry that has already been freed.
Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141https://lkml.org/lkml/2012/1/23/360
-v5: According to Linus to make remove more robust, Change to
list_for_each_prev_safe instead. That is more reasonable, because
those devices are added to tail of the list before.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After merging struct pci_dev_resource_x and pci_dev_resource,
We can use a function instead of macro now.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Linus says don't use dev_res_x because it doesn't communicate anything
about usage. Rename them to add_res or fail_res etc according to
context.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_dev_resource_x is a superset of pci_dev_resource and they're just
temp structs used during resource reallocation.
pci_dev_resource usage is quite limted.
So just use pci_dev_resource_x, and rename it as new pci_dev_resource.
-v2: According to Linus, Separate free_list change to another patch
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
So we can use helper functions for generic list. This makes the
resource re-allocation code much more readable.
-v2: Use list_add_tail instead of adding list_insert_before, Pointed out
by Linus.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No user outside of setup-bus.c now. Later patches will convert
resource_list to a regular list.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This allows us to move the definition of struct resource_list to
setup_bus.c and later convert resource_list to a regular list.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On a system with devices that support SRIOV connected to a pcie switch
to pcie root port:
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a
| | \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a
| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a
| | \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a
When the BIOS does not assign resources for SRIOV BARs, kernel pci
reallocation only goes up one bridge and then gives up, failing to to
get resources for all sSRIOV BARs, even though the range is large enough
in the peer root bus.
Specifically, only the bridge at the a1:02.0 level has its resources
cleared and reallocated. The kernel does not go up to clear the bridge
at the 80:02.0 level.
To make it go to upper levels, during retry, we need to treat "good to have"
resources as "must have".
Only on the last try will we treat good to have resources as optional.
At that time, parent bridge resources will already have been released so
we'll have a chance to get everything assigned with must_have plus
good_to_have for all child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This allows us to allocate resources to hotplug bridges during
remove/rescan.
We need to move the function to setup-bus.c so it can use
__pci_bus_size_bridges and __pci_bus_assign_resources directly to take
the add_list resource tracking list.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current rescan will not touch bridge MMIO and IO.
Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.
Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We need add size for hot plug path when pluging in hotplug chassis
without cards.
-v2: change descriptions. make it applicable after "pci: Check bridge
resources after resource allocation."
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Need to call it from __assign_resources_sorted() later and we'd like to
avoid a forward declaraion.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Will be used for resource_list_x duplication when trying
requested+optional at first.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During debug of one SRIOV enabled hotplug device, we found found that
add_size is not passed properly.
The device has devices under two level bridges:
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device
| | \-03.0-[a3]--+-00.0 Oracle Corporation Device
Which means later the parent bridge will not try to add a big enough range:
[ 557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
[ 557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
[ 557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
[ 557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
[ 557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
[ 557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
[ 557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
[ 557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
[ 557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
[ 557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]
It turns out we did not calculate size1 properly.
static resource_size_t calculate_memsize(resource_size_t size,
resource_size_t min_size,
resource_size_t size1,
resource_size_t old_size,
resource_size_t align)
{
if (size < min_size)
size = min_size;
if (old_size == 1 )
old_size = 0;
if (size < old_size)
size = old_size;
size = ALIGN(size + size1, align);
return size;
}
We should not pass add_size with min_size in calculate_memsize since
that will make add_size not contribute final add_size.
So just pass add_size with size1 to calculate_memsize().
With this change, we should have chance to remove extra addon in
pci_reassign_resource.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.
The call chain when a user does "bind" looks as so:
echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
and ends up calling:
driver_bind:
device_lock(dev); <=== TAKES LOCK
XXXX_probe:
.. pci_enable_device()
...__pci_reset_function(), which calls
pci_dev_reset(dev, 0):
if (!0) {
device_lock(dev) <==== DEADLOCK
The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add missing iounmap in error handling code, in a case where the function
already preforms iounmap on some other execution path.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e;
statement S,S1;
int ret;
@@
e = \(ioremap\|ioremap_nocache\)(...)
... when != iounmap(e)
if (<+...e...+>) S
... when any
when != iounmap(e)
*if (...)
{ ... when != iounmap(e)
return ...; }
... when any
iounmap(e);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Boot up a KVM guest, and hotplug multifunction
devices(func1,func2,func0,func3) to guest.
for i in 1 2 0 3;do
qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
(qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
(qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
done
In linux kernel, when func0 of the slot is hot-added, the whole
slot will be marked as 'enabled', then driver will ignore other new
hotadded funcs.
But in Win7 & WinXP, we can continaully add other funcs after adding
func0, all funcs will be added in guest.
drivers/pci/hotplug/acpiphp_glue.c:
static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{
....
for (slot = bridge->slots; slot; slot = slot->next) {
if (slot->flags & SLOT_ENABLED) {
acpiphp_disable_slot()
else
acpiphp_enable_slot()
.... |
} v
enable_device()
|
v
//only don't enable slot if func0 is not added
list_for_each_entry(func, &slot->funcs, sibling) {
...
}
slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'
This patch just make pci driver can continaully add funcs after adding
func 0. Only mark slot to 'enabled' when all funcs are added.
For pci multifunction hotplug, we can add functions one by one(func 0 is
necessary), and all functions will be removed in one time.
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch converts the underlying maintenance aspects of FW-assigned
BIOS BAR values from a statically allocated array within struct pci_dev
to a list of temporary, stand alone, entries.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_revert_fw_address() is used to reinstate a PCI device's original
FW-assigned BIOS BAR value(s) if normal resource assignment fails.
When attempting to reinstate an address, the point within the resource
tree from which to attempt the new resource request should be the parent
resource corresponding to the device, not the base of the resource tree
(ioport_resource or iomem_resource). For PCI devices this would
typically be the resource corresponding to the upstream PCI host bridge
or P2P bridge aperture.
This patch sets the point within the resource tree to attempt a new
resource assignment request to the PCI device's parent resource and only
if that fails does it fall back to the base ioport_resource or
iomem_resource.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During test busn_res allocation with cardbus, found pci card removal is not
working anymore, and it turns out it is broken by:
|commit 79cc9601c3
|Date: Tue Nov 22 21:06:53 2011 -0800
|
| PCI: Only call pci_stop_bus_device() one time for child devices at remove
The above changed the behavior of pci_remove_behind_bridge that
yenta_cardbus depended on. So restore the old behavoir of
pci_remove_behind_bridge (which requires stopping and removing of all
devices) by:
1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
__pci_remove_bus_device() call it instead.
2. add pci_stop_behind_bridge that will stop devices behind a bridge
3. add back pci_remove_behind_bridge that will stop and remove devices
under bridge.
-v2: update commit description a little bit.
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
the PCI_SRIOV_BAR are queried. The sys pagesize defaults to 4k,
so this change is required on powerpc box with 64k base page size.
This is a regression caused due to moving SRIOV init to sriov_enable().
| commit afd24ece5c
| Author: Ram Pai <linuxram@us.ibm.com>
| PCI: delay configuration of SRIOV capability
| The SRIOV capability, namely page size and total_vfs of a device are
| configured during enumeration phase of the device. This can potentially
| interfere with the PCI operations of the platform, if the IOV capability
| of the device is not enabled.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
when using ACPI resources (_CRS).
This is default, a manual workaround (without this patch)
would be pci=nocrs boot param.
V2: Add dev_warn if the workaround is hit. This should reveal
how common such setups are (via google) and point to possible
problems if things are still not working as expected.
-> Suggested by Jan Beulich.
Cc: stable@vger.kernel.org
Tested-by: garyhade@us.ibm.com
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
.. as the rest of the kernel is using that format.
Suggested-by: Марк Коренберг <socketpair@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This was done to resolve a merge and build problem with the
drivers/acpi/processor_driver.c file.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch (as1516) fixes a bug introduced during the removal of
put_driver() and get_driver() from drivers/pci/xen-pcifront.c.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1514) cleans up some places where new_id and remove_id
sysfs attributes are created and deleted. Handling both attributes in
a single routine rather than a pair of routines makes the code
smaller. It also prevents certain kinds of errors, like one we
currently have in the USB subsystem: The removeid attribute is often
created even when newid isn't (because the driver's no_dynamid_id flag
is set).
In the case of the PCMCIA subsystem, the newid attribute is created
but never explicitly deleted. The patch adds a deletion routine.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As part of the removal of get_driver()/put_driver(), this patch
(as1512) gets rid of various useless and unnecessary calls in several
drivers. In some cases it may be desirable to pin the driver by
calling try_module_get(), but that can be done later.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: "David S. Miller" <davem@davemloft.net>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Michael Buesch <m@bues.ch>
CC: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As part of the removal of get_driver()/put_driver(), this patch
(as1511) changes all the places that add dynamic IDs for drivers.
Since these additions are done by writing to the drivers' sysfs
attribute files, and the attributes are removed when the drivers are
unregistered, there is no reason to take an extra reference to the
drivers.
The one exception is the pci-stub driver, which calls pci_add_dynid()
as part of its registration. But again, there's no reason to take an
extra reference here, because the driver can't be unloaded while it is
being registered.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Dmitry Torokhov <dmitry.torokhov@gmail.com>
CC: Jiri Kosina <jkosina@suse.cz>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix new kernel-doc warnings:
Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()
Manually fix up a semantic mis-merge wrt security_netlink_recv():
- the interface was removed in commit fd77846152 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")
- a new user of it appeared in commit a38f7907b9 ("crypto: Add
userspace configuration API")
causing no automatic merge conflict, but Eric Paris pointed out the
issue.
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.
The call chain when a user does "bind" looks as so:
echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
and ends up calling:
driver_bind:
device_lock(dev); <=== TAKES LOCK
XXXX_probe:
.. pci_enable_device()
...__pci_reset_function(), which calls
pci_dev_reset(dev, 0):
if (!0) {
device_lock(dev) <==== DEADLOCK
The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
PCI: Increase resource array mask bit size in pcim_iomap_regions()
PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
x86/PCI: amd: factor out MMCONFIG discovery
PCI: Enable ATS at the device state restore
PCI: msi: fix imbalanced refcount of msi irq sysfs objects
PCI: kconfig: English typo in pci/pcie/Kconfig
PCI/PM/Runtime: make PCI traces quieter
PCI: remove pci_create_bus()
xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
sparc/PCI: convert to pci_create_root_bus()
sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
powerpc/PCI: convert to pci_create_root_bus()
powerpc/PCI: split PHB part out of pcibios_map_io_space()
...
Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
to the same patches being applied in other branches.
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
xen/pciback: Expand the warning message to include domain id.
xen/pciback: Fix "device has been assigned to X domain!" warning
xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
Xen: consolidate and simplify struct xenbus_driver instantiation
xen-gntalloc: introduce missing kfree
xen/xenbus: Fix compile error - missing header for xen_initial_domain()
xen/netback: Enable netback on HVM guests
xen/grant-table: Support mappings required by blkback
xenbus: Use grant-table wrapper functions
xenbus: Support HVM backends
xen/xenbus-frontend: Fix compile error with randconfig
xen/xenbus-frontend: Make error message more clear
xen/privcmd: Remove unused support for arch specific privcmp mmap
xen: Add xenbus_backend device
xen: Add xenbus device driver
xen: Add privcmd device driver
xen/gntalloc: fix reference counts on multi-page mappings
...
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...
The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".
Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
x86, apic: Add probe() for apic_flat
x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
x86: Add per-cpu stat counter for APIC ICR read tries
pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
x86: Add NumaChip support
x86: Add x86_init platform override to fix up NUMA core numbering
x86: Make flat_init_apic_ldr() available
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This warning was recently reported to me:
------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
Hardware name: VMware Virtual Platform
kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
being called.
Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
vmw_pvscsi
Pid: 630, comm: modprobe Tainted: G W 3.1.6-1.fc16.x86_64 #1
Call Trace:
[<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
[<ffffffff810da293>] ? free_desc+0x63/0x70
[<ffffffff812a9aa0>] kobject_put+0x50/0x60
[<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
[<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
[<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
[<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
[<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
[<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
[<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
[<ffffffff8138bceb>] __driver_attach+0xab/0xb0
[<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
[<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
[<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
[<ffffffff8138b63e>] driver_attach+0x1e/0x20
[<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
[<ffffffffa0188000>] ? 0xffffffffa0187fff
[<ffffffff8138c246>] driver_register+0x76/0x140
[<ffffffff815ca414>] ? printk+0x51/0x53
[<ffffffffa0188000>] ? 0xffffffffa0187fff
[<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
[<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
[<ffffffff81002042>] do_one_initcall+0x42/0x180
[<ffffffff810aad71>] sys_init_module+0x91/0x200
[<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
---[ end trace 44593438a59a9558 ]---
Using INTx interrupt, #Rx queues: 1.
It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
be called. Because populate_msi_sysfs fails, we never registered any of the
msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
on each of them, which gets flagged in the above stack trace.
The fix is pretty straightforward. We can key of the parent pointer in the
kobject. It is only set if the kobject_init_and_add succededs in
populate_msi_sysfs. If anything fails there, each kobject has its parent reset
to NULL
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When the runtime PM is activated on PCI, if a device switches state
frequently (e.g. an EHCI controller with autosuspending USB devices
connected) the PCI configuration traces might be very verbose in the
kernel log. Let's guard those traces with DEBUG condition.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All users of pci_create_bus() have been converted to pci_create_root_bus(),
so remove it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Users of pci_scan_bus_parented() should be converted to use either
pci_scan_root_bus() (preferred, but also calls pci_bus_add_devices)
or
pci_create_root_bus()
pci_scan_child_bus()
Since pci_scan_bus_parented(), I'm marking it deprecated now and will
actually remove it later.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This converts pci_scan_bus_parented() to use pci_create_root_bus()
instead of pci_create_bus(). The new bus still has the default (incorrect)
resources, so this patch doesn't help fix that problem, but it does remove
one more use of pci_create_bus().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I plan to deprecate pci_scan_bus_parented(), so use pci_create_root_bus()
directly instead. pci_scan_bus() itself will be removed as soon as all
callers are gone, so this is just an interim step.
v2: export pci_scan_bus
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
"Early" and "header" quirks often use incorrect bus resources because they
see the default resources assigned by pci_create_bus(), before the
architecture fixes them up (typically in pcibios_fixup_bus()). Regions
reserved by these quirks end up with the wrong parents.
Here's the standard path for scanning a PCI root bus:
pci_scan_bus or pci_scan_bus_parented
pci_create_bus <-- A create with default resources
pci_scan_child_bus
pci_scan_slot
pci_scan_single_device
pci_scan_device
pci_setup_device
pci_fixup_device(early) <-- B
pci_device_add
pci_fixup_device(header) <-- C
pcibios_fixup_bus <-- D fill in correct resources
Early and header quirks at B and C use the default (incorrect) root bus
resources rather than those filled in at D.
This patch adds a new pci_scan_root_bus() function that sets the bus
resources correctly from a supplied list of resources.
I intend to remove pci_scan_bus() and pci_scan_bus_parented() after
fixing all callers.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_create_bus() assigns ioport_resource and iomem_resource as the default
bus resources, i.e., the entire address space. Architectures fix these
later, typically in pcibios_fixup_bus() or after pci_scan_bus_parented()
returns, but code that runs in the interim sees incorrect resource
information.
This patch adds a new pci_create_root_bus() that sets the bus resources
correctly from a supplied list of resources.
I intend to remove pci_create_bus() after changing all callers.
Based on original patch by Deng-Cheng Zhu.
Reference: http://www.spinics.net/lists/mips/msg41654.html
Reference: https://lkml.org/lkml/2011/8/26/88
Signed-off-by: Deng-Cheng Zhu <dczhu@mips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Show the bus number and resources for every root bus we create. This
will become more interesting when we supply the correct resources
instead of using the defaults (ioport_resource and iomem_resource).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We'd like to supply a list of resources when we create a new PCI bus,
e.g., the root bus under a PCI host bridge. These are helpers for
constructing that list.
These are exported because the plan is to replace this exported interface:
pci_scan_bus_parented()
with this one:
pci_add_resource(resources, ...)
pci_scan_root_bus(..., resources)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device. This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.
The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.
The patch is tested on x86 and power platform.
Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.
So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).
There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.
No functional change.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.
This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.
This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.
Adaptions of the ipr driver were originally written by Brian King.
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.
Signed-off-by: David Fries <David@Fries.net>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup. The booting kernel had not enabled those interupts so
was not prepared to handle them.
I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts. The pci
spec specifies that msi interrupts should be off by default. Drivers
are expected to enable the msi interrupts if they want to use them. Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.
This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.
Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use non-ordered workqueue for attention button events.
Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix improper workqueue cleanup.
In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.
This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.
It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.
Measured to save around 5W on an idle Thinkpad X220.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These are extended capabilities, rename and move to proper
group for consistency.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs
This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs. For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
security_capable takes ns, cred, cap. But the LSM capable() hook takes
cred, ns, cap. The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!
Signed-off-by: Eric Paris <eparis@redhat.com>
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.
Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).
Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.
v2: Restore DRV_NAME for the driver name in xen-pciback.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
I noticed that hotplug of one setup does not work with recent change in
pci tree.
After checking the bridge conf setup, I noticed that the bridges get
assigned but do not get enabled.
The reason is the following commit, while simply ignores bridge
resources when enabling a pci device:
| commit bbef98ab0f
| Author: Ram Pai <linuxram@us.ibm.com>
| Date: Sun Nov 6 10:33:10 2011 +0800
|
| PCI: defer enablement of SRIOV BARS
|...
| NOTE: Note, there is subtle change in the pci_enable_device() API. Any
| driver that depends on SRIOV BARS to be enabled in pci_enable_device()
| can fail.
Put back bridge resource and ROM resource checking to fix the problem.
That should fix regression like BIOS does not assign correct resource to
bridge.
Discussion can be found at:
http://www.spinics.net/lists/linux-pci/msg12874.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During test of one IB card with guest VM, found that, msi is not
initialized properly.
It turns out __write_msi_msg will do nothing if device current_state is
not PCI_D0. And, that pci device does not have pm_cap in guest VM.
There is an error in setting of power state to PCI_D0 in
pci_enable_device(), but error is not returned for this. Following is
code flow:
pci_enable_device() --> __pci_enable_device_flags() -->
do_pci_enable_device() --> pci_set_power_state() -->
__pci_start_power_transition()
We have following condition inside __pci_start_power_transition():
if (platform_pci_power_manageable(dev)) {
error = platform_pci_set_power_state(dev, state);
if (!error)
pci_update_current_state(dev, state);
} else {
error = -ENODEV;
/* Fall back to PCI_D0 if native PM is not supported */
if (!dev->pm_cap)
dev->current_state = PCI_D0;
}
Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is
getting called and that is failing with ENODEV because of following
condition:
if (!handle || ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp)))
return -ENODEV;
Because of that, pci_update_current_state() is not getting called.
With this patch, if device power state can not be set via
platform_pci_set_power_state and that device does not have native pm
support, then PCI device power state will be set to PCI_D0.
-v2: This also reverts 47e9037ac1, as it's
not needed after this change.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 0d52f54e2e (PCI / ACPI: Make
acpiphp ignore root bridges using PCIe native hotplug) added code
that made the acpiphp driver completely ignore PCIe root complexes
for which the kernel had been granted control of the native PCIe
hotplug feature by the BIOS through _OSC. Unfortunately, however,
this was a mistake, because on some systems there were PCI bridges
supporting PCI (non-PCIe) hotplug under such root complexes and
those bridges should have been handled by acpiphp.
For this reason, revert the changes made by the commit mentioned
above and make register_slot() in drivers/pci/hotplug/acpiphp_glue.c
avoid registering hotplug slots for PCIe ports that belong to
root complexes with native PCIe hotplug enabled (which means that
the BIOS has granted the kernel control of this feature for the
given root complex). This is reported to address the original
issue fixed by commit 0d52f54e2e and
to work on the system where that commit broke things.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This adjusts PCI_IOAPIC to be user configurable (possibly as a
module) on x86, since the base architecture code for adding
IO-APICs dynamically isn't there yet (and hence having the code
present everywhere is pretty pointless).
To make this consistent, a MODULE_DEVICE_TABLE() declaration
gets added, the class specifications get corrected (by properly
using PCI_DEVICE_CLASS() intended for purposes like this), and
the probe and remove functions get their sections adjusted.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/4EDDD71A02000078000659F1@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I get this compile failure on parisc:
drivers/pci/ats.c: In function 'ats_alloc_one':
drivers/pci/ats.c:29: error: implicit declaration of function 'kzalloc'
drivers/pci/ats.c:29: warning: assignment makes pointer from integer without a cast
drivers/pci/ats.c: In function 'ats_free_one':
drivers/pci/ats.c:45: error: implicit declaration of function 'kfree'
Because ats.c is missing linux/slab.h as an include. This patch fixes it
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device(). This unnecessarily enables SRIOV BARs of the
device.
On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.
The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().
NOTE: Note, there is subtle change in the pci_enable_device() API. Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.
The patch has been touch tested on power and x86 platform.
Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
More consistency cleanups. Drop the _OFF, separate and indent
CTRL/CAP/STATUS bit definitions. This helped find the previous
mis-use of bit 0 in the PASID capability register.
Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PASID ECN indicates bit 0 is reserved in the capability register.
Switch pci_enable_pasid() to error if PASID is already enabled and
don't expose enable as a feature in pci_pasid_features().
Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup. The booting kernel had not enabled those interupts so
was not prepared to handle them.
I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts. The pci
spec specifies that msi interrupts should be off by default. Drivers
are expected to enable the msi interrupts if they want to use them. Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.
This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.
Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If the kernel has requested control of the SHPC native hotplug
feature for a given root bridge, the acpiphp driver should not try
to handle that root bridge and it should leave it to shpchp.
Failing to do so causes problems to happen if shpchp is loaded
and unloaded before loading acpiphp (ACPI-based hotplug won't work
in that case anyway).
To address this issue make find_root_bridges() ignore PCI root
bridges with SHPC native hotplug enabled and make add_bridge()
return error code if SHPC native hotplug is enabled for the given
root bridge. This causes acpiphp to refuse to load if SHPC native
hotplug is enabled for all root bridges and to refuse binding to
the root bridges with SHPC native hotplug enabled.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use non-ordered workqueue for attention button events.
Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix improper workqueue cleanup.
In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.
This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.
It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.
Measured to save around 5W on an idle Thinkpad X220.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These are extended capabilities, rename and move to proper
group for consistency.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>