* pci/misc:
PCI: Remove pcie_cap_has_devctl()
PCI: Support PCIe Capability Slot registers only for ports with slots
PCI: Remove PCIe Capability version checks
PCI: Allow PCIe Capability link-related register access for switches
PCI: Add offsets of PCIe capability registers
PCI: Tidy bitmasks and spacing of PCIe capability definitions
PCI: Remove obsolete comment reference to pci_pcie_cap2()
PCI: Clarify PCI_EXP_TYPE_PCI_BRIDGE comment
PCI: Rename PCIe capability definitions to follow convention
PCI: Disable decoding for BAR sizing only when it was actually enabled
PCI: Add comment about needing pci_msi_off() even when CONFIG_PCI_MSI=n
PCI: Add pcibios_pm_ops for optional arch-specific hibernate functionality
pcie_cap_has_devctl() does nothing, so remove it. Simplicity over
consistency in this case. No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>
Previously we allowed callers to access Slot Capabilities, Status, and
Control for Root Ports even if the Root Port did not implement a slot.
This seems dubious because the spec only requires these registers if a
slot is implemented.
It's true that even Root Ports without slots must have *space* for these
slot registers, because the Root Capabilities, Status, and Control
registers are after the slot registers in the capability. However,
for a v1 PCIe Capability, the *semantics* of the slot registers are
undefined unless a slot is implemented.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>
Previously we relied on the PCIe r3.0, sec 7.8, spec language that says
"For Functions that do not implement the [Link, Slot, Root] registers,
these spaces must be hardwired to 0b," which means that for v2 PCIe
capabilities, we don't need to check the device type at all.
But it's simpler if we don't need to check the capability version at all,
and I think the spec is explicit enough about which registers are required
for which types that we can remove the version checks.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>
Every PCIe device has a link, except Root Complex Integrated Endpoints
and Root Complex Event Collectors. Previously we didn't give access
to PCIe capability link-related registers for Upstream Ports, Downstream
Ports, and Bridges, so attempts to read PCI_EXP_LNKCTL incorrectly
returned zero. See PCIe spec r3.0, sec 7.8 and 1.3.2.3.
Reference: http://lkml.kernel.org/r/979A8436335E3744ADCD3A9F2A2B68A52AD136BE@SJEXCHMB10.corp.ad.broadcom.com
Reported-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-By: Jiang Liu <jiang.liu@huawei.com>
These offsets are not used, and in some cases are completely reserved
even in the spec, but I'm adding them for completeness just to match
the diagrams in the spec, e.g., PCIe spec r3.0, sec 7.8.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The convention of showing bits in a mask of the full register width, e.g.,
"0x00000007" instead of "0x07" for a field in a 32-bit register, is common
but not universal in this file. This patch makes it consistently used at
least for the PCIe capability.
Whitespace and zero-extension changes only; no functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_pcie_cap2() was replaced by pcie_capability_read_word() and similar
functions, so update the comment.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI_EXP_TYPE_PCI_BRIDGE is a *PCIe* function that is a bridge to
PCI/PCI-X. See PCIe spec r3.0, sec 7.8.2.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
All other PCIe capability register fields include "PCI_EXP" + <reg-name> +
<field-name>. This renames PCI_EXP_OBFF_MASK, PCI_EXP_IDO_REQ_EN,
PCI_EXP_LTR_EN, and related fields using the same convention.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com> # for MFD driver
* pci/yinghai-assign-unassigned-v6:
PCI: Assign resources for hot-added host bridge more aggressively
PCI: Move resource reallocation code to non-__init
PCI: Delay enabling bridges until they're needed
PCI: Assign resources on a per-bus basis
PCI: Enable unassigned resource reallocation on per-bus basis
PCI: Turn on reallocation for unassigned resources with host bridge offset
PCI: Look for unassigned resources on per-bus basis
PCI: Drop temporary variable in pci_assign_unassigned_resources()
If a BIOS configures MPS incorrectly, devices may not work normally.
For example, if a bridge has MPS set larger than an endpoint below it,
the endpoint may discard packets.
To help diagnose this issue, print a warning if we find an endpoint
MPS setting different than that of the upstream bridge.
[bhelgaas: changelog, "bridge" temporary, warning text]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=60799
Reported-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jon Mason <jdmason@kudzu.us>
Correct minor wording issue in MPS peer-to-peer comment. Noticed by Don
Dutile.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We disable BARs while sizing them so we don't cause conflicts with other
devices (see 253d2e5498 and bbffe43524). But if device decoding is already
disabled before we size the BAR, we don't need to disable it again.
[bhelgaas: changelog, add PCI_COMMAND_DECODING_ENABLE for readability]
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per f5f2b13129 ("msi: sanely support hardware level msi disabling"), we
want pci_msi_off() to work even if MSI support is not compiled into the
kernel, and there are existing callers that use it when CONFIG_PCI_MSI=n.
This adds a comment to that effect.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Platforms may want to provide architecture-specific functionality when
a PCI device is doing a hibernate transition. Add a weak symbol
pcibios_pm_ops that architectures can override to do so.
[bhelgaas: fold in return value checks from v2 patch]
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When booting with "pci=pcie_bus_safe", we previously limited the
fabric MPS to 128 when we found:
(1) A hotplug-capable Downstream Port ("dev->is_hotplug_bridge &&
pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT"), or
(2) A hotplug-capable Root Port with a slot that was either empty or
contained a multi-function device ("dev->is_hotplug_bridge &&
!list_is_singular(&dev->bus->devices)")
Part (1) is valid, but part (2) is not.
After a hot-add in the slot below a Root Port, we can reconfigure all
MPS values in the fabric below the Root Port because the new device is
the only thing below the Root Port and there are no active drivers.
Therefore, there's no reason to limit the MPS for Root Ports, no
matter what's in the slot.
Test info:
-+-[0000:40]-+-07.0-[0000:46]--+-00.0 Intel 82576 NIC
\-00.1 Intel 82576 NIC
0000:40:07.0 Root Port bridge to [bus 46] (MPS supported=256)
0000:46:00.0 Endpoint (MPS supported=512)
0000:46:00.1 Endpoint (MPS supported=512)
# echo 0 > /sys/bus/pci/slots/7/power
# echo 1 > /sys/bus/pci/slots/7/power
pcieport 0000:40:07.0: PCI-E Max Payload Size set to 256/ 256 (was 256)
pci 0000:46:00.0: PCI-E Max Payload Size set to 256/ 512 (was 128)
pci 0000:46:00.1: PCI-E Max Payload Size set to 256/ 512 (was 128)
Before this change, we set MPS to 128 for the Root Port and both NICs
because the slot contained a multi-function device and
dev->is_hotplug_bridge && !list_is_singular(&dev->bus->devices)
was true. After this change, we set it to 256.
[bhelgaas: changelog, comments, split out upstream bridge check]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jon Mason <jdmason@kudzu.us>
PCIe hotplug bridges are always either Root Ports or Downstream Ports. No
other device type can have a PCIe link leading downstream to a slot.
Root Ports don't have an upstream bridge, so "dev->is_hotplug_bridge &&
dev->bus->self" is true if and only if "dev" is a Downstream Port. That
means we can simplify this by looking at the type of "dev" itself, without
looking upstream at all.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
After 59875ae489 ("PCI/core: Use PCI Express Capability accessors"),
pcie_get_mps() never returns an error, so don't bother to check for it.
No functional change.
[bhelgaas: changelog, fix pcie_get_mps() doc]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Based on a patch by Jon Mason (see URL below).
All users of pcie_bus_configure_settings() pass arguments of the form
"bus, bus->self->pcie_mpss". The "mpss" argument is redundant since we
can easily look it up internally. In addition, all callers check
"bus->self" for NULL, which we can also do internally.
This patch simplifies the interface and the callers. No functional change.
Reference: http://lkml.kernel.org/r/1317048850-30728-2-git-send-email-mason@myri.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The conventional spelling is "PCIe", but I think even that is superfluous,
so remove the whole thing.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Users of pci_reset_bus() and pci_reset_slot() need a way to probe
whether the bus or slot supports reset. Add trivial helper functions
and export them as vfio-pci will make use of these.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
One PCI bus reset function to rule them all.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI spec indicates that with stable power, reset needs to be
asserted for a minimum of 1ms (Trst). We should be able to assume
stable power for a Hot Reset, but we add another millisecond as
a fudge factor to make sure the reset is seen on the bus for at least
a full 1ms.
After reset is de-asserted we must wait for devices to complete
initialization. The specs refer to this as "recovery time" (Trhfa).
For PCI this is 2^25 clock cycles or 2^26 for PCI-X. For minimum
bus speeds, both of those come to 1s. PCIe "softens" this
requirement with the Configuration Request Retry Status (CRS)
completion status. Theoretically we could use CRS to shorten the
wait time. We don't make use of that here, using a fixed 1s delay
to allow devices to re-initialize.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Devices come out of reset in D0. Restoring a device to a different
post-reset state takes more smarts than our simple config space
restore, which can leave devices in an inconsistent state. For
example, if a device is reset in D3, but the restore doesn't
successfully return the device to D3, then the actual state of the
device and dev->current_state are contradictory. Put everything
in D0 going into the reset, then we don't need to do anything
special on the way out.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Sometimes pci_reset_function() is not sufficient. We have cases where
devices do not support any kind of reset, but there might be multiple
functions on the bus preventing pci_reset_function() from doing a
secondary bus reset. We also have cases where a device will advertise
that it supports a PM reset, but really does nothing on D3hot->D0
(graphics cards are notorious for this). These devices often also
have more than one function, so even blacklisting PM reset for them
wouldn't allow a secondary bus reset through pci_reset_function().
If a driver supports multiple devices it should have the ability to
induce a bus reset when it needs to. This patch provides that ability
through pci_reset_slot() and pci_reset_bus(). It's the caller's
responsibility when using these interfaces to understand that all of
the devices in or below the slot (or on or below the bus) will be
reset and therefore should be under control of the caller. PCI state
of all the affected devices is saved and restored around these resets,
but internal state of all of the affected devices is reset (which
should be the intention).
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Only cosmetic code changes to existing paths. Expand the comment in
the new pci_dev_save_and_disable() function since there's a lot
hidden in that Command register write.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If the hotplug controller provides a way to reset a slot, use that
before a direct parent bus reset. Like the bus reset option, this is
only available when a single pci_dev occupies the slot.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe hotplug has a bus per slot, so we can just use a normal
secondary bus reset. However, if a slot supports surprise removal,
a bus reset can be seen as a presence detection change triggering
a hot-remove followed by a hot-add. Disable presence detection from
triggering an interrupt or being polled around the bus reset.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This optional callback allows hotplug controllers to perform slot
specific resets. These may be necessary in cases where a normal
secondary bus reset can interact with controller logic and expose
spurious hotplugs.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/vipul-chelsio-reset-v2:
PCI: Use pci_wait_for_pending_transaction() instead of for loop
bnx2x: Use pci_wait_for_pending_transaction() instead of for loop
PCI: Chelsio quirk: Enable Bus Master during Function-Level Reset
PCI: Add pci_wait_for_pending_transaction()
New routine has been added to avoid duplication of code to wait for
pending PCI transactions to complete. This makes use of that function.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
New routine has been added to avoid duplication of code to wait for
pending PCI transactions to complete. This makes use of that routine.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Acked-by: David S. Miller <davem@davemloft.net>
T4 can wedge if there are DMAs in flight within the chip and Bus
Master has been disabled. We need to have it on till the Function
Level Reset completes. T4 can also suffer a Head Of Line blocking
problem if MSI-X interrupts are disabled before the FLR has completed.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
New routine to avoid duplication of code to wait for pending PCI
transactions to complete.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/misc:
PCI: exynos: Split into Synopsys part and Exynos part
PCI: mvebu: Make Marvell PCIe driver depend on OF
PCI: mvebu: Convert to use devm_ioremap_resource
Exynos PCIe IP consists of Synopsys specific part and Exynos
specific part. Only core block is a Synopsys Designware part;
other parts are Exynos specific.
Also, the Synopsys Designware part can be shared with other
platforms; thus, it can be split two parts such as Synopsys
Designware part and Exynos specific part.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Pratyush Anand <pratyush.anand@st.com>
Cc: Mohit KUMAR <Mohit.KUMAR@st.com>
The Marvell PCIe host controller driver is heavily tied to Device Tree
APIs, and can only be used on platforms where the Device Tree is
used. Therefore, it should "depends on OF" to avoid build failures on
!OF configurations.
Reported-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move the secondary bus reset code from pci_parent_bus_reset() into its own
function. Export it as we'll later be calling it from hotplug controllers
and elsewhere.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/wei-resource-cleanups:
PCI: Align bridge I/O windows as required by downstream devices & bridges
PCI: Fix types in pbus_size_io()
PCI: Add comments for pbus_size_mem() parameters
PCI: Enumerate subordinate buses, not devices, in pci_bus_get_depth()
Commit 75096579c3 ("lib: devres: Introduce devm_ioremap_resource()")
introduced devm_ioremap_resource() and deprecated the use of
devm_request_and_ioremap().
While at it, modify mvebu_pcie_map_registers() to propagate error code.
Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
An upstream bridge's I/O window must be at least as aligned as any
downstream device or bridge requires. In particular, if the upstream
bridge supports 1K alignment but a downstream bridge requires 4K alignment,
the upstream window must also be 4K aligned.
Therefore, do not reduce the required alignment ("min_align") based on
the upstream bridge's capabilities.
Reported-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch changes the type of "size" to resource_size_t and makes the
corresponding dev_printk() change.
[bhelgaas: changelog]
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch fills in the missing description for two parameters of
pbus_size_mem().
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Normally, on one PCI bus there would be more devices than bridges. When
calculating the depth of a PCI bus, it would be more time efficient to
enumerating through the child buses instead of the child devices.
Also by doing so, the code seems more self explaining. Previously, it went
through the devices and checked whether a bridge introduced a child bus or
not, which needs more background knowledge to understand it.
This patch calculates the depth by enumerating the bus hierarchy.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/misc:
PCI: Fix comment typo for pci_add_cap_save_buffer()
PCI: Return -ENOSYS for SR-IOV operations on non-SR-IOV devices
PCI: Update NumVFs register when disabling SR-IOV
x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero
PCI: Convert class code to use dev_groups
frv/PCI: Mark pcibios_fixup_bus() as non-init
x86/pci/mrst: Cleanup checkpatch.pl warnings
PCI: Rename "PCI Express support" kconfig title
PCI: Fix comment typo in iov.c
* pci/aw-acs-fixes-v2:
PCI: Claim ACS support for AMD southbridge devices
PCI: Differentiate ACS controllable from enabled
PCI: Check all ACS features for multifunction downstream ports