License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* Purpose: PCI Express Port Bus Driver's Core Functions
|
|
|
|
*
|
|
|
|
* Copyright (C) 2004 Intel
|
|
|
|
* Copyright (C) Tom Long Nguyen (tom.l.nguyen@intel.com)
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/pci.h>
|
|
|
|
#include <linux/kernel.h>
|
PCI: Add missing link delays required by the PCIe spec
Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that
consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
+-01.0-[04-36]-- DS hotplug port
+-02.0-[37]----00.0 xHCI controller
\-04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe
gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that
we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0
GT/s, software must wait a minimum of 100 ms after Link training
completes before sending a Configuration Request to the device
immediately below that Port. Software can determine when Link training
completes by polling the Data Link Layer Link Active bit or by setting
up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the
actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
pcieport 0000:00:1b.0: waking up bus
pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:00:1b.0: PME# disabled
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:01:00.0: PME# disabled
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:00.0: PME# disabled
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:01.0: PME# disabled
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: PME# disabled
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:04.0: PME# disabled
pcieport 0000:02:01.0: PME# enabled
pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
pcieport 0000:02:04.0: PME# enabled
pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
...
thunderbolt 0000:03:00.0: PME# disabled
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking
into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
transition of the root port and the two downstream hotplug ports. This
means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can
see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8. there is a specific _DSM that allows the OS to skip the delays
but this platform does not provide the _DSM and does not go to S3 anyway
so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not
actually needed because there is an additional delay as part of the ACPI
power resource that is used to turn on power to the hierarchy but since
that additional delay is not required by any of standards (PCIe, ACPI)
it is not present in the Intel Ice Lake, for example where missing the
mandatory delays causes pciehp to start tearing down the stack too early
(links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they
perform the mandatory delays before the downstream component gets
resumed. We perform the delays before port services are resumed because
otherwise pciehp might find that the link is not up (even if it is just
training) and tears-down the hierarchy.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-12 18:57:38 +08:00
|
|
|
#include <linux/delay.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/errno.h>
|
|
|
|
#include <linux/pm.h>
|
2016-06-02 16:17:15 +08:00
|
|
|
#include <linux/pm_runtime.h>
|
2005-10-31 07:03:48 +08:00
|
|
|
#include <linux/string.h>
|
|
|
|
#include <linux/slab.h>
|
PCI: PCIe: Ask BIOS for control of all native services at once
After commit 852972acff8f10f3a15679be2059bb94916cba5d (ACPI: Disable
ASPM if the platform won't provide _OSC control for PCIe) control of
the PCIe Capability Structure is unconditionally requested by
acpi_pci_root_add(), which in principle may cause problems to
happen in two ways. First, the BIOS may refuse to give control of
the PCIe Capability Structure if it is not asked for any of the
_OSC features depending on it at the same time. Second, the BIOS may
assume that control of the _OSC features depending on the PCIe
Capability Structure will be requested in the future and may behave
incorrectly if that doesn't happen. For this reason, control of
the PCIe Capability Structure should always be requested along with
control of any other _OSC features that may depend on it (ie. PCIe
native PME, PCIe native hot-plug, PCIe AER).
Rework the PCIe port driver so that (1) it checks which native PCIe
port services can be enabled, according to the BIOS, and (2) it
requests control of all these services simultaneously. In
particular, this causes pcie_portdrv_probe() to fail if the BIOS
refuses to grant control of the PCIe Capability Structure, which
means that no native PCIe port services can be enabled for the PCIe
Root Complex the given port belongs to. If that happens, ASPM is
disabled to avoid problems with mishandling it by the part of the
PCIe hierarchy for which control of the PCIe Capability Structure
has not been received.
Make it possible to override this behavior using 'pcie_ports=native'
(use the PCIe native services regardless of the BIOS response to the
control request), or 'pcie_ports=compat' (do not use the PCIe native
services at all).
Accordingly, rework the existing PCIe port service drivers so that
they don't request control of the services directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-08-22 04:02:38 +08:00
|
|
|
#include <linux/aer.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-01-13 21:38:34 +08:00
|
|
|
#include "../pci.h"
|
2005-04-17 06:20:36 +08:00
|
|
|
#include "portdrv.h"
|
|
|
|
|
2018-05-18 05:44:16 +08:00
|
|
|
struct portdrv_service_data {
|
|
|
|
struct pcie_port_service_driver *drv;
|
2018-05-18 05:44:17 +08:00
|
|
|
struct device *dev;
|
2018-05-18 05:44:16 +08:00
|
|
|
u32 service;
|
|
|
|
};
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
|
|
|
* release_pcie_device - free PCI Express port service device structure
|
|
|
|
* @dev: Port service device to release
|
|
|
|
*
|
|
|
|
* Invoked automatically when device is being removed in response to
|
|
|
|
* device_unregister(dev). Release all resources being claimed.
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
|
|
|
static void release_pcie_device(struct device *dev)
|
|
|
|
{
|
2009-12-15 10:38:04 +08:00
|
|
|
kfree(to_pcie_device(dev));
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2017-10-20 21:48:06 +08:00
|
|
|
/*
|
|
|
|
* Fill in *pme, *aer, *dpc with the relevant Interrupt Message Numbers if
|
|
|
|
* services are enabled in "mask". Return the number of MSI/MSI-X vectors
|
|
|
|
* required to accommodate the largest Message Number.
|
2009-01-24 07:23:22 +08:00
|
|
|
*/
|
2017-10-20 21:48:06 +08:00
|
|
|
static int pcie_message_numbers(struct pci_dev *dev, int mask,
|
|
|
|
u32 *pme, u32 *aer, u32 *dpc)
|
2009-01-24 07:23:22 +08:00
|
|
|
{
|
2018-03-23 05:20:55 +08:00
|
|
|
u32 nvec = 0, pos;
|
2017-10-20 21:48:06 +08:00
|
|
|
u16 reg16;
|
2009-01-24 07:23:22 +08:00
|
|
|
|
|
|
|
/*
|
2017-10-20 21:48:06 +08:00
|
|
|
* The Interrupt Message Number indicates which vector is used, i.e.,
|
|
|
|
* the MSI-X table entry or the MSI offset between the base Message
|
|
|
|
* Data and the generated interrupt message. See PCIe r3.1, sec
|
|
|
|
* 7.8.2, 7.10.10, 7.31.2.
|
2009-01-24 07:23:22 +08:00
|
|
|
*/
|
|
|
|
|
PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
The Interrupt Message Number in the PCIe Capabilities register (PCIe r4.0,
sec 7.5.3.2) indicates which MSI/MSI-X vector is shared by interrupts
related to the PCIe Capability, including Link Bandwidth Management and
Link Autonomous Bandwidth Interrupts (Link Control, 7.5.3.7), Command
Completed and Hot-Plug Interrupts (Slot Control, 7.5.3.10), and the PME
Interrupt (Root Control, 7.5.3.12).
pcie_message_numbers() checked whether we want to enable PME or Hot-Plug
interrupts but neglected to check for Link Bandwidth Management, so if we
only wanted the Bandwidth Management interrupts, it decided we didn't need
any vectors at all. Then pcie_port_enable_irq_vec() tried to reallocate
zero vectors, which failed, resulting in fallback to INTx.
On some systems, e.g., an X79-based workstation, that INTx seems broken or
not handled correctly, so we got spurious IRQ16 interrupts for Bandwidth
Management events.
Change pcie_message_numbers() so that if we want Link Bandwidth Management
interrupts, we use the shared MSI/MSI-X vector from the PCIe Capabilities
register.
Fixes: e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth notification")
Link: https://lore.kernel.org/lkml/155597243666.19387.1205950870601742062.stgit@gimli.home
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-04-23 06:43:30 +08:00
|
|
|
if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP |
|
|
|
|
PCIE_PORT_SERVICE_BWNOTIF)) {
|
2012-12-06 04:51:18 +08:00
|
|
|
pcie_capability_read_word(dev, PCI_EXP_FLAGS, ®16);
|
2017-10-20 21:48:06 +08:00
|
|
|
*pme = (reg16 & PCI_EXP_FLAGS_IRQ) >> 9;
|
|
|
|
nvec = *pme + 1;
|
2009-01-24 07:23:22 +08:00
|
|
|
}
|
|
|
|
|
2018-03-23 05:20:55 +08:00
|
|
|
#ifdef CONFIG_PCIEAER
|
2009-01-24 07:23:22 +08:00
|
|
|
if (mask & PCIE_PORT_SERVICE_AER) {
|
2018-03-23 05:20:55 +08:00
|
|
|
u32 reg32;
|
|
|
|
|
|
|
|
pos = dev->aer_cap;
|
2017-10-20 21:48:06 +08:00
|
|
|
if (pos) {
|
|
|
|
pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS,
|
|
|
|
®32);
|
|
|
|
*aer = (reg32 & PCI_ERR_ROOT_AER_IRQ) >> 27;
|
|
|
|
nvec = max(nvec, *aer + 1);
|
|
|
|
}
|
2009-01-24 07:23:22 +08:00
|
|
|
}
|
2018-03-23 05:20:55 +08:00
|
|
|
#endif
|
2009-01-24 07:23:22 +08:00
|
|
|
|
2017-05-23 22:23:59 +08:00
|
|
|
if (mask & PCIE_PORT_SERVICE_DPC) {
|
|
|
|
pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC);
|
2017-10-20 21:48:06 +08:00
|
|
|
if (pos) {
|
|
|
|
pci_read_config_word(dev, pos + PCI_EXP_DPC_CAP,
|
|
|
|
®16);
|
|
|
|
*dpc = reg16 & PCI_EXP_DPC_IRQ;
|
|
|
|
nvec = max(nvec, *dpc + 1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return nvec;
|
|
|
|
}
|
|
|
|
|
2009-01-24 07:23:22 +08:00
|
|
|
/**
|
2017-05-23 22:23:58 +08:00
|
|
|
* pcie_port_enable_irq_vec - try to set up MSI-X or MSI as interrupt mode
|
|
|
|
* for given port
|
2009-01-24 07:23:22 +08:00
|
|
|
* @dev: PCI Express port to handle
|
2017-02-01 21:41:43 +08:00
|
|
|
* @irqs: Array of interrupt vectors to populate
|
2009-01-24 07:23:22 +08:00
|
|
|
* @mask: Bitmask of port capabilities returned by get_port_device_capability()
|
|
|
|
*
|
|
|
|
* Return value: 0 on success, error code on failure
|
|
|
|
*/
|
2017-05-23 22:23:58 +08:00
|
|
|
static int pcie_port_enable_irq_vec(struct pci_dev *dev, int *irqs, int mask)
|
2009-01-24 07:23:22 +08:00
|
|
|
{
|
2019-02-28 04:58:17 +08:00
|
|
|
int nr_entries, nvec, pcie_irq;
|
2017-10-20 21:48:06 +08:00
|
|
|
u32 pme = 0, aer = 0, dpc = 0;
|
2017-05-23 22:23:59 +08:00
|
|
|
|
2017-10-20 05:09:26 +08:00
|
|
|
/* Allocate the maximum possible number of MSI/MSI-X vectors */
|
2017-05-23 22:23:58 +08:00
|
|
|
nr_entries = pci_alloc_irq_vectors(dev, 1, PCIE_PORT_MAX_MSI_ENTRIES,
|
|
|
|
PCI_IRQ_MSIX | PCI_IRQ_MSI);
|
2017-02-01 21:41:43 +08:00
|
|
|
if (nr_entries < 0)
|
|
|
|
return nr_entries;
|
2017-05-23 22:23:59 +08:00
|
|
|
|
2017-10-20 21:48:06 +08:00
|
|
|
/* See how many and which Interrupt Message Numbers we actually use */
|
|
|
|
nvec = pcie_message_numbers(dev, mask, &pme, &aer, &dpc);
|
|
|
|
if (nvec > nr_entries) {
|
|
|
|
pci_free_irq_vectors(dev);
|
|
|
|
return -EIO;
|
2017-05-23 22:23:59 +08:00
|
|
|
}
|
|
|
|
|
2009-01-24 07:23:22 +08:00
|
|
|
/*
|
PCI/portdrv: Compute MSI/MSI-X IRQ vectors after final allocation
When setting up portdrv MSI/MSI-X interrupts, we previously allocated the
maximum possible number of vectors, read the Interrupt Message Numbers for
each service, saved the IRQ for each, freed the vectors, and finally used
the largest Message Number to reallocate only as many vectors as we need.
The problem is that freeing the vectors invalidates their IRQs, so the
saved IRQ numbers may now be invalid, which can result in errors like
this:
pcie_pme: probe of 0000:00:00.0:pcie001 failed with error -22
pciehp 0000:00:00.0:pcie004: Cannot get irq 20 for the hotplug controller
aer: probe of 0000:00:00.0:pcie002 failed with error -22
dpc 0000:00:00.0:pcie010: request IRQ22 failed: -22
Change the setup so we save the Interrupt Message Numbers (not the IRQs)
before we free the original setup, then use the Message Numbers to compute
the IRQs (via pci_irq_vector()) *after* we reallocate the vectors.
This should always be safe for MSI-X because the Message Numbers are fixed.
For MSI, the hardware is allowed to change Message Numbers when we update
the MSI Multiple Message Enable field when reallocating the vectors, but
since we allocate enough vectors to accommodate the largest Message Number
we found, that's unlikely. See PCIe r3.1, sec 7.8.2, 7.10.10, 7.31.2.
Fixes: 3674cc49da9a ("PCI/portdrv: Use pci_irq_alloc_vectors()")
Based-on-patch-by: Dongdong Liu <liudongdong3@huawei.com>
Tested-by: Dongdong Liu <liudongdong3@huawei.com> # HiSilicon hip08
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-10-20 21:57:16 +08:00
|
|
|
* If we allocated more than we need, free them and reallocate fewer.
|
|
|
|
*
|
|
|
|
* Reallocating may change the specific vectors we get, so
|
|
|
|
* pci_irq_vector() must be done *after* the reallocation.
|
|
|
|
*
|
|
|
|
* If we're using MSI, hardware is *allowed* to change the Interrupt
|
|
|
|
* Message Numbers when we free and reallocate the vectors, but we
|
|
|
|
* assume it won't because we allocate enough vectors for the
|
|
|
|
* biggest Message Number we found.
|
2009-01-24 07:23:22 +08:00
|
|
|
*/
|
2017-02-01 21:41:43 +08:00
|
|
|
if (nvec != nr_entries) {
|
|
|
|
pci_free_irq_vectors(dev);
|
2009-01-24 07:23:22 +08:00
|
|
|
|
2017-02-01 21:41:43 +08:00
|
|
|
nr_entries = pci_alloc_irq_vectors(dev, nvec, nvec,
|
2017-05-23 22:23:58 +08:00
|
|
|
PCI_IRQ_MSIX | PCI_IRQ_MSI);
|
2017-02-01 21:41:43 +08:00
|
|
|
if (nr_entries < 0)
|
|
|
|
return nr_entries;
|
2009-01-24 07:23:22 +08:00
|
|
|
}
|
|
|
|
|
2019-02-28 04:58:17 +08:00
|
|
|
/* PME, hotplug and bandwidth notification share an MSI/MSI-X vector */
|
|
|
|
if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP |
|
|
|
|
PCIE_PORT_SERVICE_BWNOTIF)) {
|
|
|
|
pcie_irq = pci_irq_vector(dev, pme);
|
|
|
|
irqs[PCIE_PORT_SERVICE_PME_SHIFT] = pcie_irq;
|
|
|
|
irqs[PCIE_PORT_SERVICE_HP_SHIFT] = pcie_irq;
|
|
|
|
irqs[PCIE_PORT_SERVICE_BWNOTIF_SHIFT] = pcie_irq;
|
2009-01-24 07:23:22 +08:00
|
|
|
}
|
|
|
|
|
2017-10-20 21:48:06 +08:00
|
|
|
if (mask & PCIE_PORT_SERVICE_AER)
|
|
|
|
irqs[PCIE_PORT_SERVICE_AER_SHIFT] = pci_irq_vector(dev, aer);
|
2009-01-24 07:23:22 +08:00
|
|
|
|
2017-10-20 21:48:06 +08:00
|
|
|
if (mask & PCIE_PORT_SERVICE_DPC)
|
|
|
|
irqs[PCIE_PORT_SERVICE_DPC_SHIFT] = pci_irq_vector(dev, dpc);
|
2017-05-23 22:23:59 +08:00
|
|
|
|
2017-02-01 21:41:43 +08:00
|
|
|
return 0;
|
2009-01-24 07:23:22 +08:00
|
|
|
}
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
2017-02-01 21:41:43 +08:00
|
|
|
* pcie_init_service_irqs - initialize irqs for PCI Express port services
|
2009-01-02 02:48:55 +08:00
|
|
|
* @dev: PCI Express port to handle
|
2009-11-25 20:04:00 +08:00
|
|
|
* @irqs: Array of irqs to populate
|
2009-01-02 02:48:55 +08:00
|
|
|
* @mask: Bitmask of port capabilities returned by get_port_device_capability()
|
|
|
|
*
|
|
|
|
* Return value: Interrupt mode associated with the port
|
|
|
|
*/
|
2017-02-01 21:41:43 +08:00
|
|
|
static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2017-02-01 21:41:43 +08:00
|
|
|
int ret, i;
|
|
|
|
|
|
|
|
for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
|
|
|
|
irqs[i] = -1;
|
2010-02-18 06:40:07 +08:00
|
|
|
|
2012-07-18 14:06:54 +08:00
|
|
|
/*
|
2018-03-10 01:21:27 +08:00
|
|
|
* If we support PME but can't use MSI/MSI-X for it, we have to
|
|
|
|
* fall back to INTx or other interrupts, e.g., a system shared
|
|
|
|
* interrupt.
|
2012-07-18 14:06:54 +08:00
|
|
|
*/
|
2017-05-23 22:23:58 +08:00
|
|
|
if ((mask & PCIE_PORT_SERVICE_PME) && pcie_pme_no_msi())
|
|
|
|
goto legacy_irq;
|
|
|
|
|
|
|
|
/* Try to use MSI-X or MSI if supported */
|
|
|
|
if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0)
|
|
|
|
return 0;
|
2009-01-13 21:39:39 +08:00
|
|
|
|
2017-05-23 22:23:58 +08:00
|
|
|
legacy_irq:
|
|
|
|
/* fall back to legacy IRQ */
|
|
|
|
ret = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY);
|
2017-02-01 21:41:43 +08:00
|
|
|
if (ret < 0)
|
|
|
|
return -ENODEV;
|
2009-01-24 07:23:22 +08:00
|
|
|
|
2018-03-10 01:21:24 +08:00
|
|
|
for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
|
|
|
|
irqs[i] = pci_irq_vector(dev, 0);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-11-25 20:04:00 +08:00
|
|
|
return 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
|
|
|
* get_port_device_capability - discover capabilities of a PCI Express port
|
|
|
|
* @dev: PCI Express port to examine
|
|
|
|
*
|
|
|
|
* The capabilities are read from the port's PCI Express configuration registers
|
|
|
|
* as described in PCI Express Base Specification 1.0a sections 7.8.2, 7.8.9 and
|
|
|
|
* 7.9 - 7.11.
|
|
|
|
*
|
|
|
|
* Return value: Bitmask of discovered port capabilities
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
static int get_port_device_capability(struct pci_dev *dev)
|
|
|
|
{
|
2018-03-10 01:21:25 +08:00
|
|
|
struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
|
2012-07-24 17:20:08 +08:00
|
|
|
int services = 0;
|
PCI: PCIe: Ask BIOS for control of all native services at once
After commit 852972acff8f10f3a15679be2059bb94916cba5d (ACPI: Disable
ASPM if the platform won't provide _OSC control for PCIe) control of
the PCIe Capability Structure is unconditionally requested by
acpi_pci_root_add(), which in principle may cause problems to
happen in two ways. First, the BIOS may refuse to give control of
the PCIe Capability Structure if it is not asked for any of the
_OSC features depending on it at the same time. Second, the BIOS may
assume that control of the _OSC features depending on the PCIe
Capability Structure will be requested in the future and may behave
incorrectly if that doesn't happen. For this reason, control of
the PCIe Capability Structure should always be requested along with
control of any other _OSC features that may depend on it (ie. PCIe
native PME, PCIe native hot-plug, PCIe AER).
Rework the PCIe port driver so that (1) it checks which native PCIe
port services can be enabled, according to the BIOS, and (2) it
requests control of all these services simultaneously. In
particular, this causes pcie_portdrv_probe() to fail if the BIOS
refuses to grant control of the PCIe Capability Structure, which
means that no native PCIe port services can be enabled for the PCIe
Root Complex the given port belongs to. If that happens, ASPM is
disabled to avoid problems with mishandling it by the part of the
PCIe hierarchy for which control of the PCIe Capability Structure
has not been received.
Make it possible to override this behavior using 'pcie_ports=native'
(use the PCIe native services regardless of the BIOS response to the
control request), or 'pcie_ports=compat' (do not use the PCIe native
services at all).
Accordingly, rework the existing PCIe port service drivers so that
they don't request control of the services directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-08-22 04:02:38 +08:00
|
|
|
|
2018-03-10 01:21:30 +08:00
|
|
|
if (dev->is_hotplug_bridge &&
|
2018-05-24 06:22:19 +08:00
|
|
|
(pcie_ports_native || host->native_pcie_hotplug)) {
|
2016-05-05 05:58:11 +08:00
|
|
|
services |= PCIE_PORT_SERVICE_HP;
|
2018-03-10 01:21:25 +08:00
|
|
|
|
2016-05-05 05:58:11 +08:00
|
|
|
/*
|
|
|
|
* Disable hot-plug interrupts in case they have been enabled
|
|
|
|
* by the BIOS and the hot-plug service driver is not loaded.
|
|
|
|
*/
|
|
|
|
pcie_capability_clear_word(dev, PCI_EXP_SLTCTL,
|
|
|
|
PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE);
|
2009-01-13 21:38:34 +08:00
|
|
|
}
|
2018-03-10 01:21:25 +08:00
|
|
|
|
2018-03-23 05:20:55 +08:00
|
|
|
#ifdef CONFIG_PCIEAER
|
|
|
|
if (dev->aer_cap && pci_aer_available() &&
|
|
|
|
(pcie_ports_native || host->native_aer)) {
|
2008-10-19 08:33:19 +08:00
|
|
|
services |= PCIE_PORT_SERVICE_AER;
|
2018-03-10 01:21:25 +08:00
|
|
|
|
2010-08-21 07:57:39 +08:00
|
|
|
/*
|
|
|
|
* Disable AER on this port in case it's been enabled by the
|
|
|
|
* BIOS (the AER service driver will enable it when necessary).
|
|
|
|
*/
|
|
|
|
pci_disable_pcie_error_reporting(dev);
|
|
|
|
}
|
2018-03-23 05:20:55 +08:00
|
|
|
#endif
|
2018-03-10 01:21:25 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Root ports are capable of generating PME too. Root Complex
|
|
|
|
* Event Collectors can also generate PMEs, but we don't handle
|
|
|
|
* those yet.
|
|
|
|
*/
|
|
|
|
if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT &&
|
2018-03-10 01:21:30 +08:00
|
|
|
(pcie_ports_native || host->native_pme)) {
|
2009-11-25 20:02:51 +08:00
|
|
|
services |= PCIE_PORT_SERVICE_PME;
|
2018-03-10 01:21:25 +08:00
|
|
|
|
2010-08-21 07:57:39 +08:00
|
|
|
/*
|
|
|
|
* Disable PME interrupt on this port in case it's been enabled
|
|
|
|
* by the BIOS (the PME service driver will enable it when
|
|
|
|
* necessary).
|
|
|
|
*/
|
|
|
|
pcie_pme_interrupt_enable(dev, false);
|
|
|
|
}
|
2018-03-10 01:21:25 +08:00
|
|
|
|
|
|
|
if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) &&
|
2018-03-27 18:48:35 +08:00
|
|
|
pci_aer_available() && services & PCIE_PORT_SERVICE_AER)
|
2016-05-03 04:10:31 +08:00
|
|
|
services |= PCIE_PORT_SERVICE_DPC;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-02-28 04:58:17 +08:00
|
|
|
if (pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM ||
|
|
|
|
pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
|
|
|
|
services |= PCIE_PORT_SERVICE_BWNOTIF;
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
return services;
|
|
|
|
}
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
2009-11-25 20:01:28 +08:00
|
|
|
* pcie_device_init - allocate and initialize PCI Express port service device
|
|
|
|
* @pdev: PCI Express port to associate the service device with
|
|
|
|
* @service: Type of service to associate with the service device
|
2009-01-02 02:48:55 +08:00
|
|
|
* @irq: Interrupt vector to associate with the service device
|
|
|
|
*/
|
2009-11-25 20:01:28 +08:00
|
|
|
static int pcie_device_init(struct pci_dev *pdev, int service, int irq)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-11-25 20:01:28 +08:00
|
|
|
int retval;
|
|
|
|
struct pcie_device *pcie;
|
2005-04-17 06:20:36 +08:00
|
|
|
struct device *device;
|
|
|
|
|
2009-11-25 20:01:28 +08:00
|
|
|
pcie = kzalloc(sizeof(*pcie), GFP_KERNEL);
|
|
|
|
if (!pcie)
|
|
|
|
return -ENOMEM;
|
|
|
|
pcie->port = pdev;
|
|
|
|
pcie->irq = irq;
|
|
|
|
pcie->service = service;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* Initialize generic device interface */
|
2009-11-25 20:01:28 +08:00
|
|
|
device = &pcie->device;
|
2005-04-17 06:20:36 +08:00
|
|
|
device->bus = &pcie_port_bus_type;
|
|
|
|
device->release = release_pcie_device; /* callback to free pcie dev */
|
2016-05-03 22:58:11 +08:00
|
|
|
dev_set_name(device, "%s:pcie%03x",
|
2009-11-25 20:01:28 +08:00
|
|
|
pci_name(pdev),
|
2012-07-24 17:20:03 +08:00
|
|
|
get_descriptor_id(pci_pcie_type(pdev), service));
|
2009-11-25 20:01:28 +08:00
|
|
|
device->parent = &pdev->dev;
|
2010-02-09 02:16:33 +08:00
|
|
|
device_enable_async_suspend(device);
|
2009-11-25 20:01:28 +08:00
|
|
|
|
|
|
|
retval = device_register(device);
|
2013-12-20 05:20:09 +08:00
|
|
|
if (retval) {
|
2013-12-20 05:22:35 +08:00
|
|
|
put_device(device);
|
2013-12-20 05:20:09 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2016-06-02 16:17:15 +08:00
|
|
|
pm_runtime_no_callbacks(device);
|
|
|
|
|
2013-12-20 05:20:09 +08:00
|
|
|
return 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_device_register - register PCI Express port
|
|
|
|
* @dev: PCI Express port to register
|
|
|
|
*
|
|
|
|
* Allocate the port extension structure and register services associated with
|
|
|
|
* the port.
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
int pcie_port_device_register(struct pci_dev *dev)
|
|
|
|
{
|
2009-11-25 20:05:35 +08:00
|
|
|
int status, capabilities, i, nr_service;
|
2009-11-25 20:04:00 +08:00
|
|
|
int irqs[PCIE_PORT_DEVICE_MAXSERVICES];
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-11-25 20:04:30 +08:00
|
|
|
/* Enable PCI Express port device */
|
|
|
|
status = pci_enable_device(dev);
|
|
|
|
if (status)
|
2009-11-25 20:06:15 +08:00
|
|
|
return status;
|
2010-12-19 22:57:16 +08:00
|
|
|
|
|
|
|
/* Get and check PCI Express port services */
|
|
|
|
capabilities = get_port_device_capability(dev);
|
2011-03-21 11:29:20 +08:00
|
|
|
if (!capabilities)
|
2010-12-19 22:57:16 +08:00
|
|
|
return 0;
|
|
|
|
|
2009-11-25 20:04:30 +08:00
|
|
|
pci_set_master(dev);
|
2009-11-25 20:04:00 +08:00
|
|
|
/*
|
|
|
|
* Initialize service irqs. Don't use service devices that
|
|
|
|
* require interrupts if there is no way to generate them.
|
2014-04-01 07:51:23 +08:00
|
|
|
* However, some drivers may have a polling mode (e.g. pciehp_poll_mode)
|
|
|
|
* that can be used in the absence of irqs. Allow them to determine
|
|
|
|
* if that is to be used.
|
2009-11-25 20:04:00 +08:00
|
|
|
*/
|
2017-02-01 21:41:43 +08:00
|
|
|
status = pcie_init_service_irqs(dev, irqs, capabilities);
|
2009-11-25 20:04:00 +08:00
|
|
|
if (status) {
|
2018-03-10 01:21:24 +08:00
|
|
|
capabilities &= PCIE_PORT_SERVICE_HP;
|
2009-11-25 20:04:00 +08:00
|
|
|
if (!capabilities)
|
2009-11-25 20:04:30 +08:00
|
|
|
goto error_disable;
|
2009-01-13 21:42:01 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* Allocate child services if any */
|
2009-11-25 20:05:35 +08:00
|
|
|
status = -ENODEV;
|
|
|
|
nr_service = 0;
|
|
|
|
for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
|
2009-01-13 21:39:39 +08:00
|
|
|
int service = 1 << i;
|
|
|
|
if (!(capabilities & service))
|
|
|
|
continue;
|
2009-11-25 20:05:35 +08:00
|
|
|
if (!pcie_device_init(dev, service, irqs[i]))
|
|
|
|
nr_service++;
|
2009-01-13 21:42:01 +08:00
|
|
|
}
|
2009-11-25 20:05:35 +08:00
|
|
|
if (!nr_service)
|
2009-11-25 20:05:01 +08:00
|
|
|
goto error_cleanup_irqs;
|
2009-11-25 20:05:35 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
return 0;
|
2009-01-13 21:42:01 +08:00
|
|
|
|
2009-11-25 20:05:01 +08:00
|
|
|
error_cleanup_irqs:
|
2017-02-01 21:41:43 +08:00
|
|
|
pci_free_irq_vectors(dev);
|
2009-11-25 20:04:30 +08:00
|
|
|
error_disable:
|
|
|
|
pci_disable_device(dev);
|
2009-01-13 21:42:01 +08:00
|
|
|
return status;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef CONFIG_PM
|
2018-07-20 06:27:52 +08:00
|
|
|
typedef int (*pcie_pm_callback_t)(struct pcie_device *);
|
|
|
|
|
|
|
|
static int pm_iter(struct device *dev, void *data)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
|
|
|
struct pcie_port_service_driver *service_driver;
|
2018-07-20 06:27:52 +08:00
|
|
|
size_t offset = *(size_t *)data;
|
|
|
|
pcie_pm_callback_t cb;
|
2005-03-30 05:36:43 +08:00
|
|
|
|
2009-12-15 10:38:04 +08:00
|
|
|
if ((dev->bus == &pcie_port_bus_type) && dev->driver) {
|
|
|
|
service_driver = to_service_driver(dev->driver);
|
2018-07-20 06:27:52 +08:00
|
|
|
cb = *(pcie_pm_callback_t *)((void *)service_driver + offset);
|
|
|
|
if (cb)
|
|
|
|
return cb(to_pcie_device(dev));
|
2009-12-15 10:38:04 +08:00
|
|
|
}
|
2005-03-30 05:36:43 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
PCI: Add missing link delays required by the PCIe spec
Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that
consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
+-01.0-[04-36]-- DS hotplug port
+-02.0-[37]----00.0 xHCI controller
\-04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe
gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that
we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0
GT/s, software must wait a minimum of 100 ms after Link training
completes before sending a Configuration Request to the device
immediately below that Port. Software can determine when Link training
completes by polling the Data Link Layer Link Active bit or by setting
up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the
actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
pcieport 0000:00:1b.0: waking up bus
pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:00:1b.0: PME# disabled
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:01:00.0: PME# disabled
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:00.0: PME# disabled
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:01.0: PME# disabled
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: PME# disabled
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:04.0: PME# disabled
pcieport 0000:02:01.0: PME# enabled
pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
pcieport 0000:02:04.0: PME# enabled
pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
...
thunderbolt 0000:03:00.0: PME# disabled
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking
into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
transition of the root port and the two downstream hotplug ports. This
means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can
see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8. there is a specific _DSM that allows the OS to skip the delays
but this platform does not provide the _DSM and does not go to S3 anyway
so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not
actually needed because there is an additional delay as part of the ACPI
power resource that is used to turn on power to the hierarchy but since
that additional delay is not required by any of standards (PCIe, ACPI)
it is not present in the Intel Ice Lake, for example where missing the
mandatory delays causes pciehp to start tearing down the stack too early
(links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they
perform the mandatory delays before the downstream component gets
resumed. We perform the delays before port services are resumed because
otherwise pciehp might find that the link is not up (even if it is just
training) and tears-down the hierarchy.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-12 18:57:38 +08:00
|
|
|
static int get_downstream_delay(struct pci_bus *bus)
|
|
|
|
{
|
|
|
|
struct pci_dev *pdev;
|
|
|
|
int min_delay = 100;
|
|
|
|
int max_delay = 0;
|
|
|
|
|
|
|
|
list_for_each_entry(pdev, &bus->devices, bus_list) {
|
|
|
|
if (!pdev->imm_ready)
|
|
|
|
min_delay = 0;
|
|
|
|
else if (pdev->d3cold_delay < min_delay)
|
|
|
|
min_delay = pdev->d3cold_delay;
|
|
|
|
if (pdev->d3cold_delay > max_delay)
|
|
|
|
max_delay = pdev->d3cold_delay;
|
|
|
|
}
|
|
|
|
|
|
|
|
return max(min_delay, max_delay);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* wait_for_downstream_link - Wait for downstream link to establish
|
|
|
|
* @pdev: PCIe port whose downstream link is waited
|
|
|
|
*
|
|
|
|
* Handle delays according to PCIe 4.0 section 6.6.1 before configuration
|
|
|
|
* access to the downstream component is permitted.
|
|
|
|
*
|
|
|
|
* This blocks PCI core resume of the hierarchy below this port until the
|
|
|
|
* link is trained. Should be called before resuming port services to
|
|
|
|
* prevent pciehp from starting to tear-down the hierarchy too soon.
|
|
|
|
*/
|
|
|
|
static void wait_for_downstream_link(struct pci_dev *pdev)
|
|
|
|
{
|
|
|
|
int delay;
|
|
|
|
|
|
|
|
if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ROOT_PORT &&
|
|
|
|
pci_pcie_type(pdev) != PCI_EXP_TYPE_DOWNSTREAM)
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (pci_dev_is_disconnected(pdev))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (!pdev->subordinate || list_empty(&pdev->subordinate->devices) ||
|
|
|
|
!pdev->bridge_d3)
|
|
|
|
return;
|
|
|
|
|
|
|
|
delay = get_downstream_delay(pdev->subordinate);
|
|
|
|
if (!delay)
|
|
|
|
return;
|
|
|
|
|
|
|
|
dev_dbg(&pdev->dev, "waiting downstream link for %d ms\n", delay);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If downstream port does not support speeds greater than 5 GT/s
|
|
|
|
* need to wait 100ms. For higher speeds (gen3) we need to wait
|
|
|
|
* first for the data link layer to become active.
|
|
|
|
*/
|
|
|
|
if (pcie_get_speed_cap(pdev) <= PCIE_SPEED_5_0GT)
|
|
|
|
msleep(delay);
|
|
|
|
else
|
|
|
|
pcie_wait_for_link_delay(pdev, true, delay);
|
|
|
|
}
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_device_suspend - suspend port services associated with a PCIe port
|
|
|
|
* @dev: PCI Express port to handle
|
|
|
|
*/
|
2009-02-16 05:32:48 +08:00
|
|
|
int pcie_port_device_suspend(struct device *dev)
|
2005-03-30 05:36:43 +08:00
|
|
|
{
|
2018-07-20 06:27:52 +08:00
|
|
|
size_t off = offsetof(struct pcie_port_service_driver, suspend);
|
|
|
|
return device_for_each_child(dev, &off, pm_iter);
|
2005-03-30 05:36:43 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-07-20 06:27:53 +08:00
|
|
|
int pcie_port_device_resume_noirq(struct device *dev)
|
|
|
|
{
|
|
|
|
size_t off = offsetof(struct pcie_port_service_driver, resume_noirq);
|
PCI: Add missing link delays required by the PCIe spec
Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that
consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
+-01.0-[04-36]-- DS hotplug port
+-02.0-[37]----00.0 xHCI controller
\-04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe
gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that
we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0
GT/s, software must wait a minimum of 100 ms after Link training
completes before sending a Configuration Request to the device
immediately below that Port. Software can determine when Link training
completes by polling the Data Link Layer Link Active bit or by setting
up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the
actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
pcieport 0000:00:1b.0: waking up bus
pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:00:1b.0: PME# disabled
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:01:00.0: PME# disabled
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:00.0: PME# disabled
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:01.0: PME# disabled
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: PME# disabled
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:04.0: PME# disabled
pcieport 0000:02:01.0: PME# enabled
pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
pcieport 0000:02:04.0: PME# enabled
pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
...
thunderbolt 0000:03:00.0: PME# disabled
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking
into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
transition of the root port and the two downstream hotplug ports. This
means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can
see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8. there is a specific _DSM that allows the OS to skip the delays
but this platform does not provide the _DSM and does not go to S3 anyway
so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not
actually needed because there is an additional delay as part of the ACPI
power resource that is used to turn on power to the hierarchy but since
that additional delay is not required by any of standards (PCIe, ACPI)
it is not present in the Intel Ice Lake, for example where missing the
mandatory delays causes pciehp to start tearing down the stack too early
(links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they
perform the mandatory delays before the downstream component gets
resumed. We perform the delays before port services are resumed because
otherwise pciehp might find that the link is not up (even if it is just
training) and tears-down the hierarchy.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-12 18:57:38 +08:00
|
|
|
|
|
|
|
wait_for_downstream_link(to_pci_dev(dev));
|
2018-07-20 06:27:53 +08:00
|
|
|
return device_for_each_child(dev, &off, pm_iter);
|
|
|
|
}
|
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
2015-06-25 05:03:23 +08:00
|
|
|
* pcie_port_device_resume - resume port services associated with a PCIe port
|
2009-01-02 02:48:55 +08:00
|
|
|
* @dev: PCI Express port to handle
|
|
|
|
*/
|
2009-02-16 05:32:48 +08:00
|
|
|
int pcie_port_device_resume(struct device *dev)
|
2005-03-30 05:36:43 +08:00
|
|
|
{
|
2018-07-20 06:27:52 +08:00
|
|
|
size_t off = offsetof(struct pcie_port_service_driver, resume);
|
|
|
|
return device_for_each_child(dev, &off, pm_iter);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2018-09-28 05:41:48 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* pcie_port_device_runtime_suspend - runtime suspend port services
|
|
|
|
* @dev: PCI Express port to handle
|
|
|
|
*/
|
|
|
|
int pcie_port_device_runtime_suspend(struct device *dev)
|
|
|
|
{
|
|
|
|
size_t off = offsetof(struct pcie_port_service_driver, runtime_suspend);
|
|
|
|
return device_for_each_child(dev, &off, pm_iter);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* pcie_port_device_runtime_resume - runtime resume port services
|
|
|
|
* @dev: PCI Express port to handle
|
|
|
|
*/
|
|
|
|
int pcie_port_device_runtime_resume(struct device *dev)
|
|
|
|
{
|
|
|
|
size_t off = offsetof(struct pcie_port_service_driver, runtime_resume);
|
PCI: Add missing link delays required by the PCIe spec
Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that
consists of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
+-01.0-[04-36]-- DS hotplug port
+-02.0-[37]----00.0 xHCI controller
\-04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe
gen3 so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that
we must follow the rules in PCIe 4.0 section 6.6.1.
For the PCIe gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0
GT/s, software must wait a minimum of 100 ms after Link training
completes before sending a Configuration Request to the device
immediately below that Port. Software can determine when Link training
completes by polling the Data Link Layer Link Active bit or by setting
up an associated interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):
pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with additional logging so we can see the
actual delays the kernel performs:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
pcieport 0000:00:1b.0: waking up bus
pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:00:1b.0: PME# disabled
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:01:00.0: PME# disabled
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:00.0: PME# disabled
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:01.0: PME# disabled
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: PME# disabled
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:04.0: PME# disabled
pcieport 0000:02:01.0: PME# enabled
pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
pcieport 0000:02:04.0: PME# enabled
pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
...
thunderbolt 0000:03:00.0: PME# disabled
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
xhci_hcd 0000:37:00.0: PME# disabled
For the switch upstream port (01:00.0) we wait for 100ms but not taking
into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
transition of the root port and the two downstream hotplug ports. This
means that we deviate from what the spec requires.
Performing the same check for system sleep (s2idle) transitions we can
see following when resuming from s2idle:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
...
pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
...
pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
...
thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
This is even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8. there is a specific _DSM that allows the OS to skip the delays
but this platform does not provide the _DSM and does not go to S3 anyway
so no firmware is involved that could already handle these delays.
In this particular Intel Coffee Lake platform these delays are not
actually needed because there is an additional delay as part of the ACPI
power resource that is used to turn on power to the hierarchy but since
that additional delay is not required by any of standards (PCIe, ACPI)
it is not present in the Intel Ice Lake, for example where missing the
mandatory delays causes pciehp to start tearing down the stack too early
(links are not yet trained).
For this reason, change the PCIe portdrv PM resume hooks so that they
perform the mandatory delays before the downstream component gets
resumed. We perform the delays before port services are resumed because
otherwise pciehp might find that the link is not up (even if it is just
training) and tears-down the hierarchy.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-12 18:57:38 +08:00
|
|
|
|
|
|
|
wait_for_downstream_link(to_pci_dev(dev));
|
2018-09-28 05:41:48 +08:00
|
|
|
return device_for_each_child(dev, &off, pm_iter);
|
|
|
|
}
|
2009-02-16 05:32:48 +08:00
|
|
|
#endif /* PM */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-03-30 05:36:43 +08:00
|
|
|
static int remove_iter(struct device *dev, void *data)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2013-12-20 05:24:13 +08:00
|
|
|
if (dev->bus == &pcie_port_bus_type)
|
2009-02-21 12:16:07 +08:00
|
|
|
device_unregister(dev);
|
2005-03-30 05:36:43 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-05-18 05:44:16 +08:00
|
|
|
static int find_service_iter(struct device *device, void *data)
|
|
|
|
{
|
|
|
|
struct pcie_port_service_driver *service_driver;
|
|
|
|
struct portdrv_service_data *pdrvs;
|
|
|
|
u32 service;
|
|
|
|
|
|
|
|
pdrvs = (struct portdrv_service_data *) data;
|
|
|
|
service = pdrvs->service;
|
|
|
|
|
|
|
|
if (device->bus == &pcie_port_bus_type && device->driver) {
|
|
|
|
service_driver = to_service_driver(device->driver);
|
|
|
|
if (service_driver->service == service) {
|
|
|
|
pdrvs->drv = service_driver;
|
2018-05-18 05:44:17 +08:00
|
|
|
pdrvs->dev = device;
|
2018-05-18 05:44:16 +08:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* pcie_port_find_service - find the service driver
|
|
|
|
* @dev: PCI Express port the service is associated with
|
|
|
|
* @service: Service to find
|
|
|
|
*
|
|
|
|
* Find PCI Express port service driver associated with given service
|
|
|
|
*/
|
|
|
|
struct pcie_port_service_driver *pcie_port_find_service(struct pci_dev *dev,
|
|
|
|
u32 service)
|
|
|
|
{
|
|
|
|
struct pcie_port_service_driver *drv;
|
|
|
|
struct portdrv_service_data pdrvs;
|
|
|
|
|
|
|
|
pdrvs.drv = NULL;
|
|
|
|
pdrvs.service = service;
|
|
|
|
device_for_each_child(&dev->dev, &pdrvs, find_service_iter);
|
|
|
|
|
|
|
|
drv = pdrvs.drv;
|
|
|
|
return drv;
|
|
|
|
}
|
|
|
|
|
2018-05-18 05:44:17 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_find_device - find the struct device
|
|
|
|
* @dev: PCI Express port the service is associated with
|
|
|
|
* @service: For the service to find
|
|
|
|
*
|
|
|
|
* Find the struct device associated with given service on a pci_dev
|
|
|
|
*/
|
|
|
|
struct device *pcie_port_find_device(struct pci_dev *dev,
|
|
|
|
u32 service)
|
|
|
|
{
|
|
|
|
struct device *device;
|
|
|
|
struct portdrv_service_data pdrvs;
|
|
|
|
|
|
|
|
pdrvs.dev = NULL;
|
|
|
|
pdrvs.service = service;
|
|
|
|
device_for_each_child(&dev->dev, &pdrvs, find_service_iter);
|
|
|
|
|
|
|
|
device = pdrvs.dev;
|
|
|
|
return device;
|
|
|
|
}
|
2018-10-12 02:34:10 +08:00
|
|
|
EXPORT_SYMBOL_GPL(pcie_port_find_device);
|
2018-05-18 05:44:17 +08:00
|
|
|
|
2009-01-02 02:48:55 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_device_remove - unregister PCI Express port service devices
|
|
|
|
* @dev: PCI Express port the service devices to unregister are associated with
|
|
|
|
*
|
|
|
|
* Remove PCI Express port service devices associated with given port and
|
|
|
|
* disable MSI-X or MSI for the port.
|
|
|
|
*/
|
2005-03-30 05:36:43 +08:00
|
|
|
void pcie_port_device_remove(struct pci_dev *dev)
|
|
|
|
{
|
2009-02-21 12:16:07 +08:00
|
|
|
device_for_each_child(&dev->dev, NULL, remove_iter);
|
2017-02-01 21:41:43 +08:00
|
|
|
pci_free_irq_vectors(dev);
|
2009-11-25 20:04:00 +08:00
|
|
|
pci_disable_device(dev);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2009-01-02 02:53:32 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_probe_service - probe driver for given PCI Express port service
|
|
|
|
* @dev: PCI Express port service device to probe against
|
|
|
|
*
|
|
|
|
* If PCI Express port service driver is registered with
|
|
|
|
* pcie_port_service_register(), this function will be called by the driver core
|
|
|
|
* whenever match is found between the driver and a port service device.
|
|
|
|
*/
|
2009-01-02 02:52:12 +08:00
|
|
|
static int pcie_port_probe_service(struct device *dev)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-01-02 02:52:12 +08:00
|
|
|
struct pcie_device *pciedev;
|
|
|
|
struct pcie_port_service_driver *driver;
|
|
|
|
int status;
|
|
|
|
|
|
|
|
if (!dev || !dev->driver)
|
|
|
|
return -ENODEV;
|
|
|
|
|
|
|
|
driver = to_service_driver(dev->driver);
|
|
|
|
if (!driver || !driver->probe)
|
|
|
|
return -ENODEV;
|
|
|
|
|
|
|
|
pciedev = to_pcie_device(dev);
|
2009-01-13 21:44:19 +08:00
|
|
|
status = driver->probe(pciedev);
|
2013-12-20 05:20:09 +08:00
|
|
|
if (status)
|
|
|
|
return status;
|
|
|
|
|
|
|
|
get_device(dev);
|
|
|
|
return 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2009-01-02 02:53:32 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_remove_service - detach driver from given PCI Express port service
|
|
|
|
* @dev: PCI Express port service device to handle
|
|
|
|
*
|
|
|
|
* If PCI Express port service driver is registered with
|
|
|
|
* pcie_port_service_register(), this function will be called by the driver core
|
|
|
|
* when device_unregister() is called for the port service device associated
|
|
|
|
* with the driver.
|
|
|
|
*/
|
2009-01-02 02:52:12 +08:00
|
|
|
static int pcie_port_remove_service(struct device *dev)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-01-02 02:52:12 +08:00
|
|
|
struct pcie_device *pciedev;
|
|
|
|
struct pcie_port_service_driver *driver;
|
|
|
|
|
|
|
|
if (!dev || !dev->driver)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
pciedev = to_pcie_device(dev);
|
|
|
|
driver = to_service_driver(dev->driver);
|
|
|
|
if (driver && driver->remove) {
|
|
|
|
driver->remove(pciedev);
|
|
|
|
put_device(dev);
|
|
|
|
}
|
|
|
|
return 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2009-01-02 02:53:32 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_shutdown_service - shut down given PCI Express port service
|
|
|
|
* @dev: PCI Express port service device to handle
|
|
|
|
*
|
|
|
|
* If PCI Express port service driver is registered with
|
|
|
|
* pcie_port_service_register(), this function will be called by the driver core
|
|
|
|
* when device_shutdown() is called for the port service device associated
|
|
|
|
* with the driver.
|
|
|
|
*/
|
2009-01-02 02:52:12 +08:00
|
|
|
static void pcie_port_shutdown_service(struct device *dev) {}
|
|
|
|
|
2009-01-02 02:53:32 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_service_register - register PCI Express port service driver
|
|
|
|
* @new: PCI Express port service driver to register
|
|
|
|
*/
|
2005-04-17 06:20:36 +08:00
|
|
|
int pcie_port_service_register(struct pcie_port_service_driver *new)
|
|
|
|
{
|
2010-08-21 07:51:44 +08:00
|
|
|
if (pcie_ports_disabled)
|
|
|
|
return -ENODEV;
|
|
|
|
|
2013-11-13 03:07:17 +08:00
|
|
|
new->driver.name = new->name;
|
2005-04-17 06:20:36 +08:00
|
|
|
new->driver.bus = &pcie_port_bus_type;
|
|
|
|
new->driver.probe = pcie_port_probe_service;
|
|
|
|
new->driver.remove = pcie_port_remove_service;
|
|
|
|
new->driver.shutdown = pcie_port_shutdown_service;
|
|
|
|
|
|
|
|
return driver_register(&new->driver);
|
2005-03-30 05:36:43 +08:00
|
|
|
}
|
2009-12-15 10:38:04 +08:00
|
|
|
EXPORT_SYMBOL(pcie_port_service_register);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-01-02 02:53:32 +08:00
|
|
|
/**
|
|
|
|
* pcie_port_service_unregister - unregister PCI Express port service driver
|
|
|
|
* @drv: PCI Express port service driver to unregister
|
|
|
|
*/
|
2009-01-02 02:52:12 +08:00
|
|
|
void pcie_port_service_unregister(struct pcie_port_service_driver *drv)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-01-02 02:52:12 +08:00
|
|
|
driver_unregister(&drv->driver);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(pcie_port_service_unregister);
|