* pci/resource:
PCI: Simplify pci_create_attr() control flow
PCI: Don't leak memory if sysfs_create_bin_file() fails
PCI: Simplify sysfs ROM cleanup
PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
ia64/PCI: Use ioremap() instead of open-coded equivalent
ia64/PCI: Use temporary struct resource * to avoid repetition
PCI: Clean up pci_map_rom() whitespace
PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
PCI: Set ROM shadow location in arch code, not in PCI core
PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
PCI: Don't assign or reassign immutable resources
PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
PCI: Disable IO/MEM decoding for devices with non-compliant BARs
* pci/host-hv:
PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs
PCI: Look up IRQ domain by fwnode_handle
PCI: Add fwnode_handle to x86 pci_sysdata
* pci/aer:
PCI/AER: Log aer_inject error injections
PCI/AER: Log actual error causes in aer_inject
PCI/AER: Use dev_warn() in aer_inject
PCI/AER: Fix aer_inject error codes
* pci/enumeration:
PCI: Fix broken URL for Dell biosdevname
* pci/kconfig:
PCI: Cleanup pci/pcie/Kconfig whitespace
PCI: Include pci/hotplug Kconfig directly from pci/Kconfig
PCI: Include pci/pcie/Kconfig directly from pci/Kconfig
* pci/misc:
PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
PCI: Add QEMU top-level IDs for (sub)vendor & device
unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition
PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h
PCI: Move pci_dma_* helpers to common code
frv/PCI: Remove stray pci_{alloc,free}_consistent() declaration
* pci/virtualization:
PCI: Wait for up to 1000ms after FLR reset
PCI: Support SR-IOV on any function type
* pci/vpd:
PCI: Prevent VPD access for buggy devices
PCI: Sleep rather than busy-wait for VPD access completion
PCI: Fold struct pci_vpd_pci22 into struct pci_vpd
PCI: Rename VPD symbols to remove unnecessary "pci22"
PCI: Remove struct pci_vpd_ops.release function pointer
PCI: Move pci_vpd_release() from header file to pci/access.c
PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code
PCI: Determine actual VPD size on first access
PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy
PCI: Allow access to VPD attributes with size 0
PCI: Update VPD definitions
Add pci_ops.{add,remove}_bus() callbacks, which will be called on every
newly created bus and when a bus is being removed, respectively. This can
be used by drivers to implement driver-specific initialization and teardown
of the bus, in addition to the architecture-specifics implemented by the
pcibios_add_bus() and the pcibios_remove_bus() functions.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There's only one kind of VPD, so we don't need to qualify it as "the
version described by PCI spec rev 2.2."
Rename the following symbols to remove unnecessary "pci22":
PCI_VPD_PCI22_SIZE -> PCI_VPD_MAX_SIZE
pci_vpd_pci22_size() -> pci_vpd_size()
pci_vpd_pci22_wait() -> pci_vpd_wait()
pci_vpd_pci22_read() -> pci_vpd_read()
pci_vpd_pci22_write() -> pci_vpd_write()
pci_vpd_pci22_ops -> pci_vpd_ops
pci_vpd_pci22_init() -> pci_vpd_init()
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
The PCI config header (first 64 bytes of each device's config space) is
defined by the PCI spec so generic software can identify the device and
manage its usage of I/O, memory, and IRQ resources.
Some non-spec-compliant devices put registers other than BARs where the
BARs should be. When the PCI core sizes these "BARs", the reads and writes
it does may have unwanted side effects, and the "BAR" may appear to
describe non-sensical address space.
Add a flag bit to mark non-compliant devices so we don't touch their BARs.
Turn off IO/MEM decoding to prevent the devices from consuming address
space, since we can't read the BARs to find out what that address space
would be.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
CC: stable@vger.kernel.org
If pci_host_bridge_msi_domain() can't find an IRQ domain through the OF
tree, try to look it up directly through the fwnode_handle.
[bhelgaas: changelog]
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
include/asm-generic/pci-bridge.h is now empty, so remove every #include of
it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com> (arm64)
The PCI flag management constants and functions were previously declared in
include/asm-generic/pci-bridge.h. But they are not specific to bridges,
and arches did not include pci-bridge.h consistently.
Move the following interfaces and related constants to include/linux/pci.h
and remove pci-bridge.h:
pci_set_flags()
pci_add_flags()
pci_clear_flags()
pci_has_flag()
This fixes these warnings when building for some arches:
drivers/pci/host/pcie-designware.c:562:20: error: 'PCI_PROBE_ONLY' undeclared (first use in this function)
drivers/pci/host/pcie-designware.c:562:7: error: implicit declaration of function 'pci_has_flag' [-Werror=implicit-function-declaration]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch introduces pci_msi_register_fwnode_provider() for irqchip
to register a callback, to provide a way to determine appropriate MSI
domain for a pci device.
It also introduces pci_host_bridge_acpi_msi_domain(), which returns
the MSI domain of the specified PCI host bridge with DOMAIN_BUS_PCI_MSI
bus token. Then, it is assigned to pci device.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
* pci/aspm:
PCI/ASPM: Make sysfs link_state_store() consistent with link_state_show()
* pci/hotplug:
PCI: pciehp: Always protect pciehp_disable_slot() with hotplug mutex
* pci/misc:
x86/PCI: Simplify pci_bios_{read,write}
PCI: Simplify config space size computation
PCI: Limit config space size for Netronome NFP6000 family
PCI: Add Netronome vendor and device IDs
PCI: Support PCIe devices with short cfg_size
x86/PCI: Clarify AMD Fam10h config access restrictions comment
PCI: Print warnings for all invalid expansion ROM headers
PCI: Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask
* pci/msi:
PCI/MSI: Remove empty pci_msi_init_pci_dev()
PCI/MSI: Initialize MSI capability for all architectures
Restructure the logic so we return the config space size as soon as we know
it. This reduces indentation, removes negations, and removes gotos.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4a7cc83167 ("genirq/MSI: Move msi_list from struct pci_dev to struct
device") removed the contents of pci_msi_init_pci_dev(). All
implementation of it are now empty, so remove it completely.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't
support MSI") moved dev->msi_cap and dev->msix_cap initialization from the
pci_init_capabilities() path (used on all architectures) to the
pci_setup_device() path (not used on Open Firmware architectures).
This broke MSI or MSI-X on Open Firmware machines. 4d9aac397a
("powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case")
fixed it for PowerPC but not for SPARC.
Set up MSI and MSI-X (initialize msi_cap and msix_cap and disable MSI and
MSI-X) in pci_init_capabilities() so all architectures do it the same way.
This reverts 4d9aac397a since this patch fixes the problem generically
for both PowerPC and SPARC.
[bhelgaas: changelog, make pci_msi_setup_pci_dev() static]
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* acpi-smbus:
Revert "ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook"
ACPI / SMBus: Fix boot stalls / high CPU caused by reentrant code
* acpi-ec:
ACPI-EC: Drop unnecessary check made before calling acpi_ec_delete_query()
* acpi-pci:
PCI: Fix OF logic in pci_dma_configure()
This patch fixes a bug introduced by previous commit,
which incorrectly checkes the of_node of the end-point device.
Instead, it should check the of_node of the host bridge.
Fixes: 50230713b6 ("PCI: OF: Move of_pci_dma_configure() to pci_dma_configure()")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Support for the ACPI _CCA configuration object intended to tell
the OS whether or not a bus master device supports hardware
managed cache coherency and a new set of functions to allow
drivers to check the cache coherency support for devices in a
platform firmware interface agnostic way (Suravee Suthikulpanit,
Jeremy Linton).
- ACPI backlight quirks for ESPRIMO Mobile M9410 and Dell XPS L421X
(Aaron Lu, Hans de Goede).
- Fixes for the arm_big_little and s5pv210-cpufreq cpufreq drivers
(Jon Medhurst, Nicolas Pitre).
- kfree()-related fixup for the recently introduced CPPC cpufreq
frontend (Markus Elfring).
- intel_pstate fix reducing kernel log noise on systems where
P-states are managed by hardware (Prarit Bhargava).
- intel_pstate maintainers information update (Srinivas Pandruvada).
- cpufreq core optimization related to the handling of delayed work
items used by governors (Viresh Kumar).
- Locking fixes and cleanups of the Operating Performance Points
(OPP) framework (Viresh Kumar).
- Generic power domains framework cleanups (Lina Iyer).
- cpupower tool updates (Jacob Tanenbaum, Sriram Raghunathan,
Thomas Renninger).
- turbostat tool updates (Len Brown).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWQ96OAAoJEILEb/54YlRxyYYQALJ1HXu76SvYX1re2aawOw6Y
WgzF3Ly7JX034E1VvA2xP6wgkWpBRBDcpnRDeltNA4dYXPBDei/eTcRZTLX12N3g
AfFRGjGWTtLJfpNPecNMmUyF5xHjgDgMIQRabY+Is5NfP5STkPHJeqULnEpvTtx8
bd0lnC5jc4vuZiPEh1xVb+ClYDqWS8YQPyFJVjV/BaIf8Qwe5+oRX36byMBaKc9D
ZgmvmCk5n/HLQQ1uQsqe4xnhFLHN2rypt2BLvFrOtlnSz9VNNpQyB+OIW1mgCD4f
LhpKIwjP8NhZNQUq8HFu7nDlm8ciQtWmeMPB5NdGQ+OESu7yfKAOzQ+3U6Gl2Gaf
66zVGyV6SOJJwfDVJ3qKTtroWps9QV7ZClOJ+zJGgiujwU+tJ3pDQyZM9pa7CL3C
s7ZAUsI6IigSBjD3nJVOyG4DO0a8KQFCIE1mDmyqId45Qz8xJoOrYP33/ZnDuOdo
2OtL/emyfWsz9ixbHVfwIhb7EC6aoaUxQrhSWmNraaQS43YfioZR7h4we8gwenph
X4E1KY4SdML+uFf2VKIcd45NM3IBprCxx5UgFAJdrqe8+otqPNF2AVosG4iqhg/b
k4nxwuIvw2a8Fm77U9ytyXDYMItU/wIlAHMbnmgx+oTwRv6AbZ07MHkyfuQLYuhD
tq5Y14qSiTS7prNacx98
=XZiP
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management and ACPI updates from Rafael Wysocki:
"The only new feature in this batch is support for the ACPI _CCA device
configuration object, which it a pre-requisite for future ACPI PCI
support on ARM64, but should not affect the other architectures.
The rest is fixes and cleanups, mostly in cpufreq (including
intel_pstate), the Operating Performace Points (OPP) framework and
tools (cpupower and turbostat).
Specifics:
- Support for the ACPI _CCA configuration object intended to tell the
OS whether or not a bus master device supports hardware managed
cache coherency and a new set of functions to allow drivers to
check the cache coherency support for devices in a platform
firmware interface agnostic way (Suravee Suthikulpanit, Jeremy
Linton).
- ACPI backlight quirks for ESPRIMO Mobile M9410 and Dell XPS L421X
(Aaron Lu, Hans de Goede).
- Fixes for the arm_big_little and s5pv210-cpufreq cpufreq drivers
(Jon Medhurst, Nicolas Pitre).
- kfree()-related fixup for the recently introduced CPPC cpufreq
frontend (Markus Elfring).
- intel_pstate fix reducing kernel log noise on systems where
P-states are managed by hardware (Prarit Bhargava).
- intel_pstate maintainers information update (Srinivas Pandruvada).
- cpufreq core optimization related to the handling of delayed work
items used by governors (Viresh Kumar).
- Locking fixes and cleanups of the Operating Performance Points
(OPP) framework (Viresh Kumar).
- Generic power domains framework cleanups (Lina Iyer).
- cpupower tool updates (Jacob Tanenbaum, Sriram Raghunathan, Thomas
Renninger).
- turbostat tool updates (Len Brown)"
* tag 'pm+acpi-4.4-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
PCI: ACPI: Add support for PCI device DMA coherency
PCI: OF: Move of_pci_dma_configure() to pci_dma_configure()
of/pci: Fix pci_get_host_bridge_device leak
device property: ACPI: Remove unused DMA APIs
device property: ACPI: Make use of the new DMA Attribute APIs
device property: Adding DMA Attribute APIs for Generic Devices
ACPI: Adding DMA Attribute APIs for ACPI Device
device property: Introducing enum dev_dma_attr
ACPI: Honor ACPI _CCA attribute setting
cpufreq: CPPC: Delete an unnecessary check before the function call kfree()
PM / OPP: Add opp_rcu_lockdep_assert() to _find_device_opp()
PM / OPP: Hold dev_opp_list_lock for writers
PM / OPP: Protect updates to list_dev with mutex
PM / OPP: Propagate error properly from dev_pm_opp_set_sharing_cpus()
cpufreq: s5pv210-cpufreq: fix wrong do_div() usage
MAINTAINERS: update for intel P-state driver
Creating a common structure initialization pattern for struct option
cpupower: Enable disabled Cstates if they are below max latency
cpupower: Remove debug message when using cpupower idle-set -D switch
cpupower: cpupower monitor reports uninitialized values for offline cpus
...
This patch adds support for setting up PCI device DMA coherency from
ACPI _CCA object that should normally be specified in the DSDT node
of its PCI host bridge.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch move of_pci_dma_configure() to a more generic
pci_dma_configure(), which can be extended by non-OF code (e.g. ACPI).
This has no functional change.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Acked-by: Rob Herring <robh+dt@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/aer:
PCI/AER: Clear error status registers during enumeration and restore
* pci/hotplug:
PCI: pciehp: Queue power work requests in dedicated function
* pci/misc:
PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum
x86/PCI: Make pci_subsys_init() static
PCI: Add builtin_pci_driver() to avoid registration boilerplate
PCI: Remove unnecessary "if" statement
* pci/msi:
x86/PCI: Don't alloc pcibios-irq when MSI is enabled
PCI/MSI: Export all remapped MSIs to sysfs attributes
PCI: Disable MSI on SiS 761
* pci/resource:
sparc/PCI: Add mem64 resource parsing for root bus
PCI: Expand Enhanced Allocation BAR output
PCI: Make Enhanced Allocation bitmasks more obvious
PCI: Handle Enhanced Allocation capability for SR-IOV devices
PCI: Add support for Enhanced Allocation devices
PCI: Add Enhanced Allocation register entries
PCI: Handle IORESOURCE_PCI_FIXED when assigning resources
PCI: Handle IORESOURCE_PCI_FIXED when sizing resources
PCI: Clear IORESOURCE_UNSET when reverting to firmware-assigned address
* pci/virtualization:
PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
PCI: Wait 1 second between disabling VFs and clearing NumVFs
PCI: Reorder pcibios_sriov_disable()
PCI: Remove VFs in reverse order if virtfn_add() fails
PCI: Remove redundant validation of SR-IOV offset/stride registers
PCI: Set SR-IOV NumVFs to zero after enumeration
PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs
PCI: Don't try to restore VF BARs
Add support for devices using Enhanced Allocation entries instead of BARs.
This allows the kernel to parse the EA Extended Capability structure in PCI
config space and claim the BAR-equivalent resources.
See https://pcisig.com/sites/default/files/specification_documents/ECN_Enhanced_Allocation_23_Oct_2014_Final.pdf
[bhelgaas: add spec URL, s/pci_ea_set_flags/pci_ea_flags/, consolidate
declarations, print unknown property in hex to match spec]
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
[david.daney@cavium.com: Add more support/checking for Entry Properties,
allow EA behind bridges, rewrite some error messages.]
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
So far, we've always considered that for a given PCI device, its
MSI controller was either set by the architecture-specific
pcibios hook, or simply inherited from the host bridge.
This doesn't cover things like firmware-defined topologies like
msi-map (DT) or IORT (ACPI), which can provide information about
which MSI controller to use on a per-device basis.
This patch adds the necessary hook into the MSI code to allow this
feature, and provides the msi-map functionnality as a first
implementation.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
So far, we have considered that the MSI domain for a device was
either set via the architecture-dependent pcibios implementation
or inherited from the host bridge.
As we're about to break that assumption, add pci_dev_msi_domain
which is the equivalent of pci_host_bridge_msi_domain, but for
a single device.
Other than moving things around a bit, this patch on its own
has no effect.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
SR-IOV creates a virtual bus where bus->self is NULL. When we add VFs and
scan for an MSI domain, pci_set_bus_msi_domain() dereferences bus->self,
which causes a kernel NULL pointer dereference oops.
Scan up to the parent bus until we find a real bridge where we can get the
MSI domain.
[bhelgaas: changelog]
Fixes: 44aa0c657e ("PCI/MSI: Add hooks to populate the msi_domain field")
Tested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
AER errors might be recorded when powering-on devices. These errors can be
ignored, so firmware usually clears them before the OS enumerates devices.
However, firmware is not involved when devices are added via hotplug, so
the OS may discover power-up errors that should be ignored. The same may
happen when powering up devices when resuming after suspend.
Clear the AER error status registers during enumeration and resume.
[bhelgaas: changelog, remove repetitive comments]
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Revert dff22d2054 ("PCI: Call pci_read_bridge_bases() from core instead
of arch code").
Reading PCI bridge windows is not arch-specific in itself, but there is PCI
core code that doesn't work correctly if we read them too early. For
example, Hannes found this case on an ARM Freescale i.mx6 board:
pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]
The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
0x204100 of space, and mem windows are megabyte-aligned.
Bus sizing can increase a bridge window size, but never *decrease* it (see
d65245c329 ("PCI: don't shrink bridge resources")). Prior to
dff22d2054, ARM didn't read bridge windows at all, so the "original size"
was zero, and we assigned a 3MB window.
After dff22d2054, we read the bridge windows before sizing the bus. The
firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
we never decrease the size, we kept 16MB even though we only needed 3MB.
But 16MB doesn't fit in the host bridge aperture, so we failed to assign
space for the window and the downstream devices.
I think this is a defect in the PCI core: we shouldn't rely on the firmware
to assign sensible windows.
Ray reported a similar problem, also on ARM, with Broadcom iProc.
Issues like this are too hard to fix right now, so revert dff22d2054.
Reported-by: Hannes <oe5hpm@gmail.com>
Reported-by: Ray Jui <rjui@broadcom.com>
Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map. This facility is used by the pmem driver to
enable pfn_to_page() operations on the page frames returned by DAX
('direct_access' in 'struct block_device_operations'). For now, the
'memmap' allocation for these "device" pages comes from "System
RAM". Support for allocating the memmap from device memory will
arrive in a later kernel.
2/ Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3. Completion of
the conversion is targeted for v4.4.
3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
5/ Miscellaneous updates and fixes to libnvdimm including support
for issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
slsw6DkrWT60CRE42nbK
=o57/
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().
Summary:
- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.
This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').
For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.
- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.
Completion of the conversion is targeted for v4.4.
- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"
* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...
Pull irq updates from Thomas Gleixner:
"This updated pull request does not contain the last few GIC related
patches which were reported to cause a regression. There is a fix
available, but I let it breed for a couple of days first.
The irq departement provides:
- new infrastructure to support non PCI based MSI interrupts
- a couple of new irq chip drivers
- the usual pile of fixlets and updates to irq chip drivers
- preparatory changes for removal of the irq argument from interrupt
flow handlers
- preparatory changes to remove IRQF_VALID"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
irqchip: Add documentation for the bcm2836 interrupt controller
irqchip/bcm2835: Add support for being used as a second level controller
irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
PCI: xilinx: Fix typo in function name
irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
irqchip/gic: Only allow the primary GIC to set the CPU map
PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
m68k/irq: Prepare irq handlers for irq argument removal
C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
blackfin: Prepare irq handlers for irq argument removal
arc/irq: Prepare idu_cascade_isr for irq argument removal
sparc/irq: Use access helper irq_data_get_affinity_mask()
sparc/irq: Use helper irq_data_get_irq_handler_data()
parisc/irq: Use access helper irq_data_get_affinity_mask()
mn10300/irq: Use access helper irq_data_get_affinity_mask()
irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
...
Commit 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel
doesn't support MSI") changed the location of the code that initialises
dev->msi_cap/msix_cap and then disables MSI/MSI-X interrupts at PCI
probe time in devices that have this flag set. It moved the code from
pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(),
called by pci_setup_device().
The pseries PCI probing code does not call pci_setup_device(), so since
the aforementioned commit the function pci_msi_setup_pci_dev() is not
called and MSI/MSI-X interrupts are left enabled. Additionally because
dev->msi_cap/msix_cap are not initialised no driver can ever enable
MSI/MSI-X.
To fix this, the pseries PCI probe should manually call
pci_msi_setup_pci_dev(), so this patch makes it non-static.
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Update change log to mention dev->msi_cap/msix_cap]
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Firmware typically configures the PCIe fabric with a consistent Max Payload
Size setting based on the devices present at boot. A hot-added device
typically has the power-on default MPS setting (128 bytes), which may not
match the fabric.
The previous Linux default, in the absence of any "pci=pcie_bus_*" options,
was PCIE_BUS_TUNE_OFF, in which we never touch MPS, even for hot-added
devices.
Add a new default setting, PCIE_BUS_DEFAULT, in which we make sure every
device's MPS setting matches the upstream bridge. This makes it more
likely that a hot-added device will work in a system with optimized MPS
configuration.
Note that if we hot-add a device that only supports 128-byte MPS, it still
likely won't work because we don't reconfigure the rest of the fabric.
Booting with "pci=pcie_bus_peer2peer" is a workaround for this because it
sets MPS to 128 for everything.
[bhelgaas: changelog, new default, rework for pci_configure_device() path]
Tested-by: Keith Busch <keith.busch@intel.com>
Tested-by: Jordan Hargrave <jharg93@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Previously we checked for invalid MPS settings, i.e., a device with MPS
different than its upstream bridge, in pcie_bus_detect_mps(). We only did
this if the arch or hotplug driver called pcie_bus_configure_settings(),
and then only if PCIe bus tuning was disabled (PCIE_BUS_TUNE_OFF).
Move the MPS checking code to pci_configure_device(), so we do it in the
pci_device_add() path for every device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add a pci_scan_root_bus_msi() interface so an arch can specify the MSI
controller up front. This removes the need for a pcibios callback to set
the MSI controller later.
This is not exported because I'd like to replace the variety of "scan root
bus" interfaces with a single, more extensible interface that can handle
the MSI controller, domain, pci_ops, resources, etc. I hope this interface
is temporary.
[bhelgaas: changelog, split into separate patch]
Suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
We should not assume any particular hardware topology. Commit d0751b98df
("PCI: Add dev->has_secondary_link to track downstream PCIe links") relied
on the assumption that every PCIe hierarchy is rooted at a Root Port. But
we can't rely on any assumption about what hardware we will find; we just
have to deal with the world as it is.
On some platforms, PCIe devices (endpoints, switch upstream ports, etc.)
appear directly on the root bus, and there is no Root Port in the PCI bus
hierarchy. For example, Meelis observed these top-level devices on a
Sparc V245:
0000:02:00.0 PCI bridge to [bus 03-0d] Switch Upstream Port
0001:02:00.0 PCI bridge to [bus 03] PCIe to PCI/PCI-X Bridge
These devices *look* like they have links going upstream, but there really
are no upstream devices.
In set_pcie_port_type(), we used the parent device to figure out which side
of a switch port has a link, so if the parent device did not exist, we
dereferenced a NULL parent pointer.
Check whether the parent device exists before dereferencing it.
Meelis observed this oops on Sparc V245 and T2000. Ben Herrenschmidt says
this is also possible on IBM PowerVM guests on PowerPC.
[bhelgaas: changelog, comment]
Link: http://lkml.kernel.org/r/alpine.LRH.2.20.1508122118210.18637@math.ut.ee
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
* pci/hotplug:
PCI: pciehp: Remove ignored MRL sensor interrupt events
PCI: pciehp: Remove unused interrupt events
PCI: pciehp: Handle invalid data when reading from non-existent devices
PCI: Hold pci_slot_mutex while searching bus->slots list
PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem
PCI: pciehp: Simplify pcie_poll_cmd()
PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot
* pci/iommu:
PCI: Remove pci_ats_enabled()
PCI: Stop caching ATS Invalidate Queue Depth
PCI: Move ATS declarations to linux/pci.h so they're all together
PCI: Clean up ATS error handling
PCI: Use pci_physfn() rather than looking up physfn by hand
PCI: Inline the ATS setup code into pci_ats_init()
PCI: Rationalize pci_ats_queue_depth() error checking
PCI: Reduce size of ATS structure elements
PCI: Embed ATS info directly into struct pci_dev
PCI: Allocate ATS struct during enumeration
iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth
* pci/irq:
PCI: Kill off set_irq_flags() usage
* pci/virtualization:
PCI: Add ACS quirks for Intel I219-LM/V
Previously, we allocated pci_ats structures when an IOMMU driver called
pci_enable_ats(). An SR-IOV VF shares the STU setting with its PF, so when
enabling ATS on the VF, we allocated a pci_ats struct for the PF if it
didn't already have one. We held the sriov->lock to serialize threads
concurrently enabling ATS on several VFS so only one would allocate the PF
pci_ats.
Gregor reported a deadlock here:
pci_enable_sriov
sriov_enable
virtfn_add
mutex_lock(dev->sriov->lock) # acquire sriov->lock
pci_device_add
device_add
BUS_NOTIFY_ADD_DEVICE notifier chain
iommu_bus_notifier
amd_iommu_add_device # iommu_ops.add_device
init_iommu_group
iommu_group_get_for_dev
iommu_group_add_device
__iommu_attach_device
amd_iommu_attach_device # iommu_ops.attach_device
attach_device
pci_enable_ats
mutex_lock(dev->sriov->lock) # deadlock
There's no reason to delay allocating the pci_ats struct, and if we
allocate it for each device at enumeration-time, there's no need for
locking in pci_enable_ats().
Allocate pci_ats struct during enumeration, when we initialize other
capabilities.
Note that this implementation requires ATS to be enabled on the PF first,
before on any of the VFs because the PF controls the STU for all the VFs.
Link: http://permalink.gmane.org/gmane.linux.kernel.iommu/9433
Reported-by: Gregor Dick <gdick@solarflare.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Quoting Arnd:
I was thinking the opposite approach and basically removing all uses
of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
them.and we can probably replace them all with hardcoded
ioremap_cached() calls in the cases they are actually useful.
All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().
Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* pci/irq:
PCI/MSI: Free legacy IRQ when enabling MSI/MSI-X
PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed
PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()
PCI: Add pcibios_alloc_irq() and pcibios_free_irq()
* pci/misc:
PCI: Remove unused "pci_probe" flags
PCI: Add VPD function 0 quirk for Intel Ethernet devices
PCI: Add dev_flags bit to access VPD through function 0
PCI / ACPI: Fix pci_acpi_optimize_delay() comment
PCI: Remove a broken link in quirks.c
PCI: Remove useless redundant code
PCI: Simplify pci_find_(ext_)capability() return value checks
PCI: Move PCI_FIND_CAP_TTL to pci.h and use it in quirks
PCI: Add pcie_downstream_port() (true for Root and Switch Downstream Ports)
PCI: Fix pcie_port_device_resume() comment
PCI: Shift PCI_CLASS_NOT_DEFINED consistently with other classes
PCI: Revert aeb30016fe ("PCI: add Intel USB specific reset method")
PCI: Fix TI816X class code quirk
PCI: Fix generic NCR 53c810 class code quirk
PCI: Use PCI_CLASS_SERIAL_USB instead of bare number
PCI: Add quirk for Intersil/Techwell TW686[4589] AV capture cards
PCI: Remove Intel Cherrytrail D3 delays
* pci/resource:
PCI: Call pci_read_bridge_bases() from core instead of arch code
* pci/virtualization:
PCI: Restore ACS configuration as part of pci_restore_state()
Previously, pci_setup_device() and similar functions searched the
pci_bus->slots list without any locking. It was possible for another
thread to update the list while we searched it.
Add pci_dev_assign_slot() to search the list while holding pci_slot_mutex.
[bhelgaas: changelog, fold in CONFIG_SYSFS fix]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In order to populate the PCI host bridge msi_domain, use the
"msi-parent" attribute to lookup a corresponding irq domain.
If found, this is our MSI domain.
This gets plugged into the core PCI code.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Ma Jun <majun258@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Duc Dang <dhdang@apm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1438091186-10244-6-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In order to be able to populate the device msi_domain field,
add the necessary hooks to propagate the host bridge msi_domain
across secondary busses to devices.
So far, nobody populates the initial msi_domain.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Cc: Ma Jun <majun258@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Duc Dang <dhdang@apm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1438091186-10244-5-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When we scan a PCI bus, we read PCI-PCI bridge window registers with
pci_read_bridge_bases() so we can validate the resource hierarchy. Most
architectures call pci_read_bridge_bases() from pcibios_fixup_bus(), but
PCI-PCI bridges are not arch-specific, so this doesn't need to be in
arch-specific code.
Call pci_read_bridge_bases() directly from the PCI core instead of from
arch code.
For alpha and mips, we now call pci_read_bridge_bases() always; previously
we only called it if PCI_PROBE_ONLY was set.
[bhelgaas: changelog]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: James E.J. Bottomley <jejb@parisc-linux.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: David Howells <dhowells@redhat.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Tony Luck <tony.luck@intel.com>
CC: David S. Miller <davem@davemloft.net>
CC: Ingo Molnar <mingo@redhat.com>
CC: Guenter Roeck <linux@roeck-us.net>
CC: Michal Simek <monstr@monstr.eu>
CC: Chris Zankel <chris@zankel.net>
The PCI class in dev->class is a three-byte value comprising a base class,
sub-class, and interface type. PCI_CLASS_NOT_DEFINED includes the base
class and sub-class, but not the interface type, so it should be shifted to
make space for the interface. It happens that PCI_CLASS_NOT_DEFINED is
zero, so it doesn't matter in the end, but we should still use it
consistently with other class definitions.
Treat PCI_CLASS_NOT_DEFINED as a base class/sub-class value that should
appear in bits 8-23 of dev->class.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No one uses pci_scan_bus_parented() any more, remove it.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>