After recent introduction of new VPD API functions and user migration
these defines and inline functions aren't used outside VPD core any
longer.
Link: https://lore.kernel.org/r/d33e06bf-bc5e-ece7-bf35-7245ae224d1b@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add a pci_vpd_find_id_string() API function to retrieve the ID string from
VPD.
This way callers don't need pci_vpd_lrdt_size() any longer, and it can be
made private to the VPD core.
Link: https://lore.kernel.org/r/c5225bf6-8d29-970d-e271-0d7b52252630@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move pci_vpd_find_tag() post-processing from pci_vpd_find_ro_info_keyword()
to pci_vpd_find_tag(). This simplifies function pci_vpd_find_id_string()
that will be added in a subsequent patch.
Link: https://lore.kernel.org/r/fb15393f-d3b2-e140-2643-570d3abd7382@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that the last users have been migrated to pci_vpd_find_ro_keyword()
we can stop exporting this function. It's still used in VPD core code.
Link: https://lore.kernel.org/r/96ca2a56-383e-9b61-9cba-4f1e5611dc15@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that the last users have been migrated to pci_vpd_find_ro_keyword()
we can stop exporting this function. It's still used in VPD core code.
Link: https://lore.kernel.org/r/71131eca-0502-7878-365f-30b6614161cf@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- Address 3 PCI device power management issues (Rafael Wysocki).
- Add Power Limit4 support for Alder Lake to the Intel RAPL power
capping driver (Sumeet Pawnikar).
- Add HWP guaranteed performance change notification support to
the intel_pstate driver (Srinivas Pandruvada).
- Replace deprecated CPU-hotplug functions in code related to power
management (Sebastian Andrzej Siewior).
- Update CPU PM notifiers to use raw spinlocks (Valentin Schneider).
- Add support for 'required-opps' DT property to the generic power
domains (genpd) framework and use this property for I2C on ARM64
sc7180 (Rajendra Nayak).
- Fix Kconfig issue related to genpd (Geert Uytterhoeven).
- Increase energy calculation precision in the Energy Model (Lukasz
Luba).
- Fix kobject deletion in the exit code of the schedutil cpufreq
governor (Kevin Hao).
- Unmark some functions as kernel-doc in the PM core to avoid
false-positive documentation build warnings (Randy Dunlap).
- Check RTC features instead of ops in suspend_test Alexandre
Belloni).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmEtIvYSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxnQ8QAK2QCyfFAdrPE3zn+XdTcVC8rpYhL0d2
3YpbS2obZ4lOHsrNy7SrrU5NdgvCFPHl14Py5rxu4QZW8EF5km6SqJYj9MhrDsEm
wJ/ZM3Zbaj98MAls7pulHxabMo8hvGfMSmcNOFfgh7yqCm5eu8gNnt/LbLaB9IvI
v46ZGNvnpbtkWgk4Eaq+v3Y5/+4CoUfuAFn7K21SHNcgzDbhE3Ii6cam/rwPeCdn
a+LcDLzeDQpnnYKukx7Zyk7Z+FblXWHWZ/ClLcR8JgYYSrbrXxSLtRdM5EtLje2J
ZnqFKqmlMcq2OhTjBSoHSJjiOkxyz5eWWrEf7d1/oH19WSW+rv5VfZUXAkYtfGKo
OJChuIomrLNqJhAwTnxG3fnVh2NMhLwDVEjkFgvej4shCZXhoVgWu39mWnbyTxV3
57adiuVUv6AboH6Lyvi/yzV7sWmZbL/U/YVvt+Z9SEh0syy6YBb87bW3Mr1qnLeG
QF4AmF553qzHwd9uxUoFfiUoBxH1GvNmOWtx6lYKhy0lm3AwvW1XHKqsDG4KbVJg
zzok7J2yv1/1rd8dJ+8tCwyyC3WJjNjtq/ULOltugwQpqVK4PCwWAUmjJaTP3Gpp
ZH/bGuLSQWxxiinwKCkFSxwbewJu+AvtoXYKBfk/Gf5O6GyUyryBbBMO9dsqP27A
Zg3P4+ejS7RN
=sx6G
-----END PGP SIGNATURE-----
Merge tag 'pm-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These address some PCI device power management issues, add new
hardware support to the RAPL power capping driver, add HWP guaranteed
performance change notification support to the intel_pstate driver,
replace deprecated CPU-hotplug functions in a few places, update CPU
PM notifiers to use raw spinlocks, update the PM domains framework
(new DT property support, Kconfig fix), do a couple of cleanups in
code related to system sleep, and improve the energy model and the
schedutil cpufreq governor.
Specifics:
- Address 3 PCI device power management issues (Rafael Wysocki).
- Add Power Limit4 support for Alder Lake to the Intel RAPL power
capping driver (Sumeet Pawnikar).
- Add HWP guaranteed performance change notification support to the
intel_pstate driver (Srinivas Pandruvada).
- Replace deprecated CPU-hotplug functions in code related to power
management (Sebastian Andrzej Siewior).
- Update CPU PM notifiers to use raw spinlocks (Valentin Schneider).
- Add support for 'required-opps' DT property to the generic power
domains (genpd) framework and use this property for I2C on ARM64
sc7180 (Rajendra Nayak).
- Fix Kconfig issue related to genpd (Geert Uytterhoeven).
- Increase energy calculation precision in the Energy Model (Lukasz
Luba).
- Fix kobject deletion in the exit code of the schedutil cpufreq
governor (Kevin Hao).
- Unmark some functions as kernel-doc in the PM core to avoid
false-positive documentation build warnings (Randy Dunlap).
- Check RTC features instead of ops in suspend_test Alexandre
Belloni)"
* tag 'pm-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: domains: Fix domain attach for CONFIG_PM_OPP=n
powercap: Add Power Limit4 support for Alder Lake SoC
cpufreq: intel_pstate: Process HWP Guaranteed change notification
thermal: intel: Allow processing of HWP interrupt
notifier: Remove atomic_notifier_call_chain_robust()
PM: cpu: Make notifier chain use a raw_spinlock_t
PM: sleep: unmark 'state' functions as kernel-doc
arm64: dts: sc7180: Add required-opps for i2c
PM: domains: Add support for 'required-opps' to set default perf state
opp: Don't print an error if required-opps is missing
cpufreq: schedutil: Use kobject release() method to free sugov_tunables
PM: EM: Increase energy calculation precision
PM: sleep: check RTC features instead of ops in suspend_test
PM: sleep: s2idle: Replace deprecated CPU-hotplug functions
cpufreq: Replace deprecated CPU-hotplug functions
powercap: intel_rapl: Replace deprecated CPU-hotplug functions
PCI: PM: Enable PME if it can be signaled from D3cold
PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently
PCI: Use pci_update_current_state() in pci_enable_device_flags()
HiSilicon KunPeng920 and KunPeng930 have devices that appear as PCI but are
actually on the AMBA bus. These fake PCI devices can support SVA via the
SMMU stall feature.
DT systems can indicate this in the device tree, but ACPI systems don't
have that mechanism, so add a "dma-can-stall" property manually for them.
[bhelgaas: add text from Robin as comment near quirk]
Link: https://lore.kernel.org/r/1626144876-11352-4-git-send-email-zhangfei.gao@linaro.org
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Add a driver for the DesignWare-based PCIe controller found on
RK356X. The existing pcie-rockchip-host driver is only used for
the Rockchip-designed IP found on RK3399.
Link: https://lore.kernel.org/r/20210625065511.1096935-1-xxm@rock-chips.com
Tested-by: Peter Geis <pgwipeout@gmail.com>
Signed-off-by: Simon Xue <xxm@rock-chips.com>
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Reviewed-by: Rob Herring <robh@kernel.org>
As part of code refactoring completed in a0fd361db8 ("PCI: dwc: Move
"dbi", "dbi2", and "addr_space" resource setup into common code"),
dw_plat_add_pcie_ep() was removed and the call to the dw_pcie_ep_init() was
moved into dw_plat_pcie_probe().
This left a break statement behind that is not needed any more as as
dw_plat_pcie_probe() returns immediately after calling dw_pcie_ep_init().
Remove this surplus break statement that became dead code.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20210701210252.1638709-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The switch statement in the artpec6_pcie_probe() has a local code block
where "val" is defined and immediately used by the artpec6_pcie_readl().
This extra code block adds brackets at the same indentation level as the
switch statement itself which can hinder readability of the code.
Move the "val" declaration to the top of the function and remove
the extra code block from the switch statement.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20210701204401.1636562-2-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
As part of code refactoring completed in a0fd361db8 ("PCI: dwc: Move
"dbi", "dbi2", and "addr_space" resource setup into common code"),
artpec6_add_pcie_ep() was removed and the call to the dw_pcie_ep_init()
was moved into artpec6_pcie_probe().
This left a break statement behind that is not needed any more as
artpec6_pcie_probe() returns immediately after calling dw_pcie_ep_init().
Remove this surplus break statement that became dead code.
Link: https://lore.kernel.org/r/20210701204401.1636562-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Add support for the PCIe RC controller on Toshiba Visconti ARM SoCs. This
PCIe controller is based on the Synopsys DesignWare PCIe core.
Link: https://lore.kernel.org/r/20210811083830.784065-3-nobuhiro1.iwamatsu@toshiba.co.jp
Signed-off-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Previously we assumed that all Root Ports and Switch Downstream Ports
supported Link Bandwidth Notification. Per spec, this is only required
for Ports supporting Links wider than x1 and/or multiple Link speeds
(PCIe r5.0, sec 7.5.3.6).
Because we assumed all Ports supported it, we tried to set up a Bandwidth
Notification IRQ, which failed for devices that don't support IRQs at all,
which meant pcieport didn't attach to the Port at all.
Check the Link Bandwidth Notification Capability bit and enable the service
only when the Port supports it.
[bhelgaas: commit log]
Fixes: e8303bb7a7 ("PCI/LINK: Report degraded links via link bandwidth notification")
Link: https://lore.kernel.org/r/20210512213314.7778-1-stuart.w.hayes@gmail.com
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org
Core changes:
- The usual set of small fixes and improvements all over the place, but nothing
outstanding
MSI changes:
- Further consolidation of the PCI/MSI interrupt chip code
- Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts
of platform devices in the same way as PCI exposes them.
Driver changes:
- Support for ARM GICv3 EPPI partitions
- Treewide conversion to generic_handle_domain_irq() for all chained
interrupt controllers
- Conversion to bitmap_zalloc() throughout the irq chip drivers
- The usual set of small fixes and improvements
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEsnpsTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoS+/EACQdpRkzl3IDIYqThxVZ8KQzp2rKKVn
qisAQiWg/6koNJx/yYy62KNAUyKjCIObNtRnWi7OAOx6OvNtQTD2WOLAwkh3Pgw1
8ePYYl55k+yCs8VoITsZM9jYeO+Tk878pU2A6R943zR+g6G7bskGJrxEyZ9TbzIe
qKfusNKnRY9/jMQaRALUAAtA9VIVR867GqORX5X8hKz8yE2rqlpb4y+1CFba5BTV
Vlxw7cIXvXBn7BKAom5diRqEGDNJEbX+56jJ7yDZshgLo7m11D7QLw72kmb6TNVC
g7PchvFi4afpc1ifEAAp0tk4RiSIAQ91nS3n0+jLcLbodOjIkl14eY02ZCJGAP29
uslyzUbmy1wgejG6CA63JtZ4MYdrf/OSMGuoN78qnOKYcIsWFzOvlJmBWWNW34qW
LCaUF9QdJ/slXu6B4vIx30GfN9q4myml8bFUobE5q9mBRrEk4R0B7iyBvPu1xKYr
ZEan67prI5VEu+afJGpp4r294m4HNVkMLfl3nYmE5+y4MoLeMNKDY3IPTvI9iP4G
kaFgoPvQo23WnuclNYpJ+CaA4aRASlB2nTY+oAXIYfehbey9EW5vq4/EK864ek6w
oyUTepxxNhE81tG2jpQbf2tR4COsEHy986clxqPP4AvsZXcbypCw8O2FcflpQbHO
5DLEAfTmp7cziQ==
=qyll
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates to the interrupt core and driver subsystems:
Core changes:
- The usual set of small fixes and improvements all over the place,
but nothing stands out
MSI changes:
- Further consolidation of the PCI/MSI interrupt chip code
- Make MSI sysfs code independent of PCI/MSI and expose the MSI
interrupts of platform devices in the same way as PCI exposes them.
Driver changes:
- Support for ARM GICv3 EPPI partitions
- Treewide conversion to generic_handle_domain_irq() for all chained
interrupt controllers
- Conversion to bitmap_zalloc() throughout the irq chip drivers
- The usual set of small fixes and improvements"
* tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
platform-msi: Add ABI to show msi_irqs of platform devices
genirq/msi: Move MSI sysfs handling from PCI to MSI core
genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
irqdomain: Export irq_domain_disconnect_hierarchy()
irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
irqchip/apple-aic: Fix irq_disable from within irq handlers
pinctrl/rockchip: drop the gpio related codes
gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
gpio/rockchip: support next version gpio controller
gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
gpio/rockchip: add driver for rockchip gpio
dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
pinctrl/rockchip: add pinctrl device to gpio bank struct
pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
pinctrl/rockchip: always enable clock for gpio controller
genirq: Fix kernel doc indentation
EDAC/altera: Convert to generic_handle_domain_irq()
powerpc: Bulk conversion to generic_handle_domain_irq()
nios2: Bulk conversion to generic_handle_domain_irq()
...
* pm-pci:
PCI: PM: Enable PME if it can be signaled from D3cold
PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently
PCI: Use pci_update_current_state() in pci_enable_device_flags()
* pm-sleep:
PM: sleep: unmark 'state' functions as kernel-doc
PM: sleep: check RTC features instead of ops in suspend_test
PM: sleep: s2idle: Replace deprecated CPU-hotplug functions
* pm-domains:
PM: domains: Fix domain attach for CONFIG_PM_OPP=n
arm64: dts: sc7180: Add required-opps for i2c
PM: domains: Add support for 'required-opps' to set default perf state
opp: Don't print an error if required-opps is missing
* powercap:
powercap: Add Power Limit4 support for Alder Lake SoC
powercap: intel_rapl: Replace deprecated CPU-hotplug functions
When running as Xen PV guest, masking MSI-X is a responsibility of the
hypervisor. The guest has no write access to the relevant BAR at all - when
it tries to, it results in a crash like this:
BUG: unable to handle page fault for address: ffffc9004069100c
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
e1000_probe+0x41f/0xdb0 [e1000e]
local_pci_probe+0x42/0x80
(...)
The recently introduced function msix_mask_all() does not check the global
variable pci_msi_ignore_mask which is set by XEN PV to bypass the masking
of MSI[-X] interrupts.
Add the check to make this function XEN PV compatible.
Fixes: 7d5ec3d361 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210826170342.135172-1-marmarek@invisiblethingslab.com
Some systems, e.g., HiSilicon KunPeng920 and KunPeng930, have devices that
appear as PCI but are actually on the AMBA bus. Some of these fake PCI
devices support a PASID-like feature and they do have a working PASID
capability even though they do not use the PCIe Transport Layer Protocol
and do not support TLP prefixes.
Add a pasid_no_tlp bit for this "PASID works without TLP prefixes" case and
update pci_enable_pasid() so it can enable PASID on these devices.
Set this bit for HiSilicon KunPeng920 and KunPeng930.
[bhelgaas: squashed, commit log]
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/1626144876-11352-2-git-send-email-zhangfei.gao@linaro.org
Link: https://lore.kernel.org/r/1626144876-11352-3-git-send-email-zhangfei.gao@linaro.org
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add 'override_only' field to struct pci_device_id to be used as part of
pci_match_device().
When set, a driver only matches the entry when dev->driver_override is
set to that driver.
In addition, add a helper macro named 'PCI_DEVICE_DRIVER_OVERRIDE' to
enable setting some data on it.
Next patch from this series will use the above functionality.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20210826103912.128972-10-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Use of_get_pci_domain_nr() to get the pci domain.
If the "linux,pci-domain" property is present, we assume that the PCIe
bridge is an individual bridge, hence we only need to parse one port.
Link: https://lore.kernel.org/r/20210823032800.1660-5-chuanjia.liu@mediatek.com
Signed-off-by: Chuanjia Liu <chuanjia.liu@mediatek.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
Use platform_get_irq_byname() to get the irq number
if the "interrupt-names" property is defined.
Link: https://lore.kernel.org/r/20210823032800.1660-4-chuanjia.liu@mediatek.com
Signed-off-by: Chuanjia Liu <chuanjia.liu@mediatek.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
For the new dts format, add a new method to get
shared pcie-cfg base address and use it to configure
the PCIECFG controller
Link: https://lore.kernel.org/r/20210823032800.1660-3-chuanjia.liu@mediatek.com
Signed-off-by: Chuanjia Liu <chuanjia.liu@mediatek.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
irq_mask and irq_unmask callbacks need to be properly guarded by raw spin
locks as masking/unmasking procedure needs atomic read-modify-write
operation on hardware register.
Link: https://lore.kernel.org/r/20210820155020.3000-1-pali@kernel.org
Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Add a predicate that returns if PCIe PTM (Precision Time Measurement)
is enabled.
It will only return true if it's enabled in all the ports in the path
from the device to the root.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Make pci_enable_ptm() accessible from the drivers.
Exposing this to the driver enables the driver to use the
'ptm_enabled' field of 'pci_dev' to check if PTM is enabled or not.
This reverts commit ac6c26da29 ("PCI: Make pci_enable_ptm() private").
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Move PCI's MSI sysfs code to the irq core so that other busses such as
platform can reuse it.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210813035628.6844-2-21cnbao@gmail.com
Now we have everything we need, just provide a proper sysdata type for
the bus to use on ARM64 and everything else works.
Link: https://lore.kernel.org/r/20210726180657.142727-9-boqun.feng@gmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Since PCI_HYPERV depends on PCI_MSI_IRQ_DOMAIN which selects
GENERIC_MSI_IRQ_DOMAIN, we can use dev_set_msi_domain() to set up the
MSI domain at probing time, and this works for both x86 and ARM64.
Therefore use it as the preparation for ARM64 Hyper-V PCI support.
As a result, no longer need to maintain ->fwnode in x86 specific
pci_sysdata, and make hv_pcibus_device own it instead.
Link: https://lore.kernel.org/r/20210726180657.142727-8-boqun.feng@gmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change, just store and maintain the PCI domain number in
the ->domain_nr of pci_host_bridge. Note that we still need to keep
the copy of domain number in x86-specific pci_sysdata, because x86 is
not a PCI_DOMAINS_GENERIC=y architecture, so the ->domain_nr of
pci_host_bridge doesn't work for it yet.
Link: https://lore.kernel.org/r/20210726180657.142727-7-boqun.feng@gmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
In order to support ARM64 Hyper-V PCI, we need to set up the bridge at
probing time because ARM64 is a PCI_DOMAIN_GENERIC=y arch and we don't
have pci_config_window (ARM64 sysdata) for a PCI root bus on Hyper-V, so
it's impossible to retrieve the information (e.g. PCI domains, MSI
domains) from bus sysdata on ARM64 after creation.
Originally in create_root_hv_pci_bus(), pci_create_root_bus() is used to
create the root bus and the corresponding bridge based on x86 sysdata.
Now we create a bridge first and then call pci_scan_root_bus_bridge(),
which allows us to do the necessary set-ups for the bridge.
Link: https://lore.kernel.org/r/20210726180657.142727-6-boqun.feng@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Currently, at probing time, the MSI domains of root buses are populated
if either the information of MSI domain is available from firmware (DT
or ACPI), or arch-specific sysdata is used to pass the fwnode of the MSI
domain. These two conditions don't cover all, e.g. Hyper-V virtual PCI
on ARM64, which doesn't have the MSI information in the firmware and
couldn't use arch-specific sysdata because running on an architecture
with PCI_DOMAINS_GENERIC=y.
To support populating MSI domains of the root buses at the probing when
neither of the above condition is true, the ->msi_domain of the
corresponding bridge device is used: in pci_host_bridge_msi_domain(),
which should return the MSI domain of the root bus, the ->msi_domain of
the corresponding bridge is fetched first as a potential value of the
MSI domain of the root bus.
In order to use the approach to populate MSI domains, the driver needs
to dev_set_msi_domain() on the bridge before calling
pci_register_host_bridge(), and makes sure GENERIC_MSI_IRQ_DOMAIN=y.
Another advantage of this new approach is providing an arch-independent
way to populate MSI domains, which allows sharing the driver code as
much as possible between architectures.
Originally-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210726180657.142727-3-boqun.feng@gmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Currently we retrieve the PCI domain number of the host bridge from the
bus sysdata (or pci_config_window if PCI_DOMAINS_GENERIC=y). Actually
we have the information at PCI host bridge probing time, and it makes
sense that we store it into pci_host_bridge. One benefit of doing so is
the requirement for supporting PCI on Hyper-V for ARM64, because the
host bridge of Hyper-V doesn't have pci_config_window, whereas ARM64 is
a PCI_DOMAINS_GENERIC=y arch, so we cannot retrieve the PCI domain
number from pci_config_window on ARM64 Hyper-V guest.
As the preparation for ARM64 Hyper-V PCI support, we introduce the
domain_nr in pci_host_bridge and a sentinel value to allow drivers to
set domain numbers properly at probing time. Currently
CONFIG_PCI_DOMAINS_GENERIC=y archs are only users of this
newly-introduced field.
Link: https://lore.kernel.org/r/20210726180657.142727-2-boqun.feng@gmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Interfaces and structs for saving and restoring PCI Capability state were
declared in include/linux/pci.h, but aren't needed outside drivers/pci/.
Move these to drivers/pci/pci.h:
struct pci_cap_saved_data
struct pci_cap_saved_state
void pci_allocate_cap_save_buffers()
void pci_free_cap_save_buffers()
int pci_add_cap_save_buffer()
int pci_add_ext_cap_save_buffer()
struct pci_cap_saved_state *pci_find_saved_cap()
struct pci_cap_saved_state *pci_find_saved_ext_cap()
Link: https://lore.kernel.org/r/20210802221728.1469304-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
PCIe Address Translation Services (ATS) provides a mechanism for a device
to provide an on-device caching translation agent (device IOTLB). We
already have a means to disable support for this feature via the pci=noats
option. For untrusted and externally facing devices, we not only disable
ATS support for the device, but we use Access Control Services (ACS)
Transaction Blocking to actively prevent devices from sending TLPs with
non-default AT field values.
Extend pci=noats to also make use of PCI_ACS_TB so that not only is ATS
disabled at the device, but blocked at the downstream ports. This provides
a means to further lock-down ATS for cases such as device assignment, where
it may not be the hardware configuration of the device that makes it
untrusted, but the driver running on the device.
Link: https://lore.kernel.org/r/162404966325.2362347.12176138291577486015.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rajat Jain <rajatja@google.com>
Some Cavium endpoints are implemented as multi-function devices without ACS
capability, but they actually don't support peer-to-peer transactions.
Add ACS quirks to declare DMA isolation for the following devices:
- BGX device found on Octeon-TX (8xxx)
- CGX device found on Octeon-TX2 (9xxx)
- RPM device found on Octeon-TX3 (10xxx)
Link: https://lore.kernel.org/r/20210810122425.1115156-1-george.cherian@marvell.com
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since 39850ed510 ("PCI/PTM: Save/restore Precision Time Measurement
Capability for suspend/resume"), devices that have PTM capability but
don't enable it see this message on calls to pci_save_state():
no suspend buffer for PTM
Drop the message, it's perfectly fine not to use a capability.
Fixes: 39850ed510 ("PCI/PTM: Save/restore Precision Time Measurement Capability for suspend/resume")
Link: https://lore.kernel.org/r/20210811185955.3112534-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David E. Box <david.e.box@linux.intel.com>
VPD checksum information and checksum calculation are specified by PCIe
r5.0, sec 6.28.2.2. Therefore checksum handling can and should be moved
into the PCI VPD core.
Add pci_vpd_check_csum() to validate the VPD checksum.
[bhelgaas: split to separate patch]
Link: https://lore.kernel.org/r/1643bd7a-088e-1028-c9b0-9d112cf48d63@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
All users of pci_vpd_find_info_keyword() are interested in the VPD RO
section only. In addition all calls are followed by the same activities to
calculate start of tag data area and size of the data area.
Add pci_vpd_find_ro_info_keyword() that combines these functionalities.
pci_vpd_find_info_keyword() can be phased out once all users are converted.
[bhelgaas: split pci_vpd_check_csum() to separate patch]
Link: https://lore.kernel.org/r/1643bd7a-088e-1028-c9b0-9d112cf48d63@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Several users of the VPD API use a fixed-size buffer and read the VPD into
it for further usage. This requires special handling for the case that the
buffer isn't big enough to hold the full VPD data. Also the buffer is
often allocated on the stack, which isn't too nice.
Add pci_vpd_alloc() to dynamically allocate buffer of the correct size and
read VPD into it.
Link: https://lore.kernel.org/r/955ff598-0021-8446-f856-0c2c077635d7@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add driver for Intel Keem Bay SoC PCIe controller. This controller
is based on DesignWare PCIe core.
In Root Complex mode, only internal reference clock is possible for
Keem Bay A0. For Keem Bay B0, external reference clock can be used
and will be the default configuration. Currently, keembay_pcie_of_data
structure has one member. It will be expanded later to handle this
difference.
Endpoint mode link initialization is handled by the boot firmware.
Link: https://lore.kernel.org/r/20210805211010.29484-3-srikanth.thokala@intel.com
Signed-off-by: Wan Ahmad Zainie <wan.ahmad.zainie.wan.mohamad@intel.com>
Signed-off-by: Srikanth Thokala <srikanth.thokala@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
In commit 6df6ba974a ("PCI: aardvark: Remove PCIe outbound window
configuration") was removed aardvark PCIe outbound window configuration and
commit description said that was recommended solution by HW designers.
But that commit completely removed support for configuring PCIe IO
resources without removing PCIe IO 'ranges' from DTS files. After that
commit PCIe IO space started to be treated as PCIe MEM space and accessing
it just caused kernel crash.
Moreover implementation of PCIe outbound windows prior that commit was
incorrect. It completely ignored offset between CPU address and PCIe bus
address and expected that in DTS is CPU address always same as PCIe bus
address without doing any checks. Also it completely ignored size of every
PCIe resource specified in 'ranges' DTS property and expected that every
PCIe resource has size 128 MB (also for PCIe IO range). Again without any
check. Apparently none of PCIe resource has in DTS specified size of 128
MB. So it was completely broken and thanks to how aardvark mask works,
configuration was completely ignored.
This patch reverts back support for PCIe outbound window configuration but
implementation is a new without issues mentioned above. PCIe outbound
window is required when DTS specify in 'ranges' property non-zero offset
between CPU and PCIe address space. To address recommendation by HW
designers as specified in commit description of 6df6ba974a, set default
outbound parameters as PCIe MEM access without translation and therefore
for this PCIe 'ranges' it is not needed to configure PCIe outbound window.
For PCIe IO space is needed to configure aardvark PCIe outbound window.
This patch fixes kernel crash when trying to access PCIe IO space.
Link: https://lore.kernel.org/r/20210624215546.4015-2-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org # 6df6ba974a ("PCI: aardvark: Remove PCIe outbound window configuration")
Not much to see here. Half the fixes this time are for Qualcomm dts files,
fixing small mistakes on certain machines. The other fixes are:
- A 5.13 regression fix for freescale QE interrupt controller\
- A fix for TI OMAP gpt12 timer error handling
- A randconfig build regression fix for ixp4xx
- Another defconfig fix following the CONFIG_FB dependency rework
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmEe0MsACgkQmmx57+YA
GNkQaw/5Ae6pR1VN2r727xHfYPOG9tgHuU0eAUsDvZjVxQSWlC/TowMZ/+eMPoyc
6iwUz2W1BZrkPbc/3GfO/xQQq9HbdQbB0+hma9jkNVueSgURpaDsLHm1Qt8vXKw1
rSa/ITHIOuHbYE63RGU48/8qw/Xyr6JJRJpjZKuRXQRAJhuJisw13w0IJAFStvPC
GhgFkvkKruls1zsaeV5BeU1EZRnFCz9dZL519SPzol/dZW2allu9yiCFUopcdMJ+
G/XyBwL+JVkQuLGy/Y8n4CifbFsyHPOv/dj4SxGDFwXDYPb1l+4+CwkdjuBoSeYE
glbzJQJYZ7/QVyvUDIz5h5eulo03xrsx+80SQPCXjfmut+mWcLL3uSOcXb169F4S
VB0rHgusXLL6Z7NbqWigo5YF58DqpDKa19rLCpW+/QqDuhyusm91RbMIs0oLJm0B
n6HjYganyJM5VWgN5WvTpPGW/yJnt1uJoOwtgxKZSP95lmL7JRhUKzTI2AiZjo+8
6zvy6QFlgMrjoG8mfw7Ns+sS9sAXTxE3YwL8AyFtkn2JiAYH+sP2J4Wn6P+E8/kh
F5WtypSQaE2oQYgF8D06jq2Jd89dZdP+6ZABHlYZyflbbezJ/em1sxhohbRRLU10
5C/Mqwo9/yVg2tOKKDkFqAkb6eq9QKvSDz9L7jrfRDQm5RMYnb4=
=Ohvh
-----END PGP SIGNATURE-----
Merge tag 'soc-fixes-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Not much to see here. Half the fixes this time are for Qualcomm dts
files, fixing small mistakes on certain machines. The other fixes are:
- A 5.13 regression fix for freescale QE interrupt controller\
- A fix for TI OMAP gpt12 timer error handling
- A randconfig build regression fix for ixp4xx
- Another defconfig fix following the CONFIG_FB dependency rework"
* tag 'soc-fixes-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: fsl: qe: fix static checker warning
ARM: ixp4xx: fix building both pci drivers
ARM: configs: Update the nhk8815_defconfig
bus: ti-sysc: Fix error handling for sysc_check_active_timer()
soc: fsl: qe: convert QE interrupt controller to platform_device
arm64: dts: qcom: sdm845-oneplus: fix reserved-mem
arm64: dts: qcom: msm8994-angler: Disable cont_splash_mem
arm64: dts: qcom: sc7280: Fixup cpufreq domain info for cpu7
arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem mapping
arm64: dts: qcom: msm8992-bullhead: Remove PSCI
arm64: dts: qcom: c630: fix correct powerdown pin for WSA881x
Two legacy PCI sysfs objects "legacy_io" and "legacy_mem" were updated
to use an unified address space in the commit 636b21b501 ("PCI: Revoke
mappings like devmem"). This allows for revocations to be managed from
a single place when drivers want to take over and mmap() a /dev/mem
range.
Following the update, both of the sysfs objects should leverage the
iomem_get_mapping() function to get an appropriate address range, but
only the "legacy_io" has been correctly updated - the second attribute
seems to be using a wrong variable to pass the iomem_get_mapping()
function to.
Thus, correct the variable name used so that the "legacy_mem" sysfs
object would also correctly call the iomem_get_mapping() function.
Fixes: 636b21b501 ("PCI: Revoke mappings like devmem")
Link: https://lore.kernel.org/r/20210812132144.791268-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The Renoir XHCI controller apparently doesn't resume reliably with the
standard D3hot-to-D0 delay. Increase it to 20ms.
[Alex: I talked to the AMD USB hardware team and the AMD Windows team and
they are not aware of any HW errata or specific issues. The HW works fine
in Windows. I was told Windows uses a rather generous default delay of
100ms for PCI state transitions.]
Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com
Signed-off-by: Marcin Bachry <hegel666@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Prike Liang <prike.liang@amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com>
AM64 has the same PCIe IP as in J7200 with certain erratas not
applicable (quirk_detect_quiet_flag). Add support for "ti,am64-pcie-host"
compatible and "ti,am64-pcie-ep" compatible that is specific to AM64.
Link: https://lore.kernel.org/r/20210811123336.31357-5-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
J7200 has the same PCIe IP as in J721E with minor changes in the
wrapper. J7200 allows byte access of bridge configuration space
registers and the register field for LINK_DOWN interrupt is different.
J7200 also requires "quirk_detect_quiet_flag" to be set. Configure these
changes as part of driver data applicable only to J7200.
Link: https://lore.kernel.org/r/20210811123336.31357-4-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
PCIe fails to link up if SERDES lanes not used by PCIe are assigned to
another protocol. For example, link training fails if lanes 2 and 3 are
assigned to another protocol while lanes 0 and 1 are used for PCIe to
form a two lane link. This failure is due to an incorrect tie-off on an
internal status signal indicating electrical idle.
Status signals going from SERDES to PCIe Controller are tied-off when a
lane is not assigned to PCIe. Signal indicating electrical idle is
incorrectly tied-off to a state that indicates non-idle. As a result,
PCIe sees unused lanes to be out of electrical idle and this causes
LTSSM to exit Detect.Quiet state without waiting for 12ms timeout to
occur. If a receiver is not detected on the first receiver detection
attempt in Detect.Active state, LTSSM goes back to Detect.Quiet and
again moves forward to Detect.Active state without waiting for 12ms as
required by PCIe base specification. Since wait time in Detect.Quiet is
skipped, multiple receiver detect operations are performed back-to-back
without allowing time for capacitance on the transmit lines to
discharge. This causes subsequent receiver detection to always fail even
if a receiver gets connected eventually.
Add a quirk flag "quirk_detect_quiet_flag" to program the minimum
time the LTSSM should wait on entering Detect.Quiet state here.
This has to be set for J7200 as it has an incorrect tie-off on unused
lanes.
Link: https://lore.kernel.org/r/20210811123336.31357-3-kishon@ti.com
Signed-off-by: Nadeem Athani <nadeem@cadence.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Now that support for SR-IOV is added in PCIe endpoint core, add support
to configure virtual functions in the Cadence PCIe EP driver.
Link: https://lore.kernel.org/r/20210819123343.1951-7-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Simplify code to get register base address for
configuring PCI BAR.
Link: https://lore.kernel.org/r/20210819123343.1951-6-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add virtual function number in pci_epc ops. EPC controller driver
can perform virtual function specific initialization based on the
virtual function number.
Link: https://lore.kernel.org/r/20210819123343.1951-5-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
While the physical function has to be linked to endpoint controller, the
virtual function has to be linked to a physical function. Add support to
link a physical function to a virtual function in pci-ep-cfs.
Link: https://lore.kernel.org/r/20210819123343.1951-4-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add support to add virtual function in endpoint core. The virtual
function can only be associated with a physical function instead of a
endpoint controller. Provide APIs to associate a virtual function with
a physical function here.
[weiyongjun1@huawei.com: PCI: endpoint: Fix missing unlock on error in
pci_epf_add_vepf() - Reported-by: Hulk Robot <hulkci@huawei.com>]
Link: https://lore.kernel.org/r/20210819123343.1951-3-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Change the type of probe argument in functions which implement reset
methods from int to bool to make the context and intent clear.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-10-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
_RST is a standard ACPI method that performs a function level reset of a
device (ACPI v6.3, sec 7.3.25).
Add pci_dev_acpi_reset() to probe for _RST method and execute if present.
The default priority of this reset is set to below device-specific and
above hardware resets.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-9-ameynarkhede03@gmail.com
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Previously, the ACPI_COMPANION() of a pci_dev was usually set by
acpi_bind_one() in this path:
pci_device_add
pci_configure_device
pci_init_capabilities
device_add
device_platform_notify
acpi_platform_notify
acpi_device_notify # KOBJ_ADD
acpi_bind_one
ACPI_COMPANION_SET
However, things like pci_configure_device() and pci_init_capabilities()
that run before device_add() need the ACPI_COMPANION, e.g.,
acpi_pci_bridge_d3() uses a _DSD method to learn about D3 support. These
places had special-case code to manually look up the ACPI_COMPANION.
Set the ACPI_COMPANION earlier, in pci_setup_device(), so it will be
available while configuring the device. This covers both paths to creating
pci_dev objects:
pci_scan_single_device # for normal non-SR-IOV devices
pci_scan_device
pci_setup_device
pci_set_acpi_fwnode
pci_device_add
pci_iov_add_virtfn # for SR-IOV virtual functions
pci_setup_device
pci_set_acpi_fwnode
Also move the OF fwnode setup to the same spot.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20210817180500.1253-8-ameynarkhede03@gmail.com
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Use acpi_pci_power_manageable() instead of duplicating the logic in
acpi_pci_bridge_d3(). No functional change intended.
[bhelgaas: split out from
https://lore.kernel.org/r/20210817180500.1253-8-ameynarkhede03@gmail.com]
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move the existing logic from acpi_pci_bridge_d3() to a separate function
pci_set_acpi_fwnode() to set the ACPI fwnode. No functional change
intended.
Link: https://lore.kernel.org/r/20210817180500.1253-7-ameynarkhede03@gmail.com
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add "reset_method" sysfs attribute to enable user to query and set
preferred device reset methods and their ordering.
[bhelgaas: on invalid sysfs input, return error and preserve previous
config, as in earlier patch versions]
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-6-ameynarkhede03@gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
"reset_fn" indicates whether the device supports any reset mechanism.
Remove the use of reset_fn in favor of the reset_methods array that tracks
supported reset mechanisms of a device and their ordering.
The octeon driver incorrectly used reset_fn to detect whether the device
supports FLR or not. Use pcie_reset_flr() to probe whether it supports FLR.
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-5-ameynarkhede03@gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Add reset_methods[] in struct pci_dev to keep track of reset mechanisms
supported by the device and their ordering.
Refactor probing and reset functions to take advantage of calling
convention of reset functions.
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20210817180500.1253-4-ameynarkhede03@gmail.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Most reset methods are of the form "pci_*_reset(dev, probe)". pcie_flr()
was an exception because it relied on a separate pcie_has_flr() function
instead of taking a "probe" argument.
Add "pcie_reset_flr(dev, probe)" to follow the convention. Remove
pcie_has_flr().
Some pcie_flr() callers that did not use pcie_has_flr() remain.
[bhelgaas: commit log, rework pcie_reset_flr() to use dev->devcap directly]
Link: https://lore.kernel.org/r/20210817180500.1253-3-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Add a new member called devcap in struct pci_dev for caching the PCIe
Device Capabilities register to avoid reading PCI_EXP_DEVCAP multiple
times.
Refactor pcie_has_flr() to use cached device capabilities.
Link: https://lore.kernel.org/r/20210817180500.1253-2-ameynarkhede03@gmail.com
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
When the link is in L1, hardware should return it to L0
automatically whenever a transaction targets a component on the
other end of the link (PCIe r5.0, sec 5.2).
The R-Car PCIe controller doesn't handle this transition correctly.
If the link is not in L0, an MMIO transaction targeting a downstream
device fails, and the controller reports an ARM imprecise external
abort.
Work around this by hooking the abort handler so the driver can
detect this situation and help the hardware complete the link state
transition.
When the R-Car controller receives a PM_ENTER_L1 DLLP from the
downstream component, it sets PMEL1RX bit in PMSR register, but then
the controller enters some sort of in-between state. A subsequent
MMIO transaction will fail, resulting in the external abort. The
abort handler detects this condition and completes the link state
transition by setting the L1IATN bit in PMCTLR and waiting for the
link state transition to complete.
Link: https://lore.kernel.org/r/20210815181650.132579-1-marek.vasut@gmail.com
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: linux-renesas-soc@vger.kernel.org
Hyper-V vPCI protocol version 1_4 adds support for create interrupt
v3. Create interrupt v3 essentially makes the size of the vector
field bigger in the message, thereby allowing bigger vector values.
For example, that will come into play for supporting LPI vectors
on ARM, which start at 8192.
Link: https://lore.kernel.org/r/MW4PR21MB20026A6EA554A0B9EC696AA8C0159@MW4PR21MB2002.namprd21.prod.outlook.com
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Enable PCIe reference clock. There is no remove function that's why
this should be enough for simple operation.
Normally this clock is enabled by default by firmware but there are
usecases where this clock should be enabled by driver itself.
It is also good that PCIe clock is recorded in a clock framework.
Link: https://lore.kernel.org/r/ee6997a08fab582b1c6de05f8be184f3fe8d5357.1624618100.git.michal.simek@xilinx.com
Fixes: ab597d35ef ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller")
Signed-off-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
When both the old and the new PCI drivers are enabled
in the same kernel, there are a couple of namespace
conflicts that cause a build failure:
drivers/pci/controller/pci-ixp4xx.c:38: error: "IXP4XX_PCI_CSR" redefined [-Werror]
38 | #define IXP4XX_PCI_CSR 0x1c
|
In file included from arch/arm/mach-ixp4xx/include/mach/hardware.h:23,
from arch/arm/mach-ixp4xx/include/mach/io.h:15,
from arch/arm/include/asm/io.h:198,
from include/linux/io.h:13,
from drivers/pci/controller/pci-ixp4xx.c:20:
arch/arm/mach-ixp4xx/include/mach/ixp4xx-regs.h:221: note: this is the location of the previous definition
221 | #define IXP4XX_PCI_CSR(x) ((volatile u32 *)(IXP4XX_PCI_CFG_BASE_VIRT+(x)))
|
drivers/pci/controller/pci-ixp4xx.c:148:12: error: 'ixp4xx_pci_read' redeclared as different kind of symbol
148 | static int ixp4xx_pci_read(struct ixp4xx_pci *p, u32 addr, u32 cmd, u32 *data)
| ^~~~~~~~~~~~~~~
Rename both the ixp4xx_pci_read/ixp4xx_pci_write functions and the
IXP4XX_PCI_CSR macro. In each case, I went with the version that
has fewer callers to keep the change small.
Fixes: f7821b4934 ("PCI: ixp4xx: Add a new driver for IXP4xx")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: soc@kernel.org
Link: https://lore.kernel.org/r/20210721151546.2325937-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
pci_dev_str_match_path() is often called with a spinlock held so the
allocation has to be atomic. The call tree is:
pci_specified_resource_alignment() <-- takes spin_lock();
pci_dev_str_match()
pci_dev_str_match_path()
Fixes: 45db33709c ("PCI: Allow specifying devices using a base bus and path of devfns")
Link: https://lore.kernel.org/r/20210812070004.GC31863@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Exporting sysfs files that can't be accessed doesn't make much sense.
Therefore, if either a quirk or the dynamic size calculation result in VPD
being marked as invalid, treat this as though the device has no VPD
capability. One consequence is that the "vpd" sysfs file is not visible.
Link: https://lore.kernel.org/r/6a02b204-4ed2-4553-c3b2-eacf9554fa8d@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Determine VPD size in pci_vpd_init().
Quirks set dev->vpd.len to a non-zero value, so they cause us to skip the
dynamic size calculation. Prerequisite is that we move the quirks from
FINAL to HEADER so they are run before pci_vpd_init().
Link: https://lore.kernel.org/r/cc4a6538-557a-294d-4f94-e6d1d3c91589@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that struct pci_vpd is really small, simplify the code by embedding
struct pci_vpd directly in struct pci_dev instead of dynamically allocating
it.
Link: https://lore.kernel.org/r/d898489e-22ba-71f1-2f31-f1a78dc15849@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Instead of having a separate flag, use vp->len != 0 as indicator that VPD
validity has been checked. Now vpd->len == PCI_VPD_SZ_INVALID indicates
that VPD is invalid.
Link: https://lore.kernel.org/r/9f777bc7-5316-e1b8-e5d4-f9f609bdb5dd@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some multi-function devices share VPD hardware across functions and don't
work correctly for concurrent VPD accesses to different functions.
Struct pci_vpd_ops was added by 932c435cab ("PCI: Add dev_flags bit to
access VPD through function 0") so that on these devices, VPD accesses to
any function would always go to function 0.
It's easier to just check for the PCI_DEV_FLAGS_VPD_REF_F0 quirk bit in the
two places we need it than to deal with the struct pci_vpd_ops.
Simplify the code by removing struct pci_vpd_ops and removing the indirect
calls.
[bhelgaas: check for !func0_dev earlier, commit log]
Link: https://lore.kernel.org/r/b2532a41-df8b-860f-461f-d5c066c819d0@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Switch the PCI/MSI core to use the new mask/unmask functions. No functional
change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.311207034@linutronix.de
The existing mask/unmask functions are convoluted and generate suboptimal
assembly code.
Provide a new set of functions which will be used in later patches to
replace the exisiting ones.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/875ywetozb.ffs@tglx
msi_mask() is calculating the possible mask bits for MSI per vector
masking.
Rename it to msi_multi_mask() and hand the MSI descriptor pointer into it
to simplify the call sites.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.203905260@linutronix.de
Handling of virtual MSI-X is obfuscated by letting pci_msix_desc_addr()
return NULL and checking the pointer.
Just use msi_desc::msi_attrib.is_virtual at the call sites and get rid of
that pointer check.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.151522318@linutronix.de
Three error exits doing exactly the same ask for a common error exit point.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.098828720@linutronix.de
msi_desc::masked is a misnomer. For MSI it's used to cache the MSI mask
bits when the device supports per vector masking. For MSI-X it's used to
cache the content of the vector control word which contains the mask bit
for the vector.
Replace it with a union of msi_mask and msix_ctrl to make the purpose clear
and fix up the usage sites.
No functional change
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729222543.045993608@linutronix.de
No point in looping over all entries when 64bit addressing mode is enabled
for nothing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20210729222542.992849326@linutronix.de
The PCI core already ensures that the MSI[-X] state is correct when MSI[-X]
is disabled. For MSI the reset state is all entries unmasked and for MSI-X
all vectors are masked.
S390 masks all MSI entries and masks the already masked MSI-X entries
again. Remove it and let the device in the correct state.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20210729222542.939798136@linutronix.de
Multi-MSI uses a single MSI descriptor and there is a single mask register
when the device supports per vector masking. To avoid reading back the mask
register the value is cached in the MSI descriptor and updates are done by
clearing and setting bits in the cache and writing it to the device.
But nothing protects msi_desc::masked and the mask register from being
modified concurrently on two different CPUs for two different Linux
interrupts which belong to the same multi-MSI descriptor.
Add a lock to struct device and protect any operation on the mask and the
mask register with it.
This makes the update of msi_desc::masked unconditional, but there is no
place which requires a modification of the hardware register without
updating the masked cache.
msi_mask_irq() is now an empty wrapper which will be cleaned up in follow
up changes.
The problem goes way back to the initial support of multi-MSI, but picking
the commit which introduced the mask cache is a valid cut off point
(2.6.30).
Fixes: f2440d9acb ("PCI MSI: Refactor interrupt masking code")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de
No point in using the raw write function from shutdown. Preparatory change
to introduce proper serialization for the msi_desc::masked cache.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.674391354@linutronix.de
The comments about preserving the cached state in pci_msi[x]_shutdown() are
misleading as the MSI descriptors are freed right after those functions
return. So there is nothing to restore. Preparatory change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.621609423@linutronix.de
msi_mask_irq() takes a mask and a flags argument. The mask argument is used
to mask out bits from the cached mask and the flags argument to set bits.
Some places invoke it with a flags argument which sets bits which are not
used by the device, i.e. when the device supports up to 8 vectors a full
unmask in some places sets the mask to 0xFFFFFF00. While devices probably
do not care, it's still bad practice.
Fixes: 7ba1930db0 ("PCI MSI: Unmask MSI if setup failed")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.568173099@linutronix.de
Nothing enforces the posted writes to be visible when the function
returns. Flush them even if the flush might be redundant when the entry is
masked already as the unmask will flush as well. This is either setup or a
rare affinity change event so the extra flush is not the end of the world.
While this is more a theoretical issue especially the logic in the X86
specific msi_set_affinity() function relies on the assumption that the
update has reached the hardware when the function returns.
Again, as this never has been enforced the Fixes tag refers to a commit in:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.515188147@linutronix.de
The specification (PCIe r5.0, sec 6.1.4.5) states:
For MSI-X, a function is permitted to cache Address and Data values
from unmasked MSI-X Table entries. However, anytime software unmasks a
currently masked MSI-X Table entry either by clearing its Mask bit or
by clearing the Function Mask bit, the function must update any Address
or Data values that it cached from that entry. If software changes the
Address or Data value of an entry while the entry is unmasked, the
result is undefined.
The Linux kernel's MSI-X support never enforced that the entry is masked
before the entry is modified hence the Fixes tag refers to a commit in:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Enforce the entry to be masked across the update.
There is no point in enforcing this to be handled at all possible call
sites as this is just pointless code duplication and the common update
function is the obvious place to enforce this.
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Reported-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.462096385@linutronix.de
When MSI-X is enabled the ordering of calls is:
msix_map_region();
msix_setup_entries();
pci_msi_setup_msi_irqs();
msix_program_entries();
This has a few interesting issues:
1) msix_setup_entries() allocates the MSI descriptors and initializes them
except for the msi_desc:masked member which is left zero initialized.
2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets
up the MSI interrupts which ends up in pci_write_msi_msg() unless the
interrupt chip provides its own irq_write_msi_msg() function.
3) msix_program_entries() does not do what the name suggests. It solely
updates the entries array (if not NULL) and initializes the masked
member for each MSI descriptor by reading the hardware state and then
masks the entry.
Obviously this has some issues:
1) The uninitialized masked member of msi_desc prevents the enforcement
of masking the entry in pci_write_msi_msg() depending on the cached
masked bit. Aside of that half initialized data is a NONO in general
2) msix_program_entries() only ensures that the actually allocated entries
are masked. This is wrong as experimentation with crash testing and
crash kernel kexec has shown.
This limited testing unearthed that when the production kernel had more
entries in use and unmasked when it crashed and the crash kernel
allocated a smaller amount of entries, then a full scan of all entries
found unmasked entries which were in use in the production kernel.
This is obviously a device or emulation issue as the device reset
should mask all MSI-X table entries, but obviously that's just part
of the paper specification.
Cure this by:
1) Masking all table entries in hardware
2) Initializing msi_desc::masked in msix_setup_entries()
3) Removing the mask dance in msix_program_entries()
4) Renaming msix_program_entries() to msix_update_entries() to
reflect the purpose of that function.
As the masking of unused entries has never been done the Fixes tag refers
to a commit in:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.403833459@linutronix.de
The ordering of MSI-X enable in hardware is dysfunctional:
1) MSI-X is disabled in the control register
2) Various setup functions
3) pci_msi_setup_msi_irqs() is invoked which ends up accessing
the MSI-X table entries
4) MSI-X is enabled and masked in the control register with the
comment that enabling is required for some hardware to access
the MSI-X table
Step #4 obviously contradicts #3. The history of this is an issue with the
NIU hardware. When #4 was introduced the table access actually happened in
msix_program_entries() which was invoked after enabling and masking MSI-X.
This was changed in commit d71d6432e1 ("PCI/MSI: Kill redundant call of
irq_set_msi_desc() for MSI-X interrupts") which removed the table write
from msix_program_entries().
Interestingly enough nobody noticed and either NIU still works or it did
not get any testing with a kernel 3.19 or later.
Nevertheless this is inconsistent and there is no reason why MSI-X can't be
enabled and masked in the control register early on, i.e. move step #4
above to step #1. This preserves the NIU workaround and has no side effects
on other hardware.
Fixes: d71d6432e1 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de
The struct pci_vpd.flag member was used only to communicate between
pci_vpd_wait() and its callers. Remove the flag member and pass the value
directly to pci_vpd_wait() to simplify the code.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/e4ef6845-6b23-1646-28a0-d5c5a28347b6@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reading/writing 4 bytes should be fast enough even on a slow bus, therefore
pci_vpd_wait() doesn't have to be interruptible. Making it uninterruptible
allows to simplify the code.
In addition make VPD writes uninterruptible in general. It's about vital
data, and allowing writes to be interruptible may leave the VPD in an
inconsistent state.
Link: https://lore.kernel.org/r/258bf994-bc2a-2907-9181-2c7a562986d5@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
vpd->len is initialized to PCI_VPD_MAX_SIZE, and if a quirk is used to set
a specific VPD size, then pci_vpd_set_size() sets vpd->valid, resulting in
pci_vpd_size() not being called. Therefore we can remove the old_size
argument. Note that we don't have to check off < PCI_VPD_MAX_SIZE because
that's implicitly done by pci_read_vpd().
Link: https://lore.kernel.org/r/ede36c16-5335-6867-43a1-293641348430@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Previously, if we found any error in the VPD, we returned size 0, which
prevents access to all of VPD. But there may be valid resources in VPD
before the error, and there's no reason to prevent access to those.
"off" covers only VPD resources known to have valid header tags. In case
of error, return "off" (which may be zero if we haven't found any valid
header tags at all).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
VPD consists of a series of Small and Large Resources. Computing the size
of VPD requires only the length of each, which is specified in the generic
tag of each resource. We only expect to see ID_STRING, RO_DATA, and
RW_DATA in VPD, but it's not a problem if it contains other resource types
because all we care about is the size.
Drop the validity checking of Large Resource items.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
VPD is limited in size by the 15-bit VPD Address field in the VPD
Capability. Each resource tag includes a length that determines the
overall size of the resource. Reject any resources that would extend past
the maximum VPD size.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tag for toerh trees/branches to pull from in order to have a stable base
to build off of for the "Allow deferred execution of
iomem_get_mapping()" set of sysfs changes
Link: https://lore.kernel.org/r/20210729233235.1508920-1-kw@linux.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYQ0XAw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymPsQCeLzeQco/wi96/nf2fhKqpAPsBtH4AoLqE8R7F
PDJCjDCLsbwL+7ZC2udo
=Fbxh
-----END PGP SIGNATURE-----
Merge tag 'sysfs_defferred_iomem_get_mapping-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core driver-core-next
sysfs: Allow deferred execution of iomem_get_mapping()
Tag for toerh trees/branches to pull from in order to have a stable base
to build off of for the "Allow deferred execution of
iomem_get_mapping()" set of sysfs changes
Link: https://lore.kernel.org/r/20210729233235.1508920-1-kw@linux.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'sysfs_defferred_iomem_get_mapping-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: Rename struct bin_attribute member to f_mapping
sysfs: Invoke iomem_get_mapping() from the sysfs open callback