This time with bigger changes than usual:
* A new IOMMU driver for the ARM SMMUv3. This IOMMU is pretty
different from SMMUv1 and v2 in that it is configured through
in-memory structures and not through the MMIO register region.
The ARM SMMUv3 also supports IO demand paging for PCI devices
with PRI/PASID capabilities, but this is not implemented in
the driver yet.
* Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
* Introduction of default domains into the IOMMU core code. The
rationale behind this is to move functionalily out of the
IOMMU drivers to common code to get to a unified behavior
between different drivers.
The patches here introduce a default domain for iommu-groups
(isolation groups). A device will now always be attached to a
domain, either the default domain or another domain handled by
the device driver. The IOMMU drivers have to be modified to
make use of that feature. So long the AMD IOMMU driver is
converted, with others to follow.
* Patches for the Intel VT-d drvier to fix DMAR faults that
happen when a kdump kernel boots. When the kdump kernel boots
it re-initializes the IOMMU hardware, which destroys all
mappings from the crashed kernel. As this happens before
the endpoint devices are re-initialized, any in-flight DMA
causes a DMAR fault. These faults cause PCI master aborts,
which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old
kernel and keep the old mappings in place until the device
driver of the new kernel takes over. This emulates the the
behavior without an IOMMU to the best degree possible.
* A couple of other small fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJViSIWAAoJECvwRC2XARrjl+cP/2FXS7SWDq91VFiIZfXfPt8H
C5Ef3OGWCnMzn4MKE1ExkyDhC+AH6pF1s4zi3XfT6b8iOA+DUpa51rxJjixszt31
tQwmvB7hWu4mznGxSN7EA0Pm0l/v3tBAY5BvG598af0aNZFFJ6po+31MyQA5X67+
6xpqLbH/hm4IZhFBOEzZwxuWWsNxlJwwzKqeAjGyqeUhdruRYZiPHWQ17sDjwLM/
QcVvWBb7meOtKv1OCtpzC4sglSk3scbAfEHMEBuDt8cI6OD7/t2VzPXDWWZuXGqK
nRAxCT7NrXvyOnv0xwdn0j5p1FUGipVxvhsGWX7sJsh3UHWm8Q+5rRKFFVI9pm50
QcMjiIMazK5VwcAkDnLoDgSz4Zz6TfHXEOqSJ2vjTPt2VDP/J9zdM2iwHx2ujicI
mIkrtmsBprvAPx6e9jcqiS5L/Xy1y1xewXuGxa5F2XOjqdoXkPqaupjlyrWzrChA
MC8w67FdzjHDPCfIqfIWZpJQj4f1OFQGd3HS5HpkBACxIwCg85gRw4DEMfD/sirO
BL2VM0RO/bB5+4R0AY7UA2VszQvNMqedj1bA4vAbrnXqOh8BI/0GgeoWiBMXhyX1
qvT1jl+cxuCm5tgBOMUGYoRyF+//bH+l78jLsTYaWRtuVzFlkAX6idNvYYK0dmNt
tLII2IIZBk87P3pF4d6A
=Zicw
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This time with bigger changes than usual:
- A new IOMMU driver for the ARM SMMUv3.
This IOMMU is pretty different from SMMUv1 and v2 in that it is
configured through in-memory structures and not through the MMIO
register region. The ARM SMMUv3 also supports IO demand paging for
PCI devices with PRI/PASID capabilities, but this is not
implemented in the driver yet.
- Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
- Introduction of default domains into the IOMMU core code.
The rationale behind this is to move functionalily out of the IOMMU
drivers to common code to get to a unified behavior between
different drivers. The patches here introduce a default domain for
iommu-groups (isolation groups).
A device will now always be attached to a domain, either the
default domain or another domain handled by the device driver. The
IOMMU drivers have to be modified to make use of that feature. So
long the AMD IOMMU driver is converted, with others to follow.
- Patches for the Intel VT-d drvier to fix DMAR faults that happen
when a kdump kernel boots.
When the kdump kernel boots it re-initializes the IOMMU hardware,
which destroys all mappings from the crashed kernel. As this
happens before the endpoint devices are re-initialized, any
in-flight DMA causes a DMAR fault. These faults cause PCI master
aborts, which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old kernel
and keep the old mappings in place until the device driver of the
new kernel takes over. This emulates the the behavior without an
IOMMU to the best degree possible.
- A couple of other small fixes and cleanups"
* tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (69 commits)
iommu/amd: Handle large pages correctly in free_pagetable
iommu/vt-d: Don't disable IR when it was previously enabled
iommu/vt-d: Make sure copied over IR entries are not reused
iommu/vt-d: Copy IR table from old kernel when in kdump mode
iommu/vt-d: Set IRTA in intel_setup_irq_remapping
iommu/vt-d: Disable IRQ remapping in intel_prepare_irq_remapping
iommu/vt-d: Move QI initializationt to intel_setup_irq_remapping
iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
iommu/vt-d: Enable Translation only if it was previously disabled
iommu/vt-d: Don't disable translation prior to OS handover
iommu/vt-d: Don't copy translation tables if RTT bit needs to be changed
iommu/vt-d: Don't do early domain assignment if kdump kernel
iommu/vt-d: Allocate si_domain in init_dmars()
iommu/vt-d: Mark copied context entries
iommu/vt-d: Do not re-use domain-ids from the old kernel
iommu/vt-d: Copy translation tables from old kernel
iommu/vt-d: Detect pre enabled translation
iommu/vt-d: Make root entry visible for hardware right after allocation
iommu/vt-d: Init QI before root entry is allocated
iommu/vt-d: Cleanup log messages
...
Give them a common prefix that can be grepped for and
improve the wording here and there.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Refine the interfaces to create IRQ for DMAR unit. It's a preparation
for converting DMAR IRQ to hierarchical irqdomain on x86.
It also moves dmar_alloc_hwirq()/dmar_free_hwirq() from irq_remapping.h
to dmar.h. They are not irq_remapping specific.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: iommu@lists.linux-foundation.org
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Link: http://lkml.kernel.org/r/1428905519-23704-20-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
According to Intel VT-d specification, _DSM method to support DMAR
hotplug should exist directly under corresponding ACPI object
representing PCI host bridge. But some BIOSes doesn't conform to
this, so search for _DSM method in the subtree starting from the
ACPI object representing the PCI host bridge.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
On Intel platforms, an IO Hub (PCI/PCIe host bridge) may contain DMAR
units, so we need to support DMAR hotplug when supporting PCI host
bridge hotplug on Intel platforms.
According to Section 8.8 "Remapping Hardware Unit Hot Plug" in "Intel
Virtualization Technology for Directed IO Architecture Specification
Rev 2.2", ACPI BIOS should implement ACPI _DSM method under the ACPI
object for the PCI host bridge to support DMAR hotplug.
This patch introduces interfaces to parse ACPI _DSM method for
DMAR unit hotplug. It also implements state machines for DMAR unit
hot-addition and hot-removal.
The PCI host bridge hotplug driver should call dmar_hotplug_hotplug()
before scanning PCI devices connected for hot-addition and after
destroying all PCI devices for hot-removal.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Introduce functions to support dynamic IOMMU seq_id allocating and
releasing, which will be used to support DMAR hotplug.
Also rename IOMMU_UNITS_SUPPORTED as DMAR_UNITS_SUPPORTED.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Introduce helper function dmar_walk_resources to walk resource entries
in DMAR table and ACPI buffer object returned by ACPI _DSM method
for IOMMU hot-plug.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The VT-d specification states that an RMRR entry in the DMAR
table needs to specify the full path to the device. This is
also how newer Linux kernels implement it.
Unfortunatly older drivers just match for the target device
and not the full path to the device, so that BIOS vendors
implement that behavior into their BIOSes to make them work
with older Linux kernels. But those RMRR entries break on
newer Linux kernels.
Work around this issue by adding a fall-back into the RMRR
matching code to match those old RMRR entries too.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Checking adev == NULL is not sufficient as
acpi_bus_get_device() might not touch the value of this
parameter in an error case, so check the return value
directly.
Fixes: ed40356b5f
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.
The following Coccinelle semantic patch was used:
@@
@@
- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
- ACPICA update to upstream version 20140724. That includes
ACPI 5.1 material (support for the _CCA and _DSD predefined names,
changes related to the DMAR and PCCT tables and ARM support among
other things) and cleanups related to using ACPICA's header files.
A major part of it is related to acpidump and the core code used
by that utility. Changes from Bob Moore, David E Box, Lv Zheng,
Sascha Wildner, Tomasz Nowicki, Hanjun Guo.
- Radix trees for memory bitmaps used by the hibernation core from
Joerg Roedel.
- Support for waking up the system from suspend-to-idle (also known
as the "freeze" sleep state) using ACPI-based PCI wakeup signaling
(Rafael J Wysocki).
- Fixes for issues related to ACPI button events (Rafael J Wysocki).
- New device ID for an ACPI-enumerated device included into the
Wildcat Point PCH from Jie Yang.
- ACPI video updates related to backlight handling from Hans de Goede
and Linus Torvalds.
- Preliminary changes needed to support ACPI on ARM from Hanjun Guo
and Graeme Gregory.
- ACPI PNP core cleanups from Arjun Sreedharan and Zhang Rui.
- Cleanups related to ACPI_COMPANION() and ACPI_HANDLE() macros
(Rafael J Wysocki).
- ACPI-based device hotplug cleanups from Wei Yongjun and
Rafael J Wysocki.
- Cleanups and improvements related to system suspend from
Lan Tianyu, Randy Dunlap and Rafael J Wysocki.
- ACPI battery cleanup from Wei Yongjun.
- cpufreq core fixes from Viresh Kumar.
- Elimination of a deadband effect from the cpufreq ondemand
governor and intel_pstate driver cleanups from Stratos Karafotis.
- 350MHz CPU support for the powernow-k6 cpufreq driver from
Mikulas Patocka.
- Fix for the imx6 cpufreq driver from Anson Huang.
- cpuidle core and governor cleanups from Daniel Lezcano,
Sandeep Tripathy and Mohammad Merajul Islam Molla.
- Build fix for the big_little cpuidle driver from Sachin Kamat.
- Configuration fix for the Operation Performance Points (OPP)
framework from Mark Brown.
- APM cleanup from Jean Delvare.
- cpupower utility fixes and cleanups from Peter Senna Tschudin,
Andrey Utkin, Himangi Saraogi, Rickard Strandqvist, Thomas Renninger.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJT4nhtAAoJEILEb/54YlRxtZEP/2rtVQFSFdAW8l0Xm1SeSsl4
EnZpSNT1TFn+NdG23vSIot5Jzdz1/dLfeoJEbXpoVt4DPC9/PK4HPlv5FEDQYfh5
srftvvGcAva969sXzSBRNUeR+M8Yd2RdoYCfmqTEUjzf8GJLL4jC0VAIwMtsQklt
EbiQX8JaHQS7RIql7MDg1N2vaTo+zxkf39Kkcl56usmO/uATP7cAPjFreF/xQ3d8
OyBhz1cOXIhPw7bd9Dv9AgpJzA8WFpktDYEgy2sluBWMv+mLYjdZRCFkfpIRzmea
pt+hJDeAy8ZL6/bjWCzz2x6wG7uJdDLblreI28sgnJx/VHR3Co6u4H1BqUBj18ct
CHV6zQ55WFmx9/uJqBtwFy333HS2ysJziC5ucwmg8QjkvAn4RK8S0qHMfRvSSaHj
F9ejnHGxyrc3zzfsngUf/VXIp67FReaavyKX3LYxjHjMPZDMw2xCtCWEpUs52l2o
fAbkv8YFBbUalIv0RtELH5XnKQ2ggMP8UgvT74KyfXU6LaliH8lEV20FFjMgwrPI
sMr2xk04eS8mNRNAXL8OMMwvh6DY/Qsmb7BVg58RIw6CdHeFJl834yztzcf7+j56
4oUmA16QYBCFA3udGQ3Tb07mi8XTfrMdTOGA0koQG9tjswKXuLUXUk9WAXZe4vml
ItRpZKE86BCs3mLJMYre
=ZODv
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"Again, ACPICA leads the pack (47 commits), followed by cpufreq (18
commits) and system suspend/hibernation (9 commits).
From the new code perspective, the ACPICA update brings ACPI 5.1 to
the table, including a new device configuration object called _DSD
(Device Specific Data) that will hopefully help us to operate device
properties like Device Trees do (at least to some extent) and changes
related to supporting ACPI on ARM.
Apart from that we have hibernation changes making it use radix trees
to store memory bitmaps which should speed up some operations carried
out by it quite significantly. We also have some power management
changes related to suspend-to-idle (the "freeze" sleep state) support
and more preliminary changes needed to support ACPI on ARM (outside of
ACPICA).
The rest is fixes and cleanups pretty much everywhere.
Specifics:
- ACPICA update to upstream version 20140724. That includes ACPI 5.1
material (support for the _CCA and _DSD predefined names, changes
related to the DMAR and PCCT tables and ARM support among other
things) and cleanups related to using ACPICA's header files. A
major part of it is related to acpidump and the core code used by
that utility. Changes from Bob Moore, David E Box, Lv Zheng,
Sascha Wildner, Tomasz Nowicki, Hanjun Guo.
- Radix trees for memory bitmaps used by the hibernation core from
Joerg Roedel.
- Support for waking up the system from suspend-to-idle (also known
as the "freeze" sleep state) using ACPI-based PCI wakeup signaling
(Rafael J Wysocki).
- Fixes for issues related to ACPI button events (Rafael J Wysocki).
- New device ID for an ACPI-enumerated device included into the
Wildcat Point PCH from Jie Yang.
- ACPI video updates related to backlight handling from Hans de Goede
and Linus Torvalds.
- Preliminary changes needed to support ACPI on ARM from Hanjun Guo
and Graeme Gregory.
- ACPI PNP core cleanups from Arjun Sreedharan and Zhang Rui.
- Cleanups related to ACPI_COMPANION() and ACPI_HANDLE() macros
(Rafael J Wysocki).
- ACPI-based device hotplug cleanups from Wei Yongjun and Rafael J
Wysocki.
- Cleanups and improvements related to system suspend from Lan
Tianyu, Randy Dunlap and Rafael J Wysocki.
- ACPI battery cleanup from Wei Yongjun.
- cpufreq core fixes from Viresh Kumar.
- Elimination of a deadband effect from the cpufreq ondemand governor
and intel_pstate driver cleanups from Stratos Karafotis.
- 350MHz CPU support for the powernow-k6 cpufreq driver from Mikulas
Patocka.
- Fix for the imx6 cpufreq driver from Anson Huang.
- cpuidle core and governor cleanups from Daniel Lezcano, Sandeep
Tripathy and Mohammad Merajul Islam Molla.
- Build fix for the big_little cpuidle driver from Sachin Kamat.
- Configuration fix for the Operation Performance Points (OPP)
framework from Mark Brown.
- APM cleanup from Jean Delvare.
- cpupower utility fixes and cleanups from Peter Senna Tschudin,
Andrey Utkin, Himangi Saraogi, Rickard Strandqvist, Thomas
Renninger"
* tag 'pm+acpi-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (118 commits)
ACPI / LPSS: add LPSS device for Wildcat Point PCH
ACPI / PNP: Replace faulty is_hex_digit() by isxdigit()
ACPICA: Update version to 20140724.
ACPICA: ACPI 5.1: Update for PCCT table changes.
ACPICA/ARM: ACPI 5.1: Update for GTDT table changes.
ACPICA/ARM: ACPI 5.1: Update for MADT changes.
ACPICA/ARM: ACPI 5.1: Update for FADT changes.
ACPICA: ACPI 5.1: Support for the _CCA predifined name.
ACPICA: ACPI 5.1: New notify value for System Affinity Update.
ACPICA: ACPI 5.1: Support for the _DSD predefined name.
ACPICA: Debug object: Add current value of Timer() to debug line prefix.
ACPICA: acpihelp: Add UUID support, restructure some existing files.
ACPICA: Utilities: Fix local printf issue.
ACPICA: Tables: Update for DMAR table changes.
ACPICA: Remove some extraneous printf arguments.
ACPICA: Update for comments/formatting. No functional changes.
ACPICA: Disassembler: Add support for the ToUUID opererator (macro).
ACPICA: Remove a redundant cast to acpi_size for ACPI_OFFSET() macro.
ACPICA: Work around an ancient GCC bug.
ACPI / processor: Make it possible to get local x2apic id via _MAT
...
Update table compiler and disassembler for new DMAR fields introduced
in Sept. 2013.
Note that Linux DMAR users need to be updated after applying this change.
[zetalog: changing drivers/iommu/dmar.c accordingly]
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Register our DRHD IOMMUs, cross link devices, and provide a base set
of attributes for the IOMMU. Note that IRQ remapping support parses
the DMAR table very early in boot, well before the iommu_class can
reasonably be setup, so our registration is split between
intel_iommu_init(), which occurs later, and alloc_iommu(), which
typically occurs much earlier, but may happen at any time later
with IOMMU hot-add support.
On a typical desktop system, this provides the following (pruned):
$ find /sys | grep dmar
/sys/devices/virtual/iommu/dmar0
/sys/devices/virtual/iommu/dmar0/devices
/sys/devices/virtual/iommu/dmar0/devices/0000:00:02.0
/sys/devices/virtual/iommu/dmar0/intel-iommu
/sys/devices/virtual/iommu/dmar0/intel-iommu/cap
/sys/devices/virtual/iommu/dmar0/intel-iommu/ecap
/sys/devices/virtual/iommu/dmar0/intel-iommu/address
/sys/devices/virtual/iommu/dmar0/intel-iommu/version
/sys/devices/virtual/iommu/dmar1
/sys/devices/virtual/iommu/dmar1/devices
/sys/devices/virtual/iommu/dmar1/devices/0000:00:00.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:01.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:16.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1a.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1b.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1c.0
...
/sys/devices/virtual/iommu/dmar1/intel-iommu
/sys/devices/virtual/iommu/dmar1/intel-iommu/cap
/sys/devices/virtual/iommu/dmar1/intel-iommu/ecap
/sys/devices/virtual/iommu/dmar1/intel-iommu/address
/sys/devices/virtual/iommu/dmar1/intel-iommu/version
/sys/class/iommu/dmar0
/sys/class/iommu/dmar1
(devices also link back to the dmar units)
This makes address, version, capabilities, and extended capabilities
available, just like printed on boot. I've tried not to duplicate
data that can be found in the DMAR table, with the exception of the
address, which provides an easy way to associate the sysfs device with
a DRHD entry in the DMAR. It's tempting to add scopes and RMRR data
here, but the full DMAR table is already exposed under /sys/firmware/
and therefore already provides a way for userspace to learn such
details.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
__dmar_enable_qi() will initialize free_head,free_tail and
free_cnt for q_inval. Remove the redundant initialization
in dmar_enable_qi().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
ia64 and x86 share this driver. x86 is moving to a different irq
allocation and ia64 keeps its private irq_create/destroy stuff.
Use macros to redirect to one or the other. Yes, macros to avoid
include hell.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: x86@kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20140507154336.372289825@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit "59ce0515cdaf iommu/vt-d: Update DRHD/RMRR/ATSR device scope
caches when PCI hotplug happens" introduces a bug, which fails to
match PCI devices with DMAR device scope entries if PCI path array
in the entry has more than one level.
For example, it fails to handle
[1D2h 0466 1] Device Scope Entry Type : 01
[1D3h 0467 1] Entry Length : 0A
[1D4h 0468 2] Reserved : 0000
[1D6h 0470 1] Enumeration ID : 00
[1D7h 0471 1] PCI Bus Number : 00
[1D8h 0472 2] PCI Path : 1C,04
[1DAh 0474 2] PCI Path : 00,02
And cause DMA failure on HP DL980 as:
DMAR:[fault reason 02] Present bit in context entry is clear
dmar: DRHD: handling fault status reg 602
dmar: DMAR:[DMA Read] Request device [02:00.2] fault addr 7f61e000
Reported-and-tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
If we failed to find an ACPI device to correspond to an ANDD record, we
would fail to increment our pointer and would just process the same record
over and over again, with predictable results.
Turn it from a while() loop into a for() loop to let the 'continue' in
the error paths work correctly.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
As pointed out by Jörg and fixed in commit 11f1a7768 ("iommu/vt-d: Check
for NULL pointer in dmar_acpi_dev_scope_init(), this code path can
bizarrely get exercised even on AMD IOMMU systems with IRQ remapping
enabled.
In addition to the defensive check for NULL which Jörg added, let's also
just avoid calling the function at all if there aren't an Intel IOMMU
units in the system.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When ir_dev_scope_init() is called via a rootfs initcall it
will check for irq_remapping_enabled before it calls
(indirectly) into dmar_acpi_dev_scope_init() which uses the
dmar_tbl pointer without any checks.
The AMD IOMMU driver also sets the irq_remapping_enabled
flag which causes the dmar_acpi_dev_scope_init() function to
be called on systems with AMD IOMMU hardware too, causing a
boot-time kernel crash.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
It's not only for PCI devices any more, and the scope information for an
ACPI device provides the bus and devfn so that has to be stored here too.
It is the device pointer itself which needs to be protected with RCU,
so the __rcu annotation follows it into the definition of struct
dmar_dev_scope, since we're no longer just passing arrays of device
pointers around.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Now we have a PCI bus notification based mechanism to update DMAR
device scope array, we could extend the mechanism to support boot
time initialization too, which will help to unify and simplify
the implementation.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Current Intel DMAR/IOMMU driver assumes that all PCI devices associated
with DMAR/RMRR/ATSR device scope arrays are created at boot time and
won't change at runtime, so it caches pointers of associated PCI device
object. That assumption may be wrong now due to:
1) introduction of PCI host bridge hotplug
2) PCI device hotplug through sysfs interfaces.
Wang Yijing has tried to solve this issue by caching <bus, dev, func>
tupple instead of the PCI device object pointer, but that's still
unreliable because PCI bus number may change in case of hotplug.
Please refer to http://lkml.org/lkml/2013/11/5/64
Message from Yingjing's mail:
after remove and rescan a pci device
[ 611.857095] dmar: DRHD: handling fault status reg 2
[ 611.857109] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff7000
[ 611.857109] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.857524] dmar: DRHD: handling fault status reg 102
[ 611.857534] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff6000
[ 611.857534] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.857936] dmar: DRHD: handling fault status reg 202
[ 611.857947] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff5000
[ 611.857947] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.858351] dmar: DRHD: handling fault status reg 302
[ 611.858362] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff4000
[ 611.858362] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.860819] IPv6: ADDRCONF(NETDEV_UP): eth3: link is not ready
[ 611.860983] dmar: DRHD: handling fault status reg 402
[ 611.860995] dmar: INTR-REMAP: Request device [[86:00.3] fault index a4
[ 611.860995] INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear
This patch introduces a new mechanism to update the DRHD/RMRR/ATSR device scope
caches by hooking PCI bus notification.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Global DMA and interrupt remapping resources may be accessed in
interrupt context, so use RCU instead of rwsem to protect them
in such cases.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Introduce a global rwsem dmar_global_lock, which will be used to
protect DMAR related global data structures from DMAR/PCI/memory
device hotplug operations in process context.
DMA and interrupt remapping related data structures are read most,
and only change when memory/PCI/DMAR hotplug event happens.
So a global rwsem solution is adopted for balance between simplicity
and performance.
For interrupt remapping driver, function intel_irq_remapping_supported(),
dmar_table_init(), intel_enable_irq_remapping(), disable_irq_remapping(),
reenable_irq_remapping() and enable_drhd_fault_handling() etc
are called during booting, suspending and resuming with interrupt
disabled, so no need to take the global lock.
For interrupt remapping entry allocation, the locking model is:
down_read(&dmar_global_lock);
/* Find corresponding iommu */
iommu = map_hpet_to_ir(id);
if (iommu)
/*
* Allocate remapping entry and mark entry busy,
* the IOMMU won't be hot-removed until the
* allocated entry has been released.
*/
index = alloc_irte(iommu, irq, 1);
up_read(&dmar_global_lock);
For DMA remmaping driver, we only uses the dmar_global_lock rwsem to
protect functions which are only called in process context. For any
function which may be called in interrupt context, we will use RCU
to protect them in following patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Introduce for_each_dev_scope()/for_each_active_dev_scope() to walk
{active} device scope entries. This will help following RCU lock
related patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Factor out function dmar_alloc_dev_scope() from dmar_parse_dev_scope()
for later reuse.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Clean up most sparse warnings in Intel DMA and interrupt remapping
drivers.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
If dmar_table_init() fails to detect DMAR table on the first call,
it will return wrong result on following calls because it always
sets dmar_table_initialized no matter if succeeds or fails to
detect DMAR table.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Release associated invalidation queue when destroying IOMMU unit
to avoid memory leak.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Simplify vt-d related code with existing macros and introduce a new
macro for_each_active_drhd_unit() to enumerate all active DRHD unit.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Functions alloc_iommu() and parse_ioapics_under_ir()
are only used internally, so mark them as static.
[Joerg: Made detect_intel_iommu() non-static again for IA64]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Remove dead code from VT-d related files.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Conflicts:
drivers/iommu/dmar.c
Flag irq_remapping_enabled is only set by intel_enable_irq_remapping(),
which is called after detect_intel_iommu(). So moving pr_info() from
detect_intel_iommu() to intel_enable_irq_remapping(), which also
slightly simplifies implementation.
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Function dmar_parse_dev_scope() should release the PCI device reference
count gained in function dmar_parse_one_dev_scope() on error recovery,
otherwise it will cause PCI device object leakage.
This patch also introduces dmar_free_dev_scope(), which will be used
to support DMAR device hotplug.
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Mark the functions check_zero_address() and dmar_get_fault_reason() as
static in dmar.c because they are not used outside this file.
This eliminates the following warnings in dmar.c:
drivers/iommu/dmar.c:491:12: warning: no previous prototype for ‘check_zero_address’ [-Wmissing-prototypes]
drivers/iommu/dmar.c:1116:13: warning: no previous prototype for ‘dmar_get_fault_reason’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This time the updates contain:
* Tracepoints for certain IOMMU-API functions to make
their use easier to debug
* A tracepoint for IOMMU page faults to make it easier
to get them in user space
* Updates and fixes for the new ARM SMMU driver after
the first hardware showed up
* Various other fixes and cleanups in other IOMMU drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJShVQAAAoJECvwRC2XARrj4T8P/2C/aej9QoEZhZRsJbClt7d6
6j6VoAYzFGQ5KKGuIXH/qmJqQKrDhRq7O/dP6XZEFYTDiyAcpLPK9sZ3eovNrur6
xW7TIpewZczEOPY0sz7Hkg90DgP0DnU37fELA0oYoUe55jQ0uZrcXmptcWlQssei
UZ6Cx2x9ebVcvPz2Ge7cNuDT8FXpu2MGNR7FLlh49EarFwMkl/al5oFuTcnmAojO
ypsafA5JBmsjhu3VpiI+VolZMEnYzUtZlIjv44cHw891RL5iQkcxVT/UWC8q3jHW
+OZZci21/MN3X4f2GcQUE5lEJTLX+mcmlGRoDF6B4Lh4n0IZGikyNThZMPRU1Q6x
6Ab76qHhOJtcGnxWcMiEbReUC6oPRFyr8YzTrJJfNp6iTMNgXgISKwL6UV1A7Lha
pZDXjAzREgxe8FbU3JZGfgcMg7WlnN/Y33R5E/UGwXK/MDAL0BCwNV4PBE0LCbtH
2qCzBC3TIWF7NMbIp0GnD8cbJJRO7c1hIZkVNRUwbUXkrMT75CfNIhq3l9xeWAIF
ooXvNa+MO0uJPQ/0IAJc5+AGBEEiPnvXEnp8XfTWE6S8dkA26LokKslI6fsZE27s
r0P+dHhc1OiHsIAngYqetWXZ/OdeNMfeBhIWeiKj2VKrT8MG8e/tdO9ICAHkVQSt
dnUAmLQqyR41hcI5hFEu
=IWTK
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This time the updates contain:
- Tracepoints for certain IOMMU-API functions to make their use
easier to debug
- A tracepoint for IOMMU page faults to make it easier to get them in
user space
- Updates and fixes for the new ARM SMMU driver after the first
hardware showed up
- Various other fixes and cleanups in other IOMMU drivers"
* tag 'iommu-updates-v3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (26 commits)
iommu/shmobile: Enable the driver on all ARM platforms
iommu/tegra-smmu: Staticize tegra_smmu_pm_ops
iommu/tegra-gart: Staticize tegra_gart_pm_ops
iommu/vt-d: Use list_for_each_entry_safe() for dmar_domain->devices traversal
iommu/vt-d: Use for_each_drhd_unit() instead of list_for_each_entry()
iommu/vt-d: Fixed interaction of VFIO_IOMMU_MAP_DMA with IOMMU address limits
iommu/arm-smmu: Clear global and context bank fault status registers
iommu/arm-smmu: Print context fault information
iommu/arm-smmu: Check for num_context_irqs > 0 to avoid divide by zero exception
iommu/arm-smmu: Refine check for proper size of mapped region
iommu/arm-smmu: Switch to subsys_initcall for driver registration
iommu/arm-smmu: use relaxed accessors where possible
iommu/arm-smmu: replace devm_request_and_ioremap by devm_ioremap_resource
iommu: Remove stack trace from broken irq remapping warning
iommu: Change iommu driver to call io_page_fault trace event
iommu: Add iommu_error class event to iommu trace
iommu/tegra: gart: cleanup devm_* functions usage
iommu/tegra: Print phys_addr_t using %pa
iommu: No need to pass '0x' when '%pa' is used
iommu: Change iommu driver to call unmap trace event
...
Use for_each_drhd_unit() instead of list_for_each_entry for
better readability.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This patch updates DMAR table header definitions as such enhancement
has been made in ACPICA upstream already. It ports that change to
the Linux source to reduce source code differences between Linux and
ACPICA upstream.
Build test done on x86-64 machine with the following configs enabled:
CONFIG_DMAR_TABLE
CONFIG_IRQ_REMAP
CONFIG_INTEL_IOMMU
This patch does not affect the generation of the Linux kernel binary.
[rjw: Changelog]
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In intel vt-d spec , chapter 8.1 , DMA Remapping Reporting Structure.
In the end of the table, it says:
Remapping Structures[]
-
A list of structures. The list will contain one or
more DMA Remapping Hardware Unit Definition
(DRHD) structures, and zero or more Reserved
Memory Region Reporting (RMRR) and Root Port
ATS Capability Reporting (ATSR) structures.
These structures are described below.
So, there should be at least one DRHD structure in DMA Remapping
reporting table. If there is no DRHD found, a warning is necessary.
Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
ACPI_DMAR_SCOPE_TYPE_HPET is parsed by ir_parse_ioapic_hpet_scope() and
should not be flagged as an unsupported type.
Signed-off-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This patch disables translation(dma-remapping) before its initialization
if it is already enabled.
This is needed for kexec/kdump boot. If dma-remapping is enabled in the
first kernel, it need to be disabled before initializing its page table
during second kernel boot. Wei Hu also reported that this is needed
when second kernel boots with intel_iommu=off.
Basically iommu->gcmd is used to know whether translation is enabled or
disabled, but it is always zero at boot time even when translation is
enabled since iommu->gcmd is initialized without considering such a
case. Therefor this patch synchronizes iommu->gcmd value with global
command register when iommu structure is allocated.
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>