Store the domain information for the device, only if it's not already
attached to a domain.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
is_power_of_2 requires an unsigned long parameter which would
lead to truncation of 64 bit values on 32 bit architectures.
__ffs also expects an unsigned long parameter thus won't work
for 64 bit values on 32 bit architectures.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If somebody attempts to check the capability of an IOMMU domain prior to
device attach, then we'll try to dereference a NULL SMMU pointer through
the SMMU domain (since we can't determine the actual SMMU instance until
we have a device attached).
This patch fixes the capability check so that non-global features are
reported as being absent when no device is attached to the domain.
Signed-off-by: Will Deacon <will.deacon@arm.com>
AMD-Vi support for IOMMU sysfs. This allows us to associate devices
with a specific IOMMU device and examine the capabilities and features
of that IOMMU. The AMD IOMMU is hosted on and actual PCI device, so
we make that device the parent for the IOMMU class device. This
initial implementaiton exposes only the capability header and extended
features register for the IOMMU.
# find /sys | grep ivhd
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:00.0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:02.0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:04.0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:09.0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:11.0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:12.0
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:12.2
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/devices/0000:00:13.0
...
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/power
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/power/control
...
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/device
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/subsystem
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/amd-iommu
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/amd-iommu/cap
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/amd-iommu/features
/sys/devices/pci0000:00/0000:00:00.2/iommu/ivhd0/uevent
/sys/class/iommu/ivhd0
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Register our DRHD IOMMUs, cross link devices, and provide a base set
of attributes for the IOMMU. Note that IRQ remapping support parses
the DMAR table very early in boot, well before the iommu_class can
reasonably be setup, so our registration is split between
intel_iommu_init(), which occurs later, and alloc_iommu(), which
typically occurs much earlier, but may happen at any time later
with IOMMU hot-add support.
On a typical desktop system, this provides the following (pruned):
$ find /sys | grep dmar
/sys/devices/virtual/iommu/dmar0
/sys/devices/virtual/iommu/dmar0/devices
/sys/devices/virtual/iommu/dmar0/devices/0000:00:02.0
/sys/devices/virtual/iommu/dmar0/intel-iommu
/sys/devices/virtual/iommu/dmar0/intel-iommu/cap
/sys/devices/virtual/iommu/dmar0/intel-iommu/ecap
/sys/devices/virtual/iommu/dmar0/intel-iommu/address
/sys/devices/virtual/iommu/dmar0/intel-iommu/version
/sys/devices/virtual/iommu/dmar1
/sys/devices/virtual/iommu/dmar1/devices
/sys/devices/virtual/iommu/dmar1/devices/0000:00:00.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:01.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:16.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1a.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1b.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1c.0
...
/sys/devices/virtual/iommu/dmar1/intel-iommu
/sys/devices/virtual/iommu/dmar1/intel-iommu/cap
/sys/devices/virtual/iommu/dmar1/intel-iommu/ecap
/sys/devices/virtual/iommu/dmar1/intel-iommu/address
/sys/devices/virtual/iommu/dmar1/intel-iommu/version
/sys/class/iommu/dmar0
/sys/class/iommu/dmar1
(devices also link back to the dmar units)
This makes address, version, capabilities, and extended capabilities
available, just like printed on boot. I've tried not to duplicate
data that can be found in the DMAR table, with the exception of the
address, which provides an easy way to associate the sysfs device with
a DRHD entry in the DMAR. It's tempting to add scopes and RMRR data
here, but the full DMAR table is already exposed under /sys/firmware/
and therefore already provides a way for userspace to learn such
details.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IOMMUs currently have no common representation to userspace, most
seem to have no representation at all aside from a few printks
on bootup. There are however features of IOMMUs that are useful
to know about. For instance the IOMMU might support superpages,
making use of processor large/huge pages more important in a device
assignment scenario. It's also useful to create cross links between
devices and IOMMU hardware units, so that users might be able to
load balance their devices to avoid thrashing a single hardware unit.
This patch adds a device create and destroy interface as well as
device linking, making it very lightweight for an IOMMU driver to add
basic support. IOMMU drivers can provide additional attributes
automatically by using an attribute_group.
The attributes exposed are expected to be relatively device specific,
the means to retrieve them certainly are, so there are currently no
common attributes for the new class created here.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The single helper here no longer has any users.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Drop custom code and use IOMMU provided grouping support for PCI.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Varun Sethi <varun.sethi@freescale.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
VT-d code currently makes use of pci_find_upstream_pcie_bridge() in
order to find the topology based alias of a device. This function has
a few problems. First, it doesn't check the entire alias path of the
device to the root bus, therefore if a PCIe device is masked upstream,
the wrong result is produced. Also, it's known to get confused and
give up when it crosses a bridge from a conventional PCI bus to a PCIe
bus that lacks a PCIe capability. The PCI-core provided DMA alias
support solves both of these problems and additionally adds support
for DMA function quirks allowing VT-d to work with devices like
Marvell and Ricoh with known broken requester IDs.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IOMMU code now provides a common interface for finding or
creating an IOMMU group for a device on PCI buses. Make use of it
and remove piles of code.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The common iommu_group_get_for_dev() allows us to greatly simplify
our group lookup for a new device. Also, since we insert IVRS
aliases into the PCI DMA alias quirks, we should alway come up with
the same results as the existing code.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD-Vi already has a concept of an alias provided via the IVRS table.
Now that PCI-core also understands aliases, we need to incorporate
both aspects when programming the IOMMU. IVRS is generally quite
reliable, so we continue to prefer it when an alias is present. For
cases where we have an IVRS alias that does not match the PCI alias
or where PCI does not report an alias, report the mismatch to allow
us to collect more quirks and dynamically incorporate the alias into
the device alias quirks where possible.
This should allow AMD-Vi to work with devices like Marvell and Ricoh
with DMA function alias quirks unknown to the BIOS.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently each IOMMU driver that supports IOMMU groups has its own
code for discovering the base device used in grouping. This code
is generally not specific to the IOMMU hardware, but to the bus of
the devices managed by the IOMMU. We can therefore create a common
interface for supporting devices on different buses.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
suppress compiler warnings:
drivers/iommu/intel-iommu.c: In function ‘device_to_iommu’:
drivers/iommu/intel-iommu.c:673: warning: ‘segment’ may be used uninitialized in this function
drivers/iommu/intel-iommu.c: In function ‘get_domain_for_dev.clone.3’:
drivers/iommu/intel-iommu.c:2217: warning: ‘bridge_bus’ may be used uninitialized in this function
drivers/iommu/intel-iommu.c:2217: warning: ‘bridge_devfn’ may be used uninitialized in this function
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use inline function dma_pte_superpage() instead of macro for
better readability.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Alloc_domain() will initialize domain->nid to -1. So the
initialization for domain->nid in md_domain_init() is redundant,
clear it.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
__dmar_enable_qi() will initialize free_head,free_tail and
free_cnt for q_inval. Remove the redundant initialization
in dmar_enable_qi().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use list_for_each_entry_safe() instead of list_entry()
to simplify code.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
For an SMMU that supports both Stage-1 and Stage-2 mappings (but not
nested translation), then we should prefer stage-1 mappings as we
otherwise rely on the memory attributes of the incoming transactions
for IOMMU_CACHE mappings.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMU driver has supported chained SMMUs (i.e. SMMUs connected
back-to-back in series) via the smmu-parent property in device tree.
This was in anticipation of somebody building such a configuration,
however that seems not to be the case.
This patch removes the unused chained SMMU hack from the driver. We can
consider adding it back later if somebody decided they need it, but for
the time being it's just pointless mess that we're carrying in mainline.
Removal of the feature also makes migration to the generic IOMMU bindings
easier.
Signed-off-by: Will Deacon <will.deacon@arm.com>
MSIs are just seen as bog standard memory writes by the ARM SMMU, so
they can be translated (and isolated) in the same way.
This patch adds the IOMMU_CAP_INTR_REMAP capability to the ARM SMMU
driver and reworks our capabaility code so that we don't assume the
caps are organised as bits in a bitmask (since this isn't the intention).
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch extends the ARM SMMU driver so that it can handle PCI master
devices in addition to platform devices described in the device tree.
The driver is informed about the PCI host controller in the DT via a
phandle to the host controller in the mmu-masters property. The host
controller is then added to the master tree for that SMMU, just like a
normal master (although it probably doesn't advertise any StreamIDs).
When a device is added to the PCI bus, we set the archdata.iommu pointer
for that device to describe its StreamID (actually its RequesterID for
the moment). This allows us to re-use our existing data structures using
the host controller of_node for everything apart from StreamID
configuration, where we reach into the archdata for the information we
require.
Cc: Varun Sethi <varun.sethi@freescale.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
T0SZ controls the input address range for TTBR0, so use the input
address range rather than the output address range for the calculation.
For stage-2, this means using the output size of stage-1.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit e79df31 introduced mmu_notifer_count to protect
against parallel mmu_notifier_invalidate_range_start/end
calls. The patch left a small race condition when
invalidate_range_end() races with a new
invalidate_range_start() the empty page-table may be
reverted leading to stale TLB entries in the IOMMU and the
device. Use a spin_lock instead of just an atomic variable
to eliminate the race.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Function dmar_iommu_notify_scope_dev() makes a wrong assumption that
there's one RMRR for each PCI device at most, which causes DMA failure
on some HP platforms. So enhance dmar_iommu_notify_scope_dev() to
handle multiple RMRRs for the same PCI device.
Fixbug: https://bugzilla.novell.com/show_bug.cgi?id=879482
Cc: <stable@vger.kernel.org> # 3.15
Reported-by: Tom Mingarelli <thomas.mingarelli@hp.com>
Tested-by: Linda Knippers <linda.knippers@hp.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The changes include:
* A new IOMMU driver for ARM Renesas SOCs
* Updates and fixes for the ARM Exynos driver to bring it closer
to a usable state again
* Convert the AMD IOMMUv2 driver to use the
mmu_notifier->release call-back instead of the task_exit
notifier
* Random other fixes and minor improvements to a number of other
IOMMU drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTkI/xAAoJECvwRC2XARrjgPgQANk/62drPe+Am7YIftYWgaMM
e2i5PruVtDfHYo/KhV3lqPr3/GWDKuD6zrrymEYNLoq1T2GH2XtjRvJZZrzXvApO
jEOj9Lf35JQgWnjh1IhFBq2N8baX9UFhNLBUkqBT4+CFQrrqARXk1pZG56VtYjUg
PRn3HCHatEyN/o24tLTpXymWGjH6Z1jJQ8LLFL1/woU4nZRVHSA6HATIx1Ytv+D3
MQTy+r7M+tphT2moeJiiWo2wX98XZH/lM7+4kuc94+7CHiJjnc9rvG+Tt/cp82cL
kSf7EYnW7rEnN1Tr1unFgOkdX8GhIK1Pkm1QiJ5mfIsXdFeRVj66NBpuQhaAXfPU
XhISkbl5K6YqjxLCpbId8KSbonxFfk9sO+LILrjWj6x/YsWpns7LP1OPUbbwJnXh
ncsn/goeCFU5M1JO9AHm2XbrDdumTUceAtgotVRQwo6GDkAd7ReVb+6jj1MND7L7
hol8UbHZKvf41msGILqIDsVTotQvzd1fQg7hl9DLcM+mRyhJo7dgTlq8GcMcYc40
3v+aDFaD1BOtgQ2VMdCiaJ2UqJNDlhC8827wEwqvIlnPXStOOgdErPNKUvr16jYV
qAZsehdFIQWylve528CtR05bG/VuzaldMo4xixktC0wj3zc2gxH1BqNmZU1zluES
qNOsm/wjtaqi1we+DLFu
=1DJL
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into next
Pull IOMMU updates from Joerg Roedel:
"The changes include:
- a new IOMMU driver for ARM Renesas SOCs
- updates and fixes for the ARM Exynos driver to bring it closer to a
usable state again
- convert the AMD IOMMUv2 driver to use the mmu_notifier->release
call-back instead of the task_exit notifier
- random other fixes and minor improvements to a number of other
IOMMU drivers"
* tag 'iommu-updates-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (54 commits)
iommu/msm: Use devm_ioremap_resource to simplify code
iommu/amd: Fix recently introduced compile warnings
arm/ipmmu-vmsa: Fix compile error
iommu/exynos: Fix checkpatch warning
iommu/exynos: Fix trivial typo
iommu/exynos: Remove invalid symbol dependency
iommu: fsl_pamu.c: Fix for possible null pointer dereference
iommu/amd: Remove duplicate checking code
iommu/amd: Handle parallel invalidate_range_start/end calls correctly
iommu/amd: Remove IOMMUv2 pasid_state_list
iommu/amd: Implement mmu_notifier_release call-back
iommu/amd: Convert IOMMUv2 state_table into state_list
iommu/amd: Don't access IOMMUv2 state_table directly
iommu/ipmmu-vmsa: Support clearing mappings
iommu/ipmmu-vmsa: Remove stage 2 PTE bits definitions
iommu/ipmmu-vmsa: Support 2MB mappings
iommu/ipmmu-vmsa: Rewrite page table management
iommu/ipmmu-vmsa: PMD is never folded, PUD always is
iommu/ipmmu-vmsa: Set the PTE contiguous hint bit when possible
iommu/ipmmu-vmsa: Define driver-specific page directory sizes
...
Merge misc updates from Andrew Morton:
- a few fixes for 3.16. Cc'ed to stable so they'll get there somehow.
- various misc fixes and cleanups
- most of the ocfs2 queue. Review is slow...
- most of MM. The MM queue is pretty huge this time, but not much in
the way of feature work.
- some tweaks under kernel/
- printk maintenance work
- updates to lib/
- checkpatch updates
- tweaks to init/
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (276 commits)
fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init
fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul
init/main.c: remove an ifdef
kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND
init/main.c: add initcall_blacklist kernel parameter
init/main.c: don't use pr_debug()
fs/binfmt_flat.c: make old_reloc() static
fs/binfmt_elf.c: fix bool assignements
fs/efs: convert printk(KERN_DEBUG to pr_debug
fs/efs: add pr_fmt / use __func__
fs/efs: convert printk to pr_foo()
scripts/checkpatch.pl: device_initcall is not the only __initcall substitute
checkpatch: check stable email address
checkpatch: warn on unnecessary void function return statements
checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar);
checkpatch: add warning for kmalloc/kzalloc with multiply
checkpatch: warn on #defines ending in semicolon
checkpatch: make --strict a default for files in drivers/net and net/
checkpatch: always warn on missing blank line after variable declaration block
checkpatch: fix wildcard DT compatible string checking
...
This adds support for the DMA Contiguous Memory Allocator for
intel-iommu. This change enables dma_alloc_coherent() to allocate big
contiguous memory.
It is achieved in the same way as nommu_dma_ops currently does, i.e.
trying to allocate memory by dma_alloc_from_contiguous() and
alloc_pages() is used as a fallback.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Don Dutile <ddutile@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull core irq updates from Thomas Gleixner:
"The irq department delivers:
- Another tree wide update to get rid of the horrible create_irq
interface along with its even more horrible variants. That also
gets rid of the last leftovers of the initial sparse irq hackery.
arch/driver specific changes have been either acked or ignored.
- A fix for the spurious interrupt detection logic with threaded
interrupts.
- A new ARM SoC interrupt controller
- The usual pile of fixes and improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
Documentation: brcmstb-l2: Add Broadcom STB Level-2 interrupt controller binding
irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller
genirq: Improve documentation to match current implementation
ARM: iop13xx: fix msi support with sparse IRQ
genirq: Provide !SMP stub for irq_set_affinity_notifier()
irqchip: armada-370-xp: Move the devicetree binding documentation
irqchip: gic: Use mask field in GICC_IAR
genirq: Remove dynamic_irq mess
ia64: Use irq_init_desc
genirq: Replace dynamic_irq_init/cleanup
genirq: Remove irq_reserve_irq[s]
genirq: Replace reserve_irqs in core code
s390: Avoid call to irq_reserve_irqs()
s390: Remove pointless arch_show_interrupts()
s390: pci: Check return value of alloc_irq_desc() proper
sh: intc: Remove pointless irq_reserve_irqs() invocation
x86, irq: Remove pointless irq_reserve_irqs() call
genirq: Make create/destroy_irq() ia64 private
tile: Use SPARSE_IRQ
tile: pci: Use irq_alloc/free_hwirq()
...
Here is the "big" pull request for 3.16-rc1.
Not a lot of changes here, some kernfs work, a revert of a very old
driver core change that ended up cauing some memory leaks on driver
probe error paths, and other minor things.
As was pointed out earlier today, one commit here,
26fc9cd200 (kernfs: move the last
knowledge of sysfs out from kernfs) is also needed in your 3.15-final
branch as well. If you could cherry-pick it there, it would be most
appreciated by Andy Lutomirski to prevent a regression there.
All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iEYEABECAAYFAlONV9YACgkQMUfUDdst+yn0sQCfWWYg1oVXyu6f0uJjYbVBFkpD
UHgAoJxxfwTZJq/xYrnk6+RqUowIsUlh
=ojAS
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core into next
Pull driver core / kernfs changes from Greg KH:
"Here is the "big" pull request for 3.16-rc1.
Not a lot of changes here, some kernfs work, a revert of a very old
driver core change that ended up cauing some memory leaks on driver
probe error paths, and other minor things.
As was pointed out earlier today, one commit here, 26fc9cd200
("kernfs: move the last knowledge of sysfs out from kernfs") is also
needed in your 3.15-final branch as well. If you could cherry-pick it
there, it would be most appreciated by Andy Lutomirski to prevent a
regression there.
All of these have been in linux-next for a while"
* tag 'driver-core-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
crypto/nx/nx-842: dev_set_drvdata can no longer fail
kernfs: move the last knowledge of sysfs out from kernfs
sysfs: fix attribute_group bin file path on removal
sysfs.h: don't return a void-valued expression in sysfs_remove_file
init.h: Update initcall_sync variants to fix build errors
driver core: Inline dev_set/get_drvdata
driver core: dev_get_drvdata: Don't check for NULL dev
driver core: dev_set_drvdata returns void
driver core: dev_set_drvdata can no longer fail
driver core: Move driver_data back to struct device
lib/devres.c: fix checkpatch warnings
lib/devres.c: use dev in devm_request_and_ioremap
kobject: Make support for uevent_helper optional.
kernfs: make kernfs_notify() trigger inotify events too
kernfs: implement kernfs_root->supers list
Use devm_ioremap_resource() to make the code simpler, drop unused variable,
redundant return value check, and error-handing code.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix two compile warnings about unused variables introduced
by commit ecef115.
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
So there is no point in checking its return value, which will soon
disappear.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function arm_iommu_create_mapping lost the order
parameter. Remove it from this IOMMU driver too to make it
build.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Silences the following type of warnings:
WARNING: Missing a blank line after declarations
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
EXYNOS_DEV_SYSMMU symbol is not defined anywhere and prevents building
the Exynos driver. Remove it.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is otherwise a risk of a possible null pointer dereference.
Was largely found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Reviewed-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Acked-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
amd_iommu_rlookup_table[devid] != NULL is already guaranteed
by check_device called before, it's fine to attach device at
this point.
Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add a counter to the pasid_state so that we do not restore
the original page-table before all invalidate_range_start
to invalidate_range_end sections have finished.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This list was only used for the task_exit notifier function.
Now that it is gone we can remove it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Since mmu_notifier call-backs can sleep (because they use
SRCU now) we can use them to tear down PASID mappings. This
allows us to finally remove the hack to use the task_exit
notifier from oprofile to get notified when a process dies.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
The state_table consumes 512kb of memory and is only sparsly
populated. Convert it into a list to save memory. There
should be no measurable performance impact.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
This is a preparation for converting the state_table into a
state_list.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Add support for 2MB block mappings at the PMD level.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IOMMU core will only call us with page sizes advertized as supported
by the driver. We can thus simplify the code by removing loops over PGD
and PMD entries.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The driver only supports the 3-level long descriptor format that has no
PUD and always has a PMD.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The contiguous hint bit signals to the IOMMU that a range of 16 PTEs
refer to physically contiguous memory. It improves performances by
dividing the number of TLB lookups by 16, effectively implementing 64kB
page sizes.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The PTRS_PER_(PUD|PGD|PMD|PTE) macros evaluate to different values
depending on whether LPAE is enabled. The IPMMU driver uses a long
descriptor format regardless of LPAE, making those macros mismatch the
IPMMU configuration on non-LPAE systems.
Replace the macros by driver-specific versions that always evaluate to
the right value.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Cache the micro-TLB number in archdata allocated in the .add_device
handler instead of looking it up when the deviced is attached and
detached. This simplifies the .attach_dev and .detach_dev operations and
prepares for DT support.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
ia64 and x86 share this driver. x86 is moving to a different irq
allocation and ia64 keeps its private irq_create/destroy stuff.
Use macros to redirect to one or the other. Yes, macros to avoid
include hell.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: x86@kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20140507154336.372289825@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The create_irq variants are going away. Use the new interface. The
core and arch code already excludes the gsi interrupts from the
allocation, so no functional change.
This does not replace the requirement to move x86 to irq domains, but
it limits the mess to some degree.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: x86@kernel.org
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20140507154334.741805075@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch contains 2 workaround for the System MMU v3.x.
System MMU v3.2 and v3.3 has FLPD cache that caches first level page
table entries to reduce page table walking latency. However, the
FLPD cache is filled with a first level page table entry even though
it is not accessed by a master H/W because System MMU v3.3
speculatively prefetches page table entries that may be accessed
in the near future by the master H/W.
The prefetched FLPD cache entries are not invalidated by iommu_unmap()
because iommu_unmap() only unmaps and invalidates the page table
entries that is mapped.
Because exynos-iommu driver discards a second level page table when
it needs to be replaced with another second level page table or
a first level page table entry with 1MB mapping, It is required to
invalidate FLPD cache that may contain the first level page table
entry that points to the second level page table.
Another workaround of System MMU v3.3 is initializing the first level
page table entries with the second level page table which is filled
with all zeros. This prevents System MMU prefetches 'fault' first
level page table entry which may lead page fault on access to 16MiB
wide.
System MMU 3.x fetches consecutive page table entries by a page
table walking to maximize bus utilization and to minimize TLB miss
panelty.
Unfortunately, functional problem is raised with the fetching behavior
because it fetches 'fault' page table entries that specifies no
translation information and that a valid translation information will
be written to in the near future. The logic in the System MMU generates
page fault with the cached fault entries that is no longer coherent
with the page table which is updated.
There is another workaround that must be implemented by I/O virtual
memory manager: any two consecutive I/O virtual memory area must have
a hole between the two that is larger than or equal to 128KiB.
Also, next I/O virtual memory area must be started from the next
128KiB boundary.
0 128K 256K 384K 512K
|-------------|---------------|-----------------|----------------|
|area1---------------->|.........hole...........|<--- area2 -----
The constraint is depicted above.
The size is selected by the calculation followed:
- System MMU can fetch consecutive 64 page table entries at once
64 * 4KiB = 256KiB. This is the size between 128K ~ 384K of the
above picture. This style of fetching is 'block fetch'. It fetches
the page table entries predefined consecutive page table entries
including the entry that is the reason of the page table walking.
- System MMU can prefetch upto consecutive 32 page table entries.
This is the size between 256K ~ 384K.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This turns on FLPD_CACHE, ACGEN and SYSSEL.
FLPD_CACHE is a cache of 1st level page table entries that contains
the address of a 2nd level page table to reduce latency of page table
walking.
ACGEN is architectural clock gating that gates clocks by System MMU
itself if it is not active. Note that ACGEN is different from clock
gating by the CPU. ACGEN just gates clocks to the internal logic of
System MMU while clock gating by the CPU gates clocks to the System
MMU.
SYSSEL selects System MMU version in some Exynos SoCs. Some Exynos
SoCs have an option to select System MMU versions exclusively because
the SoCs adopts new System MMU version experimentally.
This also always selects LRU as TLB replacement policy. Selecting TLB
replacement policy is deprecated from System MMU 3.2. TLB in System
MMU 3.3 has single TLB replacement policy, LRU. The bit of MMU_CFG
selecting TLB replacement policy is remained as reserved.
QoS value of page table walking is set to 15 (highst value). System
MMU 3.3 can inherit QoS value of page table walking from its master
H/W's transaction. This new feature is enabled by default and QoS
value written to MMU_CFG is ignored.
This patch also adds simplifies the sysmmu version checking by
introducing some macros.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This commit adds device tree support for System MMU.
Also, system mmu handling is improved. Previously, an IOMMU domain is
bound to a System MMU which is not correct. This patch binds an IOMMU
domain with the master device of a System MMU.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Some redundant error message is removed and some error messages
are changed to error level from debug level.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Patch written by Antonios Motakis <a.motakis@virtualopensystems.com>:
IOMMU groups are expected by certain users of the IOMMU API,
e.g. VFIO. Since each device is behind its own System MMU, we
can allocate a new IOMMU group for each device.
Reviewed-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Shaik Ameeer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This commit introduces sysmmu_pte_t for page table entries and
sysmmu_iova_t vor I/O virtual address that is manipulated by
exynos-iommu driver. The purpose of the typedef is to remove
dependencies to the driver code from the change of CPU architecture
from 32 bit to 64 bit.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since acquiring read_lock is not more frequent than write_lock, it is
not beneficial to use rwlock, this commit changes rwlock to spinlock.
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This commit removes custom fault handler. The device drivers that
need to register fault handler can register
with iommu_set_fault_handler().
CC: Grant Grundler <grundler@chromium.org>
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch gates clocks of master H/W as well as clocks of System MMU
if master clocks are specified.
Some Exynos SoCs (i.e. GScalers in Exynos5250) have dependencies in
the gating clocks of master H/W and its System MMU. If a H/W is the
case, accessing control registers of System MMU is prohibited unless
both of the gating clocks of System MMU and its master H/W.
CC: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch removes dbgname member from sysmmu_drvdata structure.
Kernel message for debugging already has the name of a single
System MMU node. It also removes some compilation warnings.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Checking if the probing device has a parent device was just to discover
if the probing device is involved in a power domain when the power
domain controlled by Samsung's custom implementation.
Since generic IO power domain is applied, it is required to remove
the condition to see if the probing device has a parent device.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This commit adds cache flush for removed small and large page entries
in exynos_iommu_unmap(). Missing cache flush of removed page table
entries can cause missing page fault interrupt when a master IP
accesses an unmapped area.
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Tested-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Prefetch buffer is a cache of System MMU 3.x and caches a block of
page table entries to make effect of larger page with small pages.
However, how to control prefetch buffers and the specifications of
prefetch buffers different from minor versions of System MMU v3.
Prefetch buffers must be controled with care because there are some
restrictions in H/W design.
The interface and implementation to initiate prefetch buffers will
be prepared later.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
L2TLB is 8-way set-associative TLB with 512 entries. The number of
sets is 64.
A single 4KB(small page) translation information is cached
only to a set whose index is the same with the lower 6 bits of the page
frame number.
A single 64KB(large page) translation information can be
cached to any 16 sets whose top two bits of their indices are the same
with the bit [5:4] of the page frame number.
A single 1MB(section) or larger translation information can be cached to
any set in the TLB.
It is required to invalidate entire sets that may cache the target
translation information to guarantee that the L2TLB has no stale data.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since kmalloc() does not guarantee that the allignment of 1KiB when it
allocates 1KiB, it is required to allocate lv2 page table from own
slab that guarantees alignment of 1KiB
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch changes not to panic on any error when updating page table.
Instead prints error messages with callstack.
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 25e9d28d92 (ARM: EXYNOS: remove system mmu initialization from
exynos tree) removed arch/arm/mach-exynos/mach/sysmmu.h header without
removing remaining use of it from exynos-iommu driver, thus causing a
compilation error.
This patch fixes the error by removing respective include line
from exynos-iommu.c.
Use of __pa and __va macro is changed to virt_to_phys and phys_to_virt
which are recommended in driver code. printk formatting of physical
address is also fixed to %pa.
Also System MMU driver is changed to control only a single instance
of System MMU at a time. Since a single instance of System MMU has only
a single clock descriptor for its clock gating, single address range
for control registers, there is no need to obtain two or more clock
descriptors and ioremaped region.
CC: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Cho KyongHo <pullip.cho@samsung.com>
Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
set_device_exclusion_range(u16 devid, struct ivmd_header *m) enables
exclusion range for ONE device. IOMMU does not translate the access
to the exclusion range from the device.
The device is specified by input argument 'devid'. But 'devid' is not
passed to the actual set function set_dev_entry_bit(), instead
'm->devid' is passed. 'm->devid' does not specify the exact device
which needs enable the exclusion range. 'm->devid' represents DeviceID
field of IVMD, which has different meaning depends on IVMD type.
The caller init_exclusion_range() sets 'devid' for ONE device. When
m->type is equal to ACPI_IVMD_TYPE_ALL or ACPI_IVMD_TYPE_RANGE,
'm->devid' is not equal to 'devid'.
This patch fixes 'm->devid' to 'devid'.
Signed-off-by: Su Friendy <friendy.su@sony.com.cn>
Signed-off-by: Tamori Masahiro <Masahiro.Tamori@jp.sony.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
get_user_pages requires caller to hold a read lock on mmap_sem.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
An apparent cut and paste error prevents the correct flags from being
set on the alias device resulting in MSI on conventional PCI devices
failing to work. This also produces error events from the IOMMU like:
AMD-Vi: Event logged [INVALID_DEVICE_REQUEST device=00:14.4 address=0x000000fdf8000000 flags=0x0a00]
Where 14.4 is a PCIe-to-PCI bridge with a device behind it trying to
use MSI interrupts.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joro@8bytes.org>
There is already S2CR_TYPE_SHIFT in S2CR_TYPE_TRANS macro, so drop the
second shift. Note that, since S2CR_TYPE_SHIFT is 0x0, there is no
functional change introduced by this patch.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The output size of stage-1 is currently limited by the input size of
stage-2, which is further limited by VA_BITS since we make use of the
standard pgd_alloc functions for creating page tables.
This patch ensures that we use VA_BITS instead of a hardcoded '39'
for the stage-1 output size limit.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Replace the devm_ioremap_nocache() call with devm_ioremap_resource().
This simplifies error checking.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
The prot flags passed to the IOMMU map handler are defined in
include/linux/iommu.h as IOMMU_(READ|WRITE|CACHE|EXEC). However, the
driver expects to receive MMU_RAM_* OMAP-specific flags. This causes
IOMMU flags being interpreted as page sizes, leading to failures.
Hardcode the OMAP mapping parameters to little-endian, 8-bits and
non-mixed page attributes. Furthermore, as the OMAP IOMMU doesn't
support read-only or write-only mappings, ignore the prot value.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sakari Ailus <sakari.ailus@iki.fi>
Acked-by: Suman Anna <s-anna@ti.com>
The IOMMU core breaks out mappings into pages already, there's no
need to support mapping multiple pages in one go.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sakari Ailus <sakari.ailus@iki.fi>
The flush_iotlb_page() function prints a debug message when no
corresponding page was found in the TLB. That condition is incorrectly
checked and always resolves to true, given that the for_each_iotlb_cr()
loop is never interrupted and always reaches obj->nr_tlb_entries.
Given that we can't have two TLB entries for the same VA, break from the
loop when a match is found.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Suman Anna <s-anna@ti.com>
The to_iommu definition is used only locally to the omap-iommu.c
source file, and it has nothing to do with the page attributes
defined in omap-iopgtable.h. So, move the definition out of
omap-iopgtable.h header file.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
The current OMAP IOMMU ops for .domain_has_cap is a stub,
and the iommu core already returns a value of 0 if the
domain doesn't have a .domain_has_cap ops plugged in. So,
clean up this stub function.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
The iotlb_entry field values are used directly in omap2_alloc_cr,
a function used in preparing the MMU_CAM and MMU_RAM registers.
The iotlb_entry.valid value is being set incorrectly to 1 at the
moment, and this would result in overriding the PAGESIZE bit field
of the MMU_CAM register if prefetching of the entries were to be
supported.
The bug has not caused any MMU faults due to incorrect size
programming so far as the prefetching is disabled by default. Fix
this by using the correct init value for the iotlb_entry.valid
field.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
kernel panic happened when iommu_unmap a buffer larger than 2MB,
more than expected pmd entries got “invalidated”, due to a wrong range
passed to arm_smmu_alloc_init_pte. it was likely a typo, now we fix
it, passing the correct "end" address to arm_smmu_alloc_init_pte.
Signed-off-by: Bin Wang <binw@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The IOMMU core expects the unmap operation to return the number of bytes
that have been unmapped or 0 on failure, a negative return value being
treated like a number of bytes.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit "59ce0515cdaf iommu/vt-d: Update DRHD/RMRR/ATSR device scope
caches when PCI hotplug happens" introduces a bug, which fails to
match PCI devices with DMAR device scope entries if PCI path array
in the entry has more than one level.
For example, it fails to handle
[1D2h 0466 1] Device Scope Entry Type : 01
[1D3h 0467 1] Entry Length : 0A
[1D4h 0468 2] Reserved : 0000
[1D6h 0470 1] Enumeration ID : 00
[1D7h 0471 1] PCI Bus Number : 00
[1D8h 0472 2] PCI Path : 1C,04
[1DAh 0474 2] PCI Path : 00,02
And cause DMA failure on HP DL980 as:
DMAR:[fault reason 02] Present bit in context entry is clear
dmar: DRHD: handling fault status reg 602
dmar: DMAR:[DMA Read] Request device [02:00.2] fault addr 7f61e000
Reported-and-tested-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Commit 146922ec79 ("iommu/vt-d: Make get_domain_for_dev() take struct
device") introduced new variables bridge_bus and bridge_devfn to
identify the upstream PCIe to PCI bridge responsible for the given
target device. Leaving the original bus/devfn variables to identify
the target device itself, now that it is no longer assumed to be PCI
and we can no longer trivially find that information.
However, the patch failed to correctly use the new variables in all
cases; instead using the as-yet-uninitialised 'bus' and 'devfn'
variables.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Commit ea8ea46 "iommu/vt-d: Clean up and fix page table clear/free
behaviour" introduces possible leakage of DMA page tables due to:
for (pte = page_address(pg); !first_pte_in_page(pte); pte++) {
if (dma_pte_present(pte) && !dma_pte_superpage(pte))
freelist = dma_pte_list_pagetables(domain, level - 1,
pte, freelist);
}
For the first pte in a page, first_pte_in_page(pte) will always be true,
thus dma_pte_list_pagetables() will never be called and leak DMA page
tables if level is bigger than 1.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This time a few more updates queued up.
* Rework VT-d code to support ACPI devices
* Improvements for memory and PCI hotplug support
in the VT-d driver
* Device-tree support for OMAP IOMMU
* Convert OMAP IOMMU to use devm_* interfaces
* Fixed PASID support for AMD IOMMU
* Other random cleanups and fixes for OMAP, ARM-SMMU
and SHMOBILE IOMMU
Most of the changes are in the VT-d driver because some rework was
necessary for better hotplug and ACPI device support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTPn8aAAoJECvwRC2XARrj66UQAICYhJKrErry3V8tGxzJ6r/3
jEWCmwd//FmV7rHcdRqckrv8vn9n9wf0lhRoGpbo9iiUrO8ikTdNltccKr1t0cAf
yPabayk1PbbZyc1Dv1hGKbIH52lHfVjbqesQ9Z2gndgVJuXWz5tdGDInXIvuYYEP
vWJFS2D5pfAj/lokvcJ1LcwLoBgDdsKM6dbGb2gOxxHm3gIUFvkZktMWGKgKeEGF
HL8R78tfGj6LRfLILT3RcxW4LfhnM2r1ZSeOm1lCdy7CISptqg6COWdX+gw/UToJ
lQeuPXv/AUaLTFRL+v07qus8iVsvr6E+Fx3ppmi+mkvgoSD0spWPKnAvrAkFDij3
ygXFKumTHqGTIHJQTVNWXrHkYFeM/XaIX7lZv0UwhE2RuLTRjFaIdBo2Wckmu9T3
Un2mZ6NshyI2IQ6NC9ytxHP8BCAP/rhjirKOpegYlKfbU5AHjkBQHHNjL7hcSIH5
g3ZXIV/FOgfNWhv4cdBR8go1N/PO+VMMYUfc1MC0sYi9MeiOnmsq5Y3LhXQWDgPE
eogQiz5j87ciNcA4IMO/tUJ2xBSvU6g2ESSCLZxZ33F6uD6X+VVgvinOeFBncbCS
fD0QUBo27tXUclWJHe7Z2yT4X2Xn4RohGEohqSpD0iNa1RUaNyBf7EzoEkZihxSZ
J7OuEnbE1kVQLh8y5ZdE
=uy/Z
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU upates from Joerg Roedel:
"This time a few more updates queued up.
- Rework VT-d code to support ACPI devices
- Improvements for memory and PCI hotplug support in the VT-d driver
- Device-tree support for OMAP IOMMU
- Convert OMAP IOMMU to use devm_* interfaces
- Fixed PASID support for AMD IOMMU
- Other random cleanups and fixes for OMAP, ARM-SMMU and SHMOBILE
IOMMU
Most of the changes are in the VT-d driver because some rework was
necessary for better hotplug and ACPI device support"
* tag 'iommu-updates-v3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (75 commits)
iommu/vt-d: Fix error handling in ANDD processing
iommu/vt-d: returning free pointer in get_domain_for_dev()
iommu/vt-d: Only call dmar_acpi_dev_scope_init() if DRHD units present
iommu/vt-d: Check for NULL pointer in dmar_acpi_dev_scope_init()
iommu/amd: Fix logic to determine and checking max PASID
iommu/vt-d: Include ACPI devices in iommu=pt
iommu/vt-d: Finally enable translation for non-PCI devices
iommu/vt-d: Remove to_pci_dev() in intel_map_page()
iommu/vt-d: Remove pdev from intel_iommu_attach_device()
iommu/vt-d: Remove pdev from iommu_no_mapping()
iommu/vt-d: Make domain_add_dev_info() take struct device
iommu/vt-d: Make domain_remove_one_dev_info() take struct device
iommu/vt-d: Rename 'hwdev' variables to 'dev' now that that's the norm
iommu/vt-d: Remove some pointless to_pci_dev() calls
iommu/vt-d: Make get_valid_domain_for_dev() take struct device
iommu/vt-d: Make iommu_should_identity_map() take struct device
iommu/vt-d: Handle RMRRs for non-PCI devices
iommu/vt-d: Make get_domain_for_dev() take struct device
iommu/vt-d: Make domain_context_mapp{ed,ing}() take struct device
iommu/vt-d: Make device_to_iommu() cope with non-PCI devices
...
Pull DMA-mapping updates from Marek Szyprowski:
"This contains extension for more efficient handling of io address
space for dma-mapping subsystem for ARM architecture"
* 'for-3.15' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
arm: dma-mapping: remove order parameter from arm_iommu_create_mapping()
arm: dma-mapping: Add support to extend DMA IOMMU mappings
Enumeration
- Increment max correctly in pci_scan_bridge() (Andreas Noever)
- Clarify the "scan anyway" comment in pci_scan_bridge() (Andreas Noever)
- Assign CardBus bus number only during the second pass (Andreas Noever)
- Use request_resource_conflict() instead of insert_ for bus numbers (Andreas Noever)
- Make sure bus number resources stay within their parents bounds (Andreas Noever)
- Remove pci_fixup_parent_subordinate_busnr() (Andreas Noever)
- Check for child busses which use more bus numbers than allocated (Andreas Noever)
- Don't scan random busses in pci_scan_bridge() (Andreas Noever)
- x86: Drop pcibios_scan_root() check for bus already scanned (Bjorn Helgaas)
- x86: Use pcibios_scan_root() instead of pci_scan_bus_with_sysdata() (Bjorn Helgaas)
- x86: Use pcibios_scan_root() instead of pci_scan_bus_on_node() (Bjorn Helgaas)
- x86: Merge pci_scan_bus_on_node() into pcibios_scan_root() (Bjorn Helgaas)
- x86: Drop return value of pcibios_scan_root() (Bjorn Helgaas)
NUMA
- x86: Add x86_pci_root_bus_node() to look up NUMA node from PCI bus (Bjorn Helgaas)
- x86: Use x86_pci_root_bus_node() instead of get_mp_bus_to_node() (Bjorn Helgaas)
- x86: Remove mp_bus_to_node[], set_mp_bus_to_node(), get_mp_bus_to_node() (Bjorn Helgaas)
- x86: Use NUMA_NO_NODE, not -1, for unknown node (Bjorn Helgaas)
- x86: Remove acpi_get_pxm() usage (Bjorn Helgaas)
- ia64: Use NUMA_NO_NODE, not MAX_NUMNODES, for unknown node (Bjorn Helgaas)
- ia64: Remove acpi_get_pxm() usage (Bjorn Helgaas)
- ACPI: Fix acpi_get_node() prototype (Bjorn Helgaas)
Resource management
- i2o: Fix and refactor PCI space allocation (Bjorn Helgaas)
- Add resource_contains() (Bjorn Helgaas)
- Add %pR support for IORESOURCE_UNSET (Bjorn Helgaas)
- Mark resources as IORESOURCE_UNSET if we can't assign them (Bjorn Helgaas)
- Don't clear IORESOURCE_UNSET when updating BAR (Bjorn Helgaas)
- Check IORESOURCE_UNSET before updating BAR (Bjorn Helgaas)
- Don't try to claim IORESOURCE_UNSET resources (Bjorn Helgaas)
- Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit (Bjorn Helgaas)
- Don't enable decoding if BAR hasn't been assigned an address (Bjorn Helgaas)
- Add "weak" generic pcibios_enable_device() implementation (Bjorn Helgaas)
- alpha, microblaze, sh, sparc, tile: Use default pcibios_enable_device() (Bjorn Helgaas)
- s390: Use generic pci_enable_resources() (Bjorn Helgaas)
- Don't check resource_size() in pci_bus_alloc_resource() (Bjorn Helgaas)
- Set type in __request_region() (Bjorn Helgaas)
- Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region() (Bjorn Helgaas)
- Change pci_bus_alloc_resource() type_mask to unsigned long (Bjorn Helgaas)
- Log IDE resource quirk in dmesg (Bjorn Helgaas)
- Revert "[PATCH] Insert GART region into resource map" (Bjorn Helgaas)
PCI device hotplug
- Make check_link_active() non-static (Rajat Jain)
- Use link change notifications for hot-plug and removal (Rajat Jain)
- Enable link state change notifications (Rajat Jain)
- Don't disable the link permanently during removal (Rajat Jain)
- Don't check adapter or latch status while disabling (Rajat Jain)
- Disable link notification across slot reset (Rajat Jain)
- Ensure very fast hotplug events are also processed (Rajat Jain)
- Add hotplug_lock to serialize hotplug events (Rajat Jain)
- Remove a non-existent card, regardless of "surprise" capability (Rajat Jain)
- Don't turn slot off when hot-added device already exists (Yijing Wang)
MSI
- Keep pci_enable_msi() documentation (Alexander Gordeev)
- ahci: Fix broken single MSI fallback (Alexander Gordeev)
- ahci, vfio: Use pci_enable_msi_range() (Alexander Gordeev)
- Check kmalloc() return value, fix leak of name (Greg Kroah-Hartman)
- Fix leak of msi_attrs (Greg Kroah-Hartman)
- Fix pci_msix_vec_count() htmldocs failure (Masanari Iida)
Virtualization
- Device-specific ACS support (Alex Williamson)
Freescale i.MX6
- Wait for retraining (Marek Vasut)
Marvell MVEBU
- Use Device ID and revision from underlying endpoint (Andrew Lunn)
- Fix incorrect size for PCI aperture resources (Jason Gunthorpe)
- Call request_resource() on the apertures (Jason Gunthorpe)
- Fix potential issue in range parsing (Jean-Jacques Hiblot)
Renesas R-Car
- Check platform_get_irq() return code (Ben Dooks)
- Add error interrupt handling (Ben Dooks)
- Fix bridge logic configuration accesses (Ben Dooks)
- Register each instance independently (Magnus Damm)
- Break out window size handling (Magnus Damm)
- Make the Kconfig dependencies more generic (Magnus Damm)
Synopsys DesignWare
- Fix RC BAR to be single 64-bit non-prefetchable memory (Mohit Kumar)
Miscellaneous
- Remove unused SR-IOV VF Migration support (Bjorn Helgaas)
- Enable INTx if BIOS left them disabled (Bjorn Helgaas)
- Fix hex vs decimal typo in cpqhpc_probe() (Dan Carpenter)
- Clean up par-arch object file list (Liviu Dudau)
- Set IORESOURCE_ROM_SHADOW only for the default VGA device (Sander Eikelenboom)
- ACPI, ARM, drm, powerpc, pcmcia, PCI: Use list_for_each_entry() for bus traversal (Yijing Wang)
- Fix pci_bus_b() build failure (Paul Gortmaker)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTOdAZAAoJEFmIoMA60/r8VYUQALRrReyMBk3pjRt/fKIX4Kwi
ydSo/YJeeKTN8K93fLw8bb8bdPItJScJFTfEa4Q2SpZezR/ecGXLowisy0BBaPHK
qtOyB8EqjkLS17GfyecIe9Nd2SIAI2De/0bchK3kDtIX1YlZB/k/tD3eCPMHDnnl
m8c5kAHKPQYd8g01I+S8nrtGHk/A33grfYpJXPZbcqyhE0lWU3SI8KDAGbcKzNHE
23Do0yNyd4nHIdixWlhETcNvzHn35Q/O38JJwW9Mf1aI9gusYuml6GFefCgu/iov
lxqp3CEW7iPZgQEgNbrQ0HzWn/durL2Trd6S/Yh6f2xbm1LGYKWh3LZUFLd3AQDd
INEpUgKsyb//nF3dtiyGnZlp0QykoqFyLo2AEDrb+ILTd4up5DeRY/m1UpjAXR5p
QicBmrDksHrSivPmMZwLx1DFQYKjQbdx5lOqy9hQM/Jmsr+N3/l7QBrbQWXks3JZ
NNAyn4RZHQB7UDQS/MmVPArs+JK5qaEDQD57QuOTlqgP19VY9C9E/l/aEqefjdFo
XOAm7CwGpB/iBAkIbE6ROEDiJArigRVHEfxLYeE/jtGOdRDCD1deWk+g3S8DWD7m
ZxWSgIVB00PMAmomczdg59YVFBhocgwPUa8/cw6yqzx2QKP4mWXIFZ/Sjau5I3tn
WWoxXlUirZfTJc29XnVy
=3mNS
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas:
"Enumeration
- Increment max correctly in pci_scan_bridge() (Andreas Noever)
- Clarify the "scan anyway" comment in pci_scan_bridge() (Andreas Noever)
- Assign CardBus bus number only during the second pass (Andreas Noever)
- Use request_resource_conflict() instead of insert_ for bus numbers (Andreas Noever)
- Make sure bus number resources stay within their parents bounds (Andreas Noever)
- Remove pci_fixup_parent_subordinate_busnr() (Andreas Noever)
- Check for child busses which use more bus numbers than allocated (Andreas Noever)
- Don't scan random busses in pci_scan_bridge() (Andreas Noever)
- x86: Drop pcibios_scan_root() check for bus already scanned (Bjorn Helgaas)
- x86: Use pcibios_scan_root() instead of pci_scan_bus_with_sysdata() (Bjorn Helgaas)
- x86: Use pcibios_scan_root() instead of pci_scan_bus_on_node() (Bjorn Helgaas)
- x86: Merge pci_scan_bus_on_node() into pcibios_scan_root() (Bjorn Helgaas)
- x86: Drop return value of pcibios_scan_root() (Bjorn Helgaas)
NUMA
- x86: Add x86_pci_root_bus_node() to look up NUMA node from PCI bus (Bjorn Helgaas)
- x86: Use x86_pci_root_bus_node() instead of get_mp_bus_to_node() (Bjorn Helgaas)
- x86: Remove mp_bus_to_node[], set_mp_bus_to_node(), get_mp_bus_to_node() (Bjorn Helgaas)
- x86: Use NUMA_NO_NODE, not -1, for unknown node (Bjorn Helgaas)
- x86: Remove acpi_get_pxm() usage (Bjorn Helgaas)
- ia64: Use NUMA_NO_NODE, not MAX_NUMNODES, for unknown node (Bjorn Helgaas)
- ia64: Remove acpi_get_pxm() usage (Bjorn Helgaas)
- ACPI: Fix acpi_get_node() prototype (Bjorn Helgaas)
Resource management
- i2o: Fix and refactor PCI space allocation (Bjorn Helgaas)
- Add resource_contains() (Bjorn Helgaas)
- Add %pR support for IORESOURCE_UNSET (Bjorn Helgaas)
- Mark resources as IORESOURCE_UNSET if we can't assign them (Bjorn Helgaas)
- Don't clear IORESOURCE_UNSET when updating BAR (Bjorn Helgaas)
- Check IORESOURCE_UNSET before updating BAR (Bjorn Helgaas)
- Don't try to claim IORESOURCE_UNSET resources (Bjorn Helgaas)
- Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit (Bjorn Helgaas)
- Don't enable decoding if BAR hasn't been assigned an address (Bjorn Helgaas)
- Add "weak" generic pcibios_enable_device() implementation (Bjorn Helgaas)
- alpha, microblaze, sh, sparc, tile: Use default pcibios_enable_device() (Bjorn Helgaas)
- s390: Use generic pci_enable_resources() (Bjorn Helgaas)
- Don't check resource_size() in pci_bus_alloc_resource() (Bjorn Helgaas)
- Set type in __request_region() (Bjorn Helgaas)
- Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region() (Bjorn Helgaas)
- Change pci_bus_alloc_resource() type_mask to unsigned long (Bjorn Helgaas)
- Log IDE resource quirk in dmesg (Bjorn Helgaas)
- Revert "[PATCH] Insert GART region into resource map" (Bjorn Helgaas)
PCI device hotplug
- Make check_link_active() non-static (Rajat Jain)
- Use link change notifications for hot-plug and removal (Rajat Jain)
- Enable link state change notifications (Rajat Jain)
- Don't disable the link permanently during removal (Rajat Jain)
- Don't check adapter or latch status while disabling (Rajat Jain)
- Disable link notification across slot reset (Rajat Jain)
- Ensure very fast hotplug events are also processed (Rajat Jain)
- Add hotplug_lock to serialize hotplug events (Rajat Jain)
- Remove a non-existent card, regardless of "surprise" capability (Rajat Jain)
- Don't turn slot off when hot-added device already exists (Yijing Wang)
MSI
- Keep pci_enable_msi() documentation (Alexander Gordeev)
- ahci: Fix broken single MSI fallback (Alexander Gordeev)
- ahci, vfio: Use pci_enable_msi_range() (Alexander Gordeev)
- Check kmalloc() return value, fix leak of name (Greg Kroah-Hartman)
- Fix leak of msi_attrs (Greg Kroah-Hartman)
- Fix pci_msix_vec_count() htmldocs failure (Masanari Iida)
Virtualization
- Device-specific ACS support (Alex Williamson)
Freescale i.MX6
- Wait for retraining (Marek Vasut)
Marvell MVEBU
- Use Device ID and revision from underlying endpoint (Andrew Lunn)
- Fix incorrect size for PCI aperture resources (Jason Gunthorpe)
- Call request_resource() on the apertures (Jason Gunthorpe)
- Fix potential issue in range parsing (Jean-Jacques Hiblot)
Renesas R-Car
- Check platform_get_irq() return code (Ben Dooks)
- Add error interrupt handling (Ben Dooks)
- Fix bridge logic configuration accesses (Ben Dooks)
- Register each instance independently (Magnus Damm)
- Break out window size handling (Magnus Damm)
- Make the Kconfig dependencies more generic (Magnus Damm)
Synopsys DesignWare
- Fix RC BAR to be single 64-bit non-prefetchable memory (Mohit Kumar)
Miscellaneous
- Remove unused SR-IOV VF Migration support (Bjorn Helgaas)
- Enable INTx if BIOS left them disabled (Bjorn Helgaas)
- Fix hex vs decimal typo in cpqhpc_probe() (Dan Carpenter)
- Clean up par-arch object file list (Liviu Dudau)
- Set IORESOURCE_ROM_SHADOW only for the default VGA device (Sander Eikelenboom)
- ACPI, ARM, drm, powerpc, pcmcia, PCI: Use list_for_each_entry() for bus traversal (Yijing Wang)
- Fix pci_bus_b() build failure (Paul Gortmaker)"
* tag 'pci-v3.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (108 commits)
Revert "[PATCH] Insert GART region into resource map"
PCI: Log IDE resource quirk in dmesg
PCI: Change pci_bus_alloc_resource() type_mask to unsigned long
PCI: Check all IORESOURCE_TYPE_BITS in pci_bus_alloc_from_region()
resources: Set type in __request_region()
PCI: Don't check resource_size() in pci_bus_alloc_resource()
s390/PCI: Use generic pci_enable_resources()
tile PCI RC: Use default pcibios_enable_device()
sparc/PCI: Use default pcibios_enable_device() (Leon only)
sh/PCI: Use default pcibios_enable_device()
microblaze/PCI: Use default pcibios_enable_device()
alpha/PCI: Use default pcibios_enable_device()
PCI: Add "weak" generic pcibios_enable_device() implementation
PCI: Don't enable decoding if BAR hasn't been assigned an address
PCI: Enable INTx in pci_reenable_device() only when MSI/MSI-X not enabled
PCI: Mark 64-bit resource as IORESOURCE_UNSET if we only support 32-bit
PCI: Don't try to claim IORESOURCE_UNSET resources
PCI: Check IORESOURCE_UNSET before updating BAR
PCI: Don't clear IORESOURCE_UNSET when updating BAR
PCI: Mark resources as IORESOURCE_UNSET if we can't assign them
...
Conflicts:
arch/x86/include/asm/topology.h
drivers/ata/ahci.c
If we failed to find an ACPI device to correspond to an ANDD record, we
would fail to increment our pointer and would just process the same record
over and over again, with predictable results.
Turn it from a while() loop into a for() loop to let the 'continue' in
the error paths work correctly.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
If we hit this error condition then we want to return a NULL pointer and
not a freed variable.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
As pointed out by Jörg and fixed in commit 11f1a7768 ("iommu/vt-d: Check
for NULL pointer in dmar_acpi_dev_scope_init(), this code path can
bizarrely get exercised even on AMD IOMMU systems with IRQ remapping
enabled.
In addition to the defensive check for NULL which Jörg added, let's also
just avoid calling the function at all if there aren't an Intel IOMMU
units in the system.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When ir_dev_scope_init() is called via a rootfs initcall it
will check for irq_remapping_enabled before it calls
(indirectly) into dmar_acpi_dev_scope_init() which uses the
dmar_tbl pointer without any checks.
The AMD IOMMU driver also sets the irq_remapping_enabled
flag which causes the dmar_acpi_dev_scope_init() function to
be called on systems with AMD IOMMU hardware too, causing a
boot-time kernel crash.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
In reality, the spec can only support 16-bit PASID since
INVALIDATE_IOTLB_PAGES and COMPLETE_PPR_REQUEST commands only allow 16-bit
PASID. So, we updated the PASID_MASK accordingly and invoke BUG_ON
if the hardware is reporting PASmax more than 16-bit.
Besides, max PASID is defined as ((2^(PASmax+1)) - 1). The current does not
determine this correctly.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Mostly made redundant by using dev_name() instead of pci_name(), and one
instance of using *dev->dma_mask instead of pdev->dma_mask.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Should hopefully never happen (RMRRs are an abomination) but while we're
busy eliminating all the PCI assumptions, we might as well do it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Pass the struct device to it, and also make it return the bus/devfn to use,
since that is also stored in the DMAR table.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This was problematic because it works by domain/bus/devfn and we want
to make device_to_iommu() use only a struct device * (for handling non-PCI
devices). Now that the iommu pointer is reliably stored in the
device_domain_info, we don't need to look it up.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
By moving this into get_domain_for_dev() we can make dmar_insert_dev_info()
suitable for use with "special" domains such as the si_domain, which
currently use domain_add_dev_info().
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
It's not only for PCI devices any more, and the scope information for an
ACPI device provides the bus and devfn so that has to be stored here too.
It is the device pointer itself which needs to be protected with RCU,
so the __rcu annotation follows it into the definition of struct
dmar_dev_scope, since we're no longer just passing arrays of device
pointers around.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In commit 2e12bc29 ("intel-iommu: Default to non-coherent for domains
unattached to iommus") we decided to err on the side of caution and
always assume that it's possible that a device will be attached which is
behind a non-coherent IOMMU.
In some cases, however, that just *cannot* happen. If there *are* no
IOMMUs in the system which are non-coherent, then we don't need to do
it. And flushing the dcache is a *significant* performance hit.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There is a race condition between the existing clear/free code and the
hardware. The IOMMU is actually permitted to cache the intermediate
levels of the page tables, and doesn't need to walk the table from the
very top of the PGD each time. So the existing back-to-back calls to
dma_pte_clear_range() and dma_pte_free_pagetable() can lead to a
use-after-free where the IOMMU reads from a freed page table.
When freeing page tables we actually need to do the IOTLB flush, with
the 'invalidation hint' bit clear to indicate that it's not just a
leaf-node flush, after unlinking each page table page from the next level
up but before actually freeing it.
So in the rewritten domain_unmap() we just return a list of pages (using
pg->freelist to make a list of them), and then the caller is expected to
do the appropriate IOTLB flush (or tear down the domain completely,
whatever), before finally calling dma_free_pagelist() to free the pages.
As an added bonus, we no longer need to flush the CPU's data cache for
pages which are about to be *removed* from the page table hierarchy anyway,
in the non-cache-coherent case. This drastically improves the performance
of large unmaps.
As a side-effect of all these changes, this also fixes the fact that
intel_iommu_unmap() was neglecting to free the page tables for the range
in question after clearing them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We have this horrid API where iommu_unmap() can unmap more than it's asked
to, if the IOVA in question happens to be mapped with a large page.
Instead of propagating this nonsense to the point where we end up returning
the page order from dma_pte_clear_range(), let's just do it once and adjust
the 'size' parameter accordingly.
Augment pfn_to_dma_pte() to return the level at which the PTE was found,
which will also be useful later if we end up changing the API for
iommu_iova_to_phys() to behave the same way as is being discussed upstream.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>