Commit Graph

91 Commits

Author SHA1 Message Date
Dan Carpenter 3775d4818d iommu/amd: fix type bug in flush code
write_file_bool() modifies 32 bits of data, so "amd_iommu_unmap_flush"
needs to be 32 bits as well or we'll corrupt memory.  Fortunately it
looks like the data is aligned with a gap after the declaration so this
is harmless in production.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-07-02 12:11:40 +02:00
Alex Williamson 664b600331 amd_iommu: Make use of DMA quirks and ACS checks in IOMMU groups
Work around broken devices and adhere to ACS support when determining
IOMMU grouping.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25 13:48:29 +02:00
Alex Williamson 9dcd61303a amd_iommu: Support IOMMU groups
Add IOMMU group support to AMD-Vi device init and uninit code.
Existing notifiers make sure this gets called for each device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25 13:48:28 +02:00
Alex Williamson d72e31c937 iommu: IOMMU Groups
IOMMU device groups are currently a rather vague associative notion
with assembly required by the user or user level driver provider to
do anything useful.  This patch intends to grow the IOMMU group concept
into something a bit more consumable.

To do this, we first create an object representing the group, struct
iommu_group.  This structure is allocated (iommu_group_alloc) and
filled (iommu_group_add_device) by the iommu driver.  The iommu driver
is free to add devices to the group using it's own set of policies.
This allows inclusion of devices based on physical hardware or topology
limitations of the platform, as well as soft requirements, such as
multi-function trust levels or peer-to-peer protection of the
interconnects.  Each device may only belong to a single iommu group,
which is linked from struct device.iommu_group.  IOMMU groups are
maintained using kobject reference counting, allowing for automatic
removal of empty, unreferenced groups.  It is the responsibility of
the iommu driver to remove devices from the group
(iommu_group_remove_device).

IOMMU groups also include a userspace representation in sysfs under
/sys/kernel/iommu_groups.  When allocated, each group is given a
dynamically assign ID (int).  The ID is managed by the core IOMMU group
code to support multiple heterogeneous iommu drivers, which could
potentially collide in group naming/numbering.  This also keeps group
IDs to small, easily managed values.  A directory is created under
/sys/kernel/iommu_groups for each group.  A further subdirectory named
"devices" contains links to each device within the group.  The iommu_group
file in the device's sysfs directory, which formerly contained a group
number when read, is now a link to the iommu group.  Example:

$ ls -l /sys/kernel/iommu_groups/26/devices/
total 0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:00:1e.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.1 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1

$ ls -l  /sys/kernel/iommu_groups/26/devices/*/iommu_group
[truncating perms/owner/timestamp]
/sys/kernel/iommu_groups/26/devices/0000:00:1e.0/iommu_group ->
					../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.0/iommu_group ->
					../../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.1/iommu_group ->
					../../../../kernel/iommu_groups/26

Groups also include several exported functions for use by user level
driver providers, for example VFIO.  These include:

iommu_group_get(): Acquires a reference to a group from a device
iommu_group_put(): Releases reference
iommu_group_for_each_dev(): Iterates over group devices using callback
iommu_group_[un]register_notifier(): Allows notification of device add
        and remove operations relevant to the group
iommu_group_id(): Return the group number

This patch also extends the IOMMU API to allow attaching groups to
domains.  This is currently a simple wrapper for iterating through
devices within a group, but it's expected that the IOMMU API may
eventually make groups a more integral part of domains.

Groups intentionally do not try to manage group ownership.  A user
level driver provider must independently acquire ownership for each
device within a group before making use of the group as a whole.
This may change in the future if group usage becomes more pervasive
across both DMA and IOMMU ops.

Groups intentionally do not provide a mechanism for driver locking
or otherwise manipulating driver matching/probing of devices within
the group.  Such interfaces are generic to devices and beyond the
scope of IOMMU groups.  If implemented, user level providers have
ready access via iommu_group_for_each_dev and group notifiers.

iommu_device_group() is removed here as it has no users.  The
replacement is:

	group = iommu_group_get(dev);
	id = iommu_group_id(group);
	iommu_group_put(group);

AMD-Vi & Intel VT-d support re-added in following patches.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25 13:48:15 +02:00
Joerg Roedel ac1534a55d iommu/amd: Initialize dma_ops for hotplug and sriov devices
When a device is added to the system at runtime the AMD
IOMMU driver initializes the necessary data structures to
handle translation for it. But it forgets to change the
per-device dma_ops to point to the AMD IOMMU driver. So
mapping actually never happens and all DMA accesses end in
an IO_PAGE_FAULT. Fix this.

Reported-by: Stefan Assmann <sassmann@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-25 13:16:00 +02:00
Joerg Roedel eee53537c4 iommu/amd: Fix deadlock in ppr-handling error path
In the error path of the ppr_notifer it can happen that the
iommu->lock is taken recursivly. This patch fixes the
problem by releasing the iommu->lock before any notifier is
invoked. This also requires to move the erratum workaround
for the ppr-log (interrupt may be faster than data in the log)
one function up.

Cc: stable@vger.kernel.org # v3.3, v3.4
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-06-04 12:47:44 +02:00
Joerg Roedel 3d06fca8d2 iommu/amd: Add workaround for event log erratum
Due to a recent erratum it can happen that the head pointer
of the event-log is updated before the actual event-log
entry is written. This patch implements the recommended
workaround.

Cc: stable@vger.kernel.org	# all stable kernels
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-04-12 14:16:14 +02:00
Joerg Roedel a3b9312143 iommu/amd: Check for the right TLP prefix bit
Unfortunatly the PRI spec changed and moved the
TLP-prefix-required bit to a different location. This patch
makes the necessary change in the AMD IOMMU driver.
Regressions are not expected because all hardware
implementing the PRI capability sets this bit to zero
anyway.

Cc: stable@vger.kernel.org # v3.3
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-04-12 12:49:26 +02:00
Linus Torvalds 58bca4a8fa Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA mapping branch from Marek Szyprowski:
 "Short summary for the whole series:

  A few limitations have been identified in the current dma-mapping
  design and its implementations for various architectures.  There exist
  more than one function for allocating and freeing the buffers:
  currently these 3 are used dma_{alloc, free}_coherent,
  dma_{alloc,free}_writecombine, dma_{alloc,free}_noncoherent.

  For most of the systems these calls are almost equivalent and can be
  interchanged.  For others, especially the truly non-coherent ones
  (like ARM), the difference can be easily noticed in overall driver
  performance.  Sadly not all architectures provide implementations for
  all of them, so the drivers might need to be adapted and cannot be
  easily shared between different architectures.  The provided patches
  unify all these functions and hide the differences under the already
  existing dma attributes concept.  The thread with more references is
  available here:

    http://www.spinics.net/lists/linux-sh/msg09777.html

  These patches are also a prerequisite for unifying DMA-mapping
  implementation on ARM architecture with the common one provided by
  dma_map_ops structure and extending it with IOMMU support.  More
  information is available in the following thread:

    http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819

  More works on dma-mapping framework are planned, especially in the
  area of buffer sharing and managing the shared mappings (together with
  the recently introduced dma_buf interface: commit d15bd7ee44
  "dma-buf: Introduce dma buffer sharing mechanism").

  The patches in the current set introduce a new alloc/free methods
  (with support for memory attributes) in dma_map_ops structure, which
  will later replace dma_alloc_coherent and dma_alloc_writecombine
  functions."

People finally started piping up with support for merging this, so I'm
merging it as the last of the pending stuff from the merge window.
Looks like pohmelfs is going to wait for 3.5 and more external support
for merging.

* 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  common: DMA-mapping: add NON-CONSISTENT attribute
  common: DMA-mapping: add WRITE_COMBINE attribute
  common: dma-mapping: introduce mmap method
  common: dma-mapping: remove old alloc_coherent and free_coherent methods
  Hexagon: adapt for dma_map_ops changes
  Unicore32: adapt for dma_map_ops changes
  Microblaze: adapt for dma_map_ops changes
  SH: adapt for dma_map_ops changes
  Alpha: adapt for dma_map_ops changes
  SPARC: adapt for dma_map_ops changes
  PowerPC: adapt for dma_map_ops changes
  MIPS: adapt for dma_map_ops changes
  X86 & IA64: adapt for dma_map_ops changes
  common: dma-mapping: introduce generic alloc() and free() methods
2012-04-04 17:13:43 -07:00
Andrzej Pietrasiewicz baa676fcf8 X86 & IA64: adapt for dma_map_ops changes
Adapt core x86 and IA64 architecture code for dma_map_ops changes: replace
alloc/free_coherent with generic alloc/free methods.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
[removed swiotlb related changes and replaced it with wrappers,
 merged with IA64 patch to avoid inter-patch dependences in intel-iommu code]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Luck <tony.luck@intel.com>
2012-03-28 16:36:31 +02:00
Steffen Persvold 943bc7e110 x86: Fix section warnings
Fix the following section warnings :

WARNING: vmlinux.o(.text+0x49dbc): Section mismatch in reference
from the function acpi_map_cpu2node() to the variable
.cpuinit.data:__apicid_to_node The function acpi_map_cpu2node()
references the variable __cpuinitdata __apicid_to_node. This is
often because acpi_map_cpu2node lacks a __cpuinitdata
annotation or the annotation of __apicid_to_node is wrong.

WARNING: vmlinux.o(.text+0x49dc1): Section mismatch in reference
from the function acpi_map_cpu2node() to the function
.cpuinit.text:numa_set_node() The function acpi_map_cpu2node()
references the function __cpuinit numa_set_node(). This is often
because acpi_map_cpu2node lacks a __cpuinit  annotation or the
annotation of numa_set_node is wrong.

WARNING: vmlinux.o(.text+0x526e77): Section mismatch in
reference from the function prealloc_protection_domains() to the
function .init.text:alloc_passthrough_domain() The function
prealloc_protection_domains() references the function __init
alloc_passthrough_domain(). This is often because
prealloc_protection_domains lacks a __init  annotation or the annotation of alloc_passthrough_domain is wrong.

Signed-off-by: Steffen Persvold <sp@numascale.com>
Link: http://lkml.kernel.org/r/1331810188-24785-1-git-send-email-sp@numascale.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-19 12:01:01 +01:00
Joerg Roedel af1be04901 iommu/amd: Work around broken IVRS tables
On some systems the IVRS table does not contain all PCI
devices present in the system. In case a device not present
in the IVRS table is translated by the IOMMU no DMA is
possible from that device by default.
This patch fixes this by removing the DTE entry for every
PCI device present in the system and not covered by IVRS.

Cc: stable@vger.kernel.org # >= 3.0
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-01-23 14:05:23 +01:00
Joerg Roedel f93ea73387 Merge branches 'iommu/page-sizes' and 'iommu/group-id' into next
Conflicts:
	drivers/iommu/amd_iommu.c
	drivers/iommu/intel-iommu.c
	include/linux/iommu.h
2012-01-09 13:06:28 +01:00
Joerg Roedel 2655d7a297 iommu/amd: Init stats for iommu=pt
The IOMMUv2 driver added a few statistic counter which are
interesting in the iommu=pt mode too. So initialize the
statistic counter for that mode too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-22 14:56:56 +01:00
Joerg Roedel 52efdb89d6 iommu/amd: Add amd_iommu_device_info() function
This function can be used to find out which features
necessary for IOMMUv2 usage are available on a given device.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-15 11:15:29 +01:00
Joerg Roedel 46277b75da iommu/amd: Adapt IOMMU driver to PCI register name changes
The symbolic register names for PCI and PASID changed in
PCI code. This patch adapts the AMD IOMMU driver to these
changes.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-15 11:05:35 +01:00
Joerg Roedel a06ec394c9 Merge branch 'iommu/page-sizes' into x86/amd
Conflicts:
	drivers/iommu/amd_iommu.c
2011-12-14 12:52:09 +01:00
Joerg Roedel 399be2f519 iommu/amd: Add stat counter for IOMMUv2 events
Add some interesting statistic counters for events when
IOMMUv2 is active.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:19:06 +01:00
Joerg Roedel 6a113ddc03 iommu/amd: Add device errata handling
Add infrastructure for errata-handling and handle two known
erratas in the IOMMUv2 code.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:19:06 +01:00
Joerg Roedel f3572db823 iommu/amd: Add function to get IOMMUv2 domain for pdev
The AMD IOMMUv2 driver needs to get the IOMMUv2 domain
associated with a particular device. This patch adds a
function to get this information.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:19:05 +01:00
Joerg Roedel c99afa25b6 iommu/amd: Implement function to send PPR completions
To send completions for PPR requests this patch adds a
function which can be used by the IOMMUv2 driver.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:19:05 +01:00
Joerg Roedel b16137b11b iommu/amd: Implement functions to manage GCR3 table
This patch adds functions necessary to set and clear the
GCR3 values associated with a particular PASID in an IOMMUv2
domain.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:19:04 +01:00
Joerg Roedel 22e266c79b iommu/amd: Implement IOMMUv2 TLB flushing routines
The functions added with this patch allow to manage the
IOMMU and the device TLBs for all devices in an IOMMUv2
domain.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:19:03 +01:00
Joerg Roedel 52815b7568 iommu/amd: Add support for IOMMUv2 domain mode
This patch adds support for protection domains that
implement two-level paging for devices.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 15:18:57 +01:00
Joerg Roedel 132bd68f18 iommu/amd: Add amd_iommu_domain_direct_map function
This function can be used to switch a domain into
paging-mode 0. In this mode all devices can access physical
system memory directly without any remapping.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 14:55:13 +01:00
Joerg Roedel 72e1dcc419 iommu/amd: Implement notifier for PPR faults
Add a notifer at which a module can attach to get informed
about incoming PPR faults.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 14:55:04 +01:00
Joerg Roedel 5abcdba4fa iommu/amd: Put IOMMUv2 capable devices in pt_domain
If the device starts to use IOMMUv2 features the dma handles
need to stay valid. The only sane way to do this is to use a
identity mapping for the device and not translate it by the
iommu. This is implemented with this patch. Since this lifts
the device-isolation there is also a new kernel parameter
which allows to disable that feature.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 14:54:58 +01:00
Joerg Roedel ee6c286845 iommu/amd: Convert dev_table_entry to u64
Convert the contents of 'struct dev_table_entry' to u64 to
allow updating the DTE wit 64bit writes as required by the
spec.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12 14:54:23 +01:00
Alex Williamson bcb71abe7d iommu: Add option to group multi-function devices
The option iommu=group_mf indicates the that the iommu driver should
expose all functions of a multi-function PCI device as the same
iommu_device_group.  This is useful for disallowing individual functions
being exposed as independent devices to userspace as there are often
hidden dependencies.  Virtual functions are not affected by this option.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-11-15 12:22:31 +01:00
Alex Williamson 8fbdce6595 iommu/amd: Implement iommu_device_group
Just use the amd_iommu_alias_table directly.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-11-15 12:22:30 +01:00
Ohad Ben-Cohen aa3de9c050 iommu/amd: announce supported page sizes
Let the IOMMU core know we support arbitrary page sizes (as long as
they're an order of 4KiB).

This way the IOMMU core will retain the existing behavior we're used to;
it will let us map regions that:
- their size is an order of 4KiB
- they are naturally aligned

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Joerg Roedel <Joerg.Roedel@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-11-10 11:40:38 +01:00
Ohad Ben-Cohen 5009065d38 iommu/core: stop converting bytes to page order back and forth
Express sizes in bytes rather than in page order, to eliminate the
size->order->size conversions we have whenever the IOMMU API is calling
the low level drivers' map/unmap methods.

Adopt all existing drivers.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <Joerg.Roedel@amd.com>
Cc: Stepan Moskovchenko <stepanm@codeaurora.org>
Cc: KyongHo Cho <pullip.cho@samsung.com>
Cc: Hiroshi DOYU <hdoyu@nvidia.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-11-10 11:40:37 +01:00
Joerg Roedel 1abb4ba596 Merge branches 'amd/fixes', 'debug/dma-api', 'arm/omap', 'arm/msm', 'core', 'iommu/fault-reporting' and 'api/iommu-ops-per-bus' into next
Conflicts:
	drivers/iommu/amd_iommu.c
	drivers/iommu/iommu.c
2011-10-21 14:38:55 +02:00
Joerg Roedel 2cc21c4236 iommu/amd: Use bus_set_iommu instead of register_iommu
Convert the AMD IOMMU driver to use the new interface for
publishing the iommu_ops.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:21 +02:00
Joerg Roedel fcd0861db1 iommu/amd: Fix wrong shift direction
The shift direction was wrong because the function takes a
page number and i is the address is the loop.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-11 17:41:32 +02:00
Joerg Roedel e33acde911 iommu/amd: Don't take domain->lock recursivly
The domain_flush_devices() function takes the domain->lock.
But this function is only called from update_domain() which
itself is already called unter the domain->lock. This causes
a deadlock situation when the dma-address-space of a domain
grows larger than 1GB.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-09-02 14:19:50 +02:00
Joerg Roedel f1ca1512e7 iommu/amd: Make sure iommu->need_sync contains correct value
The value is only set to true but never set back to false,
which causes to many completion-wait commands to be sent to
hardware. Fix it with this patch.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-09-02 14:10:32 +02:00
Joerg Roedel 17f5b569e0 iommu/amd: Don't use MSI address range for DMA addresses
Reserve the MSI address range in the address allocator so
that MSI addresses are not handed out as dma handles.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-07-06 17:14:44 +02:00
Joerg Roedel 801019d59d Merge branches 'amd/transparent-bridge' and 'core'
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 11:14:10 +02:00
Joerg Roedel 403f81d8ee iommu/amd: Move missing parts to drivers/iommu
A few parts of the driver were missing in drivers/iommu.
Move them there to have the complete driver in that
directory.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:31 +02:00
Ohad Ben-Cohen 29b68415e3 x86: amd_iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:29 +02:00