Until all upstream devices have their DMA ops swizzled to point at the
SMMU, we need to treat the IOMMU_DOMAIN_DMA domain as bypass to avoid
putting devices into an empty address space when detaching from VFIO.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMU attach_dev implementations returns -EEXIST if the device
being attached is already attached to a domain. This doesn't play nicely
with the default domain, resulting in splats such as:
WARNING: at drivers/iommu/iommu.c:1257
Modules linked in:
CPU: 3 PID: 1939 Comm: virtio-net-tx Tainted: G S 4.5.0-rc4+ #1
Hardware name: FVP Base (DT)
task: ffffffc87a9d0000 ti: ffffffc07a278000 task.ti: ffffffc07a278000
PC is at __iommu_detach_group+0x68/0xe8
LR is at __iommu_detach_group+0x48/0xe8
This patch fixes the problem by forcefully detaching the device from
its old domain, if present, when attaching to a new one. The unused
->detach_dev callback is also removed the iommu_ops structures.
Signed-off-by: Will Deacon <will.deacon@arm.com>
With DMA mapping ops provided by the iommu-dma code, only a minimal
contribution from the IOMMU driver is needed to create a suitable
DMA-API domain for them to use. Implement this for the ARM SMMUs.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
It is ILLEGAL to set STE.S1STALLD to 1 if stage 1 is enabled and
either the stall or terminate models are not supported.
This patch fixes the STALLD check and ensures that we don't set STALLD
in the STE when it is not supported.
Signed-off-by: Prem Mallappa <pmallapp@broadcom.com>
[will: consistently use IDR0_STALL_MODEL_* prefix]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When acknowledging global errors, the GERRORN register should be written
with the original GERROR value so that active errors are toggled.
This patch fixed the driver to write the original GERROR value to
GERRORN, instead of an active error mask.
Signed-off-by: Prem Mallappa <pmallapp@broadcom.com>
[will: reworked use of active bits and fixed commit log]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When invalidating an IOVA range potentially spanning multiple pages,
such as when removing an entire intermediate-level table, we currently
only issue an invalidation for the first IOVA of that range. Since the
architecture specifies that address-based TLB maintenance operations
target a single entry, an SMMU could feasibly retain live entries for
subsequent pages within that unmapped range, which is not good.
Make sure we hit every possible entry by iterating over the whole range
at the granularity provided by the pagetable implementation.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: added missing semicolons...]
Signed-off-by: Will Deacon <will.deacon@arm.com>
IOMMU hardware with range-based TLB maintenance commands can work
happily with the iova and size arguments passed via the tlb_add_flush
callback, but for IOMMUs which require separate commands per entry in
the range, it is not straightforward to infer the necessary granularity
when it comes to issuing the actual commands.
Add an additional argument indicating the granularity for the benefit
of drivers needing to know, and update the ARM LPAE code appropriately
(for non-leaf invalidations we currently just assume the worst-case
page granularity rather than walking the table to check).
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Whilst the architecture only defines a few of the possible CERROR values,
we should handle unknown values gracefully rather than go out of bounds
trying to print out an error description.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The basic flow for add a device:
arm_smmu_add_device
|->iommu_group_get_for_dev
|->iommu_group_get
return group; (1)
|->ops->device_group : Init/increase reference count to/by 1.
|->iommu_group_add_device : Increase reference count by 1.
return group (2)
|->return 0;
Since we are adding one device, the flow is (2) and the group reference
count will be increased by 2. So, we need to add iommu_group_put at the
end of arm_smmu_add_device to decrease the count by 1.
Also take the failure path into consideration when fail to add a device.
Signed-off-by: Peng Fan <van.freenix@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When we initialise a bypass STE, we memset the structure to zero and
set the Valid and Config fields to indicate that the stream should
bypass the SMMU. Unfortunately, this results in an SHCFG field of 0
which means that the shareability of any incoming transactions is
overridden with non-shareable, leading to potential coherence problems
down the line.
This patch fixes the issue by initialising bypass STEs to use the
incoming shareability attributes. When translation is in effect at
either stage 1 or stage 2, the shareability is determined by the
page tables.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMUv3 driver uses dma_{alloc,free}_coherent to manage its
queues and configuration data structures.
This patch converts the driver to the managed (dmam_*) API, so that
resources are freed automatically on device teardown. This greatly
simplifies the failure paths and allows us to remove a bunch of
handcrafted freeing code.
Signed-off-by: Will Deacon <will.deacon@arm.com>
PRIQ_0_OF has been removed from the SMMUv3 architecture, so remove its
corresponding (and unused) #define from the driver.
Signed-off-by: Will Deacon <will.deacon@arm.com>
This converts the ARM SMMU and the SMMUv3 driver to use the
new device_group call-back.
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Despite being a platform device, the SMMUv3 is capable of signaling
interrupts using MSIs. Hook it into the platform MSI framework and
enjoy faults being reported in a new and exciting way.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[will: tidied up the binding example and reworked most of the code]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The bitmap allocator returns an int, which is one of the standard
negative values on failure. Rather than assigning this straight to a
u16 (like we do for the ASID and VMID callers), which means that we
won't detect failure correctly, use an int for the purposes of error
checking.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Rather than keep a private list of struct arm_smmu_device and searching
this whenever we need to look up the correct SMMU instance, instead use
the drvdata field in the struct device to take care of the mapping for
us.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Stage-2 TLBI by IPA takes a 48-bit address field, as opposed to the
64-bit field used by the VA-based invalidation commands.
This patch re-jigs the SMMUv3 command construction code so that the
address field is correctly masked.
Signed-off-by: Will Deacon <will.deacon@arm.com>
AArch32-capable SMMU implementations have a minimum IAS of 40 bits, so
ensure that is reflected in the stage-2 page table configuration.
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the io-pgtable code now enforcing its own appropriate sync points,
the vestigial flush_pgtable callback becomes entirely redundant, so
remove it altogether.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With the correct DMA API calls now integrated into the io-pgtable code,
let that handle the flushing of non-coherent page table updates.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
A late change to the SMMUv3 architecture ensures that the OAS field
will be monotonically increasing, so we can assume that an unknown OAS
is at least 48-bit and use that, rather than fail the device probe.
Signed-off-by: Will Deacon <will.deacon@arm.com>
If the StreamIDs in a system can all be resolved by a single level-2
stream table (i.e. SIDSIZE < SPLIT), then we currently get our maths
wrong and allocate the largest strtab we support, thanks to unsigned
overflow in our calculation.
This patch fixes the issue by checking the SIDSIZE explicitly when
calculating the size of our first-level stream table.
Reported-by: Matt Evans <matt.evans@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The MSI memory attributes in the SMMUv3 driver are from an older
revision of the spec, which doesn't match the current implementations.
Out with the old, in with the new.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When an ARM SMMUv3 instance supports PRI, the driver registers
an interrupt handler, but fails to enable the generation of
such interrupt at the SMMU level.
This patches simply moves the enable flags to a variable that
gets updated by the PRI handling code before being written to the
SMMU register.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Hisilicon SMMUv3 devices treat CMD_PREFETCH_CONFIG as a illegal command,
execute it will trigger GERROR interrupt. Although the gerror code manage
to turn the prefetch into a SYNC, and the system can continue to run
normally, but it's ugly to print error information.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
[will: extended binding documentation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Because we will choose the minimum value between STRTAB_L1_SZ_SHIFT and
IDR1.SIDSIZE, so enlarge STRTAB_L1_SZ_SHIFT will not impact the platforms
whose IDR1.SIDSIZE is smaller than old STRTAB_L1_SZ_SHIFT value.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm64 CPU architecture defines TCR[8:11] as holding the inner and
outer memory attributes for TTBR0.
This patch fixes the ARM SMMUv3 driver to pack these bits into the
context descriptor, rather than picking up the TTBR1 attributes as it
currently does.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
STRTAB_BASE_CFG.LOG2SIZE should be set to log2(entries), where entries
is the *total* number of entries in the stream table, not just the first
level.
This patch fixes the register setting, which was previously being set to
the size of the l1 thanks to a multi-use "size" variable.
Reported-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The element size of cfg->strtab is just one DWORD, so we should use a
multiply operation instead of a shift when calculating the level 1
index.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The free_io_pgtable_ops() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Version three of the ARM SMMU architecture introduces significant
changes and improvements over previous versions of the specification,
necessitating a new driver in the Linux kernel.
The main change to the programming interface is that the majority of the
configuration data has been moved from MMIO registers to in-memory data
structures, with communication between the CPU and the SMMU being
mediated via in-memory circular queues.
This patch adds an initial driver for SMMUv3 to Linux. We currently
support pinned stage-1 (DMA) and stage-2 (KVM VFIO) mappings using the
generic IO-pgtable code.
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>