Commit Graph

189 Commits

Author SHA1 Message Date
Parav Pandit 7a0f06c197 iommu/vt-d: No need to typecast
Page directory assignment by alloc_pgtable_page() or phys_to_virt()
doesn't need typecasting as both routines return void*. Hence, remove
typecasting from both the calls.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-24-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:14 +02:00
Parav Pandit cee57d4fe7 iommu/vt-d: Remove unnecessary braces
No need for braces for single line statement under if() block.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-22-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:14 +02:00
Parav Pandit 74f6d776ae iommu/vt-d: Removed unused iommu_count in dmar domain
DMAR domain uses per DMAR refcount. It is indexed by iommu seq_id.
Older iommu_count is only incremented and decremented but no decisions
are taken based on this refcount. This is not of much use.

Hence, remove iommu_count and further simplify domain_detach_iommu()
by returning void.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-21-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:14 +02:00
Parav Pandit 1f106ff0ea iommu/vt-d: Use bitfields for DMAR capabilities
IOTLB device presence, iommu coherency and snooping are boolean
capabilities. Use them as bits and keep them adjacent.

Structure layout before the reorg.
$ pahole -C dmar_domain drivers/iommu/intel/dmar.o
struct dmar_domain {
        int                        nid;                  /*     0     4 */
        unsigned int               iommu_refcnt[128];    /*     4   512 */
        /* --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- */
        u16                        iommu_did[128];       /*   516   256 */
        /* --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- */
        bool                       has_iotlb_device;     /*   772     1 */

        /* XXX 3 bytes hole, try to pack */

        struct list_head           devices;              /*   776    16 */
        struct list_head           subdevices;           /*   792    16 */
        struct iova_domain         iovad __attribute__((__aligned__(8)));
							 /*   808  2320 */
        /* --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- */
        struct dma_pte *           pgd;                  /*  3128     8 */
        /* --- cacheline 49 boundary (3136 bytes) --- */
        int                        gaw;                  /*  3136     4 */
        int                        agaw;                 /*  3140     4 */
        int                        flags;                /*  3144     4 */
        int                        iommu_coherency;      /*  3148     4 */
        int                        iommu_snooping;       /*  3152     4 */
        int                        iommu_count;          /*  3156     4 */
        int                        iommu_superpage;      /*  3160     4 */

        /* XXX 4 bytes hole, try to pack */

        u64                        max_addr;             /*  3168     8 */
        u32                        default_pasid;        /*  3176     4 */

        /* XXX 4 bytes hole, try to pack */

        struct iommu_domain        domain;               /*  3184    72 */

        /* size: 3256, cachelines: 51, members: 18 */
        /* sum members: 3245, holes: 3, sum holes: 11 */
        /* forced alignments: 1 */
        /* last cacheline: 56 bytes */
} __attribute__((__aligned__(8)));

After arranging it for natural padding and to make flags as u8 bits, it
saves 8 bytes for the struct.

struct dmar_domain {
        int                        nid;                  /*     0     4 */
        unsigned int               iommu_refcnt[128];    /*     4   512 */
        /* --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- */
        u16                        iommu_did[128];       /*   516   256 */
        /* --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- */
        u8                         has_iotlb_device:1;   /*   772: 0  1 */
        u8                         iommu_coherency:1;    /*   772: 1  1 */
        u8                         iommu_snooping:1;     /*   772: 2  1 */

        /* XXX 5 bits hole, try to pack */
        /* XXX 3 bytes hole, try to pack */

        struct list_head           devices;              /*   776    16 */
        struct list_head           subdevices;           /*   792    16 */
        struct iova_domain         iovad __attribute__((__aligned__(8)));
							 /*   808  2320 */
        /* --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- */
        struct dma_pte *           pgd;                  /*  3128     8 */
        /* --- cacheline 49 boundary (3136 bytes) --- */
        int                        gaw;                  /*  3136     4 */
        int                        agaw;                 /*  3140     4 */
        int                        flags;                /*  3144     4 */
        int                        iommu_count;          /*  3148     4 */
        int                        iommu_superpage;      /*  3152     4 */

        /* XXX 4 bytes hole, try to pack */

        u64                        max_addr;             /*  3160     8 */
        u32                        default_pasid;        /*  3168     4 */

        /* XXX 4 bytes hole, try to pack */

        struct iommu_domain        domain;               /*  3176    72 */

        /* size: 3248, cachelines: 51, members: 18 */
        /* sum members: 3236, holes: 3, sum holes: 11 */
        /* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 5 bits */
        /* forced alignments: 1 */
        /* last cacheline: 48 bytes */
} __attribute__((__aligned__(8)));

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com
Link: https://lore.kernel.org/r/20210610020115.1637656-20-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
YueHaibing 3bc770b0e9 iommu/vt-d: Use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210528130229.22108-1-yuehaibing@huawei.com
Link: https://lore.kernel.org/r/20210610020115.1637656-19-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu d5b9e4bfe0 iommu/vt-d: Report prq to io-pgfault framework
Let the IO page fault requests get handled through the io-pgfault
framework.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu 4c82b88696 iommu/vt-d: Allocate/register iopf queue for sva devices
This allocates and registers the iopf queue infrastructure for devices
which want to support IO page fault for SVA.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:13 +02:00
Lu Baolu 4048377414 iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers
Align the pasid alloc/free code with the generic helpers defined in the
iommu core. This also refactored the SVA binding code to improve the
readability.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Lu Baolu 521f546b4e iommu/vt-d: Support asynchronous IOMMU nested capabilities
Current VT-d implementation supports nested translation only if all
underlying IOMMUs support the nested capability. This is unnecessary
as the upper layer is allowed to create different containers and set
them with different type of iommu backend. The IOMMU driver needs to
guarantee that devices attached to a nested mode iommu_domain should
support nested capabilility.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210517065701.5078-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Colin Ian King 05d2cbf969 iommu/vt-d: Remove redundant assignment to variable agaw
The variable agaw is initialized with a value that is never read and it
is being updated later with a new value as a counter in a for-loop. The
initialization is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210416171826.64091-1-colin.king@canonical.com
Link: https://lore.kernel.org/r/20210610020115.1637656-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-06-10 09:06:12 +02:00
Lu Baolu 54c80d9074 iommu/vt-d: Use user privilege for RID2PASID translation
When first-level page tables are used for IOVA translation, we use user
privilege by setting U/S bit in the page table entry. This is to make it
consistent with the second level translation, where the U/S enforcement
is not available. Clear the SRE (Supervisor Request Enable) field in the
pasid table entry of RID2PASID so that requests requesting the supervisor
privilege are blocked and treated as DMA remapping faults.

Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210512064426.3440915-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210519015027.108468-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-05-19 08:51:02 +02:00
Dan Carpenter 1a590a1c8b iommu/vt-d: Check for allocation failure in aux_detach_device()
In current kernels small allocations never fail, but checking for
allocation failure is the correct thing to do.

Fixes: 18abda7a2d ("iommu/vt-d: Fix general protection fault in aux_detach_device()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/YJuobKuSn81dOPLd@mwanda
Link: https://lore.kernel.org/r/20210519015027.108468-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-05-19 08:51:02 +02:00
Robin Murphy 2d471b20c5 iommu: Streamline registration interface
Rather than have separate opaque setter functions that are easy to
overlook and lead to repetitive boilerplate in drivers, let's pass the
relevant initialisation parameters directly to iommu_device_register().

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-16 17:20:45 +02:00
Joerg Roedel 49d11527e5 Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', 'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next 2021-04-16 17:16:03 +02:00
Longpeng(Mike) 38c527aeb4 iommu/vt-d: Force to flush iotlb before creating superpage
The translation caches may preserve obsolete data when the
mapping size is changed, suppose the following sequence which
can reveal the problem with high probability.

1.mmap(4GB,MAP_HUGETLB)
2.
  while (1) {
   (a)    DMA MAP   0,0xa0000
   (b)    DMA UNMAP 0,0xa0000
   (c)    DMA MAP   0,0xc0000000
             * DMA read IOVA 0 may failure here (Not present)
             * if the problem occurs.
   (d)    DMA UNMAP 0,0xc0000000
  }

The page table(only focus on IOVA 0) after (a) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x1a30a72003  entry:0xffff89b39cacb000
    PTE: 0x21d200803  entry:0xffff89b3b0a72000

The page table after (b) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x1a30a72003  entry:0xffff89b39cacb000
    PTE: 0x0          entry:0xffff89b3b0a72000

The page table after (c) is:
 PML4: 0x19db5c1003   entry:0xffff899bdcd2f000
  PDPE: 0x1a1cacb003  entry:0xffff89b35b5c1000
   PDE: 0x21d200883   entry:0xffff89b39cacb000 (*)

Because the PDE entry after (b) is present, it won't be
flushed even if the iommu driver flush cache when unmap,
so the obsolete data may be preserved in cache, which
would cause the wrong translation at end.

However, we can see the PDE entry is finally switch to
2M-superpage mapping, but it does not transform
to 0x21d200883 directly:

1. PDE: 0x1a30a72003
2. __domain_mapping
     dma_pte_free_pagetable
       Set the PDE entry to ZERO
     Set the PDE entry to 0x21d200883

So we must flush the cache after the entry switch to ZERO
to avoid the obsolete info be preserved.

Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Gonglei (Arei) <arei.gonglei@huawei.com>

Fixes: 6491d4d028 ("intel-iommu: Free old page tables before creating superpage")
Cc: <stable@vger.kernel.org> # v3.0+
Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210415004628.1779-1-longpeng2@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-15 16:12:31 +02:00
Lu Baolu c0474a606e iommu/vt-d: Invalidate PASID cache when root/context entry changed
When the Intel IOMMU is operating in the scalable mode, some information
from the root and context table may be used to tag entries in the PASID
cache. Software should invalidate the PASID-cache when changing root or
context table entries.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: 7373a8cc38 ("iommu/vt-d: Setup context and enable RID2PASID support")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 11:55:47 +02:00
Lu Baolu eea53c5816 iommu/vt-d: Remove WO permissions on second-level paging entries
When the first level page table is used for IOVA translation, it only
supports Read-Only and Read-Write permissions. The Write-Only permission
is not supported as the PRESENT bit (implying Read permission) should
always set. When using second level, we still give separate permissions
that allows WriteOnly which seems inconsistent and awkward. We want to
have consistent behavior. After moving to 1st level, we don't want things
to work sometimes, and break if we use 2nd level for the same mappings.
Hence remove this configuration.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 11:55:47 +02:00
Robin Murphy a250c23f15 iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
Instead make the global iommu_dma_strict paramete in iommu.c canonical by
exporting helpers to get and set it and use those directly in the drivers.

This make sure that the iommu.strict parameter also works for the AMD and
Intel IOMMU drivers on x86.  As those default to lazy flushing a new
IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
represent the default if not overriden by an explicit parameter.

[ported on top of the other iommu_attr changes and added a few small
 missing bits]

Signed-off-by: Robin Murphy <robin.murphy@arm.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210401155256.298656-19-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:56:53 +02:00
Christoph Hellwig 7e14754778 iommu: remove DOMAIN_ATTR_NESTING
Use an explicit enable_nesting method instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Li Yang <leoyang.li@nxp.com>
Link: https://lore.kernel.org/r/20210401155256.298656-17-hch@lst.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:56:53 +02:00
Jean-Philippe Brucker 9003351cb6 iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF
Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to
checking whether PRI is enabled.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20210401154718.307519-5-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:54:29 +02:00
Lu Baolu 6c00612d0c iommu/vt-d: Report right snoop capability when using FL for IOVA
The Intel VT-d driver checks wrong register to report snoop capablility
when using first level page table for GPA to HPA translation. This might
lead the IOMMU driver to say that it supports snooping control, but in
reality, it does not. Fix this by always setting PASID-table-entry.PGSNP
whenever a pasid entry is setting up for GPA to HPA translation so that
the IOMMU driver could report snoop capability as long as it runs in the
scalable mode.

Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Suggested-by: Rajesh Sankaran <rajesh.sankaran@intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210330021145.13824-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:41:30 +02:00
John Garry 363f266eef iommu/vt-d: Remove IOVA domain rcache flushing for CPU offlining
Now that the core code handles flushing per-IOVA domain CPU rcaches,
remove the handling here.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1616675401-151997-3-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-07 10:27:27 +02:00
Kyung Min Park dec991e472 iommu/vt-d: Disable SVM when ATS/PRI/PASID are not enabled in the device
Currently, the Intel VT-d supports Shared Virtual Memory (SVM) only when
IO page fault is supported. Otherwise, shared memory pages can not be
swapped out and need to be pinned. The device needs the Address Translation
Service (ATS), Page Request Interface (PRI) and Process Address Space
Identifier (PASID) capabilities to be enabled to support IO page fault.

Disable SVM when ATS, PRI and PASID are not enabled in the device.

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210314201534.918-1-kyung.min.park@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 11:23:52 +01:00
Robin Murphy 3542dcb15c iommu/dma: Resurrect the "forcedac" option
In converting intel-iommu over to the common IOMMU DMA ops, it quietly
lost the functionality of its "forcedac" option. Since this is a handy
thing both for testing and for performance optimisation on certain
platforms, reimplement it under the common IOMMU parameter namespace.

For the sake of fixing the inadvertent breakage of the Intel-specific
parameter, remove the dmar_forcedac remnants and hook it up as an alias
while documenting the transition to the new common parameter.

Fixes: c588072bba ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/7eece8e0ea7bfbe2cd0e30789e0d46df573af9b0.1614961776.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-03-18 10:55:23 +01:00
Joerg Roedel 45e606f272 Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next 2021-02-12 15:27:17 +01:00
Yian Chen 31a75cbbb9 iommu/vt-d: Parse SATC reporting structure
Software should parse every SATC table and all devices in the tables
reported by the BIOS and keep the information in kernel list for further
reference.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210203093329.1617808-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Lu Baolu 933fcd01e9 iommu/vt-d: Add iotlb_sync_map callback
Some Intel VT-d hardware implementations don't support memory coherency
for page table walk (presented by the Page-Walk-coherency bit in the
ecap register), so that software must flush the corresponding CPU cache
lines explicitly after each page table entry update.

The iommu_map_sg() code iterates through the given scatter-gather list
and invokes iommu_map() for each element in the scatter-gather list,
which calls into the vendor IOMMU driver through iommu_ops callback. As
the result, a single sg mapping may lead to multiple cache line flushes,
which leads to the degradation of I/O performance after the commit
<c588072bba6b5> ("iommu/vt-d: Convert intel iommu driver to the iommu
ops").

Fix this by adding iotlb_sync_map callback and centralizing the clflush
operations after all sg mappings.

Fixes: c588072bba ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/linux-iommu/D81314ED-5673-44A6-B597-090E3CB83EB0@oracle.com/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
[ cel: removed @first_pte, which is no longer used ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/linux-iommu/161177763962.1311.15577661784296014186.stgit@manet.1015granger.net
Link: https://lore.kernel.org/r/20210204014401.2846425-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Kyung Min Park 010bf5659e iommu/vt-d: Move capability check code to cap_audit files
Move IOMMU capability check and sanity check code to cap_audit files.
Also implement some helper functions for sanity checks.

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Kyung Min Park ad3d190299 iommu/vt-d: Audit IOMMU Capabilities and add helper functions
Audit IOMMU Capability/Extended Capability and check if the IOMMUs have
the consistent value for features. Report out or scale to the lowest
supported when IOMMU features have incompatibility among IOMMUs.

Report out features when below features are mismatched:
  - First Level 5 Level Paging Support (FL5LP)
  - First Level 1 GByte Page Support (FL1GP)
  - Read Draining (DRD)
  - Write Draining (DWD)
  - Page Selective Invalidation (PSI)
  - Zero Length Read (ZLR)
  - Caching Mode (CM)
  - Protected High/Low-Memory Region (PHMR/PLMR)
  - Required Write-Buffer Flushing (RWBF)
  - Advanced Fault Logging (AFL)
  - RID-PASID Support (RPS)
  - Scalable Mode Page Walk Coherency (SMPWC)
  - First Level Translation Support (FLTS)
  - Second Level Translation Support (SLTS)
  - No Write Flag Support (NWFS)
  - Second Level Accessed/Dirty Support (SLADS)
  - Virtual Command Support (VCS)
  - Scalable Mode Translation Support (SMTS)
  - Device TLB Invalidation Throttle (DIT)
  - Page Drain Support (PDS)
  - Process Address Space ID Support (PASID)
  - Extended Accessed Flag Support (EAFS)
  - Supervisor Request Support (SRS)
  - Execute Request Support (ERS)
  - Page Request Support (PRS)
  - Nested Translation Support (NEST)
  - Snoop Control (SC)
  - Pass Through (PT)
  - Device TLB Support (DT)
  - Queued Invalidation (QI)
  - Page walk Coherency (C)

Set capability to the lowest supported when below features are mismatched:
  - Maximum Address Mask Value (MAMV)
  - Number of Fault Recording Registers (NFR)
  - Second Level Large Page Support (SLLPS)
  - Fault Recording Offset (FRO)
  - Maximum Guest Address Width (MGAW)
  - Supported Adjusted Guest Address Width (SAGAW)
  - Number of Domains supported (NDOMS)
  - Pasid Size Supported (PSS)
  - Maximum Handle Mask Value (MHMV)
  - IOTLB Register Offset (IRO)

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Lu Baolu e1ed66ac30 iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration]
trace_qi_submit() could be used when interrupt remapping is supported,
but DMA remapping is not. In this case, the following compile error
occurs.

../drivers/iommu/intel/dmar.c: In function 'qi_submit_sync':
../drivers/iommu/intel/dmar.c:1311:3: error: implicit declaration of function 'trace_qi_submit';
  did you mean 'ftrace_nmi_exit'? [-Werror=implicit-function-declaration]
   trace_qi_submit(iommu, desc[i].qw0, desc[i].qw1,
   ^~~~~~~~~~~~~~~
   ftrace_nmi_exit

Fixes: f2dd871799 ("iommu/vt-d: Add qi_submit trace event")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130151907.3929148-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-02 14:43:56 +01:00
Nadav Amit 29b3283972 iommu/vt-d: Do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.

Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500 ("intel-iommu: Avoid global flushes with caching
mode.")

This behavior changed after commit 13cf017446 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.

Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.

Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.

Fixes: 13cf017446 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org

Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 13:59:02 +01:00
Lu Baolu a8ce9ebbec iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
The Access/Dirty bits in the first level page table entry will be set
whenever a page table entry was used for address translation or write
permission was successfully translated. This is always true when using
the first-level page table for kernel IOVA. Instead of wasting hardware
cycles to update the certain bits, it's better to set them up at the
beginning.

Suggested-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-01-28 11:33:35 +01:00
Tian Tao 694a1c0ade iommu/vt-d: Fix duplicate included linux/dma-map-ops.h
linux/dma-map-ops.h is included more than once, Remove the one that
isn't necessary.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609118774-10083-1-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-12 16:56:20 +00:00
Liu Yi L 7c29ada5e7 iommu/vt-d: Fix ineffective devTLB invalidation for subdevices
iommu_flush_dev_iotlb() is called to invalidate caches on a device but
only loops over the devices which are fully-attached to the domain. For
sub-devices, this is ineffective and can result in invalid caching
entries left on the device.

Fix the missing invalidation by adding a loop over the subdevices and
ensuring that 'domain->has_iotlb_device' is updated when attaching to
subdevices.

Fixes: 67b8e02b5e ("iommu/vt-d: Aux-domain specific domain attach/detach")
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-4-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:38:15 +00:00
Liu Yi L 18abda7a2d iommu/vt-d: Fix general protection fault in aux_detach_device()
The aux-domain attach/detach are not tracked, some data structures might
be used after free. This causes general protection faults when multiple
subdevices are created and assigned to a same guest machine:

  | general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI
  | RIP: 0010:intel_iommu_aux_detach_device+0x12a/0x1f0
  | [...]
  | Call Trace:
  |  iommu_aux_detach_device+0x24/0x70
  |  vfio_mdev_detach_domain+0x3b/0x60
  |  ? vfio_mdev_set_domain+0x50/0x50
  |  iommu_group_for_each_dev+0x4f/0x80
  |  vfio_iommu_detach_group.isra.0+0x22/0x30
  |  vfio_iommu_type1_detach_group.cold+0x71/0x211
  |  ? find_exported_symbol_in_section+0x4a/0xd0
  |  ? each_symbol_section+0x28/0x50
  |  __vfio_group_unset_container+0x4d/0x150
  |  vfio_group_try_dissolve_container+0x25/0x30
  |  vfio_group_put_external_user+0x13/0x20
  |  kvm_vfio_group_put_external_user+0x27/0x40 [kvm]
  |  kvm_vfio_destroy+0x45/0xb0 [kvm]
  |  kvm_put_kvm+0x1bb/0x2e0 [kvm]
  |  kvm_vm_release+0x22/0x30 [kvm]
  |  __fput+0xcc/0x260
  |  ____fput+0xe/0x10
  |  task_work_run+0x8f/0xb0
  |  do_exit+0x358/0xaf0
  |  ? wake_up_state+0x10/0x20
  |  ? signal_wake_up_state+0x1a/0x30
  |  do_group_exit+0x47/0xb0
  |  __x64_sys_exit_group+0x18/0x20
  |  do_syscall_64+0x57/0x1d0
  |  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix the crash by tracking the subdevices when attaching and detaching
aux-domains.

Fixes: 67b8e02b5e ("iommu/vt-d: Aux-domain specific domain attach/detach")
Co-developed-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-3-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-07 14:35:14 +00:00
Will Deacon c74009f529 Merge branch 'for-next/iommu/fixes' into for-next/iommu/core
Merge in IOMMU fixes for 5.10 in order to resolve conflicts against the
queue for 5.11.

* for-next/iommu/fixes:
  iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
  iommu/vt-d: Don't read VCCAP register unless it exists
  x86/tboot: Don't disable swiotlb when iommu is forced on
  iommu: Check return of __iommu_attach_device()
  arm-smmu-qcom: Ensure the qcom_scm driver has finished probing
  iommu/amd: Enforce 4k mapping for certain IOMMU data structures
  MAINTAINERS: Temporarily add myself to the IOMMU entry
  iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set
  iommu/vt-d: Avoid panic if iommu init fails in tboot system
  iommu/vt-d: Cure VF irqdomain hickup
  x86/platform/uv: Fix copied UV5 output archtype
  x86/platform/uv: Drop last traces of uv_flush_tlb_others
2020-12-08 15:21:49 +00:00
Will Deacon 113eb4ce4f Merge branch 'for-next/iommu/vt-d' into for-next/iommu/core
Intel VT-D updates for 5.11. The main thing here is converting the code
over to the iommu-dma API, which required some improvements to the core
code to preserve existing functionality.

* for-next/iommu/vt-d:
  iommu/vt-d: Avoid GFP_ATOMIC where it is not needed
  iommu/vt-d: Remove set but not used variable
  iommu/vt-d: Cleanup after converting to dma-iommu ops
  iommu/vt-d: Convert intel iommu driver to the iommu ops
  iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev
  iommu: Add quirk for Intel graphic devices in map_sg
  iommu: Allow the dma-iommu api to use bounce buffers
  iommu: Add iommu_dma_free_cpu_cached_iovas()
  iommu: Handle freelists when using deferred flushing in iommu drivers
  iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM
2020-12-08 15:11:58 +00:00
Will Deacon a5f12de3ec Merge branch 'for-next/iommu/svm' into for-next/iommu/core
More steps along the way to Shared Virtual {Addressing, Memory} support
for Arm's SMMUv3, including the addition of a helper library that can be
shared amongst other IOMMU implementations wishing to support this
feature.

* for-next/iommu/svm:
  iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
  iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()
  iommu/sva: Add PASID helpers
  iommu/ioasid: Add ioasid references
2020-12-08 15:07:49 +00:00
Christophe JAILLET 33e0715710 iommu/vt-d: Avoid GFP_ATOMIC where it is not needed
There is no reason to use GFP_ATOMIC in a 'suspend' function.
Use GFP_KERNEL instead to give more opportunities to allocate the
requested memory.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201030182630.5154-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201201013149.2466272-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-12-01 15:02:20 +00:00
Lu Baolu 405a43cc00 iommu/vt-d: Remove set but not used variable
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/iommu/intel/iommu.c:5643:27: warning: variable 'last_pfn' set but not used [-Wunused-but-set-variable]
5643 |  unsigned long start_pfn, last_pfn;
     |                           ^~~~~~~~

This variable is never used, so remove it.

Fixes: 2a2b8eaa5b ("iommu: Handle freelists when using deferred flushing in iommu drivers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201127013308.1833610-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-27 09:52:28 +00:00
David Woodhouse d76b42e927 iommu/vt-d: Don't read VCCAP register unless it exists
My virtual IOMMU implementation is whining that the guest is reading a
register that doesn't exist. Only read the VCCAP_REG if the corresponding
capability is set in ECAP_REG to indicate that it actually exists.

Fixes: 3375303e82 ("iommu/vt-d: Add custom allocator for IOASID")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Liu Yi L <yi.l.liu@intel.com>
Cc: stable@vger.kernel.org # v5.7+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/de32b150ffaa752e0cff8571b17dfb1213fbe71c.camel@infradead.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-26 14:50:24 +00:00
Lu Baolu 28b41e2c6a iommu: Move def_domain type check for untrusted device into core
So that the vendor iommu drivers are no more required to provide the
def_domain_type callback to always isolate the untrusted devices.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/linux-iommu/243ce89c33fe4b9da4c56ba35acebf81@huawei.com/
Link: https://lore.kernel.org/r/20201124130604.2912899-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:14:33 +00:00
Lu Baolu 58a8bb3949 iommu/vt-d: Cleanup after converting to dma-iommu ops
Some cleanups after converting the driver to use dma-iommu ops.
- Remove nobounce option;
- Cleanup and simplify the path in domain mapping.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-8-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:49 +00:00
Tom Murphy c588072bba iommu/vt-d: Convert intel iommu driver to the iommu ops
Convert the intel iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the intel iommu driver.

Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-7-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:49 +00:00
Lu Baolu c062db039f iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev
The iommu-dma constrains IOVA allocation based on the domain geometry
that the driver reports. Update domain geometry everytime a domain is
attached to or detached from a device.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:48 +00:00
Tom Murphy 2a2b8eaa5b iommu: Handle freelists when using deferred flushing in iommu drivers
Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.

This is useful for iommu drivers (in this case the intel iommu driver)
which need to wait for the ioTLB to be flushed before newly
free/unmapped page table pages can be freed. This way we can still batch
ioTLB free operations and handle the freelists.

Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-2-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 12:03:48 +00:00
Jean-Philippe Brucker cb4789b0d1 iommu/ioasid: Add ioasid references
Let IOASID users take references to existing ioasids with ioasid_get().
ioasid_put() drops a reference and only frees the ioasid when its
reference number is zero. It returns true if the ioasid was freed.
For drivers that don't call ioasid_get(), ioasid_put() is the same as
ioasid_free().

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-23 14:16:55 +00:00
Will Deacon 66930e7e1e Merge branch 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb into for-next/iommu/vt-d
Merge swiotlb updates from Konrad, as we depend on the updated function
prototype for swiotlb_tbl_map_single(), which dropped the 'tbl_dma_addr'
argument in -rc4.

* 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single
  swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"
2020-11-23 10:46:43 +00:00
Zhenzhong Duan 4d213e76a3 iommu/vt-d: Avoid panic if iommu init fails in tboot system
"intel_iommu=off" command line is used to disable iommu but iommu is force
enabled in a tboot system for security reason.

However for better performance on high speed network device, a new option
"intel_iommu=tboot_noforce" is introduced to disable the force on.

By default kernel should panic if iommu init fail in tboot for security
reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off".

Fix the code setting force_on and move intel_iommu_tboot_noforce
from tboot code to intel iommu code.

Fixes: 7304e8f28b ("iommu/vt-d: Correctly disable Intel IOMMU force on")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com>
Tested-by: Lukasz Hawrylko <lukasz.hawrylko@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201110071908.3133-1-zhenzhong.duan@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-18 13:09:07 +00:00
Lukas Bulwahn 68dd9d89ea iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM
Commit 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
introduced intel_iommu_sva_invalidate() when CONFIG_INTEL_IOMMU_SVM.
This function uses the dedicated static variable inv_type_granu_table
and functions to_vtd_granularity() and to_vtd_size().

These parts are unused when !CONFIG_INTEL_IOMMU_SVM, and hence,
make CC=clang W=1 warns with an -Wunused-function warning.

Include these parts conditionally on CONFIG_INTEL_IOMMU_SVM.

Fixes: 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201115205951.20698-1-lukas.bulwahn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-17 22:29:49 +00:00
Lu Baolu 6097df457a iommu/vt-d: Fix kernel NULL pointer dereference in find_domain()
If calling find_domain() for a device which hasn't been probed by the
iommu core, below kernel NULL pointer dereference issue happens.

[  362.736947] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  362.743953] #PF: supervisor read access in kernel mode
[  362.749115] #PF: error_code(0x0000) - not-present page
[  362.754278] PGD 0 P4D 0
[  362.756843] Oops: 0000 [#1] SMP NOPTI
[  362.760528] CPU: 0 PID: 844 Comm: cat Not tainted 5.9.0-rc4-intel-next+ #1
[  362.767428] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake
               U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3384.A02.1909200816
               09/20/2019
[  362.781109] RIP: 0010:find_domain+0xd/0x40
[  362.785234] Code: 48 81 fb 60 28 d9 b2 75 de 5b 41 5c 41 5d 5d c3 0f 1f 00 66
                     2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 e0 02 00
                     00 55 <48> 8b 40 38 48 89 e5 48 83 f8 fe 0f 94 c1 48 85 ff
                     0f 94 c2 08 d1
[  362.804041] RSP: 0018:ffffb09cc1f0bd38 EFLAGS: 00010046
[  362.809292] RAX: 0000000000000000 RBX: ffff905b98e4fac8 RCX: 0000000000000000
[  362.816452] RDX: 0000000000000001 RSI: ffff905b98e4fac8 RDI: ffff905b9ccd40d0
[  362.823617] RBP: ffffb09cc1f0bda0 R08: ffffb09cc1f0bd48 R09: 000000000000000f
[  362.830778] R10: ffffffffb266c080 R11: ffff905b9042602d R12: ffff905b98e4fac8
[  362.837944] R13: ffffb09cc1f0bd48 R14: ffff905b9ccd40d0 R15: ffff905b98e4fac8
[  362.845108] FS:  00007f8485460740(0000) GS:ffff905b9fc00000(0000)
               knlGS:0000000000000000
[  362.853227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  362.858996] CR2: 0000000000000038 CR3: 00000004627a6003 CR4: 0000000000770ef0
[  362.866161] PKRU: fffffffc
[  362.868890] Call Trace:
[  362.871363]  ? show_device_domain_translation+0x32/0x100
[  362.876700]  ? bind_store+0x110/0x110
[  362.880387]  ? klist_next+0x91/0x120
[  362.883987]  ? domain_translation_struct_show+0x50/0x50
[  362.889237]  bus_for_each_dev+0x79/0xc0
[  362.893121]  domain_translation_struct_show+0x36/0x50
[  362.898204]  seq_read+0x135/0x410
[  362.901545]  ? handle_mm_fault+0xeb8/0x1750
[  362.905755]  full_proxy_read+0x5c/0x90
[  362.909526]  vfs_read+0xa6/0x190
[  362.912782]  ksys_read+0x61/0xe0
[  362.916037]  __x64_sys_read+0x1a/0x20
[  362.919725]  do_syscall_64+0x37/0x80
[  362.923329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  362.928405] RIP: 0033:0x7f84855c5e95

Filter out those devices to avoid such error.

Fixes: e2726daea5 ("iommu/vt-d: debugfs: Add support to show page table internals")
Reported-and-tested-by: Xu Pengfei <pengfei.xu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org#v5.6+
Link: https://lore.kernel.org/r/20201028070725.24979-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-11-03 14:30:10 +01:00
Christoph Hellwig fc0021aa34 swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single
The tbl_dma_addr argument is used to check the DMA boundary for the
allocations, and thus needs to be a dma_addr_t.  swiotlb-xen instead
passed a physical address, which could lead to incorrect results for
strange offsets.  Fix this by removing the parameter entirely and hard
code the DMA address for io_tlb_start instead.

Fixes: 91ffe4ad53 ("swiotlb-xen: introduce phys_to_dma/dma_to_phys translations")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2020-11-02 10:10:39 -05:00
Linus Torvalds 5a32c3413d dma-mapping updates for 5.10
- rework the non-coherent DMA allocator
  - move private definitions out of <linux/dma-mapping.h>
  - lower CMA_ALIGNMENT (Paul Cercueil)
  - remove the omap1 dma address translation in favor of the common
    code
  - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
  - support per-node DMA CMA areas (Barry Song)
  - increase the default seg boundary limit (Nicolin Chen)
  - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
  - various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl+IiPwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPKEQ//TM8vxjucnRl/pklpMin49dJorwiVvROLhQqLmdxw
 286ZKpVzYYAPc7LnNqwIBugnFZiXuHu8xPKQkIiOa2OtNDTwhKNoBxOAmOJaV6DD
 8JfEtZYeX5mKJ/Nqd2iSkIqOvCwZ9Wzii+aytJ2U88wezQr1fnyF4X49MegETEey
 FHWreSaRWZKa0MMRu9AQ0QxmoNTHAQUNaPc0PeqEtPULybfkGOGw4/ghSB7WcKrA
 gtKTuooNOSpVEHkTas2TMpcBp6lxtOjFqKzVN0ml+/nqq5NeTSDx91VOCX/6Cj76
 mXIg+s7fbACTk/BmkkwAkd0QEw4fo4tyD6Bep/5QNhvEoAriTuSRbhvLdOwFz0EF
 vhkF0Rer6umdhSK7nPd7SBqn8kAnP4vBbdmB68+nc3lmkqysLyE4VkgkdH/IYYQI
 6TJ0oilXWFmU6DT5Rm4FBqCvfcEfU2dUIHJr5wZHqrF2kLzoZ+mpg42fADoG4GuI
 D/oOsz7soeaRe3eYfWybC0omGR6YYPozZJ9lsfftcElmwSsFrmPsbO1DM5IBkj1B
 gItmEbOB9ZK3RhIK55T/3u1UWY3Uc/RVr+kchWvADGrWnRQnW0kxYIqDgiOytLFi
 JZNH8uHpJIwzoJAv6XXSPyEUBwXTG+zK37Ce769HGbUEaUrE71MxBbQAQsK8mDpg
 7fM=
 =Bkf/
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of <linux/dma-mapping.h>

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
  dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove <asm/dma-contiguous.h>
  dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split <linux/dma-mapping.h>
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement ->alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
2020-10-15 14:43:29 -07:00
Linus Torvalds 531d29b0b6 IOMMU Updates for Linux v5.10
Including:
 
 	- ARM-SMMU Updates from Will:
 
 	  - Continued SVM enablement, where page-table is shared with
 	    CPU
 
 	  - Groundwork to support integrated SMMU with Adreno GPU
 
 	  - Allow disabling of MSI-based polling on the kernel
 	    command-line
 
 	  - Minor driver fixes and cleanups (octal permissions, error
 	    messages, ...)
 
 	- Secure Nested Paging Support for AMD IOMMU. The IOMMU will
 	  fault when a device tries DMA on memory owned by a guest. This
 	  needs new fault-types as well as a rewrite of the IOMMU memory
 	  semaphore for command completions.
 
 	- Allow broken Intel IOMMUs (wrong address widths reported) to
 	  still be used for interrupt remapping.
 
 	- IOMMU UAPI updates for supporting vSVA, where the IOMMU can
 	  access address spaces of processes running in a VM.
 
 	- Support for the MT8167 IOMMU in the Mediatek IOMMU driver.
 
 	- Device-tree updates for the Renesas driver to support r8a7742.
 
 	- Several smaller fixes and cleanups all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl+Fy9MACgkQK/BELZcB
 GuNxtRAA0TdYHXt6XyLWmvRAX/ySZSz6KOneZWWwpsQ9wh2/iv1PtBsrV0ltf+6g
 CaX4ROZUVRbV9wPD+7maBRbzxrG3QhfEaaV+K45Q2J/QE1wjkyV8qj1eORWTUUoc
 nis4FhGDKk2ER/Gsajy2Hjs4+6i43gdWG/+ghVGaCRo8mCZyoz1/6AyMQyN3deuO
 NqWOv9E7hsavZjRs/w/LXG7eSE20cZwtt//kPVJF0r9eQqC6i1eJDQj48iRqJVqd
 R0dwBQZaLz++qQptyKebDNlmH/3aAsb+A8nCeS7ZwHqWC1QujTWOUYWpFyPPbOmC
 KVsQXzTzRfnVTDECF1Pk5d3yi45KILLU3B4zDJfUJjbL3KDYjuVUvhHF/pcGcjC3
 H1LWJqHSAL8sJwHvKhpi0VtQ5SOxXnLO5fGG/CZT/Xb4QyM+mkwkFLdn1TryZTR/
 M4XA+QuI96TzY7HQUJdSoEDANxoBef6gPnxdDKOnK1v4hfNsPAl7o8hZkM3w0DK8
 GoFZUV+vjBhFcymGcQegSNiea28Hfi+hBe+PPHCmw+tJm47cketD5uP5jJ5NGaUe
 MKU/QXWXc6oqeBTQT6ki5zJbJXKttbPa8eEmp+FrMatc9kruvBVhQoMbj7Vd3CA1
 dC4zK9Awy7yj24ZhZfnAFx2DboCmBTUI3QKjDt9K5PRZyMeyoP8=
 =C0Sg
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - ARM-SMMU Updates from Will:

      - Continued SVM enablement, where page-table is shared with CPU

      - Groundwork to support integrated SMMU with Adreno GPU

      - Allow disabling of MSI-based polling on the kernel command-line

      - Minor driver fixes and cleanups (octal permissions, error
        messages, ...)

 - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when
   a device tries DMA on memory owned by a guest. This needs new
   fault-types as well as a rewrite of the IOMMU memory semaphore for
   command completions.

 - Allow broken Intel IOMMUs (wrong address widths reported) to still be
   used for interrupt remapping.

 - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access
   address spaces of processes running in a VM.

 - Support for the MT8167 IOMMU in the Mediatek IOMMU driver.

 - Device-tree updates for the Renesas driver to support r8a7742.

 - Several smaller fixes and cleanups all over the place.

* tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits)
  iommu/vt-d: Gracefully handle DMAR units with no supported address widths
  iommu/vt-d: Check UAPI data processed by IOMMU core
  iommu/uapi: Handle data and argsz filled by users
  iommu/uapi: Rename uapi functions
  iommu/uapi: Use named union for user data
  iommu/uapi: Add argsz for user filled data
  docs: IOMMU user API
  iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
  iommu/arm-smmu-v3: Add SVA device feature
  iommu/arm-smmu-v3: Check for SVA features
  iommu/arm-smmu-v3: Seize private ASID
  iommu/arm-smmu-v3: Share process page tables
  iommu/arm-smmu-v3: Move definitions to a header
  iommu/io-pgtable-arm: Move some definitions to a header
  iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer
  iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB
  iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR
  iommu/amd: Use 4K page for completion wait write-back semaphore
  iommu/tegra-smmu: Allow to group clients in same swgroup
  iommu/tegra-smmu: Fix iova->phys translation
  ...
2020-10-14 12:08:34 -07:00
Linus Torvalds ac74075e5d Initial support for sharing virtual addresses between the CPU and
devices which doesn't need pinning of pages for DMA anymore. Add support
 for the command submission to devices using new x86 instructions like
 ENQCMD{,S} and MOVDIR64B. In addition, add support for process address
 space identifiers (PASIDs) which are referenced by those command
 submission instructions along with the handling of the PASID state on
 context switch as another extended state. Work by Fenghua Yu, Ashok Raj,
 Yu-cheng Yu and Dave Jiang.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl996DIACgkQEsHwGGHe
 VUqM4A/+JDI3GxNyMyBpJR0nQ2vs23ru1o3OxvxhYtcacZ0cNwkaO7g3TLQxH+LZ
 k1QtvEd4jqI6BXV4de+HdZFDcqzikJf0KHnUflLTx956/Eop5rtxzMWVo69ZmYs8
 QrW0mLhyh8eq19cOHbQBb4M/HFc1DXBw+l7Ft3MeA1divOVESRB/uNxjA25K4PvV
 y+pipyUxqKSNhmBFf2bV8OVZloJiEtg3H6XudP0g/rZgjYe3qWxa+2iv6D08yBNe
 g7NpMDMql2uo1bcFON7se2oF34poAi49BfiIQb5G4m9pnPyvVEMOCijxCx2FHYyF
 nukyxt8g3Uq+UJYoolLNoWijL1jgBWeTBg1uuwsQOqWSARJx8nr859z0GfGyk2RP
 GNoYE4rrWBUMEqWk4xeiPPgRDzY0cgcGh0AeuWqNhgBfbbZeGL0t0m5kfytk5i1s
 W0YfRbz+T8+iYbgVfE/Zpthc7rH7iLL7/m34JC13+pzhPVTT32ECLJov2Ac8Tt15
 X+fOe6kmlDZa4GIhKRzUoR2aEyLpjufZ+ug50hznBQjGrQfcx7zFqRAU4sJx0Yyz
 rxUOJNZZlyJpkyXzc12xUvShaZvTcYenHGpxXl8TU3iMbY2otxk1Xdza8pc1LGQ/
 qneYgILgKa+hSBzKhXCPAAgSYtPlvQrRizArS8Y0k/9rYaKCfBU=
 =K9X4
 -----END PGP SIGNATURE-----

Merge tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 PASID updates from Borislav Petkov:
 "Initial support for sharing virtual addresses between the CPU and
  devices which doesn't need pinning of pages for DMA anymore.

  Add support for the command submission to devices using new x86
  instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support
  for process address space identifiers (PASIDs) which are referenced by
  those command submission instructions along with the handling of the
  PASID state on context switch as another extended state.

  Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang"

* tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction
  x86/asm: Carve out a generic movdir64b() helper for general usage
  x86/mmu: Allocate/free a PASID
  x86/cpufeatures: Mark ENQCMD as disabled when configured out
  mm: Add a pasid member to struct mm_struct
  x86/msr-index: Define an IA32_PASID MSR
  x86/fpu/xstate: Add supervisor PASID state for ENQCMD
  x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  Documentation/x86: Add documentation for SVA (Shared Virtual Addressing)
  iommu/vt-d: Change flags type to unsigned int in binding mm
  drm, iommu: Change type of pasid to u32
2020-10-12 10:40:34 -07:00
Joerg Roedel 7e3c3883c3 Merge branches 'arm/allwinner', 'arm/mediatek', 'arm/renesas', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/amd', 'x86/vt-d' and 'core' into next 2020-10-07 11:51:59 +02:00
Christoph Hellwig 0b1abd1fb7 dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:04 +02:00
Christoph Hellwig 0a0f0d8be7 dma-mapping: split <linux/dma-mapping.h>
Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers.  That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:03 +02:00
Lu Baolu 1a3f2fd7fc iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb()
Lock(&iommu->lock) without disabling irq causes lockdep warnings.

[   12.703950] ========================================================
[   12.703962] WARNING: possible irq lock inversion dependency detected
[   12.703975] 5.9.0-rc6+ #659 Not tainted
[   12.703983] --------------------------------------------------------
[   12.703995] systemd-udevd/284 just changed the state of lock:
[   12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at:
               iommu_flush_dev_iotlb.part.57+0x2e/0x90
[   12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   12.704043]  (&iommu->lock){+.+.}-{2:2}
[   12.704045]

               and interrupts could create inverse lock ordering between
               them.

[   12.704073]
               other info that might help us debug this:
[   12.704085]  Possible interrupt unsafe locking scenario:

[   12.704097]        CPU0                    CPU1
[   12.704106]        ----                    ----
[   12.704115]   lock(&iommu->lock);
[   12.704123]                                local_irq_disable();
[   12.704134]                                lock(device_domain_lock);
[   12.704146]                                lock(&iommu->lock);
[   12.704158]   <Interrupt>
[   12.704164]     lock(device_domain_lock);
[   12.704174]
                *** DEADLOCK ***

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200927062428.13713-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01 14:54:17 +02:00
Jacob Pan 6278eecba3 iommu/vt-d: Check UAPI data processed by IOMMU core
IOMMU generic layer already does sanity checks on UAPI data for version
match and argsz range based on generic information.

This patch adjusts the following data checking responsibilities:
- removes the redundant version check from VT-d driver
- removes the check for vendor specific data size
- adds check for the use of reserved/undefined flags

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/1601051567-54787-7-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01 14:52:46 +02:00
Jacob Pan 8d3bb3b8cb iommu/uapi: Use named union for user data
IOMMU UAPI data size is filled by the user space which must be validated
by the kernel. To ensure backward compatibility, user data can only be
extended by either re-purpose padding bytes or extend the variable sized
union at the end. No size change is allowed before the union. Therefore,
the minimum size is the offset of the union.

To use offsetof() on the union, we must make it named.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/linux-iommu/20200611145518.0c2817d6@x1.home/
Link: https://lore.kernel.org/r/1601051567-54787-4-git-send-email-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-10-01 14:52:46 +02:00
Christoph Hellwig efa70f2fdc dma-mapping: add a new dma_alloc_pages API
This API is the equivalent of alloc_pages, except that the returned memory
is guaranteed to be DMA addressable by the passed in device.  The
implementation will also be used to provide a more sensible replacement
for DMA_ATTR_NON_CONSISTENT flag.

Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages
as its backend.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
2020-09-25 06:20:47 +02:00
Christoph Hellwig 8c1c6c7588 Merge branch 'master' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next
Pull in the latest 5.9 tree for the commit to revert the
V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.
2020-09-25 06:19:19 +02:00
Lu Baolu d2ef096249 iommu/vt-d: Use device numa domain if RHSA is missing
If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR
table, we could default to the device NUMA domain as fall back.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200904010303.2961-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20200922060843.31546-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-24 10:59:55 +02:00
Fenghua Yu c7b6bac9c7 drm, iommu: Change type of pasid to u32
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".

No PASID type change in uapi although it defines PASID as __u64 in
some places.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
2020-09-17 19:21:16 +02:00
Christoph Hellwig 5ceda74093 dma-direct: rename and cleanup __phys_to_dma
The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious.  Try
to improve the situation by renaming __phys_to_dma to
phys_to_dma_unencryped, and not forcing architectures that want to
override phys_to_dma to actually provide __phys_to_dma.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2020-09-11 09:14:43 +02:00
Chris Wilson 29aaebbca4 iommu/vt-d: Handle 36bit addressing for x86-32
Beware that the address size for x86-32 may exceed unsigned long.

[    0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14
[    0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int'

If we don't handle the wide addresses, the pages are mismapped and the
device read/writes go astray, detected as DMAR faults and leading to
device failure. The behaviour changed (from working to broken) in commit
fa954e6831 ("iommu/vt-d: Delegate the dma domain to upper layer"), but
the error looks older.

Fixes: fa954e6831 ("iommu/vt-d: Delegate the dma domain to upper layer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: James Sewart <jamessewart@arista.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: <stable@vger.kernel.org> # v5.3+
Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.uk
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04 12:14:28 +02:00
Lu Baolu 2d33b7d631 iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
The dev_iommu_priv_set() must be called after probe_device(). This fixes
a NULL pointer deference bug when booting a system with kernel cmdline
"intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused.

The following stacktrace was produced:

 Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off
 ...
 DMAR: Host address width 39
 DMAR: DRHD base: 0x000000fed90000 flags: 0x0
 DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
 DMAR: DRHD base: 0x000000fed91000 flags: 0x1
 DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
 DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff
 DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff
 DMAR: No ATSR found
 BUG: kernel NULL pointer dereference, address: 0000000000000038
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP PTI
 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2
 Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018
 RIP: 0010:intel_iommu_init+0xed0/0x1136
 Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48
       03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1
       e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0
 RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282
 RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff
 RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0
 R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b
 R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0
 Call Trace:
  ? _raw_spin_unlock_irqrestore+0x1f/0x30
  ? call_rcu+0x10e/0x320
  ? trace_hardirqs_on+0x2c/0xd0
  ? rdinit_setup+0x2c/0x2c
  ? e820__memblock_setup+0x8b/0x8b
  pci_iommu_init+0x16/0x3f
  do_one_initcall+0x46/0x1e4
  kernel_init_freeable+0x169/0x1b2
  ? rest_init+0x9f/0x9f
  kernel_init+0xa/0x101
  ret_from_fork+0x22/0x30
 Modules linked in:
 CR2: 0000000000000038
 ---[ end trace 3653722a6f936f18 ]---

Fixes: 01b9d4e211 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/
Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04 11:50:37 +02:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds 952ace797c IOMMU Updates for Linux v5.9
Including:
 
 	- Removal of the dev->archdata.iommu (or similar) pointers from
 	  most architectures. Only Sparc is left, but this is private to
 	  Sparc as their drivers don't use the IOMMU-API.
 
 	- ARM-SMMU Updates from Will Deacon:
 
 	  -  Support for SMMU-500 implementation in Marvell
 	     Armada-AP806 SoC
 
 	  - Support for SMMU-500 implementation in NVIDIA Tegra194 SoC
 
 	  - DT compatible string updates
 
 	  - Remove unused IOMMU_SYS_CACHE_ONLY flag
 
 	  - Move ARM-SMMU drivers into their own subdirectory
 
 	- Intel VT-d Updates from Lu Baolu:
 
 	  - Misc tweaks and fixes for vSVA
 
 	  - Report/response page request events
 
 	  - Cleanups
 
 	- Move the Kconfig and Makefile bits for the AMD and Intel
 	  drivers into their respective subdirectory.
 
 	- MT6779 IOMMU Support
 
 	- Support for new chipsets in the Renesas IOMMU driver
 
 	- Other misc cleanups and fixes (e.g. to improve compile test
 	  coverage)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl8ygTIACgkQK/BELZcB
 GuPZmRAAzSLuUNoQPWrFUbocNuZ/YHUCKdluKdYx26AgtYFwBrwzDAHPdq8HF8Hm
 y8w2xiUVVP9uZ8gnDkAuwXBtg+yOnG9sRNFZMNdtCy1Q0ehp0HNsn/6NabxVpSml
 QuAmd2PxMMopQRVLOR5YYvZl6JdiZx19W8X+trgwnR9Kghqq+7QXI9+D00jztRxQ
 Qvh/9NvIdX3k+5R4ZPJaV6OhaFvxzQzQZwKuO61VqFOWZRH1z9Oo+aXDCWTFUjYN
 IClTcG8qOK2W9/SOyYDXMoz30Yf0vcuDxhafi2JJVNcTPRmMWoeqff6yKslp76ea
 lTepDcIKld1Ul9NoqfYzhhKiEaLcgMEW2ua6vk5YFVxBBqJfg5qdtDZzBxa0FiNx
 TQrZFX3xjtZC6tRyy+eKWOj6vx7l0ONwwDxRc3HdvL+xE+KUdmsg82qHU4cAHRjp
 U2dgTdlkTEd56q4BEQxmJAHYMIUrx2QAp6pa2+Jv/Iqpi9PsZ2k+l9Gy6h+rM7dn
 Est/1gA4kDhKdCKfTx7g9EL6AAoU50WttxNmwMxrUrXX3fsstfY1fKgyZUPpkL7V
 V5iXbbsdMQLHzOF2qiqIIMxMGYxr/x/FJ1DnSJ7j+jAXMF77d2B9iQttzImOVN2c
 VXBxcVstWN7/xXjIy13C/83bRKwWqXaaS4cbv3Di0ZGFeD2oAF0=
 =3O2Z
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Remove of the dev->archdata.iommu (or similar) pointers from most
   architectures. Only Sparc is left, but this is private to Sparc as
   their drivers don't use the IOMMU-API.

 - ARM-SMMU updates from Will Deacon:

     - Support for SMMU-500 implementation in Marvell Armada-AP806 SoC

     - Support for SMMU-500 implementation in NVIDIA Tegra194 SoC

     - DT compatible string updates

     - Remove unused IOMMU_SYS_CACHE_ONLY flag

     - Move ARM-SMMU drivers into their own subdirectory

 - Intel VT-d updates from Lu Baolu:

     - Misc tweaks and fixes for vSVA

     - Report/response page request events

     - Cleanups

 - Move the Kconfig and Makefile bits for the AMD and Intel drivers into
   their respective subdirectory.

 - MT6779 IOMMU Support

 - Support for new chipsets in the Renesas IOMMU driver

 - Other misc cleanups and fixes (e.g. to improve compile test coverage)

* tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (77 commits)
  iommu/amd: Move Kconfig and Makefile bits down into amd directory
  iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
  iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory
  iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
  iommu: Add gfp parameter to io_pgtable_ops->map()
  iommu: Mark __iommu_map_sg() as static
  iommu/vt-d: Rename intel-pasid.h to pasid.h
  iommu/vt-d: Add page response ops support
  iommu/vt-d: Report page request faults for guest SVA
  iommu/vt-d: Add a helper to get svm and sdev for pasid
  iommu/vt-d: Refactor device_to_iommu() helper
  iommu/vt-d: Disable multiple GPASID-dev bind
  iommu/vt-d: Warn on out-of-range invalidation address
  iommu/vt-d: Fix devTLB flush for vSVA
  iommu/vt-d: Handle non-page aligned address
  iommu/vt-d: Fix PASID devTLB invalidation
  iommu/vt-d: Remove global page support in devTLB flush
  iommu/vt-d: Enforce PASID devTLB field mask
  iommu: Make some functions static
  iommu/amd: Remove double zero check
  ...
2020-08-11 14:13:24 -07:00
Linus Torvalds 049eb096da pci-v5.9-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl8sdUkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwH2Q/7Brcm1uyLORSzseGsaXSGMncBs2YB
 aKbfhyy4BPsDIZRLnzcfRZzgKo3f4jlLH9dJ6nBukbNXCvS/g7oYCXtNKVuB70MD
 IgBH3OJxLmqsYgDkoQmj1fZBCBhdqMgGbRmeIPLqiIBrWOJkBpGHXKpb0XtyXAas
 CpD0Tvr0JBeHMluZq6Uay09jBDKexeCFrT5HCoVaRMXT/C/iB5K1oMrUczzITsdi
 jB9xesDjh32rYtaePKfuL8itbRT7jtqOwQlk7sCtnMNamaOOaYO/s6hL5v/4GxMh
 rtWa1knOxxA1nOsnEkUEHi0Fj/+9zXDIdb7v6thRDo0ZgWQxl7l3nshvmPcxX421
 tpCm3HqmvHzGqSI85Rtr3p4XKm9e+IjgE2EA/J6Y8Q6Grrb0EGJituhO4meL2Ciq
 6mxdhu7InxDJ2p3TLGas3fB/1hrCO0Fc0pQoBJx7YgqA1ANyld9DYCkDN6IDoZBI
 uUjKgkE1dfbW/pGjotjhBsmz3dycZHkurIFdt1iX/Xtt5KKdPAzu9yM2U03iIS2R
 im1wZ/THiS/YCOlgL/J8+DHTY0ZvXjAdbiSPjTFfwb9XTh8aHVWtFaaZON1jRIjg
 xMpIY0SxfshpLx631ThZdDTDiOwE8D3B+1n/kMwps6HOLpxOoJZeSGTRCt9wGP40
 j58DTtLm5FKpdYc=
 =moI9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:
   - Fix pci_cfg_wait queue locking problem (Bjorn Helgaas)
   - Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi
     Saheed)
   - Align PCIe capability and PCI accessor return values (Bolarinwa
     Olayemi Saheed)
   - Fix pci_create_slot() reference count leak (Qiushi Wu)
   - Announce device after early fixups (Tiezhu Yang)

  PCI device hotplug:
   - Make rpadlpar functions static (Wei Yongjun)

  Driver binding:
   - Add device even if driver attach failed (Rajat Jain)

  Virtualization:
   - xen: Remove redundant initialization of irq (Colin Ian King)

  IOMMU:
   - Add pci_pri_supported() to check device or associated PF (Ashok Raj)
   - Release IVRS table in AMD ACS quirk (Hanjun Guo)
   - Mark AMD Navi10 GPU rev 0x00 ATS as broken (Kai-Heng Feng)
   - Treat "external-facing" devices themselves as internal (Rajat Jain)

  MSI:
   - Forward MSI-X error code in pci_alloc_irq_vectors_affinity() (Piotr
     Stankiewicz)

  Error handling:
   - Clear PCIe Device Status errors only if OS owns AER (Jonathan
     Cameron)
   - Log correctable errors as warning, not error (Matt Jolly)
   - Use 'pci_channel_state_t' instead of 'enum pci_channel_state' (Luc
     Van Oostenryck)

  Peer-to-peer DMA:
   - Allow P2PDMA on AMD Zen and newer CPUs (Logan Gunthorpe)

  ASPM:
   - Add missing newline in sysfs 'policy' (Xiongfeng Wang)

  Native PCIe controllers:
   - Convert to devm_platform_ioremap_resource_byname() (Dejin Zheng)
   - Convert to devm_platform_ioremap_resource() (Dejin Zheng)
   - Remove duplicate error message from devm_pci_remap_cfg_resource()
     callers (Dejin Zheng)
   - Fix runtime PM imbalance on error (Dinghao Liu)
   - Remove dev_err() when handing an error from platform_get_irq()
     (Krzysztof Wilczyński)
   - Use pci_host_bridge.windows list directly instead of splicing in a
     temporary list for cadence, mvebu, host-common (Rob Herring)
   - Use pci_host_probe() instead of open-coding all the pieces for
     altera, brcmstb, iproc, mobiveil, rcar, rockchip, tegra, v3,
     versatile, xgene, xilinx, xilinx-nwl (Rob Herring)
   - Default host bridge parent device to the platform device (Rob
     Herring)
   - Use pci_is_root_bus() instead of tracking root bus number
     separately in aardvark, designware (imx6, keystone,
     designware-host), mobiveil, xilinx-nwl, xilinx, rockchip, rcar (Rob
     Herring)
   - Set host bridge bus number in pci_scan_root_bus_bridge() instead of
     each driver for aardvark, designware-host, host-common, mediatek,
     rcar, tegra, v3-semi (Rob Herring)
   - Move DT resource setup into devm_pci_alloc_host_bridge() (Rob
     Herring)
   - Set bridge map_irq and swizzle_irq to default functions; drivers
     that don't support legacy IRQs (iproc) need to undo this (Rob
     Herring)

  ARM Versatile PCIe controller driver:
   - Drop flag PCI_ENABLE_PROC_DOMAINS (Rob Herring)

  Cadence PCIe controller driver:
   - Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property
     (Kishon Vijay Abraham I)
   - Remove "mem" from reg binding (Kishon Vijay Abraham I)
   - Fix cdns_pcie_{host|ep}_setup() error path (Kishon Vijay Abraham I)
   - Convert all r/w accessors to perform only 32-bit accesses (Kishon
     Vijay Abraham I)
   - Add support to start link and verify link status (Kishon Vijay
     Abraham I)
   - Allow pci_host_bridge to have custom pci_ops (Kishon Vijay Abraham I)
   - Add new *ops* for CPU addr fixup (Kishon Vijay Abraham I)
   - Fix updating Vendor ID and Subsystem Vendor ID register (Kishon
     Vijay Abraham I)
   - Use bridge resources for outbound window setup (Rob Herring)
   - Remove private bus number and range storage (Rob Herring)

  Cadence PCIe endpoint driver:
   - Add MSI-X support (Alan Douglas)

  HiSilicon PCIe controller driver:
   - Remove non-ECAM HiSilicon hip05/hip06 driver (Rob Herring)

  Intel VMD host bridge driver:
   - Use Shadow MEMBAR registers for QEMU/KVM guests (Jon Derrick)

  Loongson PCIe controller driver:
   - Use DECLARE_PCI_FIXUP_EARLY for bridge_class_quirk() (Tiezhu Yang)

  Marvell Aardvark PCIe controller driver:
   - Indicate error in 'val' when config read fails (Pali Rohár)
   - Don't touch PCIe registers if no card connected (Pali Rohár)

  Marvell MVEBU PCIe controller driver:
   - Setup BAR0 in order to fix MSI (Shmuel Hazan)

  Microsoft Hyper-V host bridge driver:
   - Fix a timing issue which causes kdump to fail occasionally (Wei Hu)
   - Make some functions static (Wei Yongjun)

  NVIDIA Tegra PCIe controller driver:
   - Revert tegra124 raw_violation_fixup (Nicolas Chauvet)
   - Remove PLL power supplies (Thierry Reding)

  Qualcomm PCIe controller driver:
   - Change duplicate PCI reset to phy reset (Abhishek Sahu)
   - Add missing ipq806x clocks in PCIe driver (Ansuel Smith)
   - Add missing reset for ipq806x (Ansuel Smith)
   - Add ext reset (Ansuel Smith)
   - Use bulk clk API and assert on error (Ansuel Smith)
   - Add support for tx term offset for rev 2.1.0 (Ansuel Smith)
   - Define some PARF params needed for ipq8064 SoC (Ansuel Smith)
   - Add ipq8064 rev2 variant (Ansuel Smith)
   - Support PCI speed set for ipq806x (Sham Muthayyan)

  Renesas R-Car PCIe controller driver:
   - Use devm_pci_alloc_host_bridge() (Rob Herring)
   - Use struct pci_host_bridge.windows list directly (Rob Herring)
   - Convert rcar-gen2 to use modern host bridge probe functions (Rob
     Herring)

  TI J721E PCIe driver:
   - Add TI J721E PCIe host and endpoint driver (Kishon Vijay Abraham I)

  Xilinx Versal CPM PCIe controller driver:
   - Add Versal CPM Root Port driver and YAML schema (Bharat Kumar
     Gogada)

  MicroSemi Switchtec management driver:
   - Add missing __iomem and __user tags to fix sparse warnings (Logan
     Gunthorpe)

  Miscellaneous:
   - Replace http:// links with https:// (Alexander A. Klimov)
   - Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn
     Helgaas)
   - Remove unused pci_lost_interrupt() (Heiner Kallweit)
   - Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen)
   - Fix kerneldoc warnings (Krzysztof Kozlowski)"

* tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
  PCI: Fix kerneldoc warnings
  PCI: xilinx-cpm: Add Versal CPM Root Port driver
  PCI: xilinx-cpm: Add YAML schemas for Versal CPM Root Port
  PCI: Set bridge map_irq and swizzle_irq to default functions
  PCI: Move DT resource setup into devm_pci_alloc_host_bridge()
  PCI: rcar-gen2: Convert to use modern host bridge probe functions
  PCI: Remove dev_err() when handing an error from platform_get_irq()
  MAINTAINERS: Add Kishon Vijay Abraham I for TI J721E SoC PCIe
  misc: pci_endpoint_test: Add J721E in pci_device_id table
  PCI: j721e: Add TI J721E PCIe driver
  PCI: switchtec: Add missing __iomem tag to fix sparse warnings
  PCI: switchtec: Add missing __iomem and __user tags to fix sparse warnings
  PCI: rpadlpar: Make functions static
  PCI/P2PDMA: Allow P2PDMA on AMD Zen and newer CPUs
  PCI: Release IVRS table in AMD ACS quirk
  PCI: Announce device after early fixups
  PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
  PCI: Remove unused pci_lost_interrupt()
  dt-bindings: PCI: Add EP mode dt-bindings for TI's J721E SoC
  dt-bindings: PCI: Add host mode dt-bindings for TI's J721E SoC
  ...
2020-08-07 18:48:15 -07:00
Bjorn Helgaas 6585a1a14e Merge branch 'pci/virtualization'
- Remove redundant variable init in xen (Colin Ian King)

- Add pci_pri_supported() to check device or associated PF for PRI support
  (Ashok Raj)

- Mark AMD Navi10 GPU rev 0x00 ATS as broken (Kai-Heng Feng)

- Release IVRS table in AMD ACS quirk (Hanjun Guo)

* pci/virtualization:
  PCI: Release IVRS table in AMD ACS quirk
  PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
  PCI/ATS: Add pci_pri_supported() to check device or associated PF
  xen: Remove redundant initialization of irq
2020-08-05 18:24:17 -05:00
Joerg Roedel 56fbacc9bf Merge branches 'arm/renesas', 'arm/qcom', 'arm/mediatek', 'arm/omap', 'arm/exynos', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'core' into next 2020-07-29 14:42:00 +02:00
Ashok Raj 3f9a7a13fe PCI/ATS: Add pci_pri_supported() to check device or associated PF
For SR-IOV, the PF PRI is shared between the PF and any associated VFs, and
the PRI Capability is allowed for PFs but not for VFs.  Searching for the
PRI Capability on a VF always fails, even if its associated PF supports
PRI.

Add pci_pri_supported() to check whether device or its associated PF
supports PRI.

[bhelgaas: commit log, avoid "!!"]
Fixes: b16d0cb9e2 ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
Link: https://lore.kernel.org/r/1595543849-19692-1-git-send-email-ashok.raj@intel.com
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Cc: stable@vger.kernel.org	# v4.4+
2020-07-24 09:50:41 -05:00
Lu Baolu b1012ca8dc iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
The VT-d spec requires (10.4.4 Global Command Register, TE field) that:

Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.

Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.

This provides a quirk list for those devices and skips TE disabling if
the qurik hits.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Koba Ko <koba.ko@canonical.com>
Tested-by: Jun Miao <jun.miao@windriver.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200723013437.2268-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 14:33:39 +02:00
Lu Baolu 02f3effddf iommu/vt-d: Rename intel-pasid.h to pasid.h
As Intel VT-d files have been moved to its own subdirectory, the prefix
makes no sense. No functional changes.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-13-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Lu Baolu 8b73712115 iommu/vt-d: Add page response ops support
After page requests are handled, software must respond to the device
which raised the page request with the result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds the
VT-d implementation of page_response ops.

Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Co-developed-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-12-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Lu Baolu dd6692f1b8 iommu/vt-d: Refactor device_to_iommu() helper
It is refactored in two ways:

- Make it global so that it could be used in other files.

- Make bus/devfn optional so that callers could ignore these two returned
values when they only want to get the coresponding iommu pointer.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Jacob Pan 1ff0027965 iommu/vt-d: Warn on out-of-range invalidation address
For guest requested IOTLB invalidation, address and mask are provided as
part of the invalidation data. VT-d HW silently ignores any address bits
below the mask. SW shall also allow such case but give warning if
address does not align with the mask. This patch relax the fault
handling from error to warning and proceed with invalidation request
with the given mask.

Fixes: 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200724014925.15523-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Liu Yi L 0fa1a15fa9 iommu/vt-d: Fix devTLB flush for vSVA
For guest SVA usage, in order to optimize for less VMEXIT, guest request
of IOTLB flush also includes device TLB.

On the host side, IOMMU driver performs IOTLB and implicit devTLB
invalidation. When PASID-selective granularity is requested by the guest
we need to derive the equivalent address range for devTLB instead of
using the address information in the UAPI data. The reason for that is,
unlike IOTLB flush, devTLB flush does not support PASID-selective
granularity. This is to say, we need to set the following in the PASID
based devTLB invalidation descriptor:
- entire 64 bit range in address ~(0x1 << 63)
- S bit = 1 (VT-d CH 6.5.2.6).

Without this fix, device TLB flush range is not set properly for PASID
selective granularity. This patch also merged devTLB flush code for both
implicit and explicit cases.

Fixes: 6ee1b77ba3 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200724014925.15523-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:21 +02:00
Jacob Pan 78df6c86f0 iommu/vt-d: Remove global page support in devTLB flush
Global pages support is removed from VT-d spec 3.0 for dev TLB
invalidation. This patch is to remove the bits for vSVA. Similar change
already made for the native SVA. See the link below.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/linux-iommu/20190830142919.GE11578@8bytes.org/T/
Link: https://lore.kernel.org/r/20200724014925.15523-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 10:51:20 +02:00
Kees Cook 3f649ab728 treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:35:15 -07:00
Rajat Jain 99b50be9d8 PCI: Treat "external-facing" devices themselves as internal
"External-facing" devices are internal devices that expose PCIe hierarchies
such as Thunderbolt outside the platform [1].  Previously these internal
devices were marked as "untrusted" the same as devices downstream from
them.

Use the ACPI or DT information to identify external-facing devices, but
only mark the devices *downstream* from them as "untrusted" [2].  The
external-facing device itself is no longer marked as untrusted.

[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
[2] https://lore.kernel.org/linux-pci/20200610230906.GA1528594@bjorn-Precision-5520/
Link: https://lore.kernel.org/r/20200707224604.3737893-3-rajatja@google.com
Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-07-10 14:03:13 -05:00
Joerg Roedel 01b9d4e211 iommu/vt-d: Use dev_iommu_priv_get/set()
Remove the use of dev->archdata.iommu and use the private per-device
pointer provided by IOMMU core code instead.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200625130836.1916-3-joro@8bytes.org
2020-06-30 11:59:48 +02:00
Lu Baolu 48f0bcfb7a iommu/vt-d: Fix misuse of iommu_domain_identity_map()
The iommu_domain_identity_map() helper takes start/end PFN as arguments.
Fix a misuse case where the start and end addresses are passed.

Fixes: e70b081c6f ("iommu/vt-d: Remove IOVA handling code from the non-dma_ops path")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Tom Murphy <murphyt7@tcd.ie>
Link: https://lore.kernel.org/r/20200622231345.29722-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Lu Baolu 04c00956ee iommu/vt-d: Update scalable mode paging structure coherency
The Scalable-mode Page-walk Coherency (SMPWC) field in the VT-d extended
capability register indicates the hardware coherency behavior on paging
structures accessed through the pasid table entry. This is ignored in
current code and using ECAP.C instead which is only valid in legacy mode.
Fix this so that paging structure updates could be manually flushed from
the cache line if hardware page walking is not snooped.

Fixes: 765b6a98c1 ("iommu/vt-d: Enumerate the scalable mode capability")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Rajat Jain 67e8a5b18d iommu/vt-d: Don't apply gfx quirks to untrusted devices
Currently, an external malicious PCI device can masquerade the VID:PID
of faulty gfx devices, and thus apply iommu quirks to effectively
disable the IOMMU restrictions for itself.

Thus we need to ensure that the device we are applying quirks to, is
indeed an internal trusted device.

Signed-off-by: Rajat Jain <rajatja@google.com>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:32 +02:00
Lu Baolu 16ecf10e81 iommu/vt-d: Set U/S bit in first level page table by default
When using first-level translation for IOVA, currently the U/S bit in the
page table is cleared which implies DMA requests with user privilege are
blocked. As the result, following error messages might be observed when
passing through a device to user level:

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Read] Request device [41:00.0] PASID 1 fault addr 7ecdcd000
        [fault reason 129] SM: U/S set 0 for first-level translation
        with user privilege

This fixes it by setting U/S bit in the first level page table and makes
IOVA over first level compatible with previous second-level translation.

Fixes: b802d070a5 ("iommu/vt-d: Use iova over first level")
Reported-by: Xin Zeng <xin.zeng@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200622231345.29722-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-23 10:08:31 +02:00
Joerg Roedel 672cf6df9b iommu/vt-d: Move Intel IOMMU driver into subdirectory
Move all files related to the Intel IOMMU driver into its own
subdirectory.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200609130303.26974-3-joro@8bytes.org
2020-06-10 17:46:43 +02:00