OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Robin Murphy	78ca078459	iommu/vt-d: Prepare for multiple DMA domain types In preparation for the strict vs. non-strict decision for DMA domains to be expressed in the domain type, make sure we expose our flush queue awareness by accepting the new domain type, and test the specific feature flag where we want to identify DMA domains in general. The DMA ops reset/setup can simply be made unconditional, since iommu-dma already knows only to touch DMA domains. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/31a8ef868d593a2f3826a6a120edee81815375a7.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-18 13:27:49 +02:00
Robin Murphy	f297e27f83	iommu/vt-d: Drop IOVA cookie management The core code bakes its own cookies now. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e9dbe3b6108f8538e17e0c5f59f8feeb714f51a4.1628682048.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-18 13:25:31 +02:00
Lu Baolu	75cc1018a9	iommu/vt-d: Move clflush'es from iotlb_sync_map() to map_pages() As the Intel VT-d driver has switched to use the iommu_ops.map_pages() callback, multiple pages of the same size will be mapped in a call. There's no need to put the clflush'es in iotlb_sync_map() callback. Move them back into __domain_mapping() to simplify the code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720020615.4144323-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:56:25 +02:00
Lu Baolu	3f34f12597	iommu/vt-d: Implement map/unmap_pages() iommu_ops callback Implement the map_pages() and unmap_pages() callback for the Intel IOMMU driver to allow calls from iommu core to map and unmap multiple pages of the same size in one call. With map/unmap_pages() implemented, the prior map/unmap callbacks are deprecated. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720020615.4144323-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:56:25 +02:00
Lu Baolu	a886d5a7e6	iommu/vt-d: Report real pgsize bitmap to iommu core The pgsize bitmap is used to advertise the page sizes our hardware supports to the IOMMU core, which will then use this information to split physically contiguous memory regions it is mapping into page sizes that we support. Traditionally the IOMMU core just handed us the mappings directly, after making sure the size is an order of a 4KiB page and that the mapping has natural alignment. To retain this behavior, we currently advertise that we support all page sizes that are an order of 4KiB. We are about to utilize the new IOMMU map/unmap_pages APIs. We could change this to advertise the real page sizes we support. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720020615.4144323-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:56:25 +02:00
John Garry	308723e358	iommu: Remove mode argument from iommu_set_dma_strict() We only ever now set strict mode enabled in iommu_set_dma_strict(), so just remove the argument. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1626088340-5838-7-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:27:38 +02:00
Zhen Lei	d0e108b8e9	iommu/vt-d: Add support for IOMMU default DMA mode build options Make IOMMU_DEFAULT_LAZY default for when INTEL_IOMMU config is set, as is current behaviour. Also delete global flag intel_iommu_strict: - In intel_iommu_setup(), call iommu_set_dma_strict(true) directly. Also remove the print, as iommu_subsys_init() prints the mode and we have already marked this param as deprecated. - For cap_caching_mode() check in intel_iommu_setup(), call iommu_set_dma_strict(true) directly; also reword the accompanying print with a level downgrade and also add the missing '\n'. - For Ironlake GPU, again call iommu_set_dma_strict(true) directly and keep the accompanying print. [jpg: Remove intel_iommu_strict] Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1626088340-5838-5-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:27:38 +02:00
John Garry	1d479f160c	iommu: Deprecate Intel and AMD cmdline methods to enable strict mode Now that the x86 drivers support iommu.strict, deprecate the custom methods. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1626088340-5838-2-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:27:38 +02:00
Lu Baolu	474dd1c650	iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries The commit `2b0140c696` ("iommu/vt-d: Use pci_real_dma_dev() for mapping") fixes an issue of "sub-device is removed where the context entry is cleared for all aliases". But this commit didn't consider the PASID entry and PASID table in VT-d scalable mode. This fix increases the coverage of scalable mode. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Fixes: `8038bdb855` ("iommu/vt-d: Only clear real DMA device's context entries") Fixes: `2b0140c696` ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Cc: stable@vger.kernel.org # v5.6+ Cc: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210712071712.3416949-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-14 12:58:07 +02:00
Sanjay Kumar	37764b952e	iommu/vt-d: Global devTLB flush when present context entry changed This fixes a bug in context cache clear operation. The code was not following the correct invalidation flow. A global device TLB invalidation should be added after the IOTLB invalidation. At the same time, it uses the domain ID from the context entry. But in scalable mode, the domain ID is in PASID table entry, not context entry. Fixes: `7373a8cc38` ("iommu/vt-d: Setup context and enable RID2PASID support") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210712071315.3416543-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-14 12:57:39 +02:00
Joerg Roedel	2b9d8e3e9a	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/smmu', 'x86/vt-d', 'x86/amd', 'virtio' and 'core' into next	2021-06-25 15:23:25 +02:00
Jean-Philippe Brucker	ac6d704679	iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops() Passing a 64-bit address width to iommu_setup_dma_ops() is valid on virtual platforms, but isn't currently possible. The overflow check in iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass a limit address instead of a size, so callers don't have to fake a size to work around the check. The base and limit parameters are being phased out, because: * they are redundant for x86 callers. dma-iommu already reserves the first page, and the upper limit is already in domain->geometry. * they can now be obtained from dev->dma_range_map on Arm. But removing them on Arm isn't completely straightforward so is left for future work. As an intermediate step, simplify the x86 callers by passing dummy limits. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210618152059.1194210-5-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-25 15:02:43 +02:00
Colin Ian King	934ed4580c	iommu/vt-d: Fix dereference of pointer info before it is null checked The assignment of iommu from info->iommu occurs before info is null checked hence leading to a potential null pointer dereference issue. Fix this by assigning iommu and checking if iommu is null after null checking info. Addresses-Coverity: ("Dereference before null check") Fixes: `4c82b88696` ("iommu/vt-d: Allocate/register iopf queue for sva devices") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210611135024.32781-1-colin.king@canonical.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-18 15:19:50 +02:00
Parav Pandit	7a0f06c197	iommu/vt-d: No need to typecast Page directory assignment by alloc_pgtable_page() or phys_to_virt() doesn't need typecasting as both routines return void*. Hence, remove typecasting from both the calls. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-24-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:14 +02:00
Parav Pandit	cee57d4fe7	iommu/vt-d: Remove unnecessary braces No need for braces for single line statement under if() block. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-22-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:14 +02:00
Parav Pandit	74f6d776ae	iommu/vt-d: Removed unused iommu_count in dmar domain DMAR domain uses per DMAR refcount. It is indexed by iommu seq_id. Older iommu_count is only incremented and decremented but no decisions are taken based on this refcount. This is not of much use. Hence, remove iommu_count and further simplify domain_detach_iommu() by returning void. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-21-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:14 +02:00
Parav Pandit	1f106ff0ea	iommu/vt-d: Use bitfields for DMAR capabilities IOTLB device presence, iommu coherency and snooping are boolean capabilities. Use them as bits and keep them adjacent. Structure layout before the reorg. $ pahole -C dmar_domain drivers/iommu/intel/dmar.o struct dmar_domain { int nid; /* 0 4 / unsigned int iommu_refcnt[128]; / 4 512 / / --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- / u16 iommu_did[128]; / 516 256 / / --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- / bool has_iotlb_device; / 772 1 / / XXX 3 bytes hole, try to pack / struct list_head devices; / 776 16 / struct list_head subdevices; / 792 16 / struct iova_domain iovad __attribute__((__aligned__(8))); / 808 2320 / / --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- / struct dma_pte pgd; /* 3128 8 / / --- cacheline 49 boundary (3136 bytes) --- / int gaw; / 3136 4 / int agaw; / 3140 4 / int flags; / 3144 4 / int iommu_coherency; / 3148 4 / int iommu_snooping; / 3152 4 / int iommu_count; / 3156 4 / int iommu_superpage; / 3160 4 / / XXX 4 bytes hole, try to pack / u64 max_addr; / 3168 8 / u32 default_pasid; / 3176 4 / / XXX 4 bytes hole, try to pack / struct iommu_domain domain; / 3184 72 / / size: 3256, cachelines: 51, members: 18 / / sum members: 3245, holes: 3, sum holes: 11 / / forced alignments: 1 / / last cacheline: 56 bytes / } __attribute__((__aligned__(8))); After arranging it for natural padding and to make flags as u8 bits, it saves 8 bytes for the struct. struct dmar_domain { int nid; / 0 4 / unsigned int iommu_refcnt[128]; / 4 512 / / --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- / u16 iommu_did[128]; / 516 256 / / --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- / u8 has_iotlb_device:1; / 772: 0 1 / u8 iommu_coherency:1; / 772: 1 1 / u8 iommu_snooping:1; / 772: 2 1 / / XXX 5 bits hole, try to pack / / XXX 3 bytes hole, try to pack / struct list_head devices; / 776 16 / struct list_head subdevices; / 792 16 / struct iova_domain iovad __attribute__((__aligned__(8))); / 808 2320 / / --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- / struct dma_pte pgd; /* 3128 8 / / --- cacheline 49 boundary (3136 bytes) --- / int gaw; / 3136 4 / int agaw; / 3140 4 / int flags; / 3144 4 / int iommu_count; / 3148 4 / int iommu_superpage; / 3152 4 / / XXX 4 bytes hole, try to pack / u64 max_addr; / 3160 8 / u32 default_pasid; / 3168 4 / / XXX 4 bytes hole, try to pack / struct iommu_domain domain; / 3176 72 / / size: 3248, cachelines: 51, members: 18 / / sum members: 3236, holes: 3, sum holes: 11 / / sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 5 bits / / forced alignments: 1 / / last cacheline: 48 bytes */ } __attribute__((__aligned__(8))); Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-20-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
YueHaibing	3bc770b0e9	iommu/vt-d: Use DEVICE_ATTR_RO macro Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(), which makes the code a bit shorter and easier to read. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210528130229.22108-1-yuehaibing@huawei.com Link: https://lore.kernel.org/r/20210610020115.1637656-19-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	d5b9e4bfe0	iommu/vt-d: Report prq to io-pgfault framework Let the IO page fault requests get handled through the io-pgfault framework. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	4c82b88696	iommu/vt-d: Allocate/register iopf queue for sva devices This allocates and registers the iopf queue infrastructure for devices which want to support IO page fault for SVA. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	4048377414	iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers Align the pasid alloc/free code with the generic helpers defined in the iommu core. This also refactored the SVA binding code to improve the readability. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Lu Baolu	521f546b4e	iommu/vt-d: Support asynchronous IOMMU nested capabilities Current VT-d implementation supports nested translation only if all underlying IOMMUs support the nested capability. This is unnecessary as the upper layer is allowed to create different containers and set them with different type of iommu backend. The IOMMU driver needs to guarantee that devices attached to a nested mode iommu_domain should support nested capabilility. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210517065701.5078-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Colin Ian King	05d2cbf969	iommu/vt-d: Remove redundant assignment to variable agaw The variable agaw is initialized with a value that is never read and it is being updated later with a new value as a counter in a for-loop. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210416171826.64091-1-colin.king@canonical.com Link: https://lore.kernel.org/r/20210610020115.1637656-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Lu Baolu	54c80d9074	iommu/vt-d: Use user privilege for RID2PASID translation When first-level page tables are used for IOVA translation, we use user privilege by setting U/S bit in the page table entry. This is to make it consistent with the second level translation, where the U/S enforcement is not available. Clear the SRE (Supervisor Request Enable) field in the pasid table entry of RID2PASID so that requests requesting the supervisor privilege are blocked and treated as DMA remapping faults. Fixes: `b802d070a5` ("iommu/vt-d: Use iova over first level") Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210512064426.3440915-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210519015027.108468-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-19 08:51:02 +02:00
Dan Carpenter	1a590a1c8b	iommu/vt-d: Check for allocation failure in aux_detach_device() In current kernels small allocations never fail, but checking for allocation failure is the correct thing to do. Fixes: `18abda7a2d` ("iommu/vt-d: Fix general protection fault in aux_detach_device()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/YJuobKuSn81dOPLd@mwanda Link: https://lore.kernel.org/r/20210519015027.108468-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-19 08:51:02 +02:00
Robin Murphy	2d471b20c5	iommu: Streamline registration interface Rather than have separate opaque setter functions that are easy to overlook and lead to repetitive boilerplate in drivers, let's pass the relevant initialisation parameters directly to iommu_device_register(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-16 17:20:45 +02:00
Joerg Roedel	49d11527e5	Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', 'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next	2021-04-16 17:16:03 +02:00
Longpeng(Mike)	38c527aeb4	iommu/vt-d: Force to flush iotlb before creating superpage The translation caches may preserve obsolete data when the mapping size is changed, suppose the following sequence which can reveal the problem with high probability. 1.mmap(4GB,MAP_HUGETLB) 2. while (1) { (a) DMA MAP 0,0xa0000 (b) DMA UNMAP 0,0xa0000 (c) DMA MAP 0,0xc0000000 * DMA read IOVA 0 may failure here (Not present) * if the problem occurs. (d) DMA UNMAP 0,0xc0000000 } The page table(only focus on IOVA 0) after (a) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x1a30a72003 entry:0xffff89b39cacb000 PTE: 0x21d200803 entry:0xffff89b3b0a72000 The page table after (b) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x1a30a72003 entry:0xffff89b39cacb000 PTE: 0x0 entry:0xffff89b3b0a72000 The page table after (c) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x21d200883 entry:0xffff89b39cacb000 (*) Because the PDE entry after (b) is present, it won't be flushed even if the iommu driver flush cache when unmap, so the obsolete data may be preserved in cache, which would cause the wrong translation at end. However, we can see the PDE entry is finally switch to 2M-superpage mapping, but it does not transform to 0x21d200883 directly: 1. PDE: 0x1a30a72003 2. __domain_mapping dma_pte_free_pagetable Set the PDE entry to ZERO Set the PDE entry to 0x21d200883 So we must flush the cache after the entry switch to ZERO to avoid the obsolete info be preserved. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Gonglei (Arei) <arei.gonglei@huawei.com> Fixes: `6491d4d028` ("intel-iommu: Free old page tables before creating superpage") Cc: <stable@vger.kernel.org> # v3.0+ Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/ Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210415004628.1779-1-longpeng2@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-15 16:12:31 +02:00
Lu Baolu	c0474a606e	iommu/vt-d: Invalidate PASID cache when root/context entry changed When the Intel IOMMU is operating in the scalable mode, some information from the root and context table may be used to tag entries in the PASID cache. Software should invalidate the PASID-cache when changing root or context table entries. Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: `7373a8cc38` ("iommu/vt-d: Setup context and enable RID2PASID support") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 11:55:47 +02:00
Lu Baolu	eea53c5816	iommu/vt-d: Remove WO permissions on second-level paging entries When the first level page table is used for IOVA translation, it only supports Read-Only and Read-Write permissions. The Write-Only permission is not supported as the PRESENT bit (implying Read permission) should always set. When using second level, we still give separate permissions that allows WriteOnly which seems inconsistent and awkward. We want to have consistent behavior. After moving to 1st level, we don't want things to work sometimes, and break if we use 2nd level for the same mappings. Hence remove this configuration. Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: `b802d070a5` ("iommu/vt-d: Use iova over first level") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 11:55:47 +02:00
Robin Murphy	a250c23f15	iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE Instead make the global iommu_dma_strict paramete in iommu.c canonical by exporting helpers to get and set it and use those directly in the drivers. This make sure that the iommu.strict parameter also works for the AMD and Intel IOMMU drivers on x86. As those default to lazy flushing a new IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to represent the default if not overriden by an explicit parameter. [ported on top of the other iommu_attr changes and added a few small missing bits] Signed-off-by: Robin Murphy <robin.murphy@arm.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210401155256.298656-19-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:56:53 +02:00
Christoph Hellwig	7e14754778	iommu: remove DOMAIN_ATTR_NESTING Use an explicit enable_nesting method instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Li Yang <leoyang.li@nxp.com> Link: https://lore.kernel.org/r/20210401155256.298656-17-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:56:53 +02:00
Jean-Philippe Brucker	9003351cb6	iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to checking whether PRI is enabled. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20210401154718.307519-5-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:54:29 +02:00
Lu Baolu	6c00612d0c	iommu/vt-d: Report right snoop capability when using FL for IOVA The Intel VT-d driver checks wrong register to report snoop capablility when using first level page table for GPA to HPA translation. This might lead the IOMMU driver to say that it supports snooping control, but in reality, it does not. Fix this by always setting PASID-table-entry.PGSNP whenever a pasid entry is setting up for GPA to HPA translation so that the IOMMU driver could report snoop capability as long as it runs in the scalable mode. Fixes: `b802d070a5` ("iommu/vt-d: Use iova over first level") Suggested-by: Rajesh Sankaran <rajesh.sankaran@intel.com> Suggested-by: Kevin Tian <kevin.tian@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210330021145.13824-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:41:30 +02:00
John Garry	363f266eef	iommu/vt-d: Remove IOVA domain rcache flushing for CPU offlining Now that the core code handles flushing per-IOVA domain CPU rcaches, remove the handling here. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1616675401-151997-3-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:27:27 +02:00
Kyung Min Park	dec991e472	iommu/vt-d: Disable SVM when ATS/PRI/PASID are not enabled in the device Currently, the Intel VT-d supports Shared Virtual Memory (SVM) only when IO page fault is supported. Otherwise, shared memory pages can not be swapped out and need to be pinned. The device needs the Address Translation Service (ATS), Page Request Interface (PRI) and Process Address Space Identifier (PASID) capabilities to be enabled to support IO page fault. Disable SVM when ATS, PRI and PASID are not enabled in the device. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210314201534.918-1-kyung.min.park@intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:23:52 +01:00
Robin Murphy	3542dcb15c	iommu/dma: Resurrect the "forcedac" option In converting intel-iommu over to the common IOMMU DMA ops, it quietly lost the functionality of its "forcedac" option. Since this is a handy thing both for testing and for performance optimisation on certain platforms, reimplement it under the common IOMMU parameter namespace. For the sake of fixing the inadvertent breakage of the Intel-specific parameter, remove the dmar_forcedac remnants and hook it up as an alias while documenting the transition to the new common parameter. Fixes: `c588072bba` ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/7eece8e0ea7bfbe2cd0e30789e0d46df573af9b0.1614961776.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 10:55:23 +01:00
Joerg Roedel	45e606f272	Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next	2021-02-12 15:27:17 +01:00
Yian Chen	31a75cbbb9	iommu/vt-d: Parse SATC reporting structure Software should parse every SATC table and all devices in the tables reported by the BIOS and keep the information in kernel list for further reference. Signed-off-by: Yian Chen <yian.chen@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210203093329.1617808-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Lu Baolu	933fcd01e9	iommu/vt-d: Add iotlb_sync_map callback Some Intel VT-d hardware implementations don't support memory coherency for page table walk (presented by the Page-Walk-coherency bit in the ecap register), so that software must flush the corresponding CPU cache lines explicitly after each page table entry update. The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through iommu_ops callback. As the result, a single sg mapping may lead to multiple cache line flushes, which leads to the degradation of I/O performance after the commit <c588072bba6b5> ("iommu/vt-d: Convert intel iommu driver to the iommu ops"). Fix this by adding iotlb_sync_map callback and centralizing the clflush operations after all sg mappings. Fixes: `c588072bba` ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Reported-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/linux-iommu/D81314ED-5673-44A6-B597-090E3CB83EB0@oracle.com/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Robin Murphy <robin.murphy@arm.com> [ cel: removed @first_pte, which is no longer used ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/linux-iommu/161177763962.1311.15577661784296014186.stgit@manet.1015granger.net Link: https://lore.kernel.org/r/20210204014401.2846425-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Kyung Min Park	010bf5659e	iommu/vt-d: Move capability check code to cap_audit files Move IOMMU capability check and sanity check code to cap_audit files. Also implement some helper functions for sanity checks. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Kyung Min Park	ad3d190299	iommu/vt-d: Audit IOMMU Capabilities and add helper functions Audit IOMMU Capability/Extended Capability and check if the IOMMUs have the consistent value for features. Report out or scale to the lowest supported when IOMMU features have incompatibility among IOMMUs. Report out features when below features are mismatched: - First Level 5 Level Paging Support (FL5LP) - First Level 1 GByte Page Support (FL1GP) - Read Draining (DRD) - Write Draining (DWD) - Page Selective Invalidation (PSI) - Zero Length Read (ZLR) - Caching Mode (CM) - Protected High/Low-Memory Region (PHMR/PLMR) - Required Write-Buffer Flushing (RWBF) - Advanced Fault Logging (AFL) - RID-PASID Support (RPS) - Scalable Mode Page Walk Coherency (SMPWC) - First Level Translation Support (FLTS) - Second Level Translation Support (SLTS) - No Write Flag Support (NWFS) - Second Level Accessed/Dirty Support (SLADS) - Virtual Command Support (VCS) - Scalable Mode Translation Support (SMTS) - Device TLB Invalidation Throttle (DIT) - Page Drain Support (PDS) - Process Address Space ID Support (PASID) - Extended Accessed Flag Support (EAFS) - Supervisor Request Support (SRS) - Execute Request Support (ERS) - Page Request Support (PRS) - Nested Translation Support (NEST) - Snoop Control (SC) - Pass Through (PT) - Device TLB Support (DT) - Queued Invalidation (QI) - Page walk Coherency (C) Set capability to the lowest supported when below features are mismatched: - Maximum Address Mask Value (MAMV) - Number of Fault Recording Registers (NFR) - Second Level Large Page Support (SLLPS) - Fault Recording Offset (FRO) - Maximum Guest Address Width (MGAW) - Supported Adjusted Guest Address Width (SAGAW) - Number of Domains supported (NDOMS) - Pasid Size Supported (PSS) - Maximum Handle Mask Value (MHMV) - IOTLB Register Offset (IRO) Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Lu Baolu	e1ed66ac30	iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration] trace_qi_submit() could be used when interrupt remapping is supported, but DMA remapping is not. In this case, the following compile error occurs. ../drivers/iommu/intel/dmar.c: In function 'qi_submit_sync': ../drivers/iommu/intel/dmar.c:1311:3: error: implicit declaration of function 'trace_qi_submit'; did you mean 'ftrace_nmi_exit'? [-Werror=implicit-function-declaration] trace_qi_submit(iommu, desc[i].qw0, desc[i].qw1, ^~~~~~~~~~~~~~~ ftrace_nmi_exit Fixes: `f2dd871799` ("iommu/vt-d: Add qi_submit trace event") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130151907.3929148-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-02 14:43:56 +01:00
Nadav Amit	29b3283972	iommu/vt-d: Do not use flush-queue when caching-mode is on When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on. Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit `78d5f0f500` ("intel-iommu: Avoid global flushes with caching mode.") This behavior changed after commit `13cf017446` ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one. Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability. Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization. Fixes: `13cf017446` ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit <namit@vmware.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 13:59:02 +01:00
Lu Baolu	a8ce9ebbec	iommu/vt-d: Preset Access/Dirty bits for IOVA over FL The Access/Dirty bits in the first level page table entry will be set whenever a page table entry was used for address translation or write permission was successfully translated. This is always true when using the first-level page table for kernel IOVA. Instead of wasting hardware cycles to update the certain bits, it's better to set them up at the beginning. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 11:33:35 +01:00
Tian Tao	694a1c0ade	iommu/vt-d: Fix duplicate included linux/dma-map-ops.h linux/dma-map-ops.h is included more than once, Remove the one that isn't necessary. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609118774-10083-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-12 16:56:20 +00:00
Liu Yi L	7c29ada5e7	iommu/vt-d: Fix ineffective devTLB invalidation for subdevices iommu_flush_dev_iotlb() is called to invalidate caches on a device but only loops over the devices which are fully-attached to the domain. For sub-devices, this is ineffective and can result in invalid caching entries left on the device. Fix the missing invalidation by adding a loop over the subdevices and ensuring that 'domain->has_iotlb_device' is updated when attaching to subdevices. Fixes: `67b8e02b5e` ("iommu/vt-d: Aux-domain specific domain attach/detach") Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609949037-25291-4-git-send-email-yi.l.liu@intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 14:38:15 +00:00
Liu Yi L	18abda7a2d	iommu/vt-d: Fix general protection fault in aux_detach_device() The aux-domain attach/detach are not tracked, some data structures might be used after free. This causes general protection faults when multiple subdevices are created and assigned to a same guest machine: \| general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI \| RIP: 0010:intel_iommu_aux_detach_device+0x12a/0x1f0 \| [...] \| Call Trace: \| iommu_aux_detach_device+0x24/0x70 \| vfio_mdev_detach_domain+0x3b/0x60 \| ? vfio_mdev_set_domain+0x50/0x50 \| iommu_group_for_each_dev+0x4f/0x80 \| vfio_iommu_detach_group.isra.0+0x22/0x30 \| vfio_iommu_type1_detach_group.cold+0x71/0x211 \| ? find_exported_symbol_in_section+0x4a/0xd0 \| ? each_symbol_section+0x28/0x50 \| __vfio_group_unset_container+0x4d/0x150 \| vfio_group_try_dissolve_container+0x25/0x30 \| vfio_group_put_external_user+0x13/0x20 \| kvm_vfio_group_put_external_user+0x27/0x40 [kvm] \| kvm_vfio_destroy+0x45/0xb0 [kvm] \| kvm_put_kvm+0x1bb/0x2e0 [kvm] \| kvm_vm_release+0x22/0x30 [kvm] \| __fput+0xcc/0x260 \| ____fput+0xe/0x10 \| task_work_run+0x8f/0xb0 \| do_exit+0x358/0xaf0 \| ? wake_up_state+0x10/0x20 \| ? signal_wake_up_state+0x1a/0x30 \| do_group_exit+0x47/0xb0 \| __x64_sys_exit_group+0x18/0x20 \| do_syscall_64+0x57/0x1d0 \| entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix the crash by tracking the subdevices when attaching and detaching aux-domains. Fixes: `67b8e02b5e` ("iommu/vt-d: Aux-domain specific domain attach/detach") Co-developed-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609949037-25291-3-git-send-email-yi.l.liu@intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 14:35:14 +00:00
Will Deacon	c74009f529	Merge branch 'for-next/iommu/fixes' into for-next/iommu/core Merge in IOMMU fixes for 5.10 in order to resolve conflicts against the queue for 5.11. * for-next/iommu/fixes: iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs iommu/vt-d: Don't read VCCAP register unless it exists x86/tboot: Don't disable swiotlb when iommu is forced on iommu: Check return of __iommu_attach_device() arm-smmu-qcom: Ensure the qcom_scm driver has finished probing iommu/amd: Enforce 4k mapping for certain IOMMU data structures MAINTAINERS: Temporarily add myself to the IOMMU entry iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set iommu/vt-d: Avoid panic if iommu init fails in tboot system iommu/vt-d: Cure VF irqdomain hickup x86/platform/uv: Fix copied UV5 output archtype x86/platform/uv: Drop last traces of uv_flush_tlb_others	2020-12-08 15:21:49 +00:00
Will Deacon	113eb4ce4f	Merge branch 'for-next/iommu/vt-d' into for-next/iommu/core Intel VT-D updates for 5.11. The main thing here is converting the code over to the iommu-dma API, which required some improvements to the core code to preserve existing functionality. * for-next/iommu/vt-d: iommu/vt-d: Avoid GFP_ATOMIC where it is not needed iommu/vt-d: Remove set but not used variable iommu/vt-d: Cleanup after converting to dma-iommu ops iommu/vt-d: Convert intel iommu driver to the iommu ops iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev iommu: Add quirk for Intel graphic devices in map_sg iommu: Allow the dma-iommu api to use bounce buffers iommu: Add iommu_dma_free_cpu_cached_iovas() iommu: Handle freelists when using deferred flushing in iommu drivers iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM	2020-12-08 15:11:58 +00:00

1 2 3

102 Commits