OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Zhen Lei	aa3ac9469c	iommu/iova: Make dma_32bit_pfn implicit Now that the cached node optimisation can apply to all allocations, the couple of users which were playing tricks with dma_32bit_pfn in order to benefit from it can stop doing so. Conversely, there is also no need for all the other users to explicitly calculate a 'real' 32-bit PFN, when init_iova_domain() can happily do that itself from the page granularity. CC: Thierry Reding <thierry.reding@gmail.com> CC: Jonathan Hunter <jonathanh@nvidia.com> CC: David Airlie <airlied@linux.ie> CC: Sudeep Dutt <sudeep.dutt@intel.com> CC: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: use iova_shift(), rewrote commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Linus Torvalds	4dfc278803	IOMMU Updates for Linux v4.14 Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - Common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - Finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - Finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - New functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - Support for mt2712 in the Mediatek IOMMU driver - New IOMMU driver for QCOM hardware - System PM support for ARM-SMMU - Shutdown method for ARM-SMMU-v3 - Some constification patches - Various other small improvements and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZtCFNAAoJECvwRC2XARrjZnQP/AxC/ezQpq82HbegF4sM/cVE Ep7TeTqodEl75FS/6txe2wU0pwodqk/LB9ajfQZUbE1w8vKsNEqi5qf4ZYHGoxYI 5bWyjJBzKIlwENH5lsBpQNt6XLevrYmRsFy7F0tRYy+qPQq8k+js2i7/XkCL3q7L 3xklF847RRoITaTOhhaROx1pF23dSMEsS2XGuWHcZfjORtep4wcFKzd/2SvlCWCo P2bRU7jBzfJuuGSA80gaiUbDmrULTUfYuZNp7njASzCgsDmagERtvDEpdoXPNNSp u6s4LjU1Dp3fgr6g6cFRO7B6JUbWd619nwo9so/c/wZN54yEngBF9EyeeF3mv2O5 ZbM2mOW3RlZcjxFT/AC8G4cZwwP6MpCEQOdqknoqc6ZQwcDqwN0o9I4+po0wsiAU 89ijZZe9Mx0p9lNpihaBEB1erAUWPo5Obh62zo80W3h6x9WzkGQWM+PyFK2DYoaC 8biEZzcc21sLEHvXQkcEGJSKrihHr9sluOqvxmCw5QAkYIFAeZRoeH7JtZWjVCnr T3XvaG1G1Aw6tS7ErxufdKawREAGki0Rm9i1baiH9sqNj5rllM01Y+PgU6E21Nbg iZp9gJLjfwM4vhYLlovvQK5PRoOBsCkyCpEI4GJqjLeam5p/WN06CFFf0ifQofYr qDPCVDkWHWV8nugFFKE7 =EVh9 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - new functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - support for mt2712 in the Mediatek IOMMU driver - new IOMMU driver for QCOM hardware - system PM support for ARM-SMMU - shutdown method for ARM-SMMU-v3 - some constification patches - various other small improvements and fixes" * tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits) iommu/vt-d: Don't be too aggressive when clearing one context entry iommu: Introduce Interface for IOMMU TLB Flushing iommu/s390: Constify iommu_ops iommu/vt-d: Avoid calling virt_to_phys() on null pointer iommu/vt-d: IOMMU Page Request needs to check if address is canonical. arm/tegra: Call bus_set_iommu() after iommu_device_register() iommu/exynos: Constify iommu_ops iommu/ipmmu-vmsa: Make ipmmu_gather_ops const iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable iommu/amd: Rename a few flush functions iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY iommu/mediatek: Fix a build warning of BIT(32) in ARM iommu/mediatek: Fix a build fail of m4u_type iommu: qcom: annotate PM functions as __maybe_unused iommu/pamu: Fix PAMU boot crash memory: mtk-smi: Degrade SMI init to module_init iommu/mediatek: Enlarge the validate PA range for 4GB mode iommu/mediatek: Disable iommu clock when system suspend iommu/mediatek: Move pgtable allocation into domain_alloc iommu/mediatek: Merge 2 M4U HWs into one iommu domain ...	2017-09-09 15:03:24 -07:00
Linus Torvalds	0d519f2d1e	pci-v4.14-changes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq 64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc 0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6 B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz 4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ EG76ridTolUFVV+wzJD9 =9cLP -----END PGP SIGNATURE----- Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add enhanced Downstream Port Containment support, which prints more details about Root Port Programmed I/O errors (Dongdong Liu) - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang) - add MediaTek MT2712 and MT7622 support (Ryder Lee) - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang) - add Qualcom IPQ8074 support (Varadarajan Narayanan) - add R-Car r8a7743/5 device tree support (Biju Das) - add Rockchip per-lane PHY support for better power management (Shawn Lin) - fix IRQ mapping for hot-added devices by replacing the pci_fixup_irqs() boot-time design with a host bridge hook called at probe-time (Lorenzo Pieralisi, Matthew Minter) - fix race when enabling two devices that results in upstream bridge not being enabled correctly (Srinath Mannam) - fix pciehp power fault infinite loop (Keith Busch) - fix SHPC bridge MSI hotplug events by enabling bus mastering (Aleksandr Bezzubikov) - fix a VFIO issue by correcting PCIe capability sizes (Alex Williamson) - fix an INTD issue on Xilinx and possibly other drivers by unifying INTx IRQ domain support (Paul Burton) - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg Roedel) - allow APM X-Gene device assignment to guests by adding an ACS quirk (Feng Kan) - fix driver crashes by disabling Extended Tags on Broadcom HT2100 (Extended Tags support is required for PCIe Receivers but not Requesters, and we now enable them by default when Requesters support them) (Sinan Kaya) - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs use the real Requester ID (not a phantom RID) (Robin Murphy) - prevent assignment of Intel VMD children to guests (which may be supported eventually, but isn't yet) by not associating an IOMMU with them (Jon Derrick) - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott Bauer) - fix a Function-Level Reset issue with Intel 750 NVMe by waiting longer (up to 60sec instead of 1sec) for device to become ready (Sinan Kaya) - fix a Function-Level Reset issue on iProc Stingray by working around hardware defects in the CRS implementation (Oza Pawandeep) - fix an issue with Intel NVMe P3700 after an iProc reset by adding a delay during shutdown (Oza Pawandeep) - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking in compose_msi_msg() (Stephen Hemminger) - fix a wireless LAN driver timeout by clearing DesignWare MSI interrupt status after it is handled, not before (Faiz Abbas) - fix DesignWare ATU enable checking (Jisheng Zhang) - reduce Layerscape dependencies on the bootloader by doing more initialization in the driver (Hou Zhiqiang) - improve Intel VMD performance allowing allocation of more IRQ vectors than present CPUs (Keith Busch) - improve endpoint framework support for initial DMA mask, different BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon Vijay Abraham I, Stan Drozd) - rework CRS support to add periodic messages while we poll during enumeration and after Function-Level Reset and prepare for possible other uses of CRS (Sinan Kaya) - clean up Root Port AER handling by removing unnecessary code and moving error handler methods to struct pcie_port_service_driver (Christoph Hellwig) - clean up error handling paths in various drivers (Bjorn Andersson, Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen, Lorenzo Pieralisi, Sergei Shtylyov) - clean up SR-IOV resource handling by disabling VF decoding before updating the corresponding resource structs (Gavin Shan) - clean up DesignWare-based drivers by unifying quirks to update Class Code and Interrupt Pin and related handling of write-protected registers (Hou Zhiqiang) - clean up by adding empty generic pcibios_align_resource() and pcibios_fixup_bus() and removing empty arch-specific implementations (Palmer Dabbelt) - request exclusive reset control for several drivers to allow cleanup elsewhere (Philipp Zabel) - constify various structures (Arvind Yadav, Bhumika Goyal) - convert from full_name() to %pOF (Rob Herring) - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn Lin) * tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits) PCI: xgene: Clean up whitespace PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset PCI: xgene: Fix platform_get_irq() error handling PCI: xilinx-nwl: Fix platform_get_irq() error handling PCI: rockchip: Fix platform_get_irq() error handling PCI: altera: Fix platform_get_irq() error handling PCI: spear13xx: Fix platform_get_irq() error handling PCI: artpec6: Fix platform_get_irq() error handling PCI: armada8k: Fix platform_get_irq() error handling PCI: dra7xx: Fix platform_get_irq() error handling PCI: exynos: Fix platform_get_irq() error handling PCI: iproc: Clean up whitespace PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP PCI: iproc: Add 500ms delay during device shutdown PCI: Fix typos and whitespace errors PCI: Remove unused "res" variable from pci_resource_io() PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag() PCI/AER: Reformat AER register definitions iommu/vt-d: Prevent VMD child devices from being remapping targets x86/PCI: Use is_vmd() rather than relying on the domain number ...	2017-09-08 15:47:43 -07:00
Joerg Roedel	47b59d8e40	Merge branches 'arm/exynos', 'arm/renesas', 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-09-01 11:31:42 +02:00
Filippo Sironi	5082219b6a	iommu/vt-d: Don't be too aggressive when clearing one context entry Previously, we were invalidating context cache and IOTLB globally when clearing one context entry. This is a tad too aggressive. Invalidate the context cache and IOTLB for the interested device only. Signed-off-by: Filippo Sironi <sironi@amazon.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-01 11:30:41 +02:00
Jon Derrick	5823e330b5	iommu/vt-d: Prevent VMD child devices from being remapping targets VMD child devices must use the VMD endpoint's ID as the requester. Because of this, there needs to be a way to link the parent VMD endpoint's IOMMU group and associated mappings to the VMD child devices such that attaching and detaching child devices modify the endpoint's mappings, while preventing early detaching on a singular device removal or unbinding. The reassignment of individual VMD child devices devices to VMs is outside the scope of VMD, but may be implemented in the future. For now it is best to prevent any such attempts. Prevent VMD child devices from returning an IOMMU, which prevents it from exposing an iommu_group sysfs directory and allowing subsequent binding by userspace-access drivers such as VFIO. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-30 16:43:58 -05:00
Ashok Raj	11b93ebfa0	iommu/vt-d: Avoid calling virt_to_phys() on null pointer New kernels with debug show panic() from __phys_addr() checks. Avoid calling virt_to_phys() when pasid_state_tbl pointer is null To: Joerg Roedel <joro@8bytes.org> To: linux-kernel@vger.kernel.org> Cc: iommu@lists.linux-foundation.org Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jacob Pan <jacob.jun.pan@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Fixes: `2f26e0a9c9` ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-30 17:48:18 +02:00
Joerg Roedel	13cf017446	iommu/vt-d: Make use of iova deferred flushing Remove the deferred flushing implementation in the Intel VT-d driver and use the one from the common iova code instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:53 +02:00
Joerg Roedel	2926a2aa5c	iommu: Fix wrong freeing of iommu_device->dev The struct iommu_device has a 'struct device' embedded into it, not as a pointer, but the whole struct. In the conversion of the iommu drivers to use struct iommu_device it was forgotten that the relase function for that struct device simply calls kfree() on the pointer. This frees memory that was never allocated and causes memory corruption. To fix this issue, use a pointer to struct device instead of embedding the whole struct. This needs some updates in the iommu sysfs code as well as the Intel VT-d and AMD IOMMU driver. Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: `39ab9555c2` ('iommu: Add sysfs bindings for struct iommu_device') Cc: stable@vger.kernel.org # >= v4.11 Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 13:58:48 +02:00
David Dillow	bc24c57159	iommu/vt-d: Don't free parent pagetable of the PTE we're adding When adding a large scatterlist entry that covers more than the L3 superpage size (1GB) but has an alignment such that we must use L2 superpages (2MB) , we give dma_pte_free_level() a range that causes it to free the L3 pagetable we're about to populate. We fix this by telling dma_pte_free_pagetable() about the pagetable level we're about to populate to prevent freeing it. For example, mapping a scatterlist with entry lengths 854MB and 1194MB at IOVA 0xffff80000000 would, when processing the 2MB-aligned second entry, cause pfn_to_dma_pte() to create a L3 directory to hold L2 superpages for the mapping at IOVA 0xffffc0000000. We would previously call dma_pte_free_pagetable(domain, 0xffffc0000, 0xfffffffff), which would free the L3 directory pfn_to_dma_pte() just created for IO PFN 0xffffc0000. Telling dma_pte_free_pagetable() to retain the L3 directories while using L2 superpages avoids the erroneous free. Signed-off-by: David Dillow <dillow@google.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-07-26 11:12:58 +02:00
Linus Torvalds	fb4e3beeff	IOMMU Updates for Linux v4.13 This update comes with: * Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM * Some Errata workarounds for ARM SMMU implemenations * Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before * Support for amd_iommu=off when booting with kexec. Problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel * The Rockchip IOMMU driver is now available on ARM64 * Align the return value of the iommu_ops->device_group call-backs to not miss error values * Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT * Various other small cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZZgddAAoJECvwRC2XARrjurgQANO338GIBr2ZkA0oectidDpZ Y4yu7W9RH6NyhupJG/Xooya7daBWFjbaA1AVJ3ZZNlMERh69AmehVfRUfVMzF2w+ buma58HQgiJWN1zFD8xdeMzYKms9P77whA88C/9QvrK/klB3LipWP2SC0yvvvyxJ mMCDpgt+D+CGnIDqbRuyLDQoRu3yjAkAvYb6OzL8DPJVP1Y5oLffGwGnHzJbJnOf eWJwYHM5ai0uF/Qqy6RNNekacObjVaOLihjugGvokH6ipXfOrSSNriXW9pZiWR5m S91898YTP3KuWWsJM+N93UAjvc6pL9PqL/OvbB9zdYpzu+5PtUpFXHYcOebKyEEO 4j9CaRzubsWFTFjbYItJnR4WgXQRf4NKOGfTfHMHA+dY8aODYnlXNVdQDAA2aFgn TUBvHq5xb0zZ3nbPwtTDyW06oDMVfBBarLx2yFI1aQSSh+eg/GtIi5KP28gyFZNz 4gWj0q3g/e3y7WEwNbYV7L3TS0d/p8VUYFtUp7PUCddnWoY+4cJzgidub5xIViZD Ql0nZzga9pXXIE/kE5Pf74WqrG7JJzZsvK2ABy4+XGrMq6RclJf+0pXbSqiXDpXL quw8t0oXw0ZEeavQ31Za8mjXBvo5ocM5iintl1wrl2BujHEO3oKqbGsIOaRcLnlN Ukehbl4OEKzZpD3oLPPk =pmBf -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ...	2017-07-12 10:00:04 -07:00
Linus Torvalds	f72e24a124	This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlldmw0LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOiKA/+Ln1mFLSf3nfTzIHa24Bbk8ZTGr0B8TD4Vmyyt8iG oO3AeaTLn3d6ugbH/uih/tPz8PuyXsdiTC1rI/ejDMiwMTSjW6phSiIHGcStSR9X VFNhmMFacp7QpUpvxceV0XZYKDViAoQgHeGdp3l+K5h/v4AYePV/v/5RjQPaEyOh YLbCzETO+24mRWdJxdAqtTW4ovYhzj6XsiJ+pAjlV0+SWU6m5L5E+VAPNi1vqv1H 1O2KeCFvVYEpcnfL3qnkw2timcjmfCfeFAd9mCUAc8mSRBfs3QgDTKw3XdHdtRml LU2WuA5cpMrOdBO4mVra2plo8E2szvpB1OZZXoKKdCpK3VGwVpVHcTvClK2Ks/3B GDLieroEQNu2ZIUIdWXf/g2x6le3BcC9MmpkAhnGPqCZ7skaIBO5Cjpxm0zTJAPl PPY3CMBBEktAvys6DcudOYGixNjKUuAm5lnfpcfTEklFdG0AjhdK/jZOplAFA6w4 LCiy0rGHM8ZbVAaFxbYoFCqgcjnv6EjSiqkJxVI4fu/Q7v9YXfdPnEmE0PJwCVo5 +i7aCLgrYshTdHr/F3e5EuofHN3TDHwXNJKGh/x97t+6tt326QMvDKX059Kxst7R rFukGbrYvG8Y7yXwrSDbusl443ta0Ht7T1oL4YUoJTZp0nScAyEluDTmrH1JVCsT R4o= =0Fso -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...	2017-07-06 19:20:54 -07:00
Christoph Hellwig	5860acc1a9	x86: remove arch specific dma_supported implementation And instead wire it up as method for all the dma_map_ops instances. Note that this also means the arch specific check will be fully instead of partially applied in the AMD iommu driver. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:46 -07:00
Joerg Roedel	6a7086431f	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-06-28 14:45:02 +02:00
Arvind Yadav	01e1932a17	iommu/vt-d: Constify intel_dma_ops Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 14:17:12 +02:00
Sebastian Andrzej Siewior	58c4a95f90	iommu/vt-d: Don't disable preemption while accessing deferred_flush() get_cpu() disables preemption and returns the current CPU number. The CPU number is only used once while retrieving the address of the local's CPU deferred_flush pointer. We can instead use raw_cpu_ptr() while we remain preemptible. The worst thing that can happen is that flush_unmaps_timeout() is invoked multiple times: once by taskA after seeing HIGH_WATER_MARK and then preempted to another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0 right after its read because another CPU invoked flush_unmaps_timeout() for this CPU. The access to flush_data is protected by a spinlock so even if we get migrated to another CPU or preempted - the data structure is protected. While at it, I marked deferred_flush static since I can't find a reference to it outside of this file. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:24:40 +02:00
Peter Xu	b316d02a13	iommu/vt-d: Unwrap __get_valid_domain_for_dev() We do find_domain() in __get_valid_domain_for_dev(), while we do the same thing in get_valid_domain_for_dev(). No need to do it twice. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-30 11:41:32 +02:00
Thomas Gleixner	b608fe356f	iommu/vt-d: Adjust system_state checks To enable smp_processor_id() and might_sleep() debug checks earlier, it's required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING. Adjust the system_state checks in dmar_parse_one_atsr() and dmar_iommu_notify_scope_dev() to handle the extra states. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20170516184735.712365947@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-05-23 10:01:36 +02:00
KarimAllah Ahmed	f73a7eee90	iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings Ever since commit `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") the kdump kernel copies the IOMMU context tables from the previous kernel. Each device mappings will be destroyed once the driver for the respective device takes over. This unfortunately breaks the workflow of mapping and unmapping a new context to the IOMMU. The mapping function assumes that either: 1) Unmapping did the proper IOMMU flushing and it only ever flush if the IOMMU unit supports caching invalid entries. 2) The system just booted and the initialization code took care of flushing all IOMMU caches. This assumption is not true for the kdump kernel since the context tables have been copied from the previous kernel and translations could have been cached ever since. So make sure to flush the IOTLB as well when we destroy these old copied mappings. Cc: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: stable@vger.kernel.org v4.2+ Fixes: `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-17 14:44:47 +02:00
Joerg Roedel	2c0248d688	Merge branches 'arm/exynos', 'arm/omap', 'arm/rockchip', 'arm/mediatek', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd' and 'core' into next	2017-05-04 18:06:17 +02:00
Shaohua Li	bfd20f1cc8	x86, iommu/vt-d: Add an option to disable Intel IOMMU force on IOMMU harms performance signficantly when we run very fast networking workloads. It's 40GB networking doing XDP test. Software overhead is almost unaware, but it's the IOTLB miss (based on our analysis) which kills the performance. We observed the same performance issue even with software passthrough (identity mapping), only the hardware passthrough survives. The pps with iommu (with software passthrough) is only about ~30% of that without it. This is a limitation in hardware based on our observation, so we'd like to disable the IOMMU force on, but we do want to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not eabling IOMMU is totally ok. So introduce a new boot option to disable the force on. It's kind of silly we need to run into intel_iommu_init even without force on, but we need to disable TBOOT PMR registers. For system without the boot option, nothing is changed. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-04-26 23:57:53 +02:00
Joerg Roedel	161b28aae1	iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off When booting into a kexec kernel with intel_iommu=off, and the previous kernel had intel_iommu=on, the IOMMU hardware is still enabled and gets not disabled by the new kernel. This causes the boot to fail because DMA is blocked by the hardware. Disable the IOMMUs when we find it enabled in the kexec kernel and boot with intel_iommu=off. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-29 17:02:00 +02:00
Robin Murphy	9d3a4de4cb	iommu: Disambiguate MSI region types The introduction of reserved regions has left a couple of rough edges which we could do with sorting out sooner rather than later. Since we are not yet addressing the potential dynamic aspect of software-managed reservations and presenting them at arbitrary fixed addresses, it is incongruous that we end up displaying hardware vs. software-managed MSI regions to userspace differently, especially since ARM-based systems may actually require one or the other, or even potentially both at once, (which iommu-dma currently has no hope of dealing with at all). Let's resolve the former user-visible inconsistency ASAP before the ABI has been baked into a kernel release, in a way that also lays the groundwork for the latter shortcoming to be addressed by follow-up patches. For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use IOMMU_RESV_MSI to describe the hardware type, and document everything a little bit. Since the x86 MSI remapping hardware falls squarely under this meaning of IOMMU_RESV_MSI, apply that type to their regions as well, so that we tell the same story to userspace across all platforms. Secondly, as the various region types require quite different handling, and it really makes little sense to ever try combining them, convert the bitfield-esque #defines to a plain enum in the process before anyone gets the wrong impression. Fixes: `d30ddcaa7b` ("iommu: Add a new type field in iommu_resv_region") Reviewed-by: Eric Auger <eric.auger@redhat.com> CC: Alex Williamson <alex.williamson@redhat.com> CC: David Woodhouse <dwmw2@infradead.org> CC: kvm@vger.kernel.org Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-22 16:16:17 +01:00
Koos Vriezen	5003ae1e73	iommu/vt-d: Fix NULL pointer dereference in device_to_iommu The function device_to_iommu() in the Intel VT-d driver lacks a NULL-ptr check, resulting in this oops at boot on some platforms: BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0 PGD 0 [...] Call Trace: ? find_or_alloc_domain.constprop.29+0x1a/0x300 ? dw_dma_probe+0x561/0x580 [dw_dmac_core] ? __get_valid_domain_for_dev+0x39/0x120 ? __intel_map_single+0x138/0x180 ? intel_alloc_coherent+0xb6/0x120 ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm] ? mutex_lock+0x9/0x30 ? kernfs_add_one+0xdb/0x130 ? devres_add+0x19/0x60 ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm] ? platform_drv_probe+0x30/0x90 ? driver_probe_device+0x1ed/0x2b0 ? __driver_attach+0x8f/0xa0 ? driver_probe_device+0x2b0/0x2b0 ? bus_for_each_dev+0x55/0x90 ? bus_add_driver+0x110/0x210 ? 0xffffffffa11ea000 ? driver_register+0x52/0xc0 ? 0xffffffffa11ea000 ? do_one_initcall+0x32/0x130 ? free_vmap_area_noflush+0x37/0x70 ? kmem_cache_alloc+0x88/0xd0 ? do_init_module+0x51/0x1c4 ? load_module+0x1ee9/0x2430 ? show_taint+0x20/0x20 ? kernel_read_file+0xfd/0x190 ? SyS_finit_module+0xa3/0xb0 ? do_syscall_64+0x4a/0xb0 ? entry_SYSCALL64_slow_path+0x25/0x25 Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49 RIP [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0 RSP <ffffc90001457a78> CR2: 00000000000007ab ---[ end trace 16f974b6d58d0aad ]--- Add the missing pointer check. Fixes: `1c387188c6` ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions") Signed-off-by: Koos Vriezen <koos.vriezen@gmail.com> Cc: stable@vger.kernel.org # 4.8.15+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-22 12:01:42 +01:00
Joerg Roedel	a7fdb6e648	iommu/vt-d: Fix crash when accessing VT-d sysfs entries The link between the iommu sysfs-device and the struct intel_iommu is no longer stored as driver-data. Update the code to use the new access method. Reported-by: Dave Jones <davej@codemonkey.org.uk> Fixes: `39ab9555c2` ('iommu: Add sysfs bindings for struct iommu_device') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-28 15:48:15 +01:00
Lucas Stach	712c604dcd	mm: wire up GFP flag passing in dma_alloc_from_contiguous The callers of the DMA alloc functions already provide the proper context GFP flags. Make sure to pass them through to the CMA allocator, to make the CMA compaction context aware. Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alexander Graf <agraf@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-24 17:46:55 -08:00
Joerg Roedel	8d2932dd06	Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas', 'arm/smmu', 'arm/mediatek', 'arm/core', 'x86/vt-d' and 'core' into next	2017-02-10 15:13:10 +01:00
Joerg Roedel	e3d10af112	iommu: Make iommu_device_link/unlink take a struct iommu_device This makes the interface more consistent with iommu_device_sysfs_add/remove. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-10 13:44:57 +01:00
Joerg Roedel	39ab9555c2	iommu: Add sysfs bindings for struct iommu_device There is currently support for iommu sysfs bindings, but those need to be implemented in the IOMMU drivers. Add a more generic version of this by adding a struct device to struct iommu_device and use that for the sysfs bindings. Also convert the AMD and Intel IOMMU driver to make use of it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-10 13:44:57 +01:00
Joerg Roedel	b0119e8708	iommu: Introduce new 'struct iommu_device' This struct represents one hardware iommu in the iommu core code. For now it only has the iommu-ops associated with it, but that will be extended soon. The register/unregister interface is also added, as well as making use of it in the Intel and AMD IOMMU drivers. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-02-10 13:44:57 +01:00
David Dillow	f7116e115a	iommu/vt-d: Don't over-free page table directories dma_pte_free_level() recurses down the IOMMU page tables and frees directory pages that are entirely contained in the given PFN range. Unfortunately, it incorrectly calculates the starting address covered by the PTE under consideration, which can lead to it clearing an entry that is still in use. This occurs if we have a scatterlist with an entry that has a length greater than 1026 MB and is aligned to 2 MB for both the IOMMU and physical addresses. For example, if __domain_mapping() is asked to map a two-entry scatterlist with 2 MB and 1028 MB segments to PFN 0xffff80000, it will ask if dma_pte_free_pagetable() is asked to PFNs from 0xffff80200 to 0xffffc05ff, it will also incorrectly clear the PFNs from 0xffff80000 to 0xffff801ff because of this issue. The current code will set level_pfn to 0xffff80200, and 0xffff80200-0xffffc01ff fits inside the range being cleared. Properly setting the level_pfn for the current level under consideration catches that this PTE is outside of the range being cleared. This patch also changes the value passed into dma_pte_free_level() when it recurses. This only affects the first PTE of the range being cleared, and is handled by the existing code that ensures we start our cursor no lower than start_pfn. This was found when using dma_map_sg() to map large chunks of contiguous memory, which immediatedly led to faults on the first access of the erroneously-deleted mappings. Fixes: `3269ee0bd6` ("intel-iommu: Fix leaks in pagetable freeing") Reviewed-by: Benjamin Serebrin <serebrin@google.com> Signed-off-by: David Dillow <dillow@google.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-31 12:50:05 +01:00
Ashok Raj	21e722c4c8	iommu/vt-d: Tylersburg isoch identity map check is done too late. The check to set identity map for tylersburg is done too late. It needs to be done before the check for identity_map domain is done. To: Joerg Roedel <joro@8bytes.org> To: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Cc: Ashok Raj <ashok.raj@intel.com> Fixes: `86080ccc22` ("iommu/vt-d: Allocate si_domain in init_dmars()") Signed-off-by: Ashok Raj <ashok.raj@intel.com> Reported-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-31 12:50:05 +01:00
Eric Auger	0659b8dc45	iommu/vt-d: Implement reserved region get/put callbacks This patch registers the [FEE0_0000h - FEF0_000h] 1MB MSI range as a reserved region and RMRR regions as direct regions. This will allow to report those reserved regions in the iommu-group sysfs. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-01-23 11:48:17 +00:00
Jacob Pan	65ca7f5f7d	iommu/vt-d: Fix pasid table size encoding Different encodings are used to represent supported PASID bits and number of PASID table entries. The current code assigns ecap_pss directly to extended context table entry PTS which is wrong and could result in writing non-zero bits to the reserved fields. IOMMU fault reason 11 will be reported when reserved bits are nonzero. This patch converts ecap_pss to extend context entry pts encoding based on VT-d spec. Chapter 9.4 as follows: - number of PASID bits = ecap_pss + 1 - number of PASID table entries = 2^(pts + 5) Software assigned limit of pasid_max value is also respected to match the allocation limitation of PASID table. cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Tested-by: Mika Kuoppala <mika.kuoppala@intel.com> Fixes: `2f26e0a9c9` ('iommu/vt-d: Add basic SVM PASID support') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-04 15:18:57 +01:00
Xunlei Pang	aec0e86172	iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers under kdump, it can be steadily reproduced on several different machines, the dmesg log is like: HP HPSA Driver (v 3.4.16-0) hpsa 0000:02:00.0: using doorbell to reset controller hpsa 0000:02:00.0: board ready after hard reset. hpsa 0000:02:00.0: Waiting for controller to respond to no-op DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff] DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff] DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff] DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set hpsa 0000:02:00.0: controller message 03:00 timed out hpsa 0000:02:00.0: no-op failed; re-trying After some debugging, we found that the fault addr is from DMA initiated at the driver probe stage after reset(not in-flight DMA), and the corresponding pte entry value is correct, the fault is likely due to the old iommu caches of the in-flight DMA before it. Thus we need to flush the old cache after context mapping is setup for the device, where the device is supposed to finish reset at its driver probe stage and no in-flight DMA exists hereafter. I'm not sure if the hardware is responsible for invalidating all the related caches allocated in the iommu hardware before, but seems not the case for hpsa, actually many device drivers have problems in properly resetting the hardware. Anyway flushing (again) by software in kdump kernel when the device gets context mapped which is a quite infrequent operation does little harm. With this patch, the problematic machine can survive the kdump tests. CC: Myron Stowe <myron.stowe@gmail.com> CC: Joseph Szczypek <jszczype@redhat.com> CC: Don Brace <don.brace@microsemi.com> CC: Baoquan He <bhe@redhat.com> CC: Dave Young <dyoung@redhat.com> Fixes: `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") Fixes: `dbcd861f25` ("iommu/vt-d: Do not re-use domain-ids from the old kernel") Fixes: `cf484d0e69` ("iommu/vt-d: Mark copied context entries") Signed-off-by: Xunlei Pang <xlpang@redhat.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-01-04 15:14:04 +01:00
Linus Torvalds	e71c3978d6	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...	2016-12-12 19:25:04 -08:00
Anna-Maria Gleixner	21647615db	iommu/vt-d: Convert to hotplug state machine Install the callbacks via the state machine. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: rt@linutronix.de Cc: David Woodhouse <dwmw2@infradead.org> Link: http://lkml.kernel.org/r/20161126231350.10321-14-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-12-02 00:52:37 +01:00
Linus Torvalds	105ecadc6d	Merge git://git.infradead.org/intel-iommu Pull IOMMU fixes from David Woodhouse: "Two minor fixes. The first fixes the assignment of SR-IOV virtual functions to the correct IOMMU unit, and the second fixes the excessively large (and physically contiguous) PASID tables used with SVM" * git://git.infradead.org/intel-iommu: iommu/vt-d: Fix PASID table allocation iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions	2016-11-27 08:24:46 -08:00
Joerg Roedel	bea64033dd	iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path It turns out that the disable_dmar_iommu() code-path tried to get the device_domain_lock recursivly, which will dead-lock when this code runs on dmar removal. Fix both code-paths that could lead to the dead-lock. Fixes: `55d940430a` ('iommu/vt-d: Get rid of domain->iommu_lock') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-11-08 15:08:26 +01:00
Ashok Raj	1c387188c6	iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions The VT-d specification (§8.3.3) says: ‘Virtual Functions’ of a ‘Physical Function’ are under the scope of the same remapping unit as the ‘Physical Function’. The BIOS is not required to list all the possible VFs in the scope tables, and arguably shouldn't make any attempt to do so, since there could be a huge number of them. This has been broken basically for ever — the VF is never going to match against a specific unit's scope, so it ends up being assigned to the INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but now we're looking at Root-Complex integrated devices with SR-IOV support it's going to start being wrong. Fix it to simply use pci_physfn() before doing the lookup for PCI devices. Cc: stable@vger.kernel.org Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2016-10-30 05:32:51 -06:00
Joerg Roedel	1c5ebba95b	iommu/vt-d: Make sure RMRRs are mapped before domain goes public When a domain is allocated through the get_valid_domain_for_dev path, it will be context-mapped before the RMRR regions are mapped in the page-table. This opens a short time window where device-accesses to these regions fail and causing DMAR faults. Fix this by mapping the RMRR regions before the domain is context-mapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-09-05 13:00:28 +02:00
Joerg Roedel	76208356a0	iommu/vt-d: Split up get_domain_for_dev function Split out the search for an already existing domain and the context mapping of the device to the new domain. This allows to map possible RMRR regions into the domain before it is context mapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-09-05 13:00:28 +02:00
Krzysztof Kozlowski	00085f1efa	dma-mapping: use unsigned long for dma_attrs The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-04 08:50:07 -04:00
Linus Torvalds	dd9671172a	IOMMU Updates for Linux v4.8 In the updates: * Big endian support and preparation for defered probing for the Exynos IOMMU driver * Simplifications in iommu-group id handling * Support for Mediatek generation one IOMMU hardware * Conversion of the AMD IOMMU driver to use the generic IOVA allocator. This driver now also benefits from the recent scalability improvements in the IOVA code. * Preparations to use generic DMA mapping code in the Rockchip IOMMU driver * Device tree adaption and conversion to use generic page-table code for the MSM IOMMU driver * An iova_to_phys optimization in the ARM-SMMU driver to greatly improve page-table teardown performance with VFIO * Various other small fixes and conversions -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJXl3e+AAoJECvwRC2XARrjMIgP/1Mm9qIfcaAxKY4ByqbVfrH8 313PO6rpwUhhywUmnf/1F/x+JbuLv8MmRXfSc106mdB1rq9NXpkORYKrqVxs0cSq 6u6TzZWbF6WN1ipqXxDITNFBSy7u97K1VuFaKyYFfLbg8xrkcdkMZJ7BqM2xIEdk rnRKcfHo6wsmCXJ6InsUPmKAqU6AfMewZTGjO+v77Gce0rZEbsJ8n7BRKC9vO2bc akvN2W+zzEUSyhbuyYQBG+agpmC5GJvz4u+6QvAP5sxTWfAsnwAoPpP4xxR+/KjT eicHlja4v0YK6Hr4AJaMxoKfKIrCdqpWm0D2tg/edyWZCeg98AW/w7/s0I8OD3ao Otj6IqC8nPk0pYciOeEPQ7aqPbvKAqU2FYWt7lWamrdr98u2R3p2nXGl0KthoAj6 JqzrCZXvBS7sj1IPLlGpj939yvbKbjpE0p7y1qhI1VEBXoBWFNvlKydkYx76BTGK F6paGVqn2Zwy00AqAsylTEkvIK063zwShZ6nPqz4bMdVlgzjrjCzdDecjfbHr8Ic 6D2oCwyF+RJ8qw+Ecm9EmWFik80sgb+iUTeeYEXNf+YzLYt5McIj7fi3N+sUPel3 YJ4S4x0sIpgUZZ1i+rOo8ZPAFHRU6SRPYV+ewaeYKrMt+Un5dTn9SddpqrJdbiUu YrF36BaQjc123IRGKrSd =xiS2 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - big-endian support and preparation for defered probing for the Exynos IOMMU driver - simplifications in iommu-group id handling - support for Mediatek generation one IOMMU hardware - conversion of the AMD IOMMU driver to use the generic IOVA allocator. This driver now also benefits from the recent scalability improvements in the IOVA code. - preparations to use generic DMA mapping code in the Rockchip IOMMU driver - device tree adaption and conversion to use generic page-table code for the MSM IOMMU driver - an iova_to_phys optimization in the ARM-SMMU driver to greatly improve page-table teardown performance with VFIO - various other small fixes and conversions * tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits) iommu/amd: Initialize dma-ops domains with 3-level page-table iommu/amd: Update Alias-DTE in update_device_table() iommu/vt-d: Return error code in domain_context_mapping_one() iommu/amd: Use container_of to get dma_ops_domain iommu/amd: Flush iova queue before releasing dma_ops_domain iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back iommu/amd: Use dev_data->domain in get_domain() iommu/amd: Optimize map_sg and unmap_sg iommu/amd: Introduce dir2prot() helper iommu/amd: Implement timeout to flush unmap queues iommu/amd: Implement flush queue iommu/amd: Allow NULL pointer parameter for domain_flush_complete() iommu/amd: Set up data structures for flush queue iommu/amd: Remove align-parameter from __map_single() iommu/amd: Remove other remains of old address allocator iommu/amd: Make use of the generic IOVA allocator iommu/amd: Remove special mapping code for dma_ops path iommu/amd: Pass gfp-flags to iommu_map_page() iommu/amd: Implement apply_dm_region call-back iommu/amd: Create a list of reserved iova addresses ...	2016-08-01 07:25:10 -04:00
Linus Torvalds	194dc870a5	Add braces to avoid "ambiguous ‘else’" compiler warnings Some of our "for_each_xyz()" macro constructs make gcc unhappy about lack of braces around if-statements inside or outside the loop, because the loop construct itself has a "if-then-else" statement inside of it. The resulting warnings look something like this: drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’: drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses] if (ctx != dev_priv->kernel_context) ^ even if the code itself is fine. Since the warning is fairly easy to avoid by adding a braces around the if-statement near the for_each_xyz() construct, do so, rather than disabling the otherwise potentially useful warning. (The if-then-else statements used in the "for_each_xyz()" constructs are designed to be inherently safe even with no braces, but in this case it's quite understandable that gcc isn't really able to tell that). This finally leaves the standard "allmodconfig" build with just a handful of remaining warnings, so new and valid warnings hopefully will stand out. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-27 20:03:31 -07:00
Joerg Roedel	f360d3241f	Merge branches 'x86/amd', 'x86/vt-d', 'arm/exynos', 'arm/mediatek', 'arm/msm', 'arm/rockchip', 'arm/smmu' and 'core' into next	2016-07-26 16:02:37 +02:00
Wei Yang	5c365d18a7	iommu/vt-d: Return error code in domain_context_mapping_one() In 'commit <55d940430ab9> ("iommu/vt-d: Get rid of domain->iommu_lock")', the error handling path is changed a little, which makes the function always return 0. This path fixes this. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Fixes: `55d940430a` ('iommu/vt-d: Get rid of domain->iommu_lock') Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-07-14 10:26:30 +02:00
Aaron Campbell	0caa7616a6	iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum number of possible domains is 64K; indeed this is the maximum value that the cap_ndoms() macro will expand to. Since the value 65536 will not fix in a u16, the 'did' variable must be promoted to an int, otherwise the test for < 65536 will always be true and the loop will never end. The symptom, in my case, was a hung machine during suspend. Fixes: `3bd4f9112f` ("iommu/vt-d: Fix overflow of iommu->domains array") Signed-off-by: Aaron Campbell <aaron@monkey.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-07-04 13:34:52 +02:00
Jan Niehusmann	3bd4f9112f	iommu/vt-d: Fix overflow of iommu->domains array The valid range of 'did' in get_iommu_domain(*iommu, did) is 0..cap_ndoms(iommu->cap), so don't exceed that range in free_all_cpu_cached_iovas(). The user-visible impact of the out-of-bounds access is the machine hanging on suspend-to-ram. It is, in fact, a kernel panic, but due to already suspended devices, that's often not visible to the user. Fixes: `22e2f9fa63` ("iommu/vt-d: Use per-cpu IOVA caching") Signed-off-by: Jan Niehusmann <jan@gondor.com> Tested-By: Marius Vlad <marius.c.vlad@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-06-27 13:21:37 +02:00
Joerg Roedel	a4c34ff1c0	iommu/vt-d: Enable QI on all IOMMUs before setting root entry This seems to be required on some X58 chipsets on systems with more than one IOMMU. QI does not work until it is enabled on all IOMMUs in the system. Reported-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Tested-by: Dheeraj CVR <cvr.dheeraj@gmail.com> Fixes: `5f0a7f7614` ('iommu/vt-d: Make root entry visible for hardware right after allocation') Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-06-17 11:29:48 +02:00
Wei Yang	86f004c77c	iommu/vt-d: Reduce extra first level entry in iommu->domains In commit <8bf478163e69> ("iommu/vt-d: Split up iommu->domains array"), it it splits iommu->domains in two levels. Each first level contains 256 entries of second level. In case of the ndomains is exact a multiple of 256, it would have one more extra first level entry for current implementation. This patch refines this calculation to reduce the extra first level entry. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-06-15 13:36:58 +02:00
Linus Torvalds	2566278551	Merge git://git.infradead.org/intel-iommu Pull intel IOMMU updates from David Woodhouse: "This patchset improves the scalability of the Intel IOMMU code by resolving two spinlock bottlenecks and eliminating the linearity of the IOVA allocator, yielding up to ~5x performance improvement and approaching 'iommu=off' performance" * git://git.infradead.org/intel-iommu: iommu/vt-d: Use per-cpu IOVA caching iommu/iova: introduce per-cpu caching to iova allocation iommu/vt-d: change intel-iommu to use IOVA frame numbers iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs iommu/vt-d: only unmap mapped entries iommu/vt-d: correct flush_unmaps pfn usage iommu/vt-d: per-cpu deferred invalidation queues iommu/vt-d: refactoring of deferred flush entries	2016-05-27 13:49:24 -07:00
Joerg Roedel	6c0b43df74	Merge branches 'arm/io-pgtable', 'arm/rockchip', 'arm/omap', 'x86/vt-d', 'ppc/pamu', 'core' and 'x86/amd' into next	2016-05-09 19:39:17 +02:00
Omer Peleg	22e2f9fa63	iommu/vt-d: Use per-cpu IOVA caching Commit `9257b4a2` ('iommu/iova: introduce per-cpu caching to iova allocation') introduced per-CPU IOVA caches to massively improve scalability. Use them. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:44:48 -04:00
Omer Peleg	2aac630429	iommu/vt-d: change intel-iommu to use IOVA frame numbers Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of pointers to "struct iova". This avoids using the iova struct from the IOVA red-black tree and the resulting explicit find_iova() on unmap. This patch will allow us to cache IOVAs in the next patch, in order to avoid rbtree operations for the majority of map/unmap operations. Note: In eliminating the find_iova() operation, we have also eliminated the sanity check previously done in the unmap flow. Arguably, this was overhead that is better avoided in production code, but it could be brought back as a debug option for driver development. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:07:22 -04:00
Omer Peleg	0824c5920b	iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb() for domains with no dev iotlb devices. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [gvdl@google.com: fixed locking issues] Signed-off-by: Godfrey van der Linden <gvdl@google.com> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:06:15 -04:00
Omer Peleg	769530e4ba	iommu/vt-d: only unmap mapped entries Current unmap implementation unmaps the entire area covered by the IOVA range, which is a power-of-2 aligned region. The corresponding map, however, only maps those pages originally mapped by the user. This discrepancy can lead to unmapping of already unmapped entries, which is unneeded work. With this patch, only mapped pages are unmapped. This is also a baseline for a map/unmap implementation based on IOVAs and not iova structures, which will allow caching. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:06:01 -04:00
Omer Peleg	f5c0c08b1e	iommu/vt-d: correct flush_unmaps pfn usage Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi() dma addresses. (x86_64 mm and dma have the same size for pages at the moment, but this usage improves consistency.) Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:05:56 -04:00
Omer Peleg	aa4732406e	iommu/vt-d: per-cpu deferred invalidation queues The IOMMU's IOTLB invalidation is a costly process. When iommu mode is not set to "strict", it is done asynchronously. Current code amortizes the cost of invalidating IOTLB entries by batching all the invalidations in the system and performing a single global invalidation instead. The code queues pending invalidations in a global queue that is accessed under the global "async_umap_flush_lock" spinlock, which can result is significant spinlock contention. This patch splits this deferred queue into multiple per-cpu deferred queues, and thus gets rid of the "async_umap_flush_lock" and its contention. To keep existing deferred invalidation behavior, it still invalidates the pending invalidations of all CPUs whenever a CPU reaches its watermark or a timeout occurs. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:05:24 -04:00
Omer Peleg	314f1dc140	iommu/vt-d: refactoring of deferred flush entries Currently, deferred flushes' info is striped between several lists in the flush tables. Instead, move all information about a specific flush to a single entry in this table. This patch does not introduce any functional change. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:05:20 -04:00
Dan Carpenter	0b74ecdfbe	iommu/vt-d: Silence an uninitialized variable warning My static checker complains that "dma_alias" is uninitialized unless we are dealing with a pci device. This is true but harmless. Anyway, we can flip the condition around to silence the warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-04-07 14:51:47 +02:00
Michael S. Tsirkin	3d1a2442d2	x86/vt-d: Fix comment for dma_pte_free_pagetable() dma_pte_free_pagetable no longer depends on last level ptes being clear, it clears them itself. Fix up the comment to match. Cc: Jiang Liu <jiang.liu@linux.intel.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-04-05 17:00:37 +02:00
Joerg Roedel	e6a8c9b337	iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path In the PCI hotplug path of the Intel IOMMU driver, replace the usage of the BUS_NOTIFY_DEL_DEVICE notifier, which is executed before the driver is unbound from the device, with BUS_NOTIFY_REMOVED_DEVICE, which runs after that. This fixes a kernel BUG being triggered in the VT-d code when the device driver tries to unmap DMA buffers and the VT-d driver already destroyed all mappings. Reported-by: Stefani Seibold <stefani@seibold.net> Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-02-29 23:55:16 +01:00
Jeremy McNicoll	da972fb13b	iommu/vt-d: Don't skip PCI devices when disabling IOTLB Fix a simple typo when disabling IOTLB on PCI(e) devices. Fixes: `b16d0cb9e2` ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS") Cc: stable@vger.kernel.org # v4.4 Signed-off-by: Jeremy McNicoll <jmcnicol@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2016-01-29 12:18:13 +01:00
Dan Williams	3e6110fd54	Revert "scatterlist: use sg_phys()" commit `db0fa0cb01` "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: `db0fa0cb01` ("scatterlist: use sg_phys()") Suggested-by: Joerg Roedel <joro@8bytes.org> Reported-by: Vitaly Lavrov <vel21ripn@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-12-15 12:54:06 -08:00
Mel Gorman	d0164adc89	mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-06 17:50:42 -08:00
Linus Torvalds	39cf7c3981	IOMMU Updates for Linux v4.4 This time including: * A new IOMMU driver for s390 pci devices * Common dma-ops support based on iommu-api for ARM64. The plan is to use this as a basis for ARM32 and hopefully other architectures as well in the future. * MSI support for ARM-SMMUv3 * Cleanups and dead code removal in the AMD IOMMU driver * Better RMRR handling for the Intel VT-d driver * Various other cleanups and small fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJWOz7hAAoJECvwRC2XARrjbvYQALwtITTA5iTm0y/ApwNMxI7n pZpjZVPoBPNsGBc4t/MT8pVhUSdmpBOljbV4Y4CayL1mSSB6Bl2gooZjd66m7Z81 qMJYEVWhFQqVsIKkCSNOgaO7W5y+xt3rTgqN6vCu86/CCDfKrTPP/+CRl1T/z9bo 1J8ioM3KnZG9KzG8JuXYFg5wwbKToaBh6swSmj+O4U9hru7zV/ILP7ikcc9pyMji 12WbzCqchRORsJZD65xMRYAqRaPNN/3IlDejs00TOFhY3qpWgEgFUucyeRJBJ/+q K4U8T5vZsnr1a04l7/BeYbLmP7y/9Qv0N0xMGtTyoy/w/BieGqRWu4hHhqf/44NO EhCSXcEThMNCGTjP2VWC4dnQ/s7Y8OmSW9nCreUcFVxHoE5LfDoh8RngA2fpeNuS ixb3OwP+YXHN9Ck+1BQqQCeBznsPTLuDxlhRjCJsWntIfMSkXebOkz83YxyZ9b0Q gFvptfuknU7cotUwWa3dg8RiUB8kNlKJyEEByaVpWEbEOabnONKEMkstvuBx6Ots kA63wbe7QcPgbUYuq7g0nijDw6E2aEtf0nx2Xx4ZDL932qjg/xUkiBpmbDXHw4Gu nimNXVQtbCzF74SyTvxEtupiijOTm5eHtoKtg0mYnqPZ+V9eOwEvW8IHaFFf8XHD SecikoTtH1Q4RVtqOcAQ =jLlB -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "This time including: - A new IOMMU driver for s390 pci devices - Common dma-ops support based on iommu-api for ARM64. The plan is to use this as a basis for ARM32 and hopefully other architectures as well in the future. - MSI support for ARM-SMMUv3 - Cleanups and dead code removal in the AMD IOMMU driver - Better RMRR handling for the Intel VT-d driver - Various other cleanups and small fixes" * tag 'iommu-updates-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits) iommu/vt-d: Fix return value check of parse_ioapics_under_ir() iommu/vt-d: Propagate error-value from ir_parse_ioapic_hpet_scope() iommu/vt-d: Adjust the return value of the parse_ioapics_under_ir iommu: Move default domain allocation to iommu_group_get_for_dev() iommu: Remove is_pci_dev() fall-back from iommu_group_get_for_dev iommu/arm-smmu: Switch to device_group call-back iommu/fsl: Convert to device_group call-back iommu: Add device_group call-back to x86 iommu drivers iommu: Add generic_device_group() function iommu: Export and rename iommu_group_get_for_pci_dev() iommu: Revive device_group iommu-ops call-back iommu/amd: Remove find_last_devid_on_pci() iommu/amd: Remove first/last_device handling iommu/amd: Initialize amd_iommu_last_bdf for DEV_ALL iommu/amd: Cleanup buffer allocation iommu/amd: Remove cmd_buf_size and evt_buf_size from struct amd_iommu iommu/amd: Align DTE flag definitions iommu/amd: Remove old alias handling code iommu/amd: Set alias DTE in do_attach/do_detach iommu/amd: WARN when __[attach\|detach]_device are called with irqs enabled ...	2015-11-05 16:12:10 -08:00
Linus Torvalds	ab1228e42e	Merge git://git.infradead.org/intel-iommu Pull intel iommu updates from David Woodhouse: "This adds "Shared Virtual Memory" (aka PASID support) for the Intel IOMMU. This allows devices to do DMA using process address space, translated through the normal CPU page tables for the relevant mm. With corresponding support added to the i915 driver, this has been tested with the graphics device on Skylake. We don't have the required TLP support in our PCIe root ports for supporting discrete devices yet, so it's only integrated devices that can do it so far" * git://git.infradead.org/intel-iommu: (23 commits) iommu/vt-d: Fix rwxp flags in SVM device fault callback iommu/vt-d: Expose struct svm_dev_ops without CONFIG_INTEL_IOMMU_SVM iommu/vt-d: Clean up pasid_enabled() and ecs_enabled() dependencies iommu/vt-d: Handle Caching Mode implementations of SVM iommu/vt-d: Fix SVM IOTLB flush handling iommu/vt-d: Use dev_err(..) in intel_svm_device_to_iommu(..) iommu/vt-d: fix a loop in prq_event_thread() iommu/vt-d: Fix IOTLB flushing for global pages iommu/vt-d: Fix address shifting in page request handler iommu/vt-d: shift wrapping bug in prq_event_thread() iommu/vt-d: Fix NULL pointer dereference in page request error case iommu/vt-d: Implement SVM_FLAG_SUPERVISOR_MODE for kernel access iommu/vt-d: Implement SVM_FLAG_PRIVATE_PASID to allocate unique PASIDs iommu/vt-d: Add callback to device driver on page faults iommu/vt-d: Implement page request handling iommu/vt-d: Generalise DMAR MSI setup to allow for page request events iommu/vt-d: Implement deferred invalidate for SVM iommu/vt-d: Add basic SVM PASID support iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS iommu/vt-d: Add initial support for PASID tables ...	2015-11-05 16:06:52 -08:00
Joerg Roedel	b67ad2f7c7	Merge branches 'x86/vt-d', 'arm/omap', 'arm/smmu', 's390', 'core' and 'x86/amd' into next Conflicts: drivers/iommu/amd_iommu_types.h	2015-11-02 20:03:34 +09:00
David Woodhouse	d42fde7084	iommu/vt-d: Clean up pasid_enabled() and ecs_enabled() dependencies When booted with intel_iommu=ecs_off we were still allocating the PASID tables even though we couldn't actually use them. We really want to make the pasid_enabled() macro depend on ecs_enabled(). Which is unfortunate, because currently they're the other way round to cope with the Broadwell/Skylake problems with ECS. Instead of having ecs_enabled() depend on pasid_enabled(), which was never something that made me happy anyway, make it depend in the normal case on the "broken PASID" bit 28 not being set. Then pasid_enabled() can depend on ecs_enabled() as it should. And we also don't need to mess with it if we ever see an implementation that has some features requiring ECS (like PRI) but which doesn't have PASID support. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-24 21:33:01 +02:00
Joerg Roedel	a960fadbe6	iommu: Add device_group call-back to x86 iommu drivers Set the device_group call-back to pci_device_group() for the Intel VT-d and the AMD IOMMU driver. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-10-22 00:00:49 +02:00
Linus Torvalds	8a70dd2669	Merge tag 'for-linus-20151021' of git://git.infradead.org/intel-iommu Pull intel-iommu bugfix from David Woodhouse: "This contains a single fix, for when the IOMMU API is used to overlay an existing mapping comprised of 4KiB pages, with a mapping that can use superpages. For the first superpage in the new mapping, we were correctly¹ freeing the old bottom-level page table page and clearing the link to it, before installing the superpage. For subsequent superpages, however, we weren't. This causes a memory leak, and a warning about setting a PTE which is already set. ¹ Well, not entirely correctly. We just free the page table pages right there and then, which is wrong. In fact they should only be freed after the IOTLB is flushed so we know the hardware will no longer be looking at them.... and in fact I note that the IOTLB flush is completely missing from the intel_iommu_map() code path, although it needs to be there if it's permitted to overwrite existing mappings. Fixing those is somewhat more intrusive though, and will probably need to wait for 4.4 at this point" * tag 'for-linus-20151021' of git://git.infradead.org/intel-iommu: iommu/vt-d: fix range computation when making room for large pages	2015-10-22 06:32:48 +09:00
Sudeep Dutt	b9997e385e	iommu/vt-d: Use dev_err(..) in intel_svm_device_to_iommu(..) This will give a little bit of assistance to those developing drivers using SVM. It might cause a slight annoyance to end-users whose kernel disables the IOMMU when drivers are trying to use it. But the fix there is to fix the kernel to enable the IOMMU. Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-19 15:03:00 +01:00
David Woodhouse	a222a7f0bb	iommu/vt-d: Implement page request handling Largely based on the driver-mode implementation by Jesse Barnes. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 15:35:19 +01:00
David Woodhouse	907fea3491	iommu/vt-d: Implement deferred invalidate for SVM Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 13:22:35 +01:00
David Woodhouse	2f26e0a9c9	iommu/vt-d: Add basic SVM PASID support This provides basic PASID support for endpoint devices, tested with a version of the i915 driver. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 12:55:45 +01:00
David Woodhouse	b16d0cb9e2	iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS The behaviour if you enable PASID support after ATS is undefined. So we have to enable it first, even if we don't know whether we'll need it. This is safe enough; unless we set up a context that permits it, the device can't actually do anything with it. Also shift the feature detction to dmar_insert_one_dev_info() as it only needs to happen once. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 12:05:39 +01:00
David Woodhouse	8a94ade4ce	iommu/vt-d: Add initial support for PASID tables Add CONFIG_INTEL_IOMMU_SVM, and allocate PASID tables on supported hardware. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 11:24:51 +01:00
David Woodhouse	ae853ddb9a	iommu/vt-d: Introduce intel_iommu=pasid28, and pasid_enabled() macro As long as we use an identity mapping to work around the worst of the hardware bugs which caused us to defeature it and change the definition of the capability bit, we can use PASID support on the devices which advertised it in bit 28 of the Extended Capability Register. Allow people to do so with 'intel_iommu=pasid28' on the command line. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 11:24:45 +01:00
David Woodhouse	d14053b3c7	iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints The VT-d specification says that "Software must enable ATS on endpoint devices behind a Root Port only if the Root Port is reported as supporting ATS transactions." We walk up the tree to find a Root Port, but for integrated devices we don't find one — we get to the host bridge. In that case we should allow ATS. Currently we don't, which means that we are incorrectly failing to use ATS for the integrated graphics. Fix that. We should never break out of this loop "naturally" with bus==NULL, since we'll always find bridge==NULL in that case (and now return 1). So remove the check for (!bridge) after the loop, since it can never happen. If it did, it would be worthy of a BUG_ON(!bridge). But since it'll oops anyway in that case, that'll do just as well. Cc: stable@vger.kernel.org Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-15 09:28:56 +01:00
Dan Williams	dfddb969ed	iommu/vt-d: Switch from ioremap_cache to memremap In preparation for deprecating ioremap_cache() convert its usage in intel-iommu to memremap. This also eliminates the mishandling of the __iomem annotation in the implementation. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-10-14 15:22:06 +02:00
Christian Zander	ba2374fd2b	iommu/vt-d: fix range computation when making room for large pages In preparation for the installation of a large page, any small page tables that may still exist in the target IOV address range are removed. However, if a scatter/gather list entry is large enough to fit more than one large page, the address space for any subsequent large pages is not cleared of conflicting small page tables. This can cause legitimate mapping requests to fail with errors of the form below, potentially followed by a series of IOMMU faults: ERROR: DMA PTE for vPFN 0xfde00 already set (to 7f83a4003 not 7e9e00083) In this example, a 4MiB scatter/gather list entry resulted in the successful installation of a large page @ vPFN 0xfdc00, followed by a failed attempt to install another large page @ vPFN 0xfde00, due to the presence of a pointer to a small page table @ 0x7f83a4000. To address this problem, compute the number of large pages that fit into a given scatter/gather list entry, and use it to derive the last vPFN covered by the large page(s). Cc: stable@vger.kernel.org Signed-off-by: Christian Zander <christian@nervanasys.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-10-13 20:32:50 +01:00
Linus Torvalds	7554225312	IOMMU Fixes for Linux v4.3-rc5 A few fixes piled up: * Fix for a suspend/resume issue where PCI probing code overwrote dev->irq for the MSI irq of the AMD IOMMU. * Fix for a kernel crash when a 32 bit PCI device was assigned to a KVM guest. * Fix for a possible memory leak in the VT-d driver * A couple of fixes for the ARM-SMMU driver -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJWHNdbAAoJECvwRC2XARrjB/YQAKouJaRMjBaehx6kbaZMhMJy hXDsh8Xl6TtCe6kLD2uXrvjLZAdu32kjrtzhhcM21EO5Ms2Weq6A60/98LwnJ4Eg AqftjfxQsIwf2G1PvHb+xepgcFxIAhW6a3nORzx6d2AGrNWmMtUhbLTSncYjmojf Td4dscuRmRPenJUV1JhcJQBR62QonknIHV99QmevaCSAoUdyuMH+t5kQVEgPjx7C GlMPNEZZmGl7J3NXSWRtDSkUxFZ1OU8MTKc1LmPPHHAOZk37wbePihQbLLySlHPH v4G1R05e2hG7C66yu959fyOleL87lDToUXhwQNFJMqEc+e7IzBzZsB3ANEHjpLQH UJC9COU+sf8mPafja4ge/KbyGDmgDg/OMQJDhU6+DSXUflwymeWJmXr7sLFQex6O nZO/SVzkbKj+PKxV7UnGD0sTeAAk0X6vfhFCL0l/acPpQg0T6Fpky5D5fUMv5dWS xxxvxfwBcDoI44fxWBhfPYvmLFT9f5da+bpbzeeGjVSNezOkPJ65AJcVk5An4kQu PRzJGoq3XpZHOeg5+O7IKzeuJ+3qc7Tz4wAzMxcaNFpVBl2qp1RUkTbmS9/YV1b5 ZOcIFBMLuUROE1ExsU19c5Uo0j1Bvh9jtdy6lNFCagQYzihtA0Jk19ucllx1jIjD sdv2hgDIauRToKF1d9xz =v5G4 -----END PGP SIGNATURE----- Merge tag 'iommu-fixes-v4.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "A few fixes piled up: - Fix for a suspend/resume issue where PCI probing code overwrote dev->irq for the MSI irq of the AMD IOMMU. - Fix for a kernel crash when a 32 bit PCI device was assigned to a KVM guest. - Fix for a possible memory leak in the VT-d driver - A couple of fixes for the ARM-SMMU driver" * tag 'iommu-fixes-v4.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix NULL pointer deref on device detach iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices iommu/vt-d: Fix memory leak in dmar_insert_one_dev_info() iommu/arm-smmu: Use correct address mask for CMD_TLBI_S2_IPA iommu/arm-smmu: Ensure IAS is set correctly for AArch32-capable SMMUs iommu/io-pgtable-arm: Don't use dma_to_phys()	2015-10-13 10:09:59 -07:00
Joerg Roedel	b1ce5b79ae	iommu/vt-d: Create RMRR mappings in newly allocated domains Currently the RMRR entries are created only at boot time. This means they will vanish when the domain allocated at boot time is destroyed. This patch makes sure that also newly allocated domains will get RMRR mappings. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-10-05 17:39:21 +02:00
Joerg Roedel	d66ce54b46	iommu/vt-d: Split iommu_prepare_identity_map Split the part of the function that fetches the domain out and put the rest into into a domain_prepare_identity_map, so that the code can also be used with when the domain is already known. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-10-05 17:38:47 +02:00
Linus Torvalds	8c25ab8b5a	Merge git://git.infradead.org/intel-iommu Pull IOVA fixes from David Woodhouse: "The main fix here is the first one, fixing the over-allocation of size-aligned requests. The other patches simply make the existing IOVA code available to users other than the Intel VT-d driver, with no functional change. I concede the latter really should have been submitted during the merge window, but since it's basically risk-free and people are waiting to build on top of it and it's my fault I didn't get it in, I (and they) would be grateful if you'd take it" * git://git.infradead.org/intel-iommu: iommu: Make the iova library a module iommu: iova: Export symbols iommu: iova: Move iova cache management to the iova library iommu/iova: Avoid over-allocating when size-aligned	2015-10-02 07:59:29 -04:00
Sudip Mukherjee	499f3aa432	iommu/vt-d: Fix memory leak in dmar_insert_one_dev_info() We are returning NULL if we are not able to attach the iommu to the domain but while returning we missed freeing info. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-09-29 15:45:50 +02:00
Linus Torvalds	9a9952bbd7	IOMMU Updates for Linux v4.3 This time the IOMMU updates are mostly cleanups or fixes. No big new features or drivers this time. In particular the changes include: * Bigger cleanup of the Domain<->IOMMU data structures and the code that manages them in the Intel VT-d driver. This makes the code easier to understand and maintain, and also easier to keep the data structures in sync. It is also a preparation step to make use of default domains from the IOMMU core in the Intel VT-d driver. * Fixes for a couple of DMA-API misuses in ARM IOMMU drivers, namely in the ARM and Tegra SMMU drivers. * Fix for a potential buffer overflow in the OMAP iommu driver's debug code * A couple of smaller fixes and cleanups in various drivers * One small new feature: Report domain-id usage in the Intel VT-d driver to easier detect bugs where these are leaked. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJV7sCEAAoJECvwRC2XARrjz3YP/Au4IIfqykfPvmI0cmPhVnAV Q72tltwkbK2u2iP+pHheveaMngJtAshsZrnhBon4KJRIt/KTLZQvsFplHDaRhPfY yw3LIxhC5kLG/S6irY9Ozb0+uTMdQ3BU2uS23pyoFVfCz+RngBrAwDBcTKqZDCDG 8dNd+T21XlzxuyeGr58h9upz2VFtq6feoGFhLU5PNxTlf4JWZe77D7NlbSvx6Nwy 7Ai8dVRgpV9ciUP7w8FXrCUvbMZQDIoTMiWGNSlogVMgA0dllGES91UZYhWf3pil abuX6DeFul/cOhEOnH2xa+j5zz2O/upe9stU4wAFw6IhPiAELTHc2NKlWAhwb0SY bpDRf7dgLnUfqpmZLpWjTwN4jllc0qS2MIHj+eUu0uhdFi4Z0BuH2wSCdbR7xkqk u5u0Jq7hDNKs5FmQTSsWSiAdjakMsRjIN7jMrBbOeZnBSmUnLx74KGPLTb63ncR3 WIOi4Iyu+LSXBIvZDiLu3lIIh7Atzd+y7IDnb8KXdyqfy+h53OZZOJNbP/qTWHgT ZUdm/qrqjIQpTQfleOEadC7vY/y3fR5sBtOQHUamfntni3oYCc4AMRlNdf3eV9lb Tyss6F699mU7d/vennTaIToBgVwaXdLYtmvGWjnoT/kqOMclyDf3cIUtZGtp2rJR ddmzDA3vBUC5pGj8Hd8R =yoGE -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates for from Joerg Roedel: "This time the IOMMU updates are mostly cleanups or fixes. No big new features or drivers this time. In particular the changes include: - Bigger cleanup of the Domain<->IOMMU data structures and the code that manages them in the Intel VT-d driver. This makes the code easier to understand and maintain, and also easier to keep the data structures in sync. It is also a preparation step to make use of default domains from the IOMMU core in the Intel VT-d driver. - Fixes for a couple of DMA-API misuses in ARM IOMMU drivers, namely in the ARM and Tegra SMMU drivers. - Fix for a potential buffer overflow in the OMAP iommu driver's debug code - A couple of smaller fixes and cleanups in various drivers - One small new feature: Report domain-id usage in the Intel VT-d driver to easier detect bugs where these are leaked" * tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (83 commits) iommu/vt-d: Really use upper context table when necessary x86/vt-d: Fix documentation of DRHD iommu/fsl: Really fix init section(s) content iommu/io-pgtable-arm: Unmap and free table when overwriting with block iommu/io-pgtable-arm: Move init-fn declarations to io-pgtable.h iommu/msm: Use BUG_ON instead of if () BUG() iommu/vt-d: Access iomem correctly iommu/vt-d: Make two functions static iommu/vt-d: Use BUG_ON instead of if () BUG() iommu/vt-d: Return false instead of 0 in irq_remapping_cap() iommu/amd: Use BUG_ON instead of if () BUG() iommu/amd: Make a symbol static iommu/amd: Simplify allocation in irq_remapping_alloc() iommu/tegra-smmu: Parameterize number of TLB lines iommu/tegra-smmu: Factor out tegra_smmu_set_pde() iommu/tegra-smmu: Extract tegra_smmu_pte_get_use() iommu/tegra-smmu: Use __GFP_ZERO to allocate zeroed pages iommu/tegra-smmu: Remove PageReserved manipulation iommu/tegra-smmu: Convert to use DMA API iommu/tegra-smmu: smmu_flush_ptc() wants device addresses ...	2015-09-08 17:22:35 -07:00
Linus Torvalds	d975f309a8	Merge branch 'for-4.3/sg' of git://git.kernel.dk/linux-block Pull SG updates from Jens Axboe: "This contains a set of scatter-gather related changes/fixes for 4.3: - Add support for limited chaining of sg tables even for architectures that do not set ARCH_HAS_SG_CHAIN. From Christoph. - Add sg chain support to target_rd. From Christoph. - Fixup open coded sg->page_link in crypto/omap-sham. From Christoph. - Fixup open coded crypto ->page_link manipulation. From Dan. - Also from Dan, automated fixup of manual sg_unmark_end() manipulations. - Also from Dan, automated fixup of open coded sg_phys() implementations. - From Robert Jarzmik, addition of an sg table splitting helper that drivers can use" * 'for-4.3/sg' of git://git.kernel.dk/linux-block: lib: scatterlist: add sg splitting function scatterlist: use sg_phys() crypto/omap-sham: remove an open coded access to ->page_link scatterlist: remove open coded sg_unmark_end instances crypto: replace scatterwalk_sg_chain with sg_chain target/rd: always chain S/G list scatterlist: allow limited chaining without ARCH_HAS_SG_CHAIN	2015-09-02 13:22:38 -07:00
Linus Torvalds	26f8b7edc9	PCI changes for the v4.3 merge window: Enumeration Allocate ATS struct during enumeration (Bjorn Helgaas) Embed ATS info directly into struct pci_dev (Bjorn Helgaas) Reduce size of ATS structure elements (Bjorn Helgaas) Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas) iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas) Move MPS configuration check to pci_configure_device() (Bjorn Helgaas) Set MPS to match upstream bridge (Keith Busch) ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri) Add pci_scan_root_bus_msi() (Lorenzo Pieralisi) ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi) Resource management Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi) PCI device hotplug pciehp: Remove unused interrupt events (Bjorn Helgaas) pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas) pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson) pciehp: Simplify pcie_poll_cmd() (Yijing Wang) Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang) Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang) Hold pci_slot_mutex while searching bus->slots list (Yijing Wang) Power management Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui) Virtualization Add ACS quirks for Intel I219-LM/V (Alex Williamson) Restore ACS configuration as part of pci_restore_state() (Alexander Duyck) MSI Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu) Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu) ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi) Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi) Generic host bridge driver Remove dependency on ARM-specific struct hw_pci (Jayachandran C) Build setup-irq.o for arm64 (Jayachandran C) Add arm64 support (Jayachandran C) APM X-Gene host bridge driver Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang) Add support for a 64-bit prefetchable memory window (Duc Dang) Drop owner assignment from platform_driver (Krzysztof Kozlowski) Broadcom iProc host bridge driver Allow BCMA bus driver to be built as module (Hauke Mehrtens) Delete unnecessary checks before phy calls (Markus Elfring) Add arm64 support (Ray Jui) Synopsys DesignWare host bridge driver Don't complain missing config reg space if va_cfg0 is set (Murali Karicheri) TI DRA7xx host bridge driver Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I) Add PM support (Kishon Vijay Abraham I) Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I) Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I) Xilinx AXI host bridge driver Check for MSI interrupt flag before handling as INTx (Russell Joyce) Miscellaneous Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa) Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas) Fix generic NCR 53c810 class code quirk (Bjorn Helgaas) Fix TI816X class code quirk (Bjorn Helgaas) Remove unused "pci_probe" flags (Bjorn Helgaas) Host bridge driver code simplifications (Fabio Estevam) Add dev_flags bit to access VPD through function 0 (Mark Rustad) Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad) Kill off set_irq_flags() usage (Rob Herring) Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar) Clean up pci_find_capability() (Wei Yang) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJV5FE/AAoJEFmIoMA60/r8I2QP/R9b9MrvH2i9tN98/lTDl7g3 czE58ZM1d4kMYtW3Pm/DrYI6y6RprAaB4ZEp5rHxlFLqBPZEQwWodA19NkjECcb6 g5qKWOdIWA4T6Jaab6a/yCmAFa0jni7iAmmTYqca9o3Xj7tFovxDxqPSYkh+rer0 v+1sAr/4HXSiN339KR6teEF3VZqLFp6ewMydQlVS+R7kAOHHYQDqoo9WF6JnIoL5 PO3Kbmr1WN3fZY3s98yLq1x6XmLrLlmGdJI+2r+KewO4r/05CL6wTVP/oTMi+Eti dueseeISlOTcTAUhk87Vap23uJPeB/rJbYoFdCr7+0AkZGe/U/E2dpZm2wyMcCvq OrATuFymgzIuJm5uUPsdH4lzsX97U9BcDccracfC38rYnP5u3bqHCjw8HJzANR7p VYbFBzc5ZCCUYtQAjyrKt2820AvTFo+Bu+z75IsJO8LQQgv/zGtQQ8grIQeAjH+l sAe3xOTwzZnq6Obl4qb/GElHmIGUbQ1X4Dx1mliiijKMKkhYHOA0iFnB/OBILmEZ wHzKU8chWcI9lip0aaX8q9i/qovdVUt2+rdo/N40l7YY66x4jkNgQQXZX+FSKk6H stTvEBQgK28EKCHDxMsgzTGIqllSyk4DnRMA7ij1hRWqdUbGk7wOPTvm9QSwNDWe SokuWzAQD9YeMRGdsYjZ =DX1r -----END PGP SIGNATURE----- Merge tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v4.3 merge window: Enumeration: - Allocate ATS struct during enumeration (Bjorn Helgaas) - Embed ATS info directly into struct pci_dev (Bjorn Helgaas) - Reduce size of ATS structure elements (Bjorn Helgaas) - Stop caching ATS Invalidate Queue Depth (Bjorn Helgaas) - iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth (Bjorn Helgaas) - Move MPS configuration check to pci_configure_device() (Bjorn Helgaas) - Set MPS to match upstream bridge (Keith Busch) - ARM/PCI: Set MPS before pci_bus_add_devices() (Murali Karicheri) - Add pci_scan_root_bus_msi() (Lorenzo Pieralisi) - ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() (Lorenzo Pieralisi) Resource management: - Call pci_read_bridge_bases() from core instead of arch code (Lorenzo Pieralisi) PCI device hotplug: - pciehp: Remove unused interrupt events (Bjorn Helgaas) - pciehp: Remove ignored MRL sensor interrupt events (Bjorn Helgaas) - pciehp: Handle invalid data when reading from non-existent devices (Jarod Wilson) - pciehp: Simplify pcie_poll_cmd() (Yijing Wang) - Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot (Yijing Wang) - Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem (Yijing Wang) - Hold pci_slot_mutex while searching bus->slots list (Yijing Wang) Power management: - Disable async suspend/resume for JMicron multi-function SATA/AHCI (Zhang Rui) Virtualization: - Add ACS quirks for Intel I219-LM/V (Alex Williamson) - Restore ACS configuration as part of pci_restore_state() (Alexander Duyck) MSI: - Add pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) - x86: Implement pcibios_alloc_irq() and pcibios_free_irq() (Jiang Liu) - Add helpers to manage pci_dev->irq and pci_dev->irq_managed (Jiang Liu) - Free legacy IRQ when enabling MSI/MSI-X (Jiang Liu) - ARM/PCI: Remove msi_controller from struct pci_sys_data (Lorenzo Pieralisi) - Remove unused pcibios_msi_controller() hook (Lorenzo Pieralisi) Generic host bridge driver: - Remove dependency on ARM-specific struct hw_pci (Jayachandran C) - Build setup-irq.o for arm64 (Jayachandran C) - Add arm64 support (Jayachandran C) APM X-Gene host bridge driver: - Add APM X-Gene PCIe 64-bit prefetchable window (Duc Dang) - Add support for a 64-bit prefetchable memory window (Duc Dang) - Drop owner assignment from platform_driver (Krzysztof Kozlowski) Broadcom iProc host bridge driver: - Allow BCMA bus driver to be built as module (Hauke Mehrtens) - Delete unnecessary checks before phy calls (Markus Elfring) - Add arm64 support (Ray Jui) Synopsys DesignWare host bridge driver: - Don't complain missing config reg space if va_cfg0 is set (Murali Karicheri) TI DRA7xx host bridge driver: - Disable pm_runtime on get_sync failure (Kishon Vijay Abraham I) - Add PM support (Kishon Vijay Abraham I) - Clear MSE bit during suspend so clocks will idle (Kishon Vijay Abraham I) - Add support to make GPIO drive PERST# line (Kishon Vijay Abraham I) Xilinx AXI host bridge driver: - Check for MSI interrupt flag before handling as INTx (Russell Joyce) Miscellaneous: - Fix Intersil/Techwell TW686[4589] AV capture class code (Krzysztof Hałasa) - Use PCI_CLASS_SERIAL_USB instead of bare number (Bjorn Helgaas) - Fix generic NCR 53c810 class code quirk (Bjorn Helgaas) - Fix TI816X class code quirk (Bjorn Helgaas) - Remove unused "pci_probe" flags (Bjorn Helgaas) - Host bridge driver code simplifications (Fabio Estevam) - Add dev_flags bit to access VPD through function 0 (Mark Rustad) - Add VPD function 0 quirk for Intel Ethernet devices (Mark Rustad) - Kill off set_irq_flags() usage (Rob Herring) - Remove Intel Cherrytrail D3 delays (Srinidhi Kasagar) - Clean up pci_find_capability() (Wei Yang)" * tag 'pci-v4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (72 commits) PCI: Disable async suspend/resume for JMicron multi-function SATA/AHCI PCI: Set MPS to match upstream bridge PCI: Move MPS configuration check to pci_configure_device() PCI: Drop references acquired by of_parse_phandle() PCI/MSI: Remove unused pcibios_msi_controller() hook ARM/PCI: Remove msi_controller from struct pci_sys_data ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() PCI: Add pci_scan_root_bus_msi() ARM/PCI: Replace panic with WARN messages on failures PCI: generic: Add arm64 support PCI: Build setup-irq.o for arm64 PCI: generic: Remove dependency on ARM-specific struct hw_pci PCI: imx6: Simplify a trivial if-return sequence PCI: spear: Use BUG_ON() instead of condition followed by BUG() PCI: dra7xx: Remove unneeded use of IS_ERR_VALUE() PCI: Remove pci_ats_enabled() PCI: Stop caching ATS Invalidate Queue Depth PCI: Move ATS declarations to linux/pci.h so they're all together PCI: Clean up ATS error handling PCI: Use pci_physfn() rather than looking up physfn by hand ...	2015-08-31 17:14:39 -07:00
Joerg Roedel	4df4eab168	iommu/vt-d: Really use upper context table when necessary There is a bug in iommu_context_addr() which will always use the lower context table, even when the upper context table needs to be used. Fix this issue. Fixes: `03ecc32c52` ("iommu/vt-d: support extended root and context entries") Reported-by: Xiao, Nan <nan.xiao@hp.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-25 11:39:27 +02:00
Dan Williams	db0fa0cb01	scatterlist: use sg_phys() Coccinelle cleanup to replace open coded sg to physical address translations. This is in preparation for introducing scatterlists that reference __pfn_t. // sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg) // usage: make coccicheck COCCI=sg_phys.cocci MODE=patch virtual patch @@ struct scatterlist sg; @@ - page_to_phys(sg_page(sg)) + sg->offset + sg_phys(sg) @@ struct scatterlist sg; @@ - page_to_phys(sg_page(sg)) + sg_phys(sg) & PAGE_MASK Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-08-17 08:13:26 -06:00
Joerg Roedel	543c8dcf1d	iommu/vt-d: Access iomem correctly This fixes wrong accesses to iomem introduced by the kdump fixing code. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-13 19:49:56 +02:00
Joerg Roedel	b690420a40	iommu/vt-d: Make two functions static These functions are only used in that file and can be static. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-13 19:49:51 +02:00
Joerg Roedel	dc02e46e8d	iommu/vt-d: Use BUG_ON instead of if () BUG() Found by a coccicheck script. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-13 19:49:46 +02:00
Joerg Roedel	f303e50766	iommu/vt-d: Avoid duplicate device_domain_info structures When a 'struct device_domain_info' is created as an alias for another device, this struct will not be re-used when the real device is encountered. Fix that to avoid duplicate device_domain_info structures being added. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:37 +02:00
Joerg Roedel	08a7f456a7	iommu/vt-d: Only insert alias dev_info if there is an alias For devices without an PCI alias there will be two device_domain_info structures added. Prevent that by checking if the alias is different from the device. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	127c761598	iommu/vt-d: Pass device_domain_info to __dmar_remove_one_dev_info This struct contains all necessary information for the function already. Also handle the info->dev == NULL case while at it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	2309bd793e	iommu/vt-d: Remove dmar_global_lock from device_notifier The code in the locked section does not touch anything protected by the dmar_global_lock. Remove it from there. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	55d940430a	iommu/vt-d: Get rid of domain->iommu_lock When this lock is held the device_domain_lock is also required to make sure the device_domain_info does not vanish while in use. So this lock can be removed as it gives no additional protection. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	de7e888646	iommu/vt-d: Only call domain_remove_one_dev_info to detach old domain There is no need to make a difference here between VM and non-VM domains, so simplify this code here. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	d160aca527	iommu/vt-d: Unify domain->iommu attach/detachment Move the code to attach/detach domains to iommus and vice verce into a single function to make sure there are no dangling references. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	c6c2cebd66	iommu/vt-d: Establish domain<->iommu link in dmar_insert_one_dev_info This makes domain attachment more synchronous with domain deattachment. The domain<->iommu link is released in dmar_remove_one_dev_info. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	dc534b25d1	iommu/vt-d: Pass an iommu pointer to domain_init() This allows to do domain->iommu attachment after domain_init has run. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:36 +02:00
Joerg Roedel	2452d9db12	iommu/vt-d: Rename iommu_detach_dependent_devices() Rename this function and the ones further down its call-chain to domain_context_clear_*. In particular this means: iommu_detach_dependent_devices -> domain_context_clear iommu_detach_dev_cb -> domain_context_clear_one_cb iommu_detach_dev -> domain_context_clear_one These names match a lot better with its domain_context_mapping counterparts. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:35 +02:00
Joerg Roedel	e6de0f8dfc	iommu/vt-d: Rename domain_remove_one_dev_info() Rename the function to dmar_remove_one_dev_info to match is name better with its dmar_insert_one_dev_info counterpart. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:35 +02:00
Joerg Roedel	5db31569e9	iommu/vt-d: Rename dmar_insert_dev_info() Rename this function to dmar_insert_one_dev_info() to match the name better with its counter part function domain_remove_one_dev_info(). Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:35 +02:00
Joerg Roedel	cc4e2575cc	iommu/vt-d: Move context-mapping into dmar_insert_dev_info Do the context-mapping of devices from a single place in the call-path and clean up the other call-sites. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:35 +02:00
Joerg Roedel	76f45fe35c	iommu/vt-d: Simplify domain_remove_dev_info() Just call domain_remove_one_dev_info() for all devices in the domain instead of reimplementing the functionality. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:35 +02:00
Joerg Roedel	b608ac3b6d	iommu/vt-d: Simplify domain_remove_one_dev_info() Simplify this function as much as possible with the new iommu_refcnt field. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:34 +02:00
Joerg Roedel	42e8c186b5	iommu/vt-d: Simplify io/tlb flushing in intel_iommu_unmap We don't need to do an expensive search for domain-ids anymore, as we keep track of per-iommu domain-ids. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:34 +02:00
Joerg Roedel	29a27719ab	iommu/vt-d: Replace iommu_bmp with a refcount This replaces the dmar_domain->iommu_bmp with a similar reference count array. This allows us to keep track of how many devices behind each iommu are attached to the domain. This is necessary for further simplifications and optimizations to the iommu<->domain attachment code. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:34 +02:00
Joerg Roedel	af1089ce38	iommu/vt-d: Kill dmar_domain->id This field is now obsolete because all places use the per-iommu domain-ids. Kill the remaining uses of this field and remove it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:34 +02:00
Joerg Roedel	0dc7971594	iommu/vt-d: Don't pre-allocate domain ids for si_domain There is no reason for this special handling of the si_domain. The per-iommu domain-id can be allocated on-demand like for any other domain. So remove the pre-allocation code. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:34 +02:00
Joerg Roedel	a1ddcbe930	iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi This function can figure out the domain-id to use itself from the iommu_did array. This is more reliable over different domain types and brings us one step further to remove the domain->id field. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:34 +02:00
Joerg Roedel	de24e55395	iommu/vt-d: Simplify domain_context_mapping_one Get rid of the special cases for VM domains vs. non-VM domains and simplify the code further to just handle the hardware passthrough vs. page-table case. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:33 +02:00
Joerg Roedel	28ccce0d95	iommu/vt-d: Calculate translation in domain_context_mapping_one There is no reason to pass the translation type through multiple layers. It can also be determined in the domain_context_mapping_one function directly. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:33 +02:00
Joerg Roedel	e2411427f7	iommu/vt-d: Get rid of iommu_attach_vm_domain() The special case for VM domains is not needed, as other domains could be attached to the iommu in the same way. So get rid of this special case. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:33 +02:00
Joerg Roedel	8bf478163e	iommu/vt-d: Split up iommu->domains array This array is indexed by the domain-id and contains the pointers to the domains attached to this iommu. Modern systems support 65536 domain ids, so that this array has a size of 512kb, per iommu. This is a huge waste of space, as the array is usually sparsely populated. This patch makes the array two-dimensional and allocates the memory for the domain pointers on-demand. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:33 +02:00
Joerg Roedel	9452d5bfe5	iommu/vt-d: Add access functions for iommu->domains This makes it easier to change the layout of the data structure later. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:33 +02:00
Joerg Roedel	c0e8a6c803	iommu/vt-d: Keep track of per-iommu domain ids Instead of searching in the domain array for already allocated domain ids, keep track of them explicitly. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-12 16:23:32 +02:00
Alex Williamson	2238c0827a	iommu/vt-d: Report domain usage in sysfs Debugging domain ID leakage typically requires long running tests in order to exhaust the domain ID space or kernel instrumentation to track the setting and clearing of bits. A couple trivial intel-iommu specific sysfs extensions make it much easier to expose the IOMMU capabilities and current usage. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:30:57 +02:00
Kees Cook	2439d4aa92	iommu/vt-d: Avoid format string leaks into iommu_device_create This makes sure it won't be possible to accidentally leak format strings into iommu device names. Current name allocations are safe, but this makes the "%s" explicit. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-08-03 16:15:47 +02:00
Sakari Ailus	ae1ff3d623	iommu: iova: Move iova cache management to the iova library This is necessary to separate intel-iommu from the iova library. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-07-28 15:47:58 +01:00
Robin Murphy	8f6429c7cb	iommu/iova: Avoid over-allocating when size-aligned Currently, allocating a size-aligned IOVA region quietly adjusts the actual allocation size in the process, returning a rounded-up power-of-two-sized allocation. This results in mismatched behaviour in the IOMMU driver if the original size was not a power of two, where the original size is mapped, but the rounded-up IOVA size is unmapped. Whilst some IOMMUs will happily unmap already-unmapped pages, others consider this an error, so fix it by computing the necessary alignment padding without altering the actual allocation size. Also clean up by making pad_size unsigned, since its callers always pass unsigned values and negative padding makes little sense here anyway. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-07-28 15:47:56 +01:00
Alex Williamson	46ebb7af7b	iommu/vt-d: Fix VM domain ID leak This continues the attempt to fix commit `fb170fb4c5` ("iommu/vt-d: Introduce helper functions to make code symmetric for readability"). The previous attempt in commit `7168440690` ("iommu/vt-d: Detach domain only from attached iommus") overlooked the fact that dmar_domain.iommu_bmp gets cleared for VM domains when devices are detached: intel_iommu_detach_device domain_remove_one_dev_info domain_detach_iommu The domain is detached from the iommu, but the iommu is still attached to the domain, for whatever reason. Thus when we get to domain_exit(), we can't rely on iommu_bmp for VM domains to find the active iommus, we must check them all. Without that, the corresponding bit in intel_iommu.domain_ids doesn't get cleared and repeated VM domain creation and destruction will run out of domain IDs. Meanwhile we still can't call iommu_detach_domain() on arbitrary non-VM domains or we risk clearing in-use domain IDs, as `7168440690` attempted to address. It's tempting to modify iommu_detach_domain() to test the domain iommu_bmp, but the call ordering from domain_remove_one_dev_info() prevents it being able to work as `fb170fb4c5` seems to have intended. Caching of unused VM domains on the iommu object seems to be the root of the problem, but this code is far too fragile for that kind of rework to be proposed for stable, so we simply revert this chunk to its state prior to `fb170fb4c5`. Fixes: `fb170fb4c5` ("iommu/vt-d: Introduce helper functions to make code symmetric for readability") Fixes: `7168440690` ("iommu/vt-d: Detach domain only from attached iommus") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-07-23 14:17:39 +02:00
Bjorn Helgaas	fb0cc3aa55	iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth We check the ATS state (enabled/disabled) and fetch the PCI ATS Invalidate Queue Depth in performance-sensitive paths. It's easy to cache these, which removes dependencies on PCI. Remember the ATS enabled state. When enabling, read the queue depth once and cache it in the device_domain_info struct. This is similar to what amd_iommu.c does. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de>	2015-07-20 11:49:46 -05:00
Joerg Roedel	8939ddf6d6	iommu/vt-d: Enable Translation only if it was previously disabled Do not touch the TE bit unless we know translation is disabled. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:35 +02:00
Joerg Roedel	60b523ecfe	iommu/vt-d: Don't disable translation prior to OS handover For all the copy-translation code to run, we have to keep translation enabled in intel_iommu_init(). So remove the code disabling it. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:35 +02:00
Joerg Roedel	c3361f2f6e	iommu/vt-d: Don't copy translation tables if RTT bit needs to be changed We can't change the RTT bit when translation is enabled, so don't copy translation tables when we would change the bit with our new root entry. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:35 +02:00
Joerg Roedel	a87f491890	iommu/vt-d: Don't do early domain assignment if kdump kernel When we copied over context tables from an old kernel, we need to defer assignment of devices to domains until the device driver takes over. So skip this part of initialization when we copied over translation tables from the old kernel. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:35 +02:00
Joerg Roedel	86080ccc22	iommu/vt-d: Allocate si_domain in init_dmars() This seperates the allocation of the si_domain from its assignment to devices. It makes sure that the iommu=pt case still works in the kdump kernel, when we have to defer the assignment of devices to domains to device driver initialization time. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:34 +02:00
Joerg Roedel	cf484d0e69	iommu/vt-d: Mark copied context entries Mark the context entries we copied over from the old kernel, so that we don't detect them as present in other code paths. This makes sure we safely overwrite old context entries when a new domain is assigned. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:34 +02:00
Joerg Roedel	dbcd861f25	iommu/vt-d: Do not re-use domain-ids from the old kernel Mark all domain-ids we find as reserved, so that there could be no collision between domains from the previous kernel and our domains in the IOMMU TLB. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:34 +02:00
Joerg Roedel	091d42e43d	iommu/vt-d: Copy translation tables from old kernel If we are in a kdump kernel and find translation enabled in the iommu, try to copy the translation tables from the old kernel to preserve the mappings until the device driver takes over. This supports old and the extended root-entry and context-table formats. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:34 +02:00
Joerg Roedel	4158c2eca3	iommu/vt-d: Detect pre enabled translation Add code to detect whether translation is already enabled in the IOMMU. Save this state in a flags field added to struct intel_iommu. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:34 +02:00
Joerg Roedel	5f0a7f7614	iommu/vt-d: Make root entry visible for hardware right after allocation In case there was an old root entry, make our new one visible immediately after it was allocated. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:34 +02:00
Joerg Roedel	b63d80d1e0	iommu/vt-d: Init QI before root entry is allocated QI needs to be available when we write the root entry into hardware because flushes might be necessary after this. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:33 +02:00
Joerg Roedel	9f10e5bf62	iommu/vt-d: Cleanup log messages Give them a common prefix that can be grepped for and improve the wording here and there. Tested-by: ZhenHua Li <zhen-hual@hp.com> Tested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-06-16 10:59:33 +02:00
David Woodhouse	c83b2f20fd	iommu/vt-d: Only enable extended context tables if PASID is supported Although the extended tables are theoretically a completely orthogonal feature to PASID and anything else that uses the newly-available bits, some of the early hardware has problems even when all we do is enable them and use only the same bits that were in the old context tables. For now, there's no motivation to support extended tables unless we're going to use PASID support to do SVM. So just don't use them unless PASID support is advertised too. Also add a command-line bailout just in case later chips also have issues. The equivalent problem for PASID support has already been fixed with the upcoming VT-d spec update and commit `bd00c606a` ("iommu/vt-d: Change PASID support to bit 40 of Extended Capability Register"), because the problematic platforms use the old definition of the PASID-capable bit, which is now marked as reserved and meaningless. So with this change, we'll magically start using ECS again only when we see the new hardware advertising "hey, we have PASID support and we actually tested it this time" on bit 40. The VT-d hardware architect has promised that we are not going to have any reason to support ECS without PASID any time soon, and he'll make sure he checks with us before changing that. In the future, if hypothetical new features also use new bits in the context tables and can be seen on implementations without PASID support, we might need to add their feature bits to the ecs_enabled() macro. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-06-12 11:31:25 +01:00
David Woodhouse	4ed6a540fa	iommu/vt-d: Fix passthrough mode with translation-disabled devices When we use 'intel_iommu=igfx_off' to disable translation for the graphics, and when we discover that the BIOS has misconfigured the DMAR setup for I/OAT, we use a special DUMMY_DEVICE_DOMAIN_INFO value in dev->archdata.iommu to indicate that translation is disabled. With passthrough mode, we were attempting to dereference that as a normal pointer to a struct device_domain_info when setting up an identity mapping for the affected device. This fixes the problem by making device_to_iommu() explicitly check for the special value and indicate that no IOMMU was found to handle the devices in question. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@vger.kernel.org (which means you can pick up `18436afdc` now too)	2015-05-11 14:59:20 +01:00
Linus Torvalds	9f86262dcc	Merge git://git.infradead.org/intel-iommu Pull intel iommu updates from David Woodhouse: "This lays a little of the groundwork for upcoming Shared Virtual Memory support — fixing some bogus #defines for capability bits and adding the new ones, and starting to use the new wider page tables where we can, in anticipation of actually filling in the new fields therein. It also allows graphics devices to be assigned to VM guests again. This got broken in 3.17 by disallowing assignment of RMRR-afflicted devices. Like USB, we do understand why there's an RMRR for graphics devices — and unlike USB, it's actually sane. So we can make an exception for graphics devices, just as we do USB controllers. Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty hack to allow broken BIOSes to forbid us from using X2APIC when they do stupid and invasive things and would break if we did. Someone noticed that since Windows doesn't have full IOMMU support for DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid initialising the IOMMU on the graphics unit altogether. This means that it would be available for use in "driver mode", where the IOMMU registers are made available through a BAR of the graphics device and the graphics driver can do SVM all for itself. So they started setting the X2APIC_OPT_OUT bit on all platforms with SVM capabilities. And even the platforms which might, if the planets had been aligned correctly, possibly have had SVM capability but which in practice actually don't" * git://git.infradead.org/intel-iommu: iommu/vt-d: support extended root and context entries iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification iommu/vt-d: Allow RMRR on graphics devices too iommu/vt-d: Print x2apic opt out info instead of printing a warning iommu/vt-d: kill bogus ecap_niotlb_iunits()	2015-04-26 17:47:46 -07:00
Linus Torvalds	79319a052c	IOMMU Updates for Linux v4.1 Not much this time, but the changes include: * Moving domain allocation into the iommu drivers to prepare for the introduction of default domains for devices * Fixing the IO page-table code in the AMD IOMMU driver to correctly encode large page sizes * Extension of the PCI support in the ARM-SMMU driver * Various fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVNFIPAAoJECvwRC2XARrj4v8QAMVsPJ+kmnLvqGDkO9v2i9z6 sFX27h55HhK3Pgb5aEmEhvZd0Eec22KtuADr92LsRSjskgA4FgrzzSlo8w7+MbwM dtowij+5Bzx/jEeexM5gog0ZA9Brl725KSYBmwJIAroKAtl3YXsIA4TO7X/JtXJm 0qWbCxLs9CX5uWyJawkeDl8UAaZYb8AHKv1UhJt8Z5yajM/qITMULi51g2Bgh8kx YaRHeZNj+mFQqb6IlNkmOhILN+dbTdxQREp+aJs1alGdkBGlJyfo6eK4weNOpA4x gc8EXUWZzj1GEPyWMpA/ZMzPzCbj9M6wTeXqRiTq31AMV10zcy545uYcLWks680M CYvWTmjeCvwsbuaj9cn+efa47foH2UoeXxBmXWOJDv4WxcjE1ejmlmSd8WYfwkh9 hIkMzD8tW2iZf3ssnjCeQLa7f6ydL2P4cpnK2JH+N7hN9VOASAlciezroFxtCjU+ 18T7ozgUTbOXZZomBX7OcGQ8ElXMiHB/uaCyNO64yVzApsUnQfpHzcRI5OavOYn5 dznjrzvNLCwHs3QFI4R7rsmIfPkOM0g5nY5drGwJ23+F+rVpLmpWVPR5hqT7a1HM tJVmzces6HzOu7P1Mo0IwvNbZEmNBGTHYjGtWs6e79MQxdriFT4I+DwvFOy7GUq/ Is2b+HPwWhiWJHQXLTT2 =FMxH -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Not much this time, but the changes include: - moving domain allocation into the iommu drivers to prepare for the introduction of default domains for devices - fixing the IO page-table code in the AMD IOMMU driver to correctly encode large page sizes - extension of the PCI support in the ARM-SMMU driver - various fixes and cleanups" * tag 'iommu-updates-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (34 commits) iommu/amd: Correctly encode huge pages in iommu page tables iommu/amd: Optimize amd_iommu_iova_to_phys for new fetch_pte interface iommu/amd: Optimize alloc_new_range for new fetch_pte interface iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface iommu/amd: Return the pte page-size in fetch_pte iommu/amd: Add support for contiguous dma allocator iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE iommu/tegra: smmu: Compute PFN mask at runtime iommu/tegra: gart: Set aperture at domain initialization time iommu/tegra: Setup aperture iommu: Remove domain_init and domain_free iommu_ops iommu/fsl: Make use of domain_alloc and domain_free iommu/rockchip: Make use of domain_alloc and domain_free iommu/ipmmu-vmsa: Make use of domain_alloc and domain_free iommu/shmobile: Make use of domain_alloc and domain_free iommu/msm: Make use of domain_alloc and domain_free iommu/tegra-gart: Make use of domain_alloc and domain_free iommu/tegra-smmu: Make use of domain_alloc and domain_free ...	2015-04-20 10:50:05 -07:00
Rafael J. Wysocki	9a9ca16e7a	Merge branch 'device-properties' * device-properties: device property: Introduce firmware node type for platform data device property: Make it possible to use secondary firmware nodes driver core: Implement device property accessors through fwnode ones driver core: property: Update fwnode_property_read_string_array() driver core: Add comments about returning array counts ACPI: Introduce has_acpi_companion() driver core / ACPI: Represent ACPI companions using fwnode_handle	2015-04-13 00:35:54 +02:00
Joerg Roedel	7f65ef01e1	Merge branches 'iommu/fixes', 'x86/vt-d', 'x86/amd', 'arm/smmu', 'arm/tegra' and 'core' into next Conflicts: drivers/iommu/amd_iommu.c drivers/iommu/tegra-gart.c drivers/iommu/tegra-smmu.c	2015-04-02 13:33:19 +02:00
Joerg Roedel	00a77deb0f	iommu/vt-d: Make use of domain_alloc and domain_free Get rid of domain_init and domain_destroy and implement domain_alloc/domain_free instead. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-03-31 15:32:02 +02:00
David Woodhouse	03ecc32c52	iommu/vt-d: support extended root and context entries Add a new function iommu_context_addr() which takes care of the differences and returns a pointer to a context entry which may be in either format. The formats are binary compatible for all the old fields anyway; the new one is just larger and some of the reserved bits in the original 128 are now meaningful. So far, nothing actually uses the new fields in the extended context entry. Modulo hardware bugs with interpreting the new-style tables, this should basically be a no-op. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-03-25 15:46:13 +00:00
David Woodhouse	18436afdc1	iommu/vt-d: Allow RMRR on graphics devices too Commit `c875d2c1` ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains") prevents certain options for devices with RMRRs. This even prevents those devices from getting a 1:1 mapping with 'iommu=pt', because we don't have the code to handle preserving the RMRR regions when moving the device between domains. There's already an exclusion for USB devices, because we know the only reason for RMRRs there is a misguided desire to keep legacy keyboard/mouse emulation running in some theoretical OS which doesn't have support for USB in its own right... but which does enable the IOMMU. Add an exclusion for graphics devices too, so that 'iommu=pt' works there. We should be able to successfully assign graphics devices to guests too, as long as the initial handling of stolen memory is reconfigured appropriately. This has certainly worked in the past. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@vger.kernel.org	2015-03-25 15:36:35 +00:00
Alex Williamson	509fca899d	iommu/vt-d: Remove unused variable Unused after commit `7168440690` ("iommu/vt-d: Detach domain only from attached iommus"). Reported by 0-day builder. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-03-24 15:39:26 +01:00
Alex Williamson	7168440690	iommu/vt-d: Detach domain only from attached iommus Device domains never span IOMMU hardware units, which allows the domain ID space for each IOMMU to be an independent address space. Therefore we can have multiple, independent domains, each with the same domain->id, but attached to different hardware units. This is also why we need to do a heavy-weight search for VM domains since they can span multiple IOMMUs hardware units and we don't require a single global ID to use for all hardware units. Therefore, if we call iommu_detach_domain() across all active IOMMU hardware units for a non-VM domain, the result is that we clear domain IDs that are not associated with our domain, allowing them to be re-allocated and causing apparent coherency issues when the device cannot access IOVAs for the intended domain. This bug was introduced in commit `fb170fb4c5` ("iommu/vt-d: Introduce helper functions to make code symmetric for readability"), but is significantly exacerbated by the more recent commit `62c22167dd` ("iommu/vt-d: Fix dmar_domain leak in iommu_attach_device") which calls domain_exit() more frequently to resolve a domain leak. Fixes: `fb170fb4c5` ("iommu/vt-d: Introduce helper functions to make code symmetric for readability") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-03-23 15:22:08 +01:00
Rafael J. Wysocki	ca5b74d267	ACPI: Introduce has_acpi_companion() Now that the ACPI companions of devices are represented by pointers to struct fwnode_handle, it is not quite efficient to check whether or not an ACPI companion of a device is present by evaluating the ACPI_COMPANION() macro. For this reason, introduce a special static inline routine for that, has_acpi_companion(), and update the code to use it where applicable. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-03-16 23:49:08 +01:00
Quentin Lambert	2f119c7895	iommu/vt-d: Convert non-returned local variable to boolean when relevant This patch was produced using Coccinelle. A simplified version of the semantic patch is: @r exists@ identifier f; local idexpression u8 x; identifier xname; @@ f(...) { ...when any ( x@xname = 1; \| x@xname = 0; ) ...when any } @bad exists@ identifier r.f; local idexpression u8 r.x expression e1 != {0, 1}, e2; @@ f(...) { ...when any ( x = e1; \| x + e2 ) ...when any } @depends on !bad@ identifier r.f; local idexpression u8 r.x; identifier r.xname; @@ f(...) { ... ++ bool xname; - int xname; <... ( x = - 1 + true \| x = - -1 + false ) ...> } Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-03-03 14:13:20 +01:00
Robin Murphy	0fb5fe874c	iommu: Make IOVA domain page size explicit Systems may contain heterogeneous IOMMUs supporting differing minimum page sizes, which may also not be common with the CPU page size. Thus it is practical to have an explicit notion of IOVA granularity to simplify handling of mapping and allocation constraints. As an initial step, move the IOVA page granularity from an implicit compile-time constant to a per-domain property so we can make use of it in IOVA domain context at runtime. To keep the abstraction tidy, extend the little API of inline iova_* helpers to parallel some of the equivalent PAGE_* macros. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-19 14:55:22 +01:00
Robin Murphy	1b72250076	iommu: Make IOVA domain low limit flexible To share the IOVA allocator with other architectures, it needs to accommodate more general aperture restrictions; move the lower limit from a compile-time constant to a runtime domain property to allow IOVA domains with different requirements to co-exist. Also reword the slightly unclear description of alloc_iova since we're touching it anyway. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-19 14:55:22 +01:00
Robin Murphy	85b4545629	iommu: Consolidate IOVA allocator code In order to share the IOVA allocator with other architectures, break the unnecssary dependency on the Intel IOMMU driver and move the remaining IOVA internals to iova.c Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-19 14:55:22 +01:00
Joerg Roedel	6d1b9cc9ee	iommu/vt-d: Remove dead code in device_notifier This code only runs when action == BUS_NOTIFY_REMOVED_DEVICE, so it can't be BUS_NOTIFY_DEL_DEVICE. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-05 12:23:38 +01:00
Joerg Roedel	62c22167dd	iommu/vt-d: Fix dmar_domain leak in iommu_attach_device Since commit `1196c2f` a domain is only destroyed in the notifier path if it is hot-unplugged. This caused a domain leakage in iommu_attach_device when a driver was unbound from the device and bound to VFIO. In this case the device is attached to a new domain and unlinked from the old domain. At this point nothing points to the old domain anymore and its memory is leaked. Fix this by explicitly freeing the old domain in iommu_attach_domain. Fixes: `1196c2f` (iommu/vt-d: Fix dmar_domain leak in iommu_attach_device) Cc: stable@vger.kernel.org # v3.18 Tested-by: Jerry Hoemann <jerry.hoemann@hp.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-05 12:23:38 +01:00
Joerg Roedel	76771c938e	Merge branches 'arm/omap', 'arm/msm', 'arm/rockchip', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next Conflicts: drivers/iommu/arm-smmu.c	2014-12-02 13:07:13 +01:00
Jiang Liu	cc4f14aa17	iommu/vt-d: Fix an off-by-one bug in __domain_mapping() There's an off-by-one bug in function __domain_mapping(), which may trigger the BUG_ON(nr_pages < lvl_pages) when (nr_pages + 1) & superpage_mask == 0 The issue was introduced by commit `9051aa0268` "intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()", which sets sg_res to "nr_pages + 1" to avoid some of the 'sg_res==0' code paths. It's safe to remove extra "+1" because sg_res is only used to calculate page size now. Reported-And-Tested-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: <stable@vger.kernel.org> # >= 3.0 Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-12-02 13:03:09 +01:00
Jiang Liu	ffebeb46dd	iommu/vt-d: Enhance intel-iommu driver to support DMAR unit hotplug Implement required callback functions for intel-iommu driver to support DMAR unit hotplug. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Reviewed-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:36 +01:00
Jiang Liu	6b1972493a	iommu/vt-d: Implement DMAR unit hotplug framework On Intel platforms, an IO Hub (PCI/PCIe host bridge) may contain DMAR units, so we need to support DMAR hotplug when supporting PCI host bridge hotplug on Intel platforms. According to Section 8.8 "Remapping Hardware Unit Hot Plug" in "Intel Virtualization Technology for Directed IO Architecture Specification Rev 2.2", ACPI BIOS should implement ACPI _DSM method under the ACPI object for the PCI host bridge to support DMAR hotplug. This patch introduces interfaces to parse ACPI _DSM method for DMAR unit hotplug. It also implements state machines for DMAR unit hot-addition and hot-removal. The PCI host bridge hotplug driver should call dmar_hotplug_hotplug() before scanning PCI devices connected for hot-addition and after destroying all PCI devices for hot-removal. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Reviewed-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:35 +01:00
Jiang Liu	78d8e70461	iommu/vt-d: Dynamically allocate and free seq_id for DMAR units Introduce functions to support dynamic IOMMU seq_id allocating and releasing, which will be used to support DMAR hotplug. Also rename IOMMU_UNITS_SUPPORTED as DMAR_UNITS_SUPPORTED. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Reviewed-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:35 +01:00
Jiang Liu	c2a0b538d2	iommu/vt-d: Introduce helper function dmar_walk_resources() Introduce helper function dmar_walk_resources to walk resource entries in DMAR table and ACPI buffer object returned by ACPI _DSM method for IOMMU hot-plug. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:35 +01:00
Li, Zhen-Hua	1a2262f90f	x86/vt-d: Fix incorrect bit operations in setting values The function context_set_address_root() and set_root_value are setting new address in a wrong way, and this patch is trying to fix this problem. According to Intel Vt-d specs(Feb 2011, Revision 1.3), Chapter 9.1 and 9.2, field ctp in root entry is using bits 12:63, field asr in context entry is using bits 12:63. To set these fields, the following functions are used: static inline void context_set_address_root(struct context_entry context, unsigned long value); and static inline void set_root_value(struct root_entry root, unsigned long value) But they are using an invalid method to set these fields, in current code, only a '\|' operator is used to set it. This will not set the asr to the expected value if it has an old value. For example: Before calling this function, context->lo = 0x3456789012111; value = 0x123456789abcef12; After we call context_set_address_root(context, value), expected result is context->lo == 0x123456789abce111; But the actual result is: context->lo == 0x1237577f9bbde111; So we need to clear bits 12:63 before setting the new value, this will fix this problem. Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-06 14:40:10 +01:00
Olav Haugan	315786ebbf	iommu: Add iommu_map_sg() function Mapping and unmapping are more often than not in the critical path. map_sg allows IOMMU driver implementations to optimize the process of mapping buffers into the IOMMU page tables. Instead of mapping a buffer one page at a time and requiring potentially expensive TLB operations for each page, this function allows the driver to map all pages in one go and defer TLB maintenance until after all pages have been mapped. Additionally, the mapping operation would be faster in general since clients does not have to keep calling map API over and over again for each physically contiguous chunk of memory that needs to be mapped to a virtually contiguous region. Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-04 14:53:36 +01:00
Joerg Roedel	09b5269a1b	Merge branches 'arm/exynos', 'arm/omap', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next Conflicts: drivers/iommu/arm-smmu.c	2014-10-02 12:24:45 +02:00
Joerg Roedel	1196c2fb04	iommu/vt-d: Only remove domain when device is removed This makes sure any RMRR mappings stay in place when the driver is unbound from the device. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Jerry Hoemann <jerry.hoemann@hp.com>	2014-10-02 11:18:58 +02:00
Joerg Roedel	5d587b8de5	iommu/vt-d: Convert to iommu_capable() API function Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-09-25 15:47:37 +02:00
Joerg Roedel	e7f9fa5498	iommu/vt-d: Defer domain removal if device is assigned to a driver When the BUS_NOTIFY_DEL_DEVICE event is received the device might still be attached to a driver. In this case the domain can't be released as the mappings might still be in use. Defer the domain removal in this case until we receivce the BUS_NOTIFY_UNBOUND_DRIVER event. Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: stable@vger.kernel.org # v3.15, v3.16 Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-08-18 13:37:56 +02:00
Alex Williamson	c875d2c1b8	iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains The user of the IOMMU API domain expects to have full control of the IOVA space for the domain. RMRRs are fundamentally incompatible with that idea. We can neither map the RMRR into the IOMMU API domain, nor can we guarantee that the device won't continue DMA with the area described by the RMRR as part of the new domain. Therefore we must prevent such devices from being used by the IOMMU API. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-29 17:38:31 +02:00
Jiang Liu	161f693460	iommu/vt-d: Fix issue in computing domain's iommu_snooping flag IOMMU units may dynamically attached to/detached from domains, so we should scan all active IOMMU units when computing iommu_snooping flag for a domain instead of only scanning IOMMU units associated with the domain. Also check snooping and superpage capabilities when hot-adding DMAR units. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	a156ef99e8	iommu/vt-d: Introduce helper function iova_size() to improve code readability Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	162d1b10d4	iommu/vt-d: Introduce helper domain_pfn_within_range() to simplify code Introduce helper function domain_pfn_within_range() to simplify code and improve readability. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	d41a4adb1b	iommu/vt-d: Simplify intel_unmap_sg() and kill duplicated code Introduce intel_unmap() to reduce duplicated code in intel_unmap_sg() and intel_unmap_page(). Also let dma_pte_free_pagetable() to call dma_pte_clear_range() directly, so caller only needs to call dma_pte_free_pagetable(). Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	2a41ccee2f	iommu/vt-d: Change iommu_enable/disable_translation to return void Simplify error handling path by changing iommu_{enable\|disable}_translation to return void. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	129ad28100	iommu/vt-d: Avoid freeing virtual machine domain in free_dmar_iommu() Virtual machine domains are created by intel_iommu_domain_init() and should be destroyed by intel_iommu_domain_destroy(). So avoid freeing virtual machine domain data structure in free_dmar_iommu() when doamin->iommu_count reaches zero, otherwise it may cause invalid memory access because the IOMMU framework still holds references to the domain structure. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	2a46ddf77c	iommu/vt-d: Fix possible invalid memory access caused by free_dmar_iommu() Static identity and virtual machine domains may be cached in iommu->domain_ids array after corresponding IOMMUs have been removed from domain->iommu_bmp. So we should check domain->iommu_bmp before decreasing domain->iommu_count in function free_dmar_iommu(), otherwise it may cause free of inuse domain data structure. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	44bde61428	iommu/vt-d: Allocate dynamic domain id for virtual domains only Check the same domain id is allocated for si_domain on each IOMMU, otherwise the IOTLB flush for si_domain will fail. Now the rules to allocate and manage domain id are: 1) For normal and static identity domains, domain id is allocated when creating domain structure. And this id will be written into context entry. 2) For virtual machine domain, a virtual id is allocated when creating domain. And when binding virtual machine domain to an iommu, a real domain id is allocated on demand and this domain id will be written into context entry. So domain->id for virtual machine domain may be different from the domain id written into context entry(used by hardware). Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	fb170fb4c5	iommu/vt-d: Introduce helper functions to make code symmetric for readability Introduce domain_attach_iommu()/domain_detach_iommu() and refine iommu_attach_domain()/iommu_detach_domain() to make code symmetric and improve readability. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:46 +02:00
Jiang Liu	ab8dfe2515	iommu/vt-d: Introduce helper functions to improve code readability Introduce domain_type_is_vm() and domain_type_is_vm_or_si() to improve code readability. Also kill useless macro DOMAIN_FLAG_P2P_MULTIPLE_DEVICES. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:46 +02:00
Jiang Liu	18fd779a41	iommu/vt-d: Use correct domain id to flush virtual machine domains For virtual machine domains, domain->id is a virtual id, and the real domain id written into context entry is dynamically allocated. So use the real domain id instead of domain->id when flushing iotlbs for virtual machine domains. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:46 +02:00
Jiang Liu	c3b497c6bb	iommu/vt-d: Match segment number when searching for dev_iotlb capable devices For virtual machine and static identity domains, there may be devices from different PCI segments associated with the same domain. So function iommu_support_dev_iotlb() should also match PCI segment number (iommu unit) when searching for dev_iotlb capable devices. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:46 +02:00
Joerg Roedel	cbb24a25a8	Merge branch 'core' into x86/vt-d Conflicts: drivers/iommu/intel-iommu.c	2014-07-23 16:04:37 +02:00
Thierry Reding	b22f6434cf	iommu: Constify struct iommu_ops This structure is read-only data and should never be modified. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-07 10:36:59 +02:00
Alex Williamson	a5459cfece	iommu/vt-d: Make use of IOMMU sysfs support Register our DRHD IOMMUs, cross link devices, and provide a base set of attributes for the IOMMU. Note that IRQ remapping support parses the DMAR table very early in boot, well before the iommu_class can reasonably be setup, so our registration is split between intel_iommu_init(), which occurs later, and alloc_iommu(), which typically occurs much earlier, but may happen at any time later with IOMMU hot-add support. On a typical desktop system, this provides the following (pruned): $ find /sys \| grep dmar /sys/devices/virtual/iommu/dmar0 /sys/devices/virtual/iommu/dmar0/devices /sys/devices/virtual/iommu/dmar0/devices/0000:00:02.0 /sys/devices/virtual/iommu/dmar0/intel-iommu /sys/devices/virtual/iommu/dmar0/intel-iommu/cap /sys/devices/virtual/iommu/dmar0/intel-iommu/ecap /sys/devices/virtual/iommu/dmar0/intel-iommu/address /sys/devices/virtual/iommu/dmar0/intel-iommu/version /sys/devices/virtual/iommu/dmar1 /sys/devices/virtual/iommu/dmar1/devices /sys/devices/virtual/iommu/dmar1/devices/0000:00:00.0 /sys/devices/virtual/iommu/dmar1/devices/0000:00:01.0 /sys/devices/virtual/iommu/dmar1/devices/0000:00:16.0 /sys/devices/virtual/iommu/dmar1/devices/0000:00:1a.0 /sys/devices/virtual/iommu/dmar1/devices/0000:00:1b.0 /sys/devices/virtual/iommu/dmar1/devices/0000:00:1c.0 ... /sys/devices/virtual/iommu/dmar1/intel-iommu /sys/devices/virtual/iommu/dmar1/intel-iommu/cap /sys/devices/virtual/iommu/dmar1/intel-iommu/ecap /sys/devices/virtual/iommu/dmar1/intel-iommu/address /sys/devices/virtual/iommu/dmar1/intel-iommu/version /sys/class/iommu/dmar0 /sys/class/iommu/dmar1 (devices also link back to the dmar units) This makes address, version, capabilities, and extended capabilities available, just like printed on boot. I've tried not to duplicate data that can be found in the DMAR table, with the exception of the address, which provides an easy way to associate the sysfs device with a DRHD entry in the DMAR. It's tempting to add scopes and RMRR data here, but the full DMAR table is already exposed under /sys/firmware/ and therefore already provides a way for userspace to learn such details. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 12:35:59 +02:00
Alex Williamson	579305f75d	iommu/vt-d: Update to use PCI DMA aliases VT-d code currently makes use of pci_find_upstream_pcie_bridge() in order to find the topology based alias of a device. This function has a few problems. First, it doesn't check the entire alias path of the device to the root bus, therefore if a PCIe device is masked upstream, the wrong result is produced. Also, it's known to get confused and give up when it crosses a bridge from a conventional PCI bus to a PCIe bus that lacks a PCIe capability. The PCI-core provided DMA alias support solves both of these problems and additionally adds support for DMA function quirks allowing VT-d to work with devices like Marvell and Ricoh with known broken requester IDs. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 12:35:58 +02:00
Alex Williamson	e17f9ff413	iommu/vt-d: Use iommu_group_get_for_dev() The IOMMU code now provides a common interface for finding or creating an IOMMU group for a device on PCI buses. Make use of it and remove piles of code. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 12:35:58 +02:00
Yijing Wang	aa4d066a2a	iommu/vt-d: Suppress compiler warnings suppress compiler warnings: drivers/iommu/intel-iommu.c: In function ‘device_to_iommu’: drivers/iommu/intel-iommu.c:673: warning: ‘segment’ may be used uninitialized in this function drivers/iommu/intel-iommu.c: In function ‘get_domain_for_dev.clone.3’: drivers/iommu/intel-iommu.c:2217: warning: ‘bridge_bus’ may be used uninitialized in this function drivers/iommu/intel-iommu.c:2217: warning: ‘bridge_devfn’ may be used uninitialized in this function Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:34:37 +02:00
Yijing Wang	effad4b59f	iommu/vt-d: Remove the useless dma_pte_addr Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:34:20 +02:00
Joerg Roedel	c3c75eb7fa	iommu/vt-d: Don't use magic number in dma_pte_superpage Use the already defined DMA_PTE_LARGE_PAGE for testing instead of hardcoding the value again. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:34:18 +02:00
Yijing Wang	9b27e82d20	iommu/vt-d: Fix reference count in iommu_prepare_isa Decrease the device reference count avoid memory leak. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:34:13 +02:00
Yijing Wang	e16922af9d	iommu/vt-d: Use inline function dma_pte_superpage instead of macros Use inline function dma_pte_superpage() instead of macro for better readability. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:34:07 +02:00
Yijing Wang	8f9d41b430	iommu/vt-d: Clear the redundant assignment for domain->nid Alloc_domain() will initialize domain->nid to -1. So the initialization for domain->nid in md_domain_init() is redundant, clear it. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:34:00 +02:00
Yijing Wang	3a74ca0140	iommu/vt-d: Use list_for_each_safe() to simplify code Use list_for_each_entry_safe() instead of list_entry() to simplify code. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-04 11:16:20 +02:00
Jiang Liu	27e249501c	iommu/vt-d: fix bug in handling multiple RMRRs for the same PCI device Function dmar_iommu_notify_scope_dev() makes a wrong assumption that there's one RMRR for each PCI device at most, which causes DMA failure on some HP platforms. So enhance dmar_iommu_notify_scope_dev() to handle multiple RMRRs for the same PCI device. Fixbug: https://bugzilla.novell.com/show_bug.cgi?id=879482 Cc: <stable@vger.kernel.org> # 3.15 Reported-by: Tom Mingarelli <thomas.mingarelli@hp.com> Tested-by: Linda Knippers <linda.knippers@hp.com> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-06-20 14:18:04 +02:00
Akinobu Mita	3674643625	intel-iommu: integrate DMA CMA This adds support for the DMA Contiguous Memory Allocator for intel-iommu. This change enables dma_alloc_coherent() to allocate big contiguous memory. It is achieved in the same way as nommu_dma_ops currently does, i.e. trying to allocate memory by dma_alloc_from_contiguous() and alloc_pages() is used as a fallback. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Don Dutile <ddutile@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-04 16:53:57 -07:00
David Woodhouse	9f05d3fb64	iommu/vt-d: Fix get_domain_for_dev() handling of upstream PCIe bridges Commit `146922ec79` ("iommu/vt-d: Make get_domain_for_dev() take struct device") introduced new variables bridge_bus and bridge_devfn to identify the upstream PCIe to PCI bridge responsible for the given target device. Leaving the original bus/devfn variables to identify the target device itself, now that it is no longer assumed to be PCI and we can no longer trivially find that information. However, the patch failed to correctly use the new variables in all cases; instead using the as-yet-uninitialised 'bus' and 'devfn' variables. Reported-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-04-14 22:01:30 -07:00
Jiang Liu	adeb25905c	iommu/vt-d: fix memory leakage caused by commit `ea8ea46` Commit `ea8ea46` "iommu/vt-d: Clean up and fix page table clear/free behaviour" introduces possible leakage of DMA page tables due to: for (pte = page_address(pg); !first_pte_in_page(pte); pte++) { if (dma_pte_present(pte) && !dma_pte_superpage(pte)) freelist = dma_pte_list_pagetables(domain, level - 1, pte, freelist); } For the first pte in a page, first_pte_in_page(pte) will always be true, thus dma_pte_list_pagetables() will never be called and leak DMA page tables if level is bigger than 1. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-04-13 13:07:56 +01:00
Dan Carpenter	14d4056996	iommu/vt-d: returning free pointer in get_domain_for_dev() If we hit this error condition then we want to return a NULL pointer and not a freed variable. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-28 11:31:39 +00:00
David Woodhouse	cf04eee8bf	iommu/vt-d: Include ACPI devices in iommu=pt Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:10 +00:00
David Woodhouse	66077edc97	iommu/vt-d: Finally enable translation for non-PCI devices Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:08 +00:00
David Woodhouse	46333e375f	iommu/vt-d: Remove to_pci_dev() in intel_map_page() It might not be... Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:07 +00:00
David Woodhouse	7207d8f925	iommu/vt-d: Remove pdev from intel_iommu_attach_device() Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:05 +00:00
David Woodhouse	ecb509ec2b	iommu/vt-d: Remove pdev from iommu_no_mapping() Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:04 +00:00
David Woodhouse	5913c9bf0e	iommu/vt-d: Make domain_add_dev_info() take struct device Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:03 +00:00
David Woodhouse	bf9c9eda71	iommu/vt-d: Make domain_remove_one_dev_info() take struct device Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:01 +00:00
David Woodhouse	5040a918bd	iommu/vt-d: Rename 'hwdev' variables to 'dev' now that that's the norm Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:08:00 +00:00
David Woodhouse	207e35920d	iommu/vt-d: Remove some pointless to_pci_dev() calls Mostly made redundant by using dev_name() instead of pci_name(), and one instance of using *dev->dma_mask instead of pdev->dma_mask. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:58 +00:00
David Woodhouse	d4b709f48e	iommu/vt-d: Make get_valid_domain_for_dev() take struct device Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:57 +00:00
David Woodhouse	3bdb259116	iommu/vt-d: Make iommu_should_identity_map() take struct device Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:55 +00:00
David Woodhouse	0b9d975315	iommu/vt-d: Handle RMRRs for non-PCI devices Should hopefully never happen (RMRRs are an abomination) but while we're busy eliminating all the PCI assumptions, we might as well do it. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:54 +00:00
David Woodhouse	146922ec79	iommu/vt-d: Make get_domain_for_dev() take struct device Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:52 +00:00
David Woodhouse	e1f167f3fd	iommu/vt-d: Make domain_context_mapp{ed,ing}() take struct device Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:51 +00:00
David Woodhouse	156baca8d3	iommu/vt-d: Make device_to_iommu() cope with non-PCI devices Pass the struct device to it, and also make it return the bus/devfn to use, since that is also stored in the DMAR table. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:49 +00:00
David Woodhouse	9b226624bb	iommu/vt-d: Make identity_mapping() take struct device not struct pci_dev Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:48 +00:00
David Woodhouse	41e80dca52	iommu/vt-d: Remove segment from struct device_domain_info() It's accessible via info->iommu->segment so this is redundant. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:07:46 +00:00
David Woodhouse	7c7faa11ec	iommu/vt-d: Remove device_to_iommu() call from domain_remove_dev_info() This was problematic because it works by domain/bus/devfn and we want to make device_to_iommu() use only a struct device * (for handling non-PCI devices). Now that the iommu pointer is reliably stored in the device_domain_info, we don't need to look it up. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:53 +00:00
David Woodhouse	8bbc441012	iommu/vt-d: Simplify iommu check in domain_remove_one_dev_info() Now we store the iommu in the device_domain_info, we don't need to do a lookup. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:51 +00:00
David Woodhouse	5a8f40e8c8	iommu/vt-d: Always store iommu in device_domain_info Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:44 +00:00
David Woodhouse	e2f8c5f6d4	iommu/vt-d: Use domain_remove_one_dev_info() in domain_add_dev_info() error path Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:42 +00:00
David Woodhouse	0ac7266485	iommu/vt-d: use dmar_insert_dev_info() from dma_add_dev_info() Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:41 +00:00
David Woodhouse	b718cd3d84	iommu/vt-d: Stop dmar_insert_dev_info() freeing domains on losing race By moving this into get_domain_for_dev() we can make dmar_insert_dev_info() suitable for use with "special" domains such as the si_domain, which currently use domain_add_dev_info(). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:39 +00:00
David Woodhouse	64ae892bfe	iommu/vt-d: Pass iommu to domain_context_mapping_one() and iommu_support_dev_iotlb() Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:37 +00:00
David Woodhouse	0bcb3e28c3	iommu/vt-d: Use struct device in device_domain_info, not struct pci_dev Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:36 +00:00
David Woodhouse	1525a29a7d	iommu/vt-d: Make dmar_insert_dev_info() take struct device instead of struct pci_dev Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:34 +00:00
David Woodhouse	3d89194a94	iommu/vt-d: Make iommu_dummy() take struct device instead of struct pci_dev Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:06:33 +00:00
David Woodhouse	832bd85867	iommu/vt-d: Change scope lists to struct device, bus, devfn It's not only for PCI devices any more, and the scope information for an ACPI device provides the bus and devfn so that has to be stored here too. It is the device pointer itself which needs to be protected with RCU, so the __rcu annotation follows it into the definition of struct dmar_dev_scope, since we're no longer just passing arrays of device pointers around. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-24 14:05:08 +00:00
David Woodhouse	d050196087	iommu/vt-d: Be less pessimistic about domain coherency where possible In commit `2e12bc29` ("intel-iommu: Default to non-coherent for domains unattached to iommus") we decided to err on the side of caution and always assume that it's possible that a device will be attached which is behind a non-coherent IOMMU. In some cases, however, that just cannot happen. If there are no IOMMUs in the system which are non-coherent, then we don't need to do it. And flushing the dcache is a significant performance hit. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-19 17:25:48 +00:00
David Woodhouse	214e39aa36	iommu/vt-d: Honour intel_iommu=sp_off for non-VMM domains Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-19 17:22:13 +00:00
David Woodhouse	ea8ea460c9	iommu/vt-d: Clean up and fix page table clear/free behaviour There is a race condition between the existing clear/free code and the hardware. The IOMMU is actually permitted to cache the intermediate levels of the page tables, and doesn't need to walk the table from the very top of the PGD each time. So the existing back-to-back calls to dma_pte_clear_range() and dma_pte_free_pagetable() can lead to a use-after-free where the IOMMU reads from a freed page table. When freeing page tables we actually need to do the IOTLB flush, with the 'invalidation hint' bit clear to indicate that it's not just a leaf-node flush, after unlinking each page table page from the next level up but before actually freeing it. So in the rewritten domain_unmap() we just return a list of pages (using pg->freelist to make a list of them), and then the caller is expected to do the appropriate IOTLB flush (or tear down the domain completely, whatever), before finally calling dma_free_pagelist() to free the pages. As an added bonus, we no longer need to flush the CPU's data cache for pages which are about to be removed from the page table hierarchy anyway, in the non-cache-coherent case. This drastically improves the performance of large unmaps. As a side-effect of all these changes, this also fixes the fact that intel_iommu_unmap() was neglecting to free the page tables for the range in question after clearing them. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-19 17:21:41 +00:00
David Woodhouse	5cf0a76fa2	iommu/vt-d: Clean up size handling for intel_iommu_unmap() We have this horrid API where iommu_unmap() can unmap more than it's asked to, if the IOVA in question happens to be mapped with a large page. Instead of propagating this nonsense to the point where we end up returning the page order from dma_pte_clear_range(), let's just do it once and adjust the 'size' parameter accordingly. Augment pfn_to_dma_pte() to return the level at which the PTE was found, which will also be useful later if we end up changing the API for iommu_iova_to_phys() to behave the same way as is being discussed upstream. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2014-03-19 17:21:32 +00:00
Jiang Liu	75f05569d0	iommu/vt-d: Update IOMMU state when memory hotplug happens If static identity domain is created, IOMMU driver needs to update si_domain page table when memory hotplug event happens. Otherwise PCI device DMA operations can't access the hot-added memory regions. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:06 +01:00
Jiang Liu	2e45528930	iommu/vt-d: Unify the way to process DMAR device scope array Now we have a PCI bus notification based mechanism to update DMAR device scope array, we could extend the mechanism to support boot time initialization too, which will help to unify and simplify the implementation. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:06 +01:00
Jiang Liu	59ce0515cd	iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens Current Intel DMAR/IOMMU driver assumes that all PCI devices associated with DMAR/RMRR/ATSR device scope arrays are created at boot time and won't change at runtime, so it caches pointers of associated PCI device object. That assumption may be wrong now due to: 1) introduction of PCI host bridge hotplug 2) PCI device hotplug through sysfs interfaces. Wang Yijing has tried to solve this issue by caching <bus, dev, func> tupple instead of the PCI device object pointer, but that's still unreliable because PCI bus number may change in case of hotplug. Please refer to http://lkml.org/lkml/2013/11/5/64 Message from Yingjing's mail: after remove and rescan a pci device [ 611.857095] dmar: DRHD: handling fault status reg 2 [ 611.857109] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff7000 [ 611.857109] DMAR:[fault reason 02] Present bit in context entry is clear [ 611.857524] dmar: DRHD: handling fault status reg 102 [ 611.857534] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff6000 [ 611.857534] DMAR:[fault reason 02] Present bit in context entry is clear [ 611.857936] dmar: DRHD: handling fault status reg 202 [ 611.857947] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff5000 [ 611.857947] DMAR:[fault reason 02] Present bit in context entry is clear [ 611.858351] dmar: DRHD: handling fault status reg 302 [ 611.858362] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff4000 [ 611.858362] DMAR:[fault reason 02] Present bit in context entry is clear [ 611.860819] IPv6: ADDRCONF(NETDEV_UP): eth3: link is not ready [ 611.860983] dmar: DRHD: handling fault status reg 402 [ 611.860995] dmar: INTR-REMAP: Request device [[86:00.3] fault index a4 [ 611.860995] INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear This patch introduces a new mechanism to update the DRHD/RMRR/ATSR device scope caches by hooking PCI bus notification. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:06 +01:00
Jiang Liu	0e242612d9	iommu/vt-d: Use RCU to protect global resources in interrupt context Global DMA and interrupt remapping resources may be accessed in interrupt context, so use RCU instead of rwsem to protect them in such cases. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:05 +01:00
Jiang Liu	3a5670e8ac	iommu/vt-d: Introduce a rwsem to protect global data structures Introduce a global rwsem dmar_global_lock, which will be used to protect DMAR related global data structures from DMAR/PCI/memory device hotplug operations in process context. DMA and interrupt remapping related data structures are read most, and only change when memory/PCI/DMAR hotplug event happens. So a global rwsem solution is adopted for balance between simplicity and performance. For interrupt remapping driver, function intel_irq_remapping_supported(), dmar_table_init(), intel_enable_irq_remapping(), disable_irq_remapping(), reenable_irq_remapping() and enable_drhd_fault_handling() etc are called during booting, suspending and resuming with interrupt disabled, so no need to take the global lock. For interrupt remapping entry allocation, the locking model is: down_read(&dmar_global_lock); /* Find corresponding iommu / iommu = map_hpet_to_ir(id); if (iommu) / * Allocate remapping entry and mark entry busy, * the IOMMU won't be hot-removed until the * allocated entry has been released. */ index = alloc_irte(iommu, irq, 1); up_read(&dmar_global_lock); For DMA remmaping driver, we only uses the dmar_global_lock rwsem to protect functions which are only called in process context. For any function which may be called in interrupt context, we will use RCU to protect them in following patches. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:05 +01:00
Jiang Liu	b683b230a2	iommu/vt-d: Introduce macro for_each_dev_scope() to walk device scope entries Introduce for_each_dev_scope()/for_each_active_dev_scope() to walk {active} device scope entries. This will help following RCU lock related patches. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:04 +01:00
Jiang Liu	b5f82ddf22	iommu/vt-d: Fix error in detect ATS capability Current Intel IOMMU driver only matches a PCIe root port with the first DRHD unit with the samge segment number. It will report false result if there are multiple DRHD units with the same segment number, thus fail to detect ATS capability for some PCIe devices. This patch refines function dmar_find_matched_atsr_unit() to search all DRHD units with the same segment number. An example DMAR table entries as below: [1D0h 0464 2] Subtable Type : 0002 <Root Port ATS Capability> [1D2h 0466 2] Length : 0028 [1D4h 0468 1] Flags : 00 [1D5h 0469 1] Reserved : 00 [1D6h 0470 2] PCI Segment Number : 0000 [1D8h 0472 1] Device Scope Entry Type : 02 [1D9h 0473 1] Entry Length : 08 [1DAh 0474 2] Reserved : 0000 [1DCh 0476 1] Enumeration ID : 00 [1DDh 0477 1] PCI Bus Number : 00 [1DEh 0478 2] PCI Path : [02, 00] [1E0h 0480 1] Device Scope Entry Type : 02 [1E1h 0481 1] Entry Length : 08 [1E2h 0482 2] Reserved : 0000 [1E4h 0484 1] Enumeration ID : 00 [1E5h 0485 1] PCI Bus Number : 00 [1E6h 0486 2] PCI Path : [03, 00] [1E8h 0488 1] Device Scope Entry Type : 02 [1E9h 0489 1] Entry Length : 08 [1EAh 0490 2] Reserved : 0000 [1ECh 0492 1] Enumeration ID : 00 [1EDh 0493 1] PCI Bus Number : 00 [1EEh 0494 2] PCI Path : [03, 02] [1F0h 0496 1] Device Scope Entry Type : 02 [1F1h 0497 1] Entry Length : 08 [1F2h 0498 2] Reserved : 0000 [1F4h 0500 1] Enumeration ID : 00 [1F5h 0501 1] PCI Bus Number : 00 [1F6h 0502 2] PCI Path : [03, 03] [1F8h 0504 2] Subtable Type : 0002 <Root Port ATS Capability> [1FAh 0506 2] Length : 0020 [1FCh 0508 1] Flags : 00 [1FDh 0509 1] Reserved : 00 [1FEh 0510 2] PCI Segment Number : 0000 [200h 0512 1] Device Scope Entry Type : 02 [201h 0513 1] Entry Length : 08 [202h 0514 2] Reserved : 0000 [204h 0516 1] Enumeration ID : 00 [205h 0517 1] PCI Bus Number : 40 [206h 0518 2] PCI Path : [02, 00] [208h 0520 1] Device Scope Entry Type : 02 [209h 0521 1] Entry Length : 08 [20Ah 0522 2] Reserved : 0000 [20Ch 0524 1] Enumeration ID : 00 [20Dh 0525 1] PCI Bus Number : 40 [20Eh 0526 2] PCI Path : [02, 02] [210h 0528 1] Device Scope Entry Type : 02 [211h 0529 1] Entry Length : 08 [212h 0530 2] Reserved : 0000 [214h 0532 1] Enumeration ID : 00 [215h 0533 1] PCI Bus Number : 40 [216h 0534 2] PCI Path : [03, 00] [218h 0536 2] Subtable Type : 0002 <Root Port ATS Capability> [21Ah 0538 2] Length : 0020 [21Ch 0540 1] Flags : 00 [21Dh 0541 1] Reserved : 00 [21Eh 0542 2] PCI Segment Number : 0000 [220h 0544 1] Device Scope Entry Type : 02 [221h 0545 1] Entry Length : 08 [222h 0546 2] Reserved : 0000 [224h 0548 1] Enumeration ID : 00 [225h 0549 1] PCI Bus Number : 80 [226h 0550 2] PCI Path : [02, 00] [228h 0552 1] Device Scope Entry Type : 02 [229h 0553 1] Entry Length : 08 [22Ah 0554 2] Reserved : 0000 [22Ch 0556 1] Enumeration ID : 00 [22Dh 0557 1] PCI Bus Number : 80 [22Eh 0558 2] PCI Path : [02, 02] [230h 0560 1] Device Scope Entry Type : 02 [231h 0561 1] Entry Length : 08 [232h 0562 2] Reserved : 0000 [234h 0564 1] Enumeration ID : 00 [235h 0565 1] PCI Bus Number : 80 [236h 0566 2] PCI Path : [03, 00] [238h 0568 2] Subtable Type : 0002 <Root Port ATS Capability> [23Ah 0570 2] Length : 0020 [23Ch 0572 1] Flags : 00 [23Dh 0573 1] Reserved : 00 [23Eh 0574 2] PCI Segment Number : 0000 [240h 0576 1] Device Scope Entry Type : 02 [241h 0577 1] Entry Length : 08 [242h 0578 2] Reserved : 0000 [244h 0580 1] Enumeration ID : 00 [245h 0581 1] PCI Bus Number : C0 [246h 0582 2] PCI Path : [02, 00] [248h 0584 1] Device Scope Entry Type : 02 [249h 0585 1] Entry Length : 08 [24Ah 0586 2] Reserved : 0000 [24Ch 0588 1] Enumeration ID : 00 [24Dh 0589 1] PCI Bus Number : C0 [24Eh 0590 2] PCI Path : [02, 02] [250h 0592 1] Device Scope Entry Type : 02 [251h 0593 1] Entry Length : 08 [252h 0594 2] Reserved : 0000 [254h 0596 1] Enumeration ID : 00 [255h 0597 1] PCI Bus Number : C0 [256h 0598 2] PCI Path : [03, 00] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:04 +01:00
Jiang Liu	a4eaa86c0c	iommu/vt-d: Check for NULL pointer when freeing IOMMU data structure Domain id 0 will be assigned to invalid translation without allocating domain data structure if DMAR unit supports caching mode. So in function free_dmar_iommu(), we should check whether the domain pointer is NULL, otherwise it will cause system crash as below: [ 6.790519] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 [ 6.799520] IP: [<ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430 [ 6.806493] PGD 0 [ 6.817972] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 6.823303] Modules linked in: [ 6.826862] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #126 [ 6.834252] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.1402050741 02/05/2014 [ 6.845951] task: ffff880455a80000 ti: ffff880455a88000 task.ti: ffff880455a88000 [ 6.854437] RIP: 0010:[<ffffffff810e2dc8>] [<ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430 [ 6.864154] RSP: 0000:ffff880455a89ce0 EFLAGS: 00010046 [ 6.870179] RAX: 0000000000000046 RBX: 0000000000000002 RCX: 0000000000000000 [ 6.878249] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000c8 [ 6.886318] RBP: ffff880455a89d40 R08: 0000000000000002 R09: 0000000000000001 [ 6.894387] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880455a80000 [ 6.902458] R13: 0000000000000000 R14: 00000000000000c8 R15: 0000000000000000 [ 6.910520] FS: 0000000000000000(0000) GS:ffff88045b800000(0000) knlGS:0000000000000000 [ 6.919687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.926198] CR2: 00000000000000c8 CR3: 0000000001e0e000 CR4: 00000000001407f0 [ 6.934269] Stack: [ 6.936588] ffffffffffffff10 ffffffff810f59db 0000000000000010 0000000000000246 [ 6.945219] ffff880455a89d10 0000000000000000 ffffffff82bcb980 0000000000000046 [ 6.953850] 0000000000000000 0000000000000000 0000000000000002 0000000000000000 [ 6.962482] Call Trace: [ 6.965300] [<ffffffff810f59db>] ? vprintk_emit+0x4fb/0x5a0 [ 6.971716] [<ffffffff810e3185>] lock_acquire+0x185/0x200 [ 6.977941] [<ffffffff821fbbee>] ? init_dmars+0x839/0xa1d [ 6.984167] [<ffffffff81870b06>] _raw_spin_lock_irqsave+0x56/0x90 [ 6.991158] [<ffffffff821fbbee>] ? init_dmars+0x839/0xa1d [ 6.997380] [<ffffffff821fbbee>] init_dmars+0x839/0xa1d [ 7.003410] [<ffffffff8147d575>] ? pci_get_dev_by_id+0x75/0xd0 [ 7.010119] [<ffffffff821fc146>] intel_iommu_init+0x2f0/0x502 [ 7.016735] [<ffffffff821a7947>] ? iommu_setup+0x27d/0x27d [ 7.023056] [<ffffffff821a796f>] pci_iommu_init+0x28/0x52 [ 7.029282] [<ffffffff81002162>] do_one_initcall+0xf2/0x220 [ 7.035702] [<ffffffff810a4a29>] ? parse_args+0x2c9/0x450 [ 7.041919] [<ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b [ 7.048919] [<ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a [ 7.055336] [<ffffffff8184d3f0>] ? rest_init+0x150/0x150 [ 7.061461] [<ffffffff8184d3fe>] kernel_init+0xe/0x100 [ 7.067393] [<ffffffff8187b5fc>] ret_from_fork+0x7c/0xb0 [ 7.073518] [<ffffffff8184d3f0>] ? rest_init+0x150/0x150 [ 7.079642] Code: 01 76 18 89 05 46 04 36 01 41 be 01 00 00 00 e9 2f 02 00 00 0f 1f 80 00 00 00 00 41 be 01 00 00 00 e9 1d 02 00 00 0f 1f 44 00 00 <49> 81 3e c0 31 34 82 b8 01 00 00 00 0f 44 d8 41 83 ff 01 0f 87 [ 7.104944] RIP [<ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430 [ 7.112008] RSP <ffff880455a89ce0> [ 7.115988] CR2: 00000000000000c8 [ 7.119784] ---[ end trace 13d756f0f462c538 ]--- [ 7.125034] note: swapper/0[1] exited with preempt_count 1 [ 7.131285] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 [ 7.131285] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:03 +01:00
Jiang Liu	9ebd682e5a	iommu/vt-d: Fix incorrect iommu_count for si_domain The iommu_count field in si_domain(static identity domain) is initialized to zero and never increases. It will underflow when tearing down iommu unit in function free_dmar_iommu() and leak memory. So refine code to correctly manage si_domain->iommu_count. Warning message caused by si_domain memory leak: [ 14.609681] IOMMU: Setting RMRR: [ 14.613496] Ignoring identity map for HW passthrough device 0000:00:1a.0 [0xbdcfd000 - 0xbdd1dfff] [ 14.623809] Ignoring identity map for HW passthrough device 0000:00:1d.0 [0xbdcfd000 - 0xbdd1dfff] [ 14.634162] IOMMU: Prepare 0-16MiB unity mapping for LPC [ 14.640329] Ignoring identity map for HW passthrough device 0000:00:1f.0 [0x0 - 0xffffff] [ 14.673360] IOMMU: dmar init failed [ 14.678157] kmem_cache_destroy iommu_devinfo: Slab cache still has objects [ 14.686076] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59 [ 14.694176] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012 [ 14.707412] 0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff880c2cc37c00 [ 14.716407] ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00 [ 14.725468] ffffffff81dc7a6a ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711 [ 14.734464] Call Trace: [ 14.737453] [<ffffffff8156223d>] dump_stack+0x4d/0x66 [ 14.743430] [<ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100 [ 14.750279] [<ffffffff81dc7a6a>] intel_iommu_init+0x122/0x56a [ 14.757035] [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d [ 14.763491] [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52 [ 14.769846] [<ffffffff81000342>] do_one_initcall+0x122/0x180 [ 14.776506] [<ffffffff81077738>] ? parse_args+0x1e8/0x320 [ 14.782866] [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c [ 14.789994] [<ffffffff81d84833>] ? do_early_param+0x88/0x88 [ 14.796556] [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0 [ 14.802626] [<ffffffff8154ffce>] kernel_init+0xe/0x130 [ 14.808698] [<ffffffff815756ac>] ret_from_fork+0x7c/0xb0 [ 14.814963] [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0 [ 14.821640] kmem_cache_destroy iommu_domain: Slab cache still has objects [ 14.829456] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59 [ 14.837562] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012 [ 14.850803] 0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff88102c1ee3c0 [ 14.861222] ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00 [ 14.870284] ffffffff81dc7a76 ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711 [ 14.879271] Call Trace: [ 14.882227] [<ffffffff8156223d>] dump_stack+0x4d/0x66 [ 14.888197] [<ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100 [ 14.895034] [<ffffffff81dc7a76>] intel_iommu_init+0x12e/0x56a [ 14.901781] [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d [ 14.908238] [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52 [ 14.914594] [<ffffffff81000342>] do_one_initcall+0x122/0x180 [ 14.921244] [<ffffffff81077738>] ? parse_args+0x1e8/0x320 [ 14.927598] [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c [ 14.934738] [<ffffffff81d84833>] ? do_early_param+0x88/0x88 [ 14.941309] [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0 [ 14.947380] [<ffffffff8154ffce>] kernel_init+0xe/0x130 [ 14.953430] [<ffffffff815756ac>] ret_from_fork+0x7c/0xb0 [ 14.959689] [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0 [ 14.966299] kmem_cache_destroy iommu_iova: Slab cache still has objects [ 14.973923] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59 [ 14.982020] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012 [ 14.995263] 0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff88042cb5c980 [ 15.004265] ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00 [ 15.013322] ffffffff81dc7a82 ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711 [ 15.022318] Call Trace: [ 15.025238] [<ffffffff8156223d>] dump_stack+0x4d/0x66 [ 15.031202] [<ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100 [ 15.038038] [<ffffffff81dc7a82>] intel_iommu_init+0x13a/0x56a [ 15.044786] [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d [ 15.051242] [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52 [ 15.057601] [<ffffffff81000342>] do_one_initcall+0x122/0x180 [ 15.064254] [<ffffffff81077738>] ? parse_args+0x1e8/0x320 [ 15.070608] [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c [ 15.077747] [<ffffffff81d84833>] ? do_early_param+0x88/0x88 [ 15.084300] [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0 [ 15.090362] [<ffffffff8154ffce>] kernel_init+0xe/0x130 [ 15.096431] [<ffffffff815756ac>] ret_from_fork+0x7c/0xb0 [ 15.102693] [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0 [ 15.189273] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:02 +01:00
Jiang Liu	92d03cc8d0	iommu/vt-d: Reduce duplicated code to handle virtual machine domains Reduce duplicated code to handle virtual machine domains, there's no functionality changes. It also improves code readability. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:01 +01:00
Jiang Liu	e85bb5d4d1	iommu/vt-d: Free resources if failed to create domain for PCIe endpoint Enhance function get_domain_for_dev() to release allocated resources if failed to create domain for PCIe endpoint, otherwise the allocated resources will get lost. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:01 +01:00
Jiang Liu	745f2586e7	iommu/vt-d: Simplify function get_domain_for_dev() Function get_domain_for_dev() is a little complex, simplify it by factoring out dmar_search_domain_by_dev_info() and dmar_insert_dev_info(). Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:01 +01:00
Jiang Liu	b94e4117f8	iommu/vt-d: Move private structures and variables into intel-iommu.c Move private structures and variables into intel-iommu.c, which will help to simplify locking policy for hotplug. Also delete redundant declarations. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:00 +01:00
Jiang Liu	7e7dfab71a	iommu/vt-d: Avoid caching stale domain_device_info when hot-removing PCI device Function device_notifier() in intel-iommu.c only remove domain_device_info data structure associated with a PCI device when handling PCI device driver unbinding events. If a PCI device has never been bound to a PCI device driver, there won't be BUS_NOTIFY_UNBOUND_DRIVER event when hot-removing the PCI device. So associated domain_device_info data structure may get lost. On the other hand, if iommu_pass_through is enabled, function iommu_prepare_static_indentify_mapping() will create domain_device_info data structure for each PCIe to PCIe bridge and PCIe endpoint, no matter whether there are drivers associated with those PCIe devices or not. So those domain_device_info data structures will get lost when hot-removing the assocated PCIe devices if they have never bound to any PCI device driver. To be even worse, it's not only an memory leak issue, but also an caching of stale information bug because the memory are kept in device_domain_list and domain->devices lists. Fix the bug by trying to remove domain_device_info data structure when handling BUS_NOTIFY_DEL_DEVICE event. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:00 +01:00
Jiang Liu	816997d03b	iommu/vt-d: Avoid caching stale domain_device_info and fix memory leak Function device_notifier() in intel-iommu.c fails to remove device_domain_info data structures for PCI devices if they are associated with si_domain because iommu_no_mapping() returns true for those PCI devices. This will cause memory leak and caching of stale information in domain->devices list. So fix the issue by not calling iommu_no_mapping() and skipping check of iommu_pass_through. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:50:59 +01:00
Jiang Liu	989d51fc99	iommu/vt-d: Avoid double free of g_iommus on error recovery path Array 'g_iommus' may be freed twice on error recovery path in function init_dmars() and free_dmar_iommu(), thus cause random system crash as below. [ 6.774301] IOMMU: dmar init failed [ 6.778310] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 6.785615] software IO TLB [mem 0x76bcf000-0x7abcf000] (64MB) mapped at [ffff880076bcf000-ffff88007abcefff] [ 6.796887] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [ 6.804173] Modules linked in: [ 6.807731] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #108 [ 6.815122] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.1402050741 02/05/2014 [ 6.836000] task: ffff880455a80000 ti: ffff880455a88000 task.ti: ffff880455a88000 [ 6.844487] RIP: 0010:[<ffffffff8143eea6>] [<ffffffff8143eea6>] memcpy+0x6/0x110 [ 6.853039] RSP: 0000:ffff880455a89cc8 EFLAGS: 00010293 [ 6.859064] RAX: ffff006568636163 RBX: ffff00656863616a RCX: 0000000000000005 [ 6.867134] RDX: 0000000000000005 RSI: ffffffff81cdc439 RDI: ffff006568636163 [ 6.875205] RBP: ffff880455a89d30 R08: 000000000001bc3b R09: 0000000000000000 [ 6.883275] R10: 0000000000000000 R11: ffffffff81cdc43e R12: ffff880455a89da8 [ 6.891338] R13: ffff006568636163 R14: 0000000000000005 R15: ffffffff81cdc439 [ 6.899408] FS: 0000000000000000(0000) GS:ffff88045b800000(0000) knlGS:0000000000000000 [ 6.908575] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.915088] CR2: ffff88047e1ff000 CR3: 0000000001e0e000 CR4: 00000000001407f0 [ 6.923160] Stack: [ 6.925487] ffffffff8143c904 ffff88045b407e00 ffff006568636163 ffff006568636163 [ 6.934113] ffffffff8120a1a9 ffffffff81cdc43e 0000000000000007 0000000000000000 [ 6.942747] ffff880455a89da8 ffff006568636163 0000000000000007 ffffffff81cdc439 [ 6.951382] Call Trace: [ 6.954197] [<ffffffff8143c904>] ? vsnprintf+0x124/0x6f0 [ 6.960323] [<ffffffff8120a1a9>] ? __kmalloc_track_caller+0x169/0x360 [ 6.967716] [<ffffffff81440e1b>] kvasprintf+0x6b/0x80 [ 6.973552] [<ffffffff81432bf1>] kobject_set_name_vargs+0x21/0x70 [ 6.980552] [<ffffffff8143393d>] kobject_init_and_add+0x4d/0x90 [ 6.987364] [<ffffffff812067c9>] ? __kmalloc+0x169/0x370 [ 6.993492] [<ffffffff8102dbbc>] ? cache_add_dev+0x17c/0x4f0 [ 7.000005] [<ffffffff8102ddfa>] cache_add_dev+0x3ba/0x4f0 [ 7.006327] [<ffffffff821a87ca>] ? i8237A_init_ops+0x14/0x14 [ 7.012842] [<ffffffff821a87f8>] cache_sysfs_init+0x2e/0x61 [ 7.019260] [<ffffffff81002162>] do_one_initcall+0xf2/0x220 [ 7.025679] [<ffffffff810a4a29>] ? parse_args+0x2c9/0x450 [ 7.031903] [<ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b [ 7.038904] [<ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a [ 7.045322] [<ffffffff8184d5e0>] ? rest_init+0x150/0x150 [ 7.051447] [<ffffffff8184d5ee>] kernel_init+0xe/0x100 [ 7.057380] [<ffffffff8187b87c>] ret_from_fork+0x7c/0xb0 [ 7.063503] [<ffffffff8184d5e0>] ? rest_init+0x150/0x150 [ 7.069628] Code: 89 e5 53 48 89 fb 75 16 80 7f 3c 00 75 05 e8 d2 f9 ff ff 48 8b 43 58 48 2b 43 50 88 43 4e 5b 5d c3 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b [ 7.094960] RIP [<ffffffff8143eea6>] memcpy+0x6/0x110 [ 7.100856] RSP <ffff880455a89cc8> [ 7.104864] ---[ end trace b5d3fdc6c6c28083 ]--- [ 7.110142] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 7.110142] [ 7.120540] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff) Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:50:59 +01:00
Linus Torvalds	b3a4bcaa5a	IOMMU Updates for Linux v3.14 A few patches have been queued up for this merge window: * Improvements for the ARM-SMMU driver (IOMMU_EXEC support, IOMMU group support) * Updates and fixes for the shmobile IOMMU driver * Various fixes to generic IOMMU code and the Intel IOMMU driver * Some cleanups in IOMMU drivers (dev_is_pci() usage) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJS6XrCAAoJECvwRC2XARrjgy4P/itemtg2U+603Ldje8WcPo0E OCO/0VVSmCTYKUJDZY0hiVwqmhe5gFL3Hm/gGwkS0+UJenFXMmi+aVaPp4pCpgH+ dL2HD3dIEvi14bisrdxG/8MdR6mIx0qzKtnZLkKSR4LXwucyLHvC/DaCoOytb7Yk 7s+eEuo0hj0jAkiqSG/zLEtKElTEnoAAkLOjMy46orecJ5q4HusPZekLtWZs2ETe x3NS63Unb9g1iSQJWIA7HnQlxWIr2+iynoamHHJRiVFzqRF0W0sGvQY3auG0DSCn 70WRNE1rKfEkfXMJxosRQ4394YUQdAkt8MBENNcJcC6E1n5PBi0cEZXH6mCnEIlG jXzIKUY9fz68ZboaqIxXv4Hb+JLlPXCvPBvQzIQiKRgVxd8nncEjn5I9MHdf+je5 BmJlzJLJvP4cFvW8Hc8k2Oq101b1kEcSCLARWWvE9/bk9xIUyrqBkR4XjC0vb6qq 1HbKVdZ7KFKCkBHy9xMpr7CUjKiDiiLeUmqlhyjcK9spicuNIZQnC11HemL6/USP oR6Ext9RGhvz+ch656+5+L6f6FURVP8/ywKiJ3RjmvXV5/fCYo3WMitOB2qzlWCy SYXAczAOMOdOo+1Dxbghrr+7HzUWPqgfPmntZEPGMZhfuZ6xXr+7pGLjAhHb4vcR SZxqkDo1cprqrR9KFAWC =YKLk -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU Updates from Joerg Roedel: "A few patches have been queued up for this merge window: - improvements for the ARM-SMMU driver (IOMMU_EXEC support, IOMMU group support) - updates and fixes for the shmobile IOMMU driver - various fixes to generic IOMMU code and the Intel IOMMU driver - some cleanups in IOMMU drivers (dev_is_pci() usage)" * tag 'iommu-updates-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (36 commits) iommu/vt-d: Fix signedness bug in alloc_irte() iommu/vt-d: free all resources if failed to initialize DMARs iommu/vt-d, trivial: clean sparse warnings iommu/vt-d: fix wrong return value of dmar_table_init() iommu/vt-d: release invalidation queue when destroying IOMMU unit iommu/vt-d: fix access after free issue in function free_dmar_iommu() iommu/vt-d: keep shared resources when failed to initialize iommu devices iommu/vt-d: fix invalid memory access when freeing DMAR irq iommu/vt-d, trivial: simplify code with existing macros iommu/vt-d, trivial: use defined macro instead of hardcoding iommu/vt-d: mark internal functions as static iommu/vt-d, trivial: clean up unused code iommu/vt-d, trivial: check suitable flag in function detect_intel_iommu() iommu/vt-d, trivial: print correct domain id of static identity domain iommu/vt-d, trivial: refine support of 64bit guest address iommu/vt-d: fix resource leakage on error recovery path in iommu_init_domains() iommu/vt-d: fix a race window in allocating domain ID for virtual machines iommu/vt-d: fix PCI device reference leakage on error recovery path drm/msm: Fix link error with !MSM_IOMMU iommu/vt-d: use dedicated bitmap to track remapping entry allocation status ...	2014-01-29 20:00:13 -08:00
Alex Williamson	08336fd218	intel-iommu: fix off-by-one in pagetable freeing dma_pte_free_level() has an off-by-one error when checking whether a pte is completely covered by a range. Take for example the case of attempting to free pfn 0x0 - 0x1ff, ie. 512 entries covering the first 2M superpage. The level_size() is 0x200 and we test: static void dma_pte_free_level(... ... if (!(0 > 0 \|\| 0x1ff < 0 + 0x200)) { ... } Clearly the 2nd test is true, which means we fail to take the branch to clear and free the pagetable entry. As a result, we're leaking pagetables and failing to install new pages over the range. This was found with a PCI device assigned to a QEMU guest using vfio-pci without a VGA device present. The first 1M of guest address space is mapped with various combinations of 4K pages, but eventually the range is entirely freed and replaced with a 2M contiguous mapping. intel-iommu errors out with something like: ERROR: DMA PTE for vPFN 0x0 already set (to 5c2b8003 not 849c00083) In this case 5c2b8003 is the pointer to the previous leaf page that was neither freed nor cleared and 849c00083 is the superpage entry that we're trying to replace it with. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:41 -08:00
Jiang Liu	9bdc531ec6	iommu/vt-d: free all resources if failed to initialize DMARs Enhance intel_iommu_init() to free all resources if failed to initialize DMAR hardware. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-01-09 12:44:30 +01:00

... 3 4 5 6 7 ...

534 Commits