OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Joerg Roedel	201c1db90c	iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA The stub function for !CONFIG_IOMMU_IOVA needs to be 'static inline'. Fixes: `effa467870` ('iommu/vt-d: Don't queue_iova() if there is no flush queue') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2019-07-23 09:51:00 +02:00
Dmitry Safonov	effa467870	iommu/vt-d: Don't queue_iova() if there is no flush queue Intel VT-d driver was reworked to use common deferred flushing implementation. Previously there was one global per-cpu flush queue, afterwards - one per domain. Before deferring a flush, the queue should be allocated and initialized. Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush queue. It's probably worth to init it for static or unmanaged domains too, but it may be arguable - I'm leaving it to iommu folks. Prevent queuing an iova flush if the domain doesn't have a queue. The defensive check seems to be worth to keep even if queue would be initialized for all kinds of domains. And is easy backportable. On 4.19.43 stable kernel it has a user-visible effect: previously for devices in si domain there were crashes, on sata devices: BUG: spinlock bad magic on CPU#6, swapper/0/1 lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x45/0x115 intel_unmap+0x107/0x113 intel_unmap_sg+0x6b/0x76 __ata_qc_complete+0x7f/0x103 ata_qc_complete+0x9b/0x26a ata_qc_complete_multiple+0xd0/0xe3 ahci_handle_port_interrupt+0x3ee/0x48a ahci_handle_port_intr+0x73/0xa9 ahci_single_level_irq_intr+0x40/0x60 __handle_irq_event_percpu+0x7f/0x19a handle_irq_event_percpu+0x32/0x72 handle_irq_event+0x38/0x56 handle_edge_irq+0x102/0x121 handle_irq+0x147/0x15c do_IRQ+0x66/0xf2 common_interrupt+0xf/0xf RIP: 0010:__do_softirq+0x8c/0x2df The same for usb devices that use ehci-pci: BUG: spinlock bad magic on CPU#0, swapper/0/1 lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x77/0x145 intel_unmap+0x107/0x113 intel_unmap_page+0xe/0x10 usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d usb_hcd_unmap_urb_for_dma+0x17/0x100 unmap_urb_for_dma+0x22/0x24 __usb_hcd_giveback_urb+0x51/0xc3 usb_giveback_urb_bh+0x97/0xde tasklet_action_common.isra.4+0x5f/0xa1 tasklet_action+0x2d/0x30 __do_softirq+0x138/0x2df irq_exit+0x7d/0x8b smp_apic_timer_interrupt+0x10f/0x151 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39 Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: <stable@vger.kernel.org> # 4.14+ Fixes: `13cf017446` ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2019-07-22 17:43:06 +02:00
Thomas Gleixner	55716d2643	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 Based on 1 normalized pattern(s): this file is released under the gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 68 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:16 +02:00
Jinyu Qi	14bd9a607f	iommu/iova: Separate atomic variables to improve performance In struct iova_domain, there are three atomic variables, the former two are about TLB flush counters which use atomic_add operation, anoter is used to flush timer that use cmpxhg operation. These variables are in the same cache line, so it will cause some performance loss under the condition that many cores call queue_iova function, Let's isolate the two type atomic variables to different cache line to reduce cache line conflict. Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Jinyu Qi <jinyuqi@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2019-04-11 15:42:54 +02:00
Ganapatrao Kulkarni	bee60e94a1	iommu/iova: Optimise attempts to allocate iova from 32bit address range As an optimisation for PCI devices, there is always first attempt been made to allocate iova from SAC address range. This will lead to unnecessary attempts, when there are no free ranges available. Adding fix to track recently failed iova address size and allow further attempts, only if requested size is lesser than a failed size. The size is updated when any replenish happens. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2018-09-25 10:18:27 +02:00
Tomasz Nowicki	538d5b3332	iommu/iova: Make rcache flush optional on IOVA allocation failure Since IOVA allocation failure is not unusual case we need to flush CPUs' rcache in hope we will succeed in next round. However, it is useful to decide whether we need rcache flush step because of two reasons: - Not scalability. On large system with ~100 CPUs iterating and flushing rcache for each CPU becomes serious bottleneck so we may want to defer it. - free_cpu_cached_iovas() does not care about max PFN we are interested in. Thus we may flush our rcaches and still get no new IOVA like in the commonly used scenario: if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev)) iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift); if (!iova) iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift); 1. First alloc_iova_fast() call is limited to DMA_BIT_MASK(32) to get PCI devices a SAC address 2. alloc_iova() fails due to full 32-bit space 3. rcaches contain PFNs out of 32-bit space so free_cpu_cached_iovas() throws entries away for nothing and alloc_iova() fails again 4. Next alloc_iova_fast() call cannot take advantage of rcache since we have just defeated caches. In this case we pick the slowest option to proceed. This patch reworks flushed_rcache local flag to be additional function argument instead and control rcache flush step. Also, it updates all users to do the flush as the last chance. Signed-off-by: Tomasz Nowicki <Tomasz.Nowicki@caviumnetworks.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-10-12 14:18:02 +02:00
Robin Murphy	bb68b2fbfb	iommu/iova: Add rbtree anchor node Add a permanent dummy IOVA reservation to the rbtree, such that we can always access the top of the address space instantly. The immediate benefit is that we remove the overhead of the rb_last() traversal when not using the cached node, but it also paves the way for further simplifications. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Zhen Lei	aa3ac9469c	iommu/iova: Make dma_32bit_pfn implicit Now that the cached node optimisation can apply to all allocations, the couple of users which were playing tricks with dma_32bit_pfn in order to benefit from it can stop doing so. Conversely, there is also no need for all the other users to explicitly calculate a 'real' 32-bit PFN, when init_iova_domain() can happily do that itself from the page granularity. CC: Thierry Reding <thierry.reding@gmail.com> CC: Jonathan Hunter <jonathanh@nvidia.com> CC: David Airlie <airlied@linux.ie> CC: Sudeep Dutt <sudeep.dutt@intel.com> CC: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> [rm: use iova_shift(), rewrote commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Robin Murphy	e60aa7b538	iommu/iova: Extend rbtree node caching The cached node mechanism provides a significant performance benefit for allocations using a 32-bit DMA mask, but in the case of non-PCI devices or where the 32-bit space is full, the loss of this benefit can be significant - on large systems there can be many thousands of entries in the tree, such that walking all the way down to find free space every time becomes increasingly awful. Maintain a similar cached node for the whole IOVA space as a superset of the 32-bit space so that performance can remain much more consistent. Inspired by work by Zhen Lei <thunder.leizhen@huawei.com>. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Zhen Lei <thunder.leizhen@huawei.com> Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-09-27 17:09:57 +02:00
Joerg Roedel	9a005a800a	iommu/iova: Add flush timer Add a timer to flush entries from the Flush-Queues every 10ms. This makes sure that no stale TLB entries remain for too long after an IOVA has been unmapped. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	8109c2a2f8	iommu/iova: Add locking to Flush-Queues The lock is taken from the same CPU most of the time. But having it allows to flush the queue also from another CPU if necessary. This will be used by a timer to regularily flush any pending IOVAs from the Flush-Queues. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:52 +02:00
Joerg Roedel	fb418dab8a	iommu/iova: Add flush counters to Flush-Queue implementation There are two counters: * fq_flush_start_cnt - Increased when a TLB flush is started. * fq_flush_finish_cnt - Increased when a TLB flush is finished. The fq_flush_start_cnt is assigned to every Flush-Queue entry on its creation. When freeing entries from the Flush-Queue, the value in the entry is compared to the fq_flush_finish_cnt. The entry can only be freed when its value is less than the value of fq_flush_finish_cnt. The reason for these counters it to take advantage of IOMMU TLB flushes that happened on other CPUs. These already flushed the TLB for Flush-Queue entries on other CPUs so that they can already be freed without flushing the TLB again. This makes it less likely that the Flush-Queue is full and saves IOMMU TLB flushes. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:51 +02:00
Joerg Roedel	1928210107	iommu/iova: Implement Flush-Queue ring buffer Add a function to add entries to the Flush-Queue ring buffer. If the buffer is full, call the flush-callback and free the entries. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:51 +02:00
Joerg Roedel	42f87e71c3	iommu/iova: Add flush-queue data structures This patch adds the basic data-structures to implement flush-queues in the generic IOVA code. It also adds the initialization and destroy routines for these data structures. The initialization routine is designed so that the use of this feature is optional for the users of IOVA code. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-08-15 18:23:50 +02:00
Joerg Roedel	b4d8c7aea1	iommu/iova: Fix compile error with CONFIG_IOMMU_IOVA=m The #ifdef in iova.h only catches the CONFIG_IOMMU_IOVA=y case, so that compilation as a module fails with duplicate function definition errors. Fix it by catching both cases in the #if. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-23 00:06:17 +01:00
Thierry Reding	21aff52ab2	iommu: Add dummy implementations for !IOMMU_IOVA Currently, building code which uses the API guarded by the IOMMU_IOVA will fail to link if IOMMU_IOVA is not enabled. Often this code will be using the API provided by the IOMMU_API Kconfig symbol, but support for this can be optional, with code falling back to contiguous memory. This commit implements dummy functions for the IOVA API so that it can be compiled out. With both IOMMU_API and IOMMU_IOVA optional, code can now be built with or without support for IOMMU without having to resort to #ifdefs in the user code. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-03-22 15:54:07 +01:00
Omer Peleg	9257b4a206	iommu/iova: introduce per-cpu caching to iova allocation IOVA allocation has two problems that impede high-throughput I/O. First, it can do a linear search over the allocated IOVA ranges. Second, the rbtree spinlock that serializes IOVA allocations becomes contended. Address these problems by creating an API for caching allocated IOVA ranges, so that the IOVA allocator isn't accessed frequently. This patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs without taking the rbtree spinlock. The per-CPU caches are backed by a global cache, to avoid invoking the (linear-time) IOVA allocator without needing to make the per-CPU cache size excessive. This design is based on magazines, as described in "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources" (currently available at https://www.usenix.org/legacy/event/usenix01/bonwick.html) Adding caching on top of the existing rbtree allocator maintains the property that IOVAs are densely packed in the IO virtual address space, which is important for keeping IOMMU page table usage low. To keep the cache size reasonable, we bound the IOVA space a CPU can cache by 32 MiB (we cache a bounded number of IOVA ranges, and only ranges of size <= 128 KiB). The shared global cache is bounded at 4 MiB of IOVA space. Signed-off-by: Omer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: Adam Morrison <mad@cs.technion.ac.il> Reviewed-by: Shaohua Li <shli@fb.com> Reviewed-by: Ben Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2016-04-20 15:42:24 -04:00
Sakari Ailus	ae1ff3d623	iommu: iova: Move iova cache management to the iova library This is necessary to separate intel-iommu from the iova library. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2015-07-28 15:47:58 +01:00
Robin Murphy	0fb5fe874c	iommu: Make IOVA domain page size explicit Systems may contain heterogeneous IOMMUs supporting differing minimum page sizes, which may also not be common with the CPU page size. Thus it is practical to have an explicit notion of IOVA granularity to simplify handling of mapping and allocation constraints. As an initial step, move the IOVA page granularity from an implicit compile-time constant to a per-domain property so we can make use of it in IOVA domain context at runtime. To keep the abstraction tidy, extend the little API of inline iova_* helpers to parallel some of the equivalent PAGE_* macros. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-19 14:55:22 +01:00
Robin Murphy	1b72250076	iommu: Make IOVA domain low limit flexible To share the IOVA allocator with other architectures, it needs to accommodate more general aperture restrictions; move the lower limit from a compile-time constant to a runtime domain property to allow IOVA domains with different requirements to co-exist. Also reword the slightly unclear description of alloc_iova since we're touching it anyway. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-19 14:55:22 +01:00
Robin Murphy	85b4545629	iommu: Consolidate IOVA allocator code In order to share the IOVA allocator with other architectures, break the unnecssary dependency on the Intel IOMMU driver and move the remaining IOVA internals to iova.c Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2015-01-19 14:55:22 +01:00
Jiang Liu	a156ef99e8	iommu/vt-d: Introduce helper function iova_size() to improve code readability Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-07-23 16:04:47 +02:00
Jiang Liu	75f05569d0	iommu/vt-d: Update IOMMU state when memory hotplug happens If static identity domain is created, IOMMU driver needs to update si_domain page table when memory hotplug event happens. Otherwise PCI device DMA operations can't access the hot-added memory regions. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <joro@8bytes.org>	2014-03-04 17:51:06 +01:00
David Woodhouse	3d39cecc48	intel-iommu: Remove superfluous iova_alloc_lock from IOVA code We only ever obtain this lock immediately before the iova_rbtree_lock, and release it immediately after the iova_rbtree_lock. So ditch it and just use iova_rbtree_lock. [v2: Remove the lockdep bits this time too] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2009-07-15 08:17:02 +01:00
Kay, Allen M	3871794642	VT-d: Changes to support KVM This patch extends the VT-d driver to support KVM [Ben: fixed memory pinning] [avi: move dma_remapping.h as well] Signed-off-by: Kay, Allen M <allen.m.kay@intel.com> Signed-off-by: Weidong Han <weidong.han@intel.com> Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Amit Shah <amit.shah@qumranet.com> Acked-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 14:24:08 +02:00

25 Commits