OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Fenghua Yu	701fac4038	iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit PASIDs are process-wide. It was attempted to use refcounted PASIDs to free them when the last thread drops the refcount. This turned out to be complex and error prone. Given the fact that the PASID space is 20 bits, which allows up to 1M processes to have a PASID associated concurrently, PASID resource exhaustion is not a realistic concern. Therefore, it was decided to simplify the approach and stick with lazy on demand PASID allocation, but drop the eager free approach and make an allocated PASID's lifetime bound to the lifetime of the process. Get rid of the refcounting mechanisms and replace/rename the interfaces to reflect this new approach. [ bp: Massage commit message. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220207230254.3342514-6-fenghua.yu@intel.com	2022-02-15 11:31:35 +01:00
Fenghua Yu	7ba564722d	iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA This CONFIG option originally only referred to the Shared Virtual Address (SVA) library. But it is now also used for non-library portions of code. Drop the "_LIB" suffix so that there is just one configuration option for all code relating to SVA. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220207230254.3342514-2-fenghua.yu@intel.com	2022-02-14 19:17:46 +01:00
Rafael J. Wysocki	f266c11bce	iommu/vtd: Replace acpi_bus_get_device() Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1807113.tdWV9SEqCh@kreacher Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-02-14 12:59:45 +01:00
Guoqing Jiang	99e675d473	iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping() After commit `e3beca48a4` ("irqdomain/treewide: Keep firmware node unconditionally allocated"). For tear down scenario, fn is only freed after fail to allocate ir_domain, though it also should be freed in case dmar_enable_qi returns error. Besides free fn, irq_domain and ir_msi_domain need to be removed as well if intel_setup_irq_remapping fails to enable queued invalidation. Improve the rewinding path by add out_free_ir_domain and out_free_fwnode lables per Baolu's suggestion. Fixes: `e3beca48a4` ("irqdomain/treewide: Keep firmware node unconditionally allocated") Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Link: https://lore.kernel.org/r/20220119063640.16864-1-guoqing.jiang@linux.dev Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220128031002.2219155-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-01-31 16:53:09 +01:00
Joerg Roedel	66dc1b791c	Merge branches 'arm/smmu', 'virtio', 'x86/amd', 'x86/vt-d' and 'core' into next	2022-01-04 10:33:45 +01:00
Matthew Wilcox (Oracle)	87f60cc65d	iommu/vt-d: Use put_pages_list page->freelist is for the use of slab. We already have the ability to free a list of pages in the core mm, but it requires the use of a list_head and for the pages to be chained together through page->lru. Switch the Intel IOMMU and IOVA code over to using free_pages_list(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> [rm: split from original patch, cosmetic tweaks, fix fq entries] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2115b560d9a0ce7cd4b948bd51a2b7bde8fdfd59.1639753638.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-12-20 09:03:05 +01:00
Maíra Canal	c95a9c278d	iommu/vt-d: Remove unused dma_to_mm_pfn function Remove dma_to_buf_pfn function, which is not used in the codebase. This was pointed by clang with the following warning: 'dma_to_mm_pfn' [-Wunused-function] static inline unsigned long dma_to_mm_pfn(unsigned long dma_pfn) ^ https://lore.kernel.org/r/YYhY7GqlrcTZlzuA@fedora drivers/iommu/intel/iommu.c:136:29: warning: unused function Signed-off-by: Maíra Canal <maira.canal@usp.br> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211217083817.1745419-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-12-17 09:40:29 +01:00
Kefeng Wang	f5209f9127	iommu/vt-d: Drop duplicate check in dma_pte_free_pagetable() The BUG_ON check exists in dma_pte_clear_range(), kill the duplicate check. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20211025032307.182974-1-wangkefeng.wang@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211217083817.1745419-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-12-17 09:40:29 +01:00
Christophe JAILLET	bb71257396	iommu/vt-d: Use bitmap_zalloc() when applicable 'iommu->domain_ids' is a bitmap. So use 'bitmap_zalloc()' to simplify code and improve the semantic. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/cb7a3e0a8d522447a06298a4f244c3df072f948b.1635018498.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211217083817.1745419-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-12-17 09:40:29 +01:00
Kees Cook	4599d78a82	iommu/vt-d: Use correctly sized arguments for bit field The find.h APIs are designed to be used only on unsigned long arguments. This can technically result in a over-read, but it is harmless in this case. Regardless, fix it to avoid the warning seen under -Warray-bounds, which we'd like to enable globally: In file included from ./include/linux/bitmap.h:9, from drivers/iommu/intel/iommu.c:17: drivers/iommu/intel/iommu.c: In function 'domain_context_mapping_one': ./include/linux/find.h:119:37: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'int[1]' [-Warray-bounds] 119 \| unsigned long val = *addr & GENMASK(size - 1, 0); \| ^~~~~ drivers/iommu/intel/iommu.c:2115:18: note: while referencing 'max_pde' 2115 \| int pds, max_pde; \| ^~~~~~~ Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20211215232432.2069605-1-keescook@chromium.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-12-17 09:05:04 +01:00
Alex Williamson	86dc40c7ea	iommu/vt-d: Fix unmap_pages support When supporting only the .map and .unmap callbacks of iommu_ops, the IOMMU driver can make assumptions about the size and alignment used for mappings based on the driver provided pgsize_bitmap. VT-d previously used essentially PAGE_MASK for this bitmap as any power of two mapping was acceptably filled by native page sizes. However, with the .map_pages and .unmap_pages interface we're now getting page-size and count arguments. If we simply combine these as (page-size * count) and make use of the previous map/unmap functions internally, any size and alignment assumptions are very different. As an example, a given vfio device assignment VM will often create a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff]. On a system that does not support IOMMU super pages, the unmap_pages interface will ask to unmap 1024 4KB pages at the base IOVA. dma_pte_clear_level() will recurse down to level 2 of the page table where the first half of the pfn range exactly matches the entire pte level. We clear the pte, increment the pfn by the level size, but (oops) the next pte is on a new page, so we exit the loop an pop back up a level. When we then update the pfn based on that higher level, we seem to assume that the previous pfn value was at the start of the level. In this case the level size is 256K pfns, which we add to the base pfn and get a results of 0x7fe00, which is clearly greater than 0x401ff, so we're done. Meanwhile we never cleared the ptes for the remainder of the range. When the VM remaps this range, we're overwriting valid ptes and the VT-d driver complains loudly, as reported by the user report linked below. The fix for this seems relatively simple, if each iteration of the loop in dma_pte_clear_level() is assumed to clear to the end of the level pte page, then our next pfn should be calculated from level_pfn rather than our working pfn. Fixes: `3f34f12597` ("iommu/vt-d: Implement map/unmap_pages() iommu_ops callback") Reported-by: Ajay Garg <ajaygargnsit@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Link: https://lore.kernel.org/all/20211002124012.18186-1-ajaygargnsit@gmail.com/ Link: https://lore.kernel.org/r/163659074748.1617923.12716161410774184024.stgit@omen Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211126135556.397932-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-11-26 22:54:47 +01:00
Christophe JAILLET	4e5973dd27	iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock() If we return -EOPNOTSUPP, the rcu lock remains lock. This is spurious. Go through the end of the function instead. This way, the missing 'rcu_read_unlock()' is called. Fixes: `7afd7f6aa2` ("iommu/vt-d: Check FL and SL capability sanity in scalable mode") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/40cc077ca5f543614eab2a10e84d29dd190273f6.1636217517.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211126135556.397932-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-11-26 22:54:47 +01:00
Longpeng(Mike)	9906b9352a	iommu/vt-d: Avoid duplicate removing in __domain_mapping() The __domain_mapping() always removes the pages in the range from 'iov_pfn' to 'end_pfn', but the 'end_pfn' is always the last pfn of the range that the caller wants to map. This would introduce too many duplicated removing and leads the map operation take too long, for example: Map iova=0x100000,nr_pages=0x7d61800 iov_pfn: 0x100000, end_pfn: 0x7e617ff iov_pfn: 0x140000, end_pfn: 0x7e617ff iov_pfn: 0x180000, end_pfn: 0x7e617ff iov_pfn: 0x1c0000, end_pfn: 0x7e617ff iov_pfn: 0x200000, end_pfn: 0x7e617ff ... it takes about 50ms in total. We can reduce the cost by recalculate the 'end_pfn' and limit it to the boundary of the end of this pte page. Map iova=0x100000,nr_pages=0x7d61800 iov_pfn: 0x100000, end_pfn: 0x13ffff iov_pfn: 0x140000, end_pfn: 0x17ffff iov_pfn: 0x180000, end_pfn: 0x1bffff iov_pfn: 0x1c0000, end_pfn: 0x1fffff iov_pfn: 0x200000, end_pfn: 0x23ffff ... it only need 9ms now. This also removes a meaningless BUG_ON() in __domain_mapping(). Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Tested-by: Liujunjie <liujunjie23@huawei.com> Link: https://lore.kernel.org/r/20211008000433.1115-1-longpeng2@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Fenghua Yu	00ecd54013	iommu/vt-d: Clean up unused PASID updating functions update_pasid() and its call chain are currently unused in the tree because Thomas disabled the ENQCMD feature. The feature will be re-enabled shortly using a different approach and update_pasid() and its call chain will not be used in the new approach. Remove the useless functions. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210920192349.2602141-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Lu Baolu	94f797ad61	iommu/vt-d: Delete dev_has_feat callback The commit `262948f8ba` ("iommu: Delete iommu_dev_has_feature()") has deleted the iommu_dev_has_feature() interface. Remove the dev_has_feat callback to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210929072030.1330225-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Lu Baolu	032c5ee40e	iommu/vt-d: Use second level for GPA->HPA translation The IOMMU VT-d implementation uses the first level for GPA->HPA translation by default. Although both the first level and the second level could handle the DMA translation, they're different in some way. For example, the second level translation has separate controls for the Access/Dirty page tracking. With the first level translation, there's no such control. On the other hand, the second level translation has the page-level control for forcing snoop, but the first level only has global control with pasid granularity. This uses the second level for GPA->HPA translation so that we can provide a consistent hardware interface for use cases like dirty page tracking for live migration. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Lu Baolu	7afd7f6aa2	iommu/vt-d: Check FL and SL capability sanity in scalable mode An iommu domain could be allocated and mapped before it's attached to any device. This requires that in scalable mode, when the domain is allocated, the format (FL or SL) of the page table must be determined. In order to achieve this, the platform should support consistent SL or FL capabilities on all IOMMU's. This adds a check for this and aborts IOMMU probing if it doesn't meet this requirement. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Lu Baolu	b34380a6d7	iommu/vt-d: Remove duplicate identity domain flag The iommu_domain data structure already has the "type" field to keep the type of a domain. It's unnecessary to have the DOMAIN_FLAG_STATIC_IDENTITY flag in the vt-d implementation. This cleans it up with no functionality change. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Kyung Min Park	914ff7719e	iommu/vt-d: Dump DMAR translation structure when DMA fault occurs When the dmar translation fault happens, the kernel prints a single line fault reason with corresponding hexadecimal code defined in the Intel VT-d specification. Currently, when user wants to debug the translation fault in detail, debugfs is used for dumping the dmar_translation_struct, which is not available when the kernel failed to boot. Dump the DMAR translation structure, pagewalk the IO page table and print the page table entry when the fault happens. This takes effect only when CONFIG_DMAR_DEBUG is enabled. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Link: https://lore.kernel.org/r/20210815203845.31287-1-kyung.min.park@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Tvrtko Ursulin	5240aed2cd	iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel option Handling of intel_iommu kernel command line option should return "true" to indicate option is valid and so avoid logging it as unknown by the core parsing code. Also log unknown sub-options at the notice level to let user know of potential typos or similar. Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://lore.kernel.org/r/20210831112947.310080-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Bjorn Helgaas	0b482d0c75	iommu/vt-d: Drop "0x" prefix from PCI bus & device addresses `719a193356` ("iommu/vt-d: Tweak the description of a DMA fault") changed the DMA fault reason from hex to decimal. It also added "0x" prefixes to the PCI bus/device, e.g., - DMAR: [INTR-REMAP] Request device [00:00.5] + DMAR: [INTR-REMAP] Request device [0x00:0x00.5] These no longer match dev_printk() and other similar messages in dmar_match_pci_path() and dmar_acpi_insert_dev_scope(). Drop the "0x" prefixes from the bus and device addresses. Fixes: `719a193356` ("iommu/vt-d: Tweak the description of a DMA fault") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20210903193711.483999-1-helgaas@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210922054726.499110-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-09-28 11:41:11 +02:00
Fenghua Yu	6ef0505158	iommu/vt-d: Fix a deadlock in intel_svm_drain_prq() pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below: idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #1 (&param->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #0 (&param->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &param->lock --> &param->fault_param->lock --> pasid_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(&param->fault_param->lock); lock(pasid_mutex); lock(&param->lock); * DEADLOCK * 2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled. Fixes: `d5b9e4bfe0` ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-09-09 13:18:07 +02:00
Fenghua Yu	a21518cb23	iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm() The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid(). Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm. Fixes: `4048377414` ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-09-09 13:18:06 +02:00
Joerg Roedel	d8768d7eb9	Merge branches 'apple/dart', 'arm/smmu', 'iommu/fixes', 'x86/amd', 'x86/vt-d' and 'core' into next	2021-08-20 17:14:35 +02:00
Liu Yi L	423d39d851	iommu/vt-d: Add present bit check in pasid entry setup helpers The helper functions should not modify the pasid entries which are still in use. Add a check against present bit. Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Liu Yi L	8123b0b868	iommu/vt-d: Use pasid_pte_is_present() helper function Use the pasid_pte_is_present() helper for present bit check in the intel_pasid_tear_down_entry(). Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Andy Shevchenko	9ddc348214	iommu/vt-d: Drop the kernel doc annotation Kernel doc validator is unhappy with the following .../perf.c:16: warning: Function parameter or member 'latency_lock' not described in 'DEFINE_SPINLOCK' .../perf.c:16: warning: expecting prototype for perf.c(). Prototype was for DEFINE_SPINLOCK() instead Drop kernel doc annotation since the top comment is not in the required format. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210729163538.40101-1-andriy.shevchenko@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Lu Baolu	48811c4434	iommu/vt-d: Allow devices to have more than 32 outstanding PRs The minimum per-IOMMU PRQ queue size is one 4K page, this is more entries than the hardcoded limit of 32 in the current VT-d code. Some devices can support up to 512 outstanding PRQs but underutilized by this limit of 32. Although, 32 gives some rough fairness when multiple devices share the same IOMMU PRQ queue, but far from optimal for customized use case. This extends the per-IOMMU PRQ queue size to four 4K pages and let the devices have as many outstanding page requests as they can. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Lu Baolu	289b3b005c	iommu/vt-d: Preset A/D bits for user space DMA usage We preset the access and dirty bits for IOVA over first level usage only for the kernel DMA (i.e., when domain type is IOMMU_DOMAIN_DMA). We should also preset the FL A/D for user space DMA usage. The idea is that even the user space A/D bit memory write is unnecessary. We should avoid it to minimize the overhead. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Lu Baolu	792fb43ce2	iommu/vt-d: Enable Intel IOMMU scalable mode by default The commit `8950dcd83a` ("iommu/vt-d: Leave scalable mode default off") leaves the scalable mode default off and end users could turn it on with "intel_iommu=sm_on". Using the Intel IOMMU scalable mode for kernel DMA, user-level device access and Shared Virtual Address have been enabled. This enables the scalable mode by default if the hardware advertises the support and adds kernel options of "intel_iommu=sm_on/sm_off" for end users to configure it through the kernel parameters. Suggested-by: Ashok Raj <ashok.raj@intel.com> Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Lu Baolu	01dac2d9d2	iommu/vt-d: Refactor Kconfig a bit Put all sub-options inside a "if INTEL_IOMMU" so that they don't need to always depend on INTEL_IOMMU. Use IS_ENABLED() instead of #ifdef as well. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Zhen Lei	5e41c99894	iommu/vt-d: Remove unnecessary oom message Fixes scripts/checkpatch.pl warning: WARNING: Possible unnecessary 'out of memory' message Remove it can help us save a bit of memory. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210609124937.14260-1-thunder.leizhen@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Lu Baolu	4d99efb229	iommu/vt-d: Update the virtual command related registers The VT-d spec Revision 3.3 updated the virtual command registers, virtual command opcode B register, virtual command response register and virtual command capability register (Section 10.4.43, 10.4.44, 10.4.45, 10.4.46). This updates the virtual command interface implementation in the Intel IOMMU driver accordingly. Fixes: `24f27d32ab` ("iommu/vt-d: Enlightened PASID allocation") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Sanjay Kumar <sanjay.k.kumar@intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210713042649.3547403-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Robin Murphy	78ca078459	iommu/vt-d: Prepare for multiple DMA domain types In preparation for the strict vs. non-strict decision for DMA domains to be expressed in the domain type, make sure we expose our flush queue awareness by accepting the new domain type, and test the specific feature flag where we want to identify DMA domains in general. The DMA ops reset/setup can simply be made unconditional, since iommu-dma already knows only to touch DMA domains. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/31a8ef868d593a2f3826a6a120edee81815375a7.1628682049.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-18 13:27:49 +02:00
Robin Murphy	f297e27f83	iommu/vt-d: Drop IOVA cookie management The core code bakes its own cookies now. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e9dbe3b6108f8538e17e0c5f59f8feeb714f51a4.1628682048.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-18 13:25:31 +02:00
Liu Yi L	8798d36411	iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry() This fixes improper iotlb invalidation in intel_pasid_tear_down_entry(). When a PASID was used as nested mode, released and reused, the following error message will appear: [ 180.187556] Unexpected page request in Privilege Mode [ 180.187565] Unexpected page request in Privilege Mode [ 180.279933] Unexpected page request in Privilege Mode [ 180.279937] Unexpected page request in Privilege Mode Per chapter 6.5.3.3 of VT-d spec 3.3, when tear down a pasid entry, the software should use Domain selective IOTLB flush if the PGTT of the pasid entry is SL only or Nested, while for the pasid entries whose PGTT is FL only or PT using PASID-based IOTLB flush is enough. Fixes: `2cd1311a26` ("iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr") Signed-off-by: Kumar Sanjay K <sanjay.k.kumar@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Tested-by: Yi Sun <yi.y.sun@intel.com> Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210817124321.1517985-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-18 13:15:58 +02:00
Fenghua Yu	62ef907a04	iommu/vt-d: Fix PASID reference leak A PASID reference is increased whenever a device is bound to an mm (and its PASID) successfully (i.e. the device's sdev user count is increased). But the reference is not dropped every time the device is unbound successfully from the mm (i.e. the device's sdev user count is decreased). The reference is dropped only once by calling intel_svm_free_pasid() when there isn't any device bound to the mm. intel_svm_free_pasid() drops the reference and only frees the PASID on zero reference. Fix the issue by dropping the PASID reference and freeing the PASID when no reference on successful unbinding the device by calling intel_svm_free_pasid() . Fixes: `4048377414` ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210813181345.1870742-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210817124321.1517985-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-18 13:15:58 +02:00
Lu Baolu	75cc1018a9	iommu/vt-d: Move clflush'es from iotlb_sync_map() to map_pages() As the Intel VT-d driver has switched to use the iommu_ops.map_pages() callback, multiple pages of the same size will be mapped in a call. There's no need to put the clflush'es in iotlb_sync_map() callback. Move them back into __domain_mapping() to simplify the code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720020615.4144323-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:56:25 +02:00
Lu Baolu	3f34f12597	iommu/vt-d: Implement map/unmap_pages() iommu_ops callback Implement the map_pages() and unmap_pages() callback for the Intel IOMMU driver to allow calls from iommu core to map and unmap multiple pages of the same size in one call. With map/unmap_pages() implemented, the prior map/unmap callbacks are deprecated. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720020615.4144323-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:56:25 +02:00
Lu Baolu	a886d5a7e6	iommu/vt-d: Report real pgsize bitmap to iommu core The pgsize bitmap is used to advertise the page sizes our hardware supports to the IOMMU core, which will then use this information to split physically contiguous memory regions it is mapping into page sizes that we support. Traditionally the IOMMU core just handed us the mappings directly, after making sure the size is an order of a 4KiB page and that the mapping has natural alignment. To retain this behavior, we currently advertise that we support all page sizes that are an order of 4KiB. We are about to utilize the new IOMMU map/unmap_pages APIs. We could change this to advertise the real page sizes we support. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720020615.4144323-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:56:25 +02:00
John Garry	308723e358	iommu: Remove mode argument from iommu_set_dma_strict() We only ever now set strict mode enabled in iommu_set_dma_strict(), so just remove the argument. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1626088340-5838-7-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:27:38 +02:00
Zhen Lei	d0e108b8e9	iommu/vt-d: Add support for IOMMU default DMA mode build options Make IOMMU_DEFAULT_LAZY default for when INTEL_IOMMU config is set, as is current behaviour. Also delete global flag intel_iommu_strict: - In intel_iommu_setup(), call iommu_set_dma_strict(true) directly. Also remove the print, as iommu_subsys_init() prints the mode and we have already marked this param as deprecated. - For cap_caching_mode() check in intel_iommu_setup(), call iommu_set_dma_strict(true) directly; also reword the accompanying print with a level downgrade and also add the missing '\n'. - For Ironlake GPU, again call iommu_set_dma_strict(true) directly and keep the accompanying print. [jpg: Remove intel_iommu_strict] Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1626088340-5838-5-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:27:38 +02:00
John Garry	1d479f160c	iommu: Deprecate Intel and AMD cmdline methods to enable strict mode Now that the x86 drivers support iommu.strict, deprecate the custom methods. Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1626088340-5838-2-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-26 13:27:38 +02:00
Lu Baolu	474dd1c650	iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries The commit `2b0140c696` ("iommu/vt-d: Use pci_real_dma_dev() for mapping") fixes an issue of "sub-device is removed where the context entry is cleared for all aliases". But this commit didn't consider the PASID entry and PASID table in VT-d scalable mode. This fix increases the coverage of scalable mode. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Fixes: `8038bdb855` ("iommu/vt-d: Only clear real DMA device's context entries") Fixes: `2b0140c696` ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Cc: stable@vger.kernel.org # v5.6+ Cc: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210712071712.3416949-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-14 12:58:07 +02:00
Sanjay Kumar	37764b952e	iommu/vt-d: Global devTLB flush when present context entry changed This fixes a bug in context cache clear operation. The code was not following the correct invalidation flow. A global device TLB invalidation should be added after the IOTLB invalidation. At the same time, it uses the domain ID from the context entry. But in scalable mode, the domain ID is in PASID table entry, not context entry. Fixes: `7373a8cc38` ("iommu/vt-d: Setup context and enable RID2PASID support") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210712071315.3416543-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-07-14 12:57:39 +02:00
Joerg Roedel	2b9d8e3e9a	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/smmu', 'x86/vt-d', 'x86/amd', 'virtio' and 'core' into next	2021-06-25 15:23:25 +02:00
Jean-Philippe Brucker	ac6d704679	iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops() Passing a 64-bit address width to iommu_setup_dma_ops() is valid on virtual platforms, but isn't currently possible. The overflow check in iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass a limit address instead of a size, so callers don't have to fake a size to work around the check. The base and limit parameters are being phased out, because: * they are redundant for x86 callers. dma-iommu already reserves the first page, and the upper limit is already in domain->geometry. * they can now be obtained from dev->dma_range_map on Arm. But removing them on Arm isn't completely straightforward so is left for future work. As an intermediate step, simplify the x86 callers by passing dummy limits. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210618152059.1194210-5-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-25 15:02:43 +02:00
Colin Ian King	934ed4580c	iommu/vt-d: Fix dereference of pointer info before it is null checked The assignment of iommu from info->iommu occurs before info is null checked hence leading to a potential null pointer dereference issue. Fix this by assigning iommu and checking if iommu is null after null checking info. Addresses-Coverity: ("Dereference before null check") Fixes: `4c82b88696` ("iommu/vt-d: Allocate/register iopf queue for sva devices") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210611135024.32781-1-colin.king@canonical.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-18 15:19:50 +02:00
Joerg Roedel	d6a9642bd6	iommu/vt-d: Fix linker error on 32-bit A recent commit broke the build on 32-bit x86. The linker throws these messages: ld: drivers/iommu/intel/perf.o: in function `dmar_latency_snapshot': perf.c:(.text+0x40c): undefined reference to `__udivdi3' ld: perf.c:(.text+0x458): undefined reference to `__udivdi3' The reason are the 64-bit divides in dmar_latency_snapshot(). Use the div_u64() helper function for those. Fixes: `55ee5e67a5` ("iommu/vt-d: Add common code for dmar latency performance monitors") Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210610083120.29224-1-joro@8bytes.org	2021-06-10 10:33:14 +02:00
Parav Pandit	7a0f06c197	iommu/vt-d: No need to typecast Page directory assignment by alloc_pgtable_page() or phys_to_virt() doesn't need typecasting as both routines return void*. Hence, remove typecasting from both the calls. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-24-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:14 +02:00
Parav Pandit	cee57d4fe7	iommu/vt-d: Remove unnecessary braces No need for braces for single line statement under if() block. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-22-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:14 +02:00
Parav Pandit	74f6d776ae	iommu/vt-d: Removed unused iommu_count in dmar domain DMAR domain uses per DMAR refcount. It is indexed by iommu seq_id. Older iommu_count is only incremented and decremented but no decisions are taken based on this refcount. This is not of much use. Hence, remove iommu_count and further simplify domain_detach_iommu() by returning void. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-21-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:14 +02:00
Parav Pandit	1f106ff0ea	iommu/vt-d: Use bitfields for DMAR capabilities IOTLB device presence, iommu coherency and snooping are boolean capabilities. Use them as bits and keep them adjacent. Structure layout before the reorg. $ pahole -C dmar_domain drivers/iommu/intel/dmar.o struct dmar_domain { int nid; /* 0 4 / unsigned int iommu_refcnt[128]; / 4 512 / / --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- / u16 iommu_did[128]; / 516 256 / / --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- / bool has_iotlb_device; / 772 1 / / XXX 3 bytes hole, try to pack / struct list_head devices; / 776 16 / struct list_head subdevices; / 792 16 / struct iova_domain iovad __attribute__((__aligned__(8))); / 808 2320 / / --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- / struct dma_pte pgd; /* 3128 8 / / --- cacheline 49 boundary (3136 bytes) --- / int gaw; / 3136 4 / int agaw; / 3140 4 / int flags; / 3144 4 / int iommu_coherency; / 3148 4 / int iommu_snooping; / 3152 4 / int iommu_count; / 3156 4 / int iommu_superpage; / 3160 4 / / XXX 4 bytes hole, try to pack / u64 max_addr; / 3168 8 / u32 default_pasid; / 3176 4 / / XXX 4 bytes hole, try to pack / struct iommu_domain domain; / 3184 72 / / size: 3256, cachelines: 51, members: 18 / / sum members: 3245, holes: 3, sum holes: 11 / / forced alignments: 1 / / last cacheline: 56 bytes / } __attribute__((__aligned__(8))); After arranging it for natural padding and to make flags as u8 bits, it saves 8 bytes for the struct. struct dmar_domain { int nid; / 0 4 / unsigned int iommu_refcnt[128]; / 4 512 / / --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- / u16 iommu_did[128]; / 516 256 / / --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- / u8 has_iotlb_device:1; / 772: 0 1 / u8 iommu_coherency:1; / 772: 1 1 / u8 iommu_snooping:1; / 772: 2 1 / / XXX 5 bits hole, try to pack / / XXX 3 bytes hole, try to pack / struct list_head devices; / 776 16 / struct list_head subdevices; / 792 16 / struct iova_domain iovad __attribute__((__aligned__(8))); / 808 2320 / / --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- / struct dma_pte pgd; /* 3128 8 / / --- cacheline 49 boundary (3136 bytes) --- / int gaw; / 3136 4 / int agaw; / 3140 4 / int flags; / 3144 4 / int iommu_count; / 3148 4 / int iommu_superpage; / 3152 4 / / XXX 4 bytes hole, try to pack / u64 max_addr; / 3160 8 / u32 default_pasid; / 3168 4 / / XXX 4 bytes hole, try to pack / struct iommu_domain domain; / 3176 72 / / size: 3248, cachelines: 51, members: 18 / / sum members: 3236, holes: 3, sum holes: 11 / / sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 5 bits / / forced alignments: 1 / / last cacheline: 48 bytes */ } __attribute__((__aligned__(8))); Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-20-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
YueHaibing	3bc770b0e9	iommu/vt-d: Use DEVICE_ATTR_RO macro Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(), which makes the code a bit shorter and easier to read. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210528130229.22108-1-yuehaibing@huawei.com Link: https://lore.kernel.org/r/20210610020115.1637656-19-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Gustavo A. R. Silva	606636dcbd	iommu/vt-d: Fix out-bounds-warning in intel/svm.c Replace a couple of calls to memcpy() with simple assignments in order to fix the following out-of-bounds warning: drivers/iommu/intel/svm.c:1198:4: warning: 'memcpy' offset [25, 32] from the object at 'desc' is out of the bounds of referenced subobject 'qw2' with type 'long long unsigned int' at offset 16 [-Warray-bounds] The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). This causes a legitimate compiler warning because memcpy() overruns the length of &desc.qw2 and &resp.qw2, respectively. This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210414201403.GA392764@embeddedor Link: https://lore.kernel.org/r/20210610020115.1637656-18-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	0f4834ab25	iommu/vt-d: Add PRQ handling latency sampling The execution time for page fault request handling is performance critical and needs to be monitored. This adds code to sample the execution time of page fault request handling. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-17-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	74eb87a0f9	iommu/vt-d: Add cache invalidation latency sampling Queued invalidation execution time is performance critical and needs to be monitored. This adds code to sample the execution time of IOTLB/ devTLB/ICE cache invalidation. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-16-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	456bb0b97f	iommu/vt-d: Expose latency monitor data through debugfs A debugfs interface /sys/kernel/debug/iommu/intel/dmar_perf_latency is created to control and show counts of execution time ranges for various types per DMAR. The interface may help debug any potential performance issue. By default, the interface is disabled. Possible write value of /sys/kernel/debug/iommu/intel/dmar_perf_latency 0 - disable sampling all latency data 1 - enable sampling IOTLB invalidation latency data 2 - enable sampling devTLB invalidation latency data 3 - enable sampling intr entry cache invalidation latency data 4 - enable sampling prq handling latency data Read /sys/kernel/debug/iommu/intel/dmar_perf_latency gives a snapshot of sampling result of all enabled monitors. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-15-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	55ee5e67a5	iommu/vt-d: Add common code for dmar latency performance monitors The execution time of some operations is very performance critical, such as cache invalidation and PRQ processing time. This adds some common code to monitor the execution time range of those operations. The interfaces include enabling/disabling, checking status, updating sampling data and providing a common string format for users. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-14-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	e93a67f5a0	iommu/vt-d: Add prq_report trace event This adds a new trace event to track the page fault request report. This event will provide almost all information defined in a page request descriptor. A sample output: \| prq_report: dmar0/0000:00:0a.0 seq# 1: rid=0x50 addr=0x559ef6f97 r---- pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 2: rid=0x50 addr=0x559ef6f9c rw--l pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 3: rid=0x50 addr=0x559ef6f98 r---- pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 4: rid=0x50 addr=0x559ef6f9d rw--l pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 5: rid=0x50 addr=0x559ef6f99 r---- pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 6: rid=0x50 addr=0x559ef6f9e rw--l pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 7: rid=0x50 addr=0x559ef6f9a r---- pasid=0x2 index=0x1 \| prq_report: dmar0/0000:00:0a.0 seq# 8: rid=0x50 addr=0x559ef6f9f rw--l pasid=0x2 index=0x1 This will be helpful for I/O page fault related debugging. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	d5b9e4bfe0	iommu/vt-d: Report prq to io-pgfault framework Let the IO page fault requests get handled through the io-pgfault framework. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	4c82b88696	iommu/vt-d: Allocate/register iopf queue for sva devices This allocates and registers the iopf queue infrastructure for devices which want to support IO page fault for SVA. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	ae7f09b14b	iommu/vt-d: Refactor prq_event_thread() Refactor prq_event_thread() by moving handling single prq event out of the main loop. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	9e52cc0fed	iommu/vt-d: Use common helper to lookup svm devices It's common to iterate the svm device list and find a matched device. Add common helpers to do this and consolidate the code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	4048377414	iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers Align the pasid alloc/free code with the generic helpers defined in the iommu core. This also refactored the SVA binding code to improve the readability. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Lu Baolu	100b8a14a3	iommu/vt-d: Add pasid private data helpers We are about to use iommu_sva_alloc/free_pasid() helpers in iommu core. That means the pasid life cycle will be managed by iommu core. Use a local array to save the per pasid private data instead of attaching it the real pasid. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Lu Baolu	521f546b4e	iommu/vt-d: Support asynchronous IOMMU nested capabilities Current VT-d implementation supports nested translation only if all underlying IOMMUs support the nested capability. This is unnecessary as the upper layer is allowed to create different containers and set them with different type of iommu backend. The IOMMU driver needs to guarantee that devices attached to a nested mode iommu_domain should support nested capabilility. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210517065701.5078-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Lu Baolu	879fcc6bda	iommu/vt-d: Select PCI_ATS explicitly The Intel VT-d implementation supports device TLB management. Select PCI_ATS explicitly so that the pci_ats helpers are always available. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210512065313.3441309-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Lu Baolu	719a193356	iommu/vt-d: Tweak the description of a DMA fault The Intel IOMMU driver reports the DMA fault reason in a decimal number while the VT-d specification uses a hexadecimal one. It's inconvenient that users need to covert them everytime before consulting the spec. Let's use hexadecimal number for a DMA fault reason. The fault message uses 0xffffffff as PASID for DMA requests w/o PASID. This is confusing. Tweak this by adding "NO_PASID" explicitly. Reviewed-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210517065425.4953-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Aditya Srivastava	367f82de5a	iommu/vt-d: Fix kernel-doc syntax in file header The opening comment mark '/' is used for highlighting the beginning of kernel-doc comments. The header for drivers/iommu/intel/pasid.c follows this syntax, but the content inside does not comply with kernel-doc. This line was probably not meant for kernel-doc parsing, but is parsed due to the presence of kernel-doc like comment syntax(i.e, '/'), which causes unexpected warnings from kernel-doc: warning: Function parameter or member 'fmt' not described in 'pr_fmt' Provide a simple fix by replacing this occurrence with general comment format, i.e. '/*', to prevent kernel-doc from parsing it. Signed-off-by: Aditya Srivastava <yashsri421@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210523143245.19040-1-yashsri421@gmail.com Link: https://lore.kernel.org/r/20210610020115.1637656-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Colin Ian King	05d2cbf969	iommu/vt-d: Remove redundant assignment to variable agaw The variable agaw is initialized with a value that is never read and it is being updated later with a new value as a counter in a for-loop. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210416171826.64091-1-colin.king@canonical.com Link: https://lore.kernel.org/r/20210610020115.1637656-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Rolf Eike Beer	0ee74d5a48	iommu/vt-d: Fix sysfs leak in alloc_iommu() iommu_device_sysfs_add() is called before, so is has to be cleaned on subsequent errors. Fixes: `39ab9555c2` ("iommu: Add sysfs bindings for struct iommu_device") Cc: stable@vger.kernel.org # 4.11.x Signed-off-by: Rolf Eike Beer <eb@emlix.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/17411490.HIIP88n32C@mobilepool36.emlix.com Link: https://lore.kernel.org/r/20210525070802.361755-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-27 16:07:08 +02:00
Lu Baolu	54c80d9074	iommu/vt-d: Use user privilege for RID2PASID translation When first-level page tables are used for IOVA translation, we use user privilege by setting U/S bit in the page table entry. This is to make it consistent with the second level translation, where the U/S enforcement is not available. Clear the SRE (Supervisor Request Enable) field in the pasid table entry of RID2PASID so that requests requesting the supervisor privilege are blocked and treated as DMA remapping faults. Fixes: `b802d070a5` ("iommu/vt-d: Use iova over first level") Suggested-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210512064426.3440915-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210519015027.108468-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-19 08:51:02 +02:00
Dan Carpenter	1a590a1c8b	iommu/vt-d: Check for allocation failure in aux_detach_device() In current kernels small allocations never fail, but checking for allocation failure is the correct thing to do. Fixes: `18abda7a2d` ("iommu/vt-d: Fix general protection fault in aux_detach_device()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/YJuobKuSn81dOPLd@mwanda Link: https://lore.kernel.org/r/20210519015027.108468-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-19 08:51:02 +02:00
Linus Torvalds	57151b502c	pci-v5.13-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmCRp48UHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwsVRAAsIYueNKzZczpkeQwHigYzf4HLdKm yyT2c/Zlj9REAUOe7ApkowVAJWiMGDJP0J361KIluAGvAxnkMP1V6WlVdByorYd0 CrXc/UhD//cs+3QDo4SmJRHyL8q5QQTDa8Z/8seVJUYTR/t5OhSpMOuEJPhpeQ1s nqUk0yWNJRoN6wn6T/7KqgYEvPhARXo9epuWy5MNPZ5f8E7SRi/QG/6hP8/YOLpK A+8beIOX5LAvUJaXxEovwv5UQnSUkeZTGDyRietQYE6xXNeHPKCvZ7vDjjSE7NOW mIodD6JcG3n/riYV3sMA5PKDZgsPI3P/qJU6Y6vWBBYOaO/kQX/c7CZ+M2bcZay4 mh1dW0vOqoTy/pAVwQB2aq08Rrg2SAskpNdeyzduXllmuTyuwCMPXzG4RKmbQ8I1 qMFb8qOyNulRAWcTKgSMKByEQYASQsFA5yShtaba6h0+vqrseuP6hchBKKOEan8F 9THTI3ZflKwRvGjkI0MDbp0z0+wPYmNhrcZDpAJ3bEltw58E8TL/9aBtuhajmo8+ wJ64mZclFuMmSyhsfkAXOvjeKXMlEBaw7vinZGbcACmv4ZGI0MV7r4vVYQbQltcy myzB6xJxcWB8N07UpKpUbsGMb9JjTUPlaT36eZNvUZQDntrE1ljt8RSq3nphDrcD KmBRU8ru74I2RE0= =WvTD -----END PGP SIGNATURE----- Merge tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Release OF node when pci_scan_device() fails (Dmitry Baryshkov) - Add pci_disable_parity() (Bjorn Helgaas) - Disable Mellanox Tavor parity reporting (Heiner Kallweit) - Disable N2100 r8169 parity reporting (Heiner Kallweit) - Fix RCiEP device to RCEC association (Qiuxu Zhuo) - Convert sysfs "config", "rom", "reset", "label", "index", "acpi_index" to static attributes to help fix races in device enumeration (Krzysztof Wilczyński) - Convert sysfs "vpd" to static attribute (Heiner Kallweit, Krzysztof Wilczyński) - Use sysfs_emit() in "show" functions (Krzysztof Wilczyński) - Remove unused alloc_pci_root_info() return value (Krzysztof Wilczyński) PCI device hotplug: - Fix acpiphp reference count leak (Feilong Lin) Power management: - Fix acpi_pci_set_power_state() debug message (Rafael J. Wysocki) - Fix runtime PM imbalance (Dinghao Liu) Virtualization: - Increase delay after FLR to work around Intel DC P4510 NVMe erratum (Raphael Norwitz) MSI: - Convert rcar, tegra, xilinx to MSI domains (Marc Zyngier) - For rcar, xilinx, use controller address as MSI doorbell (Marc Zyngier) - Remove unused hv msi_controller struct (Marc Zyngier) - Remove unused PCI core msi_controller support (Marc Zyngier) - Remove struct msi_controller altogether (Marc Zyngier) - Remove unused default_teardown_msi_irqs() (Marc Zyngier) - Let host bridges declare their reliance on MSI domains (Marc Zyngier) - Make pci_host_common_probe() declare its reliance on MSI domains (Marc Zyngier) - Advertise mediatek lack of built-in MSI handling (Thomas Gleixner) - Document ways of ending up with NO_MSI (Marc Zyngier) - Refactor HT advertising of NO_MSI flag (Marc Zyngier) VPD: - Remove obsolete Broadcom NIC VPD length-limiting quirk (Heiner Kallweit) - Remove sysfs VPD size checking dead code (Heiner Kallweit) - Convert VPF sysfs file to static attribute (Heiner Kallweit) - Remove unnecessary pci_set_vpd_size() (Heiner Kallweit) - Tone down "missing VPD" message (Heiner Kallweit) Endpoint framework: - Fix NULL pointer dereference when epc_features not implemented (Shradha Todi) - Add missing destroy_workqueue() in endpoint test (Yang Yingliang) Amazon Annapurna Labs PCIe controller driver: - Fix compile testing without CONFIG_PCI_ECAM (Arnd Bergmann) - Fix "no symbols" warnings when compile testing with CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann) APM X-Gene PCIe controller driver: - Fix cfg resource mapping regression (Dejin Zheng) Broadcom iProc PCIe controller driver: - Return zero for success of iproc_msi_irq_domain_alloc() (Pali Rohár) Broadcom STB PCIe controller driver: - Add reset_control_rearm() stub for !CONFIG_RESET_CONTROLLER (Jim Quinlan) - Fix use of BCM7216 reset controller (Jim Quinlan) - Use reset/rearm for Broadcom STB pulse reset instead of deassert/assert (Jim Quinlan) - Fix brcm_pcie_probe() error return for unsupported revision (Wei Yongjun) Cavium ThunderX PCIe controller driver: - Fix compile testing (Arnd Bergmann) - Fix "no symbols" warnings when compile testing with CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann) Freescale Layerscape PCIe controller driver: - Fix ls_pcie_ep_probe() syntax error (comma for semicolon) (Krzysztof Wilczyński) - Remove layerscape-gen4 dependencies on OF and ARM64, add dependency on ARCH_LAYERSCAPE (Geert Uytterhoeven) HiSilicon HIP PCIe controller driver: - Remove obsolete HiSilicon PCIe DT description (Dongdong Liu) Intel Gateway PCIe controller driver: - Remove unused pcie_app_rd() (Jiapeng Chong) Intel VMD host bridge driver: - Program IRTE with Requester ID of VMD endpoint, not child device (Jon Derrick) - Disable VMD MSI-X remapping when possible so children can use more MSI-X vectors (Jon Derrick) MediaTek PCIe controller driver: - Configure FC and FTS for functions other than 0 (Ryder Lee) - Add YAML schema for MediaTek (Jianjun Wang) - Export pci_pio_to_address() for module use (Jianjun Wang) - Add MediaTek MT8192 PCIe controller driver (Jianjun Wang) - Add MediaTek MT8192 INTx support (Jianjun Wang) - Add MediaTek MT8192 MSI support (Jianjun Wang) - Add MediaTek MT8192 system power management support (Jianjun Wang) - Add missing MODULE_DEVICE_TABLE (Qiheng Lin) Microchip PolarFlare PCIe controller driver: - Make several symbols static (Wei Yongjun) NVIDIA Tegra PCIe controller driver: - Add MCFG quirks for Tegra194 ECAM errata (Vidya Sagar) - Make several symbols const (Rikard Falkeborn) - Fix Kconfig host/endpoint typo (Wesley Sheng) SiFive FU740 PCIe controller driver: - Add pcie_aux clock to prci driver (Greentime Hu) - Use reset-simple in prci driver for PCIe (Greentime Hu) - Add SiFive FU740 PCIe host controller driver and DT binding (Paul Walmsley, Greentime Hu) Synopsys DesignWare PCIe controller driver: - Move MSI Receiver init to dw_pcie_host_init() so it is re-initialized along with the RC in resume (Jisheng Zhang) - Move iATU detection earlier to fix regression (Hou Zhiqiang) TI J721E PCIe driver: - Add DT binding and TI j721e support for refclk to PCIe connector (Kishon Vijay Abraham I) - Add host mode and endpoint mode DT bindings for TI AM64 SoC (Kishon Vijay Abraham I) TI Keystone PCIe controller driver: - Use generic config accessors for TI AM65x (K3) to fix regression (Kishon Vijay Abraham I) Xilinx NWL PCIe controller driver: - Add support for coherent PCIe DMA traffic using CCI (Bharat Kumar Gogada) - Add optional "dma-coherent" DT property (Bharat Kumar Gogada) Miscellaneous: - Fix kernel-doc warnings (Krzysztof Wilczyński) - Remove unused MicroGate SyncLink device IDs (Jiri Slaby) - Remove redundant dev_err() for devm_ioremap_resource() failure (Chen Hui) - Remove redundant initialization (Colin Ian King) - Drop redundant dev_err() for platform_get_irq() errors (Krzysztof Wilczyński)" * tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (98 commits) riscv: dts: Add PCIe support for the SiFive FU740-C000 SoC PCI: fu740: Add SiFive FU740 PCIe host controller driver dt-bindings: PCI: Add SiFive FU740 PCIe host controller MAINTAINERS: Add maintainers for SiFive FU740 PCIe driver clk: sifive: Use reset-simple in prci driver for PCIe driver clk: sifive: Add pcie_aux clock in prci driver for PCIe driver PCI: brcmstb: Use reset/rearm instead of deassert/assert ata: ahci_brcm: Fix use of BCM7216 reset controller reset: add missing empty function reset_control_rearm() PCI: Allow VPD access for QLogic ISP2722 PCI/VPD: Add helper pci_get_func0_dev() PCI/VPD: Remove pci_vpd_find_tag() SRDT handling PCI/VPD: Remove pci_vpd_find_tag() 'offset' argument PCI/VPD: Change pci_vpd_init() return type to void PCI/VPD: Make missing VPD message less alarming PCI/VPD: Remove pci_set_vpd_size() x86/PCI: Remove unused alloc_pci_root_info() return value MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer PCI: mediatek-gen3: Add system PM support PCI: mediatek-gen3: Add MSI support ...	2021-05-05 13:24:11 -07:00
Robin Murphy	2d471b20c5	iommu: Streamline registration interface Rather than have separate opaque setter functions that are easy to overlook and lead to repetitive boilerplate in drivers, let's pass the relevant initialisation parameters directly to iommu_device_register(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-16 17:20:45 +02:00
Joerg Roedel	49d11527e5	Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', 'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next	2021-04-16 17:16:03 +02:00
Longpeng(Mike)	38c527aeb4	iommu/vt-d: Force to flush iotlb before creating superpage The translation caches may preserve obsolete data when the mapping size is changed, suppose the following sequence which can reveal the problem with high probability. 1.mmap(4GB,MAP_HUGETLB) 2. while (1) { (a) DMA MAP 0,0xa0000 (b) DMA UNMAP 0,0xa0000 (c) DMA MAP 0,0xc0000000 * DMA read IOVA 0 may failure here (Not present) * if the problem occurs. (d) DMA UNMAP 0,0xc0000000 } The page table(only focus on IOVA 0) after (a) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x1a30a72003 entry:0xffff89b39cacb000 PTE: 0x21d200803 entry:0xffff89b3b0a72000 The page table after (b) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x1a30a72003 entry:0xffff89b39cacb000 PTE: 0x0 entry:0xffff89b3b0a72000 The page table after (c) is: PML4: 0x19db5c1003 entry:0xffff899bdcd2f000 PDPE: 0x1a1cacb003 entry:0xffff89b35b5c1000 PDE: 0x21d200883 entry:0xffff89b39cacb000 (*) Because the PDE entry after (b) is present, it won't be flushed even if the iommu driver flush cache when unmap, so the obsolete data may be preserved in cache, which would cause the wrong translation at end. However, we can see the PDE entry is finally switch to 2M-superpage mapping, but it does not transform to 0x21d200883 directly: 1. PDE: 0x1a30a72003 2. __domain_mapping dma_pte_free_pagetable Set the PDE entry to ZERO Set the PDE entry to 0x21d200883 So we must flush the cache after the entry switch to ZERO to avoid the obsolete info be preserved. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Gonglei (Arei) <arei.gonglei@huawei.com> Fixes: `6491d4d028` ("intel-iommu: Free old page tables before creating superpage") Cc: <stable@vger.kernel.org> # v3.0+ Link: https://lore.kernel.org/linux-iommu/670baaf8-4ff8-4e84-4be3-030b95ab5a5e@huawei.com/ Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210415004628.1779-1-longpeng2@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-15 16:12:31 +02:00
Christophe JAILLET	745610c4a3	iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()' If 'intel_cap_audit()' fails, we should return directly, as already done in the surrounding error handling path. Fixes: `ad3d190299` ("iommu/vt-d: Audit IOMMU Capabilities and add helper functions") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/98d531caabe66012b4fffc7813fd4b9470afd517.1618124777.git.christophe.jaillet@wanadoo.fr Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-15 15:44:38 +02:00
Lu Baolu	906f86c860	iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86 Commit `f68c7f539b` ("iommu/vt-d: Enable write protect for supervisor SVM") added pasid_enable_wpe() which hit below compile error with !X86. ../drivers/iommu/intel/pasid.c: In function 'pasid_enable_wpe': ../drivers/iommu/intel/pasid.c:554:22: error: implicit declaration of function 'read_cr0' [-Werror=implicit-function-declaration] 554 \| unsigned long cr0 = read_cr0(); \| ^~~~~~~~ In file included from ../include/linux/build_bug.h:5, from ../include/linux/bits.h:22, from ../include/linux/bitops.h:6, from ../drivers/iommu/intel/pasid.c:12: ../drivers/iommu/intel/pasid.c:557:23: error: 'X86_CR0_WP' undeclared (first use in this function) 557 \| if (unlikely(!(cr0 & X86_CR0_WP))) { \| ^~~~~~~~~~ ../include/linux/compiler.h:78:42: note: in definition of macro 'unlikely' 78 \| # define unlikely(x) __builtin_expect(!!(x), 0) \| ^ ../drivers/iommu/intel/pasid.c:557:23: note: each undeclared identifier is reported only once for each function it appears in 557 \| if (unlikely(!(cr0 & X86_CR0_WP))) { \| ^~~~~~~~~~ ../include/linux/compiler.h:78:42: note: in definition of macro 'unlikely' 78 \| # define unlikely(x) __builtin_expect(!!(x), 0) \| Add the missing dependency. Cc: Sanjay Kumar <sanjay.k.kumar@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Fixes: `f68c7f539b` ("iommu/vt-d: Enable write protect for supervisor SVM") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20210411062312.3057579-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-15 15:43:20 +02:00
Lu Baolu	8b74b6ab25	iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown When a present pasid entry is disassembled, all kinds of pasid related caches need to be flushed. But when a pasid entry is not being used (PRESENT bit not set), we don't need to do this. Check the PRESENT bit in intel_pasid_tear_down_entry() and avoid flushing caches if it's not set. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 11:55:47 +02:00
Lu Baolu	c0474a606e	iommu/vt-d: Invalidate PASID cache when root/context entry changed When the Intel IOMMU is operating in the scalable mode, some information from the root and context table may be used to tag entries in the PASID cache. Software should invalidate the PASID-cache when changing root or context table entries. Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: `7373a8cc38` ("iommu/vt-d: Setup context and enable RID2PASID support") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 11:55:47 +02:00
Lu Baolu	eea53c5816	iommu/vt-d: Remove WO permissions on second-level paging entries When the first level page table is used for IOVA translation, it only supports Read-Only and Read-Write permissions. The Write-Only permission is not supported as the PRESENT bit (implying Read permission) should always set. When using second level, we still give separate permissions that allows WriteOnly which seems inconsistent and awkward. We want to have consistent behavior. After moving to 1st level, we don't want things to work sometimes, and break if we use 2nd level for the same mappings. Hence remove this configuration. Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: `b802d070a5` ("iommu/vt-d: Use iova over first level") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 11:55:47 +02:00
Lu Baolu	03d205094a	iommu/vt-d: Report the right page fault address The Address field of the Page Request Descriptor only keeps bit [63:12] of the offending address. Convert it to a full address before reporting it to device drivers. Fixes: `eb8d93ea3c` ("iommu/vt-d: Report page request faults for guest SVA") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320025415.641201-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 11:55:46 +02:00
Robin Murphy	a250c23f15	iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE Instead make the global iommu_dma_strict paramete in iommu.c canonical by exporting helpers to get and set it and use those directly in the drivers. This make sure that the iommu.strict parameter also works for the AMD and Intel IOMMU drivers on x86. As those default to lazy flushing a new IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to represent the default if not overriden by an explicit parameter. [ported on top of the other iommu_attr changes and added a few small missing bits] Signed-off-by: Robin Murphy <robin.murphy@arm.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210401155256.298656-19-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:56:53 +02:00
Christoph Hellwig	7e14754778	iommu: remove DOMAIN_ATTR_NESTING Use an explicit enable_nesting method instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Li Yang <leoyang.li@nxp.com> Link: https://lore.kernel.org/r/20210401155256.298656-17-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:56:53 +02:00
Jean-Philippe Brucker	9003351cb6	iommu/vt-d: Support IOMMU_DEV_FEAT_IOPF Allow drivers to query and enable IOMMU_DEV_FEAT_IOPF, which amounts to checking whether PRI is enabled. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20210401154718.307519-5-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:54:29 +02:00
Lu Baolu	6c00612d0c	iommu/vt-d: Report right snoop capability when using FL for IOVA The Intel VT-d driver checks wrong register to report snoop capablility when using first level page table for GPA to HPA translation. This might lead the IOMMU driver to say that it supports snooping control, but in reality, it does not. Fix this by always setting PASID-table-entry.PGSNP whenever a pasid entry is setting up for GPA to HPA translation so that the IOMMU driver could report snoop capability as long as it runs in the scalable mode. Fixes: `b802d070a5` ("iommu/vt-d: Use iova over first level") Suggested-by: Rajesh Sankaran <rajesh.sankaran@intel.com> Suggested-by: Kevin Tian <kevin.tian@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210330021145.13824-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:41:30 +02:00
John Garry	363f266eef	iommu/vt-d: Remove IOVA domain rcache flushing for CPU offlining Now that the core code handles flushing per-IOVA domain CPU rcaches, remove the handling here. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1616675401-151997-3-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:27:27 +02:00
Lu Baolu	442b81836d	iommu/vt-d: Make unnecessarily global functions static Make some functions static as they are only used inside pasid.c. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323010600.678627-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:15:58 +02:00
Lu Baolu	1b169fdf42	iommu/vt-d: Remove unused function declarations Some functions have been removed. Remove the remaining function delarations. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323010600.678627-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:15:49 +02:00
Lu Baolu	06905ea831	iommu/vt-d: Remove SVM_FLAG_PRIVATE_PASID The SVM_FLAG_PRIVATE_PASID has never been referenced in the tree, and there's no plan to have anything to use it. So cleanup it. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323010600.678627-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:15:19 +02:00
Lu Baolu	2e1a44c1c4	iommu/vt-d: Remove svm_dev_ops The svm_dev_ops has never been referenced in the tree, and there's no plan to have anything to use it. Remove it to make the code neat. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323010600.678627-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:15:19 +02:00
Lu Baolu	1d421058c8	iommu/vt-d: Don't set then clear private data in prq_event_thread() The VT-d specification (section 7.6) requires that the value in the Private Data field of a Page Group Response Descriptor must match the value in the Private Data field of the respective Page Request Descriptor. The private data field of a page group response descriptor is set then immediately cleared in prq_event_thread(). This breaks the rule defined by the VT-d specification. Fix it by moving clearing code up. Fixes: `5b438f4ba3` ("iommu/vt-d: Support page request in scalable mode") Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320024156.640798-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:09:54 +02:00
Lu Baolu	803766cbf8	iommu/vt-d: Fix lockdep splat in intel_pasid_get_entry() The pasid_lock is used to synchronize different threads from modifying a same pasid directory entry at the same time. It causes below lockdep splat. [ 83.296538] ======================================================== [ 83.296538] WARNING: possible irq lock inversion dependency detected [ 83.296539] 5.12.0-rc3+ #25 Tainted: G W [ 83.296539] -------------------------------------------------------- [ 83.296540] bash/780 just changed the state of lock: [ 83.296540] ffffffff82b29c98 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.0+0x32/0x110 [ 83.296547] but this lock took another, SOFTIRQ-unsafe lock in the past: [ 83.296547] (pasid_lock){+.+.}-{2:2} [ 83.296548] and interrupts could create inverse lock ordering between them. [ 83.296549] other info that might help us debug this: [ 83.296549] Chain exists of: device_domain_lock --> &iommu->lock --> pasid_lock [ 83.296551] Possible interrupt unsafe locking scenario: [ 83.296551] CPU0 CPU1 [ 83.296552] ---- ---- [ 83.296552] lock(pasid_lock); [ 83.296553] local_irq_disable(); [ 83.296553] lock(device_domain_lock); [ 83.296554] lock(&iommu->lock); [ 83.296554] <Interrupt> [ 83.296554] lock(device_domain_lock); [ 83.296555] * DEADLOCK * Fix it by replacing the pasid_lock with an atomic exchange operation. Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320020916.640115-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:08:09 +02:00
Jon Derrick	9b4a824b88	iommu/vt-d: Use Real PCI DMA device for IRTE VMD retransmits child device MSI-X with the VMD endpoint's requester-id. In order to support direct interrupt remapping of VMD child devices, ensure that the IRTE is programmed with the VMD endpoint's requester-id using pci_real_dma_dev(). Link: https://lore.kernel.org/r/20210210161315.316097-2-jonathan.derrick@intel.com Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de>	2021-03-22 14:08:20 +00:00
Jacob Pan	396bd6f3d9	iommu/vt-d: Calculate and set flags for handle_mm_fault Page requests are originated from the user page fault. Therefore, we shall set FAULT_FLAG_USER. FAULT_FLAG_REMOTE indicates that we are walking an mm which is not guaranteed to be the same as the current->mm and should not be subject to protection key enforcement. Therefore, we should set FAULT_FLAG_REMOTE to avoid faults when both SVM and PKEY are used. References: commit `1b2ee1266e` ("mm/core: Do not enforce PKEY permissions on remote mm access") Reviewed-by: Raj Ashok <ashok.raj@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/1614680040-1989-5-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:42:46 +01:00
Jacob Pan	78a523fe73	iommu/vt-d: Reject unsupported page request modes When supervisor/privilige mode SVM is used, we bind init_mm.pgd with a supervisor PASID. There should not be any page fault for init_mm. Execution request with DMA read is also not supported. This patch checks PRQ descriptor for both unsupported configurations, reject them both with invalid responses. Fixes: `1c4f88b7f1` ("iommu/vt-d: Shared virtual address in scalable mode") Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/1614680040-1989-4-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:42:46 +01:00
Jacob Pan	bb0f61533d	iommu/vt-d: Enable write protect propagation from guest Write protect bit, when set, inhibits supervisor writes to the read-only pages. In guest supervisor shared virtual addressing (SVA), write-protect should be honored upon guest bind supervisor PASID request. This patch extends the VT-d portion of the IOMMU UAPI to include WP bit. WPE bit of the supervisor PASID entry will be set to match CPU CR0.WP bit. Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1614680040-1989-3-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:42:46 +01:00
Jacob Pan	f68c7f539b	iommu/vt-d: Enable write protect for supervisor SVM Write protect bit, when set, inhibits supervisor writes to the read-only pages. In supervisor shared virtual addressing (SVA), where page tables are shared between CPU and DMA, IOMMU PASID entry WPE bit should match CR0.WP bit in the CPU. This patch sets WPE bit for supervisor PASIDs if CR0.WP is set. Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1614680040-1989-2-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:42:46 +01:00
Lu Baolu	6ca69e5841	iommu/vt-d: Report more information about invalidation errors When the invalidation queue errors are encountered, dump the information logged by the VT-d hardware together with the pending queue invalidation descriptors. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210318005340.187311-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:27:46 +01:00
Kyung Min Park	dec991e472	iommu/vt-d: Disable SVM when ATS/PRI/PASID are not enabled in the device Currently, the Intel VT-d supports Shared Virtual Memory (SVM) only when IO page fault is supported. Otherwise, shared memory pages can not be swapped out and need to be pinned. The device needs the Address Translation Service (ATS), Page Request Interface (PRI) and Process Address Space Identifier (PASID) capabilities to be enabled to support IO page fault. Disable SVM when ATS, PRI and PASID are not enabled in the device. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210314201534.918-1-kyung.min.park@intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:23:52 +01:00
Robin Murphy	3542dcb15c	iommu/dma: Resurrect the "forcedac" option In converting intel-iommu over to the common IOMMU DMA ops, it quietly lost the functionality of its "forcedac" option. Since this is a handy thing both for testing and for performance optimisation on certain platforms, reimplement it under the common IOMMU parameter namespace. For the sake of fixing the inadvertent breakage of the Intel-specific parameter, remove the dmar_forcedac remnants and hook it up as an alias while documenting the transition to the new common parameter. Fixes: `c588072bba` ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/7eece8e0ea7bfbe2cd0e30789e0d46df573af9b0.1614961776.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 10:55:23 +01:00
Zenghui Yu	444d66a23c	iommu/vt-d: Fix status code for Allocate/Free PASID command As per Intel vt-d spec, Rev 3.0 (section 10.4.45 "Virtual Command Response Register"), the status code of "No PASID available" error in response to the Allocate PASID command is 2, not 1. The same for "Invalid PASID" error in response to the Free PASID command. We will otherwise see confusing kernel log under the command failure from guest side. Fix it. Fixes: `24f27d32ab` ("iommu/vt-d: Enlightened PASID allocation") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210227073909.432-1-yuzenghui@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-04 13:32:04 +01:00
Joerg Roedel	45e606f272	Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next	2021-02-12 15:27:17 +01:00
Yian Chen	31a75cbbb9	iommu/vt-d: Parse SATC reporting structure Software should parse every SATC table and all devices in the tables reported by the BIOS and keep the information in kernel list for further reference. Signed-off-by: Yian Chen <yian.chen@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210203093329.1617808-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Lu Baolu	933fcd01e9	iommu/vt-d: Add iotlb_sync_map callback Some Intel VT-d hardware implementations don't support memory coherency for page table walk (presented by the Page-Walk-coherency bit in the ecap register), so that software must flush the corresponding CPU cache lines explicitly after each page table entry update. The iommu_map_sg() code iterates through the given scatter-gather list and invokes iommu_map() for each element in the scatter-gather list, which calls into the vendor IOMMU driver through iommu_ops callback. As the result, a single sg mapping may lead to multiple cache line flushes, which leads to the degradation of I/O performance after the commit <c588072bba6b5> ("iommu/vt-d: Convert intel iommu driver to the iommu ops"). Fix this by adding iotlb_sync_map callback and centralizing the clflush operations after all sg mappings. Fixes: `c588072bba` ("iommu/vt-d: Convert intel iommu driver to the iommu ops") Reported-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/linux-iommu/D81314ED-5673-44A6-B597-090E3CB83EB0@oracle.com/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Robin Murphy <robin.murphy@arm.com> [ cel: removed @first_pte, which is no longer used ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/linux-iommu/161177763962.1311.15577661784296014186.stgit@manet.1015granger.net Link: https://lore.kernel.org/r/20210204014401.2846425-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Kyung Min Park	010bf5659e	iommu/vt-d: Move capability check code to cap_audit files Move IOMMU capability check and sanity check code to cap_audit files. Also implement some helper functions for sanity checks. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Kyung Min Park	ad3d190299	iommu/vt-d: Audit IOMMU Capabilities and add helper functions Audit IOMMU Capability/Extended Capability and check if the IOMMUs have the consistent value for features. Report out or scale to the lowest supported when IOMMU features have incompatibility among IOMMUs. Report out features when below features are mismatched: - First Level 5 Level Paging Support (FL5LP) - First Level 1 GByte Page Support (FL1GP) - Read Draining (DRD) - Write Draining (DWD) - Page Selective Invalidation (PSI) - Zero Length Read (ZLR) - Caching Mode (CM) - Protected High/Low-Memory Region (PHMR/PLMR) - Required Write-Buffer Flushing (RWBF) - Advanced Fault Logging (AFL) - RID-PASID Support (RPS) - Scalable Mode Page Walk Coherency (SMPWC) - First Level Translation Support (FLTS) - Second Level Translation Support (SLTS) - No Write Flag Support (NWFS) - Second Level Accessed/Dirty Support (SLADS) - Virtual Command Support (VCS) - Scalable Mode Translation Support (SMTS) - Device TLB Invalidation Throttle (DIT) - Page Drain Support (PDS) - Process Address Space ID Support (PASID) - Extended Accessed Flag Support (EAFS) - Supervisor Request Support (SRS) - Execute Request Support (ERS) - Page Request Support (PRS) - Nested Translation Support (NEST) - Snoop Control (SC) - Pass Through (PT) - Device TLB Support (DT) - Queued Invalidation (QI) - Page walk Coherency (C) Set capability to the lowest supported when below features are mismatched: - Maximum Address Mask Value (MAMV) - Number of Fault Recording Registers (NFR) - Second Level Large Page Support (SLLPS) - Fault Recording Offset (FRO) - Maximum Guest Address Width (MGAW) - Supported Adjusted Guest Address Width (SAGAW) - Number of Domains supported (NDOMS) - Pasid Size Supported (PSS) - Maximum Handle Mask Value (MHMV) - IOTLB Register Offset (IRO) Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Lu Baolu	e1ed66ac30	iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration] trace_qi_submit() could be used when interrupt remapping is supported, but DMA remapping is not. In this case, the following compile error occurs. ../drivers/iommu/intel/dmar.c: In function 'qi_submit_sync': ../drivers/iommu/intel/dmar.c:1311:3: error: implicit declaration of function 'trace_qi_submit'; did you mean 'ftrace_nmi_exit'? [-Werror=implicit-function-declaration] trace_qi_submit(iommu, desc[i].qw0, desc[i].qw1, ^~~~~~~~~~~~~~~ ftrace_nmi_exit Fixes: `f2dd871799` ("iommu/vt-d: Add qi_submit trace event") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210130151907.3929148-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-02 14:43:56 +01:00
Lu Baolu	3aa7c62cb7	iommu/vt-d: Use INVALID response code instead of FAILURE The VT-d IOMMU response RESPONSE_FAILURE for a page request in below cases: - When it gets a Page_Request with no PASID; - When it gets a Page_Request with PASID that is not in use for this device. This is allowed by the spec, but IOMMU driver doesn't support such cases today. When the device receives RESPONSE_FAILURE, it sends the device state machine to HALT state. Now if we try to unload the driver, it hangs since the device doesn't send any outbound transactions to host when the driver is trying to clear things up. The only possible responses would be for invalidation requests. Let's use RESPONSE_INVALID instead for now, so that the device state machine doesn't enter HALT state. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210126080730.2232859-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-29 09:25:24 +01:00
Lu Baolu	28a77185f1	iommu/vt-d: Clear PRQ overflow only when PRQ is empty It is incorrect to always clear PRO when it's set w/o first checking whether the overflow condition has been cleared. Current code assumes that if an overflow condition occurs it must have been cleared by earlier loop. However since the code runs in a threaded context, the overflow condition could occur even after setting the head to the tail under some extreme condition. To be sane, we should read both head/tail again when seeing a pending PRO and only clear PRO after all pending PRs have been handled. Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/ Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-29 09:25:24 +01:00
Nadav Amit	29b3283972	iommu/vt-d: Do not use flush-queue when caching-mode is on When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on. Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit `78d5f0f500` ("intel-iommu: Avoid global flushes with caching mode.") This behavior changed after commit `13cf017446` ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one. Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability. Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization. Fixes: `13cf017446` ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit <namit@vmware.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 13:59:02 +01:00
Lu Baolu	494b3688bb	iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid() An incorrect address mask is being used in the qi_flush_dev_iotlb_pasid() to check the address alignment. This leads to a lot of spurious kernel warnings: [ 485.837093] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0 [ 485.837098] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0 [ 492.494145] qi_flush_dev_iotlb_pasid: 5734 callbacks suppressed [ 492.494147] DMAR: Invalidate non-aligned address 7f7728800000, order 11 [ 492.508965] DMAR: Invalidate non-aligned address 7f7728800000, order 11 Fix it by checking the alignment in right way. Fixes: `288d08e780` ("iommu/vt-d: Handle non-page aligned address") Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20210119043500.1539596-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 13:17:08 +01:00
Lu Baolu	a8ce9ebbec	iommu/vt-d: Preset Access/Dirty bits for IOVA over FL The Access/Dirty bits in the first level page table entry will be set whenever a page table entry was used for address translation or write permission was successfully translated. This is always true when using the first-level page table for kernel IOVA. Instead of wasting hardware cycles to update the certain bits, it's better to set them up at the beginning. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 11:33:35 +01:00
Lu Baolu	f2dd871799	iommu/vt-d: Add qi_submit trace event This adds a new trace event to track the submissions of requests to the invalidation queue. This event will provide the information like: - IOMMU name - Invalidation type - Descriptor raw data A sample output like: \| qi_submit: iotlb_inv dmar1: 0x100e2 0x0 0x0 0x0 \| qi_submit: dev_tlb_inv dmar1: 0x1000000003 0x7ffffffffffff001 0x0 0x0 \| qi_submit: iotlb_inv dmar2: 0x800f2 0xf9a00005 0x0 0x0 This will be helpful for queued invalidation related debugging. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210114090400.736104-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 11:32:13 +01:00
Lu Baolu	9872f9bd9d	iommu/vt-d: Consolidate duplicate cache invaliation code The pasid based IOTLB and devTLB invalidation code is duplicate in several places. Consolidate them by using the common helpers. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210114085021.717041-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 11:30:36 +01:00
Tian Tao	694a1c0ade	iommu/vt-d: Fix duplicate included linux/dma-map-ops.h linux/dma-map-ops.h is included more than once, Remove the one that isn't necessary. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609118774-10083-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-12 16:56:20 +00:00
Lu Baolu	2d6ffc63f1	iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev() The VT-d hardware will ignore those Addr bits which have been masked by the AM field in the PASID-based-IOTLB invalidation descriptor. As the result, if the starting address in the descriptor is not aligned with the address mask, some IOTLB caches might not invalidate. Hence people will see below errors. [ 1093.704661] dmar_fault: 29 callbacks suppressed [ 1093.704664] DMAR: DRHD: handling fault status reg 3 [ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2 fault addr 7f81c968d000 [fault reason 113] SM: Present bit in first-level paging entry is clear Fix this by using aligned address for PASID-based-IOTLB invalidation. Fixes: `1c4f88b7f1` ("iommu/vt-d: Shared virtual address in scalable mode") Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-12 15:47:54 +00:00
Liu Yi L	7c29ada5e7	iommu/vt-d: Fix ineffective devTLB invalidation for subdevices iommu_flush_dev_iotlb() is called to invalidate caches on a device but only loops over the devices which are fully-attached to the domain. For sub-devices, this is ineffective and can result in invalid caching entries left on the device. Fix the missing invalidation by adding a loop over the subdevices and ensuring that 'domain->has_iotlb_device' is updated when attaching to subdevices. Fixes: `67b8e02b5e` ("iommu/vt-d: Aux-domain specific domain attach/detach") Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609949037-25291-4-git-send-email-yi.l.liu@intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 14:38:15 +00:00
Liu Yi L	18abda7a2d	iommu/vt-d: Fix general protection fault in aux_detach_device() The aux-domain attach/detach are not tracked, some data structures might be used after free. This causes general protection faults when multiple subdevices are created and assigned to a same guest machine: \| general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI \| RIP: 0010:intel_iommu_aux_detach_device+0x12a/0x1f0 \| [...] \| Call Trace: \| iommu_aux_detach_device+0x24/0x70 \| vfio_mdev_detach_domain+0x3b/0x60 \| ? vfio_mdev_set_domain+0x50/0x50 \| iommu_group_for_each_dev+0x4f/0x80 \| vfio_iommu_detach_group.isra.0+0x22/0x30 \| vfio_iommu_type1_detach_group.cold+0x71/0x211 \| ? find_exported_symbol_in_section+0x4a/0xd0 \| ? each_symbol_section+0x28/0x50 \| __vfio_group_unset_container+0x4d/0x150 \| vfio_group_try_dissolve_container+0x25/0x30 \| vfio_group_put_external_user+0x13/0x20 \| kvm_vfio_group_put_external_user+0x27/0x40 [kvm] \| kvm_vfio_destroy+0x45/0xb0 [kvm] \| kvm_put_kvm+0x1bb/0x2e0 [kvm] \| kvm_vm_release+0x22/0x30 [kvm] \| __fput+0xcc/0x260 \| ____fput+0xe/0x10 \| task_work_run+0x8f/0xb0 \| do_exit+0x358/0xaf0 \| ? wake_up_state+0x10/0x20 \| ? signal_wake_up_state+0x1a/0x30 \| do_group_exit+0x47/0xb0 \| __x64_sys_exit_group+0x18/0x20 \| do_syscall_64+0x57/0x1d0 \| entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix the crash by tracking the subdevices when attaching and detaching aux-domains. Fixes: `67b8e02b5e` ("iommu/vt-d: Aux-domain specific domain attach/detach") Co-developed-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609949037-25291-3-git-send-email-yi.l.liu@intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 14:35:14 +00:00
Liu Yi L	9ad9f45b3b	iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev 'struct intel_svm' is shared by all devices bound to a give process, but records only a single pointer to a 'struct intel_iommu'. Consequently, cache invalidations may only be applied to a single DMAR unit, and are erroneously skipped for the other devices. In preparation for fixing this, rework the structures so that the iommu pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm' to track them in its device list. Fixes: `1c4f88b7f1` ("iommu/vt-d: Shared virtual address in scalable mode") Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Raj Ashok <ashok.raj@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Reported-by: Guo Kaijie <Kaijie.Guo@intel.com> Reported-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Xin Zeng <xin.zeng@intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Cc: stable@vger.kernel.org # v5.0+ Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 14:34:36 +00:00
Lu Baolu	420d42f6f9	iommu/vt-d: Fix lockdep splat in sva bind()/unbind() Lock(&iommu->lock) without disabling irq causes lockdep warnings. ======================================================== WARNING: possible irq lock inversion dependency detected 5.11.0-rc1+ #828 Not tainted -------------------------------------------------------- kworker/0:1H/120 just changed the state of lock: ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.0+0x32/0x120 but this lock took another, SOFTIRQ-unsafe lock in the past: (&iommu->lock){+.+.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&iommu->lock); local_irq_disable(); lock(device_domain_lock); lock(&iommu->lock); <Interrupt> lock(device_domain_lock); * DEADLOCK * Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 13:27:14 +00:00
Lu Baolu	1efd17e7ac	iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb() Use IS_ALIGNED() instead. Otherwise, an unaligned address will be ignored. Fixes: `33cd6e642d` ("iommu/vt-d: Flush PASID-based iotlb for iova over first level") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-1-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 13:27:14 +00:00
Dinghao Liu	ff2b46d7cf	iommu/intel: Fix memleak in intel_irq_remapping_alloc When irq_domain_get_irq_data() or irqd_cfg() fails at i == 0, data allocated by kzalloc() has not been freed before returning, which leads to memleak. Fixes: `b106ee63ab` ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210105051837.32118-1-dinghao.liu@zju.edu.cn Signed-off-by: Will Deacon <will@kernel.org>	2021-01-05 19:12:06 +00:00
Linus Torvalds	19778dd504	IOMMU updates for 5.11 - IOVA allocation optimisations and removal of unused code - Introduction of DOMAIN_ATTR_IO_PGTABLE_CFG for parameterising the page-table of an IOMMU domain - Support for changing the default domain type in sysfs - Optimisation to the way in which identity-mapped regions are created - Driver updates: * Arm SMMU updates, including continued work on Shared Virtual Memory * Tegra SMMU updates, including support for PCI devices * Intel VT-D updates, including conversion to the IOMMU-DMA API - Cleanup, kerneldoc and minor refactoring -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/XWy8QHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNPejB/46QsXATkWt7hbDPIxlUvzUG8VP/FBNJ6A3 /4Z+4KBXR3zhvZJOEqTarnm6Uc22tWkYpNS3QAOuRW0EfVeD8H+og4SOA2iri5tR x3GZUCng93APWpHdDtJP7kP/xuU47JsBblY/Ip9aJKYoXi9c9svtssAqKr008wxr knv/xv/awQ0O7CNc3gAoz7mUagQxG/no+HMXMT3Fz9KWRzzvTi6s+7ZDm2faI0hO GEJygsKbXxe1qbfeGqKTP/67EJVqjTGsLCF2zMogbnnD7DxadJ2hP0oNg5tvldT/ oDj9YWG6oLMfIVCwDVQXuWNfKxd7RGORMbYwKNAaRSvmkli6625h =KFOO -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull IOMMU updates from Will Deacon: "There's a good mixture of improvements to the core code and driver changes across the board. One thing worth pointing out is that this includes a quirk to work around behaviour in the i915 driver (see `65f746e828` ("iommu: Add quirk for Intel graphic devices in map_sg")), which otherwise interacts badly with the conversion of the intel IOMMU driver over to the DMA-IOMMU APU but has being fixed properly in the DRM tree. We'll revert the quirk later this cycle once we've confirmed that things don't fall apart without it. Summary: - IOVA allocation optimisations and removal of unused code - Introduction of DOMAIN_ATTR_IO_PGTABLE_CFG for parameterising the page-table of an IOMMU domain - Support for changing the default domain type in sysfs - Optimisation to the way in which identity-mapped regions are created - Driver updates: * Arm SMMU updates, including continued work on Shared Virtual Memory * Tegra SMMU updates, including support for PCI devices * Intel VT-D updates, including conversion to the IOMMU-DMA API - Cleanup, kerneldoc and minor refactoring" * tag 'iommu-updates-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (50 commits) iommu/amd: Add sanity check for interrupt remapping table length macros dma-iommu: remove __iommu_dma_mmap iommu/io-pgtable: Remove tlb_flush_leaf iommu: Stop exporting free_iova_mem() iommu: Stop exporting alloc_iova_mem() iommu: Delete split_and_remove_iova() iommu/io-pgtable-arm: Remove unused 'level' parameter from iopte_type() macro iommu: Defer the early return in arm_(v7s/lpae)_map iommu: Improve the performance for direct_mapping iommu: avoid taking iova_rbtree_lock twice iommu/vt-d: Avoid GFP_ATOMIC where it is not needed iommu/vt-d: Remove set but not used variable iommu: return error code when it can't get group iommu: Fix htmldocs warnings in sysfs-kernel-iommu_groups iommu: arm-smmu-impl: Add a space before open parenthesis iommu: arm-smmu-impl: Use table to list QCOM implementations iommu/arm-smmu: Move non-strict mode to use io_pgtable_domain_attr iommu/arm-smmu: Add support for pagetable config domain attribute iommu: Document usage of "/sys/kernel/iommu_groups/<grp_id>/type" file iommu: Take lock before reading iommu group default domain type ...	2020-12-16 13:58:47 -08:00
Linus Torvalds	148842c98a	Yet another large set of x86 interrupt management updates: - Simplification and distangling of the MSI related functionality - Let IO/APIC construct the RTE entries from an MSI message instead of having IO/APIC specific code in the interrupt remapping drivers - Make the retrieval of the parent interrupt domain (vector or remap unit) less hardcoded and use the relevant irqdomain callbacks for selection. - Allow the handling of more than 255 CPUs without a virtualized IOMMU when the hypervisor supports it. This has made been possible by the above modifications and also simplifies the existing workaround in the HyperV specific virtual IOMMU. - Cleanup of the historical timer_works() irq flags related inconsistencies. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/Xxd8THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoYpOD/9C5TppNlPMUyx2SflH6bxt37pJEpln +hYTKsk+jSThntr5mfj+GifGvgmHOVBTGnlDUnUnrpN7TQmLFBzwTOtnBLW53AO2 16/u0+Xci4LNCtEkaymf0Rq4MfsfriXHPJr0A/CnZ0tpHSf5QKHAiitSiGujdMlb gbq43+zXd+jNkH7vkOLPX/7dZVI1hNASQEevJu2tRR4xYTuXFdBxvLgYkHtYKKrK R1sbs6nI6yIzye2u4m4xGu29SxgUft+zdUf+UehJKM3yFmf51d9qpkX+kLaTWuaL VPsMItbn0kdvxwXQWO6DYnIAAnVKCklyHQJTZCoNq9Fe91OoByak1CEVspSOa1av JmycNSch4IYWasR4vVCB1gbb+V9SejcKu5SV3CDrEDqwkOIpfiqpriUXSCJTLlFd QOEDOLuuk/79Qs//J/tb/nJ4IuKv8WPudDfIlMro8wUsAr67DjD4mnXprZ+svwWx Ct/0/Memk+BSa0cw6pvg24BUZGN6zrufkBu2HKT9GOXRUdNkdLkiPhT8mK4T/O0l f90QCLjPSOJ/K/pLEWdUHEPmgC5Q9RsXOmwVGqX+RbjfP7mYTJXlmWnBb+cFNch0 xFIH3SxVGylxxT06NX3SkvinrHj10CoAlmneefBlLtx6dF+2P84DAMZSF0OFToVI c2KMg5zoesI4bg== =8Gfs -----END PGP SIGNATURE----- Merge tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "Yet another large set of x86 interrupt management updates: - Simplification and distangling of the MSI related functionality - Let IO/APIC construct the RTE entries from an MSI message instead of having IO/APIC specific code in the interrupt remapping drivers - Make the retrieval of the parent interrupt domain (vector or remap unit) less hardcoded and use the relevant irqdomain callbacks for selection. - Allow the handling of more than 255 CPUs without a virtualized IOMMU when the hypervisor supports it. This has made been possible by the above modifications and also simplifies the existing workaround in the HyperV specific virtual IOMMU. - Cleanup of the historical timer_works() irq flags related inconsistencies" * tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) x86/ioapic: Cleanup the timer_works() irqflags mess iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select() iommu/amd: Fix IOMMU interrupt generation in X2APIC mode iommu/amd: Don't register interrupt remapping irqdomain when IR is disabled iommu/amd: Fix union of bitfields in intcapxt support x86/ioapic: Correct the PCI/ISA trigger type selection x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it x86/kvm: Enable 15-bit extension when KVM_FEATURE_MSI_EXT_DEST_ID detected iommu/hyper-v: Disable IRQ pseudo-remapping if 15 bit APIC IDs are available x86/apic: Support 15 bits of APIC ID in MSI where available x86/ioapic: Handle Extended Destination ID field in RTE iommu/vt-d: Simplify intel_irq_remapping_select() x86: Kill all traces of irq_remapping_get_irq_domain() x86/ioapic: Use irq_find_matching_fwspec() to find remapping irqdomain x86/hpet: Use irq_find_matching_fwspec() to find remapping irqdomain iommu/hyper-v: Implement select() method on remapping irqdomain iommu/vt-d: Implement select() method on remapping irqdomain iommu/amd: Implement select() method on remapping irqdomain x86/apic: Add select() method on vector irqdomain ...	2020-12-14 18:59:53 -08:00
Will Deacon	c74009f529	Merge branch 'for-next/iommu/fixes' into for-next/iommu/core Merge in IOMMU fixes for 5.10 in order to resolve conflicts against the queue for 5.11. * for-next/iommu/fixes: iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs iommu/vt-d: Don't read VCCAP register unless it exists x86/tboot: Don't disable swiotlb when iommu is forced on iommu: Check return of __iommu_attach_device() arm-smmu-qcom: Ensure the qcom_scm driver has finished probing iommu/amd: Enforce 4k mapping for certain IOMMU data structures MAINTAINERS: Temporarily add myself to the IOMMU entry iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set iommu/vt-d: Avoid panic if iommu init fails in tboot system iommu/vt-d: Cure VF irqdomain hickup x86/platform/uv: Fix copied UV5 output archtype x86/platform/uv: Drop last traces of uv_flush_tlb_others	2020-12-08 15:21:49 +00:00
Will Deacon	113eb4ce4f	Merge branch 'for-next/iommu/vt-d' into for-next/iommu/core Intel VT-D updates for 5.11. The main thing here is converting the code over to the iommu-dma API, which required some improvements to the core code to preserve existing functionality. * for-next/iommu/vt-d: iommu/vt-d: Avoid GFP_ATOMIC where it is not needed iommu/vt-d: Remove set but not used variable iommu/vt-d: Cleanup after converting to dma-iommu ops iommu/vt-d: Convert intel iommu driver to the iommu ops iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev iommu: Add quirk for Intel graphic devices in map_sg iommu: Allow the dma-iommu api to use bounce buffers iommu: Add iommu_dma_free_cpu_cached_iovas() iommu: Handle freelists when using deferred flushing in iommu drivers iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM	2020-12-08 15:11:58 +00:00
Will Deacon	a5f12de3ec	Merge branch 'for-next/iommu/svm' into for-next/iommu/core More steps along the way to Shared Virtual {Addressing, Memory} support for Arm's SMMUv3, including the addition of a helper library that can be shared amongst other IOMMU implementations wishing to support this feature. * for-next/iommu/svm: iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind() iommu/sva: Add PASID helpers iommu/ioasid: Add ioasid references	2020-12-08 15:07:49 +00:00
Christophe JAILLET	33e0715710	iommu/vt-d: Avoid GFP_ATOMIC where it is not needed There is no reason to use GFP_ATOMIC in a 'suspend' function. Use GFP_KERNEL instead to give more opportunities to allocate the requested memory. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201030182630.5154-1-christophe.jaillet@wanadoo.fr Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201201013149.2466272-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-12-01 15:02:20 +00:00
Linus Torvalds	6adf33a5e4	iommu fixes for -rc6 - Fix intel iommu driver when running on devices without VCCAP_REG - Fix swiotlb and "iommu=pt" interaction under TXT (tboot) - Fix missing return value check during device probe() - Fix probe ordering for Qualcomm SMMU implementation - Ensure page-sized mappings are used for AMD IOMMU buffers with SNP RMP -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/A3msQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNAI6B/9/MLjurPmQrSusq9Y7dhnGR7aahtICLAwR UDZHebGTeVxNqtTklQVl/1qgcY7DxU40LAO1777MM7eHOW1FnlCAWrkTo6BBQsGx U1FKegOJ/0eVHFPtFDfM7IA2skeLwlZW+hywNLAksme5mtd6iZG9yQLlDFqAjxL7 v8uXgfHFn6Z2MvMc2O+IeKTtflIwPek/6rYuaEf7UknA6ZYPAD3hnu9i1RTEuUAA h2PoVrrJ/KefEsCMUIq2jwMTsSvxohDH8ClGK9b6h74J2CLKKuhALSgABRAmsEL9 7w5TVdtMQ85n3ccnXyT4RBQ+O/eVtmsKSdfeAbFI4nG9e9j7YE1L =BI1u -----END PGP SIGNATURE----- Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull iommu fixes from Will Deacon: "Here's another round of IOMMU fixes for -rc6 consisting mainly of a bunch of independent driver fixes. Thomas agreed for me to take the x86 'tboot' fix here, as it fixes a regression introduced by a vt-d change. - Fix intel iommu driver when running on devices without VCCAP_REG - Fix swiotlb and "iommu=pt" interaction under TXT (tboot) - Fix missing return value check during device probe() - Fix probe ordering for Qualcomm SMMU implementation - Ensure page-sized mappings are used for AMD IOMMU buffers with SNP RMP" * tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: iommu/vt-d: Don't read VCCAP register unless it exists x86/tboot: Don't disable swiotlb when iommu is forced on iommu: Check return of __iommu_attach_device() arm-smmu-qcom: Ensure the qcom_scm driver has finished probing iommu/amd: Enforce 4k mapping for certain IOMMU data structures	2020-11-27 10:41:19 -08:00
Lu Baolu	405a43cc00	iommu/vt-d: Remove set but not used variable Fixes gcc '-Wunused-but-set-variable' warning: drivers/iommu/intel/iommu.c:5643:27: warning: variable 'last_pfn' set but not used [-Wunused-but-set-variable] 5643 \| unsigned long start_pfn, last_pfn; \| ^~~~~~~~ This variable is never used, so remove it. Fixes: `2a2b8eaa5b` ("iommu: Handle freelists when using deferred flushing in iommu drivers") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201127013308.1833610-1-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-27 09:52:28 +00:00
David Woodhouse	d76b42e927	iommu/vt-d: Don't read VCCAP register unless it exists My virtual IOMMU implementation is whining that the guest is reading a register that doesn't exist. Only read the VCCAP_REG if the corresponding capability is set in ECAP_REG to indicate that it actually exists. Fixes: `3375303e82` ("iommu/vt-d: Add custom allocator for IOASID") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Cc: stable@vger.kernel.org # v5.7+ Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/de32b150ffaa752e0cff8571b17dfb1213fbe71c.camel@infradead.org Signed-off-by: Will Deacon <will@kernel.org>	2020-11-26 14:50:24 +00:00
Lu Baolu	28b41e2c6a	iommu: Move def_domain type check for untrusted device into core So that the vendor iommu drivers are no more required to provide the def_domain_type callback to always isolate the untrusted devices. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Link: https://lore.kernel.org/linux-iommu/243ce89c33fe4b9da4c56ba35acebf81@huawei.com/ Link: https://lore.kernel.org/r/20201124130604.2912899-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-25 12:14:33 +00:00
Lu Baolu	58a8bb3949	iommu/vt-d: Cleanup after converting to dma-iommu ops Some cleanups after converting the driver to use dma-iommu ops. - Remove nobounce option; - Cleanup and simplify the path in domain mapping. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/20201124082057.2614359-8-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-25 12:03:49 +00:00
Tom Murphy	c588072bba	iommu/vt-d: Convert intel iommu driver to the iommu ops Convert the intel iommu driver to the dma-iommu api. Remove the iova handling and reserve region code from the intel iommu driver. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/20201124082057.2614359-7-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-25 12:03:49 +00:00
Lu Baolu	c062db039f	iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev The iommu-dma constrains IOVA allocation based on the domain geometry that the driver reports. Update domain geometry everytime a domain is attached to or detached from a device. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-25 12:03:48 +00:00
Tom Murphy	2a2b8eaa5b	iommu: Handle freelists when using deferred flushing in iommu drivers Allow the iommu_unmap_fast to return newly freed page table pages and pass the freelist to queue_iova in the dma-iommu ops path. This is useful for iommu drivers (in this case the intel iommu driver) which need to wait for the ioTLB to be flushed before newly free/unmapped page table pages can be freed. This way we can still batch ioTLB free operations and handle the freelists. Signed-off-by: Tom Murphy <murphyt7@tcd.ie> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/20201124082057.2614359-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-25 12:03:48 +00:00
Jean-Philippe Brucker	cb4789b0d1	iommu/ioasid: Add ioasid references Let IOASID users take references to existing ioasids with ioasid_get(). ioasid_put() drops a reference and only frees the ioasid when its reference number is zero. It returns true if the ioasid was freed. For drivers that don't call ioasid_get(), ioasid_put() is the same as ioasid_free(). Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>	2020-11-23 14:16:55 +00:00
Will Deacon	66930e7e1e	Merge branch 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb into for-next/iommu/vt-d Merge swiotlb updates from Konrad, as we depend on the updated function prototype for swiotlb_tbl_map_single(), which dropped the 'tbl_dma_addr' argument in -rc4. * 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"	2020-11-23 10:46:43 +00:00
Linus Torvalds	fc8299f9f3	iommu fixes for -rc5 - Fix boot when intel iommu initialisation fails under TXT (tboot) - Fix intel iommu compilation error when DMAR is enabled without ATS - Temporarily update IOMMU MAINTAINERs entry -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+3p6gQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNPowCADBKq5PBRwqIM1tmntRu5rePf/WuB1BotlJ cUUR8dIxFDvKMeGvN4mDbn/5jjL1+TlD7sQqFh+gcWJ9tImpAMzp1ENbg71isl5o Ej21X9YhjfiIsKpTrNGxcetCkYaR2tp8Z4/WzSQaO+IGB578dQ9VwiwSnDyFdWUb 1Qcj712ainYvKjbq4NyL80lPm9v43OV8QZ73WeelzBjE2UcDFAV0Xl4BIX8uuGtv I1x03JfgzmzhVBmpj5HUGPsSopMDdo5K4vlcqscqSqvBY90iHD5fgRF4ZatTjR1t PVM/1+c9vdJIWSxSt1hsB8DPtqXVOvGjzrLW/h63rxyWsISwkAq0 =DNX/ -----END PGP SIGNATURE----- Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull iommu fixes from Will Deacon: "Two straightforward vt-d fixes: - Fix boot when intel iommu initialisation fails under TXT (tboot) - Fix intel iommu compilation error when DMAR is enabled without ATS and temporarily update IOMMU MAINTAINERs entry" * tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: MAINTAINERS: Temporarily add myself to the IOMMU entry iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set iommu/vt-d: Avoid panic if iommu init fails in tboot system	2020-11-20 10:20:16 -08:00
Lu Baolu	3645a34f5b	iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set Fix the compile error below (CONFIG_PCI_ATS not set): drivers/iommu/intel/dmar.c: In function ‘vf_inherit_msi_domain’: drivers/iommu/intel/dmar.c:338:59: error: ‘struct pci_dev’ has no member named ‘physfn’; did you mean ‘is_physfn’? 338 \| dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev)); \| ^~~~~~ \| is_physfn Fixes: `ff828729be` ("iommu/vt-d: Cure VF irqdomain hickup") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/linux-iommu/CAMuHMdXA7wfJovmfSH2nbAhN0cPyCiFHodTvg4a8Hm9rx5Dj-w@mail.gmail.com/ Link: https://lore.kernel.org/r/20201119055119.2862701-1-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-19 09:52:26 +00:00
Will Deacon	388255ce95	A small set of fixes for x86: - Cure the fallout from the MSI irqdomain overhaul which missed that the Intel IOMMU does not register virtual function devices and therefore never reaches the point where the MSI interrupt domain is assigned. This makes the VF devices use the non-remapped MSI domain which is trapped by the IOMMU/remap unit. - Remove an extra space in the SGI_UV architecture type procfs output for UV5. - Remove a unused function which was missed when removing the UV BAU TLB shootdown handler. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xJi0THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVWxD/9Tq4W6Kniln7mtoEWHRvHRceiiGcS3 MocvqurhoJwirH4F2gkvCegTBy0r3FdUORy3OMmChVs6nb8XpPpso84SANCRePWp JZezpVwLSNC4O1/ZCg1Kjj4eUpzLB/UjUUQV9RsjL5wyQEhfCZgb1D40yLM/2dj5 SkVm/EAqWuQNtYe/jqAOwTX/7mV+k2QEmKCNOigM13R9EWgu6a4J8ta1gtNSbwvN jWMW+M1KjZ76pfRK+y4OpbuFixteSzhSWYPITSGwQz4IpQ+Ty2Rv0zzjidmDnAR+ Q73cup0dretdVnVDRpMwDc06dBCmt/rbN50w4yGU0YFRFDgjGc8sIbQzuIP81nEQ XY4l4rcBgyVufFsLrRpQxu1iYPFrcgU38W1kRkkJ3Kl/rY1a2ZU7sLE4kt4Oh55W A9KCmsfqP1PCYppjAQ0QT4NOp4YtecPvAU4UcBOb722DDBd8TfhLWWGw2yG57Q/d Wnu8xCJGy7BaLHLGGGseAft+D4aNnCjKC3jgMyvNtRDXaV2cK2Kdd6ehMlWVUapD xfLlKXE+igXMyoWJIWjTXQJs4dpKu6QpJCPiorwEZ8rmNaRfxsWEJVbeYwEkmUke bMoBBSCbZT86WVOYhI8WtrIemraY0mMYrrcE03M96HU3eYB8BV92KrIzZWThupcQ ZqkZbqCZm3vfHA== =X/P+ -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-next/iommu/fixes Pull in x86 fixes from Thomas, as they include a change to the Intel DMAR code on which we depend: * tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: iommu/vt-d: Cure VF irqdomain hickup x86/platform/uv: Fix copied UV5 output archtype x86/platform/uv: Drop last traces of uv_flush_tlb_others	2020-11-19 09:50:06 +00:00
Zhenzhong Duan	4d213e76a3	iommu/vt-d: Avoid panic if iommu init fails in tboot system "intel_iommu=off" command line is used to disable iommu but iommu is force enabled in a tboot system for security reason. However for better performance on high speed network device, a new option "intel_iommu=tboot_noforce" is introduced to disable the force on. By default kernel should panic if iommu init fail in tboot for security reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off". Fix the code setting force_on and move intel_iommu_tboot_noforce from tboot code to intel iommu code. Fixes: `7304e8f28b` ("iommu/vt-d: Correctly disable Intel IOMMU force on") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com> Tested-by: Lukasz Hawrylko <lukasz.hawrylko@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201110071908.3133-1-zhenzhong.duan@gmail.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-18 13:09:07 +00:00
Lukas Bulwahn	68dd9d89ea	iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM Commit `6ee1b77ba3` ("iommu/vt-d: Add svm/sva invalidate function") introduced intel_iommu_sva_invalidate() when CONFIG_INTEL_IOMMU_SVM. This function uses the dedicated static variable inv_type_granu_table and functions to_vtd_granularity() and to_vtd_size(). These parts are unused when !CONFIG_INTEL_IOMMU_SVM, and hence, make CC=clang W=1 warns with an -Wunused-function warning. Include these parts conditionally on CONFIG_INTEL_IOMMU_SVM. Fixes: `6ee1b77ba3` ("iommu/vt-d: Add svm/sva invalidate function") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201115205951.20698-1-lukas.bulwahn@gmail.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-17 22:29:49 +00:00
Linus Torvalds	326fd6db61	A small set of fixes for x86: - Cure the fallout from the MSI irqdomain overhaul which missed that the Intel IOMMU does not register virtual function devices and therefore never reaches the point where the MSI interrupt domain is assigned. This makes the VF devices use the non-remapped MSI domain which is trapped by the IOMMU/remap unit. - Remove an extra space in the SGI_UV architecture type procfs output for UV5. - Remove a unused function which was missed when removing the UV BAU TLB shootdown handler. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xJi0THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVWxD/9Tq4W6Kniln7mtoEWHRvHRceiiGcS3 MocvqurhoJwirH4F2gkvCegTBy0r3FdUORy3OMmChVs6nb8XpPpso84SANCRePWp JZezpVwLSNC4O1/ZCg1Kjj4eUpzLB/UjUUQV9RsjL5wyQEhfCZgb1D40yLM/2dj5 SkVm/EAqWuQNtYe/jqAOwTX/7mV+k2QEmKCNOigM13R9EWgu6a4J8ta1gtNSbwvN jWMW+M1KjZ76pfRK+y4OpbuFixteSzhSWYPITSGwQz4IpQ+Ty2Rv0zzjidmDnAR+ Q73cup0dretdVnVDRpMwDc06dBCmt/rbN50w4yGU0YFRFDgjGc8sIbQzuIP81nEQ XY4l4rcBgyVufFsLrRpQxu1iYPFrcgU38W1kRkkJ3Kl/rY1a2ZU7sLE4kt4Oh55W A9KCmsfqP1PCYppjAQ0QT4NOp4YtecPvAU4UcBOb722DDBd8TfhLWWGw2yG57Q/d Wnu8xCJGy7BaLHLGGGseAft+D4aNnCjKC3jgMyvNtRDXaV2cK2Kdd6ehMlWVUapD xfLlKXE+igXMyoWJIWjTXQJs4dpKu6QpJCPiorwEZ8rmNaRfxsWEJVbeYwEkmUke bMoBBSCbZT86WVOYhI8WtrIemraY0mMYrrcE03M96HU3eYB8BV92KrIzZWThupcQ ZqkZbqCZm3vfHA== =X/P+ -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of fixes for x86: - Cure the fallout from the MSI irqdomain overhaul which missed that the Intel IOMMU does not register virtual function devices and therefore never reaches the point where the MSI interrupt domain is assigned. This made the VF devices use the non-remapped MSI domain which is trapped by the IOMMU/remap unit - Remove an extra space in the SGI_UV architecture type procfs output for UV5 - Remove a unused function which was missed when removing the UV BAU TLB shootdown handler" * tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: iommu/vt-d: Cure VF irqdomain hickup x86/platform/uv: Fix copied UV5 output archtype x86/platform/uv: Drop last traces of uv_flush_tlb_others	2020-11-15 09:49:56 -08:00
Thomas Gleixner	ff828729be	iommu/vt-d: Cure VF irqdomain hickup The recent changes to store the MSI irqdomain pointer in struct device missed that Intel DMAR does not register virtual function devices. Due to that a VF device gets the plain PCI-MSI domain assigned and then issues compat MSI messages which get caught by the interrupt remapping unit. Cure that by inheriting the irq domain from the physical function device. Ideally the irqdomain would be associated to the bus, but DMAR can have multiple units and therefore irqdomains on a single bus. The VF 'bus' could of course inherit the domain from the PF, but that'd be yet another x86 oddity. Fixes: `85a8dfc57a` ("iommm/vt-d: Store irq domain in struct device") Reported-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Link: https://lore.kernel.org/r/draft-87eekymlpz.fsf@nanos.tec.linutronix.de	2020-11-13 12:00:40 +01:00
Linus Torvalds	3d5e28bff7	Merge branch 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fixes from Konrad Rzeszutek Wilk: "Two tiny fixes for issues that make drivers under Xen unhappy under certain conditions" * 'stable/for-linus-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single swiotlb: fix "x86: Don't panic if can not alloc buffer for swiotlb"	2020-11-11 14:15:06 -08:00
Liu, Yi L	71cd8e2d16	iommu/vt-d: Fix a bug for PDP check in prq_event_thread In prq_event_thread(), the QI_PGRP_PDP is wrongly set by 'req->pasid_present' which should be replaced to 'req->priv_data_present'. Fixes: `5b438f4ba3` ("iommu/vt-d: Support page request in scalable mode") Signed-off-by: Liu, Yi L <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1604025444-6954-3-git-send-email-yi.y.sun@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-11-03 14:37:02 +01:00
Liu Yi L	eea4e29ab8	iommu/vt-d: Fix sid not set issue in intel_svm_bind_gpasid() Should get correct sid and set it into sdev. Because we execute 'sdev->sid != req->rid' in the loop of prq_event_thread(). Fixes: `eb8d93ea3c` ("iommu/vt-d: Report page request faults for guest SVA") Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1604025444-6954-2-git-send-email-yi.y.sun@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-11-03 14:37:02 +01:00
Lu Baolu	6097df457a	iommu/vt-d: Fix kernel NULL pointer dereference in find_domain() If calling find_domain() for a device which hasn't been probed by the iommu core, below kernel NULL pointer dereference issue happens. [ 362.736947] BUG: kernel NULL pointer dereference, address: 0000000000000038 [ 362.743953] #PF: supervisor read access in kernel mode [ 362.749115] #PF: error_code(0x0000) - not-present page [ 362.754278] PGD 0 P4D 0 [ 362.756843] Oops: 0000 [#1] SMP NOPTI [ 362.760528] CPU: 0 PID: 844 Comm: cat Not tainted 5.9.0-rc4-intel-next+ #1 [ 362.767428] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3384.A02.1909200816 09/20/2019 [ 362.781109] RIP: 0010:find_domain+0xd/0x40 [ 362.785234] Code: 48 81 fb 60 28 d9 b2 75 de 5b 41 5c 41 5d 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 e0 02 00 00 55 <48> 8b 40 38 48 89 e5 48 83 f8 fe 0f 94 c1 48 85 ff 0f 94 c2 08 d1 [ 362.804041] RSP: 0018:ffffb09cc1f0bd38 EFLAGS: 00010046 [ 362.809292] RAX: 0000000000000000 RBX: ffff905b98e4fac8 RCX: 0000000000000000 [ 362.816452] RDX: 0000000000000001 RSI: ffff905b98e4fac8 RDI: ffff905b9ccd40d0 [ 362.823617] RBP: ffffb09cc1f0bda0 R08: ffffb09cc1f0bd48 R09: 000000000000000f [ 362.830778] R10: ffffffffb266c080 R11: ffff905b9042602d R12: ffff905b98e4fac8 [ 362.837944] R13: ffffb09cc1f0bd48 R14: ffff905b9ccd40d0 R15: ffff905b98e4fac8 [ 362.845108] FS: 00007f8485460740(0000) GS:ffff905b9fc00000(0000) knlGS:0000000000000000 [ 362.853227] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 362.858996] CR2: 0000000000000038 CR3: 00000004627a6003 CR4: 0000000000770ef0 [ 362.866161] PKRU: fffffffc [ 362.868890] Call Trace: [ 362.871363] ? show_device_domain_translation+0x32/0x100 [ 362.876700] ? bind_store+0x110/0x110 [ 362.880387] ? klist_next+0x91/0x120 [ 362.883987] ? domain_translation_struct_show+0x50/0x50 [ 362.889237] bus_for_each_dev+0x79/0xc0 [ 362.893121] domain_translation_struct_show+0x36/0x50 [ 362.898204] seq_read+0x135/0x410 [ 362.901545] ? handle_mm_fault+0xeb8/0x1750 [ 362.905755] full_proxy_read+0x5c/0x90 [ 362.909526] vfs_read+0xa6/0x190 [ 362.912782] ksys_read+0x61/0xe0 [ 362.916037] __x64_sys_read+0x1a/0x20 [ 362.919725] do_syscall_64+0x37/0x80 [ 362.923329] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 362.928405] RIP: 0033:0x7f84855c5e95 Filter out those devices to avoid such error. Fixes: `e2726daea5` ("iommu/vt-d: debugfs: Add support to show page table internals") Reported-and-tested-by: Xu Pengfei <pengfei.xu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: stable@vger.kernel.org#v5.6+ Link: https://lore.kernel.org/r/20201028070725.24979-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-11-03 14:30:10 +01:00
Christoph Hellwig	fc0021aa34	swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single The tbl_dma_addr argument is used to check the DMA boundary for the allocations, and thus needs to be a dma_addr_t. swiotlb-xen instead passed a physical address, which could lead to incorrect results for strange offsets. Fix this by removing the parameter entirely and hard code the DMA address for io_tlb_start instead. Fixes: `91ffe4ad53` ("swiotlb-xen: introduce phys_to_dma/dma_to_phys translations") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2020-11-02 10:10:39 -05:00
David Woodhouse	79eb3581bc	iommu/vt-d: Simplify intel_irq_remapping_select() Now that the old get_irq_domain() method has gone, consolidate on just the map_XXX_to_iommu() functions. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-31-dwmw2@infradead.org	2020-10-28 20:26:28 +01:00
David Woodhouse	ed381fca47	x86: Kill all traces of irq_remapping_get_irq_domain() All users are converted to use the fwspec based parent domain lookup. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-30-dwmw2@infradead.org	2020-10-28 20:26:28 +01:00
David Woodhouse	a87fb465ff	iommu/vt-d: Implement select() method on remapping irqdomain Preparatory for removing irq_remapping_get_irq_domain() Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-26-dwmw2@infradead.org	2020-10-28 20:26:27 +01:00
David Woodhouse	5d5a971338	x86/ioapic: Generate RTE directly from parent irqchip's MSI message The I/O-APIC generates an MSI cycle with address/data bits taken from its Redirection Table Entry in some combination which used to make sense, but now is just a bunch of bits which get passed through in some seemingly arbitrary order. Instead of making IRQ remapping drivers directly frob the I/OA-PIC RTE, let them just do their job and generate an MSI message. The bit swizzling to turn that MSI message into the I/O-APIC's RTE is the same in all cases, since it's a function of the I/O-APIC hardware. The IRQ remappers have no real need to get involved with that. The only slight caveat is that the I/OAPIC is interpreting some of those fields too, and it does want the 'vector' field to be unique to make EOI work. The AMD IOMMU happens to put its IRTE index in the bits that the I/O-APIC thinks are the vector field, and accommodates this requirement by reserving the first 32 indices for the I/O-APIC. The Intel IOMMU doesn't actually use the bits that the I/O-APIC thinks are the vector field, so it fills in the 'pin' value there instead. [ tglx: Replaced the unreadably macro maze with the cleaned up RTE/msi_msg bitfields and added commentry to explain the mapping magic ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-22-dwmw2@infradead.org	2020-10-28 20:26:27 +01:00
Thomas Gleixner	341b4a7211	x86/ioapic: Cleanup IO/APIC route entry structs Having two seperate structs for the I/O-APIC RTE entries (non-remapped and DMAR remapped) requires type casts and makes it hard to map. Combine them in IO_APIC_routing_entry by defining a union of two 64bit bitfields. Use naming which reflects which bits are shared and which bits are actually different for the operating modes. [dwmw2: Fix it up and finish the job, pulling the 32-bit w1,w2 words for register access into the same union and eliminating a few more places where bits were accessed through masks and shifts.] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-21-dwmw2@infradead.org	2020-10-28 20:26:27 +01:00
Thomas Gleixner	a27dca645d	x86/io_apic: Cleanup trigger/polarity helpers 'trigger' and 'polarity' are used throughout the I/O-APIC code for handling the trigger type (edge/level) and the active low/high configuration. While there are defines for initializing these variables and struct members, they are not used consequently and the meaning of 'trigger' and 'polarity' is opaque and confusing at best. Rename them to 'is_level' and 'active_low' and make them boolean in various structs so it's entirely clear what the meaning is. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-20-dwmw2@infradead.org	2020-10-28 20:26:26 +01:00
Thomas Gleixner	5c0d0e2cc6	iommu/intel: Use msi_msg shadow structs Use the bitfields in the x86 shadow struct to compose the MSI message. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-14-dwmw2@infradead.org	2020-10-28 20:26:25 +01:00
Thomas Gleixner	8c44963b60	x86/apic: Cleanup destination mode apic::irq_dest_mode is actually a boolean, but defined as u32 and named in a way which does not explain what it means. Make it a boolean and rename it to 'dest_mode_logical' Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-9-dwmw2@infradead.org	2020-10-28 20:26:25 +01:00
Thomas Gleixner	721612994f	x86/apic: Cleanup delivery mode defines The enum ioapic_irq_destination_types and the enumerated constants starting with 'dest_' are gross misnomers because they describe the delivery mode. Rename then enum and the constants so they actually make sense. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-6-dwmw2@infradead.org	2020-10-28 20:26:24 +01:00
Linus Torvalds	5c7e3f3f5c	IOMMU Fix for Linux v5.10: - Fix a build regression with !CONFIG_IOMMU_API -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl+Ns1UACgkQK/BELZcB GuPo3Q//RxTLMdvQ6Le3VB3ncbmYBKFWTEexStqrurWjIJbbQI/nZCJIGIkOXqi6 HfJLGbn4QxGmYT0GXLqOZCyEgs8cicXgVBrWrWJepleZ9tY9kYMk9EUf8CFvWvBj Kg3B71q1sVflmG4OGxyxXcqZSD105oeOhAg8nDLxrqSgpEd/3lZfMKbWth/WP9GP qknxmhmlWZ3QHLdPvHNG2wuGOOr7jcAn8dSJurMaADF4F7Zk0JE912JBWXORgffU IrsRb8WxGP6wPp2veMKu5HWw7jSKrdpf+5N5gft0pdiF3AY2diJuPzBdvvyJDtGc MKLWL06H7wZsUygt+L9mW8PEH+VbgwErB83UjHgSsDHJfcD3xfXEpUslkRY5nldB 73pXPLttGSVTf1xzN7YUANtcBiYaCoPRgdrHF9gs2/KSCUjcmBOViffsnpC32239 NmJ71VvsWGFj2tRGoKsMxTxxWV9bfrPxdaa8W89HONvkcG2G1fh8escFEJtqxMUT F5f5rkf3O/z6/ikybqz/g1wsLB45zjP0StFTzvLquNwUeXkRdJqgurd6SzUDAFZA pmmdPQXMzpY7GxBBbrWhUGxJgXoz4DvYRscs1dXECYAtA4Jq5cBk9Ag1PSJRvkiy 0X902jtauH1w76sYue2SJLI5XxbS8DDHGm79HrLR6ZhOr7gplAw= =og0e -----END PGP SIGNATURE----- Merge tag 'iommu-fix-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fix from Joerg Roedel: "Fix a build regression with !CONFIG_IOMMU_API" * tag 'iommu-fix-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built	2020-10-20 09:35:06 -07:00
Bartosz Golaszewski	9def3b1a07	iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built Since commit `c40aaaac10` ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths") dmar.c needs struct iommu_device to be selected. We can drop this dependency by not dereferencing struct iommu_device if IOMMU_API is not selected and by reusing the information stored in iommu->drhd->ignored instead. This fixes the following build error when IOMMU_API is not selected: drivers/iommu/intel/dmar.c: In function ‘free_iommu’: drivers/iommu/intel/dmar.c:1139:41: error: ‘struct iommu_device’ has no member named ‘ops’ 1139 \| if (intel_iommu_enabled && iommu->iommu.ops) { ^ Fixes: `c40aaaac10` ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths") Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20201013073055.11262-1-brgl@bgdev.pl Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-19 14:16:02 +02:00
Linus Torvalds	5a32c3413d	dma-mapping updates for 5.10 - rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl+IiPwLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYPKEQ//TM8vxjucnRl/pklpMin49dJorwiVvROLhQqLmdxw 286ZKpVzYYAPc7LnNqwIBugnFZiXuHu8xPKQkIiOa2OtNDTwhKNoBxOAmOJaV6DD 8JfEtZYeX5mKJ/Nqd2iSkIqOvCwZ9Wzii+aytJ2U88wezQr1fnyF4X49MegETEey FHWreSaRWZKa0MMRu9AQ0QxmoNTHAQUNaPc0PeqEtPULybfkGOGw4/ghSB7WcKrA gtKTuooNOSpVEHkTas2TMpcBp6lxtOjFqKzVN0ml+/nqq5NeTSDx91VOCX/6Cj76 mXIg+s7fbACTk/BmkkwAkd0QEw4fo4tyD6Bep/5QNhvEoAriTuSRbhvLdOwFz0EF vhkF0Rer6umdhSK7nPd7SBqn8kAnP4vBbdmB68+nc3lmkqysLyE4VkgkdH/IYYQI 6TJ0oilXWFmU6DT5Rm4FBqCvfcEfU2dUIHJr5wZHqrF2kLzoZ+mpg42fADoG4GuI D/oOsz7soeaRe3eYfWybC0omGR6YYPozZJ9lsfftcElmwSsFrmPsbO1DM5IBkj1B gItmEbOB9ZK3RhIK55T/3u1UWY3Uc/RVr+kchWvADGrWnRQnW0kxYIqDgiOytLFi JZNH8uHpJIwzoJAv6XXSPyEUBwXTG+zK37Ce769HGbUEaUrE71MxBbQAQsK8mDpg 7fM= =Bkf/ -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits) ARM/ixp4xx: add a missing include of dma-map-ops.h dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling dma-direct: factor out a dma_direct_alloc_from_pool helper dma-direct check for highmem pages in dma_direct_alloc_pages dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma dma-mapping: move dma-debug.h to kernel/dma/ dma-mapping: remove <asm/dma-contiguous.h> dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> dma-contiguous: remove dma_contiguous_set_default dma-contiguous: remove dev_set_cma_area dma-contiguous: remove dma_declare_contiguous dma-mapping: split <linux/dma-mapping.h> cma: decrease CMA_ALIGNMENT lower limit to 2 firewire-ohci: use dma_alloc_pages dma-iommu: implement ->alloc_noncoherent dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods dma-mapping: add a new dma_alloc_pages API dma-mapping: remove dma_cache_sync 53c700: convert to dma_alloc_noncoherent ...	2020-10-15 14:43:29 -07:00
Linus Torvalds	531d29b0b6	IOMMU Updates for Linux v5.10 Including: - ARM-SMMU Updates from Will: - Continued SVM enablement, where page-table is shared with CPU - Groundwork to support integrated SMMU with Adreno GPU - Allow disabling of MSI-based polling on the kernel command-line - Minor driver fixes and cleanups (octal permissions, error messages, ...) - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when a device tries DMA on memory owned by a guest. This needs new fault-types as well as a rewrite of the IOMMU memory semaphore for command completions. - Allow broken Intel IOMMUs (wrong address widths reported) to still be used for interrupt remapping. - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access address spaces of processes running in a VM. - Support for the MT8167 IOMMU in the Mediatek IOMMU driver. - Device-tree updates for the Renesas driver to support r8a7742. - Several smaller fixes and cleanups all over the place. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl+Fy9MACgkQK/BELZcB GuNxtRAA0TdYHXt6XyLWmvRAX/ySZSz6KOneZWWwpsQ9wh2/iv1PtBsrV0ltf+6g CaX4ROZUVRbV9wPD+7maBRbzxrG3QhfEaaV+K45Q2J/QE1wjkyV8qj1eORWTUUoc nis4FhGDKk2ER/Gsajy2Hjs4+6i43gdWG/+ghVGaCRo8mCZyoz1/6AyMQyN3deuO NqWOv9E7hsavZjRs/w/LXG7eSE20cZwtt//kPVJF0r9eQqC6i1eJDQj48iRqJVqd R0dwBQZaLz++qQptyKebDNlmH/3aAsb+A8nCeS7ZwHqWC1QujTWOUYWpFyPPbOmC KVsQXzTzRfnVTDECF1Pk5d3yi45KILLU3B4zDJfUJjbL3KDYjuVUvhHF/pcGcjC3 H1LWJqHSAL8sJwHvKhpi0VtQ5SOxXnLO5fGG/CZT/Xb4QyM+mkwkFLdn1TryZTR/ M4XA+QuI96TzY7HQUJdSoEDANxoBef6gPnxdDKOnK1v4hfNsPAl7o8hZkM3w0DK8 GoFZUV+vjBhFcymGcQegSNiea28Hfi+hBe+PPHCmw+tJm47cketD5uP5jJ5NGaUe MKU/QXWXc6oqeBTQT6ki5zJbJXKttbPa8eEmp+FrMatc9kruvBVhQoMbj7Vd3CA1 dC4zK9Awy7yj24ZhZfnAFx2DboCmBTUI3QKjDt9K5PRZyMeyoP8= =C0Sg -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - ARM-SMMU Updates from Will: - Continued SVM enablement, where page-table is shared with CPU - Groundwork to support integrated SMMU with Adreno GPU - Allow disabling of MSI-based polling on the kernel command-line - Minor driver fixes and cleanups (octal permissions, error messages, ...) - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when a device tries DMA on memory owned by a guest. This needs new fault-types as well as a rewrite of the IOMMU memory semaphore for command completions. - Allow broken Intel IOMMUs (wrong address widths reported) to still be used for interrupt remapping. - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access address spaces of processes running in a VM. - Support for the MT8167 IOMMU in the Mediatek IOMMU driver. - Device-tree updates for the Renesas driver to support r8a7742. - Several smaller fixes and cleanups all over the place. * tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits) iommu/vt-d: Gracefully handle DMAR units with no supported address widths iommu/vt-d: Check UAPI data processed by IOMMU core iommu/uapi: Handle data and argsz filled by users iommu/uapi: Rename uapi functions iommu/uapi: Use named union for user data iommu/uapi: Add argsz for user filled data docs: IOMMU user API iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate() iommu/arm-smmu-v3: Add SVA device feature iommu/arm-smmu-v3: Check for SVA features iommu/arm-smmu-v3: Seize private ASID iommu/arm-smmu-v3: Share process page tables iommu/arm-smmu-v3: Move definitions to a header iommu/io-pgtable-arm: Move some definitions to a header iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR iommu/amd: Use 4K page for completion wait write-back semaphore iommu/tegra-smmu: Allow to group clients in same swgroup iommu/tegra-smmu: Fix iova->phys translation ...	2020-10-14 12:08:34 -07:00
Linus Torvalds	cf1d2b44f6	ACPI updates for 5.10-rc1 - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron). - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo). - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada). - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki). - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: * Add predefined names from the SMBus sepcification (Bob Moore). * Update acpi_help UUID list (Bob Moore). * Return exceptions for string-to-integer conversions in iASL (Bob Moore). * Add a new "ALL <NameSeg>" debugger command (Bob Moore). * Add support for 64 bit risc-v compilation (Colin Ian King). * Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap). - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung). - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko). - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo). - Add missing config_item_put() to fix refcount leak (Hanjun Guo). - Drop lefrover field from struct acpi_memory_device (Hanjun Guo). - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings). - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov). - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao). - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing). - Serialize tools/power/acpi Makefile (Thomas Renninger). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl+F4IkSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRx1gIQAIZrt09fquEIZhYulGZAkuYhSX2U/DZt poow5+TiGk36JNHlbZS19kZ3F0tJ1wA6CKSfF/bYyULxL+gYaUjdLXzv2kArTSAj nzDXQ2CystpySZI/sEkl4QjsMg0xuZlBhlnCfNHzJw049TgdsJHnxMkJXb8T90A+ l2JKm2OpBkNvQGNpwd3djLg8xSDnHUmuevsWZPHDp92/fLMF9DUBk8dVuEwa0ndF hAUpWm+EL1tJQnhNwtfV/Akd9Ypqgk/7ROFWFHGDtHMZGnBjpyXZw68vHMX7SL6N Ej90GWGPHSJs/7Fsg4Hiaxxcph9WFNLPcpck5lVAMIrNHMKANjqQzCsmHavV/WTG STC9/qwJauA1EOjovlmlCFHctjKE/ya6Hm299WTlfBqB+Lu1L3oMR2CC+Uj0YfyG sv3264rJCsaSw610iwQOG807qHENopASO2q5DuKG0E9JpcaBUwn1N4qP5svvQciq 4aA8Ma6xM/QHCO4CS0Se9C0+WSVtxWwOUichRqQmU4E6u1sXvKJxTeWo79rV7PAh L6BwoOxBLabEiyzpi6HPGs6DoKj/N6tOQenBh4ibdwpAwMtq7hIlBFa0bp19c2wT vx8F2Raa8vbQ2zZ1QEiPZnPLJUoy2DgaCtKJ6E0FTDXNs3VFlWgyhIUlIRqk5BS9 OnAwVAUrTMkJ =feLU -----END PGP SIGNATURE----- Merge tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...	2020-10-14 11:42:04 -07:00
Rafael J. Wysocki	e4174ff78b	Merge branch 'acpi-numa' * acpi-numa: docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ACPI: Support Generic Initiator only domains ACPI / NUMA: Add stub function for pxm_to_node() irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory ACPI: Remove side effect of partly creating a node in acpi_get_node() ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node() ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node() ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT ACPI: Add out of bounds and numa_off protections to pxm_to_node()	2020-10-13 14:44:50 +02:00
Linus Torvalds	cc7343724e	Surgery of the MSI interrupt handling to prepare the support of upcoming devices which require non-PCI based MSI handling. - Cleanup historical leftovers all over the place - Rework the code to utilize more core functionality - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain assignment to PCI devices possible. - Assign irqdomains to PCI devices at initialization time which allows to utilize the full functionality of hierarchical irqdomains. - Remove arch_._msi_irq() functions from X86 and utilize the irqdomain which is assigned to the device for interrupt management. - Make the arch_._msi_irq() support conditional on a config switch and let the last few users select it. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+EUxcTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoagLEACGp5U7a4mk24GsOZJDhrua1PHR/fhb enn/5yOPpxDXdYmtFHIjV5qrNjDTV/WqDlI96KOi+oinG1Eoj0O/MA1AcSRhp6nf jVdAuK1X0DHDUTEeTAP0JFwqd2j0KlIOphBrIMgeWIf1CRKlYiJaO+ioF9fKgwZ/ /HigOTSykGYMPggm3JXnWTWtJkKSGFxeADBvVHt5RpVmbWtrI4YoSBxKEMtvjyeM 5+GsqbCad1CnFYTN74N+QWVGmgGnUWGEzWsPYnJ9hW+yyjad1kWx3n6NcCWhssaC E4vAXl6JuCPntL7jBFkbfUkQsgq12ThMZYWpCq8pShJA9O2tDKkxIGasHWrIt4cz nYrESiv6hM7edjtOvBc086Gd0A2EyGOM879goHyaNVaTO4rI6jfZG7PlW1HHWibS mf/bdTXBtULGNgEt7T8Qnb8sZ+D01WqzLrq/wm645jIrTzvNHUEpOhT1aH/g4TFQ cNHD5PcM9OTmiBir9srNd47+1s2mpfwdMYHKBt2QgiXMO8fRgdtr6WLQE4vJjmG8 sA0yGGsgdTKeg2wW1ERF1pWL0Lt05Iaa42Skm0D3BwcOG2n5ltkBHzVllto9cTUh kIldAOgxGE6QeCnnlrnbHz5mvzt/3Ih/PIKqPSUAC94Kx1yvVHRYuOvDExeO8DFB P+f0TkrscZObSg== =JlqV -----END PGP SIGNATURE----- Merge tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Surgery of the MSI interrupt handling to prepare the support of upcoming devices which require non-PCI based MSI handling: - Cleanup historical leftovers all over the place - Rework the code to utilize more core functionality - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain assignment to PCI devices possible. - Assign irqdomains to PCI devices at initialization time which allows to utilize the full functionality of hierarchical irqdomains. - Remove arch_._msi_irq() functions from X86 and utilize the irqdomain which is assigned to the device for interrupt management. - Make the arch_._msi_irq() support conditional on a config switch and let the last few users select it" * tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS x86/apic/msi: Unbreak DMAR and HPET MSI iommu/amd: Remove domain search for PCI/MSI iommu/vt-d: Remove domain search for PCI/MSI[X] x86/irq: Make most MSI ops XEN private x86/irq: Cleanup the arch__msi_irqs() leftovers PCI/MSI: Make arch_._msi_irq[s] fallbacks selectable x86/pci: Set default irq domain in pcibios_add_device() iommm/amd: Store irq domain in struct device iommm/vt-d: Store irq domain in struct device x86/xen: Wrap XEN MSI management into irqdomain irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() x86/xen: Consolidate XEN-MSI init x86/xen: Rework MSI teardown x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() PCI/MSI: Provide pci_dev_has_special_msi_domain() helper PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI x86/irq: Initialize PCI/MSI domain at PCI init time x86/pci: Reducde #ifdeffery in PCI init code ...	2020-10-12 11:40:41 -07:00
Linus Torvalds	ac74075e5d	Initial support for sharing virtual addresses between the CPU and devices which doesn't need pinning of pages for DMA anymore. Add support for the command submission to devices using new x86 instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support for process address space identifiers (PASIDs) which are referenced by those command submission instructions along with the handling of the PASID state on context switch as another extended state. Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl996DIACgkQEsHwGGHe VUqM4A/+JDI3GxNyMyBpJR0nQ2vs23ru1o3OxvxhYtcacZ0cNwkaO7g3TLQxH+LZ k1QtvEd4jqI6BXV4de+HdZFDcqzikJf0KHnUflLTx956/Eop5rtxzMWVo69ZmYs8 QrW0mLhyh8eq19cOHbQBb4M/HFc1DXBw+l7Ft3MeA1divOVESRB/uNxjA25K4PvV y+pipyUxqKSNhmBFf2bV8OVZloJiEtg3H6XudP0g/rZgjYe3qWxa+2iv6D08yBNe g7NpMDMql2uo1bcFON7se2oF34poAi49BfiIQb5G4m9pnPyvVEMOCijxCx2FHYyF nukyxt8g3Uq+UJYoolLNoWijL1jgBWeTBg1uuwsQOqWSARJx8nr859z0GfGyk2RP GNoYE4rrWBUMEqWk4xeiPPgRDzY0cgcGh0AeuWqNhgBfbbZeGL0t0m5kfytk5i1s W0YfRbz+T8+iYbgVfE/Zpthc7rH7iLL7/m34JC13+pzhPVTT32ECLJov2Ac8Tt15 X+fOe6kmlDZa4GIhKRzUoR2aEyLpjufZ+ug50hznBQjGrQfcx7zFqRAU4sJx0Yyz rxUOJNZZlyJpkyXzc12xUvShaZvTcYenHGpxXl8TU3iMbY2otxk1Xdza8pc1LGQ/ qneYgILgKa+hSBzKhXCPAAgSYtPlvQrRizArS8Y0k/9rYaKCfBU= =K9X4 -----END PGP SIGNATURE----- Merge tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PASID updates from Borislav Petkov: "Initial support for sharing virtual addresses between the CPU and devices which doesn't need pinning of pages for DMA anymore. Add support for the command submission to devices using new x86 instructions like ENQCMD{,S} and MOVDIR64B. In addition, add support for process address space identifiers (PASIDs) which are referenced by those command submission instructions along with the handling of the PASID state on context switch as another extended state. Work by Fenghua Yu, Ashok Raj, Yu-cheng Yu and Dave Jiang" * tag 'x86_pasid_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Add an enqcmds() wrapper for the ENQCMDS instruction x86/asm: Carve out a generic movdir64b() helper for general usage x86/mmu: Allocate/free a PASID x86/cpufeatures: Mark ENQCMD as disabled when configured out mm: Add a pasid member to struct mm_struct x86/msr-index: Define an IA32_PASID MSR x86/fpu/xstate: Add supervisor PASID state for ENQCMD x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions Documentation/x86: Add documentation for SVA (Shared Virtual Addressing) iommu/vt-d: Change flags type to unsigned int in binding mm drm, iommu: Change type of pasid to u32	2020-10-12 10:40:34 -07:00
Joerg Roedel	7e3c3883c3	Merge branches 'arm/allwinner', 'arm/mediatek', 'arm/renesas', 'arm/tegra', 'arm/qcom', 'arm/smmu', 'ppc/pamu', 'x86/amd', 'x86/vt-d' and 'core' into next	2020-10-07 11:51:59 +02:00
David Woodhouse	c40aaaac10	iommu/vt-d: Gracefully handle DMAR units with no supported address widths Instead of bailing out completely, such a unit can still be used for interrupt remapping. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/linux-iommu/549928db2de6532117f36c9c810373c14cf76f51.camel@infradead.org/ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-07 11:49:54 +02:00
Christoph Hellwig	0b1abd1fb7	dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> Merge dma-contiguous.h into dma-map-ops.h, after removing the comment describing the contiguous allocator into kernel/dma/contigous.c. Signed-off-by: Christoph Hellwig <hch@lst.de>	2020-10-06 07:07:04 +02:00
Christoph Hellwig	0a0f0d8be7	dma-mapping: split <linux/dma-mapping.h> Split out all the bits that are purely for dma_map_ops implementations and related code into a new <linux/dma-map-ops.h> header so that they don't get pulled into all the drivers. That also means the architecture specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h> any more, which leads to a missing includes that were pulled in by the x86 or arm versions in a few not overly portable drivers. Signed-off-by: Christoph Hellwig <hch@lst.de>	2020-10-06 07:07:03 +02:00
Lu Baolu	1a3f2fd7fc	iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb() Lock(&iommu->lock) without disabling irq causes lockdep warnings. [ 12.703950] ======================================================== [ 12.703962] WARNING: possible irq lock inversion dependency detected [ 12.703975] 5.9.0-rc6+ #659 Not tainted [ 12.703983] -------------------------------------------------------- [ 12.703995] systemd-udevd/284 just changed the state of lock: [ 12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.57+0x2e/0x90 [ 12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past: [ 12.704043] (&iommu->lock){+.+.}-{2:2} [ 12.704045] and interrupts could create inverse lock ordering between them. [ 12.704073] other info that might help us debug this: [ 12.704085] Possible interrupt unsafe locking scenario: [ 12.704097] CPU0 CPU1 [ 12.704106] ---- ---- [ 12.704115] lock(&iommu->lock); [ 12.704123] local_irq_disable(); [ 12.704134] lock(device_domain_lock); [ 12.704146] lock(&iommu->lock); [ 12.704158] <Interrupt> [ 12.704164] lock(device_domain_lock); [ 12.704174] * DEADLOCK * Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20200927062428.13713-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-01 14:54:17 +02:00
Jacob Pan	6278eecba3	iommu/vt-d: Check UAPI data processed by IOMMU core IOMMU generic layer already does sanity checks on UAPI data for version match and argsz range based on generic information. This patch adjusts the following data checking responsibilities: - removes the redundant version check from VT-d driver - removes the check for vendor specific data size - adds check for the use of reserved/undefined flags Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/1601051567-54787-7-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-01 14:52:46 +02:00
Jacob Pan	8d3bb3b8cb	iommu/uapi: Use named union for user data IOMMU UAPI data size is filled by the user space which must be validated by the kernel. To ensure backward compatibility, user data can only be extended by either re-purpose padding bytes or extend the variable sized union at the end. No size change is allowed before the union. Therefore, the minimum size is the offset of the union. To use offsetof() on the union, we must make it named. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/linux-iommu/20200611145518.0c2817d6@x1.home/ Link: https://lore.kernel.org/r/1601051567-54787-4-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-01 14:52:46 +02:00
Christoph Hellwig	efa70f2fdc	dma-mapping: add a new dma_alloc_pages API This API is the equivalent of alloc_pages, except that the returned memory is guaranteed to be DMA addressable by the passed in device. The implementation will also be used to provide a more sensible replacement for DMA_ATTR_NON_CONSISTENT flag. Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages as its backend. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)	2020-09-25 06:20:47 +02:00
Christoph Hellwig	8c1c6c7588	Merge branch 'master' of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next Pull in the latest 5.9 tree for the commit to revert the V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.	2020-09-25 06:19:19 +02:00
Jonathan Cameron	01feba590c	ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT Several ACPI static tables contain references to proximity domains. ACPI 6.3 has clarified that only entries in SRAT may define a new domain (sec 5.2.16). Those tables described in the ACPI spec have additional clarifying text. NFIT: Table 5-132, "Integer that represents the proximity domain to which the memory belongs. This number must match with corresponding entry in the SRAT table." HMAT: Table 5-145, "... This number must match with the corresponding entry in the SRAT table's processor affinity structure ... if the initiator is a processor, or the Generic Initiator Affinity Structure if the initiator is a generic initiator". IORT and DMAR are defined by external specifications. Intel Virtualization Technology for Directed I/O Rev 3.1 does not make any explicit statements, but the general SRAT statement above will still apply. https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf IO Remapping Table, Platform Design Document rev D, also makes not explicit statement, but refers to ACPI SRAT table for more information and again the generic SRAT statement above applies. https://developer.arm.com/documentation/den0049/d/ In conclusion, any proximity domain specified in these tables, should be a reference to a proximity domain also found in SRAT, and they should not be able to instantiate a new domain. Hence we switch to pxm_to_node() which will only return existing nodes. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-24 12:57:37 +02:00
Lu Baolu	d2ef096249	iommu/vt-d: Use device numa domain if RHSA is missing If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR table, we could default to the device NUMA domain as fall back. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200904010303.2961-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20200922060843.31546-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-09-24 10:59:55 +02:00
Fenghua Yu	20f0afd1fb	x86/mmu: Allocate/free a PASID A PASID is allocated for an "mm" the first time any thread binds to an SVA-capable device and is freed from the "mm" when the SVA is unbound by the last thread. It's possible for the "mm" to have different PASID values in different binding/unbinding SVA cycles. The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is propagated to a per-thread PASID MSR for all threads within the mm through IPI, context switch, or inherited. This is done to ensure that a running thread has the right PASID in the MSR matching the mm's PASID. [ bp: s/SVM/SVA/g; massage. ] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com	2020-09-17 20:22:15 +02:00
Fenghua Yu	2a5054c6e7	iommu/vt-d: Change flags type to unsigned int in binding mm "flags" passed to intel_svm_bind_mm() is a bit mask and should be defined as "unsigned int" instead of "int". Change its type to "unsigned int". Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-3-git-send-email-fenghua.yu@intel.com	2020-09-17 19:21:30 +02:00
Fenghua Yu	c7b6bac9c7	drm, iommu: Change type of pasid to u32 PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com	2020-09-17 19:21:16 +02:00
Thomas Gleixner	9f0ffb4bb3	iommu/vt-d: Remove domain search for PCI/MSI[X] Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112334.305699301@linutronix.de	2020-09-16 16:52:38 +02:00
Thomas Gleixner	85a8dfc57a	iommm/vt-d: Store irq domain in struct device As a first step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. This is done from dmar_pci_bus_add_dev() because it has to work even when DMA remapping is disabled. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112333.714566121@linutronix.de	2020-09-16 16:52:37 +02:00
Thomas Gleixner	3b9c1d377d	x86/msi: Consolidate MSI allocation Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de	2020-09-16 16:52:35 +02:00
Thomas Gleixner	33a65ba470	x86_ioapic_Consolidate_IOAPIC_allocation Move the IOAPIC specific fields into their own struct and reuse the common devid. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de	2020-09-16 16:52:32 +02:00
Thomas Gleixner	2bf1e7bced	x86/msi: Consolidate HPET allocation None of the magic HPET fields are required in any way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de	2020-09-16 16:52:31 +02:00
Thomas Gleixner	6b6256e616	iommu/irq_remapping: Consolidate irq domain lookup Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN types, consolidate the two getter functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.741909337@linutronix.de	2020-09-16 16:52:30 +02:00
Thomas Gleixner	60e5a9397c	iommu/vt-d: Consolidate irq domain getter The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.530546013@linutronix.de	2020-09-16 16:52:29 +02:00
Thomas Gleixner	b4c364da32	x86/irq: Add allocation type for parent domain retrieval irq_remapping_ir_irq_domain() is used to retrieve the remapping parent domain for an allocation type. irq_remapping_irq_domain() is for retrieving the actual device domain for allocating interrupts for a device. The two functions are similar and can be unified by using explicit modes for parent irq domain retrieval. Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu implementations. Drop the parent domain retrieval for PCI_MSI/X as that is unused. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.436350257@linutronix.de	2020-09-16 16:52:29 +02:00
Thomas Gleixner	801b5e4c4e	x86_irq_Rename_X86_IRQ_ALLOC_TYPE_MSI_to_reflect_PCI_dependency No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200826112331.343103175@linutronix.de	2020-09-16 16:52:29 +02:00
Christoph Hellwig	5ceda74093	dma-direct: rename and cleanup __phys_to_dma The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious. Try to improve the situation by renaming __phys_to_dma to phys_to_dma_unencryped, and not forcing architectures that want to override phys_to_dma to actually provide __phys_to_dma. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>	2020-09-11 09:14:43 +02:00
Chris Wilson	29aaebbca4	iommu/vt-d: Handle 36bit addressing for x86-32 Beware that the address size for x86-32 may exceed unsigned long. [ 0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14 [ 0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int' If we don't handle the wide addresses, the pages are mismapped and the device read/writes go astray, detected as DMAR faults and leading to device failure. The behaviour changed (from working to broken) in commit `fa954e6831` ("iommu/vt-d: Delegate the dma domain to upper layer"), but the error looks older. Fixes: `fa954e6831` ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: James Sewart <jamessewart@arista.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> # v5.3+ Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.uk Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-09-04 12:14:28 +02:00
Lu Baolu	2d33b7d631	iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set() The dev_iommu_priv_set() must be called after probe_device(). This fixes a NULL pointer deference bug when booting a system with kernel cmdline "intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused. The following stacktrace was produced: Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off ... DMAR: Host address width 39 DMAR: DRHD base: 0x000000fed90000 flags: 0x0 DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e DMAR: DRHD base: 0x000000fed91000 flags: 0x1 DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff DMAR: No ATSR found BUG: kernel NULL pointer dereference, address: 0000000000000038 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2 Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018 RIP: 0010:intel_iommu_init+0xed0/0x1136 Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1 e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0 RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282 RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0 R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0 Call Trace: ? _raw_spin_unlock_irqrestore+0x1f/0x30 ? call_rcu+0x10e/0x320 ? trace_hardirqs_on+0x2c/0xd0 ? rdinit_setup+0x2c/0x2c ? e820__memblock_setup+0x8b/0x8b pci_iommu_init+0x16/0x3f do_one_initcall+0x46/0x1e4 kernel_init_freeable+0x169/0x1b2 ? rest_init+0x9f/0x9f kernel_init+0xa/0x101 ret_from_fork+0x22/0x30 Modules linked in: CR2: 0000000000000038 ---[ end trace 3653722a6f936f18 ]--- Fixes: `01b9d4e211` ("iommu/vt-d: Use dev_iommu_priv_get/set()") Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/ Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-09-04 11:50:37 +02:00
Lu Baolu	6e4e9ec650	iommu/vt-d: Serialize IOMMU GCMD register modifications The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. It also checks the status register before writing command register to avoid unnecessary register write. Fixes: `af8d102f99` ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-09-04 11:39:21 +02:00
Krzysztof Kozlowski	3207fa325a	iommu/vt-d: Drop kerneldoc marker from regular comment Fix W=1 compile warnings (invalid kerneldoc): drivers/iommu/intel/dmar.c:389: warning: Function parameter or member 'header' not described in 'dmar_parse_one_drhd' Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200728170859.28143-2-krzk@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-09-04 10:45:05 +02:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Linus Torvalds	9ad57f6dfc	Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction, mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util, memory-hotplug, cleanups, uaccess, migration, gup, pagemap), - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops, checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump, exec, kdump, rapidio, panic, kcov, kgdb, ipc). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits) mm/gup: remove task_struct pointer for all gup code mm: clean up the last pieces of page fault accountings mm/xtensa: use general page fault accounting mm/x86: use general page fault accounting mm/sparc64: use general page fault accounting mm/sparc32: use general page fault accounting mm/sh: use general page fault accounting mm/s390: use general page fault accounting mm/riscv: use general page fault accounting mm/powerpc: use general page fault accounting mm/parisc: use general page fault accounting mm/openrisc: use general page fault accounting mm/nios2: use general page fault accounting mm/nds32: use general page fault accounting mm/mips: use general page fault accounting mm/microblaze: use general page fault accounting mm/m68k: use general page fault accounting mm/ia64: use general page fault accounting mm/hexagon: use general page fault accounting mm/csky: use general page fault accounting ...	2020-08-12 11:24:12 -07:00

... 2 3 4 5 6 ...

384 Commits