set_msi_sid_cb() is used to determine whether device aliases share the
same bus, but it can provide false indications that aliases use the same
bus when in fact they do not. The reason is that set_msi_sid_cb()
assumes that pdev is fixed, while actually pci_for_each_dma_alias() can
call fn() when pdev is set to a subordinate device.
As a result, running an VM on ESX with VT-d emulation enabled can
results in the log warning such as:
DMAR: [INTR-REMAP] Request device [00:11.0] fault index 3b [fault reason 38] Blocked an interrupt request due to source-id verification failure
This seems to cause additional ata errors such as:
ata3.00: qc timeout (cmd 0xa1)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
These timeouts also cause boot to be much longer and other errors.
Fix it by checking comparing the alias with the previous one instead.
Fixes: 3f0c625c6a ("iommu/vt-d: Allow interrupts from the entire bus for aliased devices")
Cc: stable@vger.kernel.org
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Linux IRQ number virq is not used in IRTE allocation. Remove it.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Use new helper pci_dev_id() to simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When a device has multiple aliases that all are from the same bus,
we program the IRTE to accept requests from any matching device on the
bus.
This is so NTB devices which can have requests from multiple bus-devfns
can pass MSI interrupts through across the bridge.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The current code uses set_irte_sid() with SVT_VERIFY_BUS and PCI_DEVID
to set the SID value. However, this is very confusing because, with
SVT_VERIFY_BUS, the SID value is not a PCI devfn address, but the start
and end bus numbers to match against.
According to the Intel Virtualization Technology for Directed I/O
Architecture Specification, Rev. 3.0, page 9-36:
The most significant 8-bits of the SID field contains the Startbus#,
and the least significant 8-bits of the SID field contains the Endbus#.
Interrupt requests that reference this IRTE must have a requester-id
whose bus# (most significant 8-bits of requester-id) has a value equal
to or within the Startbus# to Endbus# range.
So to make this more clear, introduce a new set_irte_verify_bus() that
explicitly takes a start bus and end bus so that we can stop abusing
the PCI_DEVID macro.
This helper function will be called a second time in an subsequent patch.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Intel vt-d spec rev3.0 requires software to use 256-bit
descriptors in invalidation queue. As the spec reads in
section 6.5.2:
Remapping hardware supporting Scalable Mode Translations
(ECAP_REG.SMTS=1) allow software to additionally program
the width of the descriptors (128-bits or 256-bits) that
will be written into the Queue. Software should setup the
Invalidation Queue for 256-bit descriptors before progra-
mming remapping hardware for scalable-mode translation as
128-bit descriptors are treated as invalid descriptors
(see Table 21 in Section 6.5.2.10) in scalable-mode.
This patch adds 256-bit invalidation descriptor support
if the hardware presents scalable mode capability.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Sanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
To reuse the static functions and the struct declarations, move them to
corresponding header files and export the needed functions.
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
To address the EBUSY fail of interrupt affinity settings in case that the
previous setting has not been cleaned up yet, use the new apic_ack_irq()
function instead of the special ir_ack_apic_edge() implementation which is
merily a wrapper around ack_APIC_irq().
Preparatory change for the real fix
Fixes: dccfe3147b ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Link: https://lkml.kernel.org/r/20180604162224.555716895@linutronix.de
It was noticed that the IRTE configured for guest OS kernel
was over-written while the guest was running. As a result,
vt-d Posted Interrupts configured for the guest are not being
delivered directly, and instead bounces off the host. Every
interrupt delivery takes a VM Exit.
It was noticed that the following stack is doing the over-write:
[ 147.463177] modify_irte+0x171/0x1f0
[ 147.463405] intel_ir_set_affinity+0x5c/0x80
[ 147.463641] msi_domain_set_affinity+0x32/0x90
[ 147.463881] irq_do_set_affinity+0x37/0xd0
[ 147.464125] irq_set_affinity_locked+0x9d/0xb0
[ 147.464374] __irq_set_affinity+0x42/0x70
[ 147.464627] write_irq_affinity.isra.5+0xe1/0x110
[ 147.464895] proc_reg_write+0x38/0x70
[ 147.465150] __vfs_write+0x36/0x180
[ 147.465408] ? handle_mm_fault+0xdf/0x200
[ 147.465671] ? _cond_resched+0x15/0x30
[ 147.465936] vfs_write+0xad/0x1a0
[ 147.466204] SyS_write+0x52/0xc0
[ 147.466472] do_syscall_64+0x74/0x1a0
[ 147.466744] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
reversing the sense of force check in intel_ir_reconfigure_irte()
restores proper posted interrupt functionality
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Fixes: d491bdff88 ('iommu/vt-d: Reevaluate vector configuration on activate()')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The 'early' argument of irq_domain_activate_irq() is actually used to
denote reservation mode. To avoid confusion, rename it before abuse
happens.
No functional change.
Fixes: 7249164346 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Sakari Ailus <sakari.ailus@intel.com>,
Cc: linux-media@vger.kernel.org
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the upcoming reservation/management scheme, early activation will
assign a special vector. The final activation at request_irq() assigns a
real vector, which needs to be updated in the tables.
Split out the reconfiguration code in set_affinity and use it for
reactivation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213155.853028808@linutronix.de
The irq_domain_ops.activate() callback has no return value and no way to
tell the function that the activation is early.
The upcoming changes to support a reservation scheme which allows to assign
interrupt vectors on x86 only when the interrupt is actually requested
requires:
- A return value, so activation can fail at request_irq() time
- Information that the activate invocation is early, i.e. before
request_irq().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.848490816@linutronix.de
This update comes with:
* Support for lockless operation in the ARM io-pgtable code.
This is an important step to solve the scalability problems in
the common dma-iommu code for ARM
* Some Errata workarounds for ARM SMMU implemenations
* Rewrite of the deferred IO/TLB flush code in the AMD IOMMU
driver. The code suffered from very high flush rates, with the
new implementation the flush rate is down to ~1% of what it
was before
* Support for amd_iommu=off when booting with kexec. Problem
here was that the IOMMU driver bailed out early without
disabling the iommu hardware, if it was enabled in the old
kernel
* The Rockchip IOMMU driver is now available on ARM64
* Align the return value of the iommu_ops->device_group
call-backs to not miss error values
* Preempt-disable optimizations in the Intel VT-d and common
IOVA code to help Linux-RT
* Various other small cleanups and fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZZgddAAoJECvwRC2XARrjurgQANO338GIBr2ZkA0oectidDpZ
Y4yu7W9RH6NyhupJG/Xooya7daBWFjbaA1AVJ3ZZNlMERh69AmehVfRUfVMzF2w+
buma58HQgiJWN1zFD8xdeMzYKms9P77whA88C/9QvrK/klB3LipWP2SC0yvvvyxJ
mMCDpgt+D+CGnIDqbRuyLDQoRu3yjAkAvYb6OzL8DPJVP1Y5oLffGwGnHzJbJnOf
eWJwYHM5ai0uF/Qqy6RNNekacObjVaOLihjugGvokH6ipXfOrSSNriXW9pZiWR5m
S91898YTP3KuWWsJM+N93UAjvc6pL9PqL/OvbB9zdYpzu+5PtUpFXHYcOebKyEEO
4j9CaRzubsWFTFjbYItJnR4WgXQRf4NKOGfTfHMHA+dY8aODYnlXNVdQDAA2aFgn
TUBvHq5xb0zZ3nbPwtTDyW06oDMVfBBarLx2yFI1aQSSh+eg/GtIi5KP28gyFZNz
4gWj0q3g/e3y7WEwNbYV7L3TS0d/p8VUYFtUp7PUCddnWoY+4cJzgidub5xIViZD
Ql0nZzga9pXXIE/kE5Pf74WqrG7JJzZsvK2ABy4+XGrMq6RclJf+0pXbSqiXDpXL
quw8t0oXw0ZEeavQ31Za8mjXBvo5ocM5iintl1wrl2BujHEO3oKqbGsIOaRcLnlN
Ukehbl4OEKzZpD3oLPPk
=pmBf
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This update comes with:
- Support for lockless operation in the ARM io-pgtable code.
This is an important step to solve the scalability problems in the
common dma-iommu code for ARM
- Some Errata workarounds for ARM SMMU implemenations
- Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver.
The code suffered from very high flush rates, with the new
implementation the flush rate is down to ~1% of what it was before
- Support for amd_iommu=off when booting with kexec.
The problem here was that the IOMMU driver bailed out early without
disabling the iommu hardware, if it was enabled in the old kernel
- The Rockchip IOMMU driver is now available on ARM64
- Align the return value of the iommu_ops->device_group call-backs to
not miss error values
- Preempt-disable optimizations in the Intel VT-d and common IOVA
code to help Linux-RT
- Various other small cleanups and fixes"
* tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits)
iommu/vt-d: Constify intel_dma_ops
iommu: Warn once when device_group callback returns NULL
iommu/omap: Return ERR_PTR in device_group call-back
iommu: Return ERR_PTR() values from device_group call-backs
iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device()
iommu/vt-d: Don't disable preemption while accessing deferred_flush()
iommu/iova: Don't disable preempt around this_cpu_ptr()
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126
iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701)
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74
ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model
iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions
iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table
iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE
iommu/arm-smmu-v3: Remove io-pgtable spinlock
iommu/arm-smmu: Remove io-pgtable spinlock
iommu/io-pgtable-arm-v7s: Support lockless operation
iommu/io-pgtable-arm: Support lockless operation
iommu/io-pgtable: Introduce explicit coherency
iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap
...
Add the missing name, so debugging will work proper.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.431939968@linutronix.de
struct irq_domain_ops is not modified, so it can be made const.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When booting a new non-kdump kernel, we have below failure message:
[ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar2 but we are not in kdump mode
[ 0.004000] DMAR-IR: Failed to copy IR table for dmar2 from previous kernel
[ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar1 but we are not in kdump mode
[ 0.004000] DMAR-IR: Failed to copy IR table for dmar1 from previous kernel
[ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode
[ 0.004000] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel
[ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar3 but we are not in kdump mode
[ 0.004000] DMAR-IR: Failed to copy IR table for dmar3 from previous kernel
For non-kdump case, we no need to copy IR table from previous kernel
so it's nonthing actually failed. To be less alarming or misleading,
do not print "DMAR-IR: Failed to copy IR table for dmar[0-9] from
previous kernel" messages when booting non-kdump kernel.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Minor register size and interrupt acknowledgement fixes which only showed
up in testing on newer hardware, but mostly a fix to the MM refcount
handling to prevent a recursive refcount issue when mmap() is used on
the file descriptor associated with a bound PASID.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlbC/gAACgkQdwG7hYl686OY8QCfUPH+IB0zou9/MH3JNMz1ujot
I6wAoK0R4KiOFXvjNeNPy+XroZ9xKqv/
=RM+0
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu
Pull IOMMU SVM fixes from David Woodhouse:
"Minor register size and interrupt acknowledgement fixes which only
showed up in testing on newer hardware, but mostly a fix to the MM
refcount handling to prevent a recursive refcount issue when mmap() is
used on the file descriptor associated with a bound PASID"
* tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu:
iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts
iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users
This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.
The remaining ones need more careful inspection before a conversion can
happen. On the TODO.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de
Cc: David Sterba <dsterba@suse.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Propagate the error-value from the function ir_parse_ioapic_hpet_scope()
in parse_ioapics_under_ir() and cleanup its calling loop.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Adjust the return value of parse_ioapics_under_ir as
negative value representing failure and "0" representing
succcess. Just make it consistent with other function
implementations, and we can judge if calling is successfull
by if (!parse_ioapics_under_ir()) style.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If IRTE is in posted format, the 'pda' field goes across the 64-bit
boundary, we need use cmpxchg16b to atomically update it. We only
expose posted-interrupt when X86_FEATURE_CX16 is supported and use
to update it atomically.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In preparation for deprecating ioremap_cache() convert its usage in
intel-iommu to memremap. This also eliminates the mishandling of the
__iomem annotation in the implementation.
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
iommu_load_old_irte() appears to leak the old_irte mapping after use.
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This time with bigger changes than usual:
* A new IOMMU driver for the ARM SMMUv3. This IOMMU is pretty
different from SMMUv1 and v2 in that it is configured through
in-memory structures and not through the MMIO register region.
The ARM SMMUv3 also supports IO demand paging for PCI devices
with PRI/PASID capabilities, but this is not implemented in
the driver yet.
* Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
* Introduction of default domains into the IOMMU core code. The
rationale behind this is to move functionalily out of the
IOMMU drivers to common code to get to a unified behavior
between different drivers.
The patches here introduce a default domain for iommu-groups
(isolation groups). A device will now always be attached to a
domain, either the default domain or another domain handled by
the device driver. The IOMMU drivers have to be modified to
make use of that feature. So long the AMD IOMMU driver is
converted, with others to follow.
* Patches for the Intel VT-d drvier to fix DMAR faults that
happen when a kdump kernel boots. When the kdump kernel boots
it re-initializes the IOMMU hardware, which destroys all
mappings from the crashed kernel. As this happens before
the endpoint devices are re-initialized, any in-flight DMA
causes a DMAR fault. These faults cause PCI master aborts,
which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old
kernel and keep the old mappings in place until the device
driver of the new kernel takes over. This emulates the the
behavior without an IOMMU to the best degree possible.
* A couple of other small fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJViSIWAAoJECvwRC2XARrjl+cP/2FXS7SWDq91VFiIZfXfPt8H
C5Ef3OGWCnMzn4MKE1ExkyDhC+AH6pF1s4zi3XfT6b8iOA+DUpa51rxJjixszt31
tQwmvB7hWu4mznGxSN7EA0Pm0l/v3tBAY5BvG598af0aNZFFJ6po+31MyQA5X67+
6xpqLbH/hm4IZhFBOEzZwxuWWsNxlJwwzKqeAjGyqeUhdruRYZiPHWQ17sDjwLM/
QcVvWBb7meOtKv1OCtpzC4sglSk3scbAfEHMEBuDt8cI6OD7/t2VzPXDWWZuXGqK
nRAxCT7NrXvyOnv0xwdn0j5p1FUGipVxvhsGWX7sJsh3UHWm8Q+5rRKFFVI9pm50
QcMjiIMazK5VwcAkDnLoDgSz4Zz6TfHXEOqSJ2vjTPt2VDP/J9zdM2iwHx2ujicI
mIkrtmsBprvAPx6e9jcqiS5L/Xy1y1xewXuGxa5F2XOjqdoXkPqaupjlyrWzrChA
MC8w67FdzjHDPCfIqfIWZpJQj4f1OFQGd3HS5HpkBACxIwCg85gRw4DEMfD/sirO
BL2VM0RO/bB5+4R0AY7UA2VszQvNMqedj1bA4vAbrnXqOh8BI/0GgeoWiBMXhyX1
qvT1jl+cxuCm5tgBOMUGYoRyF+//bH+l78jLsTYaWRtuVzFlkAX6idNvYYK0dmNt
tLII2IIZBk87P3pF4d6A
=Zicw
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This time with bigger changes than usual:
- A new IOMMU driver for the ARM SMMUv3.
This IOMMU is pretty different from SMMUv1 and v2 in that it is
configured through in-memory structures and not through the MMIO
register region. The ARM SMMUv3 also supports IO demand paging for
PCI devices with PRI/PASID capabilities, but this is not
implemented in the driver yet.
- Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
- Introduction of default domains into the IOMMU core code.
The rationale behind this is to move functionalily out of the IOMMU
drivers to common code to get to a unified behavior between
different drivers. The patches here introduce a default domain for
iommu-groups (isolation groups).
A device will now always be attached to a domain, either the
default domain or another domain handled by the device driver. The
IOMMU drivers have to be modified to make use of that feature. So
long the AMD IOMMU driver is converted, with others to follow.
- Patches for the Intel VT-d drvier to fix DMAR faults that happen
when a kdump kernel boots.
When the kdump kernel boots it re-initializes the IOMMU hardware,
which destroys all mappings from the crashed kernel. As this
happens before the endpoint devices are re-initialized, any
in-flight DMA causes a DMAR fault. These faults cause PCI master
aborts, which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old kernel
and keep the old mappings in place until the device driver of the
new kernel takes over. This emulates the the behavior without an
IOMMU to the best degree possible.
- A couple of other small fixes and cleanups"
* tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (69 commits)
iommu/amd: Handle large pages correctly in free_pagetable
iommu/vt-d: Don't disable IR when it was previously enabled
iommu/vt-d: Make sure copied over IR entries are not reused
iommu/vt-d: Copy IR table from old kernel when in kdump mode
iommu/vt-d: Set IRTA in intel_setup_irq_remapping
iommu/vt-d: Disable IRQ remapping in intel_prepare_irq_remapping
iommu/vt-d: Move QI initializationt to intel_setup_irq_remapping
iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
iommu/vt-d: Enable Translation only if it was previously disabled
iommu/vt-d: Don't disable translation prior to OS handover
iommu/vt-d: Don't copy translation tables if RTT bit needs to be changed
iommu/vt-d: Don't do early domain assignment if kdump kernel
iommu/vt-d: Allocate si_domain in init_dmars()
iommu/vt-d: Mark copied context entries
iommu/vt-d: Do not re-use domain-ids from the old kernel
iommu/vt-d: Copy translation tables from old kernel
iommu/vt-d: Detect pre enabled translation
iommu/vt-d: Make root entry visible for hardware right after allocation
iommu/vt-d: Init QI before root entry is allocated
iommu/vt-d: Cleanup log messages
...
Keep it enabled in kdump kernel to guarantee interrupt
delivery.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Walk over the copied entries and mark the present ones as
allocated.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we are booting into a kdump kernel and find IR enabled,
copy over the contents of the previous IR table so that
spurious interrupts will not be target aborted.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This way we can give the hardware the new IR table right
after it has been allocated and initialized.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move it to this function for now, so that the copy routines
for irq remapping take no effect yet.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
QI needs to be enabled when we program the irq remapping
table to hardware in the prepare phase later.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We need this to be detected already when we program the irq
remapping table pointer to hardware.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Give them a common prefix that can be grepped for and
improve the wording here and there.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When the interrupt is configured in posted mode, the destination of
the interrupt is set in the Posted-Interrupts Descriptor and the
migration of these interrupts happens during vCPU scheduling.
We still update the cached irte, which will be used when changing back
to remapping mode, but we avoid writing the table entry as this would
overwrite the posted mode entry.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-7-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Interrupt chip callback to set the VCPU affinity for posted interrupts.
[ tglx: Use the helper function to copy from the remap irte instead of
open coding it. Massage the comment as well ]
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: joro@8bytes.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-5-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit b106ee63ab ("irq_remapping/vt-d: Enhance Intel IR driver to
support hierarchical irqdomains") caused a regression, which forgot
to initialize remapping data structures other than the first entry
when setting up remapping entries for multiple MSIs.
[ Jiang: Commit message ]
Fixes: b106ee63ab ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: iommu@lists.linux-foundation.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Joerg Roedel <joro@8bytes.org>
Link: http://lkml.kernel.org/r/1430707662-28598-2-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull intel iommu updates from David Woodhouse:
"This lays a little of the groundwork for upcoming Shared Virtual
Memory support — fixing some bogus #defines for capability bits and
adding the new ones, and starting to use the new wider page tables
where we can, in anticipation of actually filling in the new fields
therein.
It also allows graphics devices to be assigned to VM guests again.
This got broken in 3.17 by disallowing assignment of RMRR-afflicted
devices. Like USB, we do understand why there's an RMRR for graphics
devices — and unlike USB, it's actually sane. So we can make an
exception for graphics devices, just as we do USB controllers.
Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to
persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty
hack to allow broken BIOSes to forbid us from using X2APIC when they
do stupid and invasive things and would break if we did.
Someone noticed that since Windows doesn't have full IOMMU support for
DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid
initialising the IOMMU on the graphics unit altogether.
This means that it would be available for use in "driver mode", where
the IOMMU registers are made available through a BAR of the graphics
device and the graphics driver can do SVM all for itself.
So they started setting the X2APIC_OPT_OUT bit on *all* platforms with
SVM capabilities. And even the platforms which *might*, if the
planets had been aligned correctly, possibly have had SVM capability
but which in practice actually don't"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: support extended root and context entries
iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification
iommu/vt-d: Allow RMRR on graphics devices too
iommu/vt-d: Print x2apic opt out info instead of printing a warning
iommu/vt-d: kill bogus ecap_niotlb_iunits()