Slightly more changes than usual this time:
- KDump Kernel IOMMU take-over code for AMD IOMMU. The code now
tries to preserve the mappings of the kernel so that master
aborts for devices are avoided. Master aborts cause some
devices to fail in the kdump kernel, so this code makes the
dump more likely to succeed when AMD IOMMU is enabled.
- Common flush queue implementation for IOVA code users. The
code is still optional, but AMD and Intel IOMMU drivers had
their own implementation which is now unified.
- Finish support for iommu-groups. All drivers implement this
feature now so that IOMMU core code can rely on it.
- Finish support for 'struct iommu_device' in iommu drivers. All
drivers now use the interface.
- New functions in the IOMMU-API for explicit IO/TLB flushing.
This will help to reduce the number of IO/TLB flushes when
IOMMU drivers support this interface.
- Support for mt2712 in the Mediatek IOMMU driver
- New IOMMU driver for QCOM hardware
- System PM support for ARM-SMMU
- Shutdown method for ARM-SMMU-v3
- Some constification patches
- Various other small improvements and fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZtCFNAAoJECvwRC2XARrjZnQP/AxC/ezQpq82HbegF4sM/cVE
Ep7TeTqodEl75FS/6txe2wU0pwodqk/LB9ajfQZUbE1w8vKsNEqi5qf4ZYHGoxYI
5bWyjJBzKIlwENH5lsBpQNt6XLevrYmRsFy7F0tRYy+qPQq8k+js2i7/XkCL3q7L
3xklF847RRoITaTOhhaROx1pF23dSMEsS2XGuWHcZfjORtep4wcFKzd/2SvlCWCo
P2bRU7jBzfJuuGSA80gaiUbDmrULTUfYuZNp7njASzCgsDmagERtvDEpdoXPNNSp
u6s4LjU1Dp3fgr6g6cFRO7B6JUbWd619nwo9so/c/wZN54yEngBF9EyeeF3mv2O5
ZbM2mOW3RlZcjxFT/AC8G4cZwwP6MpCEQOdqknoqc6ZQwcDqwN0o9I4+po0wsiAU
89ijZZe9Mx0p9lNpihaBEB1erAUWPo5Obh62zo80W3h6x9WzkGQWM+PyFK2DYoaC
8biEZzcc21sLEHvXQkcEGJSKrihHr9sluOqvxmCw5QAkYIFAeZRoeH7JtZWjVCnr
T3XvaG1G1Aw6tS7ErxufdKawREAGki0Rm9i1baiH9sqNj5rllM01Y+PgU6E21Nbg
iZp9gJLjfwM4vhYLlovvQK5PRoOBsCkyCpEI4GJqjLeam5p/WN06CFFf0ifQofYr
qDPCVDkWHWV8nugFFKE7
=EVh9
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"Slightly more changes than usual this time:
- KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries
to preserve the mappings of the kernel so that master aborts for
devices are avoided. Master aborts cause some devices to fail in
the kdump kernel, so this code makes the dump more likely to
succeed when AMD IOMMU is enabled.
- common flush queue implementation for IOVA code users. The code is
still optional, but AMD and Intel IOMMU drivers had their own
implementation which is now unified.
- finish support for iommu-groups. All drivers implement this feature
now so that IOMMU core code can rely on it.
- finish support for 'struct iommu_device' in iommu drivers. All
drivers now use the interface.
- new functions in the IOMMU-API for explicit IO/TLB flushing. This
will help to reduce the number of IO/TLB flushes when IOMMU drivers
support this interface.
- support for mt2712 in the Mediatek IOMMU driver
- new IOMMU driver for QCOM hardware
- system PM support for ARM-SMMU
- shutdown method for ARM-SMMU-v3
- some constification patches
- various other small improvements and fixes"
* tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits)
iommu/vt-d: Don't be too aggressive when clearing one context entry
iommu: Introduce Interface for IOMMU TLB Flushing
iommu/s390: Constify iommu_ops
iommu/vt-d: Avoid calling virt_to_phys() on null pointer
iommu/vt-d: IOMMU Page Request needs to check if address is canonical.
arm/tegra: Call bus_set_iommu() after iommu_device_register()
iommu/exynos: Constify iommu_ops
iommu/ipmmu-vmsa: Make ipmmu_gather_ops const
iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable
iommu/amd: Rename a few flush functions
iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY
iommu/mediatek: Fix a build warning of BIT(32) in ARM
iommu/mediatek: Fix a build fail of m4u_type
iommu: qcom: annotate PM functions as __maybe_unused
iommu/pamu: Fix PAMU boot crash
memory: mtk-smi: Degrade SMI init to module_init
iommu/mediatek: Enlarge the validate PA range for 4GB mode
iommu/mediatek: Disable iommu clock when system suspend
iommu/mediatek: Move pgtable allocation into domain_alloc
iommu/mediatek: Merge 2 M4U HWs into one iommu domain
...
The variable amd_iommu_pre_enabled is used in non-init
code-paths, so remove the __initdata annotation.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 3ac3e5ee5e ('iommu/amd: Copy old trans table from old kernel')
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This was reported by the kbuild bot. The condition in which
entry would be used uninitialized can not happen, because
when there is no iommu this function would never be called.
But its no fast-path, so fix the warning anyway.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It's ok to disable iommu early in normal kernel or in kdump kernel when
amd_iommu=off is specified. While we should not disable it in kdump kernel
when on-flight dma is still on-going.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When iommu is pre_enabled in kdump kernel, if a device is set up with
guest translations (DTE.GV=1), then don't copy GCR3 table root pointer
but move the device over to an empty guest-cr3 table and handle the
faults in the PPR log (which answer them with INVALID). After all these
PPR faults are recoverable for the device and we should not allow the
device to change old-kernels data when we don't have to.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD pointed out it's unsafe to update the device-table while iommu
is enabled. It turns out that device-table pointer update is split
up into two 32bit writes in the IOMMU hardware. So updating it while
the IOMMU is enabled could have some nasty side effects.
The safe way to work around this is to always allocate the device-table
below 4G, including the old device-table in normal kernel and the
device-table used for copying the content of the old device-table in kdump
kernel. Meanwhile we need check if the address of old device-table is
above 4G because it might has been touched accidentally in corrupted
1st kernel.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Firstly split the dev table entry copy into address translation part
and irq remapping part. Because these two parts could be enabled
independently.
Secondly do sanity check for address translation and irq remap of old
dev table entry separately.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Here several things need be done:
- If iommu is pre-enabled in a normal kernel, just disable it and print
warning.
- If any one of IOMMUs is not pre-enabled in kdump kernel, just continue
as it does in normal kernel.
- If failed to copy dev table of old kernel, continue to proceed as
it does in normal kernel.
- Only if all IOMMUs are pre-enabled and copy dev table is done well, free
the dev table allocated in early_amd_iommu_init() and make amd_iommu_dev_table
point to the copied one.
- Disable and Re-enable event/cmd buffer, install the copied DTE table
to reg, and detect and enable guest vapic.
- Flush all caches
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add function copy_dev_tables to copy the old DEV table entries of the panicked
kernel to the new allocated device table. Since all iommus share the same device
table the copy only need be done one time. Here add a new global old_dev_tbl_cpy
to point to the newly allocated device table which the content of old device
table will be copied to. Besides, we also need to:
- Check whether all IOMMUs actually use the same device table with the same size
- Verify that the size of the old device table is the expected size.
- Reserve the old domain id occupied in 1st kernel to avoid touching the old
io-page tables. Then on-flight DMA can continue looking it up.
And also define MACRO DEV_DOMID_MASK to replace magic number 0xffffULL, it can be
reused in copy_dev_tables().
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This reverts commit 54bd635704.
We still need the IO_PAGE_FAULT message to warn error after the
issue of on-flight dma in kdump kernel is fixed.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move single iommu enabling codes into a wrapper function early_enable_iommu().
This can make later kdump change easier.
And also add iommu_disable_command_buffer and iommu_disable_event_buffer
for later usage.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add functions to check whether translation is already enabled in IOMMU.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The register_syscore_ops() function takes a mutex and might
sleep. In the IOMMU initialization code it is invoked during
irq-remapping setup already, where irqs are disabled.
This causes a schedule-while-atomic bug:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0
no locks held by swapper/0/1.
irq event stamp: 304
hardirqs last enabled at (303): [<ffffffff818a87b6>] _raw_spin_unlock_irqrestore+0x36/0x60
hardirqs last disabled at (304): [<ffffffff8235d440>] enable_IR_x2apic+0x79/0x196
softirqs last enabled at (36): [<ffffffff818ae75f>] __do_softirq+0x35f/0x4ec
softirqs last disabled at (31): [<ffffffff810c1955>] irq_exit+0x105/0x120
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2.1.el7a.test.x86_64.debug #1
Hardware name: PowerEdge C6145 /040N24, BIOS 3.5.0 10/28/2014
Call Trace:
dump_stack+0x85/0xca
___might_sleep+0x22a/0x260
__might_sleep+0x4a/0x80
__mutex_lock+0x58/0x960
? iommu_completion_wait.part.17+0xb5/0x160
? register_syscore_ops+0x1d/0x70
? iommu_flush_all_caches+0x120/0x150
mutex_lock_nested+0x1b/0x20
register_syscore_ops+0x1d/0x70
state_next+0x119/0x910
iommu_go_to_state+0x29/0x30
amd_iommu_enable+0x13/0x23
Fix it by moving the register_syscore_ops() call to the next
initialization step, which runs with irqs enabled.
Reported-by: Artem Savkov <asavkov@redhat.com>
Tested-by: Artem Savkov <asavkov@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: 2c0ae1720c ('iommu/amd: Convert iommu initialization to state machine')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IOMMU is programmed with physical addresses for the various tables
and buffers that are used to communicate between the device and the
driver. When the driver allocates this memory it is encrypted. In order
for the IOMMU to access the memory as encrypted the encryption mask needs
to be included in these physical addresses during configuration.
The PTE entries created by the IOMMU should also include the encryption
mask so that when the device behind the IOMMU performs a DMA, the DMA
will be performed to encrypted memory.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Cc: <iommu@lists.linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Toshimitsu Kani <toshi.kani@hpe.com>
Cc: kasan-dev@googlegroups.com
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/3053631ea25ba8b1601c351cb7c541c496f6d9bc.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
After we made sure that all IOMMUs have been disabled we
need to make sure that all resources we allocated are
released again.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The function will also be used to free iommu resources when
amd_iommu=off was specified on the kernel command line. So
rename the function to reflect that.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When booting, make sure the IOMMUs are disabled. They could
be previously enabled if we boot into a kexec or kdump
kernel. So make sure they are off.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When booting into a kdump kernel, suppress IO_PAGE_FAULTs by
default for all devices. But allow the faults again when a
domain is assigned to a device.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
As newer, higher speed devices are developed, perf data shows that the
amount of MMIO that is performed when submitting commands to the IOMMU
causes performance issues. Currently, the command submission path reads
the command buffer head and tail pointers and then writes the tail
pointer once the command is ready.
The tail pointer is only ever updated by the driver so it can be tracked
by the driver without having to read it from the hardware.
The head pointer is updated by the hardware, but can be read
opportunistically. Reading the head pointer only when it appears that
there might not be room in the command buffer and then re-checking the
available space reduces the number of times the head pointer has to be
read.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently, amd_iommu_pc_get_max_[banks|counters]() use end-point device
ID to locate an IOMMU and check the reported max banks/counters. The
logic assumes that the IOMMU_BASE_DEVID belongs to the first IOMMU, and
uses it to acquire a reference to the first IOMMU, which does not work
on certain systems. Instead, modify the function to take an IOMMU index,
and use it to query the corresponding AMD IOMMU instance.
Currently, hardcode the IOMMU index to 0 since the current AMD IOMMU
perf implementation supports only a single IOMMU. A subsequent patch
will add support for multiple IOMMUs, and will use a proper IOMMU index.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1487926102-13073-7-git-send-email-Suravee.Suthikulpanit@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce amd_iommu_get_num_iommus(), which returns the value of
amd_iommus_present. The function is used to replace direct access to the
variable, which is now declared as static.
This function will also be used by AMD IOMMU perf driver.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1487926102-13073-6-git-send-email-Suravee.Suthikulpanit@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The link between the iommu sysfs-device and the struct
amd_iommu is no longer stored as driver-data. Update the
code to the new correct way of getting from device to
amd_iommu.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Fixes: 39ab9555c2 ('iommu: Add sysfs bindings for struct iommu_device')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is currently support for iommu sysfs bindings, but
those need to be implemented in the IOMMU drivers. Add a
more generic version of this by adding a struct device to
struct iommu_device and use that for the sysfs bindings.
Also convert the AMD and Intel IOMMU driver to make use of
it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This struct represents one hardware iommu in the iommu core
code. For now it only has the iommu-ops associated with it,
but that will be extended soon.
The register/unregister interface is also added, as well as
making use of it in the Intel and AMD IOMMU drivers.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Prevent early_amd_iommu_init() from leaking memory mapped via
acpi_get_table() if check_ivrs_checksum() returns an error.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
- Move some Linux-specific functionality to upstream ACPICA and
update the in-kernel users of it accordingly (Lv Zheng).
- Drop a useless warning (triggered by the lack of an optional
object) from the ACPI namespace scanning code (Zhang Rui).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYW90kAAoJEILEb/54YlRxS78P/RZwSOwkgdowIMGOlO7Hz5Xz
6Psjfc3QdGhVGOau4F3Irg7HIDsPaHE5+xijwujWFK0wh89tknrChrFDrOueND6K
rZ4w72lYROQhtQl4bu9yXZ2TswP3I5aYERDPyqKfDAi+SaQBpZUtmWdoh0I/JN2M
svyuzxoRlNglgp2GWPnnnKHFe6plEeZgKjgoGjPAufDHakqYvP0cQwY3i4+PpuYh
prXb4Xr4fFvAG2MhM9ciT02ewrQJcYUgVj5bnuSBmMu0y0zUBJ1kK7abiQlf+lTy
/BCqQbllhHrrXQD0zkqi2oBFqWL4Bnj9r+5zj03KmBsfoz3u+zXr5CApDei8Yc06
gGIZXX9aCqBirahkpsMFYPEQVPoPFCek0slXSyF5xKjCWYyocwgUtHnB87uIiSFO
jCeSsWwT5IO9HnbsTf6x4xDWBdY6LOJgJyDirIGcNSUX+q4tfgn/LFrgN9Ck4M2g
xkZn8e8H7u09ACX9l6dHZ1PCHlg7bWKeH6gcqdo5R4NOVeZxt9YDIz13ERsG3D17
JZvdrAbBdC1fXfHnOLBj/BRy+TFvieWOC+kHBTAnPQlnkr7NjXXv/vgWU8flVL/z
kxlR2U2lD2XarG/b8OHvYxO2k/TDLRenN0IqqFmYbmyhH0dHYKGGBLY4TetAXKNL
N4EouS+xUBulCP2XOI8N
=55TG
-----END PGP SIGNATURE-----
Merge tag 'acpi-extra-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"Here are new versions of two ACPICA changes that were deferred
previously due to a problem they had introduced, two cleanups on top
of them and the removal of a useless warning message from the ACPI
core.
Specifics:
- Move some Linux-specific functionality to upstream ACPICA and
update the in-kernel users of it accordingly (Lv Zheng)
- Drop a useless warning (triggered by the lack of an optional
object) from the ACPI namespace scanning code (Zhang Rui)"
* tag 'acpi-extra-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
ACPICA: Tables: Allow FADT to be customized with virtual address
ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
ACPI: do not warn if _BQC does not exist
This patch removes the users of the deprectated APIs:
acpi_get_table_with_size()
early_acpi_os_unmap_memory()
The following APIs should be used instead of:
acpi_get_table()
acpi_put_table()
The deprecated APIs are invented to be a replacement of acpi_get_table()
during the early stage so that the early mapped pointer will not be stored
in ACPICA core and thus the late stage acpi_get_table() won't return a
wrong pointer. The mapping size is returned just because it is required by
early_acpi_os_unmap_memory() to unmap the pointer during early stage.
But as the mapping size equals to the acpi_table_header.length
(see acpi_tb_init_table_descriptor() and acpi_tb_validate_table()), when
such a convenient result is returned, driver code will start to use it
instead of accessing acpi_table_header to obtain the length.
Thus this patch cleans up the drivers by replacing returned table size with
acpi_table_header.length, and should be a no-op.
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This will get rid of a lot false positives caused by kmemleak being
unaware of the irq_remap_table. Based on a suggestion from Catalin Marinas.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Introduce struct iommu_dev_data.use_vapic flag, which IOMMU driver
uses to determine if it should enable vAPIC support, by setting
the ga_mode bit in the device's interrupt remapping table entry.
Currently, it is enabled for all pass-through device if vAPIC mode
is enabled.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch adds support to detect and initialize IOMMU Guest vAPIC log
(GALOG). By default, it also enable GALog interrupt to notify IOMMU driver
when GA Log entry is created.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch enables support for the new 128-bit IOMMU IRTE format,
which can be used for both legacy and vapic interrupt remapping modes.
It replaces the existing operations on IRTE, which can only support
the older 32-bit IRTE format, with calls to the new struct amd_irt_ops.
It also provides helper functions for setting up, accessing, and
updating interrupt remapping table entries in different mode.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir,
which can be used to specify different interrupt remapping mode for
passthrough devices to VM guest:
* legacy: Legacy interrupt remapping (w/ 32-bit IRTE)
* vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit IRTE)
Note that in vapic mode, it can also supports legacy interrupt remapping
for non-passthrough devices with the 128-bit IRTE.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is a race condition in the AMD IOMMU init code that
causes requested unity mappings to be blocked by the IOMMU
for a short period of time. This results on boot failures
and IO_PAGE_FAULTs on some machines.
Fix this by making sure the unity mappings are installed
before all other DMA is blocked.
Fixes: aafd8ba0ca ('iommu/amd: Implement add_device and remove_device')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 2a0cb4e2d4 ("iommu/amd: Add new map for storing IVHD dev entry
type HID") added a call to DUMP_printk in init_iommu_from_acpi() which
used the value of devid before this variable was initialized.
Fixes: 2a0cb4e2d4 ('iommu/amd: Add new map for storing IVHD dev entry type HID')
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch introduces a new kernel parameter, ivrs_acpihid.
This is used to override existing ACPI-HID IVHD device entry,
or add an entry in case it is missing in the IVHD.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch introduces acpihid_map, which is used to store
the new IVHD device entry extracted from BIOS IVRS table.
It also provides a utility function add_acpi_hid_device(),
to add this types of devices to the map.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IVRS in more recent AMD system usually contains multiple
IVHD block types (e.g. 0x10, 0x11, and 0x40) for each IOMMU.
The newer IVHD types provide more information (e.g. new features
specified in the IOMMU spec), while maintain compatibility with
the older IVHD type.
Having multiple IVHD type allows older IOMMU drivers to still function
(e.g. using the older IVHD type 0x10) while the newer IOMMU driver can use
the newer IVHD types (e.g. 0x11 and 0x40). Therefore, the IOMMU driver
should only make use of the newest IVHD type that it can support.
This patch adds new logic to determine the highest level of IVHD type
it can support, and use it throughout the to initialize the driver.
This requires adding another pass to the IVRS parsing to determine
appropriate IVHD type (see function get_highest_supported_ivhd_type())
before parsing the contents.
[Vincent: fix the build error of IVHD_DEV_ACPI_HID flag not found]
Signed-off-by: Wan Zongshun <vincent.wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch modifies the existing struct ivhd_header,
which currently only support IVHD type 0x10, to add
new fields from IVHD type 11h and 40h.
It also modifies the pointer calculation to allow
support for IVHD type 11h and 40h
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IVHD header type 11h and 40h introduce the PCSup bit in
the EFR Register Image bit fileds. This should be used to
determine the IOMMU performance support instead of relying
on the PNCounters and PNBanks.
Note also that the PNCouters and PNBanks bits in the IOMMU
attributes field of IVHD headers type 11h are incorrectly
programmed on some systems.
So, we should not rely on it to determine the performance
counter/banks size. Instead, these values should be read
from the MMIO Offset 0030h IOMMU Extended Feature Register.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The AMD Family 15h Models 30h-3Fh (Kaveri) BIOS and Kernel Developer's
Guide omitted part of the BIOS IOMMU L2 register setup specification.
Without this setup the IOMMU L2 does not fully respect write permissions
when handling an ATS translation request.
The IOMMU L2 will set PTE dirty bit when handling an ATS translation with
write permission request, even when PTE RW bit is clear. This may occur by
direct translation (which would cause a PPR) or by prefetch request from
the ATC.
This is observed in practice when the IOMMU L2 modifies a PTE which maps a
pagecache page. The ext4 filesystem driver BUGs when asked to writeback
these (non-modified) pages.
Enable ATS write permission check in the Kaveri IOMMU L2 if BIOS has not.
Signed-off-by: Jay Cornwall <jay@jcornwall.me>
Cc: <stable@vger.kernel.org> # v3.19+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The setup code for the performance counters in the AMD IOMMU driver
tests whether the counters can be written. It tests to setup a counter
for device 00:00.0, which fails on systems where this particular device
is not covered by the IOMMU.
Fix this by not relying on device 00:00.0 but only on the IOMMU being
present.
Cc: stable@vger.kernel.org
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>