Commit Graph

160 Commits

Author SHA1 Message Date
Linus Torvalds e1cbc3b96a IOMMU Updates for Linux v5.19
Including:
 
 	- Intel VT-d driver updates
 	  - Domain force snooping improvement.
 	  - Cleanups, no intentional functional changes.
 
 	- ARM SMMU driver updates
 	  - Add new Qualcomm device-tree compatible strings
 	  - Add new Nvidia device-tree compatible string for Tegra234
 	  - Fix UAF in SMMUv3 shared virtual addressing code
 	  - Force identity-mapped domains for users of ye olde SMMU
 	    legacy binding
 	  - Minor cleanups
 
 	- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
 	  - Groundwork for upcoming iommufd framework
 	  - Introduction of DMA ownership so that an entire IOMMU group
 	    is either controlled by the kernel or by user-space
 
 	- MT8195 and MT8186 support in the Mediatek IOMMU driver
 
 	- Patches to make forcing of cache-coherent DMA more coherent
 	  between IOMMU drivers
 
 	- Fixes for thunderbolt device DMA protection
 
 	- Various smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
 GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
 75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
 kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
 5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
 py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
 3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
 m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
 QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
 yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
 nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
 m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
 =W0hj
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Intel VT-d driver updates:
     - Domain force snooping improvement.
     - Cleanups, no intentional functional changes.

 - ARM SMMU driver updates:
     - Add new Qualcomm device-tree compatible strings
     - Add new Nvidia device-tree compatible string for Tegra234
     - Fix UAF in SMMUv3 shared virtual addressing code
     - Force identity-mapped domains for users of ye olde SMMU legacy
       binding
     - Minor cleanups

 - Fix a BUG_ON in the vfio_iommu_group_notifier:
     - Groundwork for upcoming iommufd framework
     - Introduction of DMA ownership so that an entire IOMMU group is
       either controlled by the kernel or by user-space

 - MT8195 and MT8186 support in the Mediatek IOMMU driver

 - Make forcing of cache-coherent DMA more coherent between IOMMU
   drivers

 - Fixes for thunderbolt device DMA protection

 - Various smaller fixes and cleanups

* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
  iommu/amd: Increase timeout waiting for GA log enablement
  iommu/s390: Tolerate repeat attach_dev calls
  iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
  iommu/vt-d: Remove domain_update_iommu_snooping()
  iommu/vt-d: Check domain force_snooping against attached devices
  iommu/vt-d: Block force-snoop domain attaching if no SC support
  iommu/vt-d: Size Page Request Queue to avoid overflow condition
  iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
  iommu/vt-d: Change return type of dmar_insert_one_dev_info()
  iommu/vt-d: Remove unneeded validity check on dev
  iommu/dma: Explicitly sort PCI DMA windows
  iommu/dma: Fix iova map result check bug
  iommu/mediatek: Fix NULL pointer dereference when printing dev_name
  iommu: iommu_group_claim_dma_owner() must always assign a domain
  iommu/arm-smmu: Force identity domains for legacy binding
  iommu/arm-smmu: Support Tegra234 SMMU
  dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
  dt-bindings: arm-smmu: Document nvidia,memory-controller property
  iommu/arm-smmu-qcom: Add SC8280XP support
  dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
  ...
2022-05-31 09:56:54 -07:00
Joerg Roedel b0dacee202 Merge branches 'apple/dart', 'arm/mediatek', 'arm/msm', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'vfio-notifier-fix' into next 2022-05-20 12:27:17 +02:00
Joerg Roedel 42bb5aa043 iommu/amd: Increase timeout waiting for GA log enablement
On some systems it can take a long time for the hardware to enable the
GA log of the AMD IOMMU. The current wait time is only 0.1ms, but
testing showed that it can take up to 14ms for the GA log to enter
running state after it has been enabled.

Sometimes the long delay happens when booting the system, sometimes
only on resume. Adjust the timeout accordingly to not print a warning
when hardware takes a longer than usual.

There has already been an attempt to fix this with commit

	9b45a7738e ("iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()")

But that commit was based on some wrong math and did not fix the issue
in all cases.

Cc: "D. Ziegfeld" <dzigg@posteo.de>
Cc: Jörg-Volker Peetz <jvpeetz@web.de>
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220520102214.12563-1-joro@8bytes.org
2022-05-20 12:23:19 +02:00
Vasant Hegde via iommu 9ed1d7f510 iommu/amd: Remove redundant check
smatch static checker warning:
  drivers/iommu/amd/init.c:1989 amd_iommu_init_pci()
  warn: duplicate check 'ret' (previous on line 1978)

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 06687a0380 ("iommu/amd: Improve error handling for amd_iommu_init_pci")
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220314070226.40641-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04 10:32:32 +02:00
Suravee Suthikulpanit 5edde870d3 iommu/amd: Do not call sleep while holding spinlock
Smatch static checker warns:
	drivers/iommu/amd/iommu_v2.c:133 free_device_state()
	warn: sleeping in atomic context

Fixes by storing the list of struct device_state in a temporary
list, and then free the memory after releasing the spinlock.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 9f968fc70d ("iommu/amd: Improve amd_iommu_v2_exit()")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20220314024321.37411-1-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04 10:32:11 +02:00
Jason Gunthorpe 6043257b1d iommu: Introduce the domain op enforce_cache_coherency()
This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.

Currently only Intel and AMD IOMMUs are known to support this
feature. They both implement it as an IOPTE bit, that when set, will cause
PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
the no-snoop bit was clear.

The new API is triggered by calling enforce_cache_coherency() before
mapping any IOVA to the domain which globally switches on no-snoop
blocking. This allows other implementations that might block no-snoop
globally and outside the IOPTE - AMD also documents such a HW capability.

Leave AMD out of sync with Intel and have it block no-snoop even for
in-kernel users. This can be trivially resolved in a follow up patch.

Only VFIO needs to call this API because it does not have detailed control
over the device to avoid requesting no-snoop behavior at the device
level. Other places using domains with real kernel drivers should simply
avoid asking their devices to set the no-snoop bit.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 17:24:57 +02:00
Mario Limonciello f1ca70717b iommu/amd: Indicate whether DMA remap support is enabled
Bit 1 of the IVFS IVInfo field indicates that IOMMU has been used for
pre-boot DMA protection.

Export this capability to allow other places in the kernel to be able to
check for it on AMD systems.

Link: https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ce7627fa1c596878ca6515dd9d4381a45b6ee38c.1650878781.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 10:30:25 +02:00
Mario Limonciello 121660bba6 iommu/amd: Enable swiotlb in all cases
Previously the AMD IOMMU would only enable SWIOTLB in certain
circumstances:
 * IOMMU in passthrough mode
 * SME enabled

This logic however doesn't work when an untrusted device is plugged in
that doesn't do page aligned DMA transactions.  The expectation is
that a bounce buffer is used for those transactions.

This fails like this:

swiotlb buffer is full (sz: 4096 bytes), total 0 (slots), used 0 (slots)

That happens because the bounce buffers have been allocated, followed by
freed during startup but the bounce buffering code expects that all IOMMUs
have left it enabled.

Remove the criteria to set up bounce buffers on AMD systems to ensure
they're always available for supporting untrusted devices.

Fixes: 82612d66d5 ("iommu: Allow the dma-iommu api to use bounce buffers")
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220404204723.9767-2-mario.limonciello@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-04-28 10:26:49 +02:00
Christoph Hellwig 78013eaadf x86: remove the IOMMU table infrastructure
The IOMMU table tries to separate the different IOMMUs into different
backends, but actually requires various cross calls.

Rewrite the code to do the generic swiotlb/swiotlb-xen setup directly
in pci-dma.c and then just call into the IOMMU drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:10 +02:00
Joerg Roedel e17c6debd4 Merge branches 'arm/mediatek', 'arm/msm', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'x86/vt-d' and 'x86/amd' into next 2022-03-08 12:21:31 +01:00
Suravee Suthikulpanit 9f968fc70d iommu/amd: Improve amd_iommu_v2_exit()
During module exit, the current logic loops through all possible
16-bit device ID space to search for existing devices and clean up
device state structures. This can be simplified by looping through
the device state list.

Also, refactor various clean up logic into free_device_state()
for better reusability.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20220301085626.87680-6-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-08 12:19:29 +01:00
Vasant Hegde c1d5b57a1e iommu/amd: Remove unused struct fault.devid
This variable has not been used since it was introduced.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220301085626.87680-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-08 12:19:24 +01:00
Vasant Hegde 3bf01426a5 iommu/amd: Clean up function declarations
Remove unused declarations and add static keyword as needed.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220301085626.87680-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-08 12:19:14 +01:00
Vasant Hegde 434d2defa9 iommu/amd: Call memunmap in error path
Unmap old_devtb in error path.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20220301085626.87680-3-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-08 12:18:49 +01:00
Suravee Suthikulpanit 06687a0380 iommu/amd: Improve error handling for amd_iommu_init_pci
Add error messages to prevent silent failure.

Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20220301085626.87680-2-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04 10:39:19 +01:00
Lu Baolu 9a630a4b41 iommu: Split struct iommu_ops
Move the domain specific operations out of struct iommu_ops into a new
structure that only has domain specific operations. This solves the
problem of needing to know if the method vector for a given operation
needs to be retrieved from the device or the domain. Logically the domain
ops are the ones that make sense for external subsystems and endpoint
drivers to use, while device ops, with the sole exception of domain_alloc,
are IOMMU API internals.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:25:49 +01:00
Lu Baolu 41bb23e70b iommu: Remove unused argument in is_attach_deferred
The is_attach_deferred iommu_ops callback is a device op. The domain
argument is unnecessary and never used. Remove it to make code clean.

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28 13:25:49 +01:00
Suravee Suthikulpanit 6b0b2d9a6a iommu/amd: Fix I/O page table memory leak
The current logic updates the I/O page table mode for the domain
before calling the logic to free memory used for the page table.
This results in IOMMU page table memory leak, and can be observed
when launching VM w/ pass-through devices.

Fix by freeing the memory used for page table before updating the mode.

Cc: Joerg Roedel <joro@8bytes.org>
Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Tested-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Fixes: e42ba06330 ("iommu/amd: Restructure code for freeing page table")
Link: https://lore.kernel.org/all/20220118194720.urjgi73b7c3tq2o6@oracle.com/
Link: https://lore.kernel.org/r/20220210154745.11524-1-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-14 12:52:40 +01:00
Lennert Buytenhek 5ce97f4ec5 iommu/amd: Recover from event log overflow
The AMD IOMMU logs I/O page faults and such to a ring buffer in
system memory, and this ring buffer can overflow.  The AMD IOMMU
spec has the following to say about the interrupt status bit that
signals this overflow condition:

	EventOverflow: Event log overflow. RW1C. Reset 0b. 1 = IOMMU
	event log overflow has occurred. This bit is set when a new
	event is to be written to the event log and there is no usable
	entry in the event log, causing the new event information to
	be discarded. An interrupt is generated when EventOverflow = 1b
	and MMIO Offset 0018h[EventIntEn] = 1b. No new event log
	entries are written while this bit is set. Software Note: To
	resume logging, clear EventOverflow (W1C), and write a 1 to
	MMIO Offset 0018h[EventLogEn].

The AMD IOMMU driver doesn't currently implement this recovery
sequence, meaning that if a ring buffer overflow occurs, logging
of EVT/PPR/GA events will cease entirely.

This patch implements the spec-mandated reset sequence, with the
minor tweak that the hardware seems to want to have a 0 written to
MMIO Offset 0018h[EventLogEn] first, before writing an 1 into this
field, or the IOMMU won't actually resume logging events.

Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YVrSXEdW2rzEfOvk@wantstofly.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-14 12:06:55 +01:00
Joerg Roedel 9b45a7738e iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()
The polling loop for the register change in iommu_ga_log_enable() needs
to have a udelay() in it.  Otherwise the CPU might be faster than the
IOMMU hardware and wrongly trigger the WARN_ON() further down the code
stream. Use a 10us for udelay(), has there is some hardware where
activation of the GA log can take more than a 100ms.

A future optimization should move the activation check of the GA log
to the point where it gets used for the first time. But that is a
bigger change and not suitable for a fix.

Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220204115537.3894-1-joro@8bytes.org
2022-02-04 12:57:26 +01:00
Joerg Roedel 66dc1b791c Merge branches 'arm/smmu', 'virtio', 'x86/amd', 'x86/vt-d' and 'core' into next 2022-01-04 10:33:45 +01:00
Matthew Wilcox (Oracle) ce00eece69 iommu/amd: Use put_pages_list
page->freelist is for the use of slab.  We already have the ability
to free a list of pages in the core mm, but it requires the use of a
list_head and for the pages to be chained together through page->lru.
Switch the AMD IOMMU code over to using free_pages_list().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[rm: split from original patch, cosmetic tweaks]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/73af128f651aaa1f38f69e586c66765a88ad2de0.1639753638.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-20 09:03:05 +01:00
Robin Murphy 6b3106e9ba iommu/amd: Simplify pagetable freeing
For reasons unclear, pagetable freeing is an effectively recursive
method implemented via an elaborate system of templated functions that
turns out to account for 25% of the object file size. Implementing it
using regular straightforward recursion makes the code simpler, and
seems like a good thing to do before we work on it further. As part of
that, also fix the types to avoid all the needless casting back and
forth which just gets in the way.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/d3d00c9f3fa0df4756b867072c201e6e82f9ce39.1639753638.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-20 09:03:05 +01:00
Paul Menzel 664c0b58e0 iommu/amd: Fix typo in *glues … together* in comment
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/r/20211217134916.43698-1-pmenzel@molgen.mpg.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-20 09:01:55 +01:00
Maxim Levitsky 575f5cfb13 iommu/amd: Remove useless irq affinity notifier
iommu->intcapxt_notify field is no longer used
after a switch to a separate domain was done

Fixes: d1adcfbb52 ("iommu/amd: Fix IOMMU interrupt generation in X2APIC mode")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-6-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:30:22 +01:00
Maxim Levitsky 1980105e3c iommu/amd: X2apic mode: mask/unmask interrupts on suspend/resume
Use IRQCHIP_MASK_ON_SUSPEND to make the core irq code to
mask the iommu interrupt on suspend and unmask it on the resume.

Since now the unmask function updates the INTX settings,
that will restore them on resume from s3/s4.

Since IRQCHIP_MASK_ON_SUSPEND is only effective for interrupts
which are not wakeup sources, remove IRQCHIP_SKIP_SET_WAKE flag
and instead implement a dummy .irq_set_wake which doesn't allow
the interrupt to become a wakeup source.

Fixes: 6692981295 ("iommu/amd: Add support for X2APIC IOMMU interrupts")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-5-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:30:18 +01:00
Maxim Levitsky 4691f79d62 iommu/amd: X2apic mode: setup the INTX registers on mask/unmask
This is more logically correct and will also allow us to
to use mask/unmask logic to restore INTX setttings after
the resume from s3/s4.

Fixes: 6692981295 ("iommu/amd: Add support for X2APIC IOMMU interrupts")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-4-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:30:14 +01:00
Maxim Levitsky 01b297a48a iommu/amd: X2apic mode: re-enable after resume
Otherwise it is guaranteed to not work after the resume...

Fixes: 6692981295 ("iommu/amd: Add support for X2APIC IOMMU interrupts")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-3-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:20:43 +01:00
Maxim Levitsky a8d4a37d1b iommu/amd: Restore GA log/tail pointer on host resume
This will give IOMMU GA log a chance to work after resume
from s3/s4.

Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20211123161038.48009-2-mlevitsk@redhat.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17 09:20:39 +01:00
Joerg Roedel 717e88aad3 iommu/amd: Clarify AMD IOMMUv2 initialization messages
The messages printed on the initialization of the AMD IOMMUv2 driver
have caused some confusion in the past. Clarify the messages to lower
the confusion in the future.

Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20211123105507.7654-3-joro@8bytes.org
2021-11-26 22:54:20 +01:00
Linus Torvalds 7e113d01f5 IOMMU Updates for Linux v5.16:
Including:
 
   - Intel IOMMU Updates fro Lu Baolu:
     - Dump DMAR translation structure when DMA fault occurs
     - An optimization in the page table manipulation code
     - Use second level for GPA->HPA translation
     - Various cleanups
 
   - Arm SMMU Updates from Will
     - Minor optimisations to SMMUv3 command creation and submission
     - Numerous new compatible string for Qualcomm SMMUv2 implementations
 
   - Fixes for the SWIOTLB based implemenation of dma-iommu code for
     untrusted devices
 
   - Add support for r8a779a0 to the Renesas IOMMU driver and DT matching
     code for r8a77980
 
   - A couple of cleanups and fixes for the Apple DART IOMMU driver
 
   - Make use of generic report_iommu_fault() interface in the AMD IOMMU
     driver
 
   - Various smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmGD6NQACgkQK/BELZcB
 GuOSfg/9FKXl5ym86BP3tAS1fREKH7p59JRGZrrIR89NyHAcEUjtNG3YLPao+YxU
 3CDgLkru+vlDpYY54QoyqcY5FgIHT3Cna/Cdk4zekRmSO/14gHp47jtZRheOUzLF
 rvwfaplcbbtT8akpsVFzvw8YpQLGSDiDQSl7xL2+40Z9hiYX/gS9Af+PH98tAXsa
 yZKZj6gU+JXM58VihO3M7umyE06tovyBaYgcsBZtbf66bGc0ySu+fe75UVWbueRt
 Z8jwqa7TUfVXiYC8h+LqtGET6gtzNSsxAU3VllRe7Brf6K8i/yaRs/TO2Hp83d7/
 q/fcK3vNQ5v3aDNci/DjBB8SEySzCmRz/9ocCOCx8ByuRp+5lwVRPPq3WcUMtsZY
 QpYo9Fk7luFz2Gj5LObKAVBvOoeBZ5Km3oPs4HVmQ6epxn/rVckJDnJnVSLJuATq
 tSZC2heRfFlg1dT6WFaynCTP2RI1LlNEdKhHirV6L368rSjmF0ZdQxdTpHULsHr1
 yMjqL21OfcSkLW91rvfb3g68EsIwDbCPGTOlQWZLmAtwOWtHSCLPgwwEG7WefZbH
 yaslpmlUTOurUnFmpxlfLicy5sqsBL2ASzGJkEKrgunw82Ke96zzkRzi+9j9HeS6
 g0AyIWMi1cUAjONVUZtV4yjImXh63HIPiKx730a9teodusoxm+Q=
 =waUR
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Intel IOMMU Updates fro Lu Baolu:
     - Dump DMAR translation structure when DMA fault occurs
     - An optimization in the page table manipulation code
     - Use second level for GPA->HPA translation
     - Various cleanups

 - Arm SMMU Updates from Will
     - Minor optimisations to SMMUv3 command creation and submission
     - Numerous new compatible string for Qualcomm SMMUv2 implementations

 - Fixes for the SWIOTLB based implemenation of dma-iommu code for
   untrusted devices

 - Add support for r8a779a0 to the Renesas IOMMU driver and DT matching
   code for r8a77980

 - A couple of cleanups and fixes for the Apple DART IOMMU driver

 - Make use of generic report_iommu_fault() interface in the AMD IOMMU
   driver

 - Various smaller fixes and cleanups

* tag 'iommu-updates-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (35 commits)
  iommu/dma: Fix incorrect error return on iommu deferred attach
  iommu/dart: Initialize DART_STREAMS_ENABLE
  iommu/dma: Use kvcalloc() instead of kvzalloc()
  iommu/tegra-smmu: Use devm_bitmap_zalloc when applicable
  iommu/dart: Use kmemdup instead of kzalloc and memcpy
  iommu/vt-d: Avoid duplicate removing in __domain_mapping()
  iommu/vt-d: Convert the return type of first_pte_in_page to bool
  iommu/vt-d: Clean up unused PASID updating functions
  iommu/vt-d: Delete dev_has_feat callback
  iommu/vt-d: Use second level for GPA->HPA translation
  iommu/vt-d: Check FL and SL capability sanity in scalable mode
  iommu/vt-d: Remove duplicate identity domain flag
  iommu/vt-d: Dump DMAR translation structure when DMA fault occurs
  iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel option
  iommu/arm-smmu-qcom: Request direct mapping for modem device
  iommu: arm-smmu-qcom: Add compatible for QCM2290
  dt-bindings: arm-smmu: Add compatible for QCM2290 SoC
  iommu/arm-smmu-qcom: Add SM6350 SMMU compatible
  dt-bindings: arm-smmu: Add compatible for SM6350 SoC
  iommu/arm-smmu-v3: Properly handle the return value of arm_smmu_cmdq_build_cmd()
  ...
2021-11-04 11:11:24 -07:00
Linus Torvalds 2dc26d98cf overflow updates for v5.16-rc1
The end goal of the current buffer overflow detection work[0] is to gain
 full compile-time and run-time coverage of all detectable buffer overflows
 seen via array indexing or memcpy(), memmove(), and memset(). The str*()
 family of functions already have full coverage.
 
 While much of the work for these changes have been on-going for many
 releases (i.e. 0-element and 1-element array replacements, as well as
 avoiding false positives and fixing discovered overflows[1]), this series
 contains the foundational elements of several related buffer overflow
 detection improvements by providing new common helpers and FORTIFY_SOURCE
 changes needed to gain the introspection required for compiler visibility
 into array sizes. Also included are a handful of already Acked instances
 using the helpers (or related clean-ups), with many more waiting at the
 ready to be taken via subsystem-specific trees[2]. The new helpers are:
 
 - struct_group() for gaining struct member range introspection.
 - memset_after() and memset_startat() for clearing to the end of structures.
 - DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
 
 Also included is the beginning of the refactoring of FORTIFY_SOURCE to
 support memcpy() introspection, fix missing and regressed coverage under
 GCC, and to prepare to fix the currently broken Clang support. Finishing
 this work is part of the larger series[0], but depends on all the false
 positives and buffer overflow bug fixes to have landed already and those
 that depend on this series to land.
 
 As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
 and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
 functions respectively. The compile time tests have found a legitimate
 (though corner-case) bug[6] already.
 
 Please note that the appearance of "panic" and "BUG" in the
 FORTIFY_SOURCE refactoring are the result of relocating existing code,
 and no new use of those code-paths are expected nor desired.
 
 Finally, there are two tree-wide conversions for 0-element arrays and
 flexible array unions to gain sane compiler introspection coverage that
 result in no known object code differences.
 
 After this series (and the changes that have now landed via netdev
 and usb), we are very close to finally being able to build with
 -Warray-bounds and -Wzero-length-bounds. However, due corner cases in
 GCC[3] and Clang[4], I have not included the last two patches that turn
 on these options, as I don't want to introduce any known warnings to
 the build. Hopefully these can be solved soon.
 
 [0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
 [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
 [2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
 [3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
 [4] https://bugs.llvm.org/show_bug.cgi?id=51682
 [5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
 [6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
 ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
 04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
 +kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
 lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
 A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
 Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
 4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
 IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
 lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
 cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
 PdA9XB82G2tmqKc9cQ==
 =9xbT
 -----END PGP SIGNATURE-----

Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull overflow updates from Kees Cook:
 "The end goal of the current buffer overflow detection work[0] is to
  gain full compile-time and run-time coverage of all detectable buffer
  overflows seen via array indexing or memcpy(), memmove(), and
  memset(). The str*() family of functions already have full coverage.

  While much of the work for these changes have been on-going for many
  releases (i.e. 0-element and 1-element array replacements, as well as
  avoiding false positives and fixing discovered overflows[1]), this
  series contains the foundational elements of several related buffer
  overflow detection improvements by providing new common helpers and
  FORTIFY_SOURCE changes needed to gain the introspection required for
  compiler visibility into array sizes. Also included are a handful of
  already Acked instances using the helpers (or related clean-ups), with
  many more waiting at the ready to be taken via subsystem-specific
  trees[2].

  The new helpers are:

   - struct_group() for gaining struct member range introspection

   - memset_after() and memset_startat() for clearing to the end of
     structures

   - DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
     structs

  Also included is the beginning of the refactoring of FORTIFY_SOURCE to
  support memcpy() introspection, fix missing and regressed coverage
  under GCC, and to prepare to fix the currently broken Clang support.
  Finishing this work is part of the larger series[0], but depends on
  all the false positives and buffer overflow bug fixes to have landed
  already and those that depend on this series to land.

  As part of the FORTIFY_SOURCE refactoring, a set of both a
  compile-time and run-time tests are added for FORTIFY_SOURCE and the
  mem*()-family functions respectively. The compile time tests have
  found a legitimate (though corner-case) bug[6] already.

  Please note that the appearance of "panic" and "BUG" in the
  FORTIFY_SOURCE refactoring are the result of relocating existing code,
  and no new use of those code-paths are expected nor desired.

  Finally, there are two tree-wide conversions for 0-element arrays and
  flexible array unions to gain sane compiler introspection coverage
  that result in no known object code differences.

  After this series (and the changes that have now landed via netdev and
  usb), we are very close to finally being able to build with
  -Warray-bounds and -Wzero-length-bounds.

  However, due corner cases in GCC[3] and Clang[4], I have not included
  the last two patches that turn on these options, as I don't want to
  introduce any known warnings to the build. Hopefully these can be
  solved soon"

Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]

* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
  fortify: strlen: Avoid shadowing previous locals
  compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
  treewide: Replace 0-element memcpy() destinations with flexible arrays
  treewide: Replace open-coded flex arrays in unions
  stddef: Introduce DECLARE_FLEX_ARRAY() helper
  btrfs: Use memset_startat() to clear end of struct
  string.h: Introduce memset_startat() for wiping trailing members and padding
  xfrm: Use memset_after() to clear padding
  string.h: Introduce memset_after() for wiping trailing members/padding
  lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
  fortify: Add compile-time FORTIFY_SOURCE tests
  fortify: Allow strlen() and strnlen() to pass compile-time known lengths
  fortify: Prepare to improve strnlen() and strlen() warnings
  fortify: Fix dropped strcpy() compile-time write overflow check
  fortify: Explicitly disable Clang support
  fortify: Move remaining fortify helpers into fortify-string.h
  lib/string: Move helper functions out of string.c
  compiler_types.h: Remove __compiletime_object_size()
  cm4000_cs: Use struct_group() to zero struct cm4000_dev region
  can: flexcan: Use struct_group() to zero struct flexcan_regs regions
  ...
2021-11-01 17:12:56 -07:00
Tom Lendacky e9d1d2bb75 treewide: Replace the use of mem_encrypt_active() with cc_platform_has()
Replace uses of mem_encrypt_active() with calls to cc_platform_has() with
the CC_ATTR_MEM_ENCRYPT attribute.

Remove the implementation of mem_encrypt_active() across all arches.

For s390, since the default implementation of the cc_platform_has()
matches the s390 implementation of mem_encrypt_active(), cc_platform_has()
does not need to be implemented in s390 (the config option
ARCH_HAS_CC_PLATFORM is not set).

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210928191009.32551-9-bp@alien8.de
2021-10-04 11:47:24 +02:00
Tom Lendacky 32cb4d02fb x86/sme: Replace occurrences of sme_active() with cc_platform_has()
Replace uses of sme_active() with the more generic cc_platform_has()
using CC_ATTR_HOST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_HOST_MEM_ENCRYPT
can be updated, as required.

This also replaces two usages of sev_active() that are really geared
towards detecting if SME is active.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210928191009.32551-6-bp@alien8.de
2021-10-04 11:46:46 +02:00
Lennert Buytenhek 9f78e446bd iommu/amd: Use report_iommu_fault()
This patch makes iommu/amd call report_iommu_fault() when an I/O page
fault occurs, which has two effects:

1) It allows device drivers to register a callback to be notified of
   I/O page faults, via the iommu_set_fault_handler() API.

2) It triggers the io_page_fault tracepoint in report_iommu_fault()
   when an I/O page fault occurs.

The latter point is the main aim of this patch, as it allows
rasdaemon-like daemons to be notified of I/O page faults, and to
possibly initiate corrective action in response.

A number of other IOMMU drivers already use report_iommu_fault(), and
I/O page faults on those IOMMUs therefore already trigger this
tracepoint -- but this isn't yet the case for AMD-Vi and Intel DMAR.

The AMD IOMMU specification suggests that the bit in an I/O page fault
event log entry that signals whether an I/O page fault was for a read
request or for a write request is only meaningful when the faulting
access was to a present page, but some testing on a Ryzen 3700X suggests
that this bit encodes the correct value even for I/O page faults to
non-present pages, and therefore, this patch passes the R/W information
up the stack even for I/O page faults to non-present pages.

Signed-off-by: Lennert Buytenhek <buytenh@arista.com>
Link: https://lore.kernel.org/r/YVLyBW97vZLpOaAp@wantstofly.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-29 14:05:20 +02:00
Kees Cook 43d83af8a5 iommu/amd: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Use struct_group() in struct ivhd_entry around members ext and hidh, so
they can be referenced together. This will allow memcpy() and sizeof()
to more easily reason about sizes, improve readability, and avoid future
warnings about writing beyond the end of ext.

"pahole" shows no size nor member offset changes to struct ivhd_entry.
"objdump -d" shows no object code changes.

Cc: Will Deacon <will@kernel.org>
Cc: iommu@lists.linux-foundation.org
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
2021-09-25 08:20:48 -07:00
Suravee Suthikulpanit eb03f2d2f6 iommu/amd: Remove iommu_init_ga()
Since the function has been simplified and only call iommu_init_ga_log(),
remove the function and replace with iommu_init_ga_log() instead.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-4-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09 13:18:06 +02:00
Wei Huang c3811a50ad iommu/amd: Relocate GAMSup check to early_enable_iommus
Currently, iommu_init_ga() checks and disables IOMMU VAPIC support
(i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set.
However it forgets to clear IRQ_POSTING_CAP from the previously set
amd_iommu_irq_ops.capability.

This triggers an invalid page fault bug during guest VM warm reboot
if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is
incorrectly set, and crash the system with the following kernel trace.

    BUG: unable to handle page fault for address: 0000000000400dd8
    RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc
    Call Trace:
     svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd]
     ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm]
     kvm_request_apicv_update+0x10c/0x150 [kvm]
     svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd]
     svm_enable_irq_window+0x26/0xa0 [kvm_amd]
     vcpu_enter_guest+0xbbe/0x1560 [kvm]
     ? avic_vcpu_load+0xd5/0x120 [kvm_amd]
     ? kvm_arch_vcpu_load+0x76/0x240 [kvm]
     ? svm_get_segment_base+0xa/0x10 [kvm_amd]
     kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm]
     kvm_vcpu_ioctl+0x22a/0x5d0 [kvm]
     __x64_sys_ioctl+0x84/0xc0
     do_syscall_64+0x33/0x40
     entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes by moving the initializing of AMD IOMMU interrupt remapping mode
(amd_iommu_guest_ir) earlier before setting up the
amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.

[joro:	Squashed the two patches and limited
	check_features_on_all_iommus() to CONFIG_IRQ_REMAP
	to fix a compile warning.]

Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com
Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09 13:15:02 +02:00
Joerg Roedel d8768d7eb9 Merge branches 'apple/dart', 'arm/smmu', 'iommu/fixes', 'x86/amd', 'x86/vt-d' and 'core' into next 2021-08-20 17:14:35 +02:00
Robin Murphy 6d59603939 iommu/amd: Prepare for multiple DMA domain types
The DMA ops reset/setup can simply be unconditional, since
iommu-dma already knows only to touch DMA domains.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6450b4f39a5a086d505297b4a53ff1e4a7a0fe7c.1628682049.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:27:49 +02:00
Robin Murphy 3f166dae1a iommu/amd: Drop IOVA cookie management
The core code bakes its own cookies now.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/648e74e7422caa6a7db7fb0c36813c7bd2007af8.1628682048.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-18 13:25:31 +02:00
Joerg Roedel 47a70bea54 iommu/amd: Remove stale amd_iommu_unmap_flush usage
Remove the new use of the variable introduced in the AMD driver branch.
The variable was removed already in the iommu core branch, causing build
errors when the brances are merged.

Cc: Nadav Amit <namit@vmware.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20210802150643.3634-1-joro@8bytes.org
2021-08-02 17:07:39 +02:00
Joerg Roedel 1d65b90847 Merge remote-tracking branch 'korg/core' into x86/amd 2021-08-02 17:00:28 +02:00
Nadav Amit a270be1b3f iommu/amd: Use only natural aligned flushes in a VM
When running on an AMD vIOMMU, it is better to avoid TLB flushes
of unmodified PTEs. vIOMMUs require the hypervisor to synchronize the
virtualized IOMMU's PTEs with the physical ones. This process induce
overheads.

AMD IOMMU allows us to flush any range that is aligned to the power of
2. So when running on top of a vIOMMU, break the range into sub-ranges
that are naturally aligned, and flush each one separately. This apporach
is better when running with a vIOMMU, but on physical IOMMUs, the
penalty of IOTLB misses due to unnecessary flushed entries is likely to
be low.

Repurpose (i.e., keeping the name, changing the logic)
domain_flush_pages() so it is used to choose whether to perform one
flush of the whole range or multiple ones to avoid flushing unnecessary
ranges. Use NpCache, as usual, to infer whether the IOMMU is physical or
virtual.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Jiajun Cao <caojiajun@vmware.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210723093209.714328-8-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02 11:26:06 +02:00
Nadav Amit 3b122a5666 iommu/amd: Sync once for scatter-gather operations
On virtual machines, software must flush the IOTLB after each page table
entry update.

The iommu_map_sg() code iterates through the given scatter-gather list
and invokes iommu_map() for each element in the scatter-gather list,
which calls into the vendor IOMMU driver through iommu_ops callback. As
the result, a single sg mapping may lead to multiple IOTLB flushes.

Fix this by adding amd_iotlb_sync_map() callback and flushing at this
point after all sg mappings we set.

This commit is followed and inspired by commit 933fcd01e9
("iommu/vt-d: Add iotlb_sync_map callback").

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Jiajun Cao <caojiajun@vmware.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210723093209.714328-7-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02 11:26:06 +02:00
Nadav Amit fe6d269d0e iommu/amd: Tailored gather logic for AMD
AMD's IOMMU can flush efficiently (i.e., in a single flush) any range.
This is in contrast, for instnace, to Intel IOMMUs that have a limit on
the number of pages that can be flushed in a single flush.  In addition,
AMD's IOMMU do not care about the page-size, so changes of the page size
do not need to trigger a TLB flush.

So in most cases, a TLB flush due to disjoint range is not needed for
AMD. Yet, vIOMMUs require the hypervisor to synchronize the virtualized
IOMMU's PTEs with the physical ones. This process induce overheads, so
it is better not to cause unnecessary flushes, i.e., flushes of PTEs
that were not modified.

Implement and use amd_iommu_iotlb_gather_add_page() and use it instead
of the generic iommu_iotlb_gather_add_page(). Ignore disjoint regions
unless "non-present cache" feature is reported by the IOMMU
capabilities, as this is an indication we are running on a physical
IOMMU. A similar indication is used by VT-d (see "caching mode"). The
new logic retains the same flushing behavior that we had before the
introduction of page-selective IOTLB flushes for AMD.

On virtualized environments, check if the newly flushed region and the
gathered one are disjoint and flush if it is.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Jiajun Cao <caojiajun@vmware.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210723093209.714328-6-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02 11:26:05 +02:00
Nadav Amit 6664340cf1 iommu/amd: Do not use flush-queue when NpCache is on
Do not use flush-queue on virtualized environments, where the NpCache
capability of the IOMMU is set. This is required to reduce
virtualization overheads.

This change follows a similar change to Intel's VT-d and a detailed
explanation as for the rationale is described in commit 29b3283972
("iommu/vt-d: Do not use flush-queue when caching-mode is on").

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Jiajun Cao <caojiajun@vmware.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210723093209.714328-3-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02 11:26:05 +02:00
Nadav Amit fc65d0acaf iommu/amd: Selective flush on unmap
Recent patch attempted to enable selective page flushes on AMD IOMMU but
neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes.

Adapt amd_iommu_iotlb_sync() to use selective flushes and change
amd_iommu_unmap() to collect the flushes. As a defensive measure, to
avoid potential issues as those that the Intel IOMMU driver encountered
recently, flush the page-walk caches by always setting the "pde"
parameter. This can be removed later.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: Jiajun Cao <caojiajun@vmware.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
Link: https://lore.kernel.org/r/20210723093209.714328-2-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-02 11:26:05 +02:00
Lennert Buytenhek ee974d9625 iommu/amd: Fix printing of IOMMU events when rate limiting kicks in
For the printing of RMP_HW_ERROR / RMP_PAGE_FAULT / IO_PAGE_FAULT
events, the AMD IOMMU code uses such logic:

	if (pdev)
		dev_data = dev_iommu_priv_get(&pdev->dev);

	if (dev_data && __ratelimit(&dev_data->rs)) {
		pci_err(pdev, ...
	} else {
		printk_ratelimit() / pr_err{,_ratelimited}(...
	}

This means that if we receive an event for a PCI devid which actually
does have a struct pci_dev and an attached struct iommu_dev_data, but
rate limiting kicks in, we'll fall back to the non-PCI branch of the
test, and print the event in a different format.

Fix this by changing the logic to:

	if (dev_data) {
		if (__ratelimit(&dev_data->rs)) {
			pci_err(pdev, ...
		}
	} else {
		pr_err_ratelimited(...
	}

Suggested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/YPgk1dD1gPMhJXgY@wantstofly.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 14:22:40 +02:00
Xiyu Yang via iommu 8bc54824da iommu/amd: Convert from atomic_t to refcount_t on pasid_state->count
refcount_t type and corresponding API can protect refcounters from
accidental underflow and overflow and further use-after-free situations.

Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/1626683578-64214-1-git-send-email-xiyuyang19@fudan.edu.cn
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26 13:46:57 +02:00