This includes a shared branch with VFIO:
- Enhance VFIO_DEVICE_GET_PCI_HOT_RESET_INFO so it can work with iommufd
FDs, not just group FDs. This removes the last place in the uAPI that
required the group fd.
- Give VFIO a new device node /dev/vfio/devices/vfioX (the so called cdev
node) which is very similar to the FD from VFIO_GROUP_GET_DEVICE_FD.
The cdev is associated with the struct device that the VFIO driver is
bound to and shows up in sysfs in the normal way.
- Add a cdev IOCTL VFIO_DEVICE_BIND_IOMMUFD which allows a newly opened
/dev/vfio/devices/vfioX to be associated with an IOMMUFD, this replaces
the VFIO_GROUP_SET_CONTAINER flow.
- Add cdev IOCTLs VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT to allow the IOMMU
translation the vfio_device is associated with to be changed. This is a
significant new feature for VFIO as previously each vfio_device was
fixed to a single translation.
The translation is under the control of iommufd, so it can be any of
the different translation modes that iommufd is learning to create.
At this point VFIO has compilation options to remove the legacy interfaces
and in modern mode it behaves like a normal driver subsystem. The
/dev/vfio/iommu and /dev/vfio/groupX nodes are not present and each
vfio_device only has a /dev/vfio/devices/vfioX cdev node that represents
the device.
On top of this is built some of the new iommufd functionality:
- IOMMU_HWPT_ALLOC allows userspace to directly create the low level
IO Page table objects and affiliate them with IOAS objects that hold
the translation mapping. This is the basic functionality for the
normal IOMMU_DOMAIN_PAGING domains.
- VFIO_DEVICE_ATTACH_IOMMUFD_PT can be used to replace the current
translation. This is wired up to through all the layers down to the
driver so the driver has the ability to implement a hitless
replacement. This is necessary to fully support guest behaviors when
emulating HW (eg guest atomic change of translation)
- IOMMU_GET_HW_INFO returns information about the IOMMU driver HW that
owns a VFIO device. This includes support for the Intel iommu, and
patches have been posted for all the other server IOMMU.
Along the way are a number of internal items:
- New iommufd kapis iommufd_ctx_has_group(), iommufd_device_to_ictx(),
iommufd_device_to_id(), iommufd_access_detach(), iommufd_ctx_from_fd(),
iommufd_device_replace()
- iommufd now internally tracks iommu_groups as it needs some per-group
data
- Reorganize how the internal hwpt allocation flows to have more robust
locking
- Improve the access interfaces to support detach and replace of an IOAS
from an access
- New selftests and a rework of how the selftests creates a mock iommu
driver to be more like a real iommu driver
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZO/QDQAKCRCFwuHvBreF
YZ2iAP4hNEF6MJLRI2A28V3I/80f3x9Ed3Cirp/Q8ZdVEE+HYQD8DFaafJ0y3iPQ
5mxD4ZrZ9KfUns/gUqCT5oPHjrcvSAM=
=EQCw
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"On top of the vfio updates is built some new iommufd functionality:
- IOMMU_HWPT_ALLOC allows userspace to directly create the low level
IO Page table objects and affiliate them with IOAS objects that
hold the translation mapping. This is the basic functionality for
the normal IOMMU_DOMAIN_PAGING domains.
- VFIO_DEVICE_ATTACH_IOMMUFD_PT can be used to replace the current
translation. This is wired up to through all the layers down to the
driver so the driver has the ability to implement a hitless
replacement. This is necessary to fully support guest behaviors
when emulating HW (eg guest atomic change of translation)
- IOMMU_GET_HW_INFO returns information about the IOMMU driver HW
that owns a VFIO device. This includes support for the Intel iommu,
and patches have been posted for all the other server IOMMU.
Along the way are a number of internal items:
- New iommufd kernel APIs: iommufd_ctx_has_group(),
iommufd_device_to_ictx(), iommufd_device_to_id(),
iommufd_access_detach(), iommufd_ctx_from_fd(),
iommufd_device_replace()
- iommufd now internally tracks iommu_groups as it needs some
per-group data
- Reorganize how the internal hwpt allocation flows to have more
robust locking
- Improve the access interfaces to support detach and replace of an
IOAS from an access
- New selftests and a rework of how the selftests creates a mock
iommu driver to be more like a real iommu driver"
Link: https://lore.kernel.org/lkml/ZO%2FTe6LU1ENf58ZW@nvidia.com/
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (34 commits)
iommufd/selftest: Don't leak the platform device memory when unloading the module
iommu/vt-d: Implement hw_info for iommu capability query
iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl
iommufd: Add IOMMU_GET_HW_INFO
iommu: Add new iommu op to get iommu hardware information
iommu: Move dev_iommu_ops() to private header
iommufd: Remove iommufd_ref_to_users()
iommufd/selftest: Make the mock iommu driver into a real driver
vfio: Support IO page table replacement
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
iommufd: Add iommufd_access_replace() API
iommufd: Use iommufd_access_change_ioas in iommufd_access_destroy_object
iommufd: Add iommufd_access_change_ioas(_id) helpers
iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
iommufd/selftest: Return the real idev id from selftest mock_domain
iommufd: Add IOMMU_HWPT_ALLOC
iommufd/selftest: Test iommufd_device_replace()
iommufd: Make destroy_rwsem use a lock class per object type
...