OpenCloudOS-Kernel/drivers/vfio
Brett Creeley 66aa5d95ea vfio/pds: Make sure migration file isn't accessed after reset
[ Upstream commit 457f7308254756b6e4b8fc3876cb770dcf0e7cc7 ]

It's possible the migration file is accessed after reset when it has
been cleaned up, especially when it's initiated by the device. This is
because the driver doesn't rip out the filep when cleaning up it only
frees the related page structures and sets its local struct
pds_vfio_lm_file pointer to NULL. This can cause a NULL pointer
dereference, which is shown in the example below during a restore after
a device initiated reset:

BUG: kernel NULL pointer dereference, address: 000000000000000c
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:pds_vfio_get_file_page+0x5d/0xf0 [pds_vfio_pci]
[...]
Call Trace:
 <TASK>
 pds_vfio_restore_write+0xf6/0x160 [pds_vfio_pci]
 vfs_write+0xc9/0x3f0
 ? __fget_light+0xc9/0x110
 ksys_write+0xb5/0xf0
 __x64_sys_write+0x1a/0x20
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
[...]

Add a disabled flag to the driver's struct pds_vfio_lm_file that gets
set during cleanup. Then make sure to check the flag when the migration
file is accessed via its file_operations. By default this flag will be
false as the memory for struct pds_vfio_lm_file is kzalloc'd, which means
the struct pds_vfio_lm_file is enabled and accessible. Also, since the
file_operations and driver's migration file cleanup happen under the
protection of the same pds_vfio_lm_file.lock, using this flag is thread
safe.

Fixes: 8512ed256334 ("vfio/pds: Always clear the save/restore FDs on reset")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Link: https://lore.kernel.org/r/20240308182149.22036-2-brett.creeley@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03 15:28:59 +02:00
..
cdx vfio/cdx: Remove redundant initialization owner in vfio_cdx_driver 2023-08-16 11:13:51 -06:00
fsl-mc vfio/fsl-mc: Block calling interrupt handler without trigger 2024-04-03 15:28:49 +02:00
mdev vfio/mdev: Fix a null-ptr-deref bug for mdev_unregister_parent() 2023-09-22 12:48:04 -06:00
pci vfio/pds: Make sure migration file isn't accessed after reset 2024-04-03 15:28:59 +02:00
platform vfio/platform: Create persistent IRQ handlers 2024-04-03 15:28:49 +02:00
Kconfig vfio: Compile vfio_group infrastructure optionally 2023-07-25 10:20:50 -06:00
Makefile vfio: Compile vfio_group infrastructure optionally 2023-07-25 10:20:50 -06:00
container.c VFIO updates for v6.3-rc1 2023-02-25 11:52:57 -08:00
device_cdev.c vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT 2023-07-25 10:20:37 -06:00
group.c vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev() 2023-07-25 10:20:41 -06:00
iommufd.c vfio: Support IO page table replacement 2023-07-28 13:31:24 -03:00
iova_bitmap.c iommufd/iova_bitmap: Consider page offset for the pages to be pinned 2024-03-01 13:35:05 +01:00
vfio.h vfio: Compile vfio_group infrastructure optionally 2023-07-25 10:20:50 -06:00
vfio_iommu_spapr_tce.c powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains 2023-03-15 00:51:46 +11:00
vfio_iommu_type1.c vfio: align capability structures 2023-08-17 12:17:44 -06:00
vfio_main.c iommufd for 6.6 2023-08-30 20:41:37 -07:00
virqfd.c vfio: Introduce interface to flush virqfd inject workqueue 2024-04-03 15:28:49 +02:00