OpenCloudOS-Kernel/drivers/vfio
Brett Creeley 72654bc1d4 vfio/pds: Fix possible sleep while in atomic context
[ Upstream commit ae2667cd8a479bb5abd6e24c12fcc9ef5bc06d75 ]

The driver could possibly sleep while in atomic context resulting
in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is
set:

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2817, name: bash
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
Call Trace:
 <TASK>
 dump_stack_lvl+0x36/0x50
 __might_resched+0x123/0x170
 mutex_lock+0x1e/0x50
 pds_vfio_put_lm_file+0x1e/0xa0 [pds_vfio_pci]
 pds_vfio_put_save_file+0x19/0x30 [pds_vfio_pci]
 pds_vfio_state_mutex_unlock+0x2e/0x80 [pds_vfio_pci]
 pci_reset_function+0x4b/0x70
 reset_store+0x5b/0xa0
 kernfs_fop_write_iter+0x137/0x1d0
 vfs_write+0x2de/0x410
 ksys_write+0x5d/0xd0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

This can happen if pds_vfio_put_restore_file() and/or
pds_vfio_put_save_file() grab the mutex_lock(&lm_file->lock)
while the spin_lock(&pds_vfio->reset_lock) is held, which can
happen during while calling pds_vfio_state_mutex_unlock().

Fix this by changing the reset_lock to reset_mutex so there are no such
conerns. Also, make sure to destroy the reset_mutex in the driver specific
VFIO device release function.

This also fixes a spinlock bad magic BUG that was caused
by not calling spinlock_init() on the reset_lock. Since, the lock is
being changed to a mutex, make sure to call mutex_init() on it.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/kvm/1f9bc27b-3de9-4891-9687-ba2820c1b390@moroto.mountain/
Fixes: bb500dbe2a ("vfio/pds: Add VFIO live migration support")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20231122192532.25791-3-brett.creeley@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08 08:52:25 +01:00
..
cdx vfio/cdx: Remove redundant initialization owner in vfio_cdx_driver 2023-08-16 11:13:51 -06:00
fsl-mc vfio/fsl-mc: Use module_fsl_mc_driver macro to simplify the code 2023-08-16 11:14:15 -06:00
mdev vfio/mdev: Fix a null-ptr-deref bug for mdev_unregister_parent() 2023-09-22 12:48:04 -06:00
pci vfio/pds: Fix possible sleep while in atomic context 2023-12-08 08:52:25 +01:00
platform vfio-iommufd: Add detach_ioas support for physical VFIO devices 2023-07-25 10:19:12 -06:00
Kconfig vfio: Compile vfio_group infrastructure optionally 2023-07-25 10:20:50 -06:00
Makefile vfio: Compile vfio_group infrastructure optionally 2023-07-25 10:20:50 -06:00
container.c VFIO updates for v6.3-rc1 2023-02-25 11:52:57 -08:00
device_cdev.c vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT 2023-07-25 10:20:37 -06:00
group.c vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev() 2023-07-25 10:20:41 -06:00
iommufd.c vfio: Support IO page table replacement 2023-07-28 13:31:24 -03:00
iova_bitmap.c vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries 2022-12-02 10:09:25 -07:00
vfio.h vfio: Compile vfio_group infrastructure optionally 2023-07-25 10:20:50 -06:00
vfio_iommu_spapr_tce.c powerpc/iommu: Add iommu_ops to report capabilities and allow blocking domains 2023-03-15 00:51:46 +11:00
vfio_iommu_type1.c vfio: align capability structures 2023-08-17 12:17:44 -06:00
vfio_main.c iommufd for 6.6 2023-08-30 20:41:37 -07:00
virqfd.c vfio: Use GFP_KERNEL_ACCOUNT for userspace persistent allocations 2023-01-23 11:26:29 -07:00