OpenCloudOS-Kernel/drivers/dma/idxd
Dave Jiang 6b4b87f2c3 dmaengine: idxd: fix submission race window
Konstantin observed that when descriptors are submitted, the descriptor is
added to the pending list after the submission. This creates a race window
with the slight possibility that the descriptor can complete before it
gets added to the pending list and this window would cause the completion
handler to miss processing the descriptor.

To address the issue, the addition of the descriptor to the pending list
must be done before it gets submitted to the hardware. However, submitting
to swq with ENQCMDS instruction can cause a failure with the condition of
either wq is full or wq is not "active".

With the descriptor allocation being the gate to the wq capacity, it is not
possible to hit a retry with ENQCMDS submission to the swq. The only
possible failure can happen is when wq is no longer "active" due to hw
error and therefore we are moving towards taking down the portal. Given
this is a rare condition and there's no longer concern over I/O
performance, the driver can walk the completion lists in order to retrieve
and abort the descriptor.

The error path will set the descriptor to aborted status. It will take the
work list lock to prevent further processing of worklist. It will do a
delete_all on the pending llist to retrieve all descriptors on the pending
llist. The delete_all action does not require a lock. It will walk through
the acquired llist to find the aborted descriptor while add all remaining
descriptors to the work list since it holds the lock. If it does not find
the aborted descriptor on the llist, it will walk through the work
list. And if it still does not find the descriptor, then it means the
interrupt handler has removed the desc from the llist but is pending on
the work list lock and will process it once the error path releases the
lock.

Fixes: eb15e7154f ("dmaengine: idxd: add interrupt handle request and release support")
Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162628855747.360485.10101925573082466530.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-07-20 22:55:08 +05:30
..
Makefile dmaengine: idxd: Add IDXD performance monitor support 2021-04-25 21:46:12 +05:30
cdev.c dmaengine updates for v5.14-rc1 2021-07-05 12:05:13 -07:00
device.c dmaengine: idxd: device cmd should use dedicated lock 2021-04-23 23:08:45 +05:30
dma.c dmaengine: idxd: fix dma device lifetime 2021-04-20 16:43:52 +05:30
idxd.h dmaengine: idxd: fix submission race window 2021-07-20 22:55:08 +05:30
init.c dmaengine: idxd: fix sequence for pci driver remove() and shutdown() 2021-07-20 22:54:20 +05:30
irq.c dmaengine: idxd: fix submission race window 2021-07-20 22:55:08 +05:30
perfmon.c dmaengine: idxd: Add IDXD performance monitor support 2021-04-25 21:46:12 +05:30
perfmon.h dmaengine: idxd: Add IDXD performance monitor support 2021-04-25 21:46:12 +05:30
registers.h dmaengine: idxd: Add IDXD performance monitor support 2021-04-25 21:46:12 +05:30
submit.c dmaengine: idxd: fix submission race window 2021-07-20 22:55:08 +05:30
sysfs.c dmaengine: idxd: fix sequence for pci driver remove() and shutdown() 2021-07-20 22:54:20 +05:30