OpenCloudOS-Kernel

History

Sagi Grimberg 2acf70ade7 nvmet-rdma: use a private workqueue for delete Queue deletion is done asynchronous when the last reference on the queue is dropped. Thus, in order to make sure we don't over allocate under a connect/disconnect storm, we let queue deletion complete before making forward progress. However, given that we flush the system_wq from rdma_cm context which runs from a workqueue context, we can have a circular locking complaint [1]. Fix that by using a private workqueue for queue deletion. [1]: ====================================================== WARNING: possible circular locking dependency detected 4.19.0-rc4-dbg+ #3 Not tainted ------------------------------------------------------ kworker/5:0/39 is trying to acquire lock: 00000000a10b6db9 (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x6f/0x440 [rdma_cm] but task is already holding lock: 00000000331b4e2c ((work_completion)(&queue->release_work)){+.+.}, at: process_one_work+0x3ed/0xa20 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 ((work_completion)(&queue->release_work)){+.+.}: process_one_work+0x474/0xa20 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #2 ((wq_completion)"events"){+.+.}: flush_workqueue+0xf3/0x970 nvmet_rdma_cm_handler+0x133d/0x1734 [nvmet_rdma] cma_ib_req_handler+0x72f/0xf90 [rdma_cm] cm_process_work+0x2e/0x110 [ib_cm] cm_req_handler+0x135b/0x1c30 [ib_cm] cm_work_handler+0x2b7/0x38cd [ib_cm] process_one_work+0x4ae/0xa20 nvmet_rdma:nvmet_rdma_cm_handler: nvmet_rdma: disconnected (10): status 0 id 0000000040357082 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 nvme nvme0: Reconnecting in 10 seconds... -> #1 (&id_priv->handler_mutex/1){+.+.}: __mutex_lock+0xfe/0xbe0 mutex_lock_nested+0x1b/0x20 cma_ib_req_handler+0x6aa/0xf90 [rdma_cm] cm_process_work+0x2e/0x110 [ib_cm] cm_req_handler+0x135b/0x1c30 [ib_cm] cm_work_handler+0x2b7/0x38cd [ib_cm] process_one_work+0x4ae/0xa20 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #0 (&id_priv->handler_mutex){+.+.}: lock_acquire+0xc5/0x200 __mutex_lock+0xfe/0xbe0 mutex_lock_nested+0x1b/0x20 rdma_destroy_id+0x6f/0x440 [rdma_cm] nvmet_rdma_release_queue_work+0x8e/0x1b0 [nvmet_rdma] process_one_work+0x4ae/0xa20 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 Fixes: `777dc82395` ("nvmet-rdma: occasionally flush ongoing controller teardown") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de>		2018-10-05 09:25:18 +02:00
..
Kconfig	nvmet-rdma: depend on INFINIBAND_ADDR_TRANS	2018-04-27 11:15:43 -04:00
Makefile	nvmet: add simple file backed ns support	2018-05-25 16:50:12 +02:00
admin-cmd.c	nvmet: remove redundant module prefix	2018-10-01 14:16:09 -07:00
configfs.c	for-4.19/block-20180812	2018-08-14 10:23:25 -07:00
core.c	nvmet: free workqueue object if module init fails	2018-08-28 08:40:44 +02:00
discovery.c	nvmet-rdma: support max(16KB, PAGE_SIZE) inline data	2018-07-23 09:35:16 +02:00
fabrics-cmd.c	nvmet: remove duplicate NULL initialization for req->ns	2018-05-25 16:50:12 +02:00
fc.c	nvmet_fc: support target port removal with nvmet layer	2018-10-01 14:16:10 -07:00
fcloop.c	nvme-fcloop: Fix dropped LS's to removed target port	2018-08-28 08:40:43 +02:00
io-cmd-bdev.c	nvmet: don't split large I/Os unconditionally	2018-10-01 14:16:13 -07:00
io-cmd-file.c	nvmet: add ns write protect support	2018-08-08 12:00:53 +02:00
loop.c	for-4.19/block-20180812	2018-08-14 10:23:25 -07:00
nvmet.h	nvmet: don't split large I/Os unconditionally	2018-10-01 14:16:13 -07:00
rdma.c	nvmet-rdma: use a private workqueue for delete	2018-10-05 09:25:18 +02:00