OpenCloudOS-Kernel/drivers/block/rnbd
Gioh Kim b168e1d85c block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel
We got a warning message below.
When server tries to close one session by force, it locks the sysfs
interface and locks the srv_sess lock.
The problem is that client can send a request to close at the same time.
By close request, server locks the srv_sess lock and locks the sysfs
to remove the sysfs interfaces.

The simplest way to prevent that situation could be just use
mutex_trylock.

[  234.153965] ======================================================
[  234.154093] WARNING: possible circular locking dependency detected
[  234.154219] 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Tainted: G           O
[  234.154381] ------------------------------------------------------
[  234.154531] kworker/1:1H/618 is trying to acquire lock:
[  234.154651] ffff8887a09db0a8 (kn->count#132){++++}, at: kernfs_remove_by_name_ns+0x40/0x80
[  234.154819]
               but task is already holding lock:
[  234.154965] ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server]
[  234.155132]
               which lock already depends on the new lock.

[  234.155311]
               the existing dependency chain (in reverse order) is:
[  234.155462]
               -> #1 (&srv_sess->lock){+.+.}:
[  234.155614]        __mutex_lock+0x134/0xcb0
[  234.155761]        rnbd_srv_sess_dev_force_close+0x36/0x50 [rnbd_server]
[  234.155889]        rnbd_srv_dev_session_force_close_store+0x69/0xc0 [rnbd_server]
[  234.156042]        kernfs_fop_write+0x13f/0x240
[  234.156162]        vfs_write+0xf3/0x280
[  234.156278]        ksys_write+0xba/0x150
[  234.156395]        do_syscall_64+0x62/0x270
[  234.156513]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  234.156632]
               -> #0 (kn->count#132){++++}:
[  234.156782]        __lock_acquire+0x129e/0x23a0
[  234.156900]        lock_acquire+0xf3/0x210
[  234.157043]        __kernfs_remove+0x42b/0x4c0
[  234.157161]        kernfs_remove_by_name_ns+0x40/0x80
[  234.157282]        remove_files+0x3f/0xa0
[  234.157399]        sysfs_remove_group+0x4a/0xb0
[  234.157519]        rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server]
[  234.157648]        rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server]
[  234.157775]        process_io_req+0x29a/0x6a0 [rtrs_server]
[  234.157924]        __ib_process_cq+0x8c/0x100 [ib_core]
[  234.158709]        ib_cq_poll_work+0x31/0xb0 [ib_core]
[  234.158834]        process_one_work+0x4e5/0xaa0
[  234.158958]        worker_thread+0x65/0x5c0
[  234.159078]        kthread+0x1e0/0x200
[  234.159194]        ret_from_fork+0x24/0x30
[  234.159309]
               other info that might help us debug this:

[  234.159513]  Possible unsafe locking scenario:

[  234.159658]        CPU0                    CPU1
[  234.159775]        ----                    ----
[  234.159891]   lock(&srv_sess->lock);
[  234.160005]                                lock(kn->count#132);
[  234.160128]                                lock(&srv_sess->lock);
[  234.160250]   lock(kn->count#132);
[  234.160364]
                *** DEADLOCK ***

[  234.160536] 3 locks held by kworker/1:1H/618:
[  234.160677]  #0: ffff8883ca1ed528 ((wq_completion)ib-comp-wq){+.+.}, at: process_one_work+0x40a/0xaa0
[  234.160840]  #1: ffff8883d2d5fe10 ((work_completion)(&cq->work)){+.+.}, at: process_one_work+0x40a/0xaa0
[  234.161003]  #2: ffff8887ae5f6518 (&srv_sess->lock){+.+.}, at: rnbd_srv_rdma_ev+0x144/0x1590 [rnbd_server]
[  234.161168]
               stack backtrace:
[  234.161312] CPU: 1 PID: 618 Comm: kworker/1:1H Tainted: G           O      5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
[  234.161490] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
[  234.161643] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[  234.161765] Call Trace:
[  234.161910]  dump_stack+0x96/0xe0
[  234.162028]  check_noncircular+0x29e/0x2e0
[  234.162148]  ? print_circular_bug+0x100/0x100
[  234.162267]  ? register_lock_class+0x1ad/0x8a0
[  234.162385]  ? __lock_acquire+0x68e/0x23a0
[  234.162505]  ? trace_event_raw_event_lock+0x190/0x190
[  234.162626]  __lock_acquire+0x129e/0x23a0
[  234.162746]  ? register_lock_class+0x8a0/0x8a0
[  234.162866]  lock_acquire+0xf3/0x210
[  234.162982]  ? kernfs_remove_by_name_ns+0x40/0x80
[  234.163127]  __kernfs_remove+0x42b/0x4c0
[  234.163243]  ? kernfs_remove_by_name_ns+0x40/0x80
[  234.163363]  ? kernfs_fop_readdir+0x3b0/0x3b0
[  234.163482]  ? strlen+0x1f/0x40
[  234.163596]  ? strcmp+0x30/0x50
[  234.163712]  kernfs_remove_by_name_ns+0x40/0x80
[  234.163832]  remove_files+0x3f/0xa0
[  234.163948]  sysfs_remove_group+0x4a/0xb0
[  234.164068]  rnbd_srv_destroy_dev_session_sysfs+0x19/0x30 [rnbd_server]
[  234.164196]  rnbd_srv_rdma_ev+0x14c/0x1590 [rnbd_server]
[  234.164345]  ? _raw_spin_unlock_irqrestore+0x43/0x50
[  234.164466]  ? lockdep_hardirqs_on+0x1a8/0x290
[  234.164597]  ? mlx4_ib_poll_cq+0x927/0x1280 [mlx4_ib]
[  234.164732]  ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server]
[  234.164859]  process_io_req+0x29a/0x6a0 [rtrs_server]
[  234.164982]  ? rnbd_get_sess_dev+0x270/0x270 [rnbd_server]
[  234.165130]  __ib_process_cq+0x8c/0x100 [ib_core]
[  234.165279]  ib_cq_poll_work+0x31/0xb0 [ib_core]
[  234.165404]  process_one_work+0x4e5/0xaa0
[  234.165550]  ? pwq_dec_nr_in_flight+0x160/0x160
[  234.165675]  ? do_raw_spin_lock+0x119/0x1d0
[  234.165796]  worker_thread+0x65/0x5c0
[  234.165914]  ? process_one_work+0xaa0/0xaa0
[  234.166031]  kthread+0x1e0/0x200
[  234.166147]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[  234.166268]  ret_from_fork+0x24/0x30
[  234.251591] rnbd_server L243: </dev/loop1@close_device_session>: Device closed
[  234.604221] rnbd_server L264: RTRS Session close_device_session disconnected

Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-10-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20 08:59:04 -06:00
..
Kconfig block/rnbd: Select SG_POOL for RNBD_CLIENT 2021-01-08 08:19:18 -07:00
Makefile block/rnbd: include client and server modules into kernel compilation 2020-05-17 18:57:17 -03:00
README block/rnbd: Adding name to the Contributors List 2021-01-08 08:19:18 -07:00
rnbd-clt-sysfs.c block/rnbd: Kill rnbd_clt_destroy_default_group 2021-04-20 08:59:04 -06:00
rnbd-clt.c block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT} 2021-04-20 08:59:04 -06:00
rnbd-clt.h block/rnbd: Kill rnbd_clt_destroy_default_group 2021-04-20 08:59:04 -06:00
rnbd-common.c
rnbd-log.h
rnbd-proto.h block/rnbd: Set write-back cache and fua same to the target device 2020-12-16 14:55:59 -07:00
rnbd-srv-dev.c rnbd: no need to set bi_end_io in rnbd_bio_map_kern 2020-08-06 07:30:04 -06:00
rnbd-srv-dev.h rnbd: remove rnbd_dev_submit_io 2020-08-06 07:30:04 -06:00
rnbd-srv-sysfs.c block/rnbd: call kobject_put in the failure path 2020-12-04 09:41:10 -07:00
rnbd-srv.c block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel 2021-04-20 08:59:04 -06:00
rnbd-srv.h block/rnbd-srv: close a mapped device from server side. 2020-12-04 09:41:10 -07:00

README

********************************
RDMA Network Block Device (RNBD)
********************************

Introduction
------------

RNBD (RDMA Network Block Device) is a pair of kernel modules
(client and server) that allow for remote access of a block device on
the server over RTRS protocol using the RDMA (InfiniBand, RoCE, iWARP)
transport. After being mapped, the remote block devices can be accessed
on the client side as local block devices.

I/O is transferred between client and server by the RTRS transport
modules. The administration of RNBD and RTRS modules is done via
sysfs entries.

Requirements
------------

  RTRS kernel modules

Quick Start
-----------

Server side:
  # modprobe rnbd_server

Client side:
  # modprobe rnbd_client
  # echo "sessname=blya path=ip:10.50.100.66 device_path=/dev/ram0" > \
            /sys/devices/virtual/rnbd-client/ctl/map_device

  Where "sessname=" is a session name, a string to identify the session
  on client and on server sides; "path=" is a destination IP address or
  a pair of a source and a destination IPs, separated by comma.  Multiple
  "path=" options can be specified in order to use multipath  (see RTRS
  description for details); "device_path=" is the block device to be
  mapped from the server side. After the session to the server machine is
  established, the mapped device will appear on the client side under
  /dev/rnbd<N>.


RNBD-Server Module Parameters
=============================

dev_search_path
---------------

When a device is mapped from the client, the server generates the path
to the block device on the server side by concatenating dev_search_path
and the "device_path" that was specified in the map_device operation.

The default dev_search_path is: "/".

dev_search_path option can also contain %SESSNAME% in order to provide
different device namespaces for different sessions.  See "device_path"
option for details.

============================
Protocol (rnbd/rnbd-proto.h)
============================

1. Before mapping first device from a given server, client sends an
RNBD_MSG_SESS_INFO to the server. Server responds with
RNBD_MSG_SESS_INFO_RSP. Currently the messages only contain the protocol
version for backward compatibility.

2. Client requests to open a device by sending RNBD_MSG_OPEN message. This
contains the path to the device and access mode (read-only or writable).
Server responds to the message with RNBD_MSG_OPEN_RSP. This contains
a 32 bit device id to be used for  IOs and device "geometry" related
information: side, max_hw_sectors, etc.

3. Client attaches RNBD_MSG_IO to each IO message send to a device. This
message contains device id, provided by server in his rnbd_msg_open_rsp,
sector to be accessed, read-write flags and bi_size.

4. Client closes a device by sending RNBD_MSG_CLOSE which contains only the
device id provided by the server.

=========================================
Contributors List(in alphabetical order)
=========================================
Danil Kipnis <danil.kipnis@profitbricks.com>
Fabian Holler <mail@fholler.de>
Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Jack Wang <jinpu.wang@profitbricks.com>
Kleber Souza <kleber.souza@profitbricks.com>
Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Milind Dumbare <Milind.dumbare@gmail.com>
Roman Penyaev <roman.penyaev@profitbricks.com>
Swapnil Ingle <ingleswapnil@gmail.com>