Remove a level of indentation from the main code implementating the table
search by using a goto for the APST not supported case. Also move the
main comment above the function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Do not call nvme_configure_apst when the controller is not live, given
that nvme_configure_apst will fail due the lack of an admin queue when
the controller is being torn down and nvme_set_latency_tolerance is
called from dev_pm_qos_hide_latency_tolerance.
Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance")
Reported-by: Peng Liu <liupeng17@lenovo.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Add a 'kato' controller sysfs attribute to display the current
keep-alive timeout value (if any). This allows userspace to identify
persistent discovery controllers, as these will have a non-zero
KATO value.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
According to the NVMe base spec the KATO commands should be sent
at half of the KATO interval, to properly account for round-trip
times.
As we now will only ever send one KATO command per connection we
can easily use the recommended values.
This also fixes a potential issue where the request timeout for
the KATO command does not match the value in the connect command,
which might be causing spurious connection drops from the target.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Issue following command:
nvme set-feature -f 0xf -v 0 /dev/nvme1n1 # disable keep-alive timer
nvme admin-passthru -o 0x18 /dev/nvme1n1 # send keep-alive command
will make keep-alive timer fired and thus delete the controller like
below:
[247459.907635] nvmet: ctrl 1 keep-alive timer (0 seconds) expired!
[247459.930294] nvmet: ctrl 1 fatal error occurred!
Avoid this by not queuing delayed keep-alive if it is disabled when
keep-alive command is received from the admin queue.
Signed-off-by: Hou Pu <houpu.main@gmail.com>
Tested-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Smatch complains that the "type > NUM_DISK_MINORS" should be >=
instead of >. We also need to subtract one from "type" at the start.
Fixes: bf9c0538e4 ("ataflop: use a separate gendisk for each media format")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The function uses "type" as an array index:
q = unit[drive].disk[type]->queue;
Unfortunately the bounds check on "type" isn't done until later in the
function. Fix this by moving the bounds check to the start.
Fixes: bf9c0538e4 ("ataflop: use a separate gendisk for each media format")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding a break statement instead of just
letting the code fall through to the next, and by adding a fallthrough
pseudo-keyword in places whre the code is intended to fall through.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
cppcheck report the following error:
rnbd/rnbd-clt-sysfs.c:522:36: error: The variable 'buf' is used both
as a parameter and as destination in snprintf(). The origin and
destination buffers overlap. Quote from glibc (C-library)
documentation
(http://www.gnu.org/software/libc/manual/html_mono/libc.html#Formatted-Output-Functions):
"If copying takes place between objects that overlap as a result of a
call to sprintf() or snprintf(), the results are undefined."
[sprintfOverlappingData]
Fix it by initializing the buf variable in the first snprintf call.
Fixes: 91f4acb280 ("block/rnbd-clt: support mapping two devices")
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-19-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We always map with SZ_4K, so do not need max_segment_size.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-18-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When an RTRS session state changes, the transport layer generates an event
to RNBD. Then RNBD will change the state of the RNBD client device
accordingly.
This commit add kobject_uevent when the RNBD device state changes. With
this udev rules can be configured to react accordingly.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-17-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
struct rtrs_srv is not used when handling rnbd_srv_rdma_ev messages, so
cleaned up
rdma_ev function pointer in rtrs_srv_ops also is changed.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-16-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
RNBD can make double-queues for irq-mode and poll-mode.
For example, on 4-CPU system 8 request-queues are created,
4 for irq-mode and 4 for poll-mode.
If the IO has HIPRI flag, the block-layer will call .poll function
of RNBD. Then IO is sent to the poll-mode queue.
Add optional nr_poll_queues argument for map_devices interface.
To support polling of RNBD, RTRS client creates connections
for both of irq-mode and direct-poll-mode.
For example, on 4-CPU system it could've create 5 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
After this patch, it can create 9 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
con[5:8] => DIRECT-POLL cq
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-14-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When unloading the rnbd-clt module, it does not free a memory
including the filename of the symbolic link to /sys/block/rnbdX.
It is found by kmemleak as below.
unreferenced object 0xffff9f1a83d3c740 (size 16):
comm "bash", pid 736, jiffies 4295179665 (age 9841.310s)
hex dump (first 16 bytes):
21 64 65 76 21 6e 75 6c 6c 62 30 40 62 6c 61 00 !dev!nullb0@bla.
backtrace:
[<0000000039f0c55e>] 0xffffffffc0456c24
[<000000001aab9513>] kernfs_fop_write+0xcf/0x1c0
[<00000000db5aa4b3>] vfs_write+0xdb/0x1d0
[<000000007a2e2207>] ksys_write+0x65/0xe0
[<00000000055e280a>] do_syscall_64+0x50/0x1b0
[<00000000c2b51831>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-13-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
clang static analysis reports this problem
rnbd-clt.c:1212:11: warning: Branch condition evaluates to a
garbage value
else if (!first)
^~~~~~
This is triggered in the find_and_get_or_create_sess() call
because the variable first is not initialized and the
earlier check is specifically for
if (sess == ERR_PTR(-ENOMEM))
This is false positive.
But the if-check can be reduced by initializing first to
false and then returning if the call to find_or_creat_sess()
does not set it to true. When it remains false, either
sess will be valid or not. The not case is caught by
find_and_get_or_create_sess()'s caller rnbd_clt_map_device()
sess = find_and_get_or_create_sess(...);
if (IS_ERR(sess))
return ERR_CAST(sess);
Since find_and_get_or_create_sess() initializes first to false
setting it in find_or_create_sess() is not needed.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-12-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We changed the rnbd_srv_sess_dev_force_close to use try-lock
because rnbd_srv_sess_dev_force_close and process_msg_close
can generate a deadlock.
Now rnbd_srv_sess_dev_force_close would do nothing
if it fails to get the lock. So removing the force_close
file should be moved to after the lock. Or the force_close
file is removed but the others are not removed.
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-11-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
They are defined with the same value and similar meaning, let's remove
one of them, then we can remove {WAIT,NOWAIT}.
Also change the type of 'wait' from 'int' to 'enum wait_type' to make
it clear.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-9-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can use destroy_device directly since destroy_device_cb is just the
wrapper of destroy_device.
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-8-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No need to have it since we can call sysfs_remove_group in the
rnbd_clt_destroy_sysfs_files.
Then rnbd_clt_destroy_sysfs_files is paired with it's counterpart
rnbd_clt_create_sysfs_files.
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-7-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It makes more sense to add gendisk in rnbd_clt_setup_gen_disk, instead
of do it in rnbd_clt_map_device.
Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-6-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove them since both sess and idx can be dereferenced from dev. And
sess is not used in the function.
Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-5-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove 'pathname' and 'sess' since we can dereference it from 'dev'.
Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-4-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Two sysfs entries, remap_device and resize, are missing.
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-3-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Danil steps down, Haris will take over.
Also update email address to ionos.com, the old
cloud.ionos.com will still work for some time.
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Acked-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-2-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The variable st is being assigned a value that is never read and
it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Denis Efremov <efremov@linux.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20210415130020.1959951-1-colin.king@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
FLOPPY_SILENT_DCL_CLEAR is not defined anywhere and comes from pre-git
era. Just drop this undef. There is FD_SILENT_DCL_CLEAR which is really
used.
Signed-off-by: Denis Efremov <efremov@linux.com>
Link: https://lore.kernel.org/r/20210416083449.72700-6-efremov@linux.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use ST0 as 0 index for reply_buffer array. get_fdc_version() is the only
function that uses index 0 directly instead of the ST0 define.
Signed-off-by: Denis Efremov <efremov@linux.com>
Link: https://lore.kernel.org/r/20210416083449.72700-3-efremov@linux.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull MD updates from Song:
"1. mddev_find_or_alloc() clean up, from Christoph.
2. Fix NULL pointer deref with external bitmap, from Sudhakar."
* 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md/bitmap: wait for external bitmap writes to complete during tear down
md: do not return existing mddevs from mddev_find_or_alloc
md: refactor mddev_find_or_alloc
md: factor out a mddev_alloc_unit helper from mddev_find
- refactor the ioctl coee
- fix a segmentation fault during io parsing error in nvmet-tcp
(Elad Grupi)
- fix NULL derefence in nvme_ctrl_fast_io_fail_tmo_show/store
(Gopal Tiwari)
- properly respect the sgl_threshold flag in nvme-pci (Niklas Cassel)
- misc cleanups (Niklas Cassel, Amit Engel, Minwoo Im, Colin Ian King)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmB4VTwLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPzyw/+KNku6zEXU+Vqvv/Ud9k8uabkgEaiF37LgcYHPPyQ
xfv9I5M7B7P7yTh4AJGpK0G64sXzW3JWXWcQ5FhsL2del3urjUk9BnUB6Vi6vQw6
HFiSrbLRjx1Njk5mEZGJZNgkxpEB2YZNudzQ02MhZzvModhBhJ4I2IeS2Xa6IjpK
XfaNNJxmL8xiBDkD68R/ancHWkHmYVT5Y5HtEvTU0anqjrWZy2iWlkFH4tkYjZ0l
XbNqLJNPymx83MVINrhmoaSaI3dTxoQsNzpigEageiZ026z0zBElg51YXz2DXErQ
7ETHqt/c3AHlVeEjVtHTcNMoSWHpBnc7MEGAiQ7K6BHF25dn9ifvCxsQgFCKvcv9
EakqQCp20qtOCRtRb3FEV3qImAAv2gvTJIKvG1+42K1S+Hd1pgxesLkqozLFqegG
f70hQwBGMyoTsebDrW+iqsN124u9fcOXN9+71V7AAzOXnoiZn8qcuToOpaBPO6cp
UXZYz0yRyuIVY7gWsDIgnVx8iSYXs75kcqaTfV0ilpYycR8ILy8oRIGXuHpVNuxt
JYW/0pdy5DTR9k9TbpsPBKop9W3pMItE1tXsop6Ybg2TzHa+nbTRnj2ympCkARPS
KnYcU6OAFNvhCTE5+qsefe+hLouVbJsYF/DO44vtgfBlNK69Lep+tQDF8eAwqiOw
kHc=
=30Ha
-----END PGP SIGNATURE-----
Merge tag 'nvme-5.13-2021-04-15' of git://git.infradead.org/nvme into for-5.13/drivers
Pull NVMe updates from Christoph:
"nvme updates for Linux 5.13
- refactor the ioctl code
- fix a segmentation fault during io parsing error in nvmet-tcp
(Elad Grupi)
- fix NULL derefence in nvme_ctrl_fast_io_fail_tmo_show/store
(Gopal Tiwari)
- properly respect the sgl_threshold flag in nvme-pci (Niklas Cassel)
- misc cleanups (Niklas Cassel, Amit Engel, Minwoo Im, Colin Ian King)"
* tag 'nvme-5.13-2021-04-15' of git://git.infradead.org/nvme:
nvme: fix NULL derefence in nvme_ctrl_fast_io_fail_tmo_show/store
nvme: let namespace probing continue for unsupported features
nvme: factor out nvme_ns_open and nvme_ns_release helpers
nvme: move nvme_ns_head_ops to multipath.c
nvme: factor out a nvme_tryget_ns_head helper
nvme: move the ioctl code to a separate file
nvme: don't bother to look up a namespace for controller ioctls
nvme: simplify block device ioctl handling for the !multipath case
nvme: simplify the compat ioctl handling
nvme: factor out a nvme_ns_ioctl helper
nvme: pass a user pointer to nvme_nvm_ioctl
nvme: cleanup setting the disk name
nvme: add a nvme_ns_head_multipath helper
nvme: remove single trailing whitespace
nvme-multipath: remove single trailing whitespace
nvme-pci: remove single trailing whitespace
nvme-pci: don't simple map sgl when sgls are disabled
nvmet: fix a spelling mistake "nubmer" -> "number"
nvmet-fc: simplify nvmet_fc_alloc_hostport
nvmet-tcp: fix a segmentation fault during io parsing error
NULL pointer dereference was observed in super_written() when it tries
to access the mddev structure.
[The below stack trace is from an older kernel, but the problem described
in this patch applies to the mainline kernel.]
[ 1194.474861] task: ffff8fdd20858000 task.stack: ffffb99d40790000
[ 1194.488000] RIP: 0010:super_written+0x29/0xe1
[ 1194.499688] RSP: 0018:ffff8ffb7fcc3c78 EFLAGS: 00010046
[ 1194.512477] RAX: 0000000000000000 RBX: ffff8ffb7bf4a000 RCX: ffff8ffb78991048
[ 1194.527325] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ffb56b8a200
[ 1194.542576] RBP: ffff8ffb7fcc3c90 R08: 000000000000000b R09: 0000000000000000
[ 1194.558001] R10: ffff8ffb56b8a298 R11: 0000000000000000 R12: ffff8ffb56b8a200
[ 1194.573070] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1194.588117] FS: 0000000000000000(0000) GS:ffff8ffb7fcc0000(0000) knlGS:0000000000000000
[ 1194.604264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1194.617375] CR2: 00000000000002b8 CR3: 00000021e040a002 CR4: 00000000007606e0
[ 1194.632327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1194.647865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1194.663316] PKRU: 55555554
[ 1194.674090] Call Trace:
[ 1194.683735] <IRQ>
[ 1194.692948] bio_endio+0xae/0x135
[ 1194.703580] blk_update_request+0xad/0x2fa
[ 1194.714990] blk_update_bidi_request+0x20/0x72
[ 1194.726578] __blk_end_bidi_request+0x2c/0x4d
[ 1194.738373] __blk_end_request_all+0x31/0x49
[ 1194.749344] blk_flush_complete_seq+0x377/0x383
[ 1194.761550] flush_end_io+0x1dd/0x2a7
[ 1194.772910] blk_finish_request+0x9f/0x13c
[ 1194.784544] scsi_end_request+0x180/0x25c
[ 1194.796149] scsi_io_completion+0xc8/0x610
[ 1194.807503] scsi_finish_command+0xdc/0x125
[ 1194.818897] scsi_softirq_done+0x81/0xde
[ 1194.830062] blk_done_softirq+0xa4/0xcc
[ 1194.841008] __do_softirq+0xd9/0x29f
[ 1194.851257] irq_exit+0xe6/0xeb
[ 1194.861290] do_IRQ+0x59/0xe3
[ 1194.871060] common_interrupt+0x1c6/0x382
[ 1194.881988] </IRQ>
[ 1194.890646] RIP: 0010:cpuidle_enter_state+0xdd/0x2a5
[ 1194.902532] RSP: 0018:ffffb99d40793e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff43
[ 1194.917317] RAX: ffff8ffb7fce27c0 RBX: ffff8ffb7fced800 RCX: 000000000000001f
[ 1194.932056] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000
[ 1194.946428] RBP: ffffb99d40793ea0 R08: 0000000000000004 R09: 0000000000002ed2
[ 1194.960508] R10: 0000000000002664 R11: 0000000000000018 R12: 0000000000000003
[ 1194.974454] R13: 000000000000000b R14: ffffffff925715a0 R15: 0000011610120d5a
[ 1194.988607] ? cpuidle_enter_state+0xcc/0x2a5
[ 1194.999077] cpuidle_enter+0x17/0x19
[ 1195.008395] call_cpuidle+0x23/0x3a
[ 1195.017718] do_idle+0x172/0x1d5
[ 1195.026358] cpu_startup_entry+0x73/0x75
[ 1195.035769] start_secondary+0x1b9/0x20b
[ 1195.044894] secondary_startup_64+0xa5/0xa5
[ 1195.084921] RIP: super_written+0x29/0xe1 RSP: ffff8ffb7fcc3c78
[ 1195.096354] CR2: 00000000000002b8
bio in the above stack is a bitmap write whose completion is invoked after
the tear down sequence sets the mddev structure to NULL in rdev.
During tear down, there is an attempt to flush the bitmap writes, but for
external bitmaps, there is no explicit wait for all the bitmap writes to
complete. For instance, md_bitmap_flush() is called to flush the bitmap
writes, but the last call to md_bitmap_daemon_work() in md_bitmap_flush()
could generate new bitmap writes for which there is no explicit wait to
complete those writes. The call to md_bitmap_update_sb() will return
simply for external bitmaps and the follow-up call to md_update_sb() is
conditional and may not get called for external bitmaps. This results in a
kernel panic when the completion routine, super_written() is called which
tries to reference mddev in the rdev that has been set to
NULL(in unbind_rdev_from_array() by tear down sequence).
The solution is to call md_super_wait() for external bitmaps after the
last call to md_bitmap_daemon_work() in md_bitmap_flush() to ensure there
are no pending bitmap writes before proceeding with the tear down.
Cc: stable@vger.kernel.org
Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Reviewed-by: Zhao Heming <heming.zhao@suse.com>
Signed-off-by: Song Liu <song@kernel.org>
Instead of returning an existing mddev, just for it to be discarded
later directly return -EEXIST. Rename the function to mddev_alloc now
that it doesn't find an existing mddev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
Allocate the new mddev first speculatively, which greatly simplifies
the code flow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
Split out a self contained helper to find a free minor for the md
"unit" number.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
Adding entry for dev_attr_fast_io_fail_tmo to avoid the kernel crash
while reading and writing the fast_io_fail_tmo.
Fixes: 09fbed6363 (nvme: export fast_io_fail_tmo to sysfs)
Signed-off-by: Gopal Tiwari <gtiwari@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Instead of failing to scan the namespace entirely when unsupported
features are detected, just mark the gendisk hidden but allow other
access like the upcoming per-namespace character device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
These will be reused for the per-namespace character devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Move the multipath block_device_operations to multipath.c, where they
belong.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Split out the ioctl code from core.c into a new file. Also update
copyrights while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Don't bother to look up a namespace just to drop if after retreiving the
controller for the multipath case. Just look up a live controller for
the subsystem directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Only use the existing ioctl handler for the multipath case, and add a
simpler one that reverts to the pre-multipath case for not shared
use case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Don't bother defining a separate compat_ioctl handler, and just handle
the NVME_IOCTL_SUBMIT_IO32 case inline. Also only defined it for those
ABIs (currently just i386 vs x86_64) that are affected.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Factor out a helper for the namespace based ioctls.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Pass the proper user pointer instead of the not all that useful integer
representation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Javier González <javier.gonz@samsung.com>