-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmDEwCgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgphOPD/0Zv2qkOcuYm/0pRjdU0mXRBA5ImJ5JwJKA
XPe/sm+9JP5CtnP6czyWQFI2oMGwUBIukYfy0JEeZ4ydoAulzflgQionKRy56e13
SFxTVhvrPB9kAIGQ6/xPC20VAZNMDnNgaZMJJnrIoyBRm/5PKUE3Go3W9IjmuVJa
SzPsOB+TM21s03RCsi2MtJjwbfivJlfDBdvYlThMmxfKCgTpQOHNJow8FsL7P94u
RM0jQ2jCbjaHYW2+sHuSRDQF1CHjXoeq9ewrBvnjBLSLBNSrXrdN+slXNzxhAQ2a
nXp26JUs102duclHzE2xhXQhYwOYyPwKbBUmTppNv8DBLgqV/mU/KCLc6S1lDkw0
SKnCBzvrUKgCGIBe/j4eQLvlTO0ckvKdsWjdegj7naajK+VQiTgLw/UuJCAIPI+V
7FNJh0afr3hfVGZnfspIKuvaQ4omP+vCvKhQZEa14uu2pN2WvyTNVYTn8vbprXEl
Q36Aw7/yAzSXYAoUDP+QX2p+WIInpOsiInpa/CETvH43aP6uf18I7ZEjuDr2zJxP
RFpt/ApkWm8D7vq1p4QEG6FLaoGn8hsoChb6Zhax+Ek8XyoOWWs6oJHr19hfMPjc
s+sGDk1+FZqihpWEvNMHwELMbwegKnDSlbRFVu8od3DAPTBfQKHMlQtg2OPrGrA3
2q7/HsWsOw==
=LOdw
-----END PGP SIGNATURE-----
Merge tag 'block-5.13-2021-06-12' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few fixes that should go into 5.13:
- Fix a regression deadlock introduced in this release between open
and remove of a bdev (Christoph)
- Fix an async_xor md regression in this release (Xiao)
- Fix bcache oversized read issue (Coly)"
* tag 'block-5.13-2021-06-12' of git://git.kernel.dk/linux-block:
block: loop: fix deadlock between open and remove
async_xor: check src_offs is not NULL before updating it
bcache: avoid oversized read request in cache missing code path
bcache: remove bcache device self-defined readahead
Commit c76f48eb5c ("block: take bd_mutex around delete_partitions in
del_gendisk") adds disk->part0->bd_mutex in del_gendisk(), this way
causes the following AB/BA deadlock between removing loop and opening
loop:
1) loop_control_ioctl(LOOP_CTL_REMOVE)
-> mutex_lock(&loop_ctl_mutex)
-> del_gendisk
-> mutex_lock(&disk->part0->bd_mutex)
2) blkdev_get_by_dev
-> mutex_lock(&disk->part0->bd_mutex)
-> lo_open
-> mutex_lock(&loop_ctl_mutex)
Add a new Lo_deleting state to remove the need for clearing
->private_data and thus holding loop_ctl_mutex in the ioctl
LOOP_CTL_REMOVE path.
Based on an analysis and earlier patch from
Ming Lei <ming.lei@redhat.com>.
Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: c76f48eb5c ("block: take bd_mutex around delete_partitions in del_gendisk")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210605140950.5800-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCexAAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpo3SEADXxkU29LW0GPLvtQqaHSWiHZRHL1BHeqcI
tDRMx4Sch3wGJg6dKV9KoL1xRRTeKNpPFzVvQ0BLDrE+Brqu6S44AnWrYuRFvhAa
uOXsvyUCYkcB2y1ylGxPfj35ccAyFCi204/px96nz7+2J/0ASI5qaPXcZ7yf1yR/
HbN8oV3iM/pYohHlwFkEt785sMSqjDVNFA8gWWwCek+iF0Pp34J8ktesHEJubCJ+
R8B5YEK5JSfQ7ROQFlCYn/dS/DrP5mD6+1Yyy1iDumPkHgkxIz8tJ+z6h8jlyfhE
XqKPtSFVyE6LYyOk0m9j4lmNuNXWcIo1c5iiScMRvcvHyvVMMZoV0mjzSGI7HCsf
RoYQt8Ypi27Iei2EXph1V+WmpdYDhG55649m8ubn2YfMJbbep2+ya5DYZpWO1Ir+
Bof8idZkYFDZVSA6T9eBzMg/XwTvNI5WuwjCdD9tfO0s9R7OSVD0eZQNlLSJSjJA
c7N+jQkod+2uhgMzqGLSDvRze/0BOaN25Xt+R7bbOEG+k/mBd8+xgPIemAPKmS93
s6Ia87SRFdYpcJkxoIPJ6Tqky3QTcmSApTZ9ckYVUCxo8IGSsYV5gaoKX6G4O9nm
eewhdiN7si65f1duDkjXEySQ2eBPqwWpA0/w/O1WUwPDJdIYhXU2d1zDdnVGh0nH
NUcsJD1UDQ==
=JpHn
-----END PGP SIGNATURE-----
Merge tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Fix for shared tag set exit (Bart)
- Correct ioctl range for zoned ioctls (Damien)
- Removed dead/unused function (Lin)
- Fix perf regression for shared tags (Ming)
- Fix out-of-bounds issue with kyber and preemption (Omar)
- BFQ merge fix (Paolo)
- Two error handling fixes for nbd (Sun)
- Fix weight update in blk-iocost (Tejun)
- NVMe pull request (Christoph):
- correct the check for using the inline bio in nvmet (Chaitanya
Kulkarni)
- demote unsupported command warnings (Chaitanya Kulkarni)
- fix corruption due to double initializing ANA state (me, Hou Pu)
- reset ns->file when open fails (Daniel Wagner)
- fix a NULL deref when SEND is completed with error in nvmet-rdma
(Michal Kalderon)
- Fix kernel-doc warning (Bart)
* tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block:
block/partitions/efi.c: Fix the efi_partition() kernel-doc header
blk-mq: Swap two calls in blk_mq_exit_queue()
blk-mq: plug request for shared sbitmap
nvmet: use new ana_log_size instead the old one
nvmet: seset ns->file when open fails
nbd: share nbd_put and return by goto put_nbd
nbd: Fix NULL pointer in flush_workqueue
blkdev.h: remove unused codes blk_account_rq
block, bfq: avoid circular stable merges
blk-iocost: fix weight updates of inner active iocgs
nvmet: demote fabrics cmd parse err msg to debug
nvmet: use helper to remove the duplicate code
nvmet: demote discovery cmd parse err msg to debug
nvmet-rdma: Fix NULL deref when SEND is completed with error
nvmet: fix inline bio check for passthru
nvmet: fix inline bio check for bdev-ns
nvme-multipath: fix double initialization of ANA state
kyber: fix out of bounds access when preempted
block: uapi: fix comment about block device ioctl
Replace the following two statements by the statement “goto put_nbd;”
nbd_put(nbd);
return 0;
Signed-off-by: Sun Ke <sunke32@huawei.com>
Suggested-by: Markus Elfring <Markus.Elfring@web.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20210512114331.1233964-3-sunke32@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCVVnQQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgps0ND/0SL4zWQJ5fh+NVCyQJFLm0E+ejqWg6Ykmk
EE1Dzhgr9lgxZU19UCXKtN0lF9icWPfoVDxvqsB2luJLc89GciOmla3PaknCgY6N
QZ/GJh/2Kwb9ybVblzKvUNnGSZOZ8gplpAAXu4zlbFXl7xoGBb12kql78fjw84rS
S4IG+nKvTdC6ENVTPwFMj0UREL5nccVJycvsuZgzYsSQ//5i5zViDz7mfdCujAo4
g3rt8rctBqYoF684BG4OVkDp7ivJUFvMW93PVqvx8vw2sAOB11v+sAKvX5cZIsdM
Z01a3C5nY8IQcpXhoI7n6Kgg4VY0ubeiOrlIBssNQWJszquAHPN7s5uiiSFaIKwg
mCyo69Ofmk4wYm2UO0hM8y7x94QvUNKmlcVxb4ls5OEaAKS/v7chnjoovp8s8Me/
2w1BMBB4qPcF99+K2GF9KyT/gKrXDRXkr9ERTtLLPpCf2uIXtFcU+X+Y64cOivhf
ImN1kbN8fQm1ItiEntn5tVd9u9cDnfqTJhzutBolLP33jjarK3TblJ4cUZqN/xAC
uH5k1IXZGHbrE9LuXUJQwFs752m21LElSkfG7OxzlktfJcKxJriM9o/dw0mgEmLv
0i1meb55VMbtYT/dNWZEa2FRVtelFIngfoiLSgH0IHXU7sKgTEpgyLmSu4PrySez
kRVUsF1Lfw==
=Sv+q
-----END PGP SIGNATURE-----
Merge tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- dasd spelling fixes (Bhaskar)
- Limit bio max size on multi-page bvecs to the hardware limit, to
avoid overly large bio's (and hence latencies). Originally queued for
the merge window, but needed a fix and was dropped from the initial
pull (Changheun)
- NVMe pull request (Christoph):
- reset the bdev to ns head when failover (Daniel Wagner)
- remove unsupported command noise (Keith Busch)
- misc passthrough improvements (Kanchan Joshi)
- fix controller ioctl through ns_head (Minwoo Im)
- fix controller timeouts during reset (Tao Chiu)
- rnbd fixes/cleanups (Gioh, Md, Dima)
- Fix iov_iter re-expansion (yangerkun)
* tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block:
block: reexpand iov_iter after read/write
nvmet: remove unsupported command noise
nvme-multipath: reset bdev to ns head when failover
nvme-pci: fix controller reset hang when racing with nvme_timeout
nvme: move the fabrics queue ready check routines to core
nvme: avoid memset for passthrough requests
nvme: add nvme_get_ns helper
nvme: fix controller ioctl through ns_head
bio: limit bio max size
RDMA/rtrs: fix uninitialized symbol 'cnt'
s390: dasd: Mundane spelling fixes
block/rnbd: Remove all likely and unlikely
block/rnbd-clt: Check the return value of the function rtrs_clt_query
block/rnbd: Fix style issues
block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
My UEK-derived config has 1030 files depending on pagemap.h before this
change. Afterwards, just 326 files need to be rebuilt when I touch
pagemap.h. I think blkdev.h is probably included too widely, but
untangling that dependency is harder and this solves my problem. x86
allmodconfig builds, but there may be implicit include problems on other
architectures.
Link: https://lkml.kernel.org/r/20210309195747.283796-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Dan Williams <dan.j.williams@intel.com> [nvdimm]
Acked-by: Jens Axboe <axboe@kernel.dk> [block]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> [scsi]
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The IO performance test with fio after removing the likely and
unlikely macros in all if-statement shows no performance drop.
They do not help for the performance of rnbd.
The fio test did random read on 32 rnbd devices and 64 processes.
Test environment:
- AMD Opteron(tm) Processor 6386 SE
- 125G memory
- kernel version: 5.4.86
- gcc version: gcc (Debian 8.3.0-6) 8.3.0
- Infiniband controller: InfiniBand: Mellanox Technologies MT26428
[ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
before
read: IOPS=549k, BW=2146MiB/s
read: IOPS=544k, BW=2125MiB/s
read: IOPS=553k, BW=2158MiB/s
read: IOPS=535k, BW=2089MiB/s
read: IOPS=543k, BW=2122MiB/s
read: IOPS=552k, BW=2154MiB/s
average: IOPS=546k, BW=2132MiB/s
after
read: IOPS=556k, BW=2172MiB/s
read: IOPS=561k, BW=2191MiB/s
read: IOPS=552k, BW=2156MiB/s
read: IOPS=551k, BW=2154MiB/s
read: IOPS=562k, BW=2194MiB/s
-----------
average: IOPS=556k, BW=2173MiB/s
The IOPS and bandwidth got better slightly after removing
likely/unlikely. (IOPS= +1.8% BW= +1.9%) But we cannot make sure
that removing the likely/unlikely help the performance because it
depends on various situations. We only make sure that removing the
likely/unlikely does not drop the performance.
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-5-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In case none of the paths are in connected state, the function
rtrs_clt_query returns an error. In such a case, error out since the
values in the rtrs_attrs structure would be garbage.
Fixes: f7a7a5c228 ("block/rnbd: client: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-4-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch fixes some style issues detected by scripts/checkpatch.pl
* Resolve spacing and tab issues
* Remove extra braces in rnbd_get_iu
* Use num_possible_cpus() instead of NR_CPUS in alloc_sess
* Fix the comments styling in rnbd_queue_rq
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-3-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The member queue_depth in the structure rnbd_clt_session is read from the
rtrs client side using the function rtrs_clt_query, which in turn is read
from the rtrs_clt structure. It should really be of type size_t.
Fixes: 90426e89f5 ("block/rnbd: client: private header with client structs and functions")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-2-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In null_init, null_add_dev(dev) is called.
In null_add_dev, it calls null_free_zoned_dev(dev) to free dev->zones
via kvfree(dev->zones) in out_cleanup_zone branch and returns err.
Then null_init accept the err code and then calls null_free_dev(dev).
But in null_free_dev(dev), dev->zones is freed again by
null_free_zoned_dev().
My patch set dev->zones to NULL in null_free_zoned_dev() after
kvfree(dev->zones) is called, to avoid the double free.
Fixes: 2984c8684f ("nullb: factor disk parameters")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Link: https://lore.kernel.org/r/20210426143229.7374-1-lyl2019@mail.ustc.edu.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Prior to commit 4a8c31a1c6 ("xen/blkback: rework connect_ring() to avoid
inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
behaviour of xen-blkback when connecting to a frontend was:
- read 'ring-page-order'
- if not present then expect a single page ring specified by 'ring-ref'
- else expect a ring specified by 'ring-refX' where X is between 0 and
1 << ring-page-order
This was correct behaviour, but was broken by the afforementioned commit to
become:
- read 'ring-page-order'
- if not present then expect a single page ring (i.e. ring-page-order = 0)
- expect a ring specified by 'ring-refX' where X is between 0 and
1 << ring-page-order
- if that didn't work then see if there's a single page ring specified by
'ring-ref'
This incorrect behaviour works most of the time but fails when a frontend
that sets 'ring-page-order' is unloaded and replaced by one that does not
because, instead of reading 'ring-ref', xen-blkback will read the stale
'ring-ref0' left around by the previous frontend will try to map the wrong
grant reference.
This patch restores the original behaviour.
Fixes: 4a8c31a1c6 ("xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront")
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: "Roger Pau Monné" <roger.pau@citrix.com>
Link: https://lore.kernel.org/r/20210202175659.18452-1-paul@xen.org
Signed-off-by: Juergen Gross <jgross@suse.com>
While the maximum size of each ramdisk is defined either as a module
parameter, or compile time default, it's impossible to know how many pages
have currently been allocated by each ram%d device, since they're
allocated when used and never freed.
This patch creates a new directory at this location:
/sys/kernel/debug/ramdisk_pages/
which will contain a file named "ram%d" for each instantiated ramdisk on
the system. The file is read-only, and read() will output the number of
pages currently held by that ramdisk.
We lose track how much memory a ramdisk is using as pages once used are
simply recycled but never freed.
In instances where we exhaust the size of the ramdisk with a file that
exceeds it, encounter ENOSPC and delete the file for mitigation; df would
show decrease in used and increase in available blocks but the since we
have touched all pages, the memory footprint of the ramdisk does not
reflect the blocks used/available count
...
[root@localhost ~]# mkfs.ext2 /dev/ram15
mke2fs 1.45.6 (20-Mar-2020)
Creating filesystem with 4096 1k blocks and 1024 inodes
[root@localhost ~]# mount /dev/ram15 /mnt/ram15/
[root@localhost ~]# cat
/sys/kernel/debug/ramdisk_pages/ram15
58
[root@kerneltest008.06.prn3 ~]# df /dev/ram15
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/ram15 3963 31 3728 1% /mnt/ram15
[root@kerneltest008.06.prn3 ~]# dd if=/dev/urandom of=/mnt/ram15/test2
bs=1M count=5
dd: error writing '/mnt/ram15/test2': No space left on device
4+0 records in
3+0 records out
4005888 bytes (4.0 MB, 3.8 MiB) copied, 0.0446614 s, 89.7 MB/s
[root@kerneltest008.06.prn3 ~]# df /mnt/ram15/
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/ram15 3963 3960 0 100% /mnt/ram15
[root@kerneltest008.06.prn3 ~]# cat
/sys/kernel/debug/ramdisk_pages/ram15
1024
[root@kerneltest008.06.prn3 ~]# rm /mnt/ram15/test2
rm: remove regular file '/mnt/ram15/test2'? y
[root@kerneltest008.06.prn3 /var]# df /dev/ram15
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/ram15 3963 31 3728 1% /mnt/ram15
# Acutal memory footprint
[root@kerneltest008.06.prn3 /var]# cat
/sys/kernel/debug/ramdisk_pages/ram15
1024
...
This debugfs counter will always reveal the accurate number of
permanently allocated pages to the ramdisk.
Signed-off-by: Calvin Owens <calvinowens@fb.com>
[cleaned up the !CONFIG_DEBUG_FS case and API changes for HEAD]
Signed-off-by: Kyle McMartin <jkkm@fb.com>
[rebased]
Signed-off-by: Saravanan D <saravanand@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Smatch complains that the "type > NUM_DISK_MINORS" should be >=
instead of >. We also need to subtract one from "type" at the start.
Fixes: bf9c0538e4 ("ataflop: use a separate gendisk for each media format")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The function uses "type" as an array index:
q = unit[drive].disk[type]->queue;
Unfortunately the bounds check on "type" isn't done until later in the
function. Fix this by moving the bounds check to the start.
Fixes: bf9c0538e4 ("ataflop: use a separate gendisk for each media format")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding a break statement instead of just
letting the code fall through to the next, and by adding a fallthrough
pseudo-keyword in places whre the code is intended to fall through.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
cppcheck report the following error:
rnbd/rnbd-clt-sysfs.c:522:36: error: The variable 'buf' is used both
as a parameter and as destination in snprintf(). The origin and
destination buffers overlap. Quote from glibc (C-library)
documentation
(http://www.gnu.org/software/libc/manual/html_mono/libc.html#Formatted-Output-Functions):
"If copying takes place between objects that overlap as a result of a
call to sprintf() or snprintf(), the results are undefined."
[sprintfOverlappingData]
Fix it by initializing the buf variable in the first snprintf call.
Fixes: 91f4acb280 ("block/rnbd-clt: support mapping two devices")
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-19-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We always map with SZ_4K, so do not need max_segment_size.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-18-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When an RTRS session state changes, the transport layer generates an event
to RNBD. Then RNBD will change the state of the RNBD client device
accordingly.
This commit add kobject_uevent when the RNBD device state changes. With
this udev rules can be configured to react accordingly.
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-17-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
struct rtrs_srv is not used when handling rnbd_srv_rdma_ev messages, so
cleaned up
rdma_ev function pointer in rtrs_srv_ops also is changed.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-16-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
RNBD can make double-queues for irq-mode and poll-mode.
For example, on 4-CPU system 8 request-queues are created,
4 for irq-mode and 4 for poll-mode.
If the IO has HIPRI flag, the block-layer will call .poll function
of RNBD. Then IO is sent to the poll-mode queue.
Add optional nr_poll_queues argument for map_devices interface.
To support polling of RNBD, RTRS client creates connections
for both of irq-mode and direct-poll-mode.
For example, on 4-CPU system it could've create 5 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
After this patch, it can create 9 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
con[5:8] => DIRECT-POLL cq
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-14-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When unloading the rnbd-clt module, it does not free a memory
including the filename of the symbolic link to /sys/block/rnbdX.
It is found by kmemleak as below.
unreferenced object 0xffff9f1a83d3c740 (size 16):
comm "bash", pid 736, jiffies 4295179665 (age 9841.310s)
hex dump (first 16 bytes):
21 64 65 76 21 6e 75 6c 6c 62 30 40 62 6c 61 00 !dev!nullb0@bla.
backtrace:
[<0000000039f0c55e>] 0xffffffffc0456c24
[<000000001aab9513>] kernfs_fop_write+0xcf/0x1c0
[<00000000db5aa4b3>] vfs_write+0xdb/0x1d0
[<000000007a2e2207>] ksys_write+0x65/0xe0
[<00000000055e280a>] do_syscall_64+0x50/0x1b0
[<00000000c2b51831>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-13-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
clang static analysis reports this problem
rnbd-clt.c:1212:11: warning: Branch condition evaluates to a
garbage value
else if (!first)
^~~~~~
This is triggered in the find_and_get_or_create_sess() call
because the variable first is not initialized and the
earlier check is specifically for
if (sess == ERR_PTR(-ENOMEM))
This is false positive.
But the if-check can be reduced by initializing first to
false and then returning if the call to find_or_creat_sess()
does not set it to true. When it remains false, either
sess will be valid or not. The not case is caught by
find_and_get_or_create_sess()'s caller rnbd_clt_map_device()
sess = find_and_get_or_create_sess(...);
if (IS_ERR(sess))
return ERR_CAST(sess);
Since find_and_get_or_create_sess() initializes first to false
setting it in find_or_create_sess() is not needed.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-12-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We changed the rnbd_srv_sess_dev_force_close to use try-lock
because rnbd_srv_sess_dev_force_close and process_msg_close
can generate a deadlock.
Now rnbd_srv_sess_dev_force_close would do nothing
if it fails to get the lock. So removing the force_close
file should be moved to after the lock. Or the force_close
file is removed but the others are not removed.
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210419073722.15351-11-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
They are defined with the same value and similar meaning, let's remove
one of them, then we can remove {WAIT,NOWAIT}.
Also change the type of 'wait' from 'int' to 'enum wait_type' to make
it clear.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-9-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can use destroy_device directly since destroy_device_cb is just the
wrapper of destroy_device.
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-8-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No need to have it since we can call sysfs_remove_group in the
rnbd_clt_destroy_sysfs_files.
Then rnbd_clt_destroy_sysfs_files is paired with it's counterpart
rnbd_clt_create_sysfs_files.
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-7-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It makes more sense to add gendisk in rnbd_clt_setup_gen_disk, instead
of do it in rnbd_clt_map_device.
Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-6-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove them since both sess and idx can be dereferenced from dev. And
sess is not used in the function.
Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-5-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove 'pathname' and 'sess' since we can dereference it from 'dev'.
Signed-off-by: Guoqing Jiang <guoqing.jiang@gmx.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210419073722.15351-4-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
FLOPPY_SILENT_DCL_CLEAR is not defined anywhere and comes from pre-git
era. Just drop this undef. There is FD_SILENT_DCL_CLEAR which is really
used.
Signed-off-by: Denis Efremov <efremov@linux.com>
Link: https://lore.kernel.org/r/20210416083449.72700-6-efremov@linux.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use ST0 as 0 index for reply_buffer array. get_fdc_version() is the only
function that uses index 0 directly instead of the ST0 define.
Signed-off-by: Denis Efremov <efremov@linux.com>
Link: https://lore.kernel.org/r/20210416083449.72700-3-efremov@linux.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bio_list_copy_data is only used by pktcdvd, so move it there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210412134658.2623190-2-hch@lst.de
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This will enable changing the virtual boundary of null blk devices. For
now, null blk devices didn't have any restriction on the scatter/gather
elements received from the block layer. Add a module parameter and a
configfs option that will control the virtual boundary. This will
enable testing the efficiency of the block layer bounce buffer in case
a suitable application will send discontiguous IO to the given device.
Initial testing with patched FIO showed the following results (64 jobs,
128 iodepth, 1 nullb device):
IO size READ (virt=false) READ (virt=true) Write (virt=false) Write (virt=true)
---------- ------------------- ----------------- ------------------- -------------------
1k 10.7M 8482k 10.8M 8471k
2k 10.4M 8266k 10.4M 8271k
4k 10.4M 8274k 10.3M 8226k
8k 10.2M 8131k 9800k 7933k
16k 9567k 7764k 8081k 6828k
32k 8865k 7309k 5570k 5153k
64k 7695k 6586k 2682k 2617k
128k 5346k 5489k 1320k 1296k
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210412095523.278632-1-mgurtovoy@nvidia.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Guobin Huang <huangguobin4@huawei.com>
Link: https://lore.kernel.org/r/1617710988-49205-1-git-send-email-huangguobin4@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
swim3 only uses the virtual address of a bio to stash it into the data
transfer using virt_to_bus. But the ppc32 virt_to_bus just uses the
physical address with an offset. Replace virt_to_bus with a local hack
that performs the equivalent transformation and stop asking for block
layer bounce buffering.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210406061839.811588-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Always use the track buffer that is already used for addresses outside
the 16MB address capability of the floppy controller. This allows to
remove a lot of code that relies on kernel virtual addresses. With
this gone there is just a single place left that looks at the bio,
which can be converted to memcpy_{from,to}_page, thus removing the need
for the extra block-layer bounce buffering for highmem pages.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210406061755.811522-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
m68k doesn't support highmem, so don't bother enabling the block layer
bounce buffer code. Just for safety throw in a depend on !HIGHMEM.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210406061725.811389-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
from drivers/block/drbd/drbd_nl.c:24:
drivers/block/drbd/drbd_nl.c: In function ‘drbd_adm_attach’:
drivers/block/drbd/drbd_nl.c:1968:10: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
drivers/block/drbd/drbd_nl.c:930: warning: Function parameter or member 'flags' not described in 'drbd_determine_dev_size'
drivers/block/drbd/drbd_nl.c:930: warning: Function parameter or member 'rs' not described in 'drbd_determine_dev_size'
drivers/block/drbd/drbd_nl.c:1148: warning: Function parameter or member 'dc' not described in 'drbd_check_al_size'
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-12-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
drivers/block/xen-blkfront.c:1960: warning: Function parameter or member 'dev' not described in 'blkfront_probe'
drivers/block/xen-blkfront.c:1960: warning: Function parameter or member 'id' not described in 'blkfront_probe'
drivers/block/xen-blkfront.c:1960: warning: expecting prototype for Allocate the basic(). Prototype was for blkfront_probe() instead
drivers/block/xen-blkfront.c:2085: warning: Function parameter or member 'dev' not described in 'blkfront_resume'
drivers/block/xen-blkfront.c:2085: warning: expecting prototype for or a backend(). Prototype was for blkfront_resume() instead
drivers/block/xen-blkfront.c:2444: warning: wrong kernel-doc identifier on line:
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: xen-devel@lists.xenproject.org
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Link: https://lore.kernel.org/r/20210312105530.2219008-11-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
drivers/block/drbd/drbd_receiver.c:1641: warning: Function parameter or member 'op' not described in 'drbd_submit_peer_request'
drivers/block/drbd/drbd_receiver.c:1641: warning: Function parameter or member 'op_flags' not described in 'drbd_submit_peer_request'
drivers/block/drbd/drbd_receiver.c:1641: warning: Function parameter or member 'fault_type' not described in 'drbd_submit_peer_request'
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-10-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
drivers/block/drbd/drbd_main.c:278: warning: Function parameter or member 'connection' not described in 'tl_clear'
drivers/block/drbd/drbd_main.c:278: warning: Excess function parameter 'device' description in 'tl_clear'
drivers/block/drbd/drbd_main.c:489: warning: Function parameter or member 'cpu_mask' not described in 'drbd_calc_cpu_mask'
drivers/block/drbd/drbd_main.c:528: warning: Excess function parameter 'device' description in 'drbd_thread_current_set_cpu'
drivers/block/drbd/drbd_main.c:549: warning: Function parameter or member 'connection' not described in 'drbd_header_size'
drivers/block/drbd/drbd_main.c:1204: warning: Function parameter or member 'device' not described in 'send_bitmap_rle_or_plain'
drivers/block/drbd/drbd_main.c:1204: warning: Function parameter or member 'c' not described in 'send_bitmap_rle_or_plain'
drivers/block/drbd/drbd_main.c:1335: warning: Function parameter or member 'peer_device' not described in '_drbd_send_ack'
drivers/block/drbd/drbd_main.c:1335: warning: Excess function parameter 'device' description in '_drbd_send_ack'
drivers/block/drbd/drbd_main.c:1379: warning: Function parameter or member 'peer_device' not described in 'drbd_send_ack'
drivers/block/drbd/drbd_main.c:1379: warning: Excess function parameter 'device' description in 'drbd_send_ack'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'connection' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'sock' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'buffer' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'size' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'msg_flags' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:3525: warning: Function parameter or member 'flags' not described in 'drbd_queue_bitmap_io'
drivers/block/drbd/drbd_main.c:3563: warning: Function parameter or member 'flags' not described in 'drbd_bitmap_io'
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-9-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>