If a tag set is shared across request queues (e.g. SCSI LUNs) then the
block layer core keeps track of the number of active request queues in
tags->active_queues. blk_mq_tag_busy() and blk_mq_tag_idle() update that
atomic counter if the hctx flag BLK_MQ_F_TAG_QUEUE_SHARED is set. Make
sure that blk_mq_exit_queue() calls blk_mq_tag_idle() before that flag is
cleared by blk_mq_del_queue_tag_set().
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Fixes: 0d2602ca30 ("blk-mq: improve support for shared tags maps")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210513171529.7977-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In case of shared sbitmap, request won't be held in plug list any more
sine commit 32bc15afed ("blk-mq: Facilitate a shared sbitmap per
tagset"), this way makes request merge from flush plug list & batching
submission not possible, so cause performance regression.
Yanhui reports performance regression when running sequential IO
test(libaio, 16 jobs, 8 depth for each job) in VM, and the VM disk
is emulated with image stored on xfs/megaraid_sas.
Fix the issue by recovering original behavior to allow to hold request
in plug list.
Cc: Yanhui Ma <yama@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: kashyap.desai@broadcom.com
Fixes: 32bc15afed ("blk-mq: Facilitate a shared sbitmap per tagset")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210514022052.1047665-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- correct the check for using the inline bio in nvmet
(Chaitanya Kulkarni)
- demote unsupported command warnings (Chaitanya Kulkarni)
- fix corruption due to double initializing ANA state (me, Hou Pu)
- reset ns->file when open fails (Daniel Wagner)
- fix a NULL deref when SEND is completed with error in nvmet-rdma
(Michal Kalderon)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmCdW5cLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYNizRAAomAnNvWi6i/Y/4eYfPeEk31jImL4izV+L4n5LS86
/l8uacsZbtZZjIkUtDP3yjSKy4GbCQQVZHYVeH582vr6VKXztGKgGeckm1HRHUfc
9cV8fQ3Yx8OdYO7SoO7IAhwkOYNNYoiS3fFFPVLp/UlimtahpDzywum2tX05fsLz
1ucJfFwu9edxJfh1/uNKsXbkoBIXd+2NtdpycYqOjrVYTMNwQOvFq7P+Kd/ETDIU
mFLeiY1b3BwbUUIDADz7QyTL5Ajk1bcXXLVudBrNdy0jg4Bfvzhn7vxBLwYUFAzK
pX54dr8+GQhBqUzrqdRSLg4nfG9mpN+yI5RtqBcxVsGEkRU4w3XABTPgP5MtBQMr
nGcy82KNVG3UA0jAe4wEmlr3tadlTtuH9O7aB+jEsDJIMo4S6SBl5ibsVqKyXFHb
otYto+I9SpjJC5nRJXaFDqmI1pEX0mwmpDIek3YM3ovTzlLbKswBlDdHBJG9WM7H
3fo+OCj9aqj3D9SNBPD6blqBGUNsf+iPMryPxKCK+wQK6RpF9UQfyEvjVGneTx1+
xgjvem8pSipxsh3p/mz7Ktemqrx5oKSOvq6+jBcYRJbrBfRf5z3XLYn3noC/EAL6
RHAKbp0WKhlj2N/wf7pirf/+wkpXlLmpqtdaskpZWkdWT+xexqUITL8eb81RB/Ra
Tew=
=ZwLd
-----END PGP SIGNATURE-----
Merge tag 'nvme-5.13-2021-05-13' of git://git.infradead.org/nvme into block-5.13
Pull NVMe fixes from Christoph:
"nvme fix for Linux 5.13
- correct the check for using the inline bio in nvmet
(Chaitanya Kulkarni)
- demote unsupported command warnings (Chaitanya Kulkarni)
- fix corruption due to double initializing ANA state (me, Hou Pu)
- reset ns->file when open fails (Daniel Wagner)
- fix a NULL deref when SEND is completed with error in nvmet-rdma
(Michal Kalderon)"
* tag 'nvme-5.13-2021-05-13' of git://git.infradead.org/nvme:
nvmet: use new ana_log_size instead the old one
nvmet: seset ns->file when open fails
nvmet: demote fabrics cmd parse err msg to debug
nvmet: use helper to remove the duplicate code
nvmet: demote discovery cmd parse err msg to debug
nvmet-rdma: Fix NULL deref when SEND is completed with error
nvmet: fix inline bio check for passthru
nvmet: fix inline bio check for bdev-ns
nvme-multipath: fix double initialization of ANA state
Reset the ns->file value to NULL also in the error case in
nvmet_file_ns_enable().
The ns->file variable points either to file object or contains the
error code after the filp_open() call. This can lead to following
problem:
When the user first setups an invalid file backend and tries to enable
the ns, it will fail. Then the user switches over to a bdev backend
and enables successfully the ns. The first received I/O will crash the
system because the IO backend is chosen based on the ns->file value:
static u16 nvmet_parse_io_cmd(struct nvmet_req *req)
{
[...]
if (req->ns->file)
return nvmet_file_parse_io_cmd(req);
return nvmet_bdev_parse_io_cmd(req);
}
Reported-by: Enzo Matsumiya <ematsumiya@suse.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Replace the following two statements by the statement “goto put_nbd;”
nbd_put(nbd);
return 0;
Signed-off-by: Sun Ke <sunke32@huawei.com>
Suggested-by: Markus Elfring <Markus.Elfring@web.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20210512114331.1233964-3-sunke32@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Last users of blk_account_rq gone with patch commit a1ce35fa49
("block: remove dead elevator code") and now it gets no caller, it can
be safely removed.
Signed-off-by: Lin Feng <linf@wangsu.com>
Link: https://lore.kernel.org/r/20210512100124.173769-1-linf@wangsu.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
BFQ may merge a new bfq_queue, stably, with the last bfq_queue
created. In particular, BFQ first waits a little bit for some I/O to
flow inside the new queue, say Q2, if this is needed to understand
whether it is better or worse to merge Q2 with the last queue created,
say Q1. This delayed stable merge is performed by assigning
bic->stable_merge_bfqq = Q1, for the bic associated with Q1.
Yet, while waiting for some I/O to flow in Q2, a non-stable queue
merge of Q2 with Q1 may happen, causing the bic previously associated
with Q2 to be associated with exactly Q1 (bic->bfqq = Q1). After that,
Q2 and Q1 may happen to be split, and, in the split, Q1 may happen to
be recycled as a non-shared bfq_queue. In that case, Q1 may then
happen to undergo a stable merge with the bfq_queue pointed by
bic->stable_merge_bfqq. Yet bic->stable_merge_bfqq still points to
Q1. So Q1 would be merged with itself.
This commit fixes this error by intercepting this situation, and
canceling the schedule of the stable merge.
Fixes: 430a67f9d6 ("block, bfq: merge bursts of newly-created queues")
Signed-off-by: Pietro Pedroni <pedroni.pietro.96@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20210512094352.85545-2-paolo.valente@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When the weight of an active iocg is updated, weight_updated() is called
which in turn calls __propagate_weights() to update the active and inuse
weights so that the effective hierarchical weights are update accordingly.
The current implementation is incorrect for inner active nodes. For an
active leaf iocg, inuse can be any value between 1 and active and the
difference represents how much the iocg is donating. When weight is updated,
as long as inuse is clamped between 1 and the new weight, we're alright and
this is what __propagate_weights() currently implements.
However, that's not how an active inner node's inuse is set. An inner node's
inuse is solely determined by the ratio between the sums of inuse's and
active's of its children - ie. they're results of propagating the leaves'
active and inuse weights upwards. __propagate_weights() incorrectly applies
the same clamping as for a leaf when an active inner node's weight is
updated. Consider a hierarchy which looks like the following with saturating
workloads in AA and BB.
R
/ \
A B
| |
AA BB
1. For both A and B, active=100, inuse=100, hwa=0.5, hwi=0.5.
2. echo 200 > A/io.weight
3. __propagate_weights() update A's active to 200 and leave inuse at 100 as
it's already between 1 and the new active, making A:active=200,
A:inuse=100. As R's active_sum is updated along with A's active,
A:hwa=2/3, B:hwa=1/3. However, because the inuses didn't change, the
hwi's remain unchanged at 0.5.
4. The weight of A is now twice that of B but AA and BB still have the same
hwi of 0.5 and thus are doing the same amount of IOs.
Fix it by making __propgate_weights() always calculate the inuse of an
active inner iocg based on the ratio of child_inuse_sum to child_active_sum.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dan Schatzberg <dschatzberg@fb.com>
Fixes: 7caa47151a ("blkcg: implement blk-iocost")
Cc: stable@vger.kernel.org # v5.4+
Link: https://lore.kernel.org/r/YJsxnLZV1MnBcqjj@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Host can send invalid commands and flood the target with error messages.
Demote the error message from pr_err() to pr_debug() in
nvmet_parse_fabrics_cmd() and nvmet_parse_connect_cmd().
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Use the helper nvmet_report_invalid_opcode() to report invalid opcode
so we can remove the duplicate code.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Host can send invalid commands and flood the target with error messages
for the discovery controller. Demote the error message from pr_err() to
pr_debug( in nvmet_parse_discovery_cmd().
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When handling passthru commands, for inline bio allocation we only
consider the transfer size. This works well when req->sg_cnt fits into
the req->inline_bvec, but it will result in the early return from
bio_add_hw_page() when req->sg_cnt > NVMET_MAX_INLINE_BVEC.
Consider an I/O of size 32768 and first buffer is not aligned to the
page boundary, then I/O is split in following manner :-
[ 2206.256140] nvmet: sg->length 3440 sg->offset 656
[ 2206.256144] nvmet: sg->length 4096 sg->offset 0
[ 2206.256148] nvmet: sg->length 4096 sg->offset 0
[ 2206.256152] nvmet: sg->length 4096 sg->offset 0
[ 2206.256155] nvmet: sg->length 4096 sg->offset 0
[ 2206.256159] nvmet: sg->length 4096 sg->offset 0
[ 2206.256163] nvmet: sg->length 4096 sg->offset 0
[ 2206.256166] nvmet: sg->length 4096 sg->offset 0
[ 2206.256170] nvmet: sg->length 656 sg->offset 0
Now the req->transfer_size == NVMET_MAX_INLINE_DATA_LEN i.e. 32768, but
the req->sg_cnt is (9) > NVMET_MAX_INLINE_BIOVEC which is (8).
This will result in early return in the following code path :-
nvmet_bdev_execute_rw()
bio_add_pc_page()
bio_add_hw_page()
if (bio_full(bio, len))
return 0;
Use previously introduced helper nvmet_use_inline_bvec() to consider
req->sg_cnt when using inline bio. This only affects nvme-loop
transport.
Fixes: dab3902b19 ("nvmet: use inline bio for passthru fast path")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When handling rw commands, for inline bio case we only consider
transfer size. This works well when req->sg_cnt fits into the
req->inline_bvec, but it will result in the warning in
__bio_add_page() when req->sg_cnt > NVMET_MAX_INLINE_BVEC.
Consider an I/O size 32768 and first page is not aligned to the page
boundary, then I/O is split in following manner :-
[ 2206.256140] nvmet: sg->length 3440 sg->offset 656
[ 2206.256144] nvmet: sg->length 4096 sg->offset 0
[ 2206.256148] nvmet: sg->length 4096 sg->offset 0
[ 2206.256152] nvmet: sg->length 4096 sg->offset 0
[ 2206.256155] nvmet: sg->length 4096 sg->offset 0
[ 2206.256159] nvmet: sg->length 4096 sg->offset 0
[ 2206.256163] nvmet: sg->length 4096 sg->offset 0
[ 2206.256166] nvmet: sg->length 4096 sg->offset 0
[ 2206.256170] nvmet: sg->length 656 sg->offset 0
Now the req->transfer_size == NVMET_MAX_INLINE_DATA_LEN i.e. 32768, but
the req->sg_cnt is (9) > NVMET_MAX_INLINE_BIOVEC which is (8).
This will result in the following warning message :-
nvmet_bdev_execute_rw()
bio_add_page()
__bio_add_page()
WARN_ON_ONCE(bio_full(bio, len));
This scenario is very hard to reproduce on the nvme-loop transport only
with rw commands issued with the passthru IOCTL interface from the host
application and the data buffer is allocated with the malloc() and not
the posix_memalign().
Fixes: 73383adfad ("nvmet: don't split large I/Os unconditionally")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
nvme_init_identify and thus nvme_mpath_init can be called multiple
times and thus must not overwrite potentially initialized or in-use
fields. Split out a helper for the basic initialization when the
controller is initialized and make sure the init_identify path does
not blindly change in-use data structures.
Fixes: 0d0b660f21 ("nvme: add ANA support")
Reported-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
__blk_mq_sched_bio_merge() gets the ctx and hctx for the current CPU and
passes the hctx to ->bio_merge(). kyber_bio_merge() then gets the ctx
for the current CPU again and uses that to get the corresponding Kyber
context in the passed hctx. However, the thread may be preempted between
the two calls to blk_mq_get_ctx(), and the ctx returned the second time
may no longer correspond to the passed hctx. This "works" accidentally
most of the time, but it can cause us to read garbage if the second ctx
came from an hctx with more ctx's than the first one (i.e., if
ctx->index_hw[hctx->type] > hctx->nr_ctx).
This manifested as this UBSAN array index out of bounds error reported
by Jakub:
UBSAN: array-index-out-of-bounds in ../kernel/locking/qspinlock.c:130:9
index 13106 is out of range for type 'long unsigned int [128]'
Call Trace:
dump_stack+0xa4/0xe5
ubsan_epilogue+0x5/0x40
__ubsan_handle_out_of_bounds.cold.13+0x2a/0x34
queued_spin_lock_slowpath+0x476/0x480
do_raw_spin_lock+0x1c2/0x1d0
kyber_bio_merge+0x112/0x180
blk_mq_submit_bio+0x1f5/0x1100
submit_bio_noacct+0x7b0/0x870
submit_bio+0xc2/0x3a0
btrfs_map_bio+0x4f0/0x9d0
btrfs_submit_data_bio+0x24e/0x310
submit_one_bio+0x7f/0xb0
submit_extent_page+0xc4/0x440
__extent_writepage_io+0x2b8/0x5e0
__extent_writepage+0x28d/0x6e0
extent_write_cache_pages+0x4d7/0x7a0
extent_writepages+0xa2/0x110
do_writepages+0x8f/0x180
__writeback_single_inode+0x99/0x7f0
writeback_sb_inodes+0x34e/0x790
__writeback_inodes_wb+0x9e/0x120
wb_writeback+0x4d2/0x660
wb_workfn+0x64d/0xa10
process_one_work+0x53a/0xa80
worker_thread+0x69/0x5b0
kthread+0x20b/0x240
ret_from_fork+0x1f/0x30
Only Kyber uses the hctx, so fix it by passing the request_queue to
->bio_merge() instead. BFQ and mq-deadline just use that, and Kyber can
map the queues itself to avoid the mismatch.
Fixes: a6088845c2 ("block: kyber: make kyber more friendly with merging")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Link: https://lore.kernel.org/r/c7598605401a48d5cfeadebb678abd10af22b83f.1620691329.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix the comment mentioning ioctl command range used for zoned block
devices to reflect the range of commands actually implemented.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Link: https://lore.kernel.org/r/20210509234806.3000-1-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This reverts commit cd2c7545ae.
Alex reports that the commit causes corruption with LUKS on ext4. Revert
it for now so that this can be investigated properly.
Link: https://lore.kernel.org/linux-block/1620493841.bxdq8r5haw.none@localhost/
Reported-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nothing can stop a host from submitting invalid commands. The target
just needs to respond with an appropriate status, but that's not a
target error. Demote invalid command messages to the debug level so
these events don't spam the kernel logs.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a request finally completes in end_io() after it has failed over,
the bdev pointer can be stale and thus the system can crash. Set the
bdev back to ns head, so the request is map to an active path when
resubmitted.
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
reset_work() in nvme-pci may hang forever in the following scenario:
1) A reset caused by a command timeout occurs due to a controller being
temporarily irresponsive.
2) nvme_reset_work() restarts admin queue at nvme_alloc_admin_tags(). At
the same time, a user-submitted admin command is queued and waiting
for completion. Then, reset_work() changes its state to CONNECTING,
and submits an identify command.
3) However, the controller does still not respond to any command,
causing a timeout being fired at the user-submitted command.
Unfortunately, nvme_timeout() does not see the completion on cq, and
any timeout that takes place under CONNECTING state causes a
controller shutdown.
4) Normally, the identify command in reset_work() would be canceled with
SC_HOST_ABORTED by nvme_dev_disable(), then reset_work can tear down
the controller accordingly. But the controller happens to return
online and respond the identify command before nvme_dev_disable()
should have been reaped it off.
5) reset_work() continues to setup_io_queues() as it observes no error
in init_identify(). However, the admin queue has already been
quiesced in dev_disable(). Thus, any following commands would be
blocked forever in blk_execute_rq().
This can be fixed by restricting usercmd commands when controller is not
in a LIVE state in nvme_queue_rq(), as what has been done previously in
fabrics.
```
nvme_reset_work(): |
nvme_alloc_admin_tags() |
| nvme_submit_user_cmd():
nvme_init_identify(): | ...
__nvme_submit_sync_cmd(): |
... | ...
---------------------------------------> nvme_timeout():
(Controller starts reponding commands) | nvme_dev_disable(, true):
nvme_setup_io_queues(): |
__nvme_submit_sync_cmd(): |
(hung in blk_execute_rq |
since run_hw_queue sees |
queue quiesced) |
```
Signed-off-by: Tao Chiu <taochiu@synology.com>
Signed-off-by: Cody Wong <codywong@synology.com>
Reviewed-by: Leon Chien <leonchien@synology.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
queue_rq() in pci only checks if the dispatched queue (nvmeq) is ready,
e.g. not being suspended. Since nvme_alloc_admin_tags() in reset flow
restarts the admin queue, users are able to submit admin commands to a
controller before reset_work() completes. Commands submitted under this
condition may interfere with commands that performs identify, IO queue
setup in reset_work(), and may result in a hang described in the
following patch.
As seen in the fabrics, user commands are prevented from being executed
under inproper controller states. We may reuse this logic to maintain a
clear admin queue during reset_work().
Signed-off-by: Tao Chiu <taochiu@synology.com>
Signed-off-by: Cody Wong <codywong@synology.com>
Reviewed-by: Leon Chien <leonchien@synology.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
nvme_clear_nvme_request() clears the nvme_command, which is unncessary
for passthrough requests as nvme_command is overwritten immediately.
Move clearing part from this helper to the caller, so that double memset
for passthrough requests is avoided.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In multipath case, we should consider namespace attachment with
controllers in a subsystem when we find out the live controller for the
namespace. This patch manually reverted the commit 3557a44097
("nvme: don't bother to look up a namespace for controller ioctls") with
few more updates to nvme_ns_head_chr_ioctl which has been newly updated.
Fixes: 3557a44097 ("nvme: don't bother to look up a namespace for
controller ioctls")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
bio size can grow up to 4GB when muli-page bvec is enabled.
but sometimes it would lead to inefficient behaviors.
in case of large chunk direct I/O, - 32MB chunk read in user space -
all pages for 32MB would be merged to a bio structure if the pages
physical addresses are contiguous. it makes some delay to submit
until merge complete. bio max size should be limited to a proper size.
When 32MB chunk read with direct I/O option is coming from userspace,
kernel behavior is below now in do_direct_IO() loop. it's timeline.
| bio merge for 32MB. total 8,192 pages are merged.
| total elapsed time is over 2ms.
|------------------ ... ----------------------->|
| 8,192 pages merged a bio.
| at this time, first bio submit is done.
| 1 bio is split to 32 read request and issue.
|--------------->
|--------------->
|--------------->
......
|--------------->
|--------------->|
total 19ms elapsed to complete 32MB read done from device. |
If bio max size is limited with 1MB, behavior is changed below.
| bio merge for 1MB. 256 pages are merged for each bio.
| total 32 bio will be made.
| total elapsed time is over 2ms. it's same.
| but, first bio submit timing is fast. about 100us.
|--->|--->|--->|---> ... -->|--->|--->|--->|--->|
| 256 pages merged a bio.
| at this time, first bio submit is done.
| and 1 read request is issued for 1 bio.
|--------------->
|--------------->
|--------------->
......
|--------------->
|--------------->|
total 17ms elapsed to complete 32MB read done from device. |
As a result, read request issue timing is faster if bio max size is limited.
Current kernel behavior with multipage bvec, super large bio can be created.
And it lead to delay first I/O request issue.
Signed-off-by: Changheun Lee <nanich.lee@samsung.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210503095203.29076-1-nanich.lee@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
rtrs_clt_rdma_cq_direct returns an ninitialized value in cnt
if there is no session. This patch makes rtrs_clt_rdma_cq_direct
returns a negative value for block layer not to try again.
Fixes: 2958a995ed ("block/rnbd-clt: Support polling mode for IO latency optimization")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210429092741.266533-1-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The IO performance test with fio after removing the likely and
unlikely macros in all if-statement shows no performance drop.
They do not help for the performance of rnbd.
The fio test did random read on 32 rnbd devices and 64 processes.
Test environment:
- AMD Opteron(tm) Processor 6386 SE
- 125G memory
- kernel version: 5.4.86
- gcc version: gcc (Debian 8.3.0-6) 8.3.0
- Infiniband controller: InfiniBand: Mellanox Technologies MT26428
[ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
before
read: IOPS=549k, BW=2146MiB/s
read: IOPS=544k, BW=2125MiB/s
read: IOPS=553k, BW=2158MiB/s
read: IOPS=535k, BW=2089MiB/s
read: IOPS=543k, BW=2122MiB/s
read: IOPS=552k, BW=2154MiB/s
average: IOPS=546k, BW=2132MiB/s
after
read: IOPS=556k, BW=2172MiB/s
read: IOPS=561k, BW=2191MiB/s
read: IOPS=552k, BW=2156MiB/s
read: IOPS=551k, BW=2154MiB/s
read: IOPS=562k, BW=2194MiB/s
-----------
average: IOPS=556k, BW=2173MiB/s
The IOPS and bandwidth got better slightly after removing
likely/unlikely. (IOPS= +1.8% BW= +1.9%) But we cannot make sure
that removing the likely/unlikely help the performance because it
depends on various situations. We only make sure that removing the
likely/unlikely does not drop the performance.
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-5-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In case none of the paths are in connected state, the function
rtrs_clt_query returns an error. In such a case, error out since the
values in the rtrs_attrs structure would be garbage.
Fixes: f7a7a5c228 ("block/rnbd: client: main functionality")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-4-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch fixes some style issues detected by scripts/checkpatch.pl
* Resolve spacing and tab issues
* Remove extra braces in rnbd_get_iu
* Use num_possible_cpus() instead of NR_CPUS in alloc_sess
* Fix the comments styling in rnbd_queue_rq
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-3-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The member queue_depth in the structure rnbd_clt_session is read from the
rtrs client side using the function rtrs_clt_query, which in turn is read
from the rtrs_clt structure. It should really be of type size_t.
Fixes: 90426e89f5 ("block/rnbd: client: private header with client structs and functions")
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Link: https://lore.kernel.org/r/20210428061359.206794-2-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- Implement concurrent TLB flushes, which overlaps the local TLB flush with the
remote TLB flush. In testing this improved sysbench performance measurably by
a couple of percentage points, especially if TLB-heavy security mitigations
are active.
- Further micro-optimizations to improve the performance of TLB flushes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCKbNcRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hjYBAAsyNUa/gOu0g6/Cx8R86w9HtHHmm5vso/
6nJjWj2fd2qJ9JShlddxvXEMeXtPTYabVWQkiiriFMuofk6JeKnlHm1Jzl6keABX
OQFwjIFeNASPRcdXvuuYPOVWAJJdr2oL9QUr6OOK1ccQJTz/Cd0zA+VQ5YqcsCon
yaWbkxELwKXpgql+qt66eAZ6Q2Y1TKXyrTW7ZgxQi0yeeWqMaEOub0/oyS7Ax1Rg
qEJMwm1prb76NPzeqR/G3e4KTrDZfQ/B/KnSsz36GTJpl4eye6XqWDUgm1nAGNIc
5dbc4Vx7JtZsUOuC0AmzWb3hsDyzVcN/lQvijdZ2RsYR3gvuYGaBhKqExqV0XH6P
oqaWOKWCz+LqWbsgJmxCpqkt1LZl5+VUOcfJ97WkIS7DyIPtSHTzQXbBMZqKLeat
mn5UcKYB2Gi7wsUPv6VC2ChKbDqN0VT8G86XbYylGo4BE46KoZKPUNY/QWKLUPd6
0UKcVeNM2HFyf1C73p/tO/z7hzu3qLuMMnsphP6/c2pKLpdgawEXgbnVKNId1B/c
NrzyhTvVaMt+Um28bBRhHONIlzPJwWcnZbdY7NqMnu+LBKQ68cL/h4FOIV/RDLNb
GJLgfAr8fIw/zIpqYuFHiiMNo9wWqVtZko1MvXhGceXUL69QuzTra2XR/6aDxkPf
6gQVesetTvo=
=3Cyp
-----END PGP SIGNATURE-----
Merge tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 tlb updates from Ingo Molnar:
"The x86 MM changes in this cycle were:
- Implement concurrent TLB flushes, which overlaps the local TLB
flush with the remote TLB flush.
In testing this improved sysbench performance measurably by a
couple of percentage points, especially if TLB-heavy security
mitigations are active.
- Further micro-optimizations to improve the performance of TLB
flushes"
* tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp: Micro-optimize smp_call_function_many_cond()
smp: Inline on_each_cpu_cond() and on_each_cpu()
x86/mm/tlb: Remove unnecessary uses of the inline keyword
cpumask: Mark functions as pure
x86/mm/tlb: Do not make is_lazy dirty for no reason
x86/mm/tlb: Privatize cpu_tlbstate
x86/mm/tlb: Flush remote and local TLBs concurrently
x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy()
x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()
smp: Run functions concurrently in smp_call_function_many_cond()
- Switch to generic syscall scripts
- Some small fixes
-----BEGIN PGP SIGNATURE-----
iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCYIpg9AAKCRDKSWXLKUoM
IXMaAKCMAvlTOl/sq+44yyqmZafzeka1kQCeOJ64km/oN86kZlhs72oXV3Nevmg=
=VvZO
-----END PGP SIGNATURE-----
Merge tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze
Pull Microblaze updates from Michal Simek:
"No new features, just about cleaning up some code and moving to
generic syscall solution used by other architectures:
- Switch to generic syscall scripts
- Some small fixes"
* tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: add 'fallthrough' to memcpy/memset/memmove
microblaze: Fix a typo
microblaze: tag highmem_setup() with __meminit
microblaze: syscalls: switch to generic syscallhdr.sh
microblaze: syscalls: switch to generic syscalltbl.sh
- removed broken/unmaintained MIPS KVM trap and emulate support
- added support for Loongson-2K1000
- fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmCJS6oaHHRzYm9nZW5k
QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHDnbBAAnfVnVgjRZZ1y+oX9siFk
xrQUVJMxEl02272yptB6VtWOLhTnFnJ2xNyjb4EcoCbTiBReSNQ8ee4WYYdCgSXh
soDW6wYrlpcZau5sRg72xybmmYo4aUEh3D9PWNpCaDXFaSpnbXv28ccCTb0yLvCL
4s72wl+hLGAygJMlA6gkQ3mN3XblRk+ci1MLQOr4WpxCmWx0ozKKnHnlt21rw26w
kokJRsqKAOgzzB5LIBgFBdlNKXMd+Xq8BhDXvilNbH2+LtrNIIbyJDPReZBkkO/T
JWGbcNJMox/lBzqtM+eIMvuf/2QHtUAB4xXSAH+oAno6PlV0S73Oo5E+tI++ax9j
WmdN3nwjhpEmwyovZwZtl+b0P14NXXAYKYOYClaRn3dFxEXnS7Fiq+EyGuOEQYBS
oTa2VXQrFZfpN55wtdwmXdwWDwNuE8UtEbkSpblyTw9HQqh3jnl8xKGnaWzw7HMN
D1yycBxMzuOrMiY5UBcfuzdU1K9Dgxn+ur8IfraXjjcEmZB+1TEB74/GCLK1/UJM
lBUrW9ZrE5e3COA/gqPq5ZQf4W9cp8xLBMWWXxnMSBEErpePDtt3EZP2H62sK+3a
wuyvEwRuiFskeUi/Tr7t89u4oXrcIZ7Q3bM3LKj3IWbsk/Znk8X03GCF86QU27zP
wlrVT2ef0/8DHEI7btraCb4=
=w/c4
-----END PGP SIGNATURE-----
Merge tag 'mips_5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- removed get_fs/set_fs
- removed broken/unmaintained MIPS KVM trap and emulate support
- added support for Loongson-2K1000
- fixes and cleanups
* tag 'mips_5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits)
MIPS: BCM63XX: Use BUG_ON instead of condition followed by BUG.
MIPS: select ARCH_KEEP_MEMBLOCK unconditionally
mips: Do not include hi and lo in clobber list for R6
MIPS:DTS:Correct the license for Loongson-2K
MIPS:DTS:Fix label name and interrupt number of ohci for Loongson-2K
MIPS: Avoid handcoded DIVU in `__div64_32' altogether
lib/math/test_div64: Correct the spelling of "dividend"
lib/math/test_div64: Fix error message formatting
mips/bootinfo:correct some comments of fw_arg
MIPS: Avoid DIVU in `__div64_32' is result would be zero
MIPS: Reinstate platform `__div64_32' handler
div64: Correct inline documentation for `do_div'
lib/math: Add a `do_div' test module
MIPS: Makefile: Replace -pg with CC_FLAGS_FTRACE
MIPS: pci-legacy: revert "use generic pci_enable_resources"
MIPS: Loongson64: Add kexec/kdump support
MIPS: pci-legacy: use generic pci_enable_resources
MIPS: pci-legacy: remove busn_resource field
MIPS: pci-legacy: remove redundant info messages
MIPS: pci-legacy: stop using of_pci_range_to_resource
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmCJUfIACgkQnJ2qBz9k
QNkStAf8CA7beya7LZ/GGN7HzXhv2cs+IpUFhRkynLklEM0lxKsOEagLFSZxkoMD
IBSRSo4odkkderqI9W/yp+9OYhOd9+BQCq4isg1Gh9Tf5xANJEpLvBAPnWVhooJs
9CrYZQY9Bdf+fF/8GHbKlrMAYm56vBCmWqyWTEtWUyPBOA12in2ZHQJmCa+5+nge
zTT/B5cvuhN5K7uYhGM4YfeCU5DBmmvD4sV6YBTkQOgCU0bEF0f9R3JjHDo34a1s
yqna3ypqKNRhsJVs8F+aOGRieUYxFoRqtYNHZK3qI9i07v7ndoTm5jzGN6OFlKs3
U3rF9/+cBgeESahWG6IjHIqhXGXNhg==
=KjNm
-----END PGP SIGNATURE-----
Merge tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
- support for limited fanotify functionality for unpriviledged users
- faster merging of fanotify events
- a few smaller fsnotify improvements
* tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
shmem: allow reporting fanotify events with file handles on tmpfs
fs: introduce a wrapper uuid_to_fsid()
fanotify_user: use upper_32_bits() to verify mask
fanotify: support limited functionality for unprivileged users
fanotify: configurable limits via sysfs
fanotify: limit number of event merge attempts
fsnotify: use hash table for faster events merge
fanotify: mix event info and pid into merge key hash
fanotify: reduce event objectid to 29-bit hash
fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmCJU1UACgkQnJ2qBz9k
QNk62AgAgp05OIXU/AgObb7DvSyI3ycwCV8PeWBpwD8yoDAh5x0tmT7vnJu974p6
yHdnF7rr69ZzvbNCHLJ5kRykRlUao9W7cO5fdOW1uTpL7Ic60QuJMks/NfgVTHp1
2zIQmBDerfn1/LTK8r2pPGcvtcjRcr7Ep4beN0Duw57lfVMJhjsNRPnBbXGBcp0r
QzKk4/8V3DCZvOw+XNC3nto7avjvf+nU9sJmuh83546eqh0atjWivvO5aAlDOe6W
rhBiLlmP0in5u2n1fYqzI1OQvtgtleyEZT2G0CrbAZn0xjmV/if9wl+3K6TOwDvR
778xDEX7sZCaO/xkB+WK3hrd15ftKg==
=0kYE
-----END PGP SIGNATURE-----
Merge tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota, ext2, reiserfs updates from Jan Kara:
- support for path (instead of device) based quotactl syscall
(quotactl_path(2))
- ext2 conversion to kmap_local()
- other minor cleanups & fixes
* tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fs/reiserfs/journal.c: delete useless variables
fs/ext2: Replace kmap() with kmap_local_page()
ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry()
fs/ext2/: fix misspellings using codespell tool
quota: report warning limits for realtime space quotas
quota: wire up quotactl_path
quota: Add mountpath based quota support
- Various minor fixes in online scrub.
- Prevent metadata files from being automatically inactivated.
- Validate btree heights by the computed per-btree limits.
- Don't warn about remounting with deprecated mount options.
- Initialize attr forks at create time if we suspect we're going to need
to store them.
- Reduce memory reallocation workouts in the logging code.
- Fix some theoretical math calculation errors in logged buffers that
span multiple discontig memory ranges but contiguous ondisk regions.
- Speedups in dirty buffer bitmap handling.
- Make type verifier functions more inline-happy to reduce overhead.
- Reduce debug overhead in directory checking code.
- Many many typo fixes.
- Begin to handle the permanent loss of the very end of a filesystem.
- Fold struct xfs_icdinode into xfs_inode.
- Deprecate the long defunct BMV_IF_NO_DMAPI_READ from the bmapx ioctl.
- Remove a broken directory block format check from online scrub.
- Fix a bug where we could produce an unnecessarily tall data fork btree
when creating an attr fork.
- Fix scrub and readonly remounts racing.
- Fix a writeback ioend log deadlock problem by dropping the behavior
where we could preallocate a setfilesize transaction.
- Fix some bugs in the new extent count checking code.
- Fix some bugs in the attr fork preallocation code.
- Refactor if_flags out of the incore inode fork data structure.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmB6MFUACgkQ+H93GTRK
tOvigBAAlpzBUXnZVo+U18u0tSHnq5c1zbXYcf5GPhQv9w3n3TlPi3YhK2vgEXlI
TULwsdU+an30oqWkQiVrwQjKPVaTWeWE3K0sA2MlYX9L2CwPPde4x5hwhyppfQxq
mQyu0suWp480ao7vToXAgZ751OdZRtGu8sRQ7rVQ/FVf9K4R8EqpZMEynNry25f+
hpK235hpf4IUC9E1A4pE2hNBSr/LGPIyu1t5sZsfazcNmtpKcauy5R5b8Pdnzo2/
WFa6PoeE8SRIp4OxZY/c/4QUI5cRubJGyoB+kbl0hg69uYIJO+pc+R69yrQPD9Z+
JDW/FktH+Zz4pstFsC+qnSvhRaF2DvXpvXrIldonQ2Z2ByVqbs3r6HzKySlWQ+QE
jU717HApWl/ADI/kVD2IuQnrbU+Q8Ue8thzgQeEpTRWsea2HzPMofNi5FImU2ulw
g4V7PleQWJ6AsLhcpfA46Y+CUAtjTD1Tvj67JpXuWJ+MFTB4hRm3U7zgCtV/0c3T
wBBUybQjDoVA6DDr6CP/9ki1k0BO3wKJGlZMR0bkEsuxXdFNTvHEz5lmueYT/Wxc
D91+oRbna9NpEeIVFGo6lhMIu2t0iYssFdgQKyn1jXrpGXKvOklP8zDjRdPnnQmz
plT2ajlXPIjc6KjOTP2mbVqKs059LuJoYV7gIWwM7CgtFsMIrd8=
=oRKe
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"The notable user-visible addition this cycle is ability to remove
space from the last AG in a filesystem. This is the first of many
changes needed for full-fledged support for shrinking a filesystem.
Still needed are (a) the ability to reorganize files and metadata away
from the end of the fs; (b) the ability to remove entire allocation
groups; (c) shrink support for realtime volumes; and (d) thorough
testing of (a-c).
There are a number of performance improvements in this code drop: Dave
streamlined various parts of the buffer logging code and reduced the
cost of various debugging checks, and added the ability to pre-create
the xattr structures while creating files. Brian eliminated
transaction reservations that were being held across writeback (thus
reducing livelock potential.
Other random pieces: Pavel fixed the repetitve warnings about
deprecated mount options, I fixed online fsck to behave itself when a
readonly remount comes in during scrub, and refactored various other
parts of that code, Christoph contributed a lot of refactoring this
cycle. The xfs_icdinode structure has been absorbed into the (incore)
xfs_inode structure, and the format and flags handling around
xfs_inode_fork structures has been simplified. Chandan provided a
number of fixes for extent count overflow related problems that have
been shaken out by debugging knobs added during 5.12.
Summary:
- Various minor fixes in online scrub.
- Prevent metadata files from being automatically inactivated.
- Validate btree heights by the computed per-btree limits.
- Don't warn about remounting with deprecated mount options.
- Initialize attr forks at create time if we suspect we're going to
need to store them.
- Reduce memory reallocation workouts in the logging code.
- Fix some theoretical math calculation errors in logged buffers that
span multiple discontig memory ranges but contiguous ondisk
regions.
- Speedups in dirty buffer bitmap handling.
- Make type verifier functions more inline-happy to reduce overhead.
- Reduce debug overhead in directory checking code.
- Many many typo fixes.
- Begin to handle the permanent loss of the very end of a filesystem.
- Fold struct xfs_icdinode into xfs_inode.
- Deprecate the long defunct BMV_IF_NO_DMAPI_READ from the bmapx
ioctl.
- Remove a broken directory block format check from online scrub.
- Fix a bug where we could produce an unnecessarily tall data fork
btree when creating an attr fork.
- Fix scrub and readonly remounts racing.
- Fix a writeback ioend log deadlock problem by dropping the behavior
where we could preallocate a setfilesize transaction.
- Fix some bugs in the new extent count checking code.
- Fix some bugs in the attr fork preallocation code.
- Refactor if_flags out of the incore inode fork data structure"
* tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (77 commits)
xfs: remove xfs_quiesce_attr declaration
xfs: remove XFS_IFEXTENTS
xfs: remove XFS_IFINLINE
xfs: remove XFS_IFBROOT
xfs: only look at the fork format in xfs_idestroy_fork
xfs: simplify xfs_attr_remove_args
xfs: rename and simplify xfs_bmap_one_block
xfs: move the XFS_IFEXTENTS check into xfs_iread_extents
xfs: drop unnecessary setfilesize helper
xfs: drop unused ioend private merge and setfilesize code
xfs: open code ioend needs workqueue helper
xfs: drop submit side trans alloc for append ioends
xfs: fix return of uninitialized value in variable error
xfs: get rid of the ip parameter to xchk_setup_*
xfs: fix scrub and remount-ro protection when running scrub
xfs: move the check for post-EOF mappings into xfs_can_free_eofblocks
xfs: move the xfs_can_free_eofblocks call under the IOLOCK
xfs: precalculate default inode attribute offset
xfs: default attr fork size does not handle device inodes
xfs: inode fork allocation depends on XFS_IFEXTENT flag
...
- Fix some compiler and kernel-doc warnings.
- Various minor cleanups and optimizations.
- Add a new sysfs gfs2 status file with some filesystem wide
information.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmCKiWIUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqXOA//cgEMi+WZ0pQ1m4Z7Yk58ArAGXOW4
L+efdMjk2zoqgixF502tQzaa2ctz6XpukF4oZbM+Jc+yZxrbZ5CLjUIOWc9RH+Id
WQwj5+5GLbMAPnn5ksHUCTK9V+1oAlpgoY4fMtLdKq234Y6xqWj5qBjvtGUTLFAl
ACvy8FUZplFOkaOSBqgh221LT4Oh0Wthe/Elq5qvqwBfdAiE/p1sHSi2FWxktIlU
wV3PKL96rFsnWN8E6jqyJR1RNJ5d5MYA+PDkTHKcoqcXZrzw4mfu2tCh88Bh9wFb
MEyjtLxE09G1+3Li/T/Tb7qbRKWvxEmkLZXaFAjRUp7zYPvM6twKSg8nihcBDtLi
UgvTrc208CYvYj7QpRQ1dU9lEg47A46rB8dgLz+ymlpNNk/G0gqgvWLevMKnBfaX
AkZviI6qm1iNCBd6wWWPUKqR0qrCWqoe9N8F7cWyZBki7dKkoj29Gt1X1SeIQMjd
n8Mkv6Btd39kBt3DydXlCEaREMQYeDrxBJHxur234hEfFLFraFj5tYFjoeSODZdg
Uxsn5X5dgLy/hHjps8YcuBoEgRMR/aKovK5G1FXDcQR5O6UByqtJuJTJBT8jxAld
vLYHqO6vxdgGATaYAkuGLSLrJjwES+7tEXjtrdarswGo55dPitwtOB1NRWuOor/Z
uTnzJbykMdIzEHs=
=VblJ
-----END PGP SIGNATURE-----
Merge tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Fix some compiler and kernel-doc warnings
- Various minor cleanups and optimizations
- Add a new sysfs gfs2 status file with some filesystem wide
information
* tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Fix fall-through warnings for Clang
gfs2: Fix a number of kernel-doc warnings
gfs2: Make gfs2_setattr_simple static
gfs2: Add new sysfs file for gfs2 status
gfs2: Silence possible null pointer dereference warning
gfs2: Turn gfs2_meta_indirect_buffer into gfs2_meta_buffer
gfs2: Replace gfs2_lblk_to_dblk with gfs2_get_extent
gfs2: Turn gfs2_extent_map into gfs2_{get,alloc}_extent
gfs2: Add new gfs2_iomap_get helper
gfs2: Remove unused variable sb_format
gfs2: Fix dir.c function parameter descriptions
gfs2: Eliminate gh parameter from go_xmote_bh func
gfs2: don't create empty buffers for NO_CREATE
- Improve write performance with dirsync mount option.
- Improve lookup performance.
- Add support for FITRIM ioctl.
- Fix a bug with discard option.
-----BEGIN PGP SIGNATURE-----
iQJMBAABCgA2FiEE6NzKS6Uv/XAAGHgyZwv7A1FEIQgFAmCIADwYHG5hbWphZS5q
ZW9uQHNhbXN1bmcuY29tAAoJEGcL+wNRRCEIalwP/jepL2Y7B8FOhMH1R4whbY6x
5fFlG608tR7jxUEMpAfMTcFC2+5nKpHN1keoHV8SXdJz2esd1ZUknaok5CIVQFYX
I7TTOWn/genJ6nVmHeTLZLAxAWznSlIqRdeuSnDMLMKSLDn7noW5l+vzgYy5dnaF
SzICtNXJqA7D7PcMIwTtfszcSEwmt1aLiRZZabZ+y0xSF2ha0mRB6B3hCx9sDXh5
iVVbSn10lk6ULENYcjaUZhF1Dt9Lv4pOZ9drr8bVfRmWBeWspB/X4/TgO+Mf/xP8
5el7OL965ofUaaCmhyfDMrWCBffeFbf0K8sUM/psiAUuKTugKvlZniZ0RB8zLEvZ
Bn2yvr+walpdkVNBbv3qi32/4xAx5ng90rSdqX3bvY3Zq1UDW49MIIZOh6uQCNnD
9ravgGoU2ujVcUQUKEU01dj1ora83IKQ7WuxcGXQOt/xNFMRG1FmiKgS7aj2HgMM
ax9pjZZY4H8ZUZOC2hpph+x6Kc+ZQDqzR6+0qbWM2/hlLgIURDXTqVe3m1DCuvDI
c2rvtpVtpYGz+HTZW+krUzc4kOAfUsS98D+7bLS5X/f4x4e5FxQDGyXaCNUI0OCr
ed8kHmHQsKvbIui4RHgZvZylj2xO34arSuMURL41ifuT9jFoQVxPcGluXNvbdcu8
yOyqeaqwgh5AJzXPVsRv
=igtQ
-----END PGP SIGNATURE-----
Merge tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Improve write performance with dirsync mount option
- Improve lookup performance
- Add support for FITRIM ioctl
- Fix a bug with discard option
* tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: speed up iterate/lookup by fixing start point of traversing cluster chain
exfat: improve write performance when dirsync enabled
exfat: add support ioctl and FITRIM function
exfat: introduce bitmap_lock for cluster bitmap access
exfat: fix erroneous discard when clear cluster bit
This series consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core
change is using a sbitmap instead of an atomic for queue tracking.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL
WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S
klkSxLzXKQLzJBgdK5w=
=p5B/
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).
The major core change is using a sbitmap instead of an atomic for
queue tracking"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits)
scsi: target: tcm_fc: Fix a kernel-doc header
scsi: target: Shorten ALUA error messages
scsi: target: Fix two format specifiers
scsi: target: Compare explicitly with SAM_STAT_GOOD
scsi: sd: Introduce a new local variable in sd_check_events()
scsi: dc395x: Open-code status_byte(u8) calls
scsi: 53c700: Open-code status_byte(u8) calls
scsi: smartpqi: Remove unused functions
scsi: qla4xxx: Remove an unused function
scsi: myrs: Remove unused functions
scsi: myrb: Remove unused functions
scsi: mpt3sas: Fix two kernel-doc headers
scsi: fcoe: Suppress a compiler warning
scsi: libfc: Fix a format specifier
scsi: aacraid: Remove an unused function
scsi: core: Introduce enum scsi_disposition
scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
scsi: core: Rename scsi_softirq_done() into scsi_complete()
scsi: core: Remove an incorrect comment
scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
...
usual suspects are here: i.MX, Qualcomm, Renesas, Allwinner, Samsung, and
Rockchip, but it feels pretty light on commits. There's only one real commit to
the framework core and that's to consolidate code. Otherwise the diffstat is
dominated by many Qualcomm clk driver patches that modernize the driver for the
proper way of speciying clk parents. That's shifting data around, which could
subtly break things so I'll be on the lookout for fixes.
New Drivers:
- Proper clk driver for Mediatek MT7621 SoCs
- Support for the clock controller on the new Rockchip rk3568
Updates:
- Simplify Zynq Kconfig dependencies
- Use clk_hw pointers in socfpga driver
- Cleanup parent data in qcom clk drivers
- Some cleanups for rk3399 modularization
- Fix reparenting of i.MX UART clocks by initializing only the ones
associated to stdout
- Correct the PCIE clocks for i.MX8MP and i.MX8MQ
- Make i.MX LPCG and SCU clocks return on registering failure
- Kernel doc fixes
- Add DAB hardware accelerator clocks on Renesas R-Car E3 and M3-N
- Add timer (TMU) clocks on Renesas R-Car H3 ES1.0
- Add Timer (TMU & CMT) and thermal sensor (TSC) clocks on Renesas R-Car V3U
- Sigma-delta modulation on Allwinner V3s audio PLL
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmCJ9PcRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSXEFhAAuWMm47mZjE4TG+6qizcGC7kEDwS9Rs+j
0oYzdWBmSn7vGBZCCdr/IgvjK3CviNDXE076w5ntZm5kajD3yUOBNkFpHxOa9la7
NuH+OJiJQfzfzwFPwQmMZ0TjeaPDLqXcneKC80nYnauo7HScnySdiLdAmEnzJn7v
3a6Jf0Wzv1QIMZxdX3HgHts/VcWwonNJ1IGPUl5Ac7tJgUKLi+l1R27dow9eCH83
7KPFXjCsiE9jwIBvEUEvxguz5SPQujSHc81cLp8FnIsY7YsTx48NaHQrpcqQM8f8
zliC29QIjvWr8BGMKcHmuDwEG68gDdjHzmaKJSbcVTZ/jikQu+lOMNX96HRSdNe1
48UTk8Gqwyj9xI4OGC8A2ssLCGvDivSpdEfYoH7hE3K/+dxjwita9ZGIiD7Zp7C+
omjVGIUeSqoyQBXINC3sVwR+8nqCyR6ceRTxeELa8XwjPLwVg2JGPvJCKLHr5RED
TXtt3SYCcoKbdxqrBwOsMx1dhWvkx2f0iAKyUKt8jbAIE+bNwbqGOnWMaqiM+j8r
ixG8WL59087XYBjeO7kwn16Gszf5EkZWRTsA27tHni4hDUFp1ZpPRUJwvHgT1S5v
o1Tvo5AmtvT9HMasPa5cqlQh7FOrRb2FPXxaQRFR8VX6GQanWf9N+MP3THwRSNVn
2oadQvAJm/w=
=z+1G
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"Here's a collection of largely clk driver updates. The usual suspects
are here: i.MX, Qualcomm, Renesas, Allwinner, Samsung, and Rockchip,
but it feels pretty light on commits.
There's only one real commit to the framework core and that's to
consolidate code. Otherwise the diffstat is dominated by many Qualcomm
clk driver patches that modernize the driver for the proper way of
speciying clk parents. That's shifting data around, which could subtly
break things so I'll be on the lookout for fixes.
New Drivers:
- Proper clk driver for Mediatek MT7621 SoCs
- Support for the clock controller on the new Rockchip rk3568
Updates:
- Simplify Zynq Kconfig dependencies
- Use clk_hw pointers in socfpga driver
- Cleanup parent data in qcom clk drivers
- Some cleanups for rk3399 modularization
- Fix reparenting of i.MX UART clocks by initializing only the ones
associated to stdout
- Correct the PCIE clocks for i.MX8MP and i.MX8MQ
- Make i.MX LPCG and SCU clocks return on registering failure
- Kernel doc fixes
- Add DAB hardware accelerator clocks on Renesas R-Car E3 and M3-N
- Add timer (TMU) clocks on Renesas R-Car H3 ES1.0
- Add Timer (TMU & CMT) and thermal sensor (TSC) clocks on
Renesas R-Car V3U
- Sigma-delta modulation on Allwinner V3s audio PLL"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (82 commits)
MAINTAINERS: add MT7621 CLOCK maintainer
staging: mt7621-dts: use valid vendor 'mediatek' instead of invalid 'mtk'
staging: mt7621-dts: make use of new 'mt7621-clk'
clk: ralink: add clock driver for mt7621 SoC
clk: uniphier: Fix potential infinite loop
clk: qcom: rpmh: add support for SDX55 rpmh IPA clock
clk: qcom: gcc-sdm845: get rid of the test clock
clk: qcom: convert SDM845 Global Clock Controller to parent_data
dt-bindings: clock: separate SDM845 GCC clock bindings
clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE
clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
clk: qcom: a7-pll: Add missing MODULE_DEVICE_TABLE
dt: bindings: add mt7621-sysc device tree binding documentation
dt-bindings: clock: add dt binding header for mt7621 clocks
clk: samsung: Remove redundant dev_err calls
clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable
clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback
clk: zynqmp: Drop dependency on ARCH_ZYNQMP
clk: zynqmp: Enable the driver if ZYNQMP_FIRMWARE is selected
clk: qcom: gcc-sm8350: use ARRAY_SIZE instead of specifying num_parents
...
- Add support for Software Nodes to MFD Core
- Remove support for Device Properties from MFD Core
- Use standard APIs in MFD Core
- New Drivers
- Add support for ROHM BD9576MUF and BD9573MUF PMICs
- Add support for Netronix Embedded Controller, PWM and RTC
- Add support for Actions Semi ATC260x PMICs and OnKey
- New Device Support
- Add support for DG1 PCIe Graphics Card to Intel PMT
- Add support for ROHM BD71815 PMIC to ROHM BD71828
- Add support for Tolino Shine 2 HD to Netronix Embedded Controller
- Add support for AX10 BMC Secure Updates to Intel M10 BMC
- Removed Device Support
- Remove Arizona Extcon support from MFD
- Remove ST-E AB8500 Power Supply code from MFD
- Remove AB3100 altogether
- New Functionality
- Add support for SMBus and I2C modes to Dialog DA9063
- Switch to using Software Nodes in Intel (various)
- New/converted Device Tree bindings; rohm,bd71815-pmic, rohm,bd9576-pmic,
netronix,ntxec, actions,atc260x,
ricoh,rn5t618, qcom-pm8xxx
- Fix-ups
- Fix error handling/path; intel_pmt
- Simplify code; rohm-bd718x7, ab8500-core, intel-m10-bmc
- Trivial clean-ups (reordering, spelling); rohm-generic, rn5t618, max8997
- Use correct data-type; db8500-prcmu
- Remove superfluous code; lp87565, intel_quark_i2c_gpi, lpc_sch, twl
- Use generic APIs/defines; lm3533-core, intel_quark_i2c_gpio
- Regmap related fix-ups; intel-m10-bmc, sec-core
- Reorder resource freeing during remove; intel_quark_i2c_gpio
- Make table indexing more robust; intel_quark_i2c_gpio
- Fix reference imbalances; arizona-irq
- Staticify and (un)constify things; arizona-spi, stmpe, ene-kb3930,
intel-lpss-acpi, intel-lpss-pci,
atc260x-i2c, intel_quark_i2c_gpio
- Bug Fixes
- Fix incorrect (register) values; intel-m10-bmc
- Kconfig related fixes; ABX500_CORE
- Do not clear the Auto Reload Register; stm32-timers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAmCJIPEACgkQUa+KL4f8
d2FRZA//Xu9f8u2uLuIfuhxIjUUXOqIjRAFnkhKlgGZhKsY8BohjQ80Tj9yp6UKy
St6ABwACO0hJap4zL4FxPW9+HXTmqZvAibnvHnvZdYSQ3ai6x9h6kTNvhSNLeRQU
fuY7eN8kpAHHiHNKNJCsQLQMvcIyP7+0KAP6qir5GYsMjiXspWq7THUnfBi2JXC6
y60guDo9XrgmQTO+pB870UJrKLM/h+iiohNRGxLFlShKhFCgbTB/wyw6bFeKy1SB
0/6XuY6fOt1IQyBDuzw383Q2faMWO9U+es29bwvFxdqJDK0MHQXC47zBba2q94wL
/9i/HSoz9dRHnTJNYUKWsVcPv4T84w/Iq7scyDvE00ubehJ+oo/M7Au3M6Tt3M1/
6lBAwFYXiwhQnp9EP3nwPwgJF6JzX1IGuMOsUAqrVFOEMuIkZKbRdUlatUhqepJT
spV4/TOfztAhY/7BzEOZLnF8cFNjmL5sn42/UzSRW708V5SxuTNsS48KJ4l0c7Er
CZSTlR/T1rKkWqf7ejaS2TNqMCdYyB3vZW0quDxZTHTZHv9huNUvtbKPR7jmd+4p
mrMIik7EE4BzC5m8tBPnXXZl+Og0keeYv4LUDBuLDX1agrxYIErl4ITvQTqqMfX1
Jt14SIjSO56iu2ngQuvGWwegVK4/urO2kBJKUAH1QN1OocNaajQ=
=IJSL
-----END PGP SIGNATURE-----
Merge tag 'mfd-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"Core Framework:
- Add support for Software Nodes to MFD Core
- Remove support for Device Properties from MFD Core
- Use standard APIs in MFD Core
New Drivers:
- Add support for ROHM BD9576MUF and BD9573MUF PMICs
- Add support for Netronix Embedded Controller, PWM and RTC
- Add support for Actions Semi ATC260x PMICs and OnKey
New Device Support:
- Add support for DG1 PCIe Graphics Card to Intel PMT
- Add support for ROHM BD71815 PMIC to ROHM BD71828
- Add support for Tolino Shine 2 HD to Netronix Embedded Controller
- Add support for AX10 BMC Secure Updates to Intel M10 BMC
Removed Device Support:
- Remove Arizona Extcon support from MFD
- Remove ST-E AB8500 Power Supply code from MFD
- Remove AB3100 altogether
New Functionality:
- Add support for SMBus and I2C modes to Dialog DA9063
- Switch to using Software Nodes in Intel (various)
New/converted Device Tree bindings:
- rohm bd71815-pmic, rohm bd9576-pmic, netronix ntxec, actions
atc260x, ricoh rn5t618, qcom pm8xxx
- Fix-ups:
- Fix error handling/path; intel_pmt
- Simplify code; rohm-bd718x7, ab8500-core, intel-m10-bmc
- Trivial clean-ups (reordering, spelling); rohm-generic, rn5t618,
max8997
- Use correct data-type; db8500-prcmu
- Remove superfluous code; lp87565, intel_quark_i2c_gpi, lpc_sch, twl
- Use generic APIs/defines; lm3533-core, intel_quark_i2c_gpio
- Regmap related fix-ups; intel-m10-bmc, sec-core
- Reorder resource freeing during remove; intel_quark_i2c_gpio
- Make table indexing more robust; intel_quark_i2c_gpio
- Fix reference imbalances; arizona-irq
- Staticify and (un)constify things; arizona-spi, stmpe, ene-kb3930,
intel-lpss-acpi, intel-lpss-pci, atc260x-i2c, intel_quark_i2c_gpio
Bug Fixes:
- Fix incorrect (register) values; intel-m10-bmc
- Kconfig related fixes; ABX500_CORE
- Do not clear the Auto Reload Register; stm32-timers"
* tag 'mfd-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (84 commits)
mfd: intel-m10-bmc: Add support for MAX10 BMC Secure Updates
Revert "mfd: max8997: Add of_compatible to Extcon and Charger mfd_cell"
mfd: twl: Remove unused inline function twl4030charger_usb_en()
dt-bindings: mfd: Convert pm8xxx bindings to yaml
dt-bindings: mfd: Add compatible for pmk8350 rtc
i2c: designware: Get rid of legacy platform data
mfd: intel_quark_i2c_gpio: Convert I²C to use software nodes
mfd: lpc_sch: Partially revert "Add support for Intel Quark X1000"
mfd: arizona: Fix rumtime PM imbalance on error
mfd: max8997: Replace 8998 with 8997
mfd: core: Use acpi_find_child_device() for child devices lookup
mfd: intel_quark_i2c_gpio: Don't play dirty trick with const
mfd: intel_quark_i2c_gpio: Enable MSI interrupt
mfd: intel_quark_i2c_gpio: Reuse BAR definitions for MFD cell indexing
mfd: ntxec: Support for EC in Tolino Shine 2 HD
mfd: stm32-timers: Avoid clearing auto reload register
mfd: intel_quark_i2c_gpio: Replace I²C speeds with descriptive definitions
mfd: intel_quark_i2c_gpio: Remove unused struct device member
mfd: intel_quark_i2c_gpio: Unregister resources in reversed order
mfd: Kconfig: ABX500_CORE should depend on ARCH_U8500
...