linux-sg2042/drivers/md
Ming Lei 396eaf21ee blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback
blk_insert_cloned_request() is called in the fast path of a dm-rq driver
(e.g. blk-mq request-based DM mpath).  blk_insert_cloned_request() uses
blk_mq_request_bypass_insert() to directly append the request to the
blk-mq hctx->dispatch_list of the underlying queue.

1) This way isn't efficient enough because the hctx spinlock is always
used.

2) With blk_insert_cloned_request(), we completely bypass underlying
queue's elevator and depend on the upper-level dm-rq driver's elevator
to schedule IO.  But dm-rq currently can't get the underlying queue's
dispatch feedback at all.  Without knowing whether a request was issued
or not (e.g. due to underlying queue being busy) the dm-rq elevator will
not be able to provide effective IO merging (as a side-effect of dm-rq
currently blindly destaging a request from its elevator only to requeue
it after a delay, which kills any opportunity for merging).  This
obviously causes very bad sequential IO performance.

Fix this by updating blk_insert_cloned_request() to use
blk_mq_request_direct_issue().  blk_mq_request_direct_issue() allows a
request to be issued directly to the underlying queue and returns the
dispatch feedback (blk_status_t).  If blk_mq_request_direct_issue()
returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE
to _not_ destage the request.  Whereby preserving the opportunity to
merge IO.

With this, request-based DM's blk-mq sequential IO performance is vastly
improved (as much as 3X in mpath/virtio-scsi testing).

Signed-off-by: Ming Lei <ming.lei@redhat.com>
[blk-mq.c changes heavily influenced by Ming Lei's initial solution, but
they were refactored to make them less fragile and easier to read/review]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17 09:46:54 -07:00
..
bcache bcache: closures: move control bits one bit right 2018-01-09 12:18:51 -07:00
persistent-data - A few conversions from atomic_t to ref_count_t 2017-11-14 15:50:56 -08:00
Kconfig md-cluster: update document for raid10 2017-11-01 21:32:25 -07:00
Makefile Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
dm-bio-prison-v1.c dm bio prison: use rb_entry() rather than container_of() 2017-06-19 11:03:50 -04:00
dm-bio-prison-v1.h block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
dm-bio-prison-v2.c dm bio prison: use rb_entry() rather than container_of() 2017-06-19 11:03:50 -04:00
dm-bio-prison-v2.h dm bio prison v2: new interface for the bio prison 2017-03-07 11:30:16 -05:00
dm-bio-record.h block: replace bi_bdev with a gendisk pointer and partitions index 2017-08-23 12:49:55 -06:00
dm-bufio.c dm bufio: fix shrinker scans when (nr_to_scan < retain_target) 2017-12-08 10:54:25 -05:00
dm-bufio.h dm integrity: optimize writing dm-bufio buffers that are partially changed 2017-08-28 11:47:17 -04:00
dm-builtin.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dm-cache-background-tracker.c dm cache background tracker: limit amount of background work that may be issued at once 2017-11-10 15:45:03 -05:00
dm-cache-background-tracker.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-block-types.h linux: drop __bitwise__ everywhere 2016-12-16 00:13:41 +02:00
dm-cache-metadata.c dm cache: convert dm_cache_metadata.ref_count from atomic_t to refcount_t 2017-10-24 15:09:51 -04:00
dm-cache-metadata.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-policy-internal.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-policy-smq.c dm cache policy smq: allocate cache blocks in order 2017-11-10 15:45:05 -05:00
dm-cache-policy.c
dm-cache-policy.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-target.c dm: fix various targets to dm_register_target after module __init resources created 2017-12-04 10:23:10 -05:00
dm-core.h dm: allocate struct mapped_device with kvzalloc 2017-11-10 15:44:55 -05:00
dm-crypt.c dm-crypt: don't clear bvec->bv_page in crypt_free_buffer_pages() 2018-01-06 09:18:00 -07:00
dm-delay.c md: Convert timers to use timer_setup() 2017-11-14 20:11:57 -07:00
dm-era-target.c dm: do not set 'discards_supported' in targets that do not need it 2017-11-16 16:33:54 -05:00
dm-exception-store.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-exception-store.h dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-flakey.c - Some request-based DM core and DM multipath fixes and cleanups 2017-09-14 13:43:16 -07:00
dm-integrity.c md: Convert timers to use timer_setup() 2017-11-14 20:11:57 -07:00
dm-io.c block: replace bi_bdev with a gendisk pointer and partitions index 2017-08-23 12:49:55 -06:00
dm-ioctl.c dm ioctl: fix alignment of event number in the device list 2017-09-25 11:18:29 -04:00
dm-kcopyd.c locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE() 2017-10-25 11:01:08 +02:00
dm-linear.c - Some request-based DM core and DM multipath fixes and cleanups 2017-09-14 13:43:16 -07:00
dm-log-userspace-base.c dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() 2015-10-31 19:06:00 -04:00
dm-log-userspace-transfer.c dm log userspace transfer: match wait_for_completion_timeout return type 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.h
dm-log-writes.c dm log writes: add support for DAX 2017-11-10 15:44:51 -05:00
dm-log.c block,fs: use REQ_* flags directly 2016-11-01 09:43:26 -06:00
dm-mpath.c dm mpath: Use blk_path_error 2018-01-10 10:52:20 -07:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-queue-length.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-raid.c - A DM multipath stable@ fix to silence an annoying error message that 2017-11-17 09:40:12 -08:00
dm-raid1.c md: Convert timers to use timer_setup() 2017-11-14 20:11:57 -07:00
dm-region-hash.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-round-robin.c dm round robin: revert "use percpu 'repeat_count' and 'current_path'" 2017-02-17 00:54:09 -05:00
dm-rq.c blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback 2018-01-17 09:46:54 -07:00
dm-rq.h dm rq: do not update rq partially in each ending bio 2017-08-28 10:23:28 -04:00
dm-service-time.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-snap-persistent.c dm: make flush bios explicitly sync 2017-05-31 10:50:23 -04:00
dm-snap-transient.c dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-snap.c dm: fix various targets to dm_register_target after module __init resources created 2017-12-04 10:23:10 -05:00
dm-stats.c Merge branch 'linus' into locking/core, to resolve conflicts 2017-11-07 10:32:44 +01:00
dm-stats.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dm-stripe.c - Some request-based DM core and DM multipath fixes and cleanups 2017-09-14 13:43:16 -07:00
dm-switch.c locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE() 2017-10-25 11:01:08 +02:00
dm-sysfs.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-table.c dm table: fix regression from improper dm_dev_internal.count refcount_t conversion 2017-12-04 10:23:10 -05:00
dm-target.c dm: don't return errnos from ->map 2017-06-09 09:27:32 -06:00
dm-thin-metadata.c dm thin metadata: call precommit before saving the roots 2017-05-15 15:09:49 -04:00
dm-thin-metadata.h dm thin: fix a race condition between discarding and provisioning a block 2016-07-20 12:43:35 -04:00
dm-thin.c dm: fix various targets to dm_register_target after module __init resources created 2017-12-04 10:23:10 -05:00
dm-uevent.c
dm-uevent.h
dm-verity-fec.c dm verity fec: fix GFP flags used with mempool_alloc() 2017-07-26 15:55:44 -04:00
dm-verity-fec.h dm verity fec: limit error correction recursion 2017-03-16 09:37:31 -04:00
dm-verity-target.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2017-11-14 10:52:09 -08:00
dm-verity.h dm: move dm-verity to generic async completion 2017-11-03 22:11:20 +08:00
dm-zero.c dm: don't return errnos from ->map 2017-06-09 09:27:32 -06:00
dm-zoned-metadata.c block: replace bi_bdev with a gendisk pointer and partitions index 2017-08-23 12:49:55 -06:00
dm-zoned-reclaim.c dm zoned: use GFP_NOIO in I/O path 2017-07-26 15:55:43 -04:00
dm-zoned-target.c dm zoned: ignore last smaller runt zone 2017-11-10 15:44:53 -05:00
dm-zoned.h dm zoned: drive-managed zoned block device target 2017-06-19 11:05:20 -04:00
dm.c dm: fix incomplete request_queue initialization 2018-01-15 08:54:32 -07:00
dm.h dm: convert dm_dev_internal.count from atomic_t to refcount_t 2017-10-24 15:09:51 -04:00
md-bitmap.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-bitmap.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-cluster.c md-cluster: update document for raid10 2017-11-01 21:32:25 -07:00
md-cluster.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
md-faulty.c md: rename some drivers/md/ files to have an "md-" prefix 2017-10-16 19:06:36 -07:00
md-linear.c md: rename some drivers/md/ files to have an "md-" prefix 2017-10-16 19:06:36 -07:00
md-linear.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-multipath.c md: remove redundant variable q 2017-11-01 21:32:24 -07:00
md-multipath.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md.c md: limit mdstat resync progress to max_sectors 2017-12-01 12:19:47 -08:00
md.h md: use lockdep_assert_held 2017-11-01 21:32:22 -07:00
raid0.c md: remove special meaning of ->quiesce(.., 2) 2017-11-01 21:32:20 -07:00
raid0.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid1-10.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid1.c md/raid1/10: add missed blk plug 2017-12-01 12:19:48 -08:00
raid1.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid5-cache.c md/r5cache: move mddev_lock() out of r5c_journal_mode_set() 2017-12-01 11:27:32 -08:00
raid5-log.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
raid5-ppl.c raid5-ppl: check recovery_offset when performing ppl recovery 2017-10-16 19:06:34 -07:00
raid5.c md/raid5: correct degraded calculation in raid5_error 2017-12-01 11:27:32 -08:00
raid5.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid10.c md/raid1/10: add missed blk plug 2017-12-01 12:19:48 -08:00
raid10.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00