linux-sg2042/block
Liu Bo 991f61fe7e Blk-throttle: reduce tail io latency when iops limit is enforced
When an application's iops has exceeded its cgroup's iops limit, surely it
is throttled and kernel will set a timer for dispatching, thus IO latency
includes the delay.

However, the dispatch delay which is calculated by the limit and the
elapsed jiffies is suboptimal.  As the dispatch delay is only calculated
once the application's iops is (iops limit + 1), it doesn't need to wait
any longer than the remaining time of the current slice.

The difference can be proved by the following fio job and cgroup iops
setting,
-----
$ echo 4 > /mnt/config/nullb/disk1/mbps    # limit nullb's bandwidth to 4MB/s for testing.
$ echo "253:1 riops=100 rbps=max" > /sys/fs/cgroup/unified/cg1/io.max
$ cat r2.job
[global]
name=fio-rand-read
filename=/dev/nullb1
rw=randread
bs=4k
direct=1
numjobs=1
time_based=1
runtime=60
group_reporting=1

[file1]
size=4G
ioengine=libaio
iodepth=1
rate_iops=50000
norandommap=1
thinktime=4ms
-----

wo patch:
file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.7-66-gedfc
Starting 1 process

   read: IOPS=99, BW=400KiB/s (410kB/s)(23.4MiB/60001msec)
    slat (usec): min=10, max=336, avg=27.71, stdev=17.82
    clat (usec): min=2, max=28887, avg=5929.81, stdev=7374.29
     lat (usec): min=24, max=28901, avg=5958.73, stdev=7366.22
    clat percentiles (usec):
     |  1.00th=[    4],  5.00th=[    4], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    6], 60.00th=[11731],
     | 70.00th=[11863], 80.00th=[11994], 90.00th=[12911], 95.00th=[22676],
     | 99.00th=[23725], 99.50th=[23987], 99.90th=[23987], 99.95th=[25035],
     | 99.99th=[28967]

w/ patch:
file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.7-66-gedfc
Starting 1 process

   read: IOPS=100, BW=400KiB/s (410kB/s)(23.4MiB/60005msec)
    slat (usec): min=10, max=155, avg=23.24, stdev=16.79
    clat (usec): min=2, max=12393, avg=5961.58, stdev=5959.25
     lat (usec): min=23, max=12412, avg=5985.91, stdev=5951.92
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    4], 20.00th=[    4],
     | 30.00th=[    4], 40.00th=[    5], 50.00th=[   47], 60.00th=[11863],
     | 70.00th=[11994], 80.00th=[11994], 90.00th=[11994], 95.00th=[11994],
     | 99.00th=[11994], 99.50th=[11994], 99.90th=[12125], 99.95th=[12125],
     | 99.99th=[12387]

Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09 12:43:16 -06:00
..
partitions partitions/aix: append null character to print data from disk 2018-07-27 09:17:41 -06:00
Kconfig block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
Kconfig.iosched License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Makefile block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-11-03 11:29:50 -07:00
bfq-cgroup.c block: use ktime_get_ns() instead of sched_clock() for cfq and bfq 2018-05-09 08:33:06 -06:00
bfq-iosched.c block, bfq: give a better name to bfq_bfqq_may_idle 2018-07-09 09:07:52 -06:00
bfq-iosched.h block, bfq: add/remove entity weights correctly 2018-07-09 09:07:52 -06:00
bfq-wf2q.c block, bfq: fix service being wrongly set to zero in case of preemption 2018-07-09 09:07:52 -06:00
bio-integrity.c block: move bio_integrity_{intervals,bytes} into blkdev.h 2018-07-26 15:49:41 -06:00
bio.c block: bvec_nr_vecs() returns value for wrong slab 2018-08-09 08:25:04 -06:00
blk-cgroup.c blk-cgroup: hold the queue ref during throttling 2018-08-01 09:16:03 -06:00
blk-core.c block: Introduce blk_exit_queue() 2018-08-09 09:12:59 -06:00
blk-exec.c blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request 2018-01-17 09:49:21 -07:00
blk-flush.c block: fix use-after-free in block flush handling 2018-06-09 06:37:14 -06:00
blk-integrity.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
blk-ioc.c block, mm: remove unnecessary __GFP_HIGH flag 2018-07-09 09:07:54 -06:00
blk-iolatency.c block: make iolatency avg_lat exponentially decay 2018-08-02 09:58:14 -06:00
blk-lib.c block: fix infinite loop if the device loses discard capability 2018-07-09 09:07:54 -06:00
blk-map.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
blk-merge.c block: don't use blocking queue entered for recursive bio submits 2018-06-02 20:35:00 -06:00
blk-mq-cpumap.c blk-mq: don't keep offline CPUs mapped to hctx 0 2018-04-10 08:38:46 -06:00
blk-mq-debugfs-zoned.c block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n 2018-07-09 09:07:52 -06:00
blk-mq-debugfs.c blk-mq: dequeue request one by one from sw queue if hctx is busy 2018-07-09 09:07:53 -06:00
blk-mq-debugfs.h block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n 2018-07-09 09:07:52 -06:00
blk-mq-pci.c blk-mq: code clean-up by adding an API to clear set->mq_map 2018-07-09 09:07:53 -06:00
blk-mq-rdma.c block: Add rdma affinity based queue mapping helper 2017-08-08 14:58:03 -04:00
blk-mq-sched.c blk-mq: issue directly if hw queue isn't busy in case of 'none' 2018-07-17 16:04:00 -06:00
blk-mq-sched.h block: move sysfs_lock into elevator_init 2018-06-01 07:38:19 -06:00
blk-mq-sysfs.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
blk-mq-tag.c blk-mq: count the hctx as active before allocating tag 2018-08-09 08:34:17 -06:00
blk-mq-tag.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c blk-mq: count the hctx as active before allocating tag 2018-08-09 08:34:17 -06:00
blk-mq.h blk-mq: issue directly if hw queue isn't busy in case of 'none' 2018-07-17 16:04:00 -06:00
blk-rq-qos.c blk-rq-qos: make depth comparisons unsigned 2018-07-22 11:30:53 -06:00
blk-rq-qos.h blk-rq-qos: make depth comparisons unsigned 2018-07-22 11:30:53 -06:00
blk-settings.c block: allow max_discard_segments to be stacked 2018-07-24 14:46:39 -06:00
blk-softirq.c block: fix timeout changes for legacy request drivers 2018-06-19 11:27:18 -06:00
blk-stat.c blk-stat: export helpers for modifying blk_rq_stat 2018-07-09 09:07:54 -06:00
blk-stat.h blk-stat: export helpers for modifying blk_rq_stat 2018-07-09 09:07:54 -06:00
blk-sysfs.c block: Ensure that a request queue is dissociated from the cgroup controller 2018-08-09 09:13:00 -06:00
blk-tag.c for-linus-20180616 2018-06-17 05:37:55 +09:00
blk-throttle.c Blk-throttle: reduce tail io latency when iops limit is enforced 2018-08-09 12:43:16 -06:00
blk-timeout.c blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE 2018-06-23 10:25:45 -06:00
blk-wbt.c blk-wbt: Avoid lock contention and thundering herd issue in wbt_wait 2018-08-07 14:40:49 -06:00
blk-wbt.h block: remove external dependency on wbt_flags 2018-07-09 09:07:54 -06:00
blk-zoned.c block: Remove a superfluous cast from blkdev_report_zones() 2018-07-09 09:07:52 -06:00
blk.h block: Introduce blk_exit_queue() 2018-08-09 09:12:59 -06:00
bounce.c block: unexport bio_clone_bioset 2018-07-24 14:43:26 -06:00
bsg-lib.c block/bsg-lib: use PTR_ERR_OR_ZERO to simplify the flow path 2018-08-01 09:13:03 -06:00
bsg.c Linux 4.18-rc6 2018-08-05 19:32:09 -06:00
cfq-iosched.c cfq: Suppress compiler warnings about comparisons 2018-08-07 17:57:13 -06:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat_ioctl.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
deadline-iosched.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
elevator.c block: split the blk-mq case from elevator_init 2018-06-01 07:38:21 -06:00
genhd.c block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
ioctl.c block: pass inclusive 'lend' parameter to truncate_inode_pages_range 2018-02-23 15:20:19 -07:00
ioprio.c block: add ioprio_check_cap function 2018-05-31 10:50:54 -04:00
kyber-iosched.c block: kyber: make kyber more friendly with merging 2018-05-30 10:47:40 -06:00
mq-deadline.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled 2017-09-11 09:45:52 -06:00
partition-generic.c block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
scsi_ioctl.c block: consistently use GFP_NOIO instead of __GFP_NORECLAIM 2018-05-14 08:55:18 -06:00
sed-opal.c block: sed-opal: Fix a couple off by one bugs 2018-06-20 12:04:06 -06:00
t10-pi.c block: move dif_prepare/dif_complete functions to block layer 2018-07-30 08:27:02 -06:00