linux-sg2042/fs
Vivek Goyal d9449ce35a Fix regression in direct writes performance due to WRITE_ODIRECT flag removal
There seems to be a regression in direct write path due to following
commit in for-2.6.33 branch of block tree.

commit 1af60fbd75
Author: Jeff Moyer <jmoyer@redhat.com>
Date:   Fri Oct 2 18:56:53 2009 -0400

    block: get rid of the WRITE_ODIRECT flag

Marking direct writes as WRITE_SYNC_PLUG instead of WRITE_ODIRECT, sets
the NOIDLE flag in bio and hence in request. This tells CFQ to not expect
more request from the queue and not idle on it (despite the fact that
queue's think time is less and it is not seeky).

So direct writers lose big time when competing with sequential readers.

Using fio, I have run one direct writer and two sequential readers and
following are the results with 2.6.32-rc7 kernel and with for-2.6.33
branch.

Test
====
1 direct writer and 2 sequential reader running simultaneously.

[global]
directory=/mnt/sdc/fio/
runtime=10

[seqwrite]
rw=write
size=4G
direct=1

[seqread]
rw=read
size=2G
numjobs=2

2.6.32-rc7
==========
direct writes: aggrb=2,968KB/s
readers	     : aggrb=101MB/s

for-2.6.33 branch
=================
direct write: aggrb=19KB/s
readers	      aggrb=137MB/s

This patch brings back the WRITE_ODIRECT flag, with the difference that we
don't set the BIO_RW_UNPLUG flag so that device is not unplugged after
submission of request and an explicit unplug from submitter is required.

That way we fix the jeff's issue of not enough merging taking place in aio
path as well as make sure direct writes get their fair share.

After the fix
=============
for-2.6.33 + fix
----------------
direct writes: aggrb=2,728KB/s
reads: aggrb=103MB/s

Thanks
Vivek

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-11-26 09:46:46 +01:00
..
9p 9p: Add fscache support to 9p 2009-09-23 13:03:46 -05:00
adfs adfs: remove redundant test on unsigned 2009-09-24 07:21:05 -07:00
affs affs: add ->sync_fs 2009-06-11 21:36:14 -04:00
afs afs: remove cache.h 2009-10-01 16:11:16 -07:00
autofs trivial: remove unnecessary semicolons 2009-09-21 15:14:58 +02:00
autofs4 autofs4 - fix missed case when changing to use struct path 2009-08-31 17:44:05 -10:00
befs fs: Make unload_nls() NULL pointer safe 2009-09-24 07:47:42 -04:00
bfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
btrfs Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable 2009-10-15 15:06:37 -07:00
cachefiles enforce ->sync_fs is only called for rw superblock 2009-06-11 21:36:06 -04:00
cifs [CIFS] Fixing to avoid invalid kfree() in cifs_get_tcp_session() 2009-10-06 18:31:29 +00:00
coda headers: remove sched.h from poll.h 2009-10-04 15:05:10 -07:00
configfs writeback: add name to backing_dev_info 2009-09-11 09:20:26 +02:00
cramfs
debugfs debugfs: use specified mode to possibly mark files read/write only 2009-06-15 21:30:28 -07:00
devpts Move magic numbers into magic.h 2009-09-23 07:39:28 -07:00
dlm dlm: fix socket fd translation 2009-09-30 12:19:44 -05:00
ecryptfs ima: ecryptfs fix imbalance message 2009-10-08 11:31:38 -05:00
efs get rid of BKL in fs/efs 2009-06-17 00:36:36 -04:00
exofs exofs: remove BKL from super operations 2009-09-24 07:47:38 -04:00
exportfs
ext2 Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2009-09-24 07:53:22 -07:00
ext3 ext3: Don't update superblock write time when filesystem is read-only 2009-10-13 00:06:43 +02:00
ext4 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2009-10-03 11:24:19 -07:00
fat Merge git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6 2009-09-30 09:31:14 -07:00
freevxfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
fscache FS-Cache: Fixup renamed filenames in comments in internal.h 2009-05-27 10:20:13 -07:00
fuse const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
gfs2 const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
hfs hfs: fix oops on mount with corrupted btree extent records 2009-10-29 07:39:29 -07:00
hfsplus hfsplus: refuse to mount volumes larger than 2TB 2009-10-29 07:39:27 -07:00
hostfs hostfs: set maximum filesize in superblock for proper LFS support 2009-06-30 18:56:03 -07:00
hpfs headers: smp_lock.h redux 2009-07-12 12:22:34 -07:00
hppfs
hugetlbfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-09-24 08:32:11 -07:00
isofs fs: Make unload_nls() NULL pointer safe 2009-09-24 07:47:42 -04:00
jbd jbd: Annotate transaction start also for journal_restart() 2009-09-16 17:44:10 +02:00
jbd2 const: constify remaining file_operations 2009-10-01 16:11:11 -07:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2009-09-23 10:07:49 -07:00
jfs fs: Make unload_nls() NULL pointer safe 2009-09-24 07:47:42 -04:00
lockd headers: utsname.h redux 2009-09-23 18:13:10 -07:00
minix V3 minixfs: add missing directory type checking 2009-09-23 07:39:57 -07:00
ncpfs const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
nfs NFSv4: The link() operation should return any delegation on the file 2009-10-26 08:09:46 -04:00
nfs_common
nfsd const: constify remaining file_operations 2009-10-01 16:11:11 -07:00
nilfs2 const: constify remaining file_operations 2009-10-01 16:11:11 -07:00
nls Merge git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6 2009-09-30 09:31:14 -07:00
notify dnotify: ignore FS_EVENT_ON_CHILD 2009-10-20 18:02:33 -04:00
ntfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-09-24 08:32:11 -07:00
ocfs2 const: constify remaining file_operations 2009-10-01 16:11:11 -07:00
omfs const: constify remaining file_operations 2009-10-01 16:11:11 -07:00
openpromfs
partitions partitions: read whole sector with EFI GPT header 2009-11-23 09:29:58 +01:00
proc hwpoison: fix/proc/meminfo alignment 2009-10-29 07:39:25 -07:00
qnx4 qnx4: remove write support 2009-09-23 07:39:30 -07:00
quota const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
ramfs truncate: use new helpers 2009-09-24 08:41:47 -04:00
reiserfs const: make struct super_block::s_qcop const 2009-09-22 07:17:24 -07:00
romfs ROMFS: fix length used with romfs_dev_strnlen() function 2009-10-11 11:33:56 -07:00
smbfs fs: Make unload_nls() NULL pointer safe 2009-09-24 07:47:42 -04:00
squashfs const: mark remaining super_operations const 2009-09-22 07:17:24 -07:00
sysfs sysfs: Allow sysfs_notify_dirent to be called from interrupt context. 2009-10-14 15:16:25 -07:00
sysv get rid of BKL in fs/sysv 2009-06-17 00:36:37 -04:00
ubifs const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
udf udf: Fix possible corruption when close races with write 2009-09-14 19:13:01 +02:00
ufs ufs: sector_t cannot be negative 2009-06-18 13:03:46 -07:00
xfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs 2009-10-31 12:12:49 -07:00
Kconfig Merge branch 'sh/for-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 2009-10-29 09:07:15 -07:00
Kconfig.binfmt
Makefile nilfs2: update makefile and Kconfig 2009-04-07 08:31:16 -07:00
aio.c block: move bdi/address_space unplug functions to backing-dev.h 2009-10-29 13:59:26 +01:00
anon_inodes.c headers: remove sched.h from poll.h 2009-10-04 15:05:10 -07:00
attr.c truncate: new helpers 2009-09-24 08:41:47 -04:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c elf: clean up fill_note_info() 2009-09-24 07:21:01 -07:00
binfmt_elf_fdpic.c fdpic: ignore the loader's PT_GNU_STACK when calculating the stack size 2009-09-24 07:21:02 -07:00
binfmt_em86.c
binfmt_flat.c flat: use IS_ERR_VALUE() helper macro 2009-09-24 07:21:03 -07:00
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c block: Create bip slabs with embedded integrity vectors 2009-07-01 10:56:25 +02:00
bio.c block: add helpers to run flush_dcache_page() against a bio and a request's pages 2009-11-26 09:16:19 +01:00
block_dev.c Merge branch 'for-linus' into for-2.6.33 2009-11-03 21:14:39 +01:00
buffer.c Merge branch 'writeback' of git://git.kernel.dk/linux-2.6-block 2009-09-25 09:27:30 -07:00
char_dev.c fs/char_dev.c: remove useless loop 2009-09-24 07:21:03 -07:00
compat.c fs: fix overflow in sys_mount() for in-kernel calls 2009-09-24 08:40:15 -04:00
compat_binfmt_elf.c
compat_ioctl.c compat_ioctl: hook up compat handler for FIEMAP ioctl 2009-08-07 10:39:56 -07:00
dcache.c sched: Pull up the might_sleep() check into cond_resched() 2009-07-18 15:51:44 +02:00
dcookies.c
direct-io.c Fix regression in direct writes performance due to WRITE_ODIRECT flag removal 2009-11-26 09:46:46 +01:00
drop_caches.c sysctl: remove "struct file *" argument of ->proc_handler 2009-09-24 07:21:04 -07:00
eventfd.c anonfd: split interface into file creation and install 2009-09-23 07:39:29 -07:00
eventpoll.c epoll: fix nested calls support 2009-06-18 13:03:41 -07:00
exec.c task_struct cleanup: move binfmt field to mm_struct 2009-09-24 07:21:05 -07:00
fcntl.c fcntl: add F_[SG]ETOWN_EX 2009-09-24 07:21:01 -07:00
fifo.c
file.c headers: remove sched.h from interrupt.h 2009-10-11 11:20:58 -07:00
file_table.c sysctl: remove "struct file *" argument of ->proc_handler 2009-09-24 07:21:04 -07:00
filesystems.c fs: Mark get_filesystem_list() as __init function. 2009-04-20 23:02:52 -04:00
fs-writeback.c writeback: pass in super_block to bdi_start_writeback() 2009-09-26 00:10:40 +02:00
fs_struct.c
generic_acl.c
inode.c vfs: optimize touch_time() too 2009-09-24 07:47:27 -04:00
internal.h fs: fix overflow in sys_mount() for in-kernel calls 2009-09-24 08:40:15 -04:00
ioctl.c vfs: explicitly cast s_maxbytes in fiemap_check_ranges 2009-09-24 07:47:31 -04:00
ioprio.c
libfs.c libfs: return error code on failed attr set 2009-09-24 07:47:30 -04:00
locks.c const: make lock_manager_operations const 2009-09-22 07:17:25 -07:00
mbcache.c
mpage.c ext4: Properly initialize the buffer_head state 2009-05-13 15:13:42 -04:00
namei.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 2009-09-11 08:55:49 -07:00
namespace.c fs: fix overflow in sys_mount() for in-kernel calls 2009-09-24 08:40:15 -04:00
nfsctl.c
no-block.c
open.c fs: change sys_truncate length parameter type 2009-09-23 09:21:05 -07:00
pipe.c fs: pipe.c null pointer dereference 2009-10-22 08:11:44 +09:00
pnode.c
pnode.h
posix_acl.c
read_write.c sendfile(): check f_op.splice_write() rather than f_op.sendpage() 2009-11-04 09:09:52 +01:00
read_write.h
readdir.c
select.c headers: remove sched.h from poll.h 2009-10-04 15:05:10 -07:00
seq_file.c vfs: seq_file: add helpers for data filling 2009-09-24 07:47:35 -04:00
signalfd.c
splice.c sendfile(): check f_op.splice_write() rather than f_op.sendpage() 2009-11-04 09:09:52 +01:00
stack.c
stat.c kill vfs_stat_fd / vfs_lstat_fd 2009-04-20 23:02:52 -04:00
super.c freeze_bdev: grab active reference to frozen superblocks 2009-09-24 07:47:41 -04:00
sync.c fs/buffer.c: clean up EXPORT* macros 2009-09-23 07:39:29 -07:00
timerfd.c
utimes.c
xattr.c VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx. 2009-09-10 10:11:22 +10:00
xattr_acl.c