linux-sg2042

History

Jens Axboe 8c83878877 io_uring: fix poll races This is a straight port of Al's fix for the aio poll implementation, since the io_uring version is heavily based on that. The below description is almost straight from that patch, just modified to fit the io_uring situation. io_poll() has to cope with several unpleasant problems: * requests that might stay around indefinitely need to be made visible for io_cancel(2); that must not be done to a request already completed, though. * in cases when ->poll() has placed us on a waitqueue, wakeup might have happened (and request completed) before ->poll() returns. * worse, in some early wakeup cases request might end up re-added into the queue later - we can't treat "woken up and currently not in the queue" as "it's not going to stick around indefinitely" * ... moreover, ->poll() might have decided not to put it on any queues to start with, and that needs to be distinguished from the previous case * ->poll() might have tried to put us on more than one queue. Only the first will succeed for io poll, so we might end up missing wakeups. OTOH, we might very well notice that only after the wakeup hits and request gets completed (all before ->poll() gets around to the second poll_wait()). In that case it's too late to decide that we have an error. req->woken was an attempt to deal with that. Unfortunately, it was broken. What we need to keep track of is not that wakeup has happened - the thing might come back after that. It's that async reference is already gone and won't come back, so we can't (and needn't) put the request on the list of cancellables. The easiest case is "request hadn't been put on any waitqueues"; we can tell by seeing NULL apt.head, and in that case there won't be anything async. We should either complete the request ourselves (if vfs_poll() reports anything of interest) or return an error. In all other cases we get exclusion with wakeups by grabbing the queue lock. If request is currently on queue and we have something interesting from vfs_poll(), we can steal it and complete the request ourselves. If it's on queue and vfs_poll() has not reported anything interesting, we either put it on the cancellable list, or, if we know that it hadn't been put on all queues ->poll() wanted it on, we steal it and return an error. If it's _not_ on queue, it's either been already dealt with (in which case we do nothing), or there's io_poll_complete_work() about to be executed. In that case we either put it on the cancellable list, or, if we know it hadn't been put on all queues ->poll() wanted it on, simulate what cancel would've done. Fixes: `221c5eb233` ("io_uring: add support for IORING_OP_POLL") Signed-off-by: Jens Axboe <axboe@kernel.dk>		2019-03-15 15:28:57 -06:00
..
9p	Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2018-11-01 19:58:52 -07:00
adfs	adfs: use timespec64 for time conversion	2018-08-22 10:52:51 -07:00
affs	affs: fix potential memory leak when parsing option 'prefix'	2018-05-28 12:36:41 +02:00
afs	afs: Fix manually set volume location server list	2019-02-25 11:59:07 -08:00
autofs	autofs: clear O_NONBLOCK on the pipe	2019-03-07 18:32:01 -08:00
befs	fix a series of Documentation/ broken file name references	2018-06-15 18:10:01 -03:00
bfs	bfs: extra sanity checking and static inode bitmap	2019-01-04 13:13:47 -08:00
btrfs	for-5.1/block-20190302	2019-03-08 14:12:17 -08:00
cachefiles	fscache, cachefiles: remove redundant variable 'cache'	2018-11-30 16:00:58 +00:00
ceph	ceph: avoid repeatedly adding inode to mdsc->snap_flush_list	2019-02-18 18:08:29 +01:00
cifs	fs: cifs: Kconfig: pedantic formatting	2019-03-06 21:55:12 -06:00
coda	vfs: change inode times to use struct timespec64	2018-06-05 16:57:31 -07:00
configfs	configfs: fix registered group removal	2018-07-17 06:14:07 -07:00
cramfs	Make the Cramfs code more robust against filesystem corruptions,	2018-10-30 12:46:25 -07:00
crypto	fscrypt updates for v5.1	2019-03-09 10:54:24 -08:00
debugfs	Merge 5.0-rc6 into driver-core-next	2019-02-11 09:09:02 +01:00
devpts	devpts: Convert to new IDA API	2018-08-21 23:54:17 -04:00
dlm	socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes	2019-02-03 11:17:31 -08:00
ecryptfs	crypto: clarify name of WEAK_KEY request flag	2019-01-25 18:41:52 +08:00
efivarfs	efivars: Call guid_parse() against guid_t type of variable	2018-07-22 14:13:44 +02:00
efs	…
exportfs	exportfs: do not read dentry after free	2018-11-23 09:08:17 -05:00
ext2	\n	2019-03-07 09:01:33 -08:00
ext4	fscrypt updates for v5.1	2019-03-09 10:54:24 -08:00
f2fs	fscrypt updates for v5.1	2019-03-09 10:54:24 -08:00
fat	fat: enable .splice_write to support splice on O_DIRECT file	2019-03-07 18:32:01 -08:00
freevxfs	freevxfs_lookup(): use d_splice_alias()	2018-05-22 14:27:51 -04:00
fscache	fscache: fix race between enablement and dropping of object	2018-11-30 15:57:31 +00:00
fuse	fuse: decrement NR_WRITEBACK_TEMP on the right page	2019-01-16 10:27:59 +01:00
gfs2	We've only got three patches ready for this merge window:	2019-03-09 11:52:11 -08:00
hfs	hfs: do not free node before using	2018-11-30 14:56:14 -08:00
hfsplus	hfsplus: return file attributes on statx	2019-01-04 13:13:47 -08:00
hostfs	vfs: discard ATTR_ATTR_FLAG	2018-08-17 16:20:28 -07:00
hpfs	hpfs: remove unnecessary checks on the value of r when assigning error code	2018-08-25 12:42:33 -07:00
hugetlbfs	mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd	2019-03-05 21:07:19 -08:00
isofs	Update email address	2018-09-29 22:47:48 -04:00
jbd2	jbd2: clean up indentation issue, replace spaces with tab	2018-12-04 00:20:10 -05:00
jffs2	jffs2: Fix use of uninitialized delayed_work, lockdep breakage	2018-12-02 09:20:34 +01:00
jfs	jfs: remove redundant dquot_initialize() in jfs_evict_inode()	2018-09-20 09:28:49 -05:00
kernfs	Driver core patches for 5.1-rc1	2019-03-06 14:52:48 -08:00
lockd	NFS client updates for Linux 4.21	2019-01-02 16:35:23 -08:00
minix	minix_lookup: use d_splice_alias()	2018-05-22 14:27:52 -04:00
nfs	Merge branch 'fixes-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2019-02-20 09:09:33 -08:00
nfs_common	…
nfsd	Revert "nfsd4: return default lease period"	2019-02-14 12:33:19 -05:00
nilfs2	nilfs2: Use xa_erase_irq	2018-11-05 14:57:05 -05:00
nls	…
notify	fanotify: Make waits for fanotify events only killable	2019-02-21 11:47:23 +01:00
ntfs	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
ocfs2	ocfs2: Use zero-sized array and struct_size() in kzalloc()	2019-03-05 21:07:13 -08:00
omfs	omfs_lookup(): report IO errors, use d_splice_alias()	2018-05-22 14:27:58 -04:00
openpromfs	fs/openpromfs: Use of_node_name_eq for node name comparisons	2018-11-18 13:35:19 -08:00
orangefs	orangefs: remove two un-needed BUG_ONs...	2019-02-20 15:12:52 -05:00
overlayfs	Revert "ovl: relax permission checking on underlying layers"	2018-12-04 11:31:30 +01:00
proc	5.1 Merge Window Pull Request	2019-03-09 15:53:03 -08:00
pstore	pstore/ram: Avoid needless alloc during header write	2019-02-12 13:45:53 -08:00
qnx4	qnx4_lookup: use d_splice_alias()	2018-05-22 14:27:52 -04:00
qnx6	qnx6_lookup: switch to d_splice_alias()	2018-05-22 14:27:54 -04:00
quota	quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls.	2018-12-18 18:29:15 +01:00
ramfs	…
reiserfs	reiserfs: remove workaround code for GCC 3.x	2018-10-31 08:54:14 -07:00
romfs	romfs_lookup: switch to d_splice_alias()	2018-05-22 14:27:55 -04:00
squashfs	Squashfs: Compute expected length from inode size rather than block length	2018-08-02 09:34:02 -07:00
sysfs	sysfs: remove unused include of kernfs-internal.h	2019-02-08 12:57:31 +01:00
sysv	sysv: return 'err' instead of 0 in __sysv_write_inode	2018-11-10 08:02:40 -05:00
tracefs	tracefs: Annotate tracefs_ops with __ro_after_init	2018-07-31 11:32:44 -04:00
ubifs	fscrypt: remove filesystem specific build config option	2019-01-23 23:56:43 -05:00
udf	udf: Drop pointless check from udf_sync_fs()	2019-02-21 19:25:36 +01:00
ufs	fs/ufs: use ktime_get_real_seconds for sb and cg timestamps	2018-08-17 16:20:27 -07:00
xfs	for-5.1/block-20190302	2019-03-08 14:12:17 -08:00
Kconfig	scsi: fs: remove exofs	2019-02-05 21:28:13 -05:00
Kconfig.binfmt	kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt	2018-08-02 08:06:55 +09:00
Makefile	SCSI misc on 20190306	2019-03-09 16:53:47 -08:00
aio.c	Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2019-03-05 14:08:26 -08:00
anon_inodes.c	anon_inode_getfile(): switch to alloc_file_pseudo()	2018-07-12 10:04:27 -04:00
attr.c	fs: Fix attr.c kernel-doc	2018-07-03 16:44:45 -04:00
bad_inode.c	get rid of 'opened' argument of ->atomic_open() - part 3	2018-07-12 10:04:20 -04:00
binfmt_aout.c	a.out: remove core dumping support	2019-03-05 10:00:35 -08:00
binfmt_elf.c	fs/binfmt_elf.c: spread const a little	2019-03-07 18:32:01 -08:00
binfmt_elf_fdpic.c	treewide: kmalloc() -> kmalloc_array()	2018-06-12 16:19:22 -07:00
binfmt_em86.c	…
binfmt_flat.c	…
binfmt_misc.c	turn filp_clone_open() into inline wrapper for dentry_open()	2018-07-10 23:29:03 -04:00
binfmt_script.c	exec: load_script: Do not exec truncated interpreter path	2019-02-18 16:49:36 -08:00
block_dev.c	block: add bio_set_polled() helper	2019-02-24 08:20:17 -07:00
buffer.c	fs: fix guard_bio_eod to check for real EOD errors	2019-02-28 13:59:41 -07:00
char_dev.c	…
compat.c	ncpfs: remove compat functionality	2018-06-05 19:23:26 +02:00
compat_binfmt_elf.c	y2038: globally rename compat_time to old_time32	2018-08-27 14:48:48 +02:00
compat_ioctl.c	media updates for v4.20-rc1	2018-10-29 14:29:58 -07:00
coredump.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
d_path.c	…
dax.c	dax fix 4.21	2018-12-31 09:46:39 -08:00
dcache.c	fs/dcache: Track & report number of negative dentries	2019-01-30 11:02:11 -08:00
dcookies.c	…
direct-io.c	block: allow bio_for_each_segment_all() to iterate over multi-page bvec	2019-02-15 08:40:11 -07:00
drop_caches.c	fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()	2019-02-01 15:46:24 -08:00
eventfd.c	Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL	2018-06-28 10:40:47 -07:00
eventpoll.c	epoll: use rwlock in order to reduce ep_poll_callback() contention	2019-03-07 18:32:01 -08:00
exec.c	exec: increase BINPRM_BUF_SIZE to 256	2019-03-07 18:32:01 -08:00
fcntl.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
fhandle.c	…
file.c	io_uring-2019-03-06	2019-03-08 14:48:40 -08:00
file_table.c	fs: add fget_many() and fput_many()	2019-02-28 08:24:23 -07:00
filesystems.c	proc: introduce proc_create_single{,_data}	2018-05-16 07:23:35 +02:00
fs-writeback.c	writeback: synchronize sync(2) against cgroup writeback membership switches	2019-01-22 14:39:38 -07:00
fs_pin.c	…
fs_struct.c	…
fs_types.c	fs: common implementation of file type	2019-01-21 17:48:13 +01:00
inode.c	fs/inode.c: inode_set_flags(): replace opencoded set_mask_bits()	2019-03-05 21:07:13 -08:00
internal.h	overlayfs update for 4.19	2018-08-21 18:19:09 -07:00
io_uring.c	io_uring: fix poll races	2019-03-15 15:28:57 -06:00
ioctl.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
iomap.c	iomap: wire up the iopoll method	2019-02-24 08:20:17 -07:00
libfs.c	…
locks.c	locking/percpu-rwsem: Remove preempt_disable variants	2019-02-28 07:55:37 +01:00
mbcache.c	treewide: kmalloc() -> kmalloc_array()	2018-06-12 16:19:22 -07:00
mount.h	…
mpage.c	block: allow bio_for_each_segment_all() to iterate over multi-page bvec	2019-02-15 08:40:11 -07:00
namei.c	Merge branch 'akpm' (patches from Andrew)	2019-03-07 19:25:37 -08:00
namespace.c	audit/stable-5.1 PR 20190305	2019-03-07 12:20:11 -08:00
no-block.c	…
nsfs.c	…
open.c	overlayfs update for 4.19	2018-08-21 18:19:09 -07:00
pipe.c	memcg: localize memcg_kmem_enabled() check	2019-03-05 21:07:15 -08:00
pnode.c	vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled	2018-12-20 16:32:56 +00:00
pnode.h	…
posix_acl.c	…
proc_namespace.c	…
read_write.c	get rid of legacy 'get_ds()' function	2019-03-04 10:50:14 -08:00
readdir.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
select.c	y2038: syscalls: rename y2038 compat syscalls	2019-02-07 00:13:27 +01:00
seq_file.c	fs/seq_file.c: simplify seq_file iteration code and interface	2018-08-17 16:20:28 -07:00
signalfd.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
splice.c	fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes	2019-03-04 16:10:17 -08:00
stack.c	…
stat.c	y2038: Remove newstat family from default syscall set	2018-08-29 15:42:20 +02:00
statfs.c	vfs: add vfs_get_fsid() helper	2019-02-07 16:38:35 +01:00
super.c	mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNT	2018-12-21 11:51:23 -05:00
sync.c	…
timerfd.c	y2038: syscalls: rename y2038 compat syscalls	2019-02-07 00:13:27 +01:00
userfaultfd.c	userfaultfd: clear flag if remap event not enabled	2018-12-28 12:11:51 -08:00
utimes.c	y2038: syscalls: rename y2038 compat syscalls	2019-02-07 00:13:27 +01:00
xattr.c	sysfs: Do not return POSIX ACL xattrs via listxattr	2018-09-18 07:30:48 -04:00