linux-sg2042

History

Johannes Weiner 147e1a97c4 fs: kernfs: add poll file operation Patch series "psi: pressure stall monitors", v3. Android is adopting psi to detect and remedy memory pressure that results in stuttering and decreased responsiveness on mobile devices. Psi gives us the stall information, but because we're dealing with latencies in the millisecond range, periodically reading the pressure files to detect stalls in a timely fashion is not feasible. Psi also doesn't aggregate its averages at a high enough frequency right now. This patch series extends the psi interface such that users can configure sensitive latency thresholds and use poll() and friends to be notified when these are breached. As high-frequency aggregation is costly, it implements an aggregation method that is optimized for fast, short-interval averaging, and makes the aggregation frequency adaptive, such that high-frequency updates only happen while monitored stall events are actively occurring. With these patches applied, Android can monitor for, and ward off, mounting memory shortages before they cause problems for the user. For example, using memory stall monitors in userspace low memory killer daemon (lmkd) we can detect mounting pressure and kill less important processes before device becomes visibly sluggish. In our memory stress testing psi memory monitors produce roughly 10x less false positives compared to vmpressure signals. Having ability to specify multiple triggers for the same psi metric allows other parts of Android framework to monitor memory state of the device and act accordingly. The new interface is straightforward. The user opens one of the pressure files for writing and writes a trigger description into the file descriptor that defines the stall state - some or full, and the maximum stall time over a given window of time. E.g.: /* Signal when stall time exceeds 100ms of a 1s window */ char trigger[] = "full 100000 1000000"; fd = open("/proc/pressure/memory"); write(fd, trigger, sizeof(trigger)); while (poll() >= 0) { ... } close(fd); When the monitored stall state is entered, psi adapts its aggregation frequency according to what the configured time window requires in order to emit event signals in a timely fashion. Once the stalling subsides, aggregation reverts back to normal. The trigger is associated with the open file descriptor. To stop monitoring, the user only needs to close the file descriptor and the trigger is discarded. Patches 1-4 prepare the psi code for polling support. Patch 5 implements the adaptive polling logic, the pressure growth detection optimized for short intervals, and hooks up write() and poll() on the pressure files. The patches were developed in collaboration with Johannes Weiner. This patch (of 5): Kernfs has a standardized poll/notification mechanism for waking all pollers on all fds when a filesystem node changes. To allow polling for custom events, add a .poll callback that can override the default. This is in preparation for pollable cgroup pressure files which have per-fd trigger configurations. Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.com Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2019-03-05 21:07:17 -08:00
..
9p	Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2018-11-01 19:58:52 -07:00
adfs	adfs: use timespec64 for time conversion	2018-08-22 10:52:51 -07:00
affs	affs: fix potential memory leak when parsing option 'prefix'	2018-05-28 12:36:41 +02:00
afs	afs: Fix manually set volume location server list	2019-02-25 11:59:07 -08:00
autofs	autofs: fix error return in autofs_fill_super()	2019-02-01 15:46:24 -08:00
befs	fix a series of Documentation/ broken file name references	2018-06-15 18:10:01 -03:00
bfs	bfs: extra sanity checking and static inode bitmap	2019-01-04 13:13:47 -08:00
btrfs	for-5.0-rc4-tag	2019-02-03 08:48:33 -08:00
cachefiles	fscache, cachefiles: remove redundant variable 'cache'	2018-11-30 16:00:58 +00:00
ceph	ceph: avoid repeatedly adding inode to mdsc->snap_flush_list	2019-02-18 18:08:29 +01:00
cifs	cifs: update internal module version number	2019-01-31 07:05:06 -06:00
coda	vfs: change inode times to use struct timespec64	2018-06-05 16:57:31 -07:00
configfs	configfs: fix registered group removal	2018-07-17 06:14:07 -07:00
cramfs	Make the Cramfs code more robust against filesystem corruptions,	2018-10-30 12:46:25 -07:00
crypto	crypto: clarify name of WEAK_KEY request flag	2019-01-25 18:41:52 +08:00
debugfs	debugfs: debugfs_lookup() should return NULL if not found	2019-01-30 12:39:49 +01:00
devpts	devpts: Convert to new IDA API	2018-08-21 23:54:17 -04:00
dlm	socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes	2019-02-03 11:17:31 -08:00
ecryptfs	crypto: clarify name of WEAK_KEY request flag	2019-01-25 18:41:52 +08:00
efivarfs	efivars: Call guid_parse() against guid_t type of variable	2018-07-22 14:13:44 +02:00
efs	…
exofs	exofs_mount(): fix leaks on failure exits	2018-12-17 18:36:33 -05:00
exportfs	exportfs: do not read dentry after free	2018-11-23 09:08:17 -05:00
ext2	\n	2018-12-27 17:00:35 -08:00
ext4	Revert "ext4: use ext4_write_inode() when fsyncing w/o a journal"	2019-01-31 23:41:11 -05:00
f2fs	f2fs-for-4.21-rc1	2018-12-31 09:41:37 -08:00
fat	Merge branch 'akpm' (patches from Andrew)	2019-01-05 09:16:18 -08:00
freevxfs	…
fscache	fscache: fix race between enablement and dropping of object	2018-11-30 15:57:31 +00:00
fuse	fuse: decrement NR_WRITEBACK_TEMP on the right page	2019-01-16 10:27:59 +01:00
gfs2	Revert "gfs2: read journal in large chunks to locate the head"	2019-02-14 09:52:51 -08:00
hfs	hfs: do not free node before using	2018-11-30 14:56:14 -08:00
hfsplus	hfsplus: return file attributes on statx	2019-01-04 13:13:47 -08:00
hostfs	vfs: discard ATTR_ATTR_FLAG	2018-08-17 16:20:28 -07:00
hpfs	hpfs: remove unnecessary checks on the value of r when assigning error code	2018-08-25 12:42:33 -07:00
hugetlbfs	hugetlbfs: fix races and page leaks during migration	2019-03-01 09:02:33 -08:00
isofs	Update email address	2018-09-29 22:47:48 -04:00
jbd2	jbd2: clean up indentation issue, replace spaces with tab	2018-12-04 00:20:10 -05:00
jffs2	jffs2: Fix use of uninitialized delayed_work, lockdep breakage	2018-12-02 09:20:34 +01:00
jfs	jfs: remove redundant dquot_initialize() in jfs_evict_inode()	2018-09-20 09:28:49 -05:00
kernfs	fs: kernfs: add poll file operation	2019-03-05 21:07:17 -08:00
lockd	NFS client updates for Linux 4.21	2019-01-02 16:35:23 -08:00
minix	…
nfs	Merge branch 'fixes-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2019-02-20 09:09:33 -08:00
nfs_common	…
nfsd	Revert "nfsd4: return default lease period"	2019-02-14 12:33:19 -05:00
nilfs2	nilfs2: Use xa_erase_irq	2018-11-05 14:57:05 -05:00
nls	…
notify	inotify: Fix fd refcount leak in inotify_add_watch().	2019-01-02 18:28:37 +01:00
ntfs	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
ocfs2	ocfs2: Use zero-sized array and struct_size() in kzalloc()	2019-03-05 21:07:13 -08:00
omfs	…
openpromfs	fs/openpromfs: Use of_node_name_eq for node name comparisons	2018-11-18 13:35:19 -08:00
orangefs	orangefs: remove two un-needed BUG_ONs...	2019-02-20 15:12:52 -05:00
overlayfs	Revert "ovl: relax permission checking on underlying layers"	2018-12-04 11:31:30 +01:00
proc	mm: convert PG_balloon to PG_offline	2019-03-05 21:07:14 -08:00
pstore	pstore/ram: Avoid allocation and leak of platform data	2019-01-20 14:44:52 -08:00
qnx4	…
qnx6	…
quota	quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls.	2018-12-18 18:29:15 +01:00
ramfs	…
reiserfs	reiserfs: remove workaround code for GCC 3.x	2018-10-31 08:54:14 -07:00
romfs	…
squashfs	Squashfs: Compute expected length from inode size rather than block length	2018-08-02 09:34:02 -07:00
sysfs	sysfs: convert BUG_ON to WARN_ON	2019-01-07 08:53:32 +01:00
sysv	sysv: return 'err' instead of 0 in __sysv_write_inode	2018-11-10 08:02:40 -05:00
tracefs	tracefs: Annotate tracefs_ops with __ro_after_init	2018-07-31 11:32:44 -04:00
ubifs	mm: migrate: drop unused argument of migrate_page_move_mapping()	2018-12-28 12:11:51 -08:00
udf	\n	2018-12-27 17:00:35 -08:00
ufs	fs/ufs: use ktime_get_real_seconds for sb and cg timestamps	2018-08-17 16:20:27 -07:00
xfs	xfs: set buffer ops when repair probes for btree type	2019-02-03 14:03:59 -08:00
Kconfig	autofs: remove left-over autofs4 stubs	2018-06-11 08:22:34 -07:00
Kconfig.binfmt	kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt	2018-08-02 08:06:55 +09:00
Makefile	autofs: remove left-over autofs4 stubs	2018-06-11 08:22:34 -07:00
aio.c	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2019-03-04 13:24:27 -08:00
anon_inodes.c	anon_inode_getfile(): switch to alloc_file_pseudo()	2018-07-12 10:04:27 -04:00
attr.c	fs: Fix attr.c kernel-doc	2018-07-03 16:44:45 -04:00
bad_inode.c	get rid of 'opened' argument of ->atomic_open() - part 3	2018-07-12 10:04:20 -04:00
binfmt_aout.c	a.out: remove core dumping support	2019-03-05 10:00:35 -08:00
binfmt_elf.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
binfmt_elf_fdpic.c	treewide: kmalloc() -> kmalloc_array()	2018-06-12 16:19:22 -07:00
binfmt_em86.c	…
binfmt_flat.c	…
binfmt_misc.c	turn filp_clone_open() into inline wrapper for dentry_open()	2018-07-10 23:29:03 -04:00
binfmt_script.c	exec: load_script: Do not exec truncated interpreter path	2019-02-18 16:49:36 -08:00
block_dev.c	blockdev: Fix livelocks on loop device	2019-01-15 07:30:56 -07:00
buffer.c	fs: ratelimit __find_get_block_slow() failure message.	2019-02-06 12:58:56 -07:00
char_dev.c	…
compat.c	ncpfs: remove compat functionality	2018-06-05 19:23:26 +02:00
compat_binfmt_elf.c	y2038: globally rename compat_time to old_time32	2018-08-27 14:48:48 +02:00
compat_ioctl.c	media updates for v4.20-rc1	2018-10-29 14:29:58 -07:00
coredump.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
d_path.c	…
dax.c	dax fix 4.21	2018-12-31 09:46:39 -08:00
dcache.c	fs/dcache: Track & report number of negative dentries	2019-01-30 11:02:11 -08:00
dcookies.c	…
direct-io.c	direct-io: allow direct writes to empty inodes	2019-01-22 08:26:44 -07:00
drop_caches.c	fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()	2019-02-01 15:46:24 -08:00
eventfd.c	Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL	2018-06-28 10:40:47 -07:00
eventpoll.c	Merge branch 'akpm' (patches from Andrew)	2019-01-05 09:16:18 -08:00
exec.c	exec: Fix mem leak in kernel_read_file	2019-02-18 21:26:24 -05:00
fcntl.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
fhandle.c	…
file.c	fs/file.c: initialize init_files.resize_wait	2019-03-05 21:07:14 -08:00
file_table.c	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
filesystems.c	…
fs-writeback.c	writeback: synchronize sync(2) against cgroup writeback membership switches	2019-01-22 14:39:38 -07:00
fs_pin.c	…
fs_struct.c	…
inode.c	fs/inode.c: inode_set_flags(): replace opencoded set_mask_bits()	2019-03-05 21:07:13 -08:00
internal.h	overlayfs update for 4.19	2018-08-21 18:19:09 -07:00
ioctl.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
iomap.c	iomap: fix a use after free in iomap_dio_rw	2019-01-27 08:47:42 -08:00
libfs.c	…
locks.c	locks: fix error in locks_move_blocks()	2019-01-02 20:14:50 -05:00
mbcache.c	treewide: kmalloc() -> kmalloc_array()	2018-06-12 16:19:22 -07:00
mount.h	…
mpage.c	mpage: mpage_readpages() should submit IO as read-ahead	2018-08-17 16:20:29 -07:00
namei.c	Revert "vfs: Allow userns root to call mknod on owned filesystems."	2018-12-22 14:18:34 -08:00
namespace.c	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2019-03-04 13:24:27 -08:00
no-block.c	…
nsfs.c	…
open.c	overlayfs update for 4.19	2018-08-21 18:19:09 -07:00
pipe.c	memcg: localize memcg_kmem_enabled() check	2019-03-05 21:07:15 -08:00
pnode.c	vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled	2018-12-20 16:32:56 +00:00
pnode.h	…
posix_acl.c	…
proc_namespace.c	…
read_write.c	get rid of legacy 'get_ds()' function	2019-03-04 10:50:14 -08:00
readdir.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
select.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
seq_file.c	fs/seq_file.c: simplify seq_file iteration code and interface	2018-08-17 16:20:28 -07:00
signalfd.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
splice.c	fs: Make splice() and tee() take into account O_NONBLOCK flag on pipes	2019-03-04 16:10:17 -08:00
stack.c	…
stat.c	y2038: Remove newstat family from default syscall set	2018-08-29 15:42:20 +02:00
statfs.c	kernel: add kcompat_sys_{f,}statfs64()	2018-07-12 14:49:48 +01:00
super.c	mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNT	2018-12-21 11:51:23 -05:00
sync.c	…
timerfd.c	y2038: globally rename compat_time to old_time32	2018-08-27 14:48:48 +02:00
userfaultfd.c	userfaultfd: clear flag if remap event not enabled	2018-12-28 12:11:51 -08:00
utimes.c	y2038: utimes: Rework #ifdef guards for compat syscalls	2018-08-29 15:42:23 +02:00
xattr.c	sysfs: Do not return POSIX ACL xattrs via listxattr	2018-09-18 07:30:48 -04:00