OpenCloudOS-Kernel/kernel
Joel Fernandes (Google) b53b0b9d9a
pidfd: add polling support
This patch adds polling support to pidfd.

Android low memory killer (LMK) needs to know when a process dies once
it is sent the kill signal. It does so by checking for the existence of
/proc/pid which is both racy and slow. For example, if a PID is reused
between when LMK sends a kill signal and checks for existence of the
PID, since the wrong PID is now possibly checked for existence.
Using the polling support, LMK will be able to get notified when a process
exists in race-free and fast way, and allows the LMK to do other things
(such as by polling on other fds) while awaiting the process being killed
to die.

For notification to polling processes, we follow the same existing
mechanism in the kernel used when the parent of the task group is to be
notified of a child's death (do_notify_parent). This is precisely when the
tasks waiting on a poll of pidfd are also awakened in this patch.

We have decided to include the waitqueue in struct pid for the following
reasons:
1. The wait queue has to survive for the lifetime of the poll. Including
   it in task_struct would not be option in this case because the task can
   be reaped and destroyed before the poll returns.

2. By including the struct pid for the waitqueue means that during
   de_thread(), the new thread group leader automatically gets the new
   waitqueue/pid even though its task_struct is different.

Appropriate test cases are added in the second patch to provide coverage of
all the cases the patch is handling.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Jonathan Kowalski <bl0pbl33p@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@android.com
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Co-developed-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-28 12:17:55 +02:00
..
bpf bpf: fix undefined behavior in narrow load handling 2019-05-13 02:05:50 +02:00
cgroup kernel/sched/psi.c: expose pressure metrics on root cgroup 2019-05-14 19:52:48 -07:00
configs kvm_config: add CONFIG_VIRTIO_MENU 2018-10-24 20:55:56 -04:00
debug kdb: Fix bound check compiler warning 2019-05-14 13:44:24 +01:00
dma DMA mapping updates for 5.2 2019-05-09 08:40:55 -07:00
events mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
gcov gcov: clang support 2019-05-14 19:52:51 -07:00
irq Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-05-19 10:58:45 -07:00
livepatch The major changes in this tracing update includes: 2019-05-15 16:05:47 -07:00
locking locking/rwsem: Prevent decrement of reader count before increment 2019-05-07 08:46:46 +02:00
power Power management updates for 5.2-rc1 2019-05-06 19:40:31 -07:00
printk panic: add an option to replay all the printk message in buffer 2019-05-18 15:52:26 -07:00
rcu The major changes in this tracing update includes: 2019-05-15 16:05:47 -07:00
sched kernel/sched/psi.c: expose pressure metrics on root cgroup 2019-05-14 19:52:48 -07:00
time Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-05-16 11:00:20 -07:00
trace The major changes in this tracing update includes: 2019-05-15 16:05:47 -07:00
.gitignore Provide in-kernel headers to make extending kernel easier 2019-04-29 16:48:03 +02:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb()) 2019-05-06 16:57:52 -07:00
Kconfig.preempt kconfig: warn no new line at end of file 2018-12-15 17:44:35 +09:00
Makefile kernel/Makefile: don't assume that kernel/gen_ikh_data.sh is executable 2019-05-14 19:52:47 -07:00
acct.c acct_on(): don't mess with freeze protection 2019-04-04 21:04:13 -04:00
async.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
audit.c audit: connect LOGIN record to its syscall record 2019-03-20 20:57:48 -04:00
audit.h audit_compare_dname_path(): switch to const struct qstr * 2019-04-28 20:33:43 -04:00
audit_fsnotify.c audit_compare_dname_path(): switch to const struct qstr * 2019-04-28 20:33:43 -04:00
audit_tree.c fsnotify: switch send_to_group() and ->handle_event to const struct qstr * 2019-04-26 13:51:03 -04:00
audit_watch.c audit_compare_dname_path(): switch to const struct qstr * 2019-04-28 20:33:43 -04:00
auditfilter.c Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-05-07 20:03:32 -07:00
auditsc.c Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-05-07 20:03:32 -07:00
backtracetest.c backtrace-test: Simplify stack trace handling 2019-04-29 12:37:47 +02:00
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-10-31 08:54:14 -07:00
capability.c LSM: add SafeSetID module that gates setid calls 2019-01-25 11:22:43 -08:00
compat.c kernel/compat.c: mark expected switch fall-throughs 2019-05-15 08:16:14 -07:00
configs.c kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
context_tracking.c
cpu.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-05-06 14:50:46 -07:00
cpu_pm.c
crash_core.c kexec: export PG_offline to VMCOREINFO 2019-03-05 21:07:14 -08:00
crash_dump.c
cred.c SELinux: Remove cred security blob poisoning 2019-01-08 13:18:44 -08:00
delayacct.c delayacct: track delays from thrashing cache pages 2018-10-26 16:26:32 -07:00
dma.c
elfcore.c
exec_domain.c
exit.c mm: change mm_update_next_owner() to update mm->owner with WRITE_ONCE 2019-05-14 19:52:47 -07:00
extable.c
fail_function.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
fork.c pidfd: add polling support 2019-06-28 12:17:55 +02:00
freezer.c
futex.c mm/gup: change GUP fast to use flags rather than a write 'bool' 2019-05-14 09:47:46 -07:00
gen_ikh_data.sh Provide in-kernel headers to make extending kernel easier 2019-04-29 16:48:03 +02:00
groups.c
hung_task.c kernel/hung_task.c: Use continuously blocked time when reporting. 2019-03-07 18:31:59 -08:00
iomem.c mm/resource: Use resource_overlaps() to simplify region_intersects() 2019-04-19 12:59:36 +02:00
irq_work.c irq_work: Do not raise an IPI when queueing work on the local CPU 2019-04-18 14:07:52 +02:00
jump_label.c locking/static_key: Don't take sleeping locks in __static_key_slow_dec_deferred() 2019-04-29 08:29:21 +02:00
kallsyms.c bpf: Add module name [bpf] to ksymbols for bpf programs 2019-01-21 17:38:56 -03:00
kcmp.c
kcov.c kcov: convert kcov.refcount to refcount_t 2019-03-07 18:32:02 -08:00
kexec.c
kexec_core.c power/suspend: Add function to disable secondaries for suspend 2019-05-03 19:42:41 +02:00
kexec_file.c mm: memblock: make keeping memblock memory opt-in rather than opt-out 2019-05-14 09:47:50 -07:00
kexec_internal.h
kheaders.c Provide in-kernel headers to make extending kernel easier 2019-04-29 16:48:03 +02:00
kmod.c
kprobes.c kprobes: Fix error check when reusing optimized probes 2019-04-16 09:38:16 +02:00
ksysfs.c
kthread.c include/: refactor headers to allow kthread.h inclusion in psi_types.h 2019-05-14 19:52:48 -07:00
latencytop.c kernel/latencytop.c: rename clear_all_latency_tracing to clear_tsk_latency_tracing 2019-05-14 19:52:49 -07:00
memremap.c kernel/memremap.c: remove the unused device_private_entry_fault() export 2019-05-14 09:47:51 -07:00
module-internal.h kallsyms: store type information in its own array 2019-03-28 15:00:37 +01:00
module.c Modules updates for v5.2 2019-05-14 10:55:54 -07:00
module_signing.c modsign: use all trusted keys to verify module signature 2018-11-07 14:41:41 +01:00
notifier.c kernel/notifier.c: double register detection 2019-05-14 19:52:49 -07:00
nsproxy.c
padata.c padata: Replace padata_attr_type default_attrs field with groups 2019-04-25 22:06:11 +02:00
panic.c panic: add an option to replay all the printk message in buffer 2019-05-18 15:52:26 -07:00
params.c
pid.c pidfd: add polling support 2019-06-28 12:17:55 +02:00
pid_namespace.c signal: Use group_send_sig_info to kill all processes in a pid namespace 2018-09-16 16:08:25 +02:00
profile.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ptrace.c ptrace: take into account saved_sigmask in PTRACE{GET,SET}SIGMASK 2019-03-29 10:01:37 -07:00
range.c
reboot.c panic/reboot: allow specifying reboot_mode for panic only 2019-05-14 19:52:51 -07:00
relay.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-03-12 13:27:20 -07:00
resource.c mm/resource: Use resource_overlaps() to simplify region_intersects() 2019-04-19 12:59:36 +02:00
rseq.c rseq: Remove superfluous rseq_len from task_struct 2019-04-19 12:39:32 +02:00
seccomp.c audit/stable-5.2 PR 20190507 2019-05-07 19:06:04 -07:00
signal.c pidfd: add polling support 2019-06-28 12:17:55 +02:00
smp.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-01-30 19:27:00 +01:00
smpboot.c
smpboot.h
softirq.c softirq: Remove tasklet_hrtimer 2019-03-22 14:36:02 +01:00
stackleak.c stackleak: Mark stackleak_track_stack() as notrace 2018-12-05 19:31:44 -08:00
stacktrace.c stacktrace: Provide common infrastructure 2019-04-29 12:37:57 +02:00
stop_machine.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
sys.c kernel/sys.c: prctl: fix false positive in validate_prctl_map() 2019-05-14 09:47:44 -07:00
sys_ni.c signal: support CLONE_PIDFD with pidfd_send_signal 2019-05-07 14:31:03 +02:00
sysctl.c kernel/sysctl.c: fix proc_do_large_bitmap for large input buffers 2019-05-14 19:52:51 -07:00
sysctl_binary.c kernel/sysctl: add panic_print into sysctl 2019-01-04 13:13:47 -08:00
task_work.c
taskstats.c genetlink: optionally validate strictly/dumps 2019-04-27 17:07:22 -04:00
test_kprobes.c
torture.c torture: Don't try to offline the last CPU 2019-03-26 14:42:53 -07:00
tracepoint.c tracing: Replace synchronize_sched() and call_rcu_sched() 2018-11-27 09:21:41 -08:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c umh: add exit routine for UMH process 2019-01-11 18:05:40 -08:00
up.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
user-return-notifier.c
user.c kernel/user.c: clean up some leftover code 2019-05-14 19:52:49 -07:00
user_namespace.c userns: also map extents in the reverse map to kernel IDs 2018-11-07 23:51:16 -06:00
utsname.c
utsname_sysctl.c
watchdog.c watchdog: Fix typo in comment 2019-04-18 14:05:51 +02:00
watchdog_hld.c kernel/watchdog_hld.c: hard lockup message should end with a newline 2019-04-19 09:46:05 -07:00
workqueue.c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2019-05-09 13:48:52 -07:00
workqueue_internal.h sched/core, workqueues: Distangle worker accounting from rq lock 2019-04-16 16:55:15 +02:00