OpenCloudOS-Kernel/fs/proc
Linus Torvalds f511c0b17b "Yes, people use FOLL_FORCE ;)"
This effectively reverts commit 8ee74a91ac ("proc: try to remove use
of FOLL_FORCE entirely")

It turns out that people do depend on FOLL_FORCE for the /proc/<pid>/mem
case, and we're talking not just debuggers. Talking to the affected people, the use-cases are:

Keno Fischer:
 "We used these semantics as a hardening mechanism in the julia JIT. By
  opening /proc/self/mem and using these semantics, we could avoid
  needing RWX pages, or a dual mapping approach. We do have fallbacks to
  these other methods (though getting EIO here actually causes an assert
  in released versions - we'll updated that to make sure to take the
  fall back in that case).

  Nevertheless the /proc/self/mem approach was our favored approach
  because it a) Required an attacker to be able to execute syscalls
  which is a taller order than getting memory write and b) didn't double
  the virtual address space requirements (as a dual mapping approach
  would).

  I think in general this feature is very useful for anybody who needs
  to precisely control the execution of some other process. Various
  debuggers (gdb/lldb/rr) certainly fall into that category, but there's
  another class of such processes (wine, various emulators) which may
  want to do that kind of thing.

  Now, I suspect most of these will have the other process under ptrace
  control, so maybe allowing (same_mm || ptraced) would be ok, but at
  least for the sandbox/remote-jit use case, it would be perfectly
  reasonable to not have the jit server be a ptracer"

Robert O'Callahan:
 "We write to readonly code and data mappings via /proc/.../mem in lots
  of different situations, particularly when we're adjusting program
  state during replay to match the recorded execution.

  Like Julia, we can add workarounds, but they could be expensive."

so not only do people use FOLL_FORCE for both reads and writes, but they
use it for both the local mm and remote mm.

With these comments in mind, we likely also cannot add the "are we
actively ptracing" check either, so this keeps the new code organization
and does not do a real revert that would add back the original comment
about "Maybe we should limit FOLL_FORCE to actual ptrace users?"

Reported-by: Keno Fischer <keno@juliacomputing.com>
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-30 12:38:59 -07:00
..
Kconfig fs, proc: add help for CONFIG_PROC_CHILDREN 2015-07-17 16:39:52 -07:00
Makefile fs/proc: Add compiler check for -Wno-override-init to support gcc < 4.2 2016-08-03 12:45:23 -04:00
array.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
base.c "Yes, people use FOLL_FORCE ;)" 2017-05-30 12:38:59 -07:00
cmdline.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
consoles.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
cpuinfo.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
devices.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
fd.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
fd.h proc: unsigned file descriptors 2016-09-27 18:47:38 -04:00
generic.c proc: Fix unbalanced hard link numbers 2017-04-28 21:05:26 -05:00
inode.c fs/proc/inode.c: remove cast from memory allocation 2017-05-08 17:15:10 -07:00
internal.h Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
interrupts.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
kcore.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
kmsg.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
loadavg.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h> 2017-03-02 08:42:34 +01:00
meminfo.c meminfo: break apart a very long seq_printf with #ifdefs 2016-10-07 18:46:30 -07:00
namespaces.c pidns: expose task pid_ns_for_children to userspace 2017-05-08 17:15:12 -07:00
nommu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
page.c mm: fix KPF_SWAPCACHE in /proc/kpageflags 2017-02-07 12:08:32 -08:00
proc_net.c Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
proc_sysctl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-05-05 11:08:43 -07:00
proc_tty.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
root.c Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
self.c vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
softirqs.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
stat.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
task_mmu.c proc: show MADV_FREE pages info in smaps 2017-05-03 15:52:08 -07:00
task_nommu.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> 2017-03-02 08:42:28 +01:00
thread_self.c vfs: remove ".readlink = generic_readlink" assignments 2016-12-09 16:45:04 +01:00
uptime.c sched/cputime: Convert kcpustat to nsecs 2017-02-01 09:13:47 +01:00
version.c fs/proc: don't use module_init for non-modular core code 2014-01-23 16:37:02 -08:00
vmcore.c userfaultfd: non-cooperative: add event for memory unmaps 2017-02-24 17:46:55 -08:00