OpenCloudOS-Kernel/kernel/bpf
Daniel Borkmann 85782e037f bpf: undo prog rejection on read-only lock failure
Partially undo commit 9facc33687 ("bpf: reject any prog that failed
read-only lock") since it caused a regression, that is, syzkaller was
able to manage to cause a panic via fault injection deep in set_memory_ro()
path by letting an allocation fail: In x86's __change_page_attr_set_clr()
it was able to change the attributes of the primary mapping but not in
the alias mapping via cpa_process_alias(), so the second, inner call
to the __change_page_attr() via __change_page_attr_set_clr() had to split
a larger page and failed in the alloc_pages() with the artifically triggered
allocation error which is then propagated down to the call site.

Thus, for set_memory_ro() this means that it returned with an error, but
from debugging a probe_kernel_write() revealed EFAULT on that memory since
the primary mapping succeeded to get changed. Therefore the subsequent
hdr->locked = 0 reset triggered the panic as it was performed on read-only
memory, so call-site assumptions were infact wrong to assume that it would
either succeed /or/ not succeed at all since there's no such rollback in
set_memory_*() calls from partial change of mappings, in other words, we're
left in a state that is "half done". A later undo via set_memory_rw() is
succeeding though due to matching permissions on that part (aka due to the
try_preserve_large_page() succeeding). While reproducing locally with
explicitly triggering this error, the initial splitting only happens on
rare occasions and in real world it would additionally need oom conditions,
but that said, it could partially fail. Therefore, it is definitely wrong
to bail out on set_memory_ro() error and reject the program with the
set_memory_*() semantics we have today. Shouldn't have gone the extra mile
since no other user in tree today infact checks for any set_memory_*()
errors, e.g. neither module_enable_ro() / module_disable_ro() for module
RO/NX handling which is mostly default these days nor kprobes core with
alloc_insn_page() / free_insn_page() as examples that could be invoked long
after bootup and original 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks") did neither when it got first introduced to BPF
so "improving" with bailing out was clearly not right when set_memory_*()
cannot handle it today.

Kees suggested that if set_memory_*() can fail, we should annotate it with
__must_check, and all callers need to deal with it gracefully given those
set_memory_*() markings aren't "advisory", but they're expected to actually
do what they say. This might be an option worth to move forward in future
but would at the same time require that set_memory_*() calls from supporting
archs are guaranteed to be "atomic" in that they provide rollback if part
of the range fails, once that happened, the transition from RW -> RO could
be made more robust that way, while subsequent RO -> RW transition /must/
continue guaranteeing to always succeed the undo part.

Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com
Reported-by: syzbot+d866d1925855328eac3b@syzkaller.appspotmail.com
Fixes: 9facc33687 ("bpf: reject any prog that failed read-only lock")
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-29 10:47:35 -07:00
..
Makefile bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP 2018-05-03 15:55:24 -07:00
arraymap.c bpf: btf: Rename btf_key_id and btf_value_id in bpf_map_info 2018-05-23 12:03:32 +02:00
bpf_lru_list.c bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
bpf_lru_list.h bpf: Only set node->ref = 1 if it has not been set 2017-09-01 09:57:39 -07:00
btf.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
cgroup.c bpf: fix attach type BPF_LIRC_MODE2 dependency wrt CONFIG_CGROUP_BPF 2018-06-26 11:28:38 +02:00
core.c bpf: undo prog rejection on read-only lock failure 2018-06-29 10:47:35 -07:00
cpumap.c xdp: introduce xdp_return_frame_rx_napi 2018-05-24 18:36:15 -07:00
devmap.c xdp: Fix handling of devmap in generic XDP 2018-06-15 23:47:15 +02:00
disasm.c bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
disasm.h bpf: Remove struct bpf_verifier_env argument from print_bpf_insn 2018-03-23 17:38:57 +01:00
hashtab.c bpf: avoid retpoline for lookup/update/delete calls on maps 2018-06-03 07:45:37 -07:00
helpers.c bpf: implement bpf_get_current_cgroup_id() helper 2018-06-03 18:22:41 -07:00
inode.c bpf: implement dummy fops for bpf objects 2018-06-08 10:58:48 -07:00
lpm_trie.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
map_in_map.c bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
map_in_map.h bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
offload.c bpf: offload: allow offloaded programs to use perf event arrays 2018-05-04 23:41:03 +02:00
percpu_freelist.c bpf: fix lockdep splat 2017-11-15 19:46:32 +09:00
percpu_freelist.h bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
sockmap.c bpf: fix attach type BPF_LIRC_MODE2 dependency wrt CONFIG_CGROUP_BPF 2018-06-26 11:28:38 +02:00
stackmap.c bpf: avoid -Wmaybe-uninitialized warning 2018-05-28 17:40:59 +02:00
syscall.c bpf: fix attach type BPF_LIRC_MODE2 dependency wrt CONFIG_CGROUP_BPF 2018-06-26 11:28:38 +02:00
tnum.c bpf/verifier: improve register value range tracking with ARSH 2018-04-29 08:45:53 -07:00
verifier.c treewide: Use array_size() in vzalloc() 2018-06-12 16:19:22 -07:00
xskmap.c xsk: clean up SPDX headers 2018-05-18 16:07:02 +02:00