linux-sg2042/kernel
Jiong Wang e434b8cdf7 bpf: relax verifier restriction on BPF_MOV | BPF_ALU
Currently, the destination register is marked as unknown for 32-bit
sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is
SCALAR_VALUE.

This is too conservative that some valid cases will be rejected.
Especially, this may turn a constant scalar value into unknown value that
could break some assumptions of verifier.

For example, test_l4lb_noinline.c has the following C code:

    struct real_definition *dst

1:  if (!get_packet_dst(&dst, &pckt, vip_info, is_ipv6))
2:    return TC_ACT_SHOT;
3:
4:  if (dst->flags & F_IPV6) {

get_packet_dst is responsible for initializing "dst" into valid pointer and
return true (1), otherwise return false (0). The compiled instruction
sequence using alu32 will be:

  412: (54) (u32) r7 &= (u32) 1
  413: (bc) (u32) r0 = (u32) r7
  414: (95) exit

insn 413, a BPF_MOV | BPF_ALU, however will turn r0 into unknown value even
r7 contains SCALAR_VALUE 1.

This causes trouble when verifier is walking the code path that hasn't
initialized "dst" inside get_packet_dst, for which case 0 is returned and
we would then expect verifier concluding line 1 in the above C code pass
the "if" check, therefore would skip fall through path starting at line 4.
Now, because r0 returned from callee has became unknown value, so verifier
won't skip analyzing path starting at line 4 and "dst->flags" requires
dereferencing the pointer "dst" which actually hasn't be initialized for
this path.

This patch relaxed the code marking sub-register move destination. For a
SCALAR_VALUE, it is safe to just copy the value from source then truncate
it into 32-bit.

A unit test also included to demonstrate this issue. This test will fail
before this patch.

This relaxation could let verifier skipping more paths for conditional
comparison against immediate. It also let verifier recording a more
accurate/strict value for one register at one state, if this state end up
with going through exit without rejection and it is used for state
comparison later, then it is possible an inaccurate/permissive value is
better. So the real impact on verifier processed insn number is complex.
But in all, without this fix, valid program could be rejected.

>From real benchmarking on kernel selftests and Cilium bpf tests, there is
no impact on processed instruction number when tests ares compiled with
default compilation options. There is slightly improvements when they are
compiled with -mattr=+alu32 after this patch.

Also, test_xdp_noinline/-mattr=+alu32 now passed verification. It is
rejected before this fix.

Insn processed before/after this patch:

                        default     -mattr=+alu32

Kernel selftest

===
test_xdp.o              371/371      369/369
test_l4lb.o             6345/6345    5623/5623
test_xdp_noinline.o     2971/2971    rejected/2727
test_tcp_estates.o      429/429      430/430

Cilium bpf
===
bpf_lb-DLB_L3.o:        2085/2085     1685/1687
bpf_lb-DLB_L4.o:        2287/2287     1986/1982
bpf_lb-DUNKNOWN.o:      690/690       622/622
bpf_lxc.o:              95033/95033   N/A
bpf_netdev.o:           7245/7245     N/A
bpf_overlay.o:          2898/2898     3085/2947

NOTE:
  - bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by
    verifier due to another issue inside verifier on supporting alu32
    binary.
  - Each cilium bpf program could generate several processed insn number,
    above number is sum of them.

v1->v2:
 - Restrict the change on SCALAR_VALUE.
 - Update benchmark numbers on Cilium bpf tests.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-10 09:23:33 -08:00
..
bpf bpf: relax verifier restriction on BPF_MOV | BPF_ALU 2018-12-10 09:23:33 -08:00
cgroup for-linus-20181102 2018-11-02 11:25:48 -07:00
configs kvm_config: add CONFIG_VIRTIO_MENU 2018-10-24 20:55:56 -04:00
debug kdb: kdb_support: mark expected switch fall-throughs 2018-11-13 20:38:50 +00:00
dma swiotlb: Skip cache maintenance on map error 2018-11-21 18:47:58 +01:00
events Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-03 18:13:43 -07:00
gcov gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT 2018-06-08 18:56:02 +09:00
irq irq/matrix: Fix memory overallocation 2018-11-01 10:00:38 +01:00
livepatch Merge branch 'for-4.19/upstream' into for-linus 2018-08-20 18:33:50 +02:00
locking mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
power memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
printk mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
rcu Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' and 'torture.2018.08.29a' into HEAD 2018-08-30 16:12:53 -07:00
sched Merge branch 'akpm' (patches from Andrew) 2018-11-18 11:31:26 -08:00
time posix-cpu-timers: Remove useless call to check_dl_overrun() 2018-11-08 07:43:35 +01:00
trace bpf: fix check of allowed specifiers in bpf_trace_printk 2018-11-23 21:54:14 +01:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt kconfig: include kernel/Kconfig.preempt from init/Kconfig 2018-08-02 08:06:54 +09:00
Makefile x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls 2018-09-04 10:35:47 -07:00
acct.c
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2018-02-06 18:32:44 -08:00
audit.c audit: use ktime_get_coarse_real_ts64() for timestamps 2018-07-17 14:45:08 -04:00
audit.h audit: track the owner of the command mutex ourselves 2018-02-23 11:22:22 -05:00
audit_fsnotify.c fsnotify: add fsnotify_add_inode_mark() wrappers 2018-05-18 14:58:22 +02:00
audit_tree.c \n 2018-08-17 09:41:28 -07:00
audit_watch.c audit: fix use-after-free in audit_add_watch 2018-07-18 11:43:36 -04:00
auditfilter.c audit: rename FILTER_TYPE to FILTER_EXCLUDE 2018-06-19 10:39:54 -04:00
auditsc.c audit/stable-4.18 PR 20180814 2018-08-15 10:46:54 -07:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-10-31 08:54:14 -07:00
capability.c
compat.c y2038: globally rename compat_time to old_time32 2018-08-27 14:48:48 +02:00
configs.c
context_tracking.c
cpu.c Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 18:43:04 +01:00
cpu_pm.c
crash_core.c kernel/crash_core.c: print timestamp using time64_t 2018-08-22 10:52:47 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: track delays from thrashing cache pages 2018-10-26 16:26:32 -07:00
dma.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
elfcore.c
exec_domain.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
exit.c signal: Pass pid type into group_send_sig_info 2018-07-21 12:57:35 -05:00
extable.c extable: Make init_kernel_text() global 2018-02-21 16:54:06 +01:00
fail_function.c kernel/fail_function.c: remove meaningless null pointer check before debugfs_remove_recursive 2018-10-31 08:54:12 -07:00
fork.c New gcc plugin: stackleak 2018-11-01 11:46:27 -07:00
freezer.c PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
futex.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
futex_compat.c y2038: globally rename compat_time to old_time32 2018-08-27 14:48:48 +02:00
groups.c
hung_task.c kernel: hung_task.c: disable on suspend 2018-10-25 18:45:08 +02:00
iomem.c memremap: split devm_memremap_pages() and memremap() infrastructure 2018-05-15 23:08:33 -07:00
irq_work.c
jump_label.c Merge branch 'x86/build' into locking/core, to pick up dependent patches and unify jump-label work 2018-10-16 17:30:11 +02:00
kallsyms.c kallsyms: reduce size a little on 64-bit 2018-09-10 22:54:33 +09:00
kcmp.c
kcov.c sched/core / kcov: avoid kcov_area during task switch 2018-06-15 07:55:24 +09:00
kexec.c kexec: add call to LSM hook in original kexec_load syscall 2018-07-16 12:31:57 -07:00
kexec_core.c kexec: Allocate decrypted control pages for kdump if SME is enabled 2018-10-06 12:01:51 +02:00
kexec_file.c kernel/kexec_file.c: remove some duplicated includes 2018-11-03 10:09:37 -07:00
kexec_internal.h
kmod.c
kprobes.c kprobes: Don't call BUG_ON() if there is a kprobe in use on free list 2018-09-12 08:01:16 +02:00
ksysfs.c
kthread.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
latencytop.c
memremap.c Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
module-internal.h modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
module.c jump_table: Move entries into ro_after_init region 2018-09-27 17:56:49 +02:00
module_signing.c modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
notifier.c
nsproxy.c
padata.c
panic.c kernel/panic.c: filter out a potential trailing newline 2018-10-31 08:54:14 -07:00
params.c kernel/params.c: downgrade warning for unsafe parameters 2018-04-11 10:28:37 -07:00
pid.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
pid_namespace.c signal: Use group_send_sig_info to kill all processes in a pid namespace 2018-09-16 16:08:25 +02:00
profile.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ptrace.c Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-10-24 11:22:39 +01:00
range.c
reboot.c kernel/reboot.c: export pm_power_off_prepare 2018-09-11 16:13:24 +01:00
relay.c kernel/relay.c: change return type to vm_fault_t 2018-06-15 07:55:24 +09:00
resource.c resource/docs: Complete kernel-doc style function documentation 2018-11-07 16:47:47 +01:00
rseq.c rseq: uapi: Declare rseq_cs field as union, update includes 2018-07-10 22:18:52 +02:00
seccomp.c Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2018-10-24 11:49:35 +01:00
signal.c kernel/signal.c: fix a comment error 2018-10-31 08:54:14 -07:00
smp.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
smpboot.c smpboot: Remove cpumask from the API 2018-07-03 09:20:44 +02:00
smpboot.h
softirq.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-25 11:43:47 -07:00
stackleak.c stackleak: Allow runtime disabling of kernel stack erasing 2018-09-04 10:35:48 -07:00
stacktrace.c
stop_machine.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
sys.c kernel/sys.c: remove duplicated include 2018-09-20 22:01:11 +02:00
sys_ni.c Merge branch 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-10 10:17:09 -07:00
sysctl.c kernel/sysctl.c: remove duplicated include 2018-11-03 10:09:37 -07:00
sysctl_binary.c staging: irda: remove remaining remants of irda code removal 2018-04-16 11:26:49 +02:00
task_work.c
taskstats.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
test_kprobes.c kprobes: Remove jprobe API implementation 2018-06-21 12:33:05 +02:00
torture.c rcutorture: Check GP completion at stutter end 2018-08-29 09:20:48 -07:00
tracepoint.c tracepoint: Fix tracepoint array element size mismatch 2018-10-17 15:35:29 -04:00
tsacct.c
ucount.c headers: untangle kmemleak.h from mm.h 2018-04-05 21:36:27 -07:00
uid16.c fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers 2018-04-02 20:15:59 +02:00
uid16.h kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c 2018-04-02 20:15:30 +02:00
umh.c umh: Add command line to user mode helpers 2018-10-22 19:37:36 -07:00
up.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
user-return-notifier.c
user.c userns: use irqsave variant of refcount_dec_and_lock() 2018-08-22 10:52:47 -07:00
user_namespace.c userns: also map extents in the reverse map to kernel IDs 2018-11-07 23:51:16 -06:00
utsname.c uts: create "struct uts_namespace" from kmem_cache 2018-04-11 10:28:35 -07:00
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-08-11 02:05:53 -05:00
watchdog.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
watchdog_hld.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
workqueue.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
workqueue_internal.h workqueue: Set worker->desc to workqueue name by default 2018-05-18 08:47:13 -07:00