forked from OSchip/llvm-project
5fbe01b12d
For networking-type bpf program, it often needs to access packet data. A context data structure is provided to the bpf programs with two fields: u32 data; u32 data_end; User can access these two fields with ctx->data and ctx->data_end. During program verification process, the kernel verifier modifies the bpf program with loading of actual pointer value from kernel data structure. r = ctx->data ===> r = actual data start ptr r = ctx->data_end ===> r = actual data end ptr A typical program accessing ctx->data like char *data_ptr = (char *)(long)ctx->data will result in a 32-bit load followed by a zero extension. Such an operation is combined into a single LDW in DAG combiner as bpf LDW does zero extension automatically. In cases like the below (which can be a result of global value numbering and partial redundancy elimination before insn selection): B1: u32 a = load-32-bit &ctx->data u64 pa = zext a ... B2: u32 b = load-32-bit &ctx->data u64 pb = zext b ... B3: u32 m = PHI(a, b) u64 pm = zext m In B3, "pm = zext m" cannot be removed, which although is legal from compiler perspective, will generate incorrect code after kernel verification. This patch recognizes this pattern and traces through PHI node to see whether the operand of "zext m" is defined with LDWs or not. If it is, the "zext m" itself can be removed. The patch also recognizes the pattern where the load and use of the load value not in the same basic block, where truncate operation may be removed as well. The patch handles 1-byte, 2-byte and 4-byte truncation. Two test cases are added to verify the transformation happens properly for the above code pattern. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 306685 |
||
---|---|---|
.. | ||
alu8.ll | ||
atomics.ll | ||
basictest.ll | ||
byval.ll | ||
cc_args.ll | ||
cc_args_be.ll | ||
cc_ret.ll | ||
cmp.ll | ||
dwarfdump.ll | ||
ex1.ll | ||
fi_ri.ll | ||
intrinsics.ll | ||
lit.local.cfg | ||
load.ll | ||
loops.ll | ||
many_args1.ll | ||
many_args2.ll | ||
mem_offset.ll | ||
mem_offset_be.ll | ||
objdump_atomics.ll | ||
objdump_intrinsics.ll | ||
objdump_trivial.ll | ||
reloc.ll | ||
remove_truncate_1.ll | ||
remove_truncate_2.ll | ||
rodata_1.ll | ||
rodata_2.ll | ||
rodata_3.ll | ||
rodata_4.ll | ||
sanity.ll | ||
sdiv_error.ll | ||
setcc.ll | ||
shifts.ll | ||
sockex2.ll | ||
struct_ret1.ll | ||
struct_ret2.ll | ||
undef.ll | ||
vararg1.ll | ||
warn-call.ll | ||
warn-stack.ll |