llvm-project

Commit Graph

Author	SHA1	Message	Date
Yonghong Song	2e94d8e67a	[BPF] handle unsigned icmp ops in BPFAdjustOpt pass When investigating an issue with bcc tool inject.py, I found a verifier failure with latest clang. The portion of code can be illustrated as below: struct pid_struct { u64 curr_call; u64 conds_met; u64 stack[2]; }; struct pid_struct bpf_map_lookup_elem(); int foo() { struct pid_struct p = bpf_map_lookup_elem(); if (!p) return 0; p->curr_call--; if (p->conds_met < 1 \|\| p->conds_met >= 3) return 0; if (p->stack[p->conds_met - 1] == p->curr_call) p->conds_met--; ... } The verifier failure looks like: ... 8: (79) r1 = (u64 )(r0 +0) R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R10=fp0 fp-8=mmmm???? 9: (07) r1 += -1 10: (7b) (u64 )(r0 +0) = r1 R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmm???? 11: (79) r2 = (u64 )(r0 +8) R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmm???? 12: (bf) r3 = r2 13: (07) r3 += -3 14: (b7) r4 = -2 15: (2d) if r4 > r3 goto pc+13 R0=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1=inv(id=0) R2=inv(id=2) R3=inv(id=0,umin_value=18446744073709551614,var_off=(0xffffffff00000000; 0xffffffff)) R4=inv-2 R10=fp0 fp-8=mmmm???? 16: (07) r2 += -1 17: (bf) r3 = r2 18: (67) r3 <<= 3 19: (bf) r4 = r0 20: (0f) r4 += r3 math between map_value pointer and register with unbounded min value is not allowed Here the compiler optimized "p->conds_met < 1 \|\| p->conds_met >= 3" to r2 = p->conds_met r3 = r2 r3 += -3 r4 = -2 if (r3 < r4) return 0 r2 += -1 r3 = r2 ... In the above, r3 is initially equal to r2, but is modified used by the comparison. But later on r2 is used again. This caused verification failure. BPF backend has a pass, AdjustOpt, to prevent such transformation, but only focused on signed integers since typical bpf helper returns signed integers. To fix this case, let us handle unsigned integers as well. Differential Revision: https://reviews.llvm.org/D121937	2022-03-17 16:24:39 -07:00
Yonghong Song	d2b4a675a8	[BPF] Fix a bug in BPFAdjustOpt pass for icmp transformation When checking a bcc issue related to bcc tool inject.py, I found a bug in BPFAdjustOpt pass for icmp transformation, caused by typo's. For the following condition: Cond2Op != ICmpInst::ICMP_SLT && Cond1Op != ICmpInst::ICMP_SLE it should be Cond2Op != ICmpInst::ICMP_SLT && Cond2Op != ICmpInst::ICMP_SLE This patch fixed the problem and a test case is added. Differential Revision: https://reviews.llvm.org/D121883	2022-03-17 09:25:18 -07:00
Yonghong Song	98e2274458	[BPF] fix a CO-RE bitfield relocation error with >8 record alignment Jussi Maki reported a fatal error like below for a bitfield CO-RE relocation: fatal error: error in backend: Unsupported field expression for llvm.bpf.preserve.field.info, requiring too big alignment The failure is related to kernel struct thread_struct. The following is a simplied example. Suppose we have below structure: struct t2 { int a[8]; } __attribute__((aligned(64))) __attribute__((preserve_access_index)); struct t1 { int f1:1; int f2:2; struct t2 f3; } __attribute__((preserve_access_index)); Note that struct t2 has aligned 64, which is used sometimes in the kernel to enforce cache line alignment. The above struct will be encoded into BTF and the following is what C code looks like and the struct will appear in the file like vmlinux.h. struct t2 { int a[8]; long: 64; long: 64; long: 64; long: 64; } __attribute__((preserve_access_index)); struct t1 { int f1: 1; int f2: 2; long: 61; long: 64; long: 64; long: 64; long: 64; long: 64; long: 64; long: 64; struct t2 f3; } __attribute__((preserve_access_index)); Note that after origin_source -> BTF -> new_source transition, the new source has the same memory layout as the old one but the alignment interpretation inside the compiler could be different. The bpf program will use the later explicitly padded structure as in vmlinux.h. In the above case, the compiler internal ABI alignment for new struct t1 is 16 while it is 4 for old struct t1. I didn't do a thorough investigation why the ABI alignment is 16 and I suspect it is related to anonymous padding in the above. Current BPF bitfield CO-RE handling requires alignment <= 8 so proper bitfield operatin can be performed. Therefore, alignment 16 will cause a compiler fatal error. To fix the ABI alignment >=16, let us check whether the bitfield can be held within a 8-byte-aligned range. If this is the case, we can use alignment 8. Otherwise, a fatal error will be reported. Differential Revision: https://reviews.llvm.org/D121821	2022-03-16 12:16:46 -07:00
Reid Kleckner	f58fb8ae7f	[BPF] Fix tests that fail if /tmp/t.c exists IMO the BPF backend shouldn't read random source files referenced from debug info. I filed llvm.org/pr54092 about this.	2022-02-25 14:55:53 -08:00
Yonghong Song	3671bdbcd2	[BPF] Fix a BTF type pruning bug In BPF backend, BTF type generation may skip some debuginfo types if they are the pointee type of a struct member. For example, struct task_struct { ... struct mm_struct mm; ... }; BPF backend may generate a forward decl for 'struct mm_struct' instead of full type if there are no other usage of 'struct mm_struct'. The reason is to avoid bringing too much unneeded types in BTF. Alexei found a pruning bug where we may miss some full type generation. The following is an illustrating example: struct t1 { ... } struct t2 { struct t1 p; }; struct t2 g; void foo(struct t1 *arg) { ... } In the above case, we will have partial debuginfo chain like below: struct t2 -> member p \ -> ptr -> struct t1 / foo -> argument arg During traversing struct t2 -> member p -> ptr -> struct t1 The corresponding BTF types are generated except 'struct t1' which will be in FixUp stage. Later, when traversing foo -> argument arg -> ptr -> struct t1 The 'ptr' BTF type has been generated and currently implementation ignores 'pointer' type hence 'struct t1' is not generated. This patch fixed the issue not just for the above case, but for general case with multiple derived types, e.g., struct t2 -> member p \ -> const -> ptr -> volatile -> struct t1 / foo -> argument arg Differential Revision: https://reviews.llvm.org/D119986	2022-02-16 17:23:34 -08:00
Yonghong Song	f419029fcd	[BPF] Fix a bug in BTF_KIND_TYPE_TAG generation Kumar Kartikeya Dwivedi reported a bug ([1]) where BTF_KIND_TYPE_TAG types are not generated. Currently, BPF backend only generates BTF types which are used by the program, e.g., global variables, functions and some builtin functions. For example, suppose we have struct task_struct { ... struct task_group sched_task_group; struct mm_struct mm; ... pid_t pid; pid_t tgid; ... } If BPF program intends to access task_struct->pid and task_struct->tgid, there really no need to generate BTF types for struct task_group and mm_struct. In BPF backend, during BTF generation, when generating BTF for struct task_struct, if types for task_group and mm_struct have not been generated yet, a Fixup structure will be created, which will be reexamined later to instantiate into either a full type or a forward type. In current implementation, if we have something like struct foo { struct bar __tag1 *f; }; and when generating types for struct foo, struct bar type has not been generated, the __tag1 will be lost during later Fixup instantiation. This patch fixed this issue by properly handling btf_type_tag's during Fixup instantiation stage. [1] https://lore.kernel.org/bpf/20220210232411.pmhzj7v5uptqby7r@apollo.legion/ Differential Revision: https://reviews.llvm.org/D119799	2022-02-14 19:43:57 -08:00
Nikita Popov	f430c1eb64	[Tests] Add elementtype attribute to indirect inline asm operands (NFC) This updates LLVM tests for D116531 by adding elementtype attributes to operands that correspond to indirect asm constraints.	2022-01-06 14:23:51 +01:00
Yonghong Song	8fb3f84484	BPF: Workaround InstCombine trunc+icmp => mask+icmp Optimization Patch [1] added further InstCombine trunc+icmp => mask+icmp optimization and this caused a couple of bpf selftest failure. Previous llvm BPF backend patch [2] introduced llvm.bpf.compare builtin to handle such situations. This patch further added support ">" and ">=" icmp opcodes. Tested with bpf selftests and all tests are passed including two previously failed ones. Note Patch [1] also added optimization if the to-be-compared constant is negative-power-of-2 (-C) or not-of-power-of-2 (~C). This patch didn't implement these two cases as typical bpf program compares a scalar to a positive length or boundary value, and this scalar later is used as a index into an array buffer or packet buffer. [1] https://reviews.llvm.org/D112634 [2] https://reviews.llvm.org/D112938 Differential Revision: https://reviews.llvm.org/D114215	2021-11-18 20:25:28 -08:00
Yonghong Song	8d499bd5bc	BPF: change btf_type_tag BTF output format For the declaration like below: int __tag1 * __tag1 __tag2 g Commit `41860e602a` ("BPF: Support btf_type_tag attribute") implemented the following encoding: VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int Some further experiments with linux btf_type_tag support, esp. with generating attributes in vmlinux.h, and also some internal discussion showed the following format is more desirable: VAR(g) -> pointer -> __tag2 -> __tag1 -> pointer -> __tag1 -> int The format makes it similar to other modifier like 'const', e.g., const int g which has encoding VAR(g) -> PTR -> CONST -> int Differential Revision: https://reviews.llvm.org/D113496	2021-11-09 11:34:25 -08:00
Nikita Popov	1376301c87	[InstCombine] Canonicalize range test idiom InstCombine converts range tests of the form (X > C1 && X < C2) or (X < C1 \|\| X > C2) into checks of the form (X + C3 < C4) or (X + C3 > C4). It is possible to express all range tests in either of these forms (with different choices of constants), but currently neither of them is considered canonical. We may have equivalent range tests using either ult or ugt. This proposes to canonicalize all range tests to use ult. An alternative would be to canonicalize to either ult or ugt depending on the specific constants involved -- e.g. in practice we currently generate ult for && style ranges and ugt for \|\| style ranges when going through the insertRangeTest() helper. In fact, the "clamp like" fold was relying on this, which is why I had to tweak it to not assume whether inversion is needed based on just the predicate. Proof: https://alive2.llvm.org/ce/z/_SP_rQ Differential Revision: https://reviews.llvm.org/D113366	2021-11-08 21:15:46 +01:00
Yonghong Song	41860e602a	BPF: Support btf_type_tag attribute A new kind BTF_KIND_TYPE_TAG is defined. The tags associated with a pointer type are emitted in their IR order as modifiers. For example, for the following declaration: int __tag1 * __tag1 __tag2 *g; The BTF type chain will look like VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int In the above "->" means BTF CommonType.Type which indicates the point-to type. Differential Revision: https://reviews.llvm.org/D113222	2021-11-04 17:01:36 -07:00
Yonghong Song	f63405f6e3	BPF: Workaround an InstCombine ICmp transformation with llvm.bpf.compare builtin Commit `acabad9ff6` ("[InstCombine] try to canonicalize icmp with trunc op into mask and cmp") added a transformation to convert "(conv)a < power_2_const" to "a & <const>" in certain cases and bpf kernel verifier has to handle the resulted code conservatively and this may reject otherwise legitimate program. This commit tries to prevent such a transformation. A bpf backend builtin llvm.bpf.compare is added. The ICMP insn, which is subject to above InstCombine transformation, is converted to the builtin function. The builtin function is later lowered to original ICMP insn, certainly after InstCombine pass. With this change, all affected bpf strobemeta* selftests are passed now. Differential Revision: https://reviews.llvm.org/D112938	2021-11-01 14:46:20 -07:00
Yonghong Song	0472e83ffc	BPF: emit BTF_KIND_DECL_TAG for typedef types If a typedef type has __attribute__((btf_decl_tag("str"))) with bpf target, emit BTF_KIND_DECL_TAG for that type in the BTF. Differential Revision: https://reviews.llvm.org/D112259	2021-10-21 12:09:42 -07:00
Yonghong Song	cd40b5a712	BPF: set .BTF and .BTF.ext section alignment to 4 Currently, .BTF and .BTF.ext has default alignment of 1. For example, $ cat t.c int foo() { return 0; } $ clang -target bpf -O2 -c -g t.c $ llvm-readelf -S t.o ... Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al ... [ 7] .BTF PROGBITS 0000000000000000 000167 00008b 00 0 0 1 [ 8] .BTF.ext PROGBITS 0000000000000000 0001f2 000050 00 0 0 1 But to have no misaligned data access, .BTF and .BTF.ext actually requires alignment of 4. Misalignment is not an issue for architecture like x64/arm64 as it can handle it well. But some architectures like mips may incur a trap if .BTF/.BTF.ext is not properly aligned. This patch explicitly forced .BTF and .BTF.ext alignment to be 4. For the above example, we will have [ 7] .BTF PROGBITS 0000000000000000 000168 00008b 00 0 0 4 [ 8] .BTF.ext PROGBITS 0000000000000000 0001f4 000050 00 0 0 4 Differential Revision: https://reviews.llvm.org/D112106	2021-10-19 16:26:01 -07:00
Yonghong Song	009f3a89d8	BPF: remove intrindics @llvm.stacksave() and @llvm.stackrestore() Paul Chaignon reported a bpf verifier failure ([1]) due to using non-ABI register R11. For the test case, llvm11 is okay while llvm12 and later generates verifier unfriendly code. The failure is related to variable length array size. The following mimics the variable length array definition in the test case: struct t { char a[20]; }; void foo(void *); int test() { const int a = 8; char tmp[AA + sizeof(struct t) + a]; foo(tmp); ... } Paul helped bisect that the following llvm commit is responsible: `552c6c2328` ("PR44406: Follow behavior of array bound constant folding in more recent versions of GCC.") Basically, before the above commit, clang frontend did constant folding for array size "AA + sizeof(struct t) + a" to be 68, so used alloca for stack allocation. After the above commit, clang frontend didn't do constant folding for array size any more, which results in a VLA and llvm.stacksave/llvm.stackrestore is generated. BPF architecture API does not support stack pointer (sp) register. The LLVM internally used R11 to indicate sp register but it should not be in the final code. Otherwise, kernel verifier will reject it. The early patch ([2]) tried to fix the issue in clang frontend. But the upstream discussion considered frontend fix is really a hack and the backend should properly undo llvm.stacksave/llvm.stackrestore. This patch implemented a bpf IR phase to remove these intrinsics unconditionally. If eventually the alloca can be resolved with constant size, r11 will not be generated. If alloca cannot be resolved with constant size, SelectionDag will complain, the same as without this patch. [1] https://lore.kernel.org/bpf/20210809151202.GB1012999@Mem/ [2] https://reviews.llvm.org/D107882 Differential Revision: https://reviews.llvm.org/D111897	2021-10-18 09:51:19 -07:00
Yonghong Song	1321e47298	BPF: rename BTF_KIND_TAG to BTF_KIND_DECL_TAG Per discussion in https://reviews.llvm.org/D111199, the existing btf_tag attribute will be renamed to btf_decl_tag. This patch updated BTF backend to use btf_decl_tag attribute name and also renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG. Differential Revision: https://reviews.llvm.org/D111592	2021-10-11 21:33:39 -07:00
Yonghong Song	ea72b0319d	BPF: make 32bit register spill with 64bit alignment In llvm, for non-alu32 mode, the stack alignment is 64bit so only one 64bit spill per 64bit slot. For alu32 mode, the stack alignment is 32bit, so it is possible to have two 32bit spills per 64bit slot. Currently, bpf kernel verifier does not preserve register states for 32bit spills. That is, one 32bit register may hold a constant value or a bounded range before spill. After reload from the stack, the information is lost and sometimes this may cause verifier failure. For 64bit register spill, the verifier indeed tries to preserve the register state for reloading. The current verifier can be modestly changed to handle one 32bit spill per 64bit stack slot with state-preserving reload. Handling two 32bit spills per 64bit stack slot will require substantial changes. This patch changes stack alignment for alu32 to be 64bit. This way, for any 64bit slot in alu32 mode, only one 32bit or 64bit register values can be saved. Together with previous-mentioned verifier enhancement, 32bit spill can be handled with state preserving. Note that llvm stack slot coallescing seems only doing adjacent packing which may leave some holes in the stack. For example, stack slot 8 <== 8 bytes stack slot 4 <== 8 bytes with 4 byte hole stack slot 8 <== 8 bytes stack slot 4 <== 4 bytes Differential Revision: https://reviews.llvm.org/D109073	2021-09-20 21:00:25 -07:00
Nikita Popov	90ec6dff86	[OpaquePtr] Forbid mixing typed and opaque pointers Currently, opaque pointers are supported in two forms: The -force-opaque-pointers mode, where all pointers are opaque and typed pointers do not exist. And as a simple ptr type that can coexist with typed pointers. This patch removes support for the mixed mode. You either get typed pointers, or you get opaque pointers, but not both. In the (current) default mode, using ptr is forbidden. In -opaque-pointers mode, all pointers are opaque. The motivation here is that the mixed mode introduces additional issues that don't exist in fully opaque mode. D105155 is an example of a design problem. Looking at D109259, it would probably need additional work to support mixed mode (e.g. to generate GEPs for typed base but opaque result). Mixed mode will also end up inserting many casts between i8* and ptr, which would require significant additional work to consistently avoid. I don't think the mixed mode is particularly valuable, as it doesn't align with our end goal. The only thing I've found it to be moderately useful for is adding some opaque pointer tests in between typed pointer tests, but I think we can live without that. Differential Revision: https://reviews.llvm.org/D109290	2021-09-10 15:18:23 +02:00
Yonghong Song	e52617c31d	BPF: change BTF_KIND_TAG format Previously we have the following binary representation: struct bpf_type { name, info, type } struct btf_tag { __u32 component_idx; } If the tag points to a struct/union/var/func type, we will have kflag = 1, component_idx = 0 if the tag points to struct/union member or func argument, we will have kflag = 0, component_idx = 0, ..., vlen - 1 The above rather makes interface complex to have both kflag and component needed to determine its legality and index. This patch simplifies the interface by removing kflag involvement. component_idx = (u32)-1 : tag pointing to a type component_idx = 0 ... vlen - 1 : tag pointing to a member or argument and kflag is always 0 and there is no need to check. Differential Revision: https://reviews.llvm.org/D109560	2021-09-09 19:03:57 -07:00
Yonghong Song	4948927058	[BPF] support btf_tag attribute in .BTF section A new kind BTF_KIND_TAG is added to .BTF to encode btf_tag attributes. The format looks like CommonType.name : attribute string CommonType.type : attached to a struct/union/func/var. CommonType.info : encoding BTF_KIND_TAG kflag == 1 to indicate the attribute is for CommonType.type, or kflag == 0 for struct/union member or func argument. one uint32_t : to encode which member/argument starting from 0. If one particular type or member/argument has more than one attribute, multiple BTF_KIND_TAG will be generated. Differential Revision: https://reviews.llvm.org/D106622	2021-08-28 21:02:27 -07:00
Yonghong Song	e52946b9ab	BPF: avoid NE/EQ loop exit condition Kuniyuki Iwashima reported in [1] that llvm compiler may convert a loop exit condition with "i < bound" to "i != bound", where "i" is the loop index variable and "bound" is the upper bound. In case that "bound" is not a constant, verifier will always have "i != bound" true, which will cause verifier failure since to verifier this is an infinite loop. The fix is to avoid transforming "i < bound" to "i != bound". In llvm, the transformation is done by IndVarSimplify pass. The compiler checks loop condition cost (i = i + 1) and if the cost is lower, it may transform "i < bound" to "i != bound". This patch implemented getArithmeticInstrCost() in BPF TargetTransformInfo class to return a higher cost for such an operation, which will prevent the transformation for the test case added in this patch. [1] https://lore.kernel.org/netdev/1994df05-8f01-371f-3c3b-d33d7836878c@fb.com/ Differential Revision: https://reviews.llvm.org/D107483	2021-08-04 16:54:16 -07:00
Nikita Popov	be5af50e7d	[BPF] Use elementtype attribute for preserve.array/struct.index intrinsics Use the elementtype attribute introduced in D105407 for the llvm.preserve.array/struct.index intrinsics. It carries the element type of the GEP these intrinsics effectively encode. This patch: * Adds a verifier check that the attribute is required. * Adds it in the IRBuilder methods for these intrinsics. * Autoupgrades old bitcode without the attribute. * Updates the lowering code to use the attribute rather than the pointer element type. * Updates lots of tests to specify the attribute. * Adds -force-opaque-pointers to the intrinsic-array.ll test to demonstrate they work now. https://reviews.llvm.org/D106184	2021-07-17 11:09:18 +02:00
Simon Pilgrim	d561b6fbdb	[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183) (REAPPLIED) As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep': define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) { %gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1 %gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3 %sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1 ret <4 x i32>* %sel } --> define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) { %sel = select i1 %a2, i64 %a1, i64 %a3 %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel ret <4 x i32>* %gep } This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address: define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) { %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1 %sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep ret <4 x i32>* %sel } --> define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) { %sel = select i1 %a2, i64 0, i64 %a1 %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel ret <4 x i32>* %gep } Reapplied with a fix for the bpf "-bpf-disable-avoid-speculation" tests Differential Revision: https://reviews.llvm.org/D105901	2021-07-14 12:21:01 +01:00
Yonghong Song	6a2ea84600	BPF: Add more relocation kinds Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712	2021-05-25 08:19:13 -07:00
serge-sans-paille	4ab3041acb	Revert "[NFC] remove explicit default value for strboolattr attribute in tests" This reverts commit `bda6e5bee0`. See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance	2021-05-24 19:43:40 +02:00
serge-sans-paille	bda6e5bee0	[NFC] remove explicit default value for strboolattr attribute in tests Since `d6de1e1a71`, no attributes is quivalent to setting attribute to false. This is a preliminary commit for https://reviews.llvm.org/D99080	2021-05-24 19:31:04 +02:00
Alessandro Decina	833e9b2ea7	[BPF] add support for 32 bit registers in inline asm Add "w" constraint type which allows selecting 32 bit registers. 32 bit registers were added in https://reviews.llvm.org/rGca31c3bb3ff149850b664838fbbc7d40ce571879. Differential Revision: https://reviews.llvm.org/D102118	2021-05-16 11:01:47 -07:00
Yonghong Song	605c811d2b	BPF: fix FIELD_EXISTS relocation with array subscripts Lorenz Bauer reported an issue in bpf mailing list ([1]) where for FIELD_EXISTS relocation, if the object is an array subscript, the patched immediate is the object offset from the base address, instead of 1. Currently in BPF AbstractMemberAccess pass, the final offset from the base address is the patched offset except FIELD_EXISTS which is 1 unconditionally. In this particular case, the last data structure access is not a field (struct/union offset) so it didn't hit the place to set patched immediate to be 1. This patch fixed the issue by checking the relocation type. If the type is FIELD_EXISTS, just set to 1. Tested by modifying some bpf selftests, libbpf is okay with such types with FIELD_EXISTS relocation. [1] https://lore.kernel.org/bpf/CACAyw99n-cMEtVst7aK-3BfHb99GMEChmRLCvhrjsRpHhPrtvA@mail.gmail.com/ Differential Revision: https://reviews.llvm.org/D102036	2021-05-06 22:37:02 -07:00
Jinsong Ji	6bdfcb165e	[BPF][Test] Disable codegen test on AIX https://reviews.llvm.org/D101194 changed the default getMultiarchTriple in toolchain. So -march=bpf on AIX will get triple of bpf-ibm-aix now, this is unexpected and causing test failures. BPF on AIX is not supported (yet), disable the codegen test on AIX in lit cfg. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D101866	2021-05-06 02:38:46 +00:00
Yonghong Song	bba7338b8f	BPF: generate BTF info for LD_imm64 loaded function pointer For an example like below, extern int do_work(int); long bpf_helper(void *callback_fn); long prog() { return bpf_helper(&do_work); } The final generated codes look like: r1 = do_work ll call bpf_helper exit where we have debuginfo for do_work() extern function: !17 = !DISubprogram(name: "do_work", ...) This patch implemented to add additional checking in processing LD_imm64 operands for possible function pointers so BTF for bpf function do_work() can be properly generated. The original llvm function name processReloc() is renamed to processGlobalValue() to better reflect what the function is doing. Differential Revision: https://reviews.llvm.org/D100568	2021-04-26 17:23:36 -07:00
Yonghong Song	a285bdb56f	BPF: remove default .extern data section Currently, for any extern variable, if it doesn't have section attribution, it will be put into a default ".extern" btf DataSec. The initial design is to put every extern variable in a DataSec so libbpf can use it. But later on, libbpf actually requires extern variables to put into special sections, e.g., ".kconfig", ".ksyms", etc. so they can be used properly based on section name. Andrii mentioned since ".extern" variables are not actually used, it makes sense to remove it from the compiler so libbpf does not need to deal with it, esp. for static linking. The BTF for these extern variables is still generated. With this patch, I tested kernel selftests/bpf and all tests passed. Indeed, removing ".extern" DataSec seems having no impact. Differential Revision: https://reviews.llvm.org/D100392	2021-04-13 11:35:52 -07:00
Yonghong Song	968292cb93	BPF: generate proper BTF for globals with WeakODRLinkage For a global weak symbol defined as below: char g __attribute__((weak)) = 2; LLVM generates an allocated global with WeakAnyLinkage, for which BPF backend generates proper BTF info. For the above example, if a modifier "const" is added like const char g __attribute__((weak)) = 2; LLVM generates an allocated global with WeakODRLinkage, for which BPF backend didn't generate any BTF as it didn't handle WeakODRLinkage. This patch addes support for WeakODRLinkage and proper BTF info can be generated for weak symbol defined with "const" modifier. Differential Revision: https://reviews.llvm.org/D100362	2021-04-13 08:54:05 -07:00
Yonghong Song	886f9ff531	BPF: add extern func to data sections if specified This permits extern function (BTF_KIND_FUNC) be added to BTF_KIND_DATASEC if a section name is specified. For example, -bash-4.4$ cat t.c void foo(int) __attribute__((section(".kernel.funcs"))); int test(void) { foo(5); return 0; } The extern function foo (BTF_KIND_FUNC) will be put into BTF_KIND_DATASEC with name ".kernel.funcs". This will help to differentiate two kinds of external functions, functions in kernel and functions defined in other bpf programs. Differential Revision: https://reviews.llvm.org/D93563	2021-03-25 16:03:29 -07:00
Ilya Leoshkevich	a7137b238a	[BPF] Add support for floats and doubles Some BPF programs compiled on s390 fail to load, because s390 arch-specific linux headers contain float and double types. At the moment there is no BTF_KIND for floats and doubles, so the release version of LLVM ends up emitting type id 0 for them, which the in-kernel verifier does not accept. Introduce support for such types to libbpf by representing them using the new BTF_KIND_FLOAT. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D83289	2021-03-05 15:10:11 +01:00
Yonghong Song	9c0274cdea	BPF: permit type modifiers for __builtin_btf_type_id() relocation Lorenz Bauer from Cloudflare tried to use "const struct <name>" as the type for __builtin_btf_type_id(*(const struct <name>)0, 1) relocation and hit a llvm BPF fatal error. https://lore.kernel.org/bpf/a3782f71-3f6b-1e75-17a9-1827822c2030@fb.com/ ... fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc Currently, we require the debuginfo type itself must have a name. In this case, the debuginfo type is "const" which points to "struct <name>". The "const" type does not have a name, hence the above fatal error will be triggered. Let us permit "const" and "volatile" type modifiers. We skip modifiers in some other cases as well like structure member type tracing. This can aviod the above fatal error. Differential Revision: https://reviews.llvm.org/D97986	2021-03-04 16:27:23 -08:00
Yonghong Song	51cdb780db	BPF: Fix a bug in peephole TRUNC elimination optimization Andrei Matei reported a llvm11 core dump for his bpf program https://bugs.llvm.org/show_bug.cgi?id=48578 The core dump happens in LiveVariables analysis phase. #4 0x00007fce54356bb0 __restore_rt #5 0x00007fce4d51785e llvm::LiveVariables::HandleVirtRegUse(unsigned int, llvm::MachineBasicBlock, llvm::MachineInstr&) #6 0x00007fce4d519abe llvm::LiveVariables::runOnInstr(llvm::MachineInstr&, llvm::SmallVectorImpl<unsigned int>&) #7 0x00007fce4d519ec6 llvm::LiveVariables::runOnBlock(llvm::MachineBasicBlock, unsigned int) #8 0x00007fce4d51a4bf llvm::LiveVariables::runOnMachineFunction(llvm::MachineFunction&) The bug can be reproduced with llvm12 and latest trunk as well. Futher analysis shows that there is a bug in BPF peephole TRUNC elimination optimization, which tries to remove unnecessary TRUNC operations (a <<= 32; a >>= 32). Specifically, the compiler did wrong transformation for the following patterns: %1 = LDW ... %2 = SLL_ri %1, 32 %3 = SRL_ri %2, 32 ... %3 ... %4 = SRA_ri %2, 32 ... %4 ... The current transformation did not check how many uses of %2 and did transformation like %1 = LDW ... ... %1 ... %4 = SRL_ri %2, 32 ... %4 ... and pseudo register %2 is used by not defined and caused LiveVariables analysis core dump. To fix the issue, when traversing back from SRL_ri to SLL_ri, check to ensure SLL_ri has only one use. Otherwise, don't do transformation. Differential Revision: https://reviews.llvm.org/D97792	2021-03-02 13:03:42 -08:00
Yonghong Song	286daafd65	[BPF] support atomic instructions Implement fetch_<op>/fetch_and_<op>/exchange/compare-and-exchange instructions for BPF. Specially, the following gcc intrinsics are implemented. __sync_fetch_and_add (32, 64) __sync_fetch_and_sub (32, 64) __sync_fetch_and_and (32, 64) __sync_fetch_and_or (32, 64) __sync_fetch_and_xor (32, 64) __sync_lock_test_and_set (32, 64) __sync_val_compare_and_swap (32, 64) For __sync_fetch_and_sub, internally, it is implemented as a negation followed by __sync_fetch_and_add. For __sync_lock_test_and_set, despite its name, it actually does an atomic exchange and return the old content. https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html For intrinsics like __sync_{add,sub}_and_fetch and __sync_bool_compare_and_swap, the compiler is able to generate codes using __sync_fetch_and_{add,sub} and __sync_val_compare_and_swap. Similar to xadd, atomic xadd, xor and xxor (atomic_<op>) instructions are added for atomic operations which do not have return values. LLVM will check the return value for __sync_fetch_and_{add,and,or,xor}. If the return value is used, instructions atomic_fetch_<op> will be used. Otherwise, atomic_<op> instructions will be used. All new instructions only support 64bit and 32bit with alu32 mode. old xadd instruction still supports 32bit without alu32 mode. For encoding, please take a look at test atomics_2.ll. Differential Revision: https://reviews.llvm.org/D72184	2020-12-03 07:38:00 -08:00
Yonghong Song	61a06c071d	BPF: add a test for selectiondag alias analysis w.r.t. lifetime This adds a test for the bug https://bugs.llvm.org/show_bug.cgi?id=47591 Previously, selection dag has a bug which may incorrectly assume no alias when crossing a lifetime boundary and this may generate incorrect code as demonstrated in the above bug. It looks the bug is fixed by https://reviews.llvm.org/D91833. Basically, when comparing two potential memory access dag nodes, a store and a lifetime.start, with the same frame index. Previously, it may be decided no alias. With the above fix, these two will be considered aliasing which will prevent incorrect code scheduling. Differential Revision: https://reviews.llvm.org/D92451	2020-12-02 22:27:17 -08:00
Arthur Eubanks	92a67e131f	[BPF][NewPM] Port bpf-adjust-opt to NPM and add it to pipeline Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D91990	2020-11-26 10:11:26 -08:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Yonghong Song	4369223ea7	BPF: make __builtin_btf_type_id() return 64bit int Linux kernel recently added support for kernel modules https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org/ In such cases, a type id in the kernel needs to be presented as (btf id for modules, btf type id for this module). Change __builtin_btf_type_id() to return 64bit value so libbpf can do the above encoding. Differential Revision: https://reviews.llvm.org/D91489	2020-11-16 07:08:41 -08:00
Arthur Eubanks	b6ccff3d5f	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Since callbacks may end up not adding passes, we need to check if the pass managers are empty before adding them, so PassManager now has an isEmpty() function. For example, polly adds callbacks but doesn't always add passes in those callbacks, so this is necessary to keep -debug-pass-manager tests' output from changing depending on if polly is enabled or not. Tests are a continuation of those added in https://reviews.llvm.org/D89083. Reviewed By: asbirlea, Meinersbur Differential Revision: https://reviews.llvm.org/D89158	2020-11-11 15:10:27 -08:00
Simon Pilgrim	2224c2f8bc	[BPF] intrinsic-array-2.ll - remove unused check prefixes Just use default CHECK	2020-11-11 18:38:21 +00:00
Arthur Eubanks	226e179f74	Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0" This reverts commit `ae38540042`. As well as some follow-up test fixes. The original change causes new-pass-manager.ll to fail when polly is enabled.	2020-11-08 00:32:35 -08:00
Arthur Eubanks	ae38540042	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Tests are a continuation of those added in https://reviews.llvm.org/D89083. In order to prevent TargetMachines from adding unnecessary optimization passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be changed to take an OptimizationLevel, but that will be done separately. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89158	2020-11-04 22:27:16 -08:00
Arthur Eubanks	2218e6d0a8	[BPF] Make BPFAbstractMemberAccessPass required Or else on optnone functions we get the following during instruction selection: fatal error: error in backend: Cannot select: intrinsic %llvm.preserve.struct.access.index Currently the -O0 pipeline doesn't properly run passes registered via TargetMachine::registerPassBuilderCallbacks(), so don't add that RUN line yet. That will be fixed after this. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D89083	2020-10-09 11:26:37 -07:00
Yonghong Song	3161172168	BPF: fix incorrect DAG2DAG load optimization Currently, bpf backend Instruction section DAG2DAG phase has an optimization to replace loading constant struct memeber or array element with direct values. The reason is that these locally defined struct or array variables may have their initial values stored in a readonly section and early bpf ecosystem is not able to handle such cases. Bpf ecosystem now can not only handle readonly sections, but also global variables. global variable can also have initialized data and global variable may or may not be constant, i.e., global variable data can be put in .data section or .rodata section. This exposed a bug in DAG2DAG Load optimization as it did not check whether the global variable is constant or not. This patch fixed the bug by checking whether global variable, representing the initial data, is constant or not and will not do optimization if it is not a constant. Another bug is also fixed in this patch to check whether the load is simple (not volatile/atomic) or not. If it is not simple, we will not do optimization. To summary for globals: - struct t var = { ... } ; // no load optimization - const struct t var = { ... }; // load optimization is possible - volatile const struct t var = { ... }; // no load optimization Differential Revision: https://reviews.llvm.org/D89021	2020-10-07 19:08:40 -07:00
Yonghong Song	ddf1864ace	BPF: add AdjustOpt IR pass to generate verifier friendly codes Add an IR phase right before main module optimization. This is to modify IR to restrict certain downward optimizations in order to generate verifier friendly code. > prevent certain instcombine optimizations, handling both in-block/cross-block instcombines. > avoid speculative code motion if the variable used in condition is also used in the later blocks. Internally, a bpf IR builtin result = __builtin_bpf_passthrough(seq_num, result) is used to enforce ordering. This builtin is only used during target independent IR optimizations and it will be removed at the beginning of target dependent IR optimizations. For example, removing the following workaround, --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c @@ -47,7 +47,7 @@ int sysctl_tcp_mem(struct bpf_sysctl ctx) / a workaround to prevent compiler from generating * codes verifier cannot handle yet. */ - volatile int ret; + int ret; this patch is able to generate code which passed the verifier. To disable optimization, users need to use "opt" command like below: clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes test.c // disable icmp serialization opt -O2 -bpf-disable-serialize-icmp test.ll \| llvm-dis > t.ll // disable avoid-speculation opt -O2 -bpf-disable-avoid-speculation test.ll \| llvm-dis > t.ll llc t.ll Differential Revision: https://reviews.llvm.org/D85570	2020-10-07 08:49:10 -07:00
Yonghong Song	edd71db38b	BPF: avoid duplicated globals for CORE relocations This patch fixed two issues related with relocation globals. In LLVM, if a global, e.g. with name "g", is created and conflict with another global with the same name, LLVM will rename the global, e.g., with a new name "g.2". Since relocation global name has special meaning, we do not want llvm to change it, so internally we have logic to check whether duplication happens or not. If happens, just reuse the previous global. The first bug is related to non-btf-id relocation (BPFAbstractMemberAccess.cpp). Commit `54d9f743c8` ("BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible") changed ModulePass to FunctionPass, i.e., handling each function at a time. But still just one BPFAbstractMemberAccess object is created so module level de-duplication still possible. Commit `40251fee00` ("[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline") made a change to create a BPFAbstractMemberAccess object per function so module level de-duplication is not possible any more without going through all module globals. This patch simply changed the map which holds reloc globals as class static, so it will be available to all BPFAbstractMemberAccess objects for different functions. The second bug is related to btf-id relocation (BPFPreserveDIType.cpp). Before Commit `54d9f743c8`, the pass is a ModulePass, so we have a local variable, incremented for each instance, and works fine. But after Commit `54d9f743c8`, the pass becomes a FunctionPass. Local variable won't work properly since different functions will start with the same initial value. Fix the issue by change the local count variable as static, so it will be truely unique across the whole module compilation. Differential Revision: https://reviews.llvm.org/D88942	2020-10-06 22:37:49 -07:00
Arthur Eubanks	40251fee00	[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline This involves porting BPFAbstractMemberAccess and BPFPreserveDIType to NPM, then adding them BPFTargetMachine::registerPassBuilderCallbacks (the NPM equivalent of adjustPassManager()). Reviewed By: yonghong-song, asbirlea Differential Revision: https://reviews.llvm.org/D88855	2020-10-06 07:42:32 -07:00

1 2 3 4 5

243 Commits