Commit Graph

243 Commits

Author SHA1 Message Date
Yonghong Song 2e94d8e67a [BPF] handle unsigned icmp ops in BPFAdjustOpt pass
When investigating an issue with bcc tool inject.py, I found
a verifier failure with latest clang. The portion of code
can be illustrated as below:
  struct pid_struct {
    u64 curr_call;
    u64 conds_met;
    u64 stack[2];
  };
  struct pid_struct *bpf_map_lookup_elem();
  int foo() {
    struct pid_struct *p = bpf_map_lookup_elem();
    if (!p) return 0;
    p->curr_call--;
    if (p->conds_met < 1 || p->conds_met >= 3)
        return 0;
    if (p->stack[p->conds_met - 1] == p->curr_call)
        p->conds_met--;
    ...
  }

The verifier failure looks like:
  ...
  8: (79) r1 = *(u64 *)(r0 +0)
   R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R10=fp0 fp-8=mmmm????
  9: (07) r1 += -1
  10: (7b) *(u64 *)(r0 +0) = r1
   R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmm????
  11: (79) r2 = *(u64 *)(r0 +8)
   R0_w=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmm????
  12: (bf) r3 = r2
  13: (07) r3 += -3
  14: (b7) r4 = -2
  15: (2d) if r4 > r3 goto pc+13
   R0=map_value(id=0,off=0,ks=4,vs=32,imm=0) R1=inv(id=0) R2=inv(id=2)
   R3=inv(id=0,umin_value=18446744073709551614,var_off=(0xffffffff00000000; 0xffffffff))
   R4=inv-2 R10=fp0 fp-8=mmmm????
  16: (07) r2 += -1
  17: (bf) r3 = r2
  18: (67) r3 <<= 3
  19: (bf) r4 = r0
  20: (0f) r4 += r3
  math between map_value pointer and register with unbounded min value is not allowed

Here the compiler optimized "p->conds_met < 1 || p->conds_met >= 3" to
  r2 = p->conds_met
  r3 = r2
  r3 += -3
  r4 = -2
  if (r3 < r4) return 0
  r2 += -1
  r3 = r2
  ...
In the above, r3 is initially equal to r2, but is modified used by the comparison.
But later on r2 is used again. This caused verification failure.

BPF backend has a pass, AdjustOpt, to prevent such transformation, but only
focused on signed integers since typical bpf helper returns signed integers.
To fix this case, let us handle unsigned integers as well.

Differential Revision: https://reviews.llvm.org/D121937
2022-03-17 16:24:39 -07:00
Yonghong Song d2b4a675a8 [BPF] Fix a bug in BPFAdjustOpt pass for icmp transformation
When checking a bcc issue related to bcc tool inject.py,
I found a bug in BPFAdjustOpt pass for icmp transformation,
caused by typo's. For the following condition:
  Cond2Op != ICmpInst::ICMP_SLT && Cond1Op != ICmpInst::ICMP_SLE
it should be
  Cond2Op != ICmpInst::ICMP_SLT && Cond2Op != ICmpInst::ICMP_SLE

This patch fixed the problem and a test case is added.

Differential Revision: https://reviews.llvm.org/D121883
2022-03-17 09:25:18 -07:00
Yonghong Song 98e2274458 [BPF] fix a CO-RE bitfield relocation error with >8 record alignment
Jussi Maki reported a fatal error like below for a bitfield
CO-RE relocation:
  fatal error: error in backend: Unsupported field expression for
  llvm.bpf.preserve.field.info, requiring too big alignment
The failure is related to kernel struct thread_struct. The following
is a simplied example.

Suppose we have below structure:
  struct t2 {
    int a[8];
  } __attribute__((aligned(64))) __attribute__((preserve_access_index));
  struct t1 {
    int f1:1;
    int f2:2;
    struct t2 f3;
  } __attribute__((preserve_access_index));

Note that struct t2 has aligned 64, which is used sometimes in the
kernel to enforce cache line alignment.

The above struct will be encoded into BTF and the following is what
C code looks like and the struct will appear in the file like vmlinux.h.
  struct t2 {
        int a[8];
        long: 64;
        long: 64;
        long: 64;
        long: 64;
  } __attribute__((preserve_access_index));
  struct t1 {
        int f1: 1;
        int f2: 2;
        long: 61;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        long: 64;
        struct t2 f3;
  } __attribute__((preserve_access_index));

Note that after
  origin_source -> BTF -> new_source
transition, the new source has the same memory layout as the old one
but the alignment interpretation inside the compiler could be different.
The bpf program will use the later explicitly padded structure as in
vmlinux.h.

In the above case, the compiler internal ABI alignment for new struct t1
is 16 while it is 4 for old struct t1. I didn't do a thorough investigation
why the ABI alignment is 16 and I suspect it is related to anonymous padding
in the above.

Current BPF bitfield CO-RE handling requires alignment <= 8 so proper
bitfield operatin can be performed. Therefore, alignment 16 will cause
a compiler fatal error.

To fix the ABI alignment >=16, let us check whether the bitfield
can be held within a 8-byte-aligned range. If this is the case,
we can use alignment 8. Otherwise, a fatal error will be reported.

Differential Revision: https://reviews.llvm.org/D121821
2022-03-16 12:16:46 -07:00
Reid Kleckner f58fb8ae7f [BPF] Fix tests that fail if /tmp/t.c exists
IMO the BPF backend shouldn't read random source files referenced from
debug info. I filed llvm.org/pr54092 about this.
2022-02-25 14:55:53 -08:00
Yonghong Song 3671bdbcd2 [BPF] Fix a BTF type pruning bug
In BPF backend, BTF type generation may skip
some debuginfo types if they are the pointee
type of a struct member. For example,
  struct task_struct {
    ...
    struct mm_struct                *mm;
    ...
  };
BPF backend may generate a forward decl for
'struct mm_struct' instead of full type if
there are no other usage of 'struct mm_struct'.
The reason is to avoid bringing too much unneeded types
in BTF.

Alexei found a pruning bug where we may miss
some full type generation. The following is an illustrating
example:
   struct t1 { ... }
   struct t2 { struct t1 *p; };
   struct t2 g;
   void foo(struct t1 *arg) { ... }
In the above case, we will have partial debuginfo chain like below:
   struct t2 -> member p
                        \ -> ptr -> struct t1
                        /
     foo -> argument arg
During traversing
   struct t2 -> member p -> ptr -> struct t1
The corresponding BTF types are generated except 'struct t1' which
will be in FixUp stage. Later, when traversing
   foo -> argument arg -> ptr -> struct t1
The 'ptr' BTF type has been generated and currently implementation
ignores 'pointer' type hence 'struct t1' is not generated.

This patch fixed the issue not just for the above case, but for
general case with multiple derived types, e.g.,
   struct t2 -> member p
                        \ -> const -> ptr -> volatile -> struct t1
                        /
     foo -> argument arg

Differential Revision: https://reviews.llvm.org/D119986
2022-02-16 17:23:34 -08:00
Yonghong Song f419029fcd [BPF] Fix a bug in BTF_KIND_TYPE_TAG generation
Kumar Kartikeya Dwivedi reported a bug ([1]) where BTF_KIND_TYPE_TAG types
are not generated.

Currently, BPF backend only generates BTF types which are used by
the program, e.g., global variables, functions and some builtin functions.
For example, suppose we have
  struct task_struct {
    ...
    struct task_group               *sched_task_group;
    struct mm_struct                *mm;
    ...
    pid_t                           pid;
    pid_t                           tgid;
    ...
  }
If BPF program intends to access task_struct->pid and task_struct->tgid,
there really no need to generate BTF types for struct task_group
and mm_struct.

In BPF backend, during BTF generation, when generating BTF for struct
task_struct, if types for task_group and mm_struct have not been generated
yet, a Fixup structure will be created, which will be reexamined later
to instantiate into either a full type or a forward type.

In current implementation, if we have something like
  struct foo {
     struct bar  __tag1    *f;
  };
and when generating types for struct foo, struct bar type
has not been generated, the __tag1 will be lost during later
Fixup instantiation. This patch fixed this issue by properly
handling btf_type_tag's during Fixup instantiation stage.

  [1] https://lore.kernel.org/bpf/20220210232411.pmhzj7v5uptqby7r@apollo.legion/

Differential Revision: https://reviews.llvm.org/D119799
2022-02-14 19:43:57 -08:00
Nikita Popov f430c1eb64 [Tests] Add elementtype attribute to indirect inline asm operands (NFC)
This updates LLVM tests for D116531 by adding elementtype attributes
to operands that correspond to indirect asm constraints.
2022-01-06 14:23:51 +01:00
Yonghong Song 8fb3f84484 BPF: Workaround InstCombine trunc+icmp => mask+icmp Optimization
Patch [1] added further InstCombine trunc+icmp => mask+icmp
optimization and this caused a couple of bpf selftest failure.
Previous llvm BPF backend patch [2] introduced llvm.bpf.compare
builtin to handle such situations.

This patch further added support ">" and ">=" icmp opcodes.
Tested with bpf selftests and all tests are passed including two
previously failed ones.

Note Patch [1] also added optimization if the to-be-compared
constant is negative-power-of-2 (-C) or not-of-power-of-2 (~C).
This patch didn't implement these two cases as typical bpf
program compares a scalar to a positive length or boundary value,
and this scalar later is used as a index into an array buffer
or packet buffer.

  [1] https://reviews.llvm.org/D112634
  [2] https://reviews.llvm.org/D112938

Differential Revision: https://reviews.llvm.org/D114215
2021-11-18 20:25:28 -08:00
Yonghong Song 8d499bd5bc BPF: change btf_type_tag BTF output format
For the declaration like below:
  int __tag1 * __tag1 __tag2 *g
Commit 41860e602a ("BPF: Support btf_type_tag attribute")
implemented the following encoding:
  VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int

Some further experiments with linux btf_type_tag support, esp.
with generating attributes in vmlinux.h, and also some internal
discussion showed the following format is more desirable:
  VAR(g) -> pointer -> __tag2 -> __tag1 -> pointer -> __tag1 -> int

The format makes it similar to other modifier like 'const', e.g.,
  const int *g
which has encoding VAR(g) -> PTR -> CONST -> int

Differential Revision: https://reviews.llvm.org/D113496
2021-11-09 11:34:25 -08:00
Nikita Popov 1376301c87 [InstCombine] Canonicalize range test idiom
InstCombine converts range tests of the form (X > C1 && X < C2) or
(X < C1 || X > C2) into checks of the form (X + C3 < C4) or
(X + C3 > C4). It is possible to express all range tests in either
of these forms (with different choices of constants), but currently
neither of them is considered canonical. We may have equivalent
range tests using either ult or ugt.

This proposes to canonicalize all range tests to use ult. An
alternative would be to canonicalize to either ult or ugt depending
on the specific constants involved -- e.g. in practice we currently
generate ult for && style ranges and ugt for || style ranges when
going through the insertRangeTest() helper. In fact, the "clamp like"
fold was relying on this, which is why I had to tweak it to not
assume whether inversion is needed based on just the predicate.

Proof: https://alive2.llvm.org/ce/z/_SP_rQ

Differential Revision: https://reviews.llvm.org/D113366
2021-11-08 21:15:46 +01:00
Yonghong Song 41860e602a BPF: Support btf_type_tag attribute
A new kind BTF_KIND_TYPE_TAG is defined. The tags associated
with a pointer type are emitted in their IR order as modifiers.
For example, for the following declaration:
  int __tag1 * __tag1 __tag2 *g;
The BTF type chain will look like
  VAR(g) -> __tag1 --> __tag2 -> pointer -> __tag1 -> pointer -> int
In the above "->" means BTF CommonType.Type which indicates
the point-to type.

Differential Revision: https://reviews.llvm.org/D113222
2021-11-04 17:01:36 -07:00
Yonghong Song f63405f6e3 BPF: Workaround an InstCombine ICmp transformation with llvm.bpf.compare builtin
Commit acabad9ff6 ("[InstCombine] try to canonicalize icmp with
trunc op into mask and cmp") added a transformation to
convert "(conv)a < power_2_const" to "a & <const>" in certain
cases and bpf kernel verifier has to handle the resulted code
conservatively and this may reject otherwise legitimate program.

This commit tries to prevent such a transformation. A bpf backend
builtin llvm.bpf.compare is added. The ICMP insn, which is subject to
above InstCombine transformation, is converted to the builtin
function. The builtin function is later lowered to original ICMP insn,
certainly after InstCombine pass.

With this change, all affected bpf strobemeta* selftests are
passed now.

Differential Revision: https://reviews.llvm.org/D112938
2021-11-01 14:46:20 -07:00
Yonghong Song 0472e83ffc BPF: emit BTF_KIND_DECL_TAG for typedef types
If a typedef type has __attribute__((btf_decl_tag("str"))) with
bpf target, emit BTF_KIND_DECL_TAG for that type in the BTF.

Differential Revision: https://reviews.llvm.org/D112259
2021-10-21 12:09:42 -07:00
Yonghong Song cd40b5a712 BPF: set .BTF and .BTF.ext section alignment to 4
Currently, .BTF and .BTF.ext has default alignment of 1.
For example,
  $ cat t.c
    int foo() { return 0; }
  $ clang -target bpf -O2 -c -g t.c
  $ llvm-readelf -S t.o
    ...
    Section Headers:
    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    ...
    [ 7] .BTF              PROGBITS        0000000000000000 000167 00008b 00      0   0  1
    [ 8] .BTF.ext          PROGBITS        0000000000000000 0001f2 000050 00      0   0  1

But to have no misaligned data access, .BTF and .BTF.ext
actually requires alignment of 4. Misalignment is not an issue
for architecture like x64/arm64 as it can handle it well. But
some architectures like mips may incur a trap if .BTF/.BTF.ext
is not properly aligned.

This patch explicitly forced .BTF and .BTF.ext alignment to be 4.
For the above example, we will have
    [ 7] .BTF              PROGBITS        0000000000000000 000168 00008b 00      0   0  4
    [ 8] .BTF.ext          PROGBITS        0000000000000000 0001f4 000050 00      0   0  4

Differential Revision: https://reviews.llvm.org/D112106
2021-10-19 16:26:01 -07:00
Yonghong Song 009f3a89d8 BPF: remove intrindics @llvm.stacksave() and @llvm.stackrestore()
Paul Chaignon reported a bpf verifier failure ([1]) due to using
non-ABI register R11. For the test case, llvm11 is okay while
llvm12 and later generates verifier unfriendly code.

The failure is related to variable length array size.
The following mimics the variable length array definition
in the test case:

struct t { char a[20]; };
void foo(void *);
int test() {
   const int a = 8;
   char tmp[AA + sizeof(struct t) + a];
   foo(tmp);
   ...
}

Paul helped bisect that the following llvm commit is
responsible:

552c6c2328 ("PR44406: Follow behavior of array bound constant
              folding in more recent versions of GCC.")

Basically, before the above commit, clang frontend did constant
folding for array size "AA + sizeof(struct t) + a" to be 68,
so used alloca for stack allocation. After the above commit,
clang frontend didn't do constant folding for array size
any more, which results in a VLA and llvm.stacksave/llvm.stackrestore
is generated.

BPF architecture API does not support stack pointer (sp) register.
The LLVM internally used R11 to indicate sp register but it should
not be in the final code. Otherwise, kernel verifier will reject it.

The early patch ([2]) tried to fix the issue in clang frontend.
But the upstream discussion considered frontend fix is really a
hack and the backend should properly undo llvm.stacksave/llvm.stackrestore.
This patch implemented a bpf IR phase to remove these intrinsics
unconditionally. If eventually the alloca can be resolved with
constant size, r11 will not be generated. If alloca cannot be
resolved with constant size, SelectionDag will complain, the same
as without this patch.

 [1] https://lore.kernel.org/bpf/20210809151202.GB1012999@Mem/
 [2] https://reviews.llvm.org/D107882

Differential Revision: https://reviews.llvm.org/D111897
2021-10-18 09:51:19 -07:00
Yonghong Song 1321e47298 BPF: rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Per discussion in https://reviews.llvm.org/D111199,
the existing btf_tag attribute will be renamed to
btf_decl_tag. This patch updated BTF backend to
use btf_decl_tag attribute name and also
renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG.

Differential Revision: https://reviews.llvm.org/D111592
2021-10-11 21:33:39 -07:00
Yonghong Song ea72b0319d BPF: make 32bit register spill with 64bit alignment
In llvm, for non-alu32 mode, the stack alignment is 64bit so only one
64bit spill per 64bit slot. For alu32 mode, the stack alignment
is 32bit, so it is possible to have two 32bit spills per
64bit slot.

Currently, bpf kernel verifier does not preserve register states
for 32bit spills. That is, one 32bit register may hold a constant
value or a bounded range before spill. After reload from the
stack, the information is lost and sometimes this may cause
verifier failure. For 64bit register spill, the verifier
indeed tries to preserve the register state for reloading.

The current verifier can be modestly changed to handle one
32bit spill per 64bit stack slot with state-preserving reload.
Handling two 32bit spills per 64bit stack slot will require
substantial changes.

This patch changes stack alignment for alu32 to be 64bit.
This way, for any 64bit slot in alu32 mode, only one
32bit or 64bit register values can be saved. Together
with previous-mentioned verifier enhancement, 32bit
spill can be handled with state preserving.

Note that llvm stack slot coallescing
seems only doing adjacent packing which may leave some holes
in the stack. For example,
   stack slot 8   <== 8 bytes
   stack slot 4   <== 8 bytes with 4 byte hole
   stack slot 8   <== 8 bytes
   stack slot 4   <== 4 bytes

Differential Revision: https://reviews.llvm.org/D109073
2021-09-20 21:00:25 -07:00
Nikita Popov 90ec6dff86 [OpaquePtr] Forbid mixing typed and opaque pointers
Currently, opaque pointers are supported in two forms: The
-force-opaque-pointers mode, where all pointers are opaque and
typed pointers do not exist. And as a simple ptr type that can
coexist with typed pointers.

This patch removes support for the mixed mode. You either get
typed pointers, or you get opaque pointers, but not both. In the
(current) default mode, using ptr is forbidden. In -opaque-pointers
mode, all pointers are opaque.

The motivation here is that the mixed mode introduces additional
issues that don't exist in fully opaque mode. D105155 is an example
of a design problem. Looking at D109259, it would probably need
additional work to support mixed mode (e.g. to generate GEPs for
typed base but opaque result). Mixed mode will also end up
inserting many casts between i8* and ptr, which would require
significant additional work to consistently avoid.

I don't think the mixed mode is particularly valuable, as it
doesn't align with our end goal. The only thing I've found it to
be moderately useful for is adding some opaque pointer tests in
between typed pointer tests, but I think we can live without that.

Differential Revision: https://reviews.llvm.org/D109290
2021-09-10 15:18:23 +02:00
Yonghong Song e52617c31d BPF: change BTF_KIND_TAG format
Previously we have the following binary representation:
    struct bpf_type { name, info, type }
    struct btf_tag { __u32 component_idx; }
If the tag points to a struct/union/var/func type, we will have
   kflag = 1, component_idx = 0
if the tag points to struct/union member or func argument, we will have
   kflag = 0, component_idx = 0, ..., vlen - 1

The above rather makes interface complex to have both kflag and
component needed to determine its legality and index.

This patch simplifies the interface by removing kflag involvement.
   component_idx = (u32)-1 : tag pointing to a type
   component_idx = 0 ... vlen - 1 : tag pointing to a member or argument
and kflag is always 0 and there is no need to check.

Differential Revision: https://reviews.llvm.org/D109560
2021-09-09 19:03:57 -07:00
Yonghong Song 4948927058 [BPF] support btf_tag attribute in .BTF section
A new kind BTF_KIND_TAG is added to .BTF to encode
btf_tag attributes. The format looks like
   CommonType.name : attribute string
   CommonType.type : attached to a struct/union/func/var.
   CommonType.info : encoding BTF_KIND_TAG
                     kflag == 1 to indicate the attribute is
                     for CommonType.type, or kflag == 0
                     for struct/union member or func argument.
   one uint32_t    : to encode which member/argument starting from 0.

If one particular type or member/argument has more than one attribute,
multiple BTF_KIND_TAG will be generated.

Differential Revision: https://reviews.llvm.org/D106622
2021-08-28 21:02:27 -07:00
Yonghong Song e52946b9ab BPF: avoid NE/EQ loop exit condition
Kuniyuki Iwashima reported in [1] that llvm compiler may
convert a loop exit condition with "i < bound" to "i != bound", where
"i" is the loop index variable and "bound" is the upper bound.
In case that "bound" is not a constant, verifier will always have "i != bound"
true, which will cause verifier failure since to verifier this is
an infinite loop.

The fix is to avoid transforming "i < bound" to "i != bound".
In llvm, the transformation is done by IndVarSimplify pass.
The compiler checks loop condition cost (i = i + 1) and if the
cost is lower, it may transform "i < bound" to "i != bound".
This patch implemented getArithmeticInstrCost() in BPF TargetTransformInfo
class to return a higher cost for such an operation, which
will prevent the transformation for the test case
added in this patch.

 [1] https://lore.kernel.org/netdev/1994df05-8f01-371f-3c3b-d33d7836878c@fb.com/

Differential Revision: https://reviews.llvm.org/D107483
2021-08-04 16:54:16 -07:00
Nikita Popov be5af50e7d [BPF] Use elementtype attribute for preserve.array/struct.index intrinsics
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.

This patch:

 * Adds a verifier check that the attribute is required.
 * Adds it in the IRBuilder methods for these intrinsics.
 * Autoupgrades old bitcode without the attribute.
 * Updates the lowering code to use the attribute rather than
   the pointer element type.
 * Updates lots of tests to specify the attribute.
 * Adds -force-opaque-pointers to the intrinsic-array.ll test
   to demonstrate they work now.

https://reviews.llvm.org/D106184
2021-07-17 11:09:18 +02:00
Simon Pilgrim d561b6fbdb [InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183) (REAPPLIED)
As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep':

define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
  %gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
  %gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3
  %sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1
  ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) {
  %sel = select i1 %a2, i64 %a1, i64 %a3
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
  ret <4 x i32>* %gep
}

This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address:

define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) {
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1
  %sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep
  ret <4 x i32>* %sel
}
-->
define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) {
  %sel = select i1 %a2, i64 0, i64 %a1
  %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel
  ret <4 x i32>* %gep
}

Reapplied with a fix for the bpf "-bpf-disable-avoid-speculation" tests

Differential Revision: https://reviews.llvm.org/D105901
2021-07-14 12:21:01 +01:00
Yonghong Song 6a2ea84600 BPF: Add more relocation kinds
Currently, BPF only contains three relocations:
  R_BPF_NONE   for no relocation
  R_BPF_64_64  for LD_imm64 and normal 64-bit data relocation
  R_BPF_64_32  for call insn and normal 32-bit data relocation

Also .BTF and .BTF.ext sections contain symbols in allocated
program and data sections. These two sections reserved 32bit
space to hold the offset relative to the symbol's section.
When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld
may attempt to resolve relocations for .BTF and .BTF.ext,
which we want to prevent. So we used R_BPF_NONE for such relocations.

This all works fine until when we try to do linking of
multiple objects.
  . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data
    is different, so lld target->relocate() needs more context
    to do a correct job.
  . The same for R_BPF_64_32. More context is needed for
    lld target->relocate() to differentiate call insn vs.
    normal 32-bit data relocation.
  . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE,
    they will not be relocated properly when multiple .BTF/.BTF.ext
    sections are merged by lld.

This patch intends to address this issue by adding additional
relocation kinds:
  R_BPF_64_ABS64     for normal 64-bit data relocation
  R_BPF_64_ABS32     for normal 32-bit data relocation
  R_BPF_64_NODYLD32  for .BTF and .BTF.ext style relocations.
The old R_BPF_64_{64,32} semantics:
  R_BPF_64_64        for LD_imm64 relocation
  R_BPF_64_32        for call insn relocation

The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values
is maintained. They are the most common use cases for
bpf programs and we want to maintain backward compatibility
as much as possible.

ExecutionEngine RuntimeDyld BPF relocations are adjusted as well.
R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and
other relocations will be ignored.
Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in
RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!"
fatal error.

FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp
are removed as they are not triggered in BPF backend.
BPF backend used FK_SecRel_8 for LD_imm64 instruction operands.

Differential Revision: https://reviews.llvm.org/D102712
2021-05-25 08:19:13 -07:00
serge-sans-paille 4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Alessandro Decina 833e9b2ea7 [BPF] add support for 32 bit registers in inline asm
Add "w" constraint type which allows selecting 32 bit registers.
32 bit registers were added in https://reviews.llvm.org/rGca31c3bb3ff149850b664838fbbc7d40ce571879.

Differential Revision: https://reviews.llvm.org/D102118
2021-05-16 11:01:47 -07:00
Yonghong Song 605c811d2b BPF: fix FIELD_EXISTS relocation with array subscripts
Lorenz Bauer reported an issue in bpf mailing list ([1]) where
for FIELD_EXISTS relocation, if the object is an array subscript,
the patched immediate is the object offset from the base address,
instead of 1.

Currently in BPF AbstractMemberAccess pass, the final offset
from the base address is the patched offset except FIELD_EXISTS
which is 1 unconditionally. In this particular case, the last
data structure access is not a field (struct/union offset)
so it didn't hit the place to set patched immediate to be 1.

This patch fixed the issue by checking the relocation type.
If the type is FIELD_EXISTS, just set to 1.
Tested by modifying some bpf selftests, libbpf is okay with
such types with FIELD_EXISTS relocation.

 [1] https://lore.kernel.org/bpf/CACAyw99n-cMEtVst7aK-3BfHb99GMEChmRLCvhrjsRpHhPrtvA@mail.gmail.com/

Differential Revision: https://reviews.llvm.org/D102036
2021-05-06 22:37:02 -07:00
Jinsong Ji 6bdfcb165e [BPF][Test] Disable codegen test on AIX
https://reviews.llvm.org/D101194 changed the default getMultiarchTriple in toolchain.
So -march=bpf on AIX will get triple of bpf-ibm-aix now,
this is unexpected and causing test failures.

BPF on AIX is not supported (yet), disable the codegen test on AIX in lit cfg.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D101866
2021-05-06 02:38:46 +00:00
Yonghong Song bba7338b8f BPF: generate BTF info for LD_imm64 loaded function pointer
For an example like below,
    extern int do_work(int);
    long bpf_helper(void *callback_fn);
    long prog() {
        return bpf_helper(&do_work);
    }

The final generated codes look like:
    r1 = do_work ll
    call bpf_helper
    exit
where we have debuginfo for do_work() extern function:
    !17 = !DISubprogram(name: "do_work", ...)

This patch implemented to add additional checking
in processing LD_imm64 operands for possible function pointers
so BTF for bpf function do_work() can be properly generated.
The original llvm function name processReloc() is renamed to
processGlobalValue() to better reflect what the function is doing.

Differential Revision: https://reviews.llvm.org/D100568
2021-04-26 17:23:36 -07:00
Yonghong Song a285bdb56f BPF: remove default .extern data section
Currently, for any extern variable, if it doesn't have
section attribution, it will be put into a default ".extern"
btf DataSec. The initial design is to put every extern
variable in a DataSec so libbpf can use it.

But later on, libbpf actually requires extern variables
to put into special sections, e.g., ".kconfig", ".ksyms", etc.
so they can be used properly based on section name.

Andrii mentioned since ".extern" variables are
not actually used, it makes sense to remove it from
the compiler so libbpf does not need to deal with it,
esp. for static linking. The BTF for these extern variables
is still generated.

With this patch, I tested kernel selftests/bpf and all tests
passed. Indeed, removing ".extern" DataSec seems having no
impact.

Differential Revision: https://reviews.llvm.org/D100392
2021-04-13 11:35:52 -07:00
Yonghong Song 968292cb93 BPF: generate proper BTF for globals with WeakODRLinkage
For a global weak symbol defined as below:
  char g __attribute__((weak)) = 2;
LLVM generates an allocated global with WeakAnyLinkage,
for which BPF backend generates proper BTF info.

For the above example, if a modifier "const" is added like
  const char g __attribute__((weak)) = 2;
LLVM generates an allocated global with WeakODRLinkage,
for which BPF backend didn't generate any BTF as it
didn't handle WeakODRLinkage.

This patch addes support for WeakODRLinkage and proper
BTF info can be generated for weak symbol defined with
"const" modifier.

Differential Revision: https://reviews.llvm.org/D100362
2021-04-13 08:54:05 -07:00
Yonghong Song 886f9ff531 BPF: add extern func to data sections if specified
This permits extern function (BTF_KIND_FUNC) be added
to BTF_KIND_DATASEC if a section name is specified.
For example,

-bash-4.4$ cat t.c
void foo(int) __attribute__((section(".kernel.funcs")));
int test(void) {
  foo(5);
  return 0;
}

The extern function foo (BTF_KIND_FUNC) will be put into
BTF_KIND_DATASEC with name ".kernel.funcs".

This will help to differentiate two kinds of external functions,
functions in kernel and functions defined in other bpf programs.

Differential Revision: https://reviews.llvm.org/D93563
2021-03-25 16:03:29 -07:00
Ilya Leoshkevich a7137b238a [BPF] Add support for floats and doubles
Some BPF programs compiled on s390 fail to load, because s390
arch-specific linux headers contain float and double types. At the
moment there is no BTF_KIND for floats and doubles, so the release
version of LLVM ends up emitting type id 0 for them, which the
in-kernel verifier does not accept.

Introduce support for such types to libbpf by representing them using
the new BTF_KIND_FLOAT.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D83289
2021-03-05 15:10:11 +01:00
Yonghong Song 9c0274cdea BPF: permit type modifiers for __builtin_btf_type_id() relocation
Lorenz Bauer from Cloudflare tried to use "const struct <name>"
as the type for __builtin_btf_type_id(*(const struct <name>)0, 1)
relocation and hit a llvm BPF fatal error.
   https://lore.kernel.org/bpf/a3782f71-3f6b-1e75-17a9-1827822c2030@fb.com/

   ...
   fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc

Currently, we require the debuginfo type itself must have a name.
In this case, the debuginfo type is "const" which points to "struct <name>".
The "const" type does not have a name, hence the above fatal error
will be triggered.

Let us permit "const" and "volatile" type modifiers. We skip modifiers
in some other cases as well like structure member type tracing.
This can aviod the above fatal error.

Differential Revision: https://reviews.llvm.org/D97986
2021-03-04 16:27:23 -08:00
Yonghong Song 51cdb780db BPF: Fix a bug in peephole TRUNC elimination optimization
Andrei Matei reported a llvm11 core dump for his bpf program
   https://bugs.llvm.org/show_bug.cgi?id=48578
The core dump happens in LiveVariables analysis phase.
  #4 0x00007fce54356bb0 __restore_rt
  #5 0x00007fce4d51785e llvm::LiveVariables::HandleVirtRegUse(unsigned int,
      llvm::MachineBasicBlock*, llvm::MachineInstr&)
  #6 0x00007fce4d519abe llvm::LiveVariables::runOnInstr(llvm::MachineInstr&,
      llvm::SmallVectorImpl<unsigned int>&)
  #7 0x00007fce4d519ec6 llvm::LiveVariables::runOnBlock(llvm::MachineBasicBlock*, unsigned int)
  #8 0x00007fce4d51a4bf llvm::LiveVariables::runOnMachineFunction(llvm::MachineFunction&)
The bug can be reproduced with llvm12 and latest trunk as well.

Futher analysis shows that there is a bug in BPF peephole
TRUNC elimination optimization, which tries to remove
unnecessary TRUNC operations (a <<= 32; a >>= 32).
Specifically, the compiler did wrong transformation for the
following patterns:
   %1 = LDW ...
   %2 = SLL_ri %1, 32
   %3 = SRL_ri %2, 32
   ... %3 ...
   %4 = SRA_ri %2, 32
   ... %4 ...

The current transformation did not check how many uses of %2
and did transformation like
   %1 = LDW ...
   ... %1 ...
   %4 = SRL_ri %2, 32
   ... %4 ...
and pseudo register %2 is used by not defined and
caused LiveVariables analysis core dump.

To fix the issue, when traversing back from SRL_ri to SLL_ri,
check to ensure SLL_ri has only one use. Otherwise, don't
do transformation.

Differential Revision: https://reviews.llvm.org/D97792
2021-03-02 13:03:42 -08:00
Yonghong Song 286daafd65 [BPF] support atomic instructions
Implement fetch_<op>/fetch_and_<op>/exchange/compare-and-exchange
instructions for BPF.  Specially, the following gcc intrinsics
are implemented.
  __sync_fetch_and_add (32, 64)
  __sync_fetch_and_sub (32, 64)
  __sync_fetch_and_and (32, 64)
  __sync_fetch_and_or  (32, 64)
  __sync_fetch_and_xor (32, 64)
  __sync_lock_test_and_set (32, 64)
  __sync_val_compare_and_swap (32, 64)

For __sync_fetch_and_sub, internally, it is implemented as
a negation followed by __sync_fetch_and_add.
For __sync_lock_test_and_set, despite its name, it actually
does an atomic exchange and return the old content.
  https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

For intrinsics like __sync_{add,sub}_and_fetch and
__sync_bool_compare_and_swap, the compiler is able to generate
codes using __sync_fetch_and_{add,sub} and __sync_val_compare_and_swap.

Similar to xadd, atomic xadd, xor and xxor (atomic_<op>)
instructions are added for atomic operations which do not
have return values. LLVM will check the return value for
__sync_fetch_and_{add,and,or,xor}.
If the return value is used, instructions atomic_fetch_<op>
will be used. Otherwise, atomic_<op> instructions will be used.

All new instructions only support 64bit and 32bit with alu32 mode.
old xadd instruction still supports 32bit without alu32 mode.

For encoding, please take a look at test atomics_2.ll.

Differential Revision: https://reviews.llvm.org/D72184
2020-12-03 07:38:00 -08:00
Yonghong Song 61a06c071d BPF: add a test for selectiondag alias analysis w.r.t. lifetime
This adds a test for the bug
  https://bugs.llvm.org/show_bug.cgi?id=47591

Previously, selection dag has a bug which may incorrectly
assume no alias when crossing a lifetime boundary and this
may generate incorrect code as demonstrated in the above bug.

It looks the bug is fixed by https://reviews.llvm.org/D91833.
Basically, when comparing two potential memory access dag nodes,
  a store and a lifetime.start,
with the same frame index.
Previously, it may be decided no alias. With the above fix,
these two will be considered aliasing which will prevent
incorrect code scheduling.

Differential Revision: https://reviews.llvm.org/D92451
2020-12-02 22:27:17 -08:00
Arthur Eubanks 92a67e131f [BPF][NewPM] Port bpf-adjust-opt to NPM and add it to pipeline
Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D91990
2020-11-26 10:11:26 -08:00
Matt Arsenault 06c192d454 OpaquePtr: Bulk update tests to use typed byval
Upgrade of the IR text tests should be the only thing blocking making
typed byval mandatory. Partially done through regex and partially
manual.
2020-11-20 14:00:46 -05:00
Yonghong Song 4369223ea7 BPF: make __builtin_btf_type_id() return 64bit int
Linux kernel recently added support for kernel modules
  https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org/

In such cases, a type id in the kernel needs to be presented
as (btf id for modules, btf type id for this module).
Change __builtin_btf_type_id() to return 64bit value
so libbpf can do the above encoding.

Differential Revision: https://reviews.llvm.org/D91489
2020-11-16 07:08:41 -08:00
Arthur Eubanks b6ccff3d5f [NewPM] Provide method to run all pipeline callbacks, used for -O0
Some targets may add required passes via
TargetMachine::registerPassBuilderCallbacks(). We need to run those even
under -O0. As an example, BPFTargetMachine adds
BPFAbstractMemberAccessPass, a required pass.

This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
usage of the NPM) by allowing us to share added passes like coroutines
and sanitizers between -O0 and other optimization levels.

Since callbacks may end up not adding passes, we need to check if the
pass managers are empty before adding them, so PassManager now has an
isEmpty() function. For example, polly adds callbacks but doesn't always
add passes in those callbacks, so this is necessary to keep
-debug-pass-manager tests' output from changing depending on if polly is
enabled or not.

Tests are a continuation of those added in
https://reviews.llvm.org/D89083.

Reviewed By: asbirlea, Meinersbur

Differential Revision: https://reviews.llvm.org/D89158
2020-11-11 15:10:27 -08:00
Simon Pilgrim 2224c2f8bc [BPF] intrinsic-array-2.ll - remove unused check prefixes
Just use default CHECK
2020-11-11 18:38:21 +00:00
Arthur Eubanks 226e179f74 Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0"
This reverts commit ae38540042.
As well as some follow-up test fixes.

The original change causes new-pass-manager.ll to fail when polly is enabled.
2020-11-08 00:32:35 -08:00
Arthur Eubanks ae38540042 [NewPM] Provide method to run all pipeline callbacks, used for -O0
Some targets may add required passes via
    TargetMachine::registerPassBuilderCallbacks(). We need to run those even
    under -O0. As an example, BPFTargetMachine adds
    BPFAbstractMemberAccessPass, a required pass.

    This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust
    usage of the NPM) by allowing us to share added passes like coroutines
    and sanitizers between -O0 and other optimization levels.

    Tests are a continuation of those added in
    https://reviews.llvm.org/D89083.

    In order to prevent TargetMachines from adding unnecessary optimization
    passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be
    changed to take an OptimizationLevel, but that will be done separately.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89158
2020-11-04 22:27:16 -08:00
Arthur Eubanks 2218e6d0a8 [BPF] Make BPFAbstractMemberAccessPass required
Or else on optnone functions we get the following during instruction selection:
  fatal error: error in backend: Cannot select: intrinsic %llvm.preserve.struct.access.index

Currently the -O0 pipeline doesn't properly run passes registered via
TargetMachine::registerPassBuilderCallbacks(), so don't add that RUN
line yet. That will be fixed after this.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D89083
2020-10-09 11:26:37 -07:00
Yonghong Song 3161172168 BPF: fix incorrect DAG2DAG load optimization
Currently, bpf backend Instruction section DAG2DAG phase has
an optimization to replace loading constant struct memeber
or array element with direct values. The reason is that these
locally defined struct or array variables may have their
initial values stored in a readonly section and early bpf
ecosystem is not able to handle such cases.

Bpf ecosystem now can not only handle readonly sections,
but also global variables. global variable can also have
initialized data and global variable may or may not be constant,
i.e., global variable data can be put in .data section or .rodata
section. This exposed a bug in DAG2DAG Load optimization
as it did not check whether the global variable is constant
or not.

This patch fixed the bug by checking whether global variable,
representing the initial data, is constant or not and will not
do optimization if it is not a constant.

Another bug is also fixed in this patch to check whether
the load is simple (not volatile/atomic) or not. If it is
not simple, we will not do optimization. To summary for
globals:
   - struct t var = { ... } ;  // no load optimization
   - const struct t var = { ... }; // load optimization is possible
   - volatile const struct t var = { ... }; // no load optimization

Differential Revision: https://reviews.llvm.org/D89021
2020-10-07 19:08:40 -07:00
Yonghong Song ddf1864ace BPF: add AdjustOpt IR pass to generate verifier friendly codes
Add an IR phase right before main module optimization.
This is to modify IR to restrict certain downward optimizations
in order to generate verifier friendly code.
  > prevent certain instcombine optimizations, handling both
    in-block/cross-block instcombines.
  > avoid speculative code motion if the variable used in
    condition is also used in the later blocks.

Internally, a bpf IR builtin
  result = __builtin_bpf_passthrough(seq_num, result)
is used to enforce ordering. This builtin is only used
during target independent IR optimizations and it will
be removed at the beginning of target dependent IR
optimizations.

For example, removing the following workaround,
  --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
  +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
  @@ -47,7 +47,7 @@ int sysctl_tcp_mem(struct bpf_sysctl *ctx)
          /* a workaround to prevent compiler from generating
           * codes verifier cannot handle yet.
           */
  -       volatile int ret;
  +       int ret;
this patch is able to generate code which passed the verifier.

To disable optimization, users need to use "opt" command like below:
  clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes test.c
  // disable icmp serialization
  opt -O2 -bpf-disable-serialize-icmp test.ll | llvm-dis > t.ll
  // disable avoid-speculation
  opt -O2 -bpf-disable-avoid-speculation test.ll | llvm-dis > t.ll
  llc t.ll

Differential Revision: https://reviews.llvm.org/D85570
2020-10-07 08:49:10 -07:00
Yonghong Song edd71db38b BPF: avoid duplicated globals for CORE relocations
This patch fixed two issues related with relocation globals.
In LLVM, if a global, e.g. with name "g", is created and
conflict with another global with the same name, LLVM will
rename the global, e.g., with a new name "g.2". Since
relocation global name has special meaning, we do not want
llvm to change it, so internally we have logic to check
whether duplication happens or not. If happens, just reuse
the previous global.

The first bug is related to non-btf-id relocation
(BPFAbstractMemberAccess.cpp). Commit 54d9f743c8
("BPF: move AbstractMemberAccess and PreserveDIType passes
to EP_EarlyAsPossible") changed ModulePass to FunctionPass,
i.e., handling each function at a time. But still just
one BPFAbstractMemberAccess object is created so module
level de-duplication still possible. Commit 40251fee00
("[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer
pipeline") made a change to create a BPFAbstractMemberAccess
object per function so module level de-duplication is not
possible any more without going through all module globals.
This patch simply changed the map which holds reloc globals
as class static, so it will be available to all
BPFAbstractMemberAccess objects for different functions.

The second bug is related to btf-id relocation
(BPFPreserveDIType.cpp). Before Commit 54d9f743c8, the pass
is a ModulePass, so we have a local variable, incremented for
each instance, and works fine. But after Commit 54d9f743c8,
the pass becomes a FunctionPass. Local variable won't work
properly since different functions will start with the same
initial value. Fix the issue by change the local count variable
as static, so it will be truely unique across the whole module
compilation.

Differential Revision: https://reviews.llvm.org/D88942
2020-10-06 22:37:49 -07:00
Arthur Eubanks 40251fee00 [BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline
This involves porting BPFAbstractMemberAccess and BPFPreserveDIType to
NPM, then adding them BPFTargetMachine::registerPassBuilderCallbacks
(the NPM equivalent of adjustPassManager()).

Reviewed By: yonghong-song, asbirlea

Differential Revision: https://reviews.llvm.org/D88855
2020-10-06 07:42:32 -07:00