llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	a213f735d8	[IR] Deprecate GetElementPtrInst::CreateInBounds without element type This API is not compatible with opaque pointers, the method accepting an explicit pointer element type should be used instead. Thankfully there were few in-tree users. The BPF case still ends up using the pointer element type for now and needs something like D105407 to avoid doing so.	2021-07-04 16:49:30 +02:00
Simon Pilgrim	ab2d295552	BPFISelDAGToDAG.cpp - don't dereference a dyn_cast<> result. NFCI. Use cast<> instead which will assert that the cast is correct and not just return null. Fixes static analysis warnings.	2021-06-06 13:24:29 +01:00
Yonghong Song	6a2ea84600	BPF: Add more relocation kinds Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712	2021-05-25 08:19:13 -07:00
Alessandro Decina	833e9b2ea7	[BPF] add support for 32 bit registers in inline asm Add "w" constraint type which allows selecting 32 bit registers. 32 bit registers were added in https://reviews.llvm.org/rGca31c3bb3ff149850b664838fbbc7d40ce571879. Differential Revision: https://reviews.llvm.org/D102118	2021-05-16 11:01:47 -07:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Yonghong Song	605c811d2b	BPF: fix FIELD_EXISTS relocation with array subscripts Lorenz Bauer reported an issue in bpf mailing list ([1]) where for FIELD_EXISTS relocation, if the object is an array subscript, the patched immediate is the object offset from the base address, instead of 1. Currently in BPF AbstractMemberAccess pass, the final offset from the base address is the patched offset except FIELD_EXISTS which is 1 unconditionally. In this particular case, the last data structure access is not a field (struct/union offset) so it didn't hit the place to set patched immediate to be 1. This patch fixed the issue by checking the relocation type. If the type is FIELD_EXISTS, just set to 1. Tested by modifying some bpf selftests, libbpf is okay with such types with FIELD_EXISTS relocation. [1] https://lore.kernel.org/bpf/CACAyw99n-cMEtVst7aK-3BfHb99GMEChmRLCvhrjsRpHhPrtvA@mail.gmail.com/ Differential Revision: https://reviews.llvm.org/D102036	2021-05-06 22:37:02 -07:00
Yonghong Song	bba7338b8f	BPF: generate BTF info for LD_imm64 loaded function pointer For an example like below, extern int do_work(int); long bpf_helper(void *callback_fn); long prog() { return bpf_helper(&do_work); } The final generated codes look like: r1 = do_work ll call bpf_helper exit where we have debuginfo for do_work() extern function: !17 = !DISubprogram(name: "do_work", ...) This patch implemented to add additional checking in processing LD_imm64 operands for possible function pointers so BTF for bpf function do_work() can be properly generated. The original llvm function name processReloc() is renamed to processGlobalValue() to better reflect what the function is doing. Differential Revision: https://reviews.llvm.org/D100568	2021-04-26 17:23:36 -07:00
Yonghong Song	a285bdb56f	BPF: remove default .extern data section Currently, for any extern variable, if it doesn't have section attribution, it will be put into a default ".extern" btf DataSec. The initial design is to put every extern variable in a DataSec so libbpf can use it. But later on, libbpf actually requires extern variables to put into special sections, e.g., ".kconfig", ".ksyms", etc. so they can be used properly based on section name. Andrii mentioned since ".extern" variables are not actually used, it makes sense to remove it from the compiler so libbpf does not need to deal with it, esp. for static linking. The BTF for these extern variables is still generated. With this patch, I tested kernel selftests/bpf and all tests passed. Indeed, removing ".extern" DataSec seems having no impact. Differential Revision: https://reviews.llvm.org/D100392	2021-04-13 11:35:52 -07:00
Yonghong Song	968292cb93	BPF: generate proper BTF for globals with WeakODRLinkage For a global weak symbol defined as below: char g __attribute__((weak)) = 2; LLVM generates an allocated global with WeakAnyLinkage, for which BPF backend generates proper BTF info. For the above example, if a modifier "const" is added like const char g __attribute__((weak)) = 2; LLVM generates an allocated global with WeakODRLinkage, for which BPF backend didn't generate any BTF as it didn't handle WeakODRLinkage. This patch addes support for WeakODRLinkage and proper BTF info can be generated for weak symbol defined with "const" modifier. Differential Revision: https://reviews.llvm.org/D100362	2021-04-13 08:54:05 -07:00
Sander de Smalen	db134e2428	[TTI] NFC: Change getCmpSelInstrCost to return InstructionCost This patch migrates the TTI cost interfaces to return an InstructionCost. See this patch for the introduction of the type: https://reviews.llvm.org/D91174 See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100203	2021-04-13 14:21:01 +01:00
Yonghong Song	886f9ff531	BPF: add extern func to data sections if specified This permits extern function (BTF_KIND_FUNC) be added to BTF_KIND_DATASEC if a section name is specified. For example, -bash-4.4$ cat t.c void foo(int) __attribute__((section(".kernel.funcs"))); int test(void) { foo(5); return 0; } The extern function foo (BTF_KIND_FUNC) will be put into BTF_KIND_DATASEC with name ".kernel.funcs". This will help to differentiate two kinds of external functions, functions in kernel and functions defined in other bpf programs. Differential Revision: https://reviews.llvm.org/D93563	2021-03-25 16:03:29 -07:00
Yonghong Song	379d908848	BPF: provide better error message for unsupported atomic operations Currently, BPF backend does not support all variants of atomic_load_{add,and,or,xor}, atomic_swap and atomic_cmp_swap For example, it only supports 32bit (with alu32 mode) and 64bit operations for atomic_load_{and,or,xor}, atomic_swap and atomic_cmp_swap. Due to historical reason, atomic_load_add is always supported with 32bit and 64bit. If user used an unsupported atomic operation, currently, codegen selectiondag cannot find bpf support and will issue a fatal error. This is not user friendly as user may mistakenly think this is a compiler bug. This patch added Custom rule for unsupported atomic operations and will emit better error message during ReplaceNodeResults() callback. The following is an example output. $ cat t.c short sync(short *p) { return __sync_val_compare_and_swap (p, 2, 3); } $ clang -target bpf -O2 -g -c t.c t.c:2:11: error: Unsupported atomic operations, please use 64 bit version return __sync_val_compare_and_swap (p, 2, 3); ^ fatal error: error in backend: Cannot select: t19: i64,ch = AtomicCmpSwap<(load store seq_cst seq_cst 2 on %ir.p)> t0, t2, Constant:i64<2>, Constant:i64<3>, t.c:2:11 t2: i64,ch = CopyFromReg t0, Register:i64 %0 t1: i64 = Register %0 t11: i64 = Constant<2> t10: i64 = Constant<3> In function: sync PLEASE submit a bug report ... Fatal error will still happen since we did not really do proper lowering for these unsupported atomic operations. But we do get a much better error message. Differential Revision: https://reviews.llvm.org/D98471	2021-03-11 19:19:00 -08:00
Ilya Leoshkevich	a7137b238a	[BPF] Add support for floats and doubles Some BPF programs compiled on s390 fail to load, because s390 arch-specific linux headers contain float and double types. At the moment there is no BTF_KIND for floats and doubles, so the release version of LLVM ends up emitting type id 0 for them, which the in-kernel verifier does not accept. Introduce support for such types to libbpf by representing them using the new BTF_KIND_FLOAT. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D83289	2021-03-05 15:10:11 +01:00
Yonghong Song	9c0274cdea	BPF: permit type modifiers for __builtin_btf_type_id() relocation Lorenz Bauer from Cloudflare tried to use "const struct <name>" as the type for __builtin_btf_type_id(*(const struct <name>)0, 1) relocation and hit a llvm BPF fatal error. https://lore.kernel.org/bpf/a3782f71-3f6b-1e75-17a9-1827822c2030@fb.com/ ... fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc Currently, we require the debuginfo type itself must have a name. In this case, the debuginfo type is "const" which points to "struct <name>". The "const" type does not have a name, hence the above fatal error will be triggered. Let us permit "const" and "volatile" type modifiers. We skip modifiers in some other cases as well like structure member type tracing. This can aviod the above fatal error. Differential Revision: https://reviews.llvm.org/D97986	2021-03-04 16:27:23 -08:00
Yonghong Song	51cdb780db	BPF: Fix a bug in peephole TRUNC elimination optimization Andrei Matei reported a llvm11 core dump for his bpf program https://bugs.llvm.org/show_bug.cgi?id=48578 The core dump happens in LiveVariables analysis phase. #4 0x00007fce54356bb0 __restore_rt #5 0x00007fce4d51785e llvm::LiveVariables::HandleVirtRegUse(unsigned int, llvm::MachineBasicBlock, llvm::MachineInstr&) #6 0x00007fce4d519abe llvm::LiveVariables::runOnInstr(llvm::MachineInstr&, llvm::SmallVectorImpl<unsigned int>&) #7 0x00007fce4d519ec6 llvm::LiveVariables::runOnBlock(llvm::MachineBasicBlock, unsigned int) #8 0x00007fce4d51a4bf llvm::LiveVariables::runOnMachineFunction(llvm::MachineFunction&) The bug can be reproduced with llvm12 and latest trunk as well. Futher analysis shows that there is a bug in BPF peephole TRUNC elimination optimization, which tries to remove unnecessary TRUNC operations (a <<= 32; a >>= 32). Specifically, the compiler did wrong transformation for the following patterns: %1 = LDW ... %2 = SLL_ri %1, 32 %3 = SRL_ri %2, 32 ... %3 ... %4 = SRA_ri %2, 32 ... %4 ... The current transformation did not check how many uses of %2 and did transformation like %1 = LDW ... ... %1 ... %4 = SRL_ri %2, 32 ... %4 ... and pseudo register %2 is used by not defined and caused LiveVariables analysis core dump. To fix the issue, when traversing back from SRL_ri to SLL_ri, check to ensure SLL_ri has only one use. Otherwise, don't do transformation. Differential Revision: https://reviews.llvm.org/D97792	2021-03-02 13:03:42 -08:00
Yonghong Song	6d102f15a3	BPF: Add LLVMTransformUtils in CMakefile LINK_COMPONENTS Commit `1959ead525` ("BPF: Implement TTI.getCmpSelInstrCost() properly") introduced a dependency on LLVMTransformUtils library. Let us encode this dependency explicitly in CMakefile to avoid build error.	2021-02-25 15:43:25 -08:00
Yonghong Song	1959ead525	BPF: Implement TTI.getCmpSelInstrCost() properly The Select insn in BPF is expensive as BPF backend needs to resolve with conditionals. This patch set the getCmpSelInstrCost() to SCEVCheapExpansionBudget for Select insn to prevent some Select insn related optimizations. This change is motivated during bcc code review for https://github.com/iovisor/bcc/pull/3270 where IndVarSimplifyPass eventually caused generating the following asm code: ; for (i = 0; (i < VIRTIO_MAX_SGS) && (i < num); i++) { 14: 16 05 40 00 00 00 00 00 if w5 == 0 goto +64 <LBB0_6> 15: bc 51 00 00 00 00 00 00 w1 = w5 16: 04 01 00 00 ff ff ff ff w1 += -1 17: 67 05 00 00 20 00 00 00 r5 <<= 32 18: 77 05 00 00 20 00 00 00 r5 >>= 32 19: a6 01 01 00 05 00 00 00 if w1 < 5 goto +1 <LBB0_4> 20: b7 05 00 00 06 00 00 00 r5 = 6 00000000000000a8 <LBB0_4>: 21: b7 02 00 00 00 00 00 00 r2 = 0 22: b7 01 00 00 00 00 00 00 r1 = 0 ; for (i = 0; (i < VIRTIO_MAX_SGS) && (i < num); i++) { 23: 7b 1a e0 ff 00 00 00 00 (u64 )(r10 - 32) = r1 24: 7b 5a c0 ff 00 00 00 00 (u64 )(r10 - 64) = r5 Note that insn #15 has w1 = w5 and w1 is refined later but r5(w5) is eventually saved on stack at insn #24 for later use. This cause later verifier failures. With this change, IndVarSimplifyPass won't do the above transformation any more. Differential Revision: https://reviews.llvm.org/D97479	2021-02-25 14:48:53 -08:00
Juneyoung Lee	e4d751c271	Update BPFAdjustOpt.cpp to accept select form of or as well This is a minor pattern-match update to BPFAdjustOpt.cpp to accept not only 'or i1 a, b' but also 'select i1 a, i1 true, i1 b'. This resolves regression after SimplifyCFG's creating select form of and/or instead (https://reviews.llvm.org/D95026). This is a small change, and currently such select form isn't created or doesn't reach to the late pipeline (because InstCombine eagerly folds it into and/or i1), so I chose to commit without a review process.	2021-02-20 18:29:58 +09:00
Yonghong Song	74975d35b4	BPF: Add LLVMAnalysis in CMakefile LINK_COMPONENTS buildbot reported a build error like below: BPFTargetMachine.cpp:(.text._ZN4llvm19TargetTransformInfo5ModelINS_10BPFTTIImplEED2Ev [_ZN4llvm19TargetTransformInfo5ModelINS_10BPFTTIImplEED2Ev]+0x14): undefined reference to `llvm::TargetTransformInfo::Concept::~Concept()' lib/Target/BPF/CMakeFiles/LLVMBPFCodeGen.dir/BPFTargetMachine.cpp.o: In function `llvm::TargetTransformInfo::Model<llvm::BPFTTIImpl>::~Model()': Commit `a260ae7160` ("BPF: Implement TTI.IntImmCost() properly") added TargetTransformInfo to BPF, which requires LLVMAnalysis dependence. In certain cmake configurations, lacking explicit LLVMAnalysis dependency may cause compilation error. Similar to other targets, this patch added LLVMAnalysis in CMakefile LINK_COMPONENTS explicitly.	2021-02-11 10:24:22 -08:00
Craig Topper	5744502a13	[TargetLowering][RISCV][AArch64][PowerPC] Enable BuildUDIV/BuildSDIV on illegal types before type legalization if we can find a larger legal type that supports MUL. If we wait until the type is legalized, we'll lose information about the orginal type and need to use larger magic constants. This gets especially bad on RISCV64 where i64 is the only legal type. I've limited this to simple scalar types so it only works for i8/i16/i32 which are most likely to occur. For more odd types we might want to do a small promotion to a type where MULH is legal instead. Unfortunately, this does prevent some urem/srem+seteq matching since that still require legal types. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D96210	2021-02-11 09:43:13 -08:00
Yonghong Song	a260ae7160	BPF: Implement TTI.IntImmCost() properly This patch implemented TTI.IntImmCost() properly. Each BPF insn has 32bit immediate space, so for any immediate which can be represented as 32bit signed int, the cost is technically free. If an int cannot be presented as a 32bit signed int, a ld_imm64 instruction is needed and a TCC_Basic is returned. This change is motivated when we observed that several bpf selftests failed with latest llvm trunk, e.g., #10/16 strobemeta.o:FAIL #10/17 strobemeta_nounroll1.o:FAIL #10/18 strobemeta_nounroll2.o:FAIL #10/19 strobemeta_subprogs.o:FAIL #96 snprintf_btf:FAIL The reason of the failure is due to that SpeculateAroundPHIsPass did aggressive transformation which alters control flow for which currently verifer cannot handle well. In llvm12, SpeculateAroundPHIsPass is not called. SpeculateAroundPHIsPass relied on TTI.getIntImmCost() and TTI.getIntImmCostInst() for profitability analysis. This patch implemented TTI.getIntImmCost() properly for BPF backend which also prevented transformation which caused the above test failures. Differential Revision: https://reviews.llvm.org/D96448	2021-02-11 08:35:25 -08:00
Kazu Hirata	8ed1636184	[llvm] Use isa instead of dyn_cast (NFC)	2021-01-29 23:23:37 -08:00
Kazu Hirata	7dc3575ef2	[llvm] Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-01-14 20:30:34 -08:00
Kazu Hirata	8a20e2b3d3	[llvm] Use Optional::getValueOr (NFC)	2021-01-12 21:43:50 -08:00
Kazu Hirata	985f899bf2	[Target] Use llvm::append_range (NFC)	2021-01-03 09:57:43 -08:00
Kazu Hirata	b557c32ae9	[MemorySSA, BPF] Use isa instead of dyn_cast (NFC)	2020-12-31 09:39:13 -08:00
Yonghong Song	286daafd65	[BPF] support atomic instructions Implement fetch_<op>/fetch_and_<op>/exchange/compare-and-exchange instructions for BPF. Specially, the following gcc intrinsics are implemented. __sync_fetch_and_add (32, 64) __sync_fetch_and_sub (32, 64) __sync_fetch_and_and (32, 64) __sync_fetch_and_or (32, 64) __sync_fetch_and_xor (32, 64) __sync_lock_test_and_set (32, 64) __sync_val_compare_and_swap (32, 64) For __sync_fetch_and_sub, internally, it is implemented as a negation followed by __sync_fetch_and_add. For __sync_lock_test_and_set, despite its name, it actually does an atomic exchange and return the old content. https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html For intrinsics like __sync_{add,sub}_and_fetch and __sync_bool_compare_and_swap, the compiler is able to generate codes using __sync_fetch_and_{add,sub} and __sync_val_compare_and_swap. Similar to xadd, atomic xadd, xor and xxor (atomic_<op>) instructions are added for atomic operations which do not have return values. LLVM will check the return value for __sync_fetch_and_{add,and,or,xor}. If the return value is used, instructions atomic_fetch_<op> will be used. Otherwise, atomic_<op> instructions will be used. All new instructions only support 64bit and 32bit with alu32 mode. old xadd instruction still supports 32bit without alu32 mode. For encoding, please take a look at test atomics_2.ll. Differential Revision: https://reviews.llvm.org/D72184	2020-12-03 07:38:00 -08:00
Arthur Eubanks	92a67e131f	[BPF][NewPM] Port bpf-adjust-opt to NPM and add it to pipeline Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D91990	2020-11-26 10:11:26 -08:00
Florian Hahn	b2f4c5fddc	[AsmWriter] Factor out mnemonic generation to accessible getMnemonic. This patch factors out the part of printInstruction that gets the mnemonic string for a given MCInst. This is intended to be used subsequently for the instruction-mix remarks to display the final mnemonic (D90040). Unfortunately making `getMnemonic` available to the AsmPrinter seems to require making it virtual. Not sure if there's a way around that with the current layering of the AsmPrinters. Reviewed By: Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D90039	2020-11-17 09:47:38 +00:00
Yonghong Song	4369223ea7	BPF: make __builtin_btf_type_id() return 64bit int Linux kernel recently added support for kernel modules https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org/ In such cases, a type id in the kernel needs to be presented as (btf id for modules, btf type id for this module). Change __builtin_btf_type_id() to return 64bit value so libbpf can do the above encoding. Differential Revision: https://reviews.llvm.org/D91489	2020-11-16 07:08:41 -08:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Arthur Eubanks	ab0ddbc38a	Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 13:11:40 -08:00
Arthur Eubanks	9173b5a99d	Revert "[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback" This reverts commit `7a83aa0520`. Causing buildbot failures.	2020-11-04 12:57:32 -08:00
Arthur Eubanks	7a83aa0520	[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 12:53:30 -08:00
Jameson Nash	a0ad066ce4	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Replaces https://reviews.llvm.org/D74158 Differential Revision: https://reviews.llvm.org/D89613	2020-11-03 10:02:09 -05:00
Jameson Nash	4242df1470	Revert "make the AsmPrinterHandler array public" I messed up one of the tests.	2020-10-16 17:22:07 -04:00
Jameson Nash	ac2def2d8d	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Differential Revision: https://reviews.llvm.org/D74158	2020-10-16 16:27:31 -04:00
Arthur Eubanks	2218e6d0a8	[BPF] Make BPFAbstractMemberAccessPass required Or else on optnone functions we get the following during instruction selection: fatal error: error in backend: Cannot select: intrinsic %llvm.preserve.struct.access.index Currently the -O0 pipeline doesn't properly run passes registered via TargetMachine::registerPassBuilderCallbacks(), so don't add that RUN line yet. That will be fixed after this. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D89083	2020-10-09 11:26:37 -07:00
Yonghong Song	3161172168	BPF: fix incorrect DAG2DAG load optimization Currently, bpf backend Instruction section DAG2DAG phase has an optimization to replace loading constant struct memeber or array element with direct values. The reason is that these locally defined struct or array variables may have their initial values stored in a readonly section and early bpf ecosystem is not able to handle such cases. Bpf ecosystem now can not only handle readonly sections, but also global variables. global variable can also have initialized data and global variable may or may not be constant, i.e., global variable data can be put in .data section or .rodata section. This exposed a bug in DAG2DAG Load optimization as it did not check whether the global variable is constant or not. This patch fixed the bug by checking whether global variable, representing the initial data, is constant or not and will not do optimization if it is not a constant. Another bug is also fixed in this patch to check whether the load is simple (not volatile/atomic) or not. If it is not simple, we will not do optimization. To summary for globals: - struct t var = { ... } ; // no load optimization - const struct t var = { ... }; // load optimization is possible - volatile const struct t var = { ... }; // no load optimization Differential Revision: https://reviews.llvm.org/D89021	2020-10-07 19:08:40 -07:00
Yonghong Song	ddf1864ace	BPF: add AdjustOpt IR pass to generate verifier friendly codes Add an IR phase right before main module optimization. This is to modify IR to restrict certain downward optimizations in order to generate verifier friendly code. > prevent certain instcombine optimizations, handling both in-block/cross-block instcombines. > avoid speculative code motion if the variable used in condition is also used in the later blocks. Internally, a bpf IR builtin result = __builtin_bpf_passthrough(seq_num, result) is used to enforce ordering. This builtin is only used during target independent IR optimizations and it will be removed at the beginning of target dependent IR optimizations. For example, removing the following workaround, --- a/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c +++ b/tools/testing/selftests/bpf/progs/test_sysctl_loop1.c @@ -47,7 +47,7 @@ int sysctl_tcp_mem(struct bpf_sysctl ctx) / a workaround to prevent compiler from generating * codes verifier cannot handle yet. */ - volatile int ret; + int ret; this patch is able to generate code which passed the verifier. To disable optimization, users need to use "opt" command like below: clang -target bpf -O2 -S -emit-llvm -Xclang -disable-llvm-passes test.c // disable icmp serialization opt -O2 -bpf-disable-serialize-icmp test.ll \| llvm-dis > t.ll // disable avoid-speculation opt -O2 -bpf-disable-avoid-speculation test.ll \| llvm-dis > t.ll llc t.ll Differential Revision: https://reviews.llvm.org/D85570	2020-10-07 08:49:10 -07:00
Yonghong Song	edd71db38b	BPF: avoid duplicated globals for CORE relocations This patch fixed two issues related with relocation globals. In LLVM, if a global, e.g. with name "g", is created and conflict with another global with the same name, LLVM will rename the global, e.g., with a new name "g.2". Since relocation global name has special meaning, we do not want llvm to change it, so internally we have logic to check whether duplication happens or not. If happens, just reuse the previous global. The first bug is related to non-btf-id relocation (BPFAbstractMemberAccess.cpp). Commit `54d9f743c8` ("BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible") changed ModulePass to FunctionPass, i.e., handling each function at a time. But still just one BPFAbstractMemberAccess object is created so module level de-duplication still possible. Commit `40251fee00` ("[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline") made a change to create a BPFAbstractMemberAccess object per function so module level de-duplication is not possible any more without going through all module globals. This patch simply changed the map which holds reloc globals as class static, so it will be available to all BPFAbstractMemberAccess objects for different functions. The second bug is related to btf-id relocation (BPFPreserveDIType.cpp). Before Commit `54d9f743c8`, the pass is a ModulePass, so we have a local variable, incremented for each instance, and works fine. But after Commit `54d9f743c8`, the pass becomes a FunctionPass. Local variable won't work properly since different functions will start with the same initial value. Fix the issue by change the local count variable as static, so it will be truely unique across the whole module compilation. Differential Revision: https://reviews.llvm.org/D88942	2020-10-06 22:37:49 -07:00
Arthur Eubanks	40251fee00	[BPF][NewPM] Make BPFTargetMachine properly adjust NPM optimizer pipeline This involves porting BPFAbstractMemberAccess and BPFPreserveDIType to NPM, then adding them BPFTargetMachine::registerPassBuilderCallbacks (the NPM equivalent of adjustPassManager()). Reviewed By: yonghong-song, asbirlea Differential Revision: https://reviews.llvm.org/D88855	2020-10-06 07:42:32 -07:00
Yonghong Song	54d9f743c8	BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible Move abstractMemberAccess and PreserveDIType passes as early as possible, right after clang code generation. Currently, compiler may transform the above code p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); bpf_probe_read(buf, buf_size, p2); } to p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { bpf_probe_read(buf, buf_size, p2); } and eventually assembly code looks like reloc_exist = 1; reloc_member_offset = 10; //calculate member offset from base p2 = base + reloc_member_offset; if (reloc_exist) { bpf_probe_read(bpf, buf_size, p2); } if during libbpf relocation resolution, reloc_exist is actually resolved to 0 (not exist), reloc_member_offset relocation cannot be resolved and will be patched with illegal instruction. This will cause verifier failure. This patch attempts to address this issue by do chaining analysis and replace chains with special globals right after clang code gen. This will remove the cse possibility described in the above. The IR typically looks like %6 = load @llvm.sk_buff:0:50$0:0:0:2:0 %7 = bitcast %struct.sk_buff* %2 to i8* %8 = getelementptr i8, i8* %7, %6 for a particular address computation relocation. But this transformation has another consequence, code sinking may happen like below: PHI = <possibly different @preserve__access_globals> %7 = bitcast %struct.sk_buff %2 to i8* %8 = getelementptr i8, i8* %7, %6 For such cases, we will not able to generate relocations since multiple relocations are merged into one. This patch introduced a passthrough builtin to prevent such optimization. Looks like inline assembly has more impact for optimizaiton, e.g., inlining. Using passthrough has less impact on optimizations. A new IR pass is introduced at the beginning of target-dependent IR optimization, which does: - report fatal error if any reloc global in PHI nodes - remove all bpf passthrough builtin functions Changes for existing CORE tests: - for clang tests, add "-Xclang -disable-llvm-passes" flags to avoid builtin->reloc_global transformation so the test is still able to check correctness for clang generated IR. - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> \| llvm-dis" command before "llc" command since "opt" is needed to call newly-placed builtin->reloc_global transformation. Add target triple in the IR file since "opt" requires it. - Since target triple is added in IR file, if a test may produce different results for different endianness, two tests will be created, one for bpfeb and another for bpfel, e.g., some tests for relocation of lshift/rshift of bitfields. - field-reloc-bitfield-1.ll has different relocations compared to old codes. This is because for the structure in the test, new code returns struct layout alignment 4 while old code is 8. Align 8 is more precise and permits double load. With align 4, the new mechanism uses 4-byte load, so generating different relocations. - test intrinsic-transforms.ll is removed. This is used to test cse on intrinsics so we do not lose metadata. Now metadata is attached to global and not instruction, it won't get lost with cse. Differential Revision: https://reviews.llvm.org/D87153	2020-09-28 16:56:22 -07:00
Simon Pilgrim	95ca3aacf0	BTFDebug.h - reduce MachineInstr.h include to forward declaration. NFCI.	2020-09-07 17:51:13 +01:00
Craig Topper	c7a0b2684f	[X86][MC][Target] Initial backend support a tune CPU to support -mtune This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line. This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned. One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU. I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning. Differential Revision: https://reviews.llvm.org/D85165	2020-08-14 15:31:50 -07:00
Yonghong Song	c50f5dece9	BPF: fix libLLVMBPFCodeGen.so build failure Buildbot reported a build failure when building shared library libLLVMBPFCodeGen.so with unknown reference to "createCFGSimplificationPass". Commit `87cba43402` ("BPF: add a SimplifyCFG IR pass during generic Scalar/IPO optimization") added an IR pass SimplifyCFG by BPF target. The commit called function createCFGSimplificationPass() defined in "Scalar" library. Add this library in Target/BPF/LLVMBuild.txt so shared library build can succeed.	2020-08-06 15:27:15 -07:00
Yonghong Song	87cba43402	BPF: add a SimplifyCFG IR pass during generic Scalar/IPO optimization The following bpf linux kernel selftest failed with latest llvm: $ ./test_progs -n 7/10 ... The sequence of 8193 jumps is too complex. verification time 126272 usec stack depth 320 processed 114799 insns (limit 1000000) ... libbpf: failed to load object 'pyperf600_nounroll.o' test_bpf_verif_scale:FAIL:110 #7/10 pyperf600_nounroll.o:FAIL #7 bpf_verif_scale:FAIL After some investigation, I found the following llvm patch https://reviews.llvm.org/D84108 is responsible. The patch disabled hoisting common instructions in SimplifyCFG by default. Later on, the code changes and a SimplifyCFG phase with hoisting on cannot do the work any more. A test is provided to demonstrate the problem. The IR before simplifyCFG looks like: for.cond: %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] %cmp = icmp ult i32 %i.0, 6 br i1 %cmp, label %for.body, label %for.cond.cleanup for.cond.cleanup: %2 = load i8, i8* %frame_ptr, align 8, !tbaa !2 %cmp2 = icmp eq i8* %2, null %conv = zext i1 %cmp2 to i32 call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3 call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3 ret i32 %conv for.body: %3 = load i8, i8* %frame_ptr, align 8, !tbaa !2 %tobool.not = icmp eq i8* %3, null br i1 %tobool.not, label %for.inc, label %land.lhs.true The first two insns of `for.cond.cleanup` and `for.body`, load and icmp, can be hoisted to `for.cond` block. With Patch D84108, the optimization is delayed. But unfortunately, later on loop rotation added addition phi nodes to `for.body` and hoisting cannot be done any more. Note such a hoisting is beneficial to bpf programs as bpf verifier does path sensitive analysis and verification. The hoisting preverts reloading from stack which will assume conservative value and increase exploited insns. In this case, it caused verifier failure. To fix this problem, I added an IR pass from bpf target to performance additional simplifycfg with hoisting common inst enabled. Differential Revision: https://reviews.llvm.org/D85434	2020-08-06 13:16:00 -07:00
Yonghong Song	00602ee7ef	BPF: simplify IR generation for __builtin_btf_type_id() This patch simplified IR generation for __builtin_btf_type_id(). For __builtin_btf_type_id(obj, flag), previously IR builtin looks like if (obj is a lvalue) llvm.bpf.btf.type.id(obj.ptr, 1, flag) !type else llvm.bpf.btf.type.id(obj, 0, flag) !type The purpose of the 2nd argument is to differentiate __builtin_btf_type_id(obj, flag) where obj is a lvalue vs. __builtin_btf_type_id(obj.ptr, flag) Note that obj or obj.ptr is never used by the backend and the `obj` argument is only used to derive the type. This code sequence is subject to potential llvm CSE when - obj is the same .e.g., nullptr - flag is the same - metadata type is different, e.g., typedef of struct "s" and strust "s". In the above, we don't want CSE since their metadata is different. This patch change IR builtin to llvm.bpf.btf.type.id(seq_num, flag) !type and seq_num is always increasing. This will prevent potential llvm CSE. Also report an error if the type name is empty for remote relocation since remote relocation needs non-empty type name to do relocation against vmlinux. Differential Revision: https://reviews.llvm.org/D85174	2020-08-04 16:29:42 -07:00
Yonghong Song	6d218b4adb	BPF: support type exist/size and enum exist/value relocations Four new CO-RE relocations are introduced: - TYPE_EXISTENCE: whether a typedef/record/enum type exists - TYPE_SIZE: the size of a typedef/record/enum type - ENUM_VALUE_EXISTENCE: whether an enum value of an enum type exists - ENUM_VALUE: the enum value of an enum type These additional relocations will make CO-RE bpf programs more adaptive for potential kernel internal data structure changes. Differential Revision: https://reviews.llvm.org/D83878	2020-08-04 12:35:39 -07:00
Fangrui Song	40da58a04b	[MC] Default MCAsmBackend::mayNeedRelaxation() to false	2020-08-02 22:13:59 -07:00

1 2 3 4 5 ...

416 Commits