llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	902cbcd59e	Use llvm::is_contained where appropriate (NFC) Summary: This patch replaces std::find with llvm::is_contained where appropriate. Reviewers: efriedma, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, jvesely, nhaehnle, hiraditya, rogfer01, kerbowa, llvm-commits, vkmr Tags: #llvm Differential Revision: https://reviews.llvm.org/D84489	2020-07-27 10:20:44 -07:00
Simon Pilgrim	0128b9505c	Revert rG5dd566b7c7b78bd- "PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI." This reverts commit `5dd566b7c7`. Causing some buildbot failures that I'm not seeing on MSVC builds.	2020-07-24 13:02:33 +01:00
Simon Pilgrim	5dd566b7c7	PassManager.h - remove unnecessary Function.h/Module.h includes. NFCI. PassManager.h is one of the top headers in the ClangBuildAnalyzer frontend worst offenders list. This exposes a large number of implicit dependencies on various forward declarations/includes in other headers that need addressing.	2020-07-24 12:40:50 +01:00
Yuanfang Chen	ca1e69a675	[NFC] remove unused includes of SelectionDAGISel.h	2020-07-20 10:43:29 -07:00
Simon Pilgrim	017e5c949b	MCFixup.h - remove unnecessary MCExpr.h include. NFCI. Move the include down to files that actually depend on MCExpr definitions. Also exposes an implicit dependency on MCContext in AVRAsmBackend.h	2020-07-20 15:17:19 +01:00
Yonghong Song	0e347c0ff0	BPF: generate .rodata BTF datasec for certain initialized local var's Currently, BTF datasec type for .rodata is generated only if there are user-defined readonly global variables which have debuginfo generated. Certain readonly global variables may be generated from initialized local variables. For example, void foo(const void *); int test() { const struct { unsigned a[4]; char b; } val = { .a = {2, 3, 4, 5}, .b = 6 }; foo(&val); return 0; } The clang will create a private linkage const global to store the initialized value: @__const.test.val = private unnamed_addr constant %struct.anon { [4 x i32] [i32 2, i32 3, i32 4, i32 5], i8 6 }, align 4 This global variable eventually is put in .rodata ELF section. If there is .rodata ELF section, libbpf expects a BTF .rodata datasec as well even though it may be empty meaning there are no global readonly variables with proper debuginfo. Martin reported a bug where without this empty BTF .rodata datasec, the bpftool gen will exit with an error. This patch fixed the issue by generating .rodata BTF datasec if there exists local var intial data which will result in .rodata ELF section. Differential Revision: https://reviews.llvm.org/D84002	2020-07-17 09:45:57 -07:00
Logan Smith	a19461d9e1	[NFC] Add 'override' keyword where missing in include/ and lib/. This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual. Differential Revision: https://reviews.llvm.org/D83709	2020-07-14 09:47:29 -07:00
Yonghong Song	152a9fef1b	BPF: permit .maps section variables with typedef type Currently, llvm when see a global variable in .maps section, it ensures its type must be a struct type. Then pointee will be further evaluated for the structure members. In normal cases, the pointee type will be skipped. Although this is what current all bpf programs are doing, but it is a little bit restrictive. For example, it is legitimate for users to have: typedef struct { int key_size; int value_size; } __map_t; __map_t map __attribute__((section(".maps"))); This patch lifts this restriction and typedef of a struct type is also allowed for .maps section variables. To avoid create unnecessary fixup entries when traversal started with typedef/struct type, the new implementation first traverse all map struct members and then traverse the typedef/struct type. This way, in internal BTFDebug implementation, no fixup entries are generated. Two new unit tests are added for typedef and const struct in .maps section. Also tested with kernel bpf selftests. Differential Revision: https://reviews.llvm.org/D83638	2020-07-12 09:42:25 -07:00
Yonghong Song	3eacfdc72f	[BPF] Fix a BTF gen bug related to a pointer struct member Currently, BTF generation stops at pointer struct members if the pointee type is a struct. This is to avoid bloating generated BTF size. The following is the process to correctly record types for these pointee struct types. - During type traversal stage, when a struct member, which is a pointer to another struct, is encountered, the pointee struct type, keyed with its name, is remembered in a Fixup map. - Later, when all type traversal is done, the Fixup map is scanned, based on struct name matching, to either resolve as pointing to a real already generated type or as a forward declaration. Andrii discovered a bug if the struct member pointee struct is anonymous. In this case, a struct with empty name is recorded in Fixup map, and later it happens another anonymous struct with empty name is defined in BTF. So wrong type resolution happens. To fix the problem, if the struct member pointee struct is anonymous, pointee struct type will be generated in stead of being put in Fixup map. Differential Revision: https://reviews.llvm.org/D82976	2020-07-01 09:55:01 -07:00
Guillaume Chatelet	0f9d623b63	[Alignment][NFC] Use Align for BPFAbstractMemberAccess::RecordAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82962	2020-07-01 16:23:52 +00:00
Yonghong Song	7f6bc84a97	[BPF] Fix a bug for __builtin_preserve_field_info() processing Andrii discovered a problem where a simple case similar to below will generate wrong relocation kind: enum { FIELD_EXISTENCE = 2, }; struct s1 { int a1; }; int test() { struct s1 *v = 0; return __builtin_preserve_field_info(v[0], FIELD_EXISTENCE); } The expected relocation kind should be FIELD_EXISTENCE, but recorded reloc kind in the final object file is FIELD_BYTE_OFFSET, which is incorrect. This exposed a bug in generating access strings from intrinsics. The current access string generation has two steps: step 1: find the base struct/union type, step 2: traverse members in the base type. The current implementation relies on at lease one member access in step 2 to get the correct relocation kind, which is true in typical cases. But if there is no member accesses, the current implementation falls to the default info kind FIELD_BYTE_OFFSET. This is incorrect, we should still record the reloc kind based on the user input. This patch fixed this issue by properly recording the reloc kind in such cases. Differential Revision: https://reviews.llvm.org/D82932	2020-06-30 23:45:37 -07:00
Guillaume Chatelet	c1cd61e02a	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemcpy to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82849	2020-06-30 13:12:31 +00:00
Yonghong Song	89648eb16d	[BPF] fix a bug for BTF pointee type pruning In BTF, pointee type pruning is used to reduce cluttering too many unused types into prog BTF. For example, struct task_struct { ... struct mm_struct mm; ... } If bpf program does not access members of "struct mm_struct", there is no need to bring types for "struct mm_struct" to BTF. This patch fixed a bug where an incorrect pruning happened. The test case like below: struct t; typedef struct t _t; struct s1 { _t c; }; int test1(struct s1 arg) { ... } struct t { int a; int b; }; struct s2 { _t c; } int test2(struct s2 arg) { ... } After processing test1(), among others, BPF backend generates BTF types for "struct s1", "_t" and a placeholder for "struct t". Note that "struct t" is not really generated. If later a direct access to "struct t" member happened, "struct t" BTF type will be generated properly. During processing test2(), when processing member type "_t c", BPF backend sees type "_t" already generated, so returned. This caused the problem that "struct t" BTF type is never generated and eventually causing incorrect type definition for "struct s2". To fix the issue, during DebugInfo type traversal, even if a typedef/const/volatile/restrict derived type has been recorded in BTF, if it is not a type pruning candidate, type traversal of its base type continues. Differential Revision: https://reviews.llvm.org/D82041	2020-06-17 15:13:46 -07:00
Benjamin Kramer	df9a51dab3	Remove global std::strings. NFCI.	2020-06-17 14:29:42 +02:00
Yonghong Song	4db1878158	[BPF] fix incorrect type in BPFISelDAGToDAG readonly load optimization In BPF Instruction Selection DAGToDAG transformation phase, BPF backend had an optimization to turn load from readonly data section to direct load of the values. This phase is implemented before libbpf has readonly section support and before alu32 is supported. This phase however may generate incorrect type when alu32 is enabled. The following is an example, -bash-4.4$ cat ~/tmp2/t.c struct t { unsigned char a; unsigned char b; unsigned char c; }; extern void foo(void ); int test() { struct t v = { .b = 2, }; foo(&v); return 0; } The compiler will turn local variable "v" into a readonly section. During instruction selection phase, the compiler generates two loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads, t8: i32,ch = load<(dereferenceable load 2 from `i8 getelementptr inbounds (%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1), anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64 t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8, FrameIndex:i64<0>, undef:i64 BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR: t10: i64 = MOV_ri TargetConstant:i64<2> t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8> t41: i32 = OR_ri_32 t40, TargetConstant:i64<0> t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>, TargetConstant:i64<0>, t3 Note that t10 in the above is not correct. The type should be i32 and instruction should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection generated an i64 constant instead of an i32 constant as specified in the original load instruction. Such incorrect insn sequence eventually caused the following fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister. Impossible reg-to-reg copy UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42! This patch fixed the issue by using the load result type instead of always i64 when doing readonly load optimization. Differential Revision: https://reviews.llvm.org/D81630	2020-06-11 19:31:06 -07:00
Yonghong Song	3659559cf3	[BPF] Remove unnecessary MOV_32_64 instructions Commit `13f6c81c5d` ("[BPF] simplify zero extension with MOV_32_64") tried to use MOV_32_64 instructions instead of lshift/rshift instructions for zero extension. This has the benefit to remove the number of instructions and may help verifier too. But the same commit also removed the old MOV_32_64 pruning as it deems unsafe as MOV_32_64 does have the side effect, zeroing out the top 32bit in the register. This caused the following failure in kernel selftest test_cls_redirect.o. In linux kernel, we have struct __sk_buff { __u32 data; __u32 data_end; }; The compiler will generate 32bit load for __sk_buff->data and __sk_buff->data_end. But kernel verifier will actually loads an address (64bit address on 64bit kernel) to the result register. In this particular example, the explicit zext was not optimized away and destroyed top 32bit address and the verifier rejected the program : w2 = (u32 )(r1 + 76) ... r2 = w2 /* MOV_32_64: this will clear top 32bit */ Currently, if the load and the zext are next to each other, the instruction pattern match can actually capture this to avoid MOV_32_64, e.g., in BPFInstrInfo.td, we have def : Pat<(i64 (zextloadi32 ADDRri:$src)), (SUBREG_TO_REG (i64 0), (LDW32 ADDRri:$src), sub_32)>; However, if they are not next to each other, LDW32 and MOV_32_64 are generated, which may cause the above mentioned problem. BPF Backend already tried to optimize away pattern mov_32_64 + lshift + rshift Commit `13f6c81c5d` may generate mov_32_64 not followed by shifts. This patch added optimization for only mov_32_64 too. Differential Revision: https://reviews.llvm.org/D81048	2020-06-03 08:14:54 -07:00
John Fastabend	13f6c81c5d	[BPF] simplify zero extension with MOV_32_64 The current pattern matching for zext results in the following code snippet being produced, w1 = w0 r1 <<= 32 r1 >>= 32 Because BPF implementations require zero extension on 32bit loads this both adds a few extra unneeded instructions but also makes it a bit harder for the verifier to track the r1 register bounds. For example in this verifier trace we see at the end of the snippet R2 offset is unknown. However, if we track this correctly we see w1 should have the same bounds as r8. R8 smax is less than U32 max value so a zero extend load should keep the same value. Adding a max value of 800 (R8=inv(id=0,smax_value=800)) to an off=0, as seen in R7 should create a max offset of 800. However at the end of the snippet we note the R2 max offset is 0xffffFFFF. R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9=inv800 R10=fp0 fp-8=mmmm???? 58: (1c) w9 -= w8 59: (bc) w1 = w8 60: (67) r1 <<= 32 61: (77) r1 >>= 32 62: (bf) r2 = r7 63: (0f) r2 += r1 64: (bf) r1 = r6 65: (bc) w3 = w9 66: (b7) r4 = 0 67: (85) call bpf_get_stack#67 R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R3_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R4_w=inv0 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R10=fp0 fp-8=mmmm???? After this patch R1 bounds are not smashed by the <<=32 >>=32 shift and we get correct bounds on R2 umax_value=800. Further it reduces 3 insns to 1. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Differential Revision: https://reviews.llvm.org/D73985	2020-05-27 11:26:39 -07:00
Yonghong Song	eec758825d	[BPF] fix an asan issue when disassemble an illegal instruction Commit `8e8f1bd75a` ("[BPF] Return fail if disassembled insn registers out of range") tried to fix a segfault when an illegal instruction is decoded. A test case is added to emulate such an illegal instruction. The llvm buildbot reported an asan issue with this test case. ERROR: AddressSanitizer: global-buffer-overflow on address ... decodeMemoryOpValue(llvm::MCInst&, unsigned int, ...) llvm::MCDisassembler::DecodeStatus llvm::decodeToMCInst<unsigned long>(...) llvm::MCDisassembler::DecodeStatus llvm::decodeInstruction<unsigned long>(...) in (anonymous namespace)::BPFDisassembler::getInstruction(...) ... Basically, the fix in Commit `8e8f1bd75a` is too later to prevent the asan. The fix in this patch moved the register number check earlier during decodeInstruction(). It will return fail for decodeInstruction() if the register number is out of range. Note that DecodeGPRRegisterClass() and DecodeGPR32RegisterClass() already have register number checking, so here we only check decodeMemoryOpValue().	2020-05-18 22:33:34 -07:00
Yonghong Song	8e8f1bd75a	[BPF] Return fail if disassembled insn registers out of range Daniel reported a llvm-objdump segfault like below: $ llvm-objdump -D bpf_xdp.o ... 0000000000000000 <.strtab>: 0: 00 63 69 6c 69 75 6d 5f <unknown> 1: 6c 62 36 5f 61 66 66 69 w2 <<= w6 ... (llvm-objdump: lib/Target/BPF/BPFGenAsmWriter.inc:1087: static const char* llvm::BPFInstPrinter::getRegisterName(unsigned int): Assertion `RegNo && RegNo < 25 && "Invalid register number!"' failed. Stack dump: 0. Program arguments: llvm-objdump -D bpf_xdp.o ... abort ... llvm::BPFInstPrinter::getRegisterName(unsigned int) llvm::BPFInstPrinter::printMemOperand(llvm::MCInst const, int, llvm::raw_ostream&, char const) llvm::BPFInstPrinter::printInstruction(llvm::MCInst const, unsigned long, llvm::raw_ostream&) llvm::BPFInstPrinter::printInst(llvm::MCInst const, unsigned long, llvm::StringRef, llvm::MCSubtargetInfo const&, llvm::raw_ostream&) ... Basically, since -D enables disassembly for all sections, .strtab is also disassembled, but some strings are decoded as legal instructions but with illegal register numbers. When llvm-objdump tries to print register name for these illegal register numbers, assertion and segfault happens. The patch fixed the issue by returning fail for a disassembled insn if that insn contains a reg operand with illegal reg number. The insn will be printed as "<unknown>" instead of causing an assertion.	2020-05-18 18:53:23 -07:00
Yonghong Song	ddff9799d2	[BPF] Prevent disassembly segfault for NOP insn For a simple program like below: -bash-4.4$ cat t.c int test() { asm volatile("r0 = r0" ::); return 0; } compiled with clang -target bpf -O2 -c t.c the following llvm-objdump command will segfault. llvm-objdump -d t.o 0: bf 00 00 00 00 00 00 00 nop llvm-objdump: ../include/llvm/ADT/SmallVector.h:180 ... Assertion `idx < size()' failed ... abort ... llvm::BPFInstPrinter::printOperand llvm::BPFInstPrinter::printInstruction ... The reason is both NOP and MOV_rr (r0 = r0) having the same encoding. The disassembly getInstruction() decodes to be a NOP instruciton but during printInstruction() the same encoding is interpreted as a MOV_rr instruction. Such a mismatcch caused the segfault. The fix is to make NOP instruction as CodeGen only so disassembler will skip NOP insn for disassembling. Note that instruction "r0 = r0" should not appear in non inline asm codes since BPF Machine Instruction Peephole optimization will remove it. Differential Revision: https://reviews.llvm.org/D80156	2020-05-18 17:40:18 -07:00
Yonghong Song	6b01b46538	[BPF] preserve debuginfo types for builtin __builtin__btf_type_id() The builtin function u32 btf_type_id = __builtin_btf_type_id(param, 0) can help preserve type info for the following use case: extern void foo(..., void *data, int size); int test(...) { struct t { int a; int b; int c; } d; d.a = ...; d.b = ...; d.c = ...; foo(..., &d, sizeof(d)); } The function "foo" in the above only see raw data and does not know what type of the data is. In certain cases, e.g., logging, the additional type information will help pretty print. This patch handles the builtin in BPF backend. It includes an IR pass to translate the IR intrinsic to a load of a global variable which carries the metadata, and an MI pass to remove the intermediate load of the global variable. Finally, in AsmPrinter pass, proper instruction are generated. In the above example, the second argument for __builtin_btf_type_id() is 0, which means a relocation for local adjustment, i.e., w.r.t. bpf program BTF change, will be generated. The value 1 for the second argument means a relocation for remote adjustment, e.g., against vmlinux. Differential Revision: https://reviews.llvm.org/D74572	2020-05-15 08:00:44 -07:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
Simon Pilgrim	fa6b68a404	BPFMCTargetDesc.h - remove unused raw_ostream forward declaration. NFC.	2020-04-22 18:26:50 +01:00
Simon Pilgrim	54b3f91d20	[BPF] Remove unused forward declarations. NFC.	2020-04-22 15:07:18 +01:00
Shengchen Kan	8bb059ab63	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Yonghong Song	3cb7e7bf95	BPF: fix a CORE optimization bug For the test case in this patch like below struct t { int a; } __attribute__((preserve_access_index)); int foo(void ); int test(struct t arg) { long param[1]; param[0] = (long)&arg->a; return foo(param); } The IR right before BPF SimplifyPatchable phase: %1:gpr = LD_imm64 @"llvm.t:0:0$0:0" %2:gpr = LDD killed %1:gpr, 0 %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %2:gpr STD killed %3:gpr, %stack.0.param, 0 After SimplifyPatchable phase, the incorrect IR is generated: %1:gpr = LD_imm64 @"llvm.t:0:0$0:0" %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr CORE_MEM killed %3:gpr, 306, %0:gpr, @"llvm.t:0:0$0:0" Note that CORE_MEM pseudo op is introduced to encode memory operations related to CORE. In the above, we intend to check whether we have a store like (%3:gpr + 0) = ... and if this is the case, we could replace it with (%0:gpr + @"llvm.t:0:0$0:0"_ = ... Unfortunately, in the above, IR for the store is *(%stack.0.param + 0) = %3:gpr and transformation should not happen. Note that we won't have problem if the actual CORE dereference (arg->a) happens. This patch fixed the problem by skip CORE optimization if the use of ADD_rr result is not the base address of the store operation. Differential Revision: https://reviews.llvm.org/D78466	2020-04-20 19:54:51 -07:00
Simon Pilgrim	cbd790a443	DebugHandlerBase.h - reduce MachineInstr.h include to DebugLoc.h include. We were only including MachineInstr.h for DebugLoc.h. This exposes an implicit include dependency in BTFDebug.h where I've had to add the MachineInstr.h include.	2020-04-19 11:14:01 +01:00
LemonBoy	aad3d578da	[DebugInfo] Change DIEnumerator payload type from int64_t to APInt This allows the representation of arbitrarily large enumeration values. See https://lists.llvm.org/pipermail/llvm-dev/2017-December/119475.html for context. Reviewed By: andrewrk, aprantl, MaskRay Differential Revision: https://reviews.llvm.org/D62475	2020-04-18 12:49:31 -07:00
Fangrui Song	7d1ff446b6	[MC] Rename MCSection::getSectionName() to getName(). NFC A pending change will merge MCSection::getName() to MCSection::getName().	2020-04-15 16:48:14 -07:00
Fangrui Song	d2e5157c1f	[MC] Add UseIntegratedAssembler = false. NFC	2020-04-11 10:13:49 -07:00
Simon Pilgrim	a88cc20456	ProfileSummaryInfo.h - remove unnecessary includes. NFC Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration. This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.	2020-04-10 16:25:48 +01:00
Eli Friedman	565b56a72c	[NFC] Clean up uses of LoadInst constructor.	2020-04-07 16:28:53 -07:00
Yonghong Song	ced0d1f42b	[BPF] support 128bit int explicitly in layout spec Currently, bpf does not specify 128bit alignment in its layout spec. So for a structure like struct ipv6_key_t { unsigned pid; unsigned __int128 saddr; unsigned short lport; }; clang will generate IR type %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] } Additional padding is to ensure later IR->MIR can generate correct stack layout with target layout spec. But it is common practice for a tracing program to be first compiled with target flag (e.g., x86_64 or aarch64) through clang to generate IR and then go through llc to generate bpf byte code. Tracing program often refers to kernel internal data structures which needs to be compiled with non-bpf target. But such a compilation model may cause a problem on aarch64. The bcc issue https://github.com/iovisor/bcc/issues/2827 reported such a problem. For the above structure, since aarch64 has "i128:128" in its layout string, the generated IR will have %struct.ipv6_key_t = type { i32, i128, i16 } Since bpf does not have "i128:128" in its spec string, the selectionDAG assumes alignment 8 for i128 and computes the stack storage size for the above is 32 bytes, which leads incorrect code later. The x86_64 does not have this issue as it does not have "i128:128" in its layout spec as it does permits i128 to be alignmented at 8 bytes at stack. Its IR type looks like %struct.ipv6_key_t = type { i32, [12 x i8], i128, i16, [14 x i8] } The fix here is add i128 support in layout spec, the same as aarch64. The only downside is we may have less optimal stack allocation in certain cases since we require 16byte alignment for i128 instead of 8. But this is probably fine as i128 is not used widely and in most cases users should already have proper alignment. Differential Revision: https://reviews.llvm.org/D76587	2020-03-28 11:46:29 -07:00
Fangrui Song	692e0c9648	[MC] Add MCStreamer::emitInt{8,16,32,64} Similar to AsmPrinter::emitInt{8,16,32,64}.	2020-02-29 09:40:21 -08:00
Fangrui Song	774971030d	[MCStreamer] De-capitalize EmitValue EmitIntValue{,InHex}	2020-02-14 23:08:40 -08:00
Fangrui Song	6d2d589b06	[MC] De-capitalize another set of MCStreamer::Emit* functions Emit{ValueTo,Code}Alignment Emit{DTP,TP,GP}* EmitSymbolValue etc	2020-02-14 19:26:52 -08:00
Fangrui Song	a55daa1461	[MC] De-capitalize some MCStreamer::Emit* functions	2020-02-14 19:11:53 -08:00
Fangrui Song	bcd24b2d43	[AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI*	2020-02-13 22:08:55 -08:00
Fangrui Song	0bc77a0f0d	[AsmPrinter] De-capitalize some AsmPrinter::Emit* functions Similar to rL328848.	2020-02-13 13:38:33 -08:00
Yonghong Song	61bd33e37b	[BPF] explicit warning of not supporting dynamic stack allocation Currently, BPF does not support dynamic static allocation. For a program like below: extern void bar(int *); void foo(int n) { int a[n]; bar(a); } The current error message looks like: unimplemented operand UNREACHABLE executed at /.../llvm/lib/Target/BPF/BPFISelLowering.cpp:199! Let us make error message explicit so it will be clear to the user what is the problem. With this patch, the error message looks like: fatal error: error in backend: Unsupported dynamic stack allocation ... Differential Revision: https://reviews.llvm.org/D74521	2020-02-12 20:43:06 -08:00
Yonghong Song	29bc5dd194	[BPF] implement isTruncateFree and isZExtFree in BPFTargetLowering Currently, isTruncateFree() and isZExtFree() callbacks return false as they are not implemented in BPF backend. This may cause suboptimal code generation. For example, if the load in the context of zero extension has more than one use, the pattern zextload{i8,i16,i32} will not be generated. Rather, the load will be matched first and then the result is zero extended. For example, in the test together with this commit, we have I1: %0 = load i32, i32* %data_end1, align 4, !tbaa !2 I2: %conv = zext i32 %0 to i64 ... I3: %2 = load i32, i32* %data, align 4, !tbaa !7 I4: %conv2 = zext i32 %2 to i64 ... I5: %4 = trunc i64 %sub.ptr.lhs.cast to i32 I6: %conv13 = sub i32 %4, %2 ... The I1 and I2 will match to one zextloadi32 DAG node, where SUBREG_TO_REG is used to convert a 32bit register to 64bit one. During code generation, SUBREG_TO_REG is a noop. The %2 in I3 is used in both I4 and I6. If isTruncateFree() is false, the current implementation will generate a SLL_ri and SRL_ri for the zext part during lowering. This patch implement isTruncateFree() in the BPF backend, so for the above example, I3 and I4 will generate a zextloadi32 DAG node with SUBREG_TO_REG is generated during lowering to Machine IR. isZExtFree() is also implemented as it should help code gen as well. This patch also enables the change in https://reviews.llvm.org/D73985 since it won't kick in generates MOV_32_64 machine instruction. Differential Revision: https://reviews.llvm.org/D74101	2020-02-11 09:59:19 -08:00
Eric Astor	8d5bf0422b	[ms] [llvm-ml] Add support for attempted register parsing Summary: Add a new method (tryParseRegister) that attempts to parse a register specification. MASM allows the use of IFDEF <register>, as well as IFDEF <symbol>. To accommodate this, we make it possible to check whether a register specification can be parsed at the current location, without failing the entire parse if it can't. Reviewers: thakis Reviewed By: thakis Tags: #llvm Differential Revision: https://reviews.llvm.org/D73486	2020-02-11 10:45:33 -05:00
Yonghong Song	d96c1bbaa0	[BPF] disable ReduceLoadWidth during SelectionDag phase The compiler may transform the following code ctx = ctx + reloc_offset ... ((u32 )ctx) & 0x8000 ... to ctx = ctx + reloc_offset ... ((u8 )(ctx + 1)) & 0x80 ... where reloc_offset will be replaced with a constant during AsmPrinter phase. The above transformed code will be rejected the kernel verifier as it does not allow (type )((ctx + non_zero_offset1) + non_zero_offset2) style access pattern. It is hard at SelectionDag phase to identify whether a load is related to context or not. Sometime, interprocedure analysis may be needed. So let us simply prevent such optimization from happening. Differential Revision: https://reviews.llvm.org/D73997	2020-02-04 18:37:43 -08:00
Yonghong Song	6d07802d63	[BPF] handle typedef of struct/union for CO-RE relocations Linux commit `1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)` uses "#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)" to apply CO-RE relocations to all records including the following pattern: #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record) typedef struct { int a; } __t; #pragma clang attribute pop int test(__t *arg) { return arg->a; } The current approach to use struct/union type in the relocation record will result in an anonymous struct, which make later type matching difficult in bpf loader. In fact, current BPF backend will fail the above program with assertion: clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ... Assertion `TypeName.size()' failed. clang will change to use the type of the base of the member access which will preserve the typedef modifier for the preserve_{struct,union}_access_index intrinsics in the above example. Here we adjust BPF backend to accept that the debuginfo type metadata may be 'typedef' and handle them properly. Differential Revision: https://reviews.llvm.org/D73902	2020-02-04 08:53:03 -08:00
Guillaume Chatelet	b8144c0536	[NFC] Encapsulate MemOp logic Summary: This patch simply introduces functions instead of directly accessing the fields. This helps introducing additional check logic. A second patch will add simplifying functions. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73945	2020-02-04 10:36:26 +01:00
Simon Moll	5c8ba508b2	[NFC] unsigned->Register in storeRegTo/loadRegFromStack Summary: This patch makes progress on the 'unsigned -> Register' rewrite for `TargetInstrInfo::loadRegFromStack` and `TII::storeRegToStack`. Reviewers: arsenm, craig.topper, uweigand, jpienaar, atanasyan, venkatra, robertlytton, dylanmckay, t.p.northover, kparzysz, tstellar, k-ishizaka Reviewed By: arsenm Subscribers: wuzish, merge_guards_bot, jyknight, sdardis, nemanjai, jvesely, wdng, nhaehnle, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73870	2020-02-03 14:22:16 +01:00
Guillaume Chatelet	3c89b75f23	[NFC] Introduce a type to model memory operation Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785	2020-01-31 17:29:01 +01:00
Yonghong Song	795bbb3662	[BPF] fix a bug in BPFMISimplifyPatchable pass with -O0 The recommended optimization level for BPF programs is O2 since (1). BPF is running inside the kernel and linux kernel won't work at -O0 level, and (2). Verifier is not able to handle O0 code properly, e.g., potential large stack size and a lot of spills. But we should keep -O0 at least compiling. This patch fixed a bug in BPFMISimplifyPatchable phase where with -O0, a segmentation fault will happen for a simple program like: int test(int a, int b) { return a + b; } A test case is added to capture such a case. Differential Revision: https://reviews.llvm.org/D73681	2020-01-30 08:28:39 -08:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Tom Stellard	0dbcb36394	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439	2020-01-14 19:46:52 -08:00
Fangrui Song	6fdd6a7b3f	[Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction() The argument is llvm::null() everywhere except llvm::errs() in llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds. If we ever have the needs to add verbose log to disassemblers, we can record log with a member function, instead of passing it around as an argument.	2020-01-11 13:34:52 -08:00
Yonghong Song	fbb64aa698	[BPF] extend BTF_KIND_FUNC to cover global, static and extern funcs Previously extern function is added as BTF_KIND_VAR. This does not work well with existing BTF infrastructure as function expected to use BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO. This patch added extern function to BTF_KIND_FUNC. The two bits 0:1 of btf_type.info are used to indicate what kind of function it is: 0: static 1: global 2: extern Differential Revision: https://reviews.llvm.org/D71638	2020-01-10 09:06:31 -08:00
Fangrui Song	3d87d0b925	[MC] Add parameter `Address` to MCInstrPrinter::printInstruction Follow-up of D72172. Reviewed By: jhenderson, rnk Differential Revision: https://reviews.llvm.org/D72180	2020-01-06 20:44:14 -08:00
Fangrui Song	aa708763d3	[MC] Add parameter `Address` to MCInstPrinter::printInst printInst prints a branch/call instruction as `b offset` (there are many variants on various targets) instead of `b address`. It is a convention to use address instead of offset in most external symbolizers/disassemblers. This difference makes `llvm-objdump -d` output unsatisfactory. Add `uint64_t Address` to printInst(), so that it can pass the argument to printInstruction(). `raw_ostream &OS` is moved to the last to be consistent with other print* methods. The next step is to pass `Address` to printInstruction() (generated by tablegen from the instruction set description). We can gradually migrate targets to print addresses instead of offsets. In any case, downstream projects which don't know `Address` can pass 0 as the argument. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72172	2020-01-06 20:42:22 -08:00
Yonghong Song	ffd57408ef	[BPF] Enable relocation location for load/store/shifts Previous btf field relocation is always at assignment like r1 = 4 which is converted from an ld_imm64 instruction. This patch did an optimization such that relocation instruction might be load/store/shift. Specically, the following insns may also have relocation, except BPF_MOV: LDB, LDH, LDW, LDD, STB, STH, STW, STD, LDB32, LDH32, LDW32, STB32, STH32, STW32, SLL, SRL, SRA To accomplish this, a few BPF target specific codegen only instructions are invented. They are generated at backend BPF SimplifyPatchable phase, which is at early llc phase when SSA form is available. The new codegen only instructions will be converted to real proper instructions at the codegen and BTF emission stage. Note that, as revealed by a few tests, this optimization might be actual generating more relocations: Scenario 1: if (...) { ... __builtin_preserve_field_info(arg->b2, 0) ... } else { ... __builtin_preserve_field_info(arg->b2, 0) ... } Compiler could do CSE to only have one relocation. But if both of the above is translated into codegen internal instructions, the compiler will not be able to do that. Scenario 2: offset = ... __builtin_preserve_field_info(arg->b2, 0) ... ... ... offset ... ... offset ... ... offset ... For whatever reason, the compiler might be temporarily do copy propagation of the righthand of "offset" assignment like ... __builtin_preserve_field_info(arg->b2, 0) ... ... __builtin_preserve_field_info(arg->b2, 0) ... and CSE will be able to deduplicate later. But if these intrinsics are converted to BPF pseudo instructions, they will not be able to get deduplicated. I do not expect we have big instruction count difference. It may actually reduce instruction count since now relocation is in deeper insn dependency chain. For example, for test offset-reloc-fieldinfo-2.ll, this patch generates 7 instead of 6 relocations for non-alu32 mode, but it actually reduced instruction count from 29 to 26. Differential Revision: https://reviews.llvm.org/D71790	2019-12-26 09:07:39 -08:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Yonghong Song	7d0e8930ed	[BPF] put not-section-attribute externs into BTF ".extern" data section Currently for extern variables with section attribute, those BTF_KIND_VARs will not be placed in any DataSec. This is inconvenient as any other generated BTF_KIND_VAR belongs to one DataSec. This patch put these extern variables into ".extern" section so bpf loader can have a consistent processing mechanism for all data sections and variables.	2019-12-10 11:45:17 -08:00
Yonghong Song	4448125007	[BPF] Support to emit debugInfo for extern variables extern variable usage in BPF is different from traditional pure user space application. Recent discussion in linux bpf mailing list has two use cases where debug info types are required to use extern variables: - extern types are required to have a suitable interface in libbpf (bpf loader) to provide kernel config parameters to bpf programs. https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t - extern types are required so kernel bpf verifier can verify program which uses external functions more precisely. This will make later link with actual external function no need to reverify. https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed This patch added bpf support to consume such info into BTF, which can then be used by bpf loader. Function processFuncPrototypes() only adds extern function definitions into BTF. The functions with actual definition have been added to BTF in some other places. Differential Revision: https://reviews.llvm.org/D70697	2019-12-09 21:53:29 -08:00
Yonghong Song	5ea611daf9	[BPF] Support weak global variables for BTF Generate types for global variables with "weak" attribute. Keep allocation scope the same for both weak and non-weak globals as ELF symbol table can determine whether a global symbol is weak or not. Differential Revision: https://reviews.llvm.org/D71162	2019-12-07 08:58:19 -08:00
Yonghong Song	6db023b99b	[BPF] add "llvm." prefix to BPF internally created globals Currently, BPF backend creates some global variables with name like <type_name>:<reloc_type>:<patch_imm>$<access_str> to carry certain information to BPF backend. With direct clang compilation, the following code in llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp is triggered and the above globals are emitted to the ELF file. (clang enabled this as opt flag -faddrsig is on by default.) if (TM.Options.EmitAddrsig) { // Emit address-significance attributes for all globals. OutStreamer->EmitAddrsig(); for (const GlobalValue &GV : M.global_values()) if (!GV.use_empty() && !GV.isThreadLocal() && !GV.hasDLLImportStorageClass() && !GV.getName().startswith("llvm.") && !GV.hasAtLeastLocalUnnamedAddr()) OutStreamer->EmitAddrsigSym(getSymbol(&GV)); } ... 10162: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND tcp_sock:0:2048$0:117 10163: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND tcp_sock:0:2112$0:126:0 10164: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND tcp_sock:1:8$0:31:6 ... While in llc, those globals are not emited since EmitAddrsig default option is false for llc. The llc flag "-addrsig" can be used to enable the above code. This patch added "llvm." prefix to these internal globals so that they can be ignored in the above codes and possible other places. Differential Revision: https://reviews.llvm.org/D70703	2019-11-25 21:34:46 -08:00
Yonghong Song	9e6aa81588	[BPF] Fix a recursion bug in BPF Peephole ZEXT optimization Commit `a0841dfe85` ("[BPF] Fix a bug in peephole optimization") fixed a bug in peephole optimization. Recursion is introduced to handle COPY and PHI instructions. Unfortunately, multiple PHI instructions may form a cycle and this will cause infinite recursion, eventual segfault. For Commit `a0841dfe85`, I indeed tried a few loops to ensure that I won't see the recursion, but I did not try with complex control flows, which, as demonstrated with the test case in this patch, may introduce PHI cycles. This patch fixed the issue by introducing a set to remember visited PHI instructions. This way, cycles can be properly detected and handled. Differential Revision: https://reviews.llvm.org/D70586	2019-11-22 08:05:43 -08:00
Tom Stellard	ab411801b8	[cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179	2019-11-21 10:48:08 -08:00
Yonghong Song	a0841dfe85	[BPF] Fix a bug in peephole optimization One of current peephole optimiations is to remove SLL/SRL if the sub register has been zero extended. This phase has two bugs and one limitations. First, for the physical subregister used in pseudo insn COPY like below, it permits incorrect optimization. %0:gpr32 = COPY $w0 ... %4:gpr = MOV_32_64 %0:gpr32 %5:gpr = SLL_ri %4:gpr(tied-def 0), 32 %6:gpr = SRA_ri %5:gpr(tied-def 0), 32 The $w0 could be from the return value of a previous function call and its upper 32-bit value might contain some non-zero values. The same applies to function arguments. Second, the current code may permits removing SLL/SRA like below: %0:gpr32 = COPY $w0 %1:gpr32 = COPY %0:gpr32 ... %4:gpr = MOV_32_64 %1:gpr32 %5:gpr = SLL_ri %4:gpr(tied-def 0), 32 %6:gpr = SRA_ri %5:gpr(tied-def 0), 32 The reason is that it did not follow def-use chain to skip all intermediate 32bit-to-32bit COPY instructions. The current implementation is also very conservative for PHI instructions. If any PHI insn component is another PHI or COPY insn, it will just permit SLL/SRA. This patch fixed the issue as follows: - During def/use chain traversal, if any physical register is read, SLL/SRA will be preserved as these physical registers are mostly from function return values or current function arguments. - Recursively visit all COPY and PHI instructions.	2019-11-20 15:19:59 -08:00
Simon Pilgrim	f784ad8ff3	Fix uninitialized variable warning. NFCI.	2019-11-14 14:21:17 +00:00
Yonghong Song	166cdc0281	[BPF] generate BTF_KIND_VARs for all non-static globals Enable to generate BTF_KIND_VARs for non-static default-section globals which is not allowed previously. Modified the existing test case to accommodate the new change. Also removed unused linkage enum members VAR_GLOBAL_TENTATIVE and VAR_GLOBAL_EXTERNAL. Differential Revision: https://reviews.llvm.org/D70145	2019-11-12 14:34:08 -08:00
Matt Arsenault	e6c9a9af39	Use MCRegister in copyPhysReg	2019-11-11 14:42:33 +05:30
Yonghong Song	6b8baf3062	[BPF] turn on -mattr=+alu32 for cpu version v3 and later -mattr=+alu32 has shown good performance vs. without this attribute. Based on discussion at https://lore.kernel.org/bpf/1ec37838-966f-ec0b-5223-ca9b6eb0860d@fb.com/T/#t cpu version v3 should support -mattr=+alu32. This patch enabled alu32 if cpu version is v3, either specified by user or probed by the llvm. Differential Revision: https://reviews.llvm.org/D69957	2019-11-07 22:08:46 -08:00
Yonghong Song	9f34447f3f	[BPF] fix a use after free bug Commit `fff2721286` ("[BPF] Fix CO-RE bugs with bitfields") fixed CO-RE handling bitfield issues. But the implementation introduced a use after free bug. The "Base" of the intrinsic might be freed so later on accessing the Type of "Base" might access the freed memory. The failed test case, CodeGen/BPF/CORE/offset-reloc-middle-chain.ll is exactly used to test such a case. Similarly to previous attempt to remember Metadata etc, remember "Base" pointee Alignment in advance to avoid such use after free bug.	2019-11-04 22:20:23 -08:00
Yonghong Song	fff2721286	[BPF] Fix CO-RE bugs with bitfields bitfield handling is not robust with current implementation. I have seen two issues as described below. Issue 1: struct s { long long f1; char f2; char b1:1; } *p; The current approach will generate an access bit size 56 (from b1 to the end of structure) which will be rejected as it is not power of 2. Issue 2: struct s { char f1; char b1:3; char b2:5; char b3:6: char b4:2; char f2; }; The LLVM will group 4 bitfields together with 2 bytes. But loading 2 bytes is not correct as it violates alignment requirement. Note that sometimes, LLVM breaks a large bitfield groups into multiple groups, but not in this case. To resolve the above two issues, this patch takes a different approach. The alignment for the structure is used to construct the offset of the bitfield access. The bitfield incurred memory access is an aligned memory access with alignment/size equal to the alignment of the structure. This also simplified the code. This may not be the optimal memory access in terms of memory access width. But this should be okay since extracting the bitfield value will have the same amount of work regardless of what kind of memory access width. Differential Revision: https://reviews.llvm.org/D69837	2019-11-04 20:08:05 -08:00
Yonghong Song	c430533771	[BPF] fix a bug in __builtin_preserve_field_info() with FIELD_BYTE_SIZE During deriving proper bitfield access FIELD_BYTE_SIZE, function Member->getStorageOffsetInBits() is used to get llvm IR type storage offset in bits so that the byte size can permit aligned loads/stores with previously derived FIELD_BYTE_OFFSET. The function should only be used with bitfield members and it will assert if ASSERT is turned on during cmake build. Constant getStorageOffsetInBits() const { assert(getTag() == dwarf::DW_TAG_member && isBitField()); if (auto C = cast_or_null<ConstantAsMetadata>(getExtraData())) return C->getValue(); return nullptr; } This patch fixed the issue by using Member->isBitField() directly and a test case is added to cover this missing case. This issue is discovered when running Andrii's linux kernel CO-RE tests. Differential Revision: https://reviews.llvm.org/D69761	2019-11-03 08:18:28 -08:00
Yonghong Song	a27c998c00	[BPF] fix a CO-RE issue with -mattr=+alu32 Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable pass. The pattern will be transformed by MISimplifyPatchable pass looks like below: r5 = ld_imm64 @"b:0:0$0:0" r2 = ldw r5, 0 ... r2 ... // use r2 The pass will remove the intermediate 'ldw' instruction and replacing all r2 with r5 likes below: r5 = ld_imm64 @"b:0:0$0:0" ... r5 ... // use r5 Later, the ld_imm64 insn will be replaced with r5 = <patched immediate> for field relocation purpose. With -mattr=+alu32, the input code may become r5 = ld_imm64 @"b:0:0$0:0" w2 = ldw32 r5, 0 ... w2 ... // use w2 Replacing "w2" with "r5" is incorrect and will trigger compiler internal errors. To fix the problem, if the register class of ldw* dest register is sub_32, we just replace the original ldw* register with: w2 = w5 Directly replacing all uses of w2 with in-place constructed w5 for the use operand seems not working in all cases. The latest kernel will have -mattr=+alu32 on by default, so added this flag to all CORE tests. Tested with latest kernel bpf-next branch as well with this patch. Differential Revision: https://reviews.llvm.org/D69438	2019-10-25 14:27:25 -07:00
Mirko Brkusanin	4b63ca1379	[Mips] Use appropriate private label prefix based on Mips ABI MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64 regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo we can find out Mips ABI and pick appropriate prefix. Tags: #llvm, #clang, #lldb Differential Revision: https://reviews.llvm.org/D66795	2019-10-23 12:24:35 +02:00
Yonghong Song	ee881197b0	[BPF] fix indirect call assembly code Currently, for indirect call, the assembly code printed out as callx <imm> This is not right, it should be callx <reg> Fixed the issue with proper format. Differential Revision: https://reviews.llvm.org/D69229 llvm-svn: 375386	2019-10-21 03:22:03 +00:00
Reid Kleckner	904cd3e06b	Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each Now X86ISelLowering doesn't depend on many IR analyses. llvm-svn: 375320	2019-10-19 01:31:09 +00:00
Guillaume Chatelet	882c43d703	[Alignment][NFC] Use Align for TargetFrameLowering/Subtarget Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68993 llvm-svn: 375084	2019-10-17 07:49:39 +00:00
Jiong Wang	ec51851026	bpf: fix wrong truncation elimination when there is back-edge/loop Currently, BPF backend is doing truncation elimination. If one truncation is performed on a value defined by narrow loads, then it could be redundant given BPF loads zero extend the destination register implicitly. When the definition of the truncated value is a merging value (PHI node) that could come from different code paths, then checks need to be done on all possible code paths. Above described optimization was introduced as r306685, however it doesn't work when there is back-edge, for example when loop is used inside BPF code. For example for the following code, a zero-extended value should be stored into b[i], but the "and reg, 0xffff" is wrongly eliminated which then generates corrupted data. void cal1(unsigned short a, unsigned long b, unsigned int k) { unsigned short e; e = *a; for (unsigned int i = 0; i < k; i++) { b[i] = e; e = ~e; } } The reason is r306685 was trying to do the PHI node checks inside isel DAG2DAG phase, and the checks are done on MachineInstr. This is actually wrong, because MachineInstr is being built during isel phase and the associated information is not completed yet. A quick search shows none target other than BPF is access MachineInstr info during isel phase. For an PHI node, when you reached it during isel phase, it may have all predecessors linked, but not successors. It seems successors are linked to PHI node only when doing SelectionDAGISel::FinishBasicBlock and this happens later than PreprocessISelDAG hook. Previously, BPF program doesn't allow loop, there is probably the reason why this bug was not exposed. This patch therefore fixes the bug by the following approach: - The existing truncation elimination code and the associated "load_to_vreg_" records are removed. - Instead, implement truncation elimination using MachineSSA pass, this is where all information are built, and keep the pass together with other similar peephole optimizations inside BPFMIPeephole.cpp. Redundant move elimination logic is updated accordingly. - Unit testcase included + no compilation errors for kernel BPF selftest. Patch Review === Patch was sent to and reviewed by BPF community at: https://lore.kernel.org/bpf Reported-by: David Beckett <david.beckett@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 375007	2019-10-16 15:27:59 +00:00
Kadir Cetinkaya	dd37a26f6d	Fix assertions disabled builds after rL374367 llvm-svn: 374372	2019-10-10 16:04:45 +00:00
Yonghong Song	d46a6a9e68	[BPF] Remove relocation for patchable externs Previously, patchable extern relocations are introduced to patch external variables used for multi versioning in compile once, run everywhere use case. The load instruction will be converted into a move with an patchable immediate which can be changed by bpf loader on the host. The kernel verifier has evolved and is able to load and propagate constant values, so compiler relocation becomes unnecessary. This patch removed codes related to this. Differential Revision: https://reviews.llvm.org/D68760 llvm-svn: 374367	2019-10-10 15:33:09 +00:00
Yonghong Song	05e46979d2	[BPF] do compile-once run-everywhere relocation for bitfields A bpf specific clang intrinsic is introduced: u32 __builtin_preserve_field_info(member_access, info_kind) Depending on info_kind, different information will be returned to the program. A relocation is also recorded for this builtin so that bpf loader can patch the instruction on the target host. This clang intrinsic is used to get certain information to facilitate struct/union member relocations. The offset relocation is extended by 4 bytes to include relocation kind. Currently supported relocation kinds are enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; for __builtin_preserve_field_info. The old access offset relocation is covered by FIELD_BYTE_OFFSET = 0. An example: struct s { int a; int b1:9; int b2:4; }; enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64, }; void bpf_probe_read(void , unsigned, const void ); int field_read(struct s arg) { unsigned long long ull = 0; unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET); unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE); #ifdef USE_PROBE_READ bpf_probe_read(&ull, size, (const void )arg + offset); unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ lshift = lshift + (size << 3) - 64; #endif #else switch(size) { case 1: ull = (unsigned char )((void )arg + offset); break; case 2: ull = (unsigned short )((void )arg + offset); break; case 4: ull = (unsigned int )((void )arg + offset); break; case 8: ull = (unsigned long long )((void )arg + offset); break; } unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64); #endif ull <<= lshift; if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS)) return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64); } There is a minor overhead for bpf_probe_read() on big endian. The code and relocation generated for field_read where bpf_probe_read() is used to access argument data on little endian mode: r3 = r1 r1 = 0 r1 = 4 <=== relocation (FIELD_BYTE_OFFSET) r3 += r1 r1 = r10 r1 += -8 r2 = 4 <=== relocation (FIELD_BYTE_SIZE) call bpf_probe_read r2 = 51 <=== relocation (FIELD_LSHIFT_U64) r1 = (u64 )(r10 - 8) r1 <<= r2 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) r0 = r1 r0 >>= r2 r3 = 1 <=== relocation (FIELD_SIGNEDNESS) if r3 == 0 goto LBB0_2 r1 s>>= r2 r0 = r1 LBB0_2: exit Compare to the above code between relocations FIELD_LSHIFT_U64 and FIELD_LSHIFT_U64, the code with big endian mode has four more instructions. r1 = 41 <=== relocation (FIELD_LSHIFT_U64) r6 += r1 r6 += -64 r6 <<= 32 r6 >>= 32 r1 = (u64 )(r10 - 8) r1 <<= r6 r2 = 60 <=== relocation (FIELD_RSHIFT_U64) The code and relocation generated when using direct load. r2 = 0 r3 = 4 r4 = 4 if r4 s> 3 goto LBB0_3 if r4 == 1 goto LBB0_5 if r4 == 2 goto LBB0_6 goto LBB0_9 LBB0_6: # %sw.bb1 r1 += r3 r2 = (u16 )(r1 + 0) goto LBB0_9 LBB0_3: # %entry if r4 == 4 goto LBB0_7 if r4 == 8 goto LBB0_8 goto LBB0_9 LBB0_8: # %sw.bb9 r1 += r3 r2 = (u64 )(r1 + 0) goto LBB0_9 LBB0_5: # %sw.bb r1 += r3 r2 = (u8 )(r1 + 0) goto LBB0_9 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 r2 <<= r1 r1 = 60 r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit Considering verifier is able to do limited constant propogation following branches. The following is the code actually traversed. r2 = 0 r3 = 4 <=== relocation r4 = 4 <=== relocation if r4 s> 3 goto LBB0_3 LBB0_3: # %entry if r4 == 4 goto LBB0_7 LBB0_7: # %sw.bb5 r1 += r3 r2 = (u32 )(r1 + 0) LBB0_9: # %sw.epilog r1 = 51 <=== relocation r2 <<= r1 r1 = 60 <=== relocation r0 = r2 r0 >>= r1 r3 = 1 if r3 == 0 goto LBB0_11 r2 s>>= r1 r0 = r2 LBB0_11: # %sw.epilog exit For native load case, the load size is calculated to be the same as the size of load width LLVM otherwise used to load the value which is then used to extract the bitfield value. Differential Revision: https://reviews.llvm.org/D67980 llvm-svn: 374099	2019-10-08 18:23:17 +00:00
Jordan Rose	fdaa742174	Second attempt to add iterator_range::empty() Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935	2019-10-07 18:14:24 +00:00
Yonghong Song	02ac75092d	[BPF] Handle offset reloc endpoint ending in the middle of chain properly During studying support for bitfield, I found an issue for an example like the one in test offset-reloc-middle-chain.ll. struct t1 { int c; }; struct s1 { struct t1 b; }; struct r1 { struct s1 a; }; #define _(x) __builtin_preserve_access_index(x) void test1(void p1, void p2, void p3); void test(struct r1 arg) { struct s1 ps = _(&arg->a); struct t1 pt = _(&arg->a.b); int *pi = _(&arg->a.b.c); test1(ps, pt, pi); } The IR looks like: %0 = llvm.preserve.struct.access(base, ...) %1 = llvm.preserve.struct.access(%0, ...) %2 = llvm.preserve.struct.access(%1, ...) using %0, %1 and %2 In this case, we need to generate three relocatiions corresponding to chains: (%0), (%0, %1) and (%0, %1, %2). After collecting all the chains, the current implementation process each chain (in a map) with code generation sequentially. For example, after (%0) is processed, the code may look like: %0 = base + special_global_variable // llvm.preserve.struct.access(base, ...) is delisted // from the instruction stream. %1 = llvm.preserve.struct.access(%0, ...) %2 = llvm.preserve.struct.access(%1, ...) using %0, %1 and %2 When processing chain (%0, %1), the current implementation tries to visit intrinsic llvm.preserve.struct.access(base, ...) to get some of its properties and this caused segfault. This patch fixed the issue by remembering all necessary information (kind, metadata, access_index, base) during analysis phase, so in code generation phase there is no need to examine the intrinsic call instructions. This also simplifies the code. Differential Revision: https://reviews.llvm.org/D68389 llvm-svn: 373621	2019-10-03 16:30:29 +00:00
Guillaume Chatelet	18f805a7ea	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Simon Pilgrim	93c8951147	[BPF] Remove unused variables. NFCI. Fixes a dyn_cast<> null dereference warning. llvm-svn: 372958	2019-09-26 10:55:57 +00:00
Yonghong Song	1487bf6c82	[BPF] Generate array dimension size properly for zero-size elements Currently, if an array element type size is 0, the number of array elements will be set to 0, regardless of what user specified. This implementation is done in the beginning where BTF is mostly used to calculate the member offset. For example, struct s {}; struct s1 { int b; struct s a[2]; }; struct s1 s1; The BTF will have struct "s1" member "a" with element count 0. Now BTF types are used for compile-once and run-everywhere relocations and we need more precise type representation for type comparison. Andrii reported the issue as there are differences between original structure and BTF-generated structure. This patch made the change to correctly assign "2" as the number elements of member "a". Some dead codes related to ElemSize compuation are also removed. Differential Revision: https://reviews.llvm.org/D67979 llvm-svn: 372785	2019-09-24 22:38:43 +00:00
Yonghong Song	c68ee0ce70	[BPF] Permit all user instructed offset relocatiions Currently, not all user specified relocations (with clang intrinsic __builtin_preserve_access_index()) will turn into relocations. In the current implementation, a __builtin_preserve_access_index() chain is turned into relocation only if the result of the clang intrinsic is used in a function call or a nonzero offset computation of getelementptr. For all other cases, the relocatiion request is ignored and the __builtin_preserve_access_index() is turned into regular getelementptr instructions. The main reason is to mimic bpf_probe_read() requirement. But there are other use cases where relocatable offset is generated but not used for bpf_probe_read(). This patch relaxed previous constraints when to generate relocations. Now, all user __builtin_preserve_access_index() will have relocations generated. Differential Revision: https://reviews.llvm.org/D67688 llvm-svn: 372198	2019-09-18 03:49:07 +00:00
Guillaume Chatelet	ad1cea0dda	[Alignment][NFC] Use Align with TargetLowering::setPrefFunctionAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, javed.absar, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, ychen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67267 llvm-svn: 371212	2019-09-06 15:03:49 +00:00
Guillaume Chatelet	4fc3ad9e13	[Alignment][NFC] Use Align with TargetLowering::setMinFunctionAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67229 llvm-svn: 371200	2019-09-06 12:48:34 +00:00
Guillaume Chatelet	aff45e4b23	[LLVM][Alignment] Make functions using log of alignment explicit Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045	2019-09-05 10:00:22 +00:00
Sam Clegg	90b6bb75e8	[MC] Minor cleanup to MCFixup::Kind handling. NFC. Prefer `MCFixupKind` where possible and add getTargetKind() to convert to `unsigned` when needed rather than scattering cast operators around the place. Differential Revision: https://reviews.llvm.org/D59890 llvm-svn: 369720	2019-08-23 01:00:55 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Bill Wendling	ebc2cf9c27	Use "isa" since the variable isn't used. llvm-svn: 367985	2019-08-06 07:27:26 +00:00
Yonghong Song	37d24a696b	[BPF] Handling type conversions correctly for CO-RE With newly added debuginfo type metadata for preserve_array_access_index() intrinsic, this patch did the following two things: (1). checking validity before adding a new access index to the access chain. (2). calculating access byte offset in IR phase BPFAbstractMemberAccess instead of when BTF is emitted. For (1), the metadata provided by all preserve_*_access_index() intrinsics are used to check whether the to-be-added type is a proper struct/union member or array element. For (2), with all available metadata, calculating access byte offset becomes easier in BPFAbstractMemberAccess IR phase. This enables us to remove the unnecessary complexity in BTFDebug.cpp. New tests are added for . user explicit casting to array/structure/union . global variable (or its dereference) as the source of base . multi demensional arrays . array access given a base pointer . cases where we won't generate relocation if we cannot find type name. Differential Revision: https://reviews.llvm.org/D65618 llvm-svn: 367735	2019-08-02 23:16:44 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Yonghong Song	329abf2939	[BPF] fix typedef issue for offset relocation Currently, the CO-RE offset relocation does not work if any struct/union member or array element is a typedef. For example, typedef const int arr_t[7]; struct input { arr_t a; }; func(...) { struct input *in = ...; ... __builtin_preserve_access_index(&in->a[1]) ... } The BPF backend calculated default offset is 0 while 4 is the correct answer. Similar issues exist for struct/union typedef's. When getting struct/union member or array element type, we should trace down to the type by skipping typedef and qualifiers const/volatile as this is what clang did to generate getelementptr instructions. (const/volatile member type qualifiers are already ignored by clang.) This patch fixed this issue, for each access index, skipping typedef and const/volatile/restrict BTF types. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D65259 llvm-svn: 367062	2019-07-25 21:47:27 +00:00
Yonghong Song	d8efec97be	[BPF] fix CO-RE incorrect index access string Currently, we expect the CO-RE offset relocation records a string encoding the original getelementptr access index, so kernel bpf loader can decode it correctly. For example, struct s { int a; int b; }; struct t { int c; int d; }; #define _(x) (__builtin_preserve_access_index(x)) int get_value(const void addr1, const void addr2); int test(struct s arg1, struct t arg2) { return get_value(_(&arg1->b), _(&arg2->d)); } We expect two offset relocations: reloc 1: type s, access index 0, 1 reloc 2: type t, access index 0, 1 Two globals are created to retain access indexes for the above two relocations with global variable names. The first global has a name "0:1:". Unfortunately, the second global has the name "0:1:.1" as the llvm internals automatically add suffix ".1" to a global with the same name. Later on, the BPF peels the last character and record "0:1" and "0:1:." in the relocation table. This is not desirable. BPF backend could use the global variable suffix knowledge to generate correct access str. This patch rather took an approach not relying on that knowledge. It generates "s:0:1:" and "t:0:1:" to avoid global variable suffixes and later on generate correct index access string "0:1" for both records. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D65258 llvm-svn: 367030	2019-07-25 16:01:26 +00:00
Yonghong Song	a1b2a27a38	[BPF] Fix a typo in the file name Fixed the file name from BPFAbstrctMemberAccess.cpp to BPFAbstractMemberAccess.cpp. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 365532	2019-07-09 18:35:46 +00:00
Yonghong Song	d3d88d08b5	[BPF] Support for compile once and run everywhere Introduction ============ This patch added intial support for bpf program compile once and run everywhere (CO-RE). The main motivation is for bpf program which depends on kernel headers which may vary between different kernel versions. The initial discussion can be found at https://lwn.net/Articles/773198/. Currently, bpf program accesses kernel internal data structure through bpf_probe_read() helper. The idea is to capture the kernel data structure to be accessed through bpf_probe_read() and relocate them on different kernel versions. On each host, right before bpf program load, the bpfloader will look at the types of the native linux through vmlinux BTF, calculates proper access offset and patch the instruction. To accommodate this, three intrinsic functions preserve_{array,union,struct}_access_index are introduced which in clang will preserve the base pointer, struct/union/array access_index and struct/union debuginfo type information. Later, bpf IR pass can reconstruct the whole gep access chains without looking at gep itself. This patch did the following: . An IR pass is added to convert preserve__access_index to global variable who name encodes the getelementptr access pattern. The global variable has metadata attached to describe the corresponding struct/union debuginfo type. . An SimplifyPatchable MachineInstruction pass is added to remove unnecessary loads. . The BTF output pass is enhanced to generate relocation records located in .BTF.ext section. Typical CO-RE also needs support of global variables which can be assigned to different values to different hosts. For example, kernel version can be used to guard different versions of codes. This patch added the support for patchable externals as well. Example ======= The following is an example. struct pt_regs { long arg1; long arg2; }; struct sk_buff { int i; struct net_device dev; }; #define _(x) (__builtin_preserve_access_index(x)) static int (bpf_probe_read)(void dst, int size, const void unsafe_ptr) = (void ) 4; extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version; int bpf_prog(struct pt_regs ctx) { struct net_device dev = 0; // ctx->arg* does not need bpf_probe_read if (__kernel_version >= 41608) bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff )ctx->arg1)->dev)); else bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff )ctx->arg2)->dev)); return dev != 0; } In the above, we want to translate the third argument of bpf_probe_read() as relocations. -bash-4.4$ clang -target bpf -O2 -g -S trace.c The compiler will generate two new subsections in .BTF.ext, OffsetReloc and ExternReloc. OffsetReloc is to record the structure member offset operations, and ExternalReloc is to record the external globals where only u8, u16, u32 and u64 are supported. BPFOffsetReloc Size struct SecLOffsetReloc for ELF section #1 A number of struct BPFOffsetReloc for ELF section #1 struct SecOffsetReloc for ELF section #2 A number of struct BPFOffsetReloc for ELF section #2 ... BPFExternReloc Size struct SecExternReloc for ELF section #1 A number of struct BPFExternReloc for ELF section #1 struct SecExternReloc for ELF section #2 A number of struct BPFExternReloc for ELF section #2 struct BPFOffsetReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t TypeID; ///< TypeID for the relocation uint32_t OffsetNameOff; ///< The string to traverse types }; struct BPFExternReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t ExternNameOff; ///< The string for external variable }; Note that only externs with attribute section ".BPF.patchable_externs" are considered for Extern Reloc which will be patched by bpf loader right before the load. For the above test case, two offset records and one extern record will be generated: OffsetReloc records: .long .Ltmp12 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String .long .Ltmp18 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String ExternReloc record: .long .Ltmp5 # Insn Offset .long 165 # External Variable In string table: .ascii "0:1" # string offset=242 .ascii "__kernel_version" # string offset=165 The default member offset can be calculated as the 2nd member offset (0 representing the 1st member) of struct "sk_buff". The asm code: .Ltmp5: .Ltmp6: r2 = 0 r3 = 41608 .Ltmp7: .Ltmp8: .loc 1 18 9 is_stmt 0 # t.c:18:9 .Ltmp9: if r3 > r2 goto LBB0_2 .Ltmp10: .Ltmp11: .loc 1 0 9 # t.c:0:9 .Ltmp12: r2 = 8 .Ltmp13: .loc 1 19 66 is_stmt 1 # t.c:19:66 .Ltmp14: .Ltmp15: r3 = (u64 )(r1 + 0) goto LBB0_3 .Ltmp16: .Ltmp17: LBB0_2: .loc 1 0 66 is_stmt 0 # t.c:0:66 .Ltmp18: r2 = 8 .loc 1 21 66 is_stmt 1 # t.c:21:66 .Ltmp19: r3 = (u64 )(r1 + 8) .Ltmp20: .Ltmp21: LBB0_3: .loc 1 0 66 is_stmt 0 # t.c:0:66 r3 += r2 r1 = r10 .Ltmp22: .Ltmp23: .Ltmp24: r1 += -8 r2 = 8 call 4 For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number 8 is the structure offset based on the current BTF. Loader needs to adjust it if it changes on the host. For instruction .Ltmp5, "r2 = 0", the external variable got a default value 0, loader needs to supply an appropriate value for the particular host. Compiling to generate object code and disassemble: 0000000000000000 bpf_prog: 0: b7 02 00 00 00 00 00 00 r2 = 0 1: 7b 2a f8 ff 00 00 00 00 (u64 )(r10 - 8) = r2 2: b7 02 00 00 00 00 00 00 r2 = 0 3: b7 03 00 00 88 a2 00 00 r3 = 41608 4: 2d 23 03 00 00 00 00 00 if r3 > r2 goto +3 <LBB0_2> 5: b7 02 00 00 08 00 00 00 r2 = 8 6: 79 13 00 00 00 00 00 00 r3 = (u64 )(r1 + 0) 7: 05 00 02 00 00 00 00 00 goto +2 <LBB0_3> 0000000000000040 LBB0_2: 8: b7 02 00 00 08 00 00 00 r2 = 8 9: 79 13 08 00 00 00 00 00 r3 = (u64 )(r1 + 8) 0000000000000050 LBB0_3: 10: 0f 23 00 00 00 00 00 00 r3 += r2 11: bf a1 00 00 00 00 00 00 r1 = r10 12: 07 01 00 00 f8 ff ff ff r1 += -8 13: b7 02 00 00 08 00 00 00 r2 = 8 14: 85 00 00 00 04 00 00 00 call 4 Instructions #2, #5 and #8 need relocation resoutions from the loader. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61524 llvm-svn: 365503	2019-07-09 15:28:41 +00:00
Matt Arsenault	e3a676e9ad	CodeGen: Introduce a class for registers Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg(). llvm-svn: 364191	2019-06-24 15:50:29 +00:00
Tom Stellard	4b0b26199b	Revert CMake: Make most target symbols hidden by default This reverts r362990 (git commit `374571301d`) This was causing linker warnings on Darwin: ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)' from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol 'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&), std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings. llvm-svn: 363028	2019-06-11 03:21:13 +00:00

1 2 3 4 5 ...

416 Commits