llvm-project

Commit Graph

Author	SHA1	Message	Date
Shiva Chen	801bf7ebbe	[DebugInfo] Examine all uses of isDebugValue() for debug instructions. Because we create a new kind of debug instruction, DBG_LABEL, we need to check all passes which use isDebugValue() to check MachineInstr is debug instruction or not. When expelling debug instructions, we should expel both DBG_VALUE and DBG_LABEL. So, I create a new function, isDebugInstr(), in MachineInstr to check whether the MachineInstr is debug instruction or not. This patch has no new test case. I have run regression test and there is no difference in regression test. Differential Revision: https://reviews.llvm.org/D45342 Patch by Hsiangkai Wang. llvm-svn: 331844	2018-05-09 02:42:00 +00:00
Nico Weber	5d53aed419	Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt llvm-svn: 330584	2018-04-23 12:49:34 +00:00
Yonghong Song	149d4d3730	bpf: signal error instead of silent drop for certain invalid asm insn Currently, an invalid asm insn, either in an asm file or in an inline asm format, might be silently dropped. This patch fixed two places where this may happen by signaling the error so user knows what goes wrong. The following is an example to demonstrate error messages: -bash-4.2$ cat t.c int test(void ctx) { #if defined(NO_ERROR) asm volatile("r0 = (u16 )skb[%0]" : : "i"(2)); #elif defined(ERROR_1) asm volatile("r20 = (u16 )skb[%0]" : : "i"(2)); #elif defined(ERROR_2) asm volatile("r0 = (u16 )(r1 + ?)" : :); #endif return 0; } -bash-4.2$ cat run.sh for macro in NO_ERROR ERROR_1 ERROR_2; do echo "===== compile for macro" $macro clang -D${macro} -O2 -target bpf -emit-llvm -S t.c echo "==llc==" llc -march=bpf -filetype=obj t.ll done -bash-4.2$ ./run.sh ===== compile for macro NO_ERROR ==llc== ===== compile for macro ERROR_1 ==llc== <inline asm>:1:2: error: invalid register/token name r20 = (u16 )skb[2] ^ note: !srcloc = 135 ===== compile for macro ERROR_2 ==llc== <inline asm>:1:21: error: unexpected token r0 = (u16 *)(r1 + ?) ^ note: !srcloc = 210 -bash-4.2$ Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 329849	2018-04-11 20:24:52 +00:00
Nico Weber	1cbd096914	Sort targetgen calls in lib/Target/*/CMakeLists. Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181	2018-04-04 12:37:44 +00:00
Yonghong Song	d3b522f519	bpf: fix incorrect SELECT_CC lowering Commit 37962a331c77 ("bpf: Improve expanding logic in LowerSELECT_CC") intended to improve code quality for certain jmp conditions. The commit, however, has a couple of issues: (1). In code, just swap is not enough, ConditionalCode CC should also be swapped, otherwise incorrect code will be generated. (2). The ConditionalCode swap should be subject to getHasJmpExt(). If getHasJmpExt() is False, certain conditional codes will not be supported and swap may generate incorrect code. The original goal for this patch is to optimize jmp operations which does not have JmpExt turned on. If JmpExt is on, better code could be generated. For example, the test select_ri.ll is introduced to demonstrate the optimization. The same result can be achieved with -mcpu=v2 flag. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 329043	2018-04-03 03:56:37 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
David Blaikie	b3f471a4bd	Remove some unused includes to fix layering. llvm-svn: 328745	2018-03-29 00:29:45 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
Yonghong Song	82bf8bcb4f	bpf: Enhance debug information for peephole optimization passes Add more debug information for peephole optimization passes. These would only be enabled for debug version binary and could help analyzing why some optimization opportunities were missed. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327371	2018-03-13 06:47:07 +00:00
Yonghong Song	e91802f336	bpf: New post-RA peephole optimization pass to eliminate bad RA codegen This new pass eliminate identical move: MOV rA, rA This is particularly possible to happen when sub-register support enabled. The special type cast insn MOV_32_64 involves different register class on src (i32) and dst (i64), RA could generate useless instruction due to this. This pass also could serve as the bast for further post-RA optimization. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327370	2018-03-13 06:47:06 +00:00
Yonghong Song	80b882ecc5	bpf: Don't expand BSWAP on i32, promote it Currently, there is no ALU32 bswap support in eBPF ISA. BSWAP on i32 was set to EXPAND which would need about eight instructions for single BSWAP. It would be more efficient to promote it to i64, then doing BSWAP on i64. For eBPF programs, most of the promotion are zero extensions which are likely be elimiated later by peephole optimizations. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327369	2018-03-13 06:47:05 +00:00
Yonghong Song	1d28a759d9	bpf: Support subregister definition check on PHI node This patch relax the subregister definition check on Phi node. Previously, we just cancel the optimizatoin when the definition is Phi node while actually we could further check the definitions of incoming parameters of PHI node. This helps catch more elimination opportunities. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327368	2018-03-13 06:47:04 +00:00
Yonghong Song	c88bcdec43	bpf: Extends zero extension elimination beyond comparison instructions The current zero extension elimination was restricted to operands of comparison. It actually could be extended to more cases. For example: int inc_p (int p, unsigned a) { return p + a; } 'a' will be promoted to i64 during addition, and the zero extension could be eliminated as well. For the elimination optimization, it should be much better to start recognizing the candidate sequence from the SRL instruction instead of J* instructions. This patch makes it an generic zero extension elimination pass instead of one restricted with comparison. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327367	2018-03-13 06:47:03 +00:00
Yonghong Song	905d13c123	bpf: J_RR should check both operands There is a mistake in current code that we "break" out the optimization when the first operand of J_RR doesn't qualify the elimination. This caused some elimination opportunities missed, for example the one in the testcase. The code should just fall through to handle the second operand. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327366	2018-03-13 06:47:02 +00:00
Yonghong Song	89e47ac671	bpf: Tighten subregister definition check The current subregister definition check stops after the MOV_32_64 instruction. This means we are thinking all the following instruction sequences are safe to be eliminated: MOV_32_64 rB, wA SLL_ri rB, rB, 32 SRL_ri rB, rB, 32 However, this is not true. The source subregister wA of MOV_32_64 could come from a implicit truncation of 64-bit register in which case the high bits of the 64-bit register is not zeroed, therefore we can't eliminate above sequence. For example, for i32_val, we shouldn't do the elimination: long long bar (); int foo (int b, int c) { unsigned int i32_val = (unsigned int) bar(); if (i32_val < 10) return b; else return c; } Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 327365	2018-03-13 06:47:00 +00:00
Yonghong Song	03e1c8b8f9	bpf: introduce -mattr=dwarfris to disable DwarfUsesRelocationsAcrossSections Commit e4507fb8c94b ("bpf: disable DwarfUsesRelocationsAcrossSections") disables MCAsmInfo DwarfUsesRelocationsAcrossSections unconditionally so that dwarf will not use cross section (between dwarf and symbol table) relocations. This new debug format enables pahole to dump structures correctly as libdwarves.so does not have BPF backend support yet. This new debug format, however, breaks bcc (https://github.com/iovisor/bcc) source debug output as llvm in-memory Dwarf support has some issues to handle it. More specifically, with DwarfUsesRelocationsAcrossSections disabled, JIT compiler does not generate .debug_abbrev and Dwarf DIE (debug info entry) processing is not happy about this. This patch introduces a new flag -mattr=dwarfris (dwarf relocation in section) to disable DwarfUsesRelocationsAcrossSections. DwarfUsesRelocationsAcrossSections is true by default. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 326505	2018-03-01 23:04:59 +00:00
Yonghong Song	60fed1fef0	bpf: New optimization pass for eliminating unnecessary i32 promotions This pass performs peephole optimizations to cleanup ugly code sequences at MachineInstruction layer. Currently, the only optimization in this pass is to eliminate type promotion sequences for zero extending 32-bit subregisters to 64-bit registers. If the compiler could prove the zero extended source come from 32-bit subregistere then it is safe to erase those promotion sequece, because the upper half of the underlying 64-bit registers were zeroed implicitly already. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325991	2018-02-23 23:49:32 +00:00
Yonghong Song	ae961bb061	bpf: New decoder namespace for 32-bit subregister load/store When -mattr=+alu32 passed to the disassembler, use decoder namespace for 32-bit subregister. This is to disassemble load and store instructions in preferred B format as described in previous commit: w = (u8 ) (r + off) // BPF_LDX \| BPF_B w = (u16 )(r + off) // BPF_LDX \| BPF_H w = (u32 )(r + off) // BPF_LDX \| BPF_W (u8 ) (r + off) = w // BPF_STX \| BPF_B (u16 )(r + off) = w // BPF_STX \| BPF_H (u32 )(r + off) = w // BPF_STX \| BPF_W NOTE: all other instructions should still use the default decoder namespace. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325990	2018-02-23 23:49:31 +00:00
Yonghong Song	ca31c3bb3f	bpf: Enable 32-bit subregister support for -mattr=+alu32 After all those preparation patches, now we could enable 32-bit subregister support once -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325989	2018-02-23 23:49:30 +00:00
Yonghong Song	fcd1e0f625	bpf: Support 32-bit subregister in various InstrInfo hooks This patch support 32-bit subregister in three InstrInfo hooks, i.e. copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot, Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325988	2018-02-23 23:49:29 +00:00
Yonghong Song	b1a52bd756	bpf: New instruction patterns for 32-bit subregister load and store The instruction mapping between eBPF/arm64/x86_64 are: eBPF arm64 x86_64 LD1 BPF_LDX \| BPF_B ldrb movzbl LD2 BPF_LDX \| BPF_H ldrh movzwl LD4 BPF_LDX \| BPF_W ldr movl movzbl/movzwl/movl on x86_64 accept 32-bit sub-register, for example %eax, the same for ldrb/ldrh on arm64 which accept 32-bit "w" register. And actually these instructions only accept sub-registers. There is no point to have LD1/2/4 (unsigned) for 64-bit register, because on these arches, upper 32-bits are guaranteed to be zeroed by hardware or VM, so load into the smallest available register class is the best choice for maintaining type information. For eBPF we should adopt the same philosophy, to change current format (A): r = (u8 ) (r + off) // BPF_LDX \| BPF_B r = (u16 )(r + off) // BPF_LDX \| BPF_H r = (u32 )(r + off) // BPF_LDX \| BPF_W (u8 ) (r + off) = r // BPF_STX \| BPF_B (u16 )(r + off) = r // BPF_STX \| BPF_H (u32 )(r + off) = r // BPF_STX \| BPF_W into B: w = (u8 ) (r + off) // BPF_LDX \| BPF_B w = (u16 )(r + off) // BPF_LDX \| BPF_H w = (u32 )(r + off) // BPF_LDX \| BPF_W (u8 ) (r + off) = w // BPF_STX \| BPF_B (u16 )(r + off) = w // BPF_STX \| BPF_H (u32 )(r + off) = w // BPF_STX \| BPF_W There is no change on encoding nor how should they be interpreted, everything is as it is, load the specified length, write into low bits of the register then zeroing all remaining high bits. The only change is their associated register class and how compiler view them. Format A still need to be kept, because eBPF LLVM backend doesn't support sub-registers at default, but once 32-bit subregister is enabled, it should use format B. This patch implemented this together with all those necessary extended load and truncated store patterns. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325987	2018-02-23 23:49:28 +00:00
Yonghong Song	63cf273f55	bpf: Support i32 in getScalarShiftAmountTy method getScalarShiftAmount method should be implemented for eBPF backend to make sure shift amount could still get correct type once 32-bit subregisters support are enabled. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325986	2018-02-23 23:49:26 +00:00
Yonghong Song	59fc805c7e	bpf: Support condition comparison on i32 We need to support condition comparison on i32. All these comparisons are supposed to be combined into BPF_J* instructions which only support i64. For ISD::BR_CC we need to promote it to i64 first, then do custom lowering. For ISD::SET_CC, just expand to SELECT_CC like what's been done for i64. For ISD::SELECT_CC, we also want to do custom lower for i32. However, after 32-bit subregister support enabled, it is possible the comparison operands are i32 while the selected value are i64, or the comparison operands are i64 while the selected value are i32. We need to define extra instruction pattern and support them in custom instruction inserter. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325985	2018-02-23 23:49:25 +00:00
Yonghong Song	219156cff0	bpf: Handle i32 for ALU operations without ISA support There is no eBPF ISA support for BSWAP, ROTR, ROTL, SREM, SDIVREM, MULHU, ADDC, ADDE etc on i32. They could be emulated by other basic BPF_ALU operations, we'd set their lowering action the same as i64. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325984	2018-02-23 23:49:24 +00:00
Yonghong Song	07a7a41753	bpf: New calling convention for 32-bit subregisters This patch add new calling conventions to allow GPR32RegClass as valid register class for arguments and return types. New calling convention will only be choosen when -mattr=+alu32 specified. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325983	2018-02-23 23:49:23 +00:00
Yonghong Song	42389377d8	bpf: New target attribute "alu32" for 32-bit subregister support This new attribute aims to control the enablement of 32-bit subregister support on eBPF backend. Name the interface as "alu32" is because we in particular want to enable the generation of BPF_ALU32 instructions by enable subregister support. This attribute could be used in the following format with llc: llc -mtriple=bpf -mattr=[+\|-]alu32 It is disabled at default. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325982	2018-02-23 23:49:22 +00:00
Yonghong Song	0252f35362	bpf: Define instruction patterns for extensions and truncations between i32 to i64 For transformations between i32 and i64, if it is explicit signed extension: - first cast the operand to i64 - then use SLL + SRA to finish the extension. if it is explicit zero extension: - first cast the operand to i64 - then use SLL + SRL to finish the extension. if it is explicit any extension: - just refer to 64-bit register. if it is explicit truncation: - just refer to 32-bit subregister. NOTE: Some of the zero extension sequences might be unnecessary, they will be removed by an peephole pass on MachineInstruction layer. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325981	2018-02-23 23:49:21 +00:00
Yonghong Song	3a564a8f6e	bpf: Tighten the immediate predication for 32-bit alu instructions These 32-bit ALU insn patterns which takes immediate as one operand were initially added to enable AsmParser support, and the AsmMatcher uses "ins" and "outs" fields to deduct the operand constraint. However, the instruction selector doesn't work the same as AsmMatcher. The selector will use the "pattern" field for which we are not setting the predication for immediate operands correctly. Without this patch, i32 would eventually means all i32 operands are valid, both imm and gpr, while these patterns should allow imm only. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325980	2018-02-23 23:49:19 +00:00
Yonghong Song	ec84e2f1b0	bpf: Use markSuperRegs to mark reserved registers markSuperRegs is the canonical helper function used to mark reserved registers. It could mark any overlapping sub-registers automatically. Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 325979	2018-02-23 23:49:18 +00:00
Yonghong Song	9fdd139b41	bpf: disable DwarfUsesRelocationsAcrossSections The pahole does not work with BPF backend properly: -bash-4.2$ cat test.c struct test_t { int a; int b; }; int test(struct test_t s) { return s->a; } -bash-4.2$ clang -g -O2 -target bpf -c test.c -bash-4.2$ pahole test.o struct clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) { clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); / 0 4 / clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); / 4 4 / / size: 8, cachelines: 1, members: 2 / / last cacheline: 8 bytes / }; -bash-4.2$ The reason is that BPF backend is not yet implemented in elfutils backend https://github.com/threatstack/elfutils/tree/master/backends and pahole depends on elfutils for dwarf parsing and resolving relocation. More specifically, the unsupported relocation in .debug_info for type/member name against symbol table caused the incorrect result above. The following is the raw .rel.debug_info for the above example, Hex dump of section '.rel.debug_info': 0x00000000 06000000 00000000 0a000000 0b000000 ................ 0x00000010 0c000000 00000000 0a000000 01000000 ................ 0x00000020 12000000 00000000 0a000000 02000000 ................ 0x00000030 16000000 00000000 0a000000 0e000000 ................ 0x00000040 1a000000 00000000 0a000000 03000000 ................ ----------------- -------- -------- reloc location type symtab index Hex dump of section '.debug_info': 0x00000000 7b000000 04000000 00000801 00000000 {............... 0x00000010 0c000000 00000000 00000000 00000000 ................ 0x00000020 00000000 00001000 00000200 00000000 ................ Based on "type", the proper value will be extracted from symbol table and filled in .debug_info so later on .debug_info can be properly resolved against debug strings. There are two ways to fix this problem. One is to fix elfutils by adding BPF support which is desirable. This could take a long time and won't work with already deployed pahole. For a short term workaround, we can disable dwarf cross-section relation which specifically avoids debug_info and symbol table cross relocation. This should help any dwarf-related tool which has not implement BPF specific relocations yet. Now .rel.debug_info does not have any relocation for symbol table and .debug_info itself contains necessary relocation information by itself. Hex dump of section '.debug_info': 0x00000000 7b000000 04000000 00000801 00000000 {............... 0x00000010 0c003700 00000000 00003e00 00000000 ..7.......>..... 0x00000020 00000000 00001000 00000200 00000000 ................ location 0xc has 0, 0x12 has 0x37, 0x1a has 0x3e in place which will be used in relocation resolution. Here, the values of 0, 0x37 and 0x3e are offset in .debug_str section. Please note the difference between two above .debug_info dumps. With the fix, pahole works properly with BPF backend: -bash-4.2$ clang -O2 -g -target bpf -c test.c -bash-4.2$ pahole test.o struct test_t { int a; / 0 4 / int b; / 4 4 / / size: 8, cachelines: 1, members: 2 / / last cacheline: 8 bytes */ }; Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 325735	2018-02-21 22:59:14 +00:00
Jonas Paulsson	891789c299	[BPF] Return true in enableMultipleCopyHints(). Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Yonghong Song llvm-svn: 325457	2018-02-18 10:09:54 +00:00
Yonghong Song	920df52a93	bpf: fix a bug in dag2dag optimization for loads from readonly section The reference '&' is missing in the function parameter. If there are back-to-back optimizations in terms of dag node list like below: t29: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8)+12](dereferenceable), zext from i32> t3, t43, undef:i64 t34: i64,ch = load<LD4[bitcast (%struct.test_t @test.t to i8*)](dereferenceable), zext from i32> t3, t41, undef:i64 The bug will trigger a segfault for the added test case remove_truncate_5.ll: LLVMSymbolizer: error reading file: No such file or directory #0 0x000000000241c4d9 (llc+0x241c4d9) #1 0x000000000241c56a (llc+0x241c56a) #2 0x000000000241aa50 (llc+0x241aa50) ... #22 0x0000000000fd5edf (llc+0xfd5edf) #23 0x00007f0fe03bec05 __libc_start_main (/lib64/libc.so.6+0x21c05) #24 0x0000000000fd3e69 (llc+0xfd3e69) ... Segmentation fault Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 325267	2018-02-15 17:06:45 +00:00
Yonghong Song	f2075aef68	bpf: Improve expanding logic in LowerSELECT_CC LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It is not guaranteed to place ConstantNode at RHS which would miss matching Select_Ri. A new testcase added into the existing select_ri.ll, also there is an existing case in cmp.ll which would be improved to use Select_Ri after this patch, it is adjusted accordingly. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 324560	2018-02-08 04:37:49 +00:00
Craig Topper	8f324bb1a4	[SelectionDAGISel] Add a debug print before call to Select. Adjust where blank lines are printed during isel process to make things more sensibly grouped. Previously some targets printed their own message at the start of Select to indicate what they were selecting. For the targets that didn't, it means there was no print of the root node before any custom handling in the target executed. So if the target did something custom and never called SelectNodeCommon, no print would be made. For the targets that did print a message in Select, if they didn't custom handle a node SelectNodeCommon would reprint the root node before walking the isel table. It seems better to just print the message before the call to Select so all targets behave the same. And then remove the root node printing from SelectNodeCommon and just leave a message that says we're starting the table search. There were also some oddities in blank line behavior. Usually due to a \n after a call to SelectionDAGNode::dump which already inserted a new line. llvm-svn: 323551	2018-01-26 19:34:20 +00:00
Yonghong Song	035dd256d5	[BPF] Mark pseudo insn patterns as isCodeGenOnly These pseudos are not supposed to be visible to user. This patch reduced the auto-generated instruction matcher. For example, the following words are removed from keyword list of LLVM BPF assembler. - MCK__35_, // '#' - MCK__COLON_, // ':' - MCK__63_, // '?' - MCK_ADJCALLSTACKDOWN, // 'ADJCALLSTACKDOWN' - MCK_ADJCALLSTACKUP, // 'ADJCALLSTACKUP' - MCK_PSEUDO, // 'PSEUDO' - MCK_Select, // 'Select' Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322535	2018-01-16 07:27:20 +00:00
Yonghong Song	b42c7c7863	[BPF] Teach DAG2DAG AND elimination about load intrinsics As commented on the existing code: // The Reg operand should be a virtual register, which is defined // outside the current basic block. DAG combiner has done a pretty // good job in removing truncating inside a single basic block. However, when the Reg operand comes from bpf_load_[byte \| half \| word] intrinsics, the generic optimizer doesn't understand their results are zero extended, so these single basic block elimination opportunities were missed. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 322534	2018-01-16 07:27:19 +00:00
Alex Bradbury	b22f751fa7	Thread MCSubtargetInfo through Target::createMCAsmBackend Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend. D20830 threaded an MCSubtargetInfo reference through MCAsmBackend::relaxInstruction, but this isn't the only function that would benefit from access. This patch removes the Triple and CPUString arguments from createMCAsmBackend and replaces them with MCSubtargetInfo. This patch just changes the interface without making any intentional functional changes. Once in, several cleanups are possible: * Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend * Support 16-bit instructions when valid in MipsAsmBackend::writeNopData * Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl * Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221) This change initially exposed PR35686, which has since been resolved in r321026. Differential Revision: https://reviews.llvm.org/D41349 llvm-svn: 321692	2018-01-03 08:53:05 +00:00
Yonghong Song	25bf825961	bpf: add support for objdump -print-imm-hex Add support for 'objdump -print-imm-hex' for imm64, operand imm and branch target. If user programs encode immediate values as hex numbers, such an option will make it easy to correlate asm insns with source code. This option also makes it easy to correlate imm values with insn encoding. There is one changed behavior in this patch. In old way, we print the 64bit imm as u64: O << (uint64_t)Op.getImm(); and the new way is: O << formatImm(Op.getImm()); The formatImm is defined in llvm/MC/MCInstPrinter.h as format_object<int64_t> formatImm(int64_t Value) So the new way to print 64bit imm is i64 type. If a 64bit value has the highest bit set, the old way will print the value as a positive value and the new way will print as a negative value. The new way is consistent with x86_64. For the code (see the test program): ... if (a == 0xABCDABCDabcdabcdULL) ... x86_64 objdump, with and without -print-imm-hex, looks like: 48 b8 cd ab cd ab cd ab cd ab movabsq $-6067004223159161907, %rax 48 b8 cd ab cd ab cd ab cd ab movabsq $-0x5432543254325433, %rax Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 321215	2017-12-20 19:39:58 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Francis Visoiu Mistrih	a8a83d150f	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022	2017-12-07 10:40:31 +00:00
Francis Visoiu Mistrih	25528d6de7	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665	2017-12-04 17:18:51 +00:00
Francis Visoiu Mistrih	c71cced0aa	[CodeGen] Always use `printReg` to print registers in both MIR and debug output As part of the unification of the debug format and the MIR format, always use `printReg` to print all kinds of registers. Updated the tests using '_' instead of '%noreg' until we decide which one we want to be the default one. Differential Revision: https://reviews.llvm.org/D40421 llvm-svn: 319445	2017-11-30 16:12:24 +00:00
Francis Visoiu Mistrih	93ef145862	[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427	2017-11-30 12:12:19 +00:00
Alexei Starovoitov	9bd566f8c8	[bpf] remove unused variable Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 318615	2017-11-19 02:41:53 +00:00
Alexei Starovoitov	9a67245d88	[bpf] allow direct and indirect calls kernel verifier is becoming smarter and soon will support direct and indirect function calls. Remove obsolete error from BPF backend. Make call to use PCRel_4 fixup. 'bpf to bpf' calls are distinguished from 'bpf to kernel' calls by insn->src_reg == BPF_PSEUDO_CALL == 1 which is used as relocation indicator similar to ld_imm64->src_reg == BPF_PSEUDO_MAP_FD == 1 The actual 'call' instruction remains the same for both 'bpf to kernel' and 'bpf to bpf' calls. Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 318614	2017-11-19 01:35:00 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
Yonghong Song	ce96738dee	bpf: print backward branch target properly Currently, it prints the backward branch offset as unsigned value like below: 7: 7d 34 0b 00 00 00 00 00 if r4 s>= r3 goto 11 <LBB0_3> 8: b7 00 00 00 00 00 00 00 r0 = 0 LBB0_2: 9: 07 00 00 00 01 00 00 00 r0 += 1 ...... 17: bf 31 00 00 00 00 00 00 r1 = r3 18: 6d 32 f6 ff 00 00 00 00 if r2 s> r3 goto 65526 <LBB0_3+0x7FFB0> The correct print insn 18 should be: 18: 6d 32 f6 ff 00 00 00 00 if r2 s> r3 goto -10 <LBB0_2> To provide better clarity and be consistent with kernel verifier output, the insn 7 output is changed to the following with "+" added to non-negative branch offset: 7: 7d 34 0b 00 00 00 00 00 if r4 s>= r3 goto +11 <LBB0_3> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 318442	2017-11-16 19:15:36 +00:00
Yonghong Song	4c3ce59e61	bpf: enable llvm-objdump to print out symbolized jmp target Add hook in BPF backend so that llvm-objdump can print out the jmp target with label names, e.g., ... if r1 != 2 goto 6 <LBB0_2> ... goto 7 <LBB0_4> ... LBB0_2: ... LBB0_4: ... Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 318358	2017-11-16 00:52:30 +00:00
Daniel Sanders	725584e26d	Add backend name to Target to enable runtime info to be fed back into TableGen Summary: Make it possible to feed runtime information back to tablegen to enable profile-guided tablegen-eration, detection of untested tablegen definitions, etc. Being a cross-compiler by nature, LLVM will potentially collect data for multiple architectures (e.g. when running 'ninja check'). We therefore need a way for TableGen to figure out what data applies to the backend it is generating at the time. This patch achieves that by including the name of the 'def X : Target ...' for the backend in the TargetRegistry. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev Differential Revision: https://reviews.llvm.org/D39742 llvm-svn: 318352	2017-11-15 23:55:44 +00:00
David Blaikie	3f833edc7c	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647	2017-11-08 01:01:31 +00:00
David Blaikie	1be62f0327	Move TargetFrameLowering.h to CodeGen where it's implemented This header already includes a CodeGen header and is implemented in lib/CodeGen, so move the header there to match. This fixes a link error with modular codegeneration builds - where a header and its implementation are circularly dependent and so need to be in the same library, not split between two like this. llvm-svn: 317379	2017-11-03 22:32:11 +00:00
Yonghong Song	9af998e86e	bpf: fix an uninitialized variable issue Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 316519	2017-10-24 21:36:33 +00:00
Yonghong Song	ee68d8e41f	bpf: fix a bug in trunc-op optimization Previous implementation for per-function scope is incorrect and too conservative. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 316481	2017-10-24 18:21:10 +00:00
Yonghong Song	0f836d5dc5	bpf: fix a bug in bpf-isel trunc-op optimization In BPF backend, we try to optimize away redundant trunc operations so that kernel verifier rewrite remains valid. Previous implementation only works for a single function. This patch fixed the issue for multiple functions. It clears internal map data structure before performing optimization for each function. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 316469	2017-10-24 17:29:03 +00:00
Yonghong Song	6621cf67cf	bpf: fix bug on silently truncating 64-bit immediate We came across an llvm bug when compiling some testcases that 64-bit immediates are silently truncated into 32-bit and then packed into BPF_JMP \| BPF_K encoding. This caused comparison with wrong value. This bug looks to be introduced by r308080. The Select_Ri pattern is supposed to be lowered into J_Ri while the latter only support 32-bit immediate encoding, therefore Select_Ri should have similar immediate predicate check as what J_Ri are doing. Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 315889	2017-10-16 04:14:53 +00:00
Matthias Braun	bb8507e63c	Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine" Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots. This reverts commit r315633. llvm-svn: 315637	2017-10-12 22:57:28 +00:00
Matthias Braun	3a9c114b24	TargetMachine: Merge TargetMachine and LLVMTargetMachine Merge LLVMTargetMachine into TargetMachine. - There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface. Differential Revision: https://reviews.llvm.org/D38489 llvm-svn: 315633	2017-10-12 22:28:54 +00:00
Lang Hames	2241ffa43c	[MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr. MCObjectStreamer owns its MCCodeEmitter -- this fixes the types to reflect that, and allows us to remove the last instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315531	2017-10-11 23:34:47 +00:00
Oliver Stannard	4191b9eaea	[Asm] Add debug tracing in table-generated assembly matcher This adds debug tracing to the table-generated assembly instruction matcher, enabled by the -debug-only=asm-matcher option. The changes in the target AsmParsers are to add an MCInstrInfo reference under a consistent name, so that we can use it from table-generated code. This was already being used this way for targets that use deprecation warnings, but 5 targets did not have it, and Hexagon had it under a different name to the other backends. llvm-svn: 315445	2017-10-11 09:17:43 +00:00
Lang Hames	02d330548d	[MC] Have MCObjectStreamer take its MCAsmBackend argument via unique_ptr. MCObjectStreamer owns its MCAsmBackend -- this fixes the types to reflect that, and allows us to remove another instance of MCObjectStreamer's weird "holding ownership via someone else's reference" trick. llvm-svn: 315410	2017-10-11 01:57:21 +00:00
Lang Hames	60fbc7cc38	[MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriter functions. This makes the ownership of the resulting MCObjectWriter clear, and allows us to remove one instance of MCObjectStreamer's bizarre "holding ownership via someone else's reference" trick. llvm-svn: 315327	2017-10-10 16:28:07 +00:00
Lang Hames	dcb312bdb9	[MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter to ELFObjectWriter's constructor. Fixes the same ownership issue for ELF that r315245 did for MachO: ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315254	2017-10-09 23:53:15 +00:00
Yonghong Song	09b01b3555	bpf: fix an insn encoding issue for neg insn Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314911	2017-10-04 16:11:52 +00:00
Yonghong Song	ef29a84d48	bpf: fix a bug for disassembling ld_pseudo inst Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314469	2017-09-28 22:47:34 +00:00
Yonghong Song	e9165f8720	bpf: add new insns for bswap_to_le and negation This patch adds new insn, "reg = be16/be32/be64 reg", for bswap to little endian for big-endian target (bpfeb). It also adds new insn for negation "reg = -reg". Currently, for source code, e.g., b = -a LLVM still prefers to generate: b = 0 - a But "reg = -reg" format can be used in assembly code. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 314376	2017-09-28 02:46:11 +00:00
Yonghong Song	d2e0d1fa11	bpf: initial 32-bit ALU encoding support in assembler This patch adds instruction patterns for operations in BPF_ALU. After this, assembler could recognize some 32-bit ALU statement. For example, those listed int the unit test file. Separate MOV patterns are unnecessary as MOV is ALU operation that could reuse ALU encoding infrastructure, this patch removed those redundant patterns. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 313961	2017-09-22 04:36:36 +00:00
Yonghong Song	3c63b101de	bpf: add 32bit register set Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 313960	2017-09-22 04:36:35 +00:00
Yonghong Song	d03fef970b	bpf: refactor inst patterns with better inheritance Arithmetic and jump instructions, load and store instructions are sharing the same 8-bit code field encoding, A better instruction pattern implemention could be the following inheritance relationships, and each layer only encoding those fields which start to diverse from that layer. This avoids some redundant code. InstBPF -> TYPE_ALU_JMP -> ALU/JMP InstBPF -> TYPE_LD_ST -> Load/Store Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 313959	2017-09-22 04:36:34 +00:00
Yonghong Song	3bf1a8d04e	bpf: refactor inst patterns with more mnemonics Currently, eBPF backend is using some constant directly in instruction patterns, This patch replace them with mnemonics and removed some unnecessary temparary variables. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 313958	2017-09-22 04:36:32 +00:00
Yonghong Song	9ef85f0677	bpf: add inline-asm support Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 313593	2017-09-18 23:29:36 +00:00
Yonghong Song	06ff655e59	bpf: Add BPF AsmParser support in LLVM Reviewed-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 313055	2017-09-12 17:55:23 +00:00
Yonghong Song	be9c00347f	bpf: add " ll" in the LD_IMM64 asmstring This partially revert previous fix in commit f5858045aa0b ("bpf: proper print imm64 expression in inst printer"). In that commit, the original suffix "ll" is removed from LD_IMM64 asmstring. In the customer print method, the "ll" suffix is printed if the rhs is an immediate. For example, "r2 = 5ll" => "r2 = 5ll", and "r3 = varll" => "r3 = var". This has an issue though for assembler. Since assembler relies on asmstring to do pattern matching, it will not be able to distiguish between "mov r2, 5" and "ld_imm64 r2, 5" since both asmstring is "r2 = 5". In such cases, the assembler uses 64bit load for all "r = <val>" asm insts. This patch adds back " ll" suffix for ld_imm64 with one additional space for "#reg = #global_var" case. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 312978	2017-09-11 23:43:35 +00:00
Yonghong Song	093420f929	bpf: proper print imm64 expression in inst printer Fixed an issue in printImm64Operand where if the value is an expression, print out the expression properly. Currently, it will print r1 = <MCOperand Expr:(tx_port)>ll With the patch, the printout will be r1 = tx_port Suggested-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 312833	2017-09-08 23:32:38 +00:00
Yonghong Song	dc1dbf6ef3	bpf: add variants of -mcpu=# and support for additional jmp insns -mcpu=# will support: . generic: the default insn set . v1: insn set version 1, the same as generic . v2: insn set version 2, version 1 + additional jmp insns . probe: the compiler will probe the underlying kernel to decide proper version of insn set. We did not not use -mcpu=native since llc/llvm will interpret -mcpu=native as the underlying hardware architecture regardless of -march value. Currently, only x86_64 supports -mcpu=probe. Other architecture will silently revert to "generic". Also added -mcpu=help to print available cpu parameters. llvm will print out the information only if there are at least one cpu and at least one feature. Add an unused dummy feature to enable the printout. Examples for usage: $ llc -march=bpf -mcpu=v1 -filetype=asm t.ll $ llc -march=bpf -mcpu=v2 -filetype=asm t.ll $ llc -march=bpf -mcpu=generic -filetype=asm t.ll $ llc -march=bpf -mcpu=probe -filetype=asm t.ll $ llc -march=bpf -mcpu=v3 -filetype=asm t.ll 'v3' is not a recognized processor for this target (ignoring processor) ... $ llc -march=bpf -mcpu=help -filetype=asm t.ll Available CPUs for this target: generic - Select the generic processor. probe - Select the probe processor. v1 - Select the v1 processor. v2 - Select the v2 processor. Available features for this target: dummy - unused feature. Use +feature to enable a feature, or -feature to disable it. For example, llc -mcpu=mycpu -mattr=+feature1,-feature2 ... Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 311522	2017-08-23 04:25:57 +00:00
Alex Bradbury	080f6976c0	Use report_fatal_error for unsupported calling conventions The calling convention can be specified by the user in IR. Failing to support a particular calling convention isn't a programming error, and so relying on llvm_unreachable to catch and report an unsupported calling convention is not appropriate. Differential Revision: https://reviews.llvm.org/D36830 llvm-svn: 311435	2017-08-22 09:11:41 +00:00
Rafael Espindola	79e238afee	Delete Default and JITDefault code models IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911	2017-08-03 02:16:21 +00:00
Yonghong Song	fbfae984b4	bpf: fix a compilation bug due to unused variable for release build Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 308083	2017-07-15 06:08:08 +00:00
Yonghong Song	9276ef05c8	bpf: generate better lowering code for certain select/setcc instructions Currently, for code like below, === inner_map = bpf_map_lookup_elem(outer_map, &port_key); if (!inner_map) { inner_map = &fallback_map; } === the compiler generates (pseudo) code like the below: === I1: r1 = bpf_map_lookup_elem(outer_map, &port_key); I2: r2 = 0 I3: if (r1 == r2) I4: r6 = &fallback_map I5: ... === During kernel verification process, After I1, r1 holds a state map_ptr_or_null. If I3 condition is not taken (path [I1, I2, I3, I5]), supposedly r1 should become map_ptr. Unfortunately, kernel does not recognize this pattern and r1 remains map_ptr_or_null at insn I5. This will cause verificaiton failure later on. Kernel, however, is able to recognize pattern "if (r1 == 0)" properly and give a map_ptr state to r1 in the above case. LLVM here generates suboptimal code which causes kernel verification failure. This patch fixes the issue by changing BPF insn pattern matching and lowering to generate proper codes if the righthand parameter of the above condition is a constant. A test case is also added. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 308080	2017-07-15 05:41:42 +00:00
Rafael Espindola	1beb702ba2	Fully fix the movw/movt addend. The issue is not if the value is pcrel. It is whether we have a relocation or not. If we have a relocation, the static linker will select the upper bits. If we don't have a relocation, we have to do it. llvm-svn: 307730	2017-07-11 23:18:25 +00:00
Yonghong Song	5fbe01b12d	bpf: remove unnecessary truncate operation For networking-type bpf program, it often needs to access packet data. A context data structure is provided to the bpf programs with two fields: u32 data; u32 data_end; User can access these two fields with ctx->data and ctx->data_end. During program verification process, the kernel verifier modifies the bpf program with loading of actual pointer value from kernel data structure. r = ctx->data ===> r = actual data start ptr r = ctx->data_end ===> r = actual data end ptr A typical program accessing ctx->data like char data_ptr = (char )(long)ctx->data will result in a 32-bit load followed by a zero extension. Such an operation is combined into a single LDW in DAG combiner as bpf LDW does zero extension automatically. In cases like the below (which can be a result of global value numbering and partial redundancy elimination before insn selection): B1: u32 a = load-32-bit &ctx->data u64 pa = zext a ... B2: u32 b = load-32-bit &ctx->data u64 pb = zext b ... B3: u32 m = PHI(a, b) u64 pm = zext m In B3, "pm = zext m" cannot be removed, which although is legal from compiler perspective, will generate incorrect code after kernel verification. This patch recognizes this pattern and traces through PHI node to see whether the operand of "zext m" is defined with LDWs or not. If it is, the "zext m" itself can be removed. The patch also recognizes the pattern where the load and use of the load value not in the same basic block, where truncate operation may be removed as well. The patch handles 1-byte, 2-byte and 4-byte truncation. Two test cases are added to verify the transformation happens properly for the above code pattern. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 306685	2017-06-29 15:18:54 +00:00
Rafael Espindola	f351292141	Remove redundant argument. llvm-svn: 306189	2017-06-24 00:26:57 +00:00
Rafael Espindola	801b42de31	ARM: move some logic from processFixupValue to applyFixup. processFixupValue is called on every relaxation iteration. applyFixup is only called once at the very end. applyFixup is then the correct place to do last minute changes and value checks. While here, do proper range checks again for fixup_arm_thumb_bl. We used to do it, but dropped because of thumb2. We now do it again, but use the thumb2 range. llvm-svn: 306177	2017-06-23 22:52:36 +00:00
Rafael Espindola	88d9e37ec8	Use a MutableArrayRef. NFC. llvm-svn: 305968	2017-06-21 23:06:53 +00:00
Yonghong Song	a63178f756	bpf: fix a strict-aliasing issue Davide Italiano reported the following issue if llvm is compiled with gcc -Wstrict-aliasing -Werror: ..... lib/Target/BPF/CMakeFiles/LLVMBPFCodeGen.dir/BPFISelDAGToDAG.cpp.o ../lib/Target/BPF/BPFISelDAGToDAG.cpp: In member function ‘virtual void {anonymous}::BPFDAGToDAGISel::PreprocessISelDAG()’: ../lib/Target/BPF/BPFISelDAGToDAG.cpp:264:26: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] val = (uint16_t )new_val; ..... The error is caused by my previous commit (revision 305560). This patch fixed the issue by introducing an union to avoid type casting. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305608	2017-06-16 23:28:04 +00:00
Yonghong Song	ac2e25026f	bpf: avoid load from read-only sections If users tried to have a structure decl/init code like below struct test_t t = { .memeber1 = 45 }; It is very likely that compiler will generate a readonly section to hold up the init values for variable t. Later load of t members, e.g., t.member1 will result in a read from readonly section. BPF program cannot handle relocation. This will force users to write: struct test_t t = {}; t.member1 = 45; This is just inconvenient and unintuitive. This patch addresses this issue by implementing BPF PreprocessISelDAG. For any load from a global constant structure or an global array of constant struct, it attempts to translate it into a constant directly. The traversal of the constant struct and other constant data structures are similar to where the assembler emits read-only sections. Four different unit test cases are also added to cover different scenarios. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305560	2017-06-16 15:41:16 +00:00
Yonghong Song	5c65439617	bpf: set missing types in insn tablegen file o This is discovered during my study of 32-bit subregister support. o This is no impact on current functionality since we only support 64-bit registers. o Searching the web, looks like the issue has been discovered before, so fix it now. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305559	2017-06-16 15:30:55 +00:00
Yonghong Song	7e9d2cb553	bpf: clang-format on BPFAsmPrinter.cpp Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 305301	2017-06-13 16:17:20 +00:00
Zachary Turner	264b5d9e88	Move Object format code to lib/BinaryFormat. This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864	2017-06-07 03:48:56 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Craig Topper	fa5dc09292	[BPF] Correct the file name of the -gen-asm-matcher output file to not start with X86. llvm-svn: 304324	2017-05-31 19:01:05 +00:00
Matthias Braun	5e394c3d6f	TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247	2017-05-30 21:36:41 +00:00
Alexei Starovoitov	3c585d3a8f	[bpf] disallow global_addr+off folding Wrong assembly code is generated for a simple program with clang. If clang only produces IR and llc is used for IR lowering and optimization, correct assembly code is generated. The main reason is that clang feeds default Reloc::Static to llvm and llc feeds no RelocMode to llvm, where for llc case, BPF backend picks up Reloc::PIC_ mode. This leads different IR lowering behavior and clang permits global_addr+off folding while llc doesn't. This patch introduces isOffsetFoldingLegal function into BPF backend and the function always return false. This will make clang and llc behave the same for the lowering. Bug https://bugs.llvm.org//show_bug.cgi?id=33183 has more detailed explanation. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 304043	2017-05-26 22:32:41 +00:00
Serge Pavlov	d526b13e61	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527	2017-05-09 13:35:13 +00:00
Alexei Starovoitov	7bab73b1f8	[bpf] fix a bug which causes incorrect big endian reloc fixup o Add bpfeb support in BPF dwarfdump unit test case Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@fb.com> llvm-svn: 302265	2017-05-05 18:05:00 +00:00
Alexei Starovoitov	f7bd5ebd3b	[bpf] add bigendian support to disassembler . swap 4-bit register encoding, 16-bit offset and 32-bit imm to support big endian archs . add a test Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 301653	2017-04-28 16:51:01 +00:00
Konstantin Zhuravlyov	dc77b2e960	Distinguish between code pointer size and DataLayout::getPointerSize() in DWARF info generation llvm-svn: 300463	2017-04-17 17:41:25 +00:00
Alexei Starovoitov	56db145164	[bpf] Fix memory offset check for loads and stores If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void ign) { volatile unsigned long t = 0x8983984739ull; return (unsigned long )((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 (u64 )(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = (u64 )(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = (u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269	2017-04-13 22:24:13 +00:00
Alex Bradbury	866113c2ea	Add MCContext argument to MCAsmBackend::applyFixup for error reporting A number of backends (AArch64, MIPS, ARM) have been using MCContext::reportError to report issues such as out-of-range fixup values in their TgtAsmBackend. This is great, but because MCContext couldn't easily be threaded through to the adjustFixupValue helper function from its usual callsite (applyFixup), these backends ended up adding an MCContext* argument and adding another call to applyFixup to processFixupValue. Adding an MCContext parameter to applyFixup makes this unnecessary, and even better - applyFixup can take a reference to MCContext rather than a potentially null pointer. Differential Revision: https://reviews.llvm.org/D30264 llvm-svn: 299529	2017-04-05 10:16:14 +00:00
Matthias Braun	8c209aa877	Cleanup dump() functions. We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359	2017-01-28 02:02:38 +00:00

1 2 3 4 5 ...

264 Commits