llvm-project

Commit Graph

Author	SHA1	Message	Date
Jake Ehrlich	5049c3422d	[llvm-objcopy] Make .build-id linking atomic This change makes linking into .build-id atomic and safe to use. Some users under particular workflows are reporting that this races more than half the time under particular conditions. llvm-svn: 356404	2019-03-18 20:35:18 +00:00
Tim Renouf	cfdfba996b	[AMDGPU] Asm/disasm clamp modifier on vop3 int arithmetic Allow the clamp modifier on vop3 int arithmetic instructions in assembly and disassembly. This involved adding a clamp operand to the affected instructions in MIR and MC, and thus having to fix up several places in codegen and MIR tests. Differential Revision: https://reviews.llvm.org/D59267 Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e llvm-svn: 356399	2019-03-18 19:35:44 +00:00
Tim Renouf	2e94f6e584	[AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers This commit allows v_cndmask_b32_e64 with abs, neg source modifiers on src0, src1 to be assembled and disassembled. This does appear to be allowed, even though they are floating point modifiers and the operand type is b32. To do this, I added src0_modifiers and src1_modifiers to the MachineInstr, which involved fixing up several places in codegen and mir tests. Differential Revision: https://reviews.llvm.org/D59191 Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea llvm-svn: 356398	2019-03-18 19:25:39 +00:00
Amara Emerson	8627178d46	Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy() After review comments, it was preferred to not teach MachineIRBuilder about non-generic instructions beyond using buildInstr(). For AArch64 I've changed the buildCopy() calls to buildInstr() + a separate addReg() call. This also relaxes the MachineIRBuilder's COPY checking more because it may not always have a SrcOp given to it. llvm-svn: 356396	2019-03-18 19:20:10 +00:00
Alexandre Ganea	4aeea4cc42	[DebugInfo][PDB] Don't write empty debug streams Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count). With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention. Also fix the * Linker * contrib section which wasn't correctly emitted previously. Differential Revision: https://reviews.llvm.org/D59502 llvm-svn: 356395	2019-03-18 19:13:23 +00:00
Tim Renouf	8723a56551	[MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive checks bot This fixes a couple of unflushed raw_string_ostream bugs in recent commits that only show up on a bot building on windows with expensive checks. Differential Revision: https://reviews.llvm.org/D59396 Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf llvm-svn: 356394	2019-03-18 19:00:46 +00:00
Craig Topper	f07062a798	[X86] Rename imm8_su/imm16_su/imm32_su to relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are. llvm-svn: 356393	2019-03-18 18:54:06 +00:00
Warren Ristow	ad7d0ded2e	[SCEV] Guard movement of insertion point for loop-invariants This reinstates r347934, along with a tweak to address a problem with PHI node ordering that that commit created (or exposed). (That commit was reverted at r348426, due to the PHI node issue.) Original commit message: r320789 suppressed moving the insertion point of SCEV expressions with dev/rem operations to the loop header in non-loop-invariant situations. This, and similar, hoisting is also unsafe in the loop-invariant case, since there may be a guard against a zero denominator. This is an adjustment to the fix of r320789 to suppress the movement even in the loop-invariant case. This fixes PR30806. Differential Revision: https://reviews.llvm.org/D57428 llvm-svn: 356392	2019-03-18 18:52:35 +00:00
Adhemerval Zanella	270249de2b	[AArch64] Small fix for getIntImmCost It uses the generic AArch64_IMM::expandMOVImm to get the correct number of instruction used in immediate materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58461 llvm-svn: 356391	2019-03-18 18:50:58 +00:00
Adhemerval Zanella	a3cefa5d64	[AArch64] Optimize floating point materialization This patch follows some ideas from r352866 to optimize the floating point materialization even further. It changes isFPImmLegal to considere up to 2 mov instruction or up to 5 in case subtarget has fused literals. The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but the mov+fmov sequence is always better because of the reduced d-cache pressure. The timings are still the same if you consider movw+movk+fmov vs. adrp+ldr will be fused (although one instruction longer). Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58460 llvm-svn: 356390	2019-03-18 18:45:57 +00:00
Adhemerval Zanella	664c1ef528	[TargetLowering] Add code size information on isFPImmLegal. NFC This allows better code size for aarch64 floating point materialization in a future patch. Reviewers: evandro Differential Revision: https://reviews.llvm.org/D58690 llvm-svn: 356389	2019-03-18 18:40:07 +00:00
Adhemerval Zanella	8a595b1d2e	[AArch64] Refactor floating point materialization. NFC It splits the login of actual instruction emission away from the logic that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm. The new function AArch64_IMM::expandMOVImm, which return the list of the instructions to materialize the immediate constant, is implemented on a separated unit because it will be used in a subsequent patch to optimize floating point materialization. Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D58915 llvm-svn: 356387	2019-03-18 18:23:23 +00:00
Craig Topper	c2b35ebc1d	[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Similar to previous change done for VPCOM and VPCMP Differential Revision: https://reviews.llvm.org/D59468 llvm-svn: 356384	2019-03-18 17:59:59 +00:00
Nirav Dave	55c921f4bf	[DAG] Cleanup unused node in SimplifySelectCC. Delete temporarily constructed node uses for analysis after it's use, holding onto original input nodes. Ideally this would be rewritten without making nodes, but this appears relatively complex. Reviewers: spatel, RKSimon, craig.topper Subscribers: jdoerfert, hiraditya, deadalnix, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57921 llvm-svn: 356382	2019-03-18 17:02:38 +00:00
Neil Henning	523dab0788	[AMDGPU] Add an experimental buffer fat pointer address space. Add an experimental buffer fat pointer address space that is currently unhandled in the backend. This commit reserves address space 7 as a non-integral pointer repsenting the 160-bit fat pointer (128-bit buffer descriptor + 32-bit offset) that is heavily used in graphics workloads using the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D58957 llvm-svn: 356373	2019-03-18 14:44:28 +00:00
Sanjay Patel	6063393536	[InstCombine] allow general vector constants for funnel shift to shift transforms Follow-up to: rL356338 rL356369 We can calculate an arbitrary vector constant minus the bitwidth, so there's no need to limit this transform to scalars and splats. llvm-svn: 356372	2019-03-18 14:27:51 +00:00
Sanjay Patel	84de8a30a0	[InstCombine] extend rotate-left-by-constant canonicalization to funnel shift Follow-up to: rL356338 Rotates are a special case of funnel shift where the 2 input operands are the same value, but that does not need to be a restriction for the canonicalization when the shift amount is a constant. llvm-svn: 356369	2019-03-18 14:10:11 +00:00
David Stenberg	8a2e4af7e7	[DebugInfo] Ignore bitcasts when lowering stack arg dbg.values Summary: Look past bitcasts when looking for parameter debug values that are described by frame-index loads in `EmitFuncArgumentDbgValue()`. In the attached test case we would be left with an undef `DBG_VALUE` for the parameter without this patch. A similar fix was done for parameters passed in registers in D13005. This fixes PR40777. Reviewers: aprantl, vsk, jmorse Reviewed By: aprantl Subscribers: bjope, javed.absar, jdoerfert, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D58831 llvm-svn: 356363	2019-03-18 11:27:32 +00:00
Christof Douma	8cfd91dcc7	[AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse Fixes https://bugs.llvm.org/show_bug.cgi?id=35094 The Dead register definition pass should leave alone the atomicrmw instructions on AArch64 (LTE extension). The reason is the following statement in the Arm ARM: "The ST<OP> instructions, and LD<OP> instructions where the destination register is WZR or XZR, are not regarded as doing a read for the purpose of a DMB LD barrier." A good example was given in the gcc thread by Will Deacon (linked in the bugzilla ticket 35094): P0 (atomic_int* y,atomic_int* x) { atomic_store_explicit(x,1,memory_order_relaxed); atomic_thread_fence(memory_order_release); atomic_store_explicit(y,1,memory_order_relaxed); } P1 (atomic_int* y,atomic_int* x) { atomic_fetch_add_explicit(y,1,memory_order_relaxed); // STADD atomic_thread_fence(memory_order_acquire); int r0 = atomic_load_explicit(x,memory_order_relaxed); } P2 (atomic_int* y) { int r1 = atomic_load_explicit(y,memory_order_relaxed); } My understanding is that it is forbidden for r0 == 0 and r1 == 2 after this test has executed. However, if the relaxed add in P1 compiles to STADD and the subsequent acquire fence is compiled as DMB LD, then we don't have any ordering guarantees in P1 and the forbidden result could be observed. Change-Id: I419f9f9df947716932038e1100c18d10a96408d0 llvm-svn: 356360	2019-03-18 09:21:06 +00:00
Craig Topper	ba898da132	[X86] Hopefully fix a tautological compare warning in printVecCompareInstr. llvm-svn: 356359	2019-03-18 07:05:01 +00:00
Craig Topper	b4c49255aa	[X86] Make ADD*_DB post-RA pseudos and expand them in expandPostRAPseudo. These are used to help convert OR->LEA when needed to avoid avoid a copy. They aren't need after register allocation. Happens to remove an ugly goto from X86MCCodeEmitter.cpp llvm-svn: 356356	2019-03-18 05:48:18 +00:00
Craig Topper	860a27208e	[X86] Add tab character to the custom printing of VPCMP and VPCOM instructions. All the other instructions are printed with a preceeding tab. llvm-svn: 356355	2019-03-18 02:53:11 +00:00
Craig Topper	04cc28fe13	[X86] Merge printf32mem/printi32mem into a single printdwordmem. Do the same for all other printing functions. The only thing the print methods currently need to know is the string to print for the memory size in intel syntax. This patch merges the functions based on this string. If we ever need something else in the future, its easy to split them back out. This reduces the number of cases in the assembly printers. It shrinks the intel printer to only use 7 bytes per instruction instead of 8. llvm-svn: 356352	2019-03-17 22:57:21 +00:00
Tim Renouf	c4e128e221	[CodeGen] Defined MVTs v3i32, v3f32, v5i32, v5f32 AMDGPU would like to use these MVTs. Differential Revision: https://reviews.llvm.org/D58901 Change-Id: I6125fea810d7cc62a4b4de3d9904255a1233ae4e llvm-svn: 356351	2019-03-17 22:56:38 +00:00
Tim Renouf	c302b9b5fe	[CodeGen] Prepare for introduction of v3 and v5 MVTs AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This commit does not add them, but makes preparatory changes: * Exclude non-legal non-power-of-2 vector types from ComputeRegisterProp mechanism in TargetLoweringBase::getTypeConversion. * Cope with SETCC and VSELECT for odd-width i1 vector when the other vectors are legal type. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58899 Change-Id: Ib5f23377dbef511be3a936211a0b9f94e46331f8 llvm-svn: 356350	2019-03-17 21:43:12 +00:00
David Green	baa94ef03b	[ARM] Check that CPSR does not have other uses Fix up rL356335 by checking that CPSR is not read between the compare and the branch. llvm-svn: 356349	2019-03-17 21:36:15 +00:00
Matt Arsenault	884a18d792	RegAllocFast: Add hint to debug printing llvm-svn: 356348	2019-03-17 21:31:40 +00:00
Matt Arsenault	e0c1f9e76d	AMDGPU: Partially fix default device for HSA There are a few different issues, mostly stemming from using generation based checks for anything instead of subtarget features. Stop adding flat-address-space as a feature for HSA, as it should only be a device property. This was incorrectly allowing flat instructions to select for SI. Increase the default generation for HSA to avoid the encoding error when emitting objects. This has some other side effects from various checks which probably should be separate subtarget features (in the cost model and for dealing with the DS offset folding issue). Partial fix for bug 41070. It should probably be an error to try using amdhsa without flat support. llvm-svn: 356347	2019-03-17 21:31:35 +00:00
Nikita Popov	5e7b62de05	[ConstantRange] Add assertion for KnownBits validity; NFC Following the suggestion in D59475. llvm-svn: 356346	2019-03-17 21:25:32 +00:00
Nikita Popov	322e2dbee1	[ValueTracking] Use ConstantRange overflow check for signed add; NFC This is the same change as rL356290, but for signed add. It replaces the existing ripple logic with the overflow logic in ConstantRange. This is NFC in that it should return NeverOverflow in exactly the same cases as the previous implementation. However, it does make computeOverflowForSignedAdd() more powerful by now also determining AlwaysOverflows conditions. As none of its consumers handle this yet, this has no impact on optimization. Making use of AlwaysOverflows in with.overflow folding will be handled as a followup. Differential Revision: https://reviews.llvm.org/D59450 llvm-svn: 356345	2019-03-17 21:25:26 +00:00
Craig Topper	affead9ad0	[X86] Remove the _alt forms of AVX512 VPCMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Similar to the previous patch for VPCOM. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356344	2019-03-17 21:21:40 +00:00
Craig Topper	12509d87f3	[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343	2019-03-17 21:21:37 +00:00
Tim Renouf	e30aa6a136	[AMDGPU] Prepare for introduction of v3 and v5 MVTs AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This commit does not add them, but makes preparatory changes: * Fixed assumptions of power-of-2 vector type in kernel arg handling, and added v5 kernel arg tests and v3/v5 shader arg tests. * Added v5 tests for cost analysis. * Added vec3/vec5 arg test cases. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58928 Change-Id: I7279d6b4841464d2080eb255ef3c589e268eabcd llvm-svn: 356342	2019-03-17 21:04:16 +00:00
Tim Renouf	d1477e989c	[ARM] Fixed an assumption of power-of-2 vector MVT I am about to introduce some non-power-of-2 width vector MVTs. This commit fixes a power-of-2 assumption that my forthcoming change would otherwise break, as shown by test/CodeGen/ARM/vcvt_combine.ll and vdiv_combine.ll. Differential Revision: https://reviews.llvm.org/D58927 Change-Id: I56a282e365d3874ab0621e5bdef98a612f702317 llvm-svn: 356341	2019-03-17 20:48:54 +00:00
Nikita Popov	ef2d979943	[ConstantRange] Add fromKnownBits() method Following the suggestion in D59450, I'm moving the code for constructing a ConstantRange from KnownBits out of ValueTracking, which also allows us to test this code independently. I'm adding this method to ConstantRange rather than KnownBits (which would have been a bit nicer API wise) to avoid creating a dependency from Support to IR, where ConstantRange lives. Differential Revision: https://reviews.llvm.org/D59475 llvm-svn: 356339	2019-03-17 20:24:02 +00:00
Sanjay Patel	b3bcd95771	[InstCombine] canonicalize rotate right by constant to rotate left This was noted as a backend problem: https://bugs.llvm.org/show_bug.cgi?id=41057 ...and subsequently fixed for x86: rL356121 But we should canonicalize these in IR for the benefit of all targets and improve IR analysis such as CSE. llvm-svn: 356338	2019-03-17 19:08:00 +00:00
David Green	e0b48a8015	[ARM] Search backwards for CMP when combining into CBZ The constant island pass currently only looks at the instruction immediately before a branch for a CMP to fold into a CBZ/CBNZ. This extends it to search backwards for the instruction that defines CPSR. We need to ensure that the register is not overridden between the CMP and the branch. Differential Revision: https://reviews.llvm.org/D59317 llvm-svn: 356336	2019-03-17 16:11:22 +00:00
Nikita Popov	9a4453592b	[DAGCombine] Fold (x & ~y) \| y patterns Fold (x & ~y) \| y and it's four commuted variants to x \| y. This pattern can in particular appear when a vselect c, x, -1 is expanded to (x & ~c) \| (-1 & c) and combined to (x & ~c) \| c. This change has some overlap with D59066, which avoids creating a vselect of this form in the first place during uaddsat expansion. Differential Revision: https://reviews.llvm.org/D59174 llvm-svn: 356333	2019-03-17 15:45:38 +00:00
Sanjay Patel	6a6e808b69	[TargetLowering] improve the default expansion of uaddsat/usubsat This is a subset of what was proposed in: D59006 ...and may overlap with test changes from: D59174 ...but it seems like a good general optimization to turn selects into bitwise-logic when possible because we never know exactly what can happen at this stage of DAG combining depending on how the target has defined things. Differential Revision: https://reviews.llvm.org/D59066 llvm-svn: 356332	2019-03-17 14:57:40 +00:00
Alex Bradbury	997947961a	[RISCV][NFC] Factor out matchRegisterNameHelper in RISCVAsmParser.cpp Contains common logic to match a string to a register name. llvm-svn: 356330	2019-03-17 12:02:32 +00:00
Alex Bradbury	b18e314a7c	[RISCV] Fix RISCVAsmParser::ParseRegister and add tests RISCVAsmParser::ParseRegister is called from AsmParser::parseRegisterOrNumber, which in turn is called when processing CFI directives. The RISC-V implementation wasn't setting RegNo, and so was incorrect. This patch address that and adds cfi directive tests that demonstrate the fix. A follow-up patch will factor out the register parsing logic shared between ParseRegister and parseRegister. llvm-svn: 356329	2019-03-17 12:00:58 +00:00
Simon Pilgrim	3b0a6c69ee	[DAGCombine] combineShuffleOfScalars - handle non-zero SCALAR_TO_VECTOR indices (PR41097) rL356292 reduces the size of scalar_to_vector if we know the upper bits are undef - which means that shuffles may find they are suddenly referencing scalar_to_vector elements other than zero - so make sure we handle this as undef. llvm-svn: 356327	2019-03-16 17:36:26 +00:00
Yonghong Song	6db6b56a5c	[BPF] Add BTF Var and DataSec Support Two new kinds, BTF_KIND_VAR and BTF_KIND_DATASEC, are added. BTF_KIND_VAR has the following specification: btf_type.name: var name btf_type.info: type kind btf_type.type: var type // btf_type is followed by one u32 u32: varinfo (currently, only 0 - static, 1 - global allocated in elf sections) Not all globals are supported in this patch. The following globals are supported: . static variables with or without section attributes . global variables with section attributes The inclusion of globals with section attributes is for future potential extraction of key/value type id's from map definition. BTF_KIND_DATASEC has the following specification: btf_type.name: section name associated with variable or one of .data/.bss/.readonly btf_type.info: type kind and vlen for # of variables btf_type.size: 0 #vlen number of the following: u32: id of corresponding BTF_KIND_VAR u32: in-session offset of the var u32: the size of memory var occupied At the time of debug info emission, the data section size is unknown, so the btf_type.size = 0 for BTF_KIND_DATASEC. The loader can patch it during loading time. The in-session offseet of the var is only available for static variables. For global variables, the loader neeeds to assign the global variable symbol value in symbol table to in-section offset. The size of memory is used to specify the amount of the memory a variable occupies. Typically, it equals to the type size, but for certain structures, e.g., struct tt { int a; int b; char c[]; }; static volatile struct tt s2 = {3, 4, "abcdefghi"}; The static variable s2 has size of 20. Note that for BTF_KIND_DATASEC name, the section name does not contain object name. The compiler does have input module name. For example, two cases below: . clang -target bpf -O2 -g -c test.c The compiler knows the input file (module) is test.c and can generate sec name like test.data/test.bss etc. . clang -target bpf -O2 -g -emit-llvm -c test.c -o - \| llc -march=bpf -filetype=obj -o test.o The llc compiler has the input file as stdin, and would generate something like stdin.data/stdin.bss etc. which does not really make sense. For any user specificed section name, e.g., static volatile int a __attribute__((section("id1"))); static volatile const int b __attribute__((section("id2"))); The DataSec with name "id1" and "id2" does not contain information whether the section is readonly or not. The loader needs to check the corresponding elf section flags for such information. A simple example: -bash-4.4$ cat t.c int g1; int g2 = 3; const int g3 = 4; static volatile int s1; struct tt { int a; int b; char c[]; }; static volatile struct tt s2 = {3, 4, "abcdefghi"}; static volatile const int s3 = 4; int m __attribute__((section("maps"), used)) = 4; int test() { return g1 + g2 + g3 + s1 + s2.a + s3 + m; } -bash-4.4$ clang -target bpf -O2 -g -S t.c Checking t.s, 4 BTF_KIND_VAR's are generated (s1, s2, s3 and m). 4 BTF_KIND_DATASEC's are generated with names ".data", ".bss", ".rodata" and "maps". Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D59441 llvm-svn: 356326	2019-03-16 15:36:31 +00:00
Simon Pilgrim	f2c53b5d6c	[X86][SSE] Constant fold PEXTRB/PEXTRW/EXTRACT_VECTOR_ELT nodes. Replaces existing i1-only fold. llvm-svn: 356325	2019-03-16 15:02:00 +00:00
Simon Pilgrim	0f472e1d01	[X86] Add SimplifyDemandedBitsForTargetNode support for PEXTRB/PEXTRW Improved constant folding for PEXTRB/PEXTRW will be added in a future commit llvm-svn: 356324	2019-03-16 14:29:50 +00:00
Heejin Ahn	66ce419468	[WebAssembly] Make rethrow take an except_ref type argument Summary: In the new wasm EH proposal, `rethrow` takes an `except_ref` argument. This change was missing in r352598. This patch adds `llvm.wasm.rethrow.in.catch` intrinsic. This is an intrinsic that's gonna eventually be lowered to wasm `rethrow` instruction, but this intrinsic can appear only within a catchpad or a cleanuppad scope. Also this intrinsic needs to be invokable - otherwise EH pad successor for it will not be correctly generated in clang. This also adds lowering logic for this intrinsic in `SelectionDAGBuilder::visitInvoke`. This routine is basically a specialized and simplified version of `SelectionDAGBuilder::visitTargetIntrinsic`, but we can't use it because if is only for `CallInst`s. This deletes the previous `llvm.wasm.rethrow` intrinsic and related tests, which was meant to be used within a `__cxa_rethrow` library function. Turned out this needs some more logic, so the intrinsic for this purpose will be added later. LateEHPrepare takes a result value of `catch` and inserts it into matching `rethrow` as an argument. `RETHROW_IN_CATCH` is a pseudo instruction that serves as a link between `llvm.wasm.rethrow.in.catch` and the real wasm `rethrow` instruction. To generate a `rethrow` instruction, we need an `except_ref` argument, which is generated from `catch` instruction. But `catch` instrutions are added in LateEHPrepare pass, so we use `RETHROW_IN_CATCH`, which takes no argument, until we are able to correctly lower it to `rethrow` in LateEHPrepare. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59352 llvm-svn: 356316	2019-03-16 05:38:57 +00:00
Heejin Ahn	b47a18cd4b	[WebAssembly] Method order change in LateEHPrepare (NFC) Summary: Currently the order of these methods does not matter, but the following CL needs to have this order changed. Merging the order change and the semantics change within a CL complicates the diff, so submitting the order change first. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59342 llvm-svn: 356315	2019-03-16 04:46:05 +00:00
Heejin Ahn	a41250c7be	[WebAssembly] Irreducible control flow rewrite Summary: Rewrite WebAssemblyFixIrreducibleControlFlow to a simpler and cleaner design, which directly computes reachability and other properties itself. This avoids previous complexity and bugs. (The new graph analyses are very similar to how the Relooper algorithm would find loop entries and so forth.) This fixes a few bugs, including where we had a false positive and thought fannkuch was irreducible when it was not, which made us much larger and slower there, and a reverse bug where we missed irreducibility. On fannkuch, we used to be 44% slower than asm2wasm and are now 4% faster. Reviewers: aheejin Subscribers: jdoerfert, mgrang, dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D58919 Patch by Alon Zakai (kripken) llvm-svn: 356313	2019-03-16 03:00:19 +00:00
Amara Emerson	7097e83dab	[GlobalISel] Make isel verification checks of vregs run under NDEBUG only. llvm-svn: 356309	2019-03-16 01:02:10 +00:00
Fedor Sergeev	6a9c2f4f98	[TimePasses] allow -time-passes reporting into a custom stream TimePassesHandler object (implementation of time-passes for new pass manager) gains ability to report into a stream customizable per-instance (per pipeline). Intended use is to specify separate time-passes output stream per each compilation, setting up TimePasses member of StandardInstrumentation during PassBuilder setup. That allows to get independent non-overlapping pass-times reports for parallel independent compilations (in JIT-like setups). By default it still puts timing reports into the info-output-file stream (created by CreateInfoOutputFile every time report is requested). Unit-test added for non-default case, and it also allowed to discover that print() does not work as declared - it did not reset the timers, leading to yet another report being printed into the default stream. Fixed print() to actually reset timers according to what was declared in print's comments before. Reviewed By: philip.pfaffe Differential Revision: https://reviews.llvm.org/D59366 llvm-svn: 356305	2019-03-15 22:15:23 +00:00

1 2 3 4 5 ...

121287 Commits