llvm-project

Commit Graph

Author	SHA1	Message	Date
Mircea Trofin	2c3ab66539	[llvm] Skip over empty line table entries. Summary: This is similar to how addr2line handles consecutive entries with the same address - pick the last one. Reviewers: dblaikie, friss, JDevlieghere Reviewed By: dblaikie Subscribers: eugenis, vitalybuka, echristo, JDevlieghere, probinson, aprantl, hiraditya, rupprecht, jdoerfert, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D58952 llvm-svn: 356265	2019-03-15 15:00:12 +00:00
Mikael Holmen	339daae806	[CodeGenPrepare] avoid crashing from replacing a phi twice Summary: This is a fix to bug 41052: https://bugs.llvm.org/show_bug.cgi?id=41052 While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi. Patch by: JesperAntonsson Reviewers: skatkov, aprantl Reviewed By: aprantl Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59358 llvm-svn: 356260	2019-03-15 13:51:05 +00:00
Sam Parker	f82d4ed771	[ARM] Remove EarlyCSE from backend There is an issue with early CSE hitting an assert, so temporarily remove the pass from the Arm backend. Bug: https://bugs.llvm.org/show_bug.cgi?id=41081 Differential Revision: https://reviews.llvm.org/D59410 llvm-svn: 356259	2019-03-15 13:36:37 +00:00
Michael Liao	6883d7e192	[AMDGPU] Fix SGPR fixing through SCC chaining Summary: - During the fixing of SGPR copying from VGPR, ensure users of SCC is properly propagated, i.e. * only propagate through live def of SCC, * skip the SCC-def inst itself, and * stop the propagation on the other SCC-def inst after checking its SCC-use first. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59362 llvm-svn: 356258	2019-03-15 12:42:21 +00:00
Florian Hahn	728293ac87	[LSR] Update test from rL356256 after rebase. llvm-svn: 356257	2019-03-15 12:37:50 +00:00
Florian Hahn	d9e88f7b7f	[LSR] Check for signed overflow in NarrowSearchSpaceByDetectingSupersets. We are adding a sign extended IR value to an int64_t, which can cause signed overflows, as in the attached test case, where we have a formula with BaseOffset = -1 and a constant with numeric_limits<int64_t>::min(). If the addition would overflow, skip the simplification for this formula. Note that the target triple is required to trigger the failure. Reviewers: qcolombet, gilr, kparzysz, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59211 llvm-svn: 356256	2019-03-15 12:17:36 +00:00
Simon Pilgrim	398f9bb434	[SPARC] Regenerate label test for D59363 llvm-svn: 356253	2019-03-15 11:24:17 +00:00
Simon Pilgrim	22bebcbbbf	[ARM] Remove icmp undef from reduced tests Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @efriedma (Eli Friedman) llvm-svn: 356252	2019-03-15 11:14:59 +00:00
Simon Pilgrim	918d0c2ba6	[WebAssembly] Remove icmp undef in stackify test Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC) Approved by @tlively (Thomas Lively) llvm-svn: 356251	2019-03-15 11:13:26 +00:00
Simon Pilgrim	0ad17402a9	[X86][SSE] Attempt to convert SSE shift-by-var to shift-by-imm. Prep work for PR40203 llvm-svn: 356249	2019-03-15 11:05:42 +00:00
James Henderson	b10f48bbb4	[yaml2obj]Allow explicit setting of p_filesz, p_memsz, and p_offset yaml2obj currently derives the p_filesz, p_memsz, and p_offset values of program headers from their sections. This makes writing tests for certain formats more complex, and sometimes impossible. This patch allows setting these fields explicitly, overriding the default value, when relevant. Reviewed by: jakehehrlich, Higuoxing Differential Revision: https://reviews.llvm.org/D59372 llvm-svn: 356247	2019-03-15 10:35:27 +00:00
Sam Parker	9e73020bfa	[ARM][ParallelDSP] Disable for big-endian Bail early when we don't have a preheader and also if the target is big endian because it's written with only little endian in mind! Differential Revision: https://reviews.llvm.org/D59368 llvm-svn: 356243	2019-03-15 10:19:32 +00:00
Petar Avramovic	3e0da146ac	[MIPS GlobalISel] Improve selection of constants Certain 32 bit constants can be generated with a single instruction instead of two. Implement materialize32BitImm function for MIPS32. Differential Revision: https://reviews.llvm.org/D59369 llvm-svn: 356238	2019-03-15 07:07:50 +00:00
Yonghong Song	cacac05aca	[BPF] do not generate unused local/global types The kernel currently has a limit for # of types to be 64KB and the size of string subsection to be 64KB. A simple bcc tool runqlat.py generates: . the size of ~33KB type section, roughly ~10K types . the size of ~17KB string section The majority type is from the types referenced by local variables in the bpf program. For example, the kernel "task_struct" itself recursively brings in ~900 other types. This patch did the following optimization to avoid generating unused types: . do not generate types for local variables unless they are function arguments. . do not generate types for external globals. If an external global is not used in the program, llvm already removes it from IR, so global variable saving is typical small. For runqlat.py, only one variable "llvm.used" is the external global. The types for locals and external globals can be added back once there is a usage for them. After the above optimization, the runqlat.py generates: . the size of ~1.5KB type section, roughtly 500 types . the size of ~0.7KB string section UPDATE: resubmitted the patch after previous revert with the following fix: use Global.hasExternalLinkage() to test "external" linkage instead of using Global.getInitializer(), which will assert on external variables. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 356234	2019-03-15 05:51:25 +00:00
Yonghong Song	bf3a279bce	Revert "[BPF] do not generate unused local/global types" This reverts commit r356232. Reason: test failure with ASSERT on enabled build. llvm-svn: 356233	2019-03-15 05:02:19 +00:00
Yonghong Song	5664d4c8ca	[BPF] do not generate unused local/global types The kernel currently has a limit for # of types to be 64KB and the size of string subsection to be 64KB. A simple bcc tool runqlat.py generates: . the size of ~33KB type section, roughly ~10K types . the size of ~17KB string section The majority type is from the types referenced by local variables in the bpf program. For example, the kernel "task_struct" itself recursively brings in ~900 other types. This patch did the following optimization to avoid generating unused types: . do not generate types for local variables unless they are function arguments. . do not generate types for external globals. If an external global is not used in the program, llvm already removes it from IR, so global variable saving is typical small. For runqlat.py, only one variable "llvm.used" is the external global. The types for locals and external globals can be added back once there is a usage for them. After the above optimization, the runqlat.py generates: . the size of ~1.5KB type section, roughtly 500 types . the size of ~0.7KB string section Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 356232	2019-03-15 04:42:01 +00:00
Matt Arsenault	1d83670dbd	AMDGPU: Remove intrinsic operand assert Before r355981, this was under LLVM_DEBUG. I don't think the assert is quite right, but this really should be a verifier check. Instcombine should not be asserting on this sort of thing. llvm-svn: 356219	2019-03-14 23:45:09 +00:00
Sanjay Patel	2c9275a790	[CGP] add another bailout for degenerate code (PR41064) This is almost the same as: rL355345 ...and should prevent any potential crashing from examples like: https://bugs.llvm.org/show_bug.cgi?id=41064 ...although the bug was masked by: rL355823 ...and I'm not sure how to repro the problem after that change. llvm-svn: 356218	2019-03-14 23:14:31 +00:00
Paul Robinson	96c1f2cd6c	Tighten up tests that use -debugify as a shortcut. NFC These now verify that a given instruction has a specific source location, rather than any old location. We want to make sure we propagate the correct locations from one instruction to another. llvm-svn: 356217	2019-03-14 23:09:17 +00:00
Eli Friedman	fb26c329af	[MC] Sort FDEs by the associated CIE before emitting them. This isn't necessary according to the DWARF standard, but it matches the .eh_frame sections emitted by other tools in practice, and the Android libunwindstack rejects .eh_frame sections where an FDE refers to a CIE other than the closest previous CIE. So match the other tools and also sort accordingly. I consider this a bug in libunwindstack, but it's easy enough to emit a compatible .eh_frame section for compatibility with installed operating systems. Differential Revision: https://reviews.llvm.org/D58266 llvm-svn: 356216	2019-03-14 23:08:19 +00:00
Matt Arsenault	bc6d07ca46	MIR: Allow targets to serialize MachineFunctionInfo This has been a very painful missing feature that has made producing reduced testcases difficult. In particular the various registers determined for stack access during function lowering were necessary to avoid undefined register errors in a large percentage of cases. Implement a subset of the important fields that need to be preserved for AMDGPU. Most of the changes are to support targets parsing register fields and properly reporting errors. The biggest sort-of bug remaining is for fields that can be initialized from the IR section will be overwritten by a default initialized machineFunctionInfo section. Another remaining bug is the machineFunctionInfo section is still printed even if empty. llvm-svn: 356215	2019-03-14 22:54:43 +00:00
Jessica Paquette	7d6784f522	[AArch64][GlobalISel] Add isel support for G_UADDO on s32s and s64s This adds instruction selection support for G_UADDO on s32s and s64s. Also - Add an instruction selection test - Update the arm64-xaluo.ll test to show that we generate the correct assembly Differential Revision: https://reviews.llvm.org/D58734 llvm-svn: 356214	2019-03-14 22:54:29 +00:00
Amara Emerson	d61b89be8d	[AArch64][GlobalISel] Implement selection for G_UNMERGE of vectors to vectors. This re-uses the previous support for extract vector elt to extract the subvectors. Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356213	2019-03-14 22:48:18 +00:00
Amara Emerson	2ff2298c3e	[AArch64][GlobalISel] Add some support for G_CONCAT_VECTORS. Handles concatenating 2 x v2s32 and 2 x v4s16 Differential Revision: https://reviews.llvm.org/D59390 llvm-svn: 356212	2019-03-14 22:48:15 +00:00
Jordan Rupprecht	12ed01dcf9	[llvm-strip] Hook up (unimplemented) --only-keep-debug For ELF, we accept but ignore --only-keep-debug. Do the same for llvm-strip. COFF does implement this, so update the test that it is supported. llvm-svn: 356207	2019-03-14 21:51:42 +00:00
Adrian Prantl	07b97492d4	Add test I forgot to git-add in r356163. llvm-svn: 356205	2019-03-14 21:23:52 +00:00
Nikita Popov	48eb21ee5f	[InstCombine] Add tests for range-based saturing math overflow; NFC Tests for cases where overflow can be determined, but not based on known bits. llvm-svn: 356203	2019-03-14 21:06:46 +00:00
Pete Couperus	9fd1848823	[ARC] Add more load/store variants. On ARC ISA, general format of load instruction is this: LD<zz><.x><.aa><.di> a, [b,c] And general format of store is this: ST<zz><.aa><.di> c, [b,s9] Where: <zz> is data size field and can be one of <empty> (bits 00) - Word (32-bit), default behavior B (bits 01) - Byte H (bits 10) - Half-word (16-bit) <.x> is data extend mode: <empty> (bit 0) - If size is not Word(32-bit), then data is zero extended X (bit 1) - If size is not Word(32-bit), then data is sign extended <.aa> is address write-back mode: <empty> (bits 00) - no write-back .AW (bits 01) - Preincrement, base register updated pre memory transaction .AB (bits 10) - Postincrement, base register updated post memory transaction <.di> is cache bypass mode: <empty> (bit 0) - Cached memory access, default mode .DI (bit 1) - Non-cached data memory access This patch adds these load/store instruction variants to the ARC backend. Patch By Denis Antrushin! <denis@synopsys.com> Differential Revision: https://reviews.llvm.org/D58980 llvm-svn: 356200	2019-03-14 20:50:54 +00:00
Sanjay Patel	38f07b1966	[InstCombine] remove duplicate tests These got accidentally doubled with rL356191. llvm-svn: 356195	2019-03-14 19:41:21 +00:00
Sanjay Patel	de1d5d3675	[InstCombine] canonicalize funnel shift constant shift amount to be modulo bitwidth The shift argument is defined to be modulo the bitwidth, so if that argument is a constant, we can always reduce the constant to its minimal form to allow better CSE and other follow-on transforms. We need to be careful to ignore constant expressions here, or we will likely infinite loop. I'm adding a general vector constant query for that case. Differential Revision: https://reviews.llvm.org/D59374 llvm-svn: 356192	2019-03-14 19:22:08 +00:00
Sanjay Patel	6e86216531	[InstCombine] add tests for funnel shift constant shift amount mod bitwidth; NFC llvm-svn: 356191	2019-03-14 19:22:00 +00:00
Philip Reames	81abc7fb0c	[Tests] Add tests to demonstrate hoisting of unordered invariant loads llvm-svn: 356184	2019-03-14 18:06:15 +00:00
Philip Reames	9616cf0510	[Tests] Revert an accident change to a test llvm-svn: 356183	2019-03-14 18:02:19 +00:00
Jessica Paquette	5aff1f475c	[GlobalISel][AArch64] Add partial selection support for G_INSERT_VECTOR_ELT This adds support for inserting elements into packed vectors. It also adds two tests: one for selection, and one for regbank select. Unpacked vectors will come in a follow-up. Differential Revision: https://reviews.llvm.org/D59325 llvm-svn: 356182	2019-03-14 18:01:30 +00:00
Philip Reames	c53f02a32a	Auto-generate an existing test to make it easier to update llvm-svn: 356181	2019-03-14 17:59:59 +00:00
Max Moroz	a80d9ce5cf	Speeding up llvm-cov export with multithreaded renderFiles implementation. Summary: CoverageExporterJson::renderFiles accounts for most of the execution time given a large profdata file with multiple binaries. Proposed solution is to generate JSON for each file in parallel and sort at the end to preserve deterministic output. Also added flags to skip generating parts of the output to trim the output size. Patch by Sajjad Mirza (@sajjadm). Reviewers: Dor1s, vsk Reviewed By: Dor1s, vsk Subscribers: liaoyuke, mgrang, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59277 llvm-svn: 356178	2019-03-14 17:49:27 +00:00
Sanjay Patel	43570a0a62	[InstCombine] add tests for funnel shift constant shift amount mod bitwidth; NFC llvm-svn: 356175	2019-03-14 17:39:40 +00:00
Philip Reames	af41b282c5	[Tests] Add tests for reordering of unordered atomics on invariant locations llvm-svn: 356172	2019-03-14 17:36:58 +00:00
Philip Reames	70d156991c	Allow code motion (and thus folding) for atomic (but unordered) memory operands Building on the work done in D57601, now that we can distinguish between atomic and volatile memory accesses, go ahead and allow code motion of unordered atomics. As seen in the diffs, this allows much better folding of memory operations into using instructions. (Mostly done by the PeepholeOpt pass.) Note: I have not reviewed all callers of hasOrderedMemoryRef since one of them - isSafeToMove - is very widely used. I'm relying on the documented semantics of each method to judge correctness. Differential Revision: https://reviews.llvm.org/D59345 llvm-svn: 356170	2019-03-14 17:20:59 +00:00
Philip Reames	8dd9b54d9b	[Tests] Add negative folding tests w/fences as requested in D59345 llvm-svn: 356165	2019-03-14 17:05:18 +00:00
Craig Topper	c747ac3f93	[X86] Fix the pattern changes from r356121 so that the RORr1/RORm1 pattern use the rotr opcode. These instructions used to use rotl with a bitwidth-1 immediate. I changed the immediate to 1, but failed to change the opcode. Thankfully this seems to have not caused a functional issue because we now had two rotl by 1 patterns, but the correct ones were earlier and took priority. So we just missed some optimization. llvm-svn: 356164	2019-03-14 16:53:24 +00:00
Adrian Prantl	e69917f166	Add IR debug info support for Elemental, Pure, and Recursive Procedures. Patch by Eric Schweitz! Differential Revision: https://reviews.llvm.org/D54043 llvm-svn: 356163	2019-03-14 16:29:54 +00:00
Sam Parker	0a833d0ad2	[NFC][ARM] Update test Change some regex to handle commutable instructions. llvm-svn: 356159	2019-03-14 15:36:54 +00:00
Sanjay Patel	5d1df114e8	[x86] prevent infinite looping from vselect commutation (PR41066) This is an immediate fix for: https://bugs.llvm.org/show_bug.cgi?id=41066 ...but as noted there and the code comments, we should do better by stubbing this out sooner. llvm-svn: 356158	2019-03-14 15:32:34 +00:00
Matt Arsenault	133716929c	GlobalISel: Use multiple returns for intrinsic structs This is consistent with what SelectionDAG does and is much easier to work with than the extract sequence with an artificial wide register. For the AMDGPU control flow intrinsics, this was producing an s128 for the i64, i1 tuple return. Any legalization that should apply to a real s128 value would badly obscure the direct values that need to be seen. llvm-svn: 356147	2019-03-14 14:18:56 +00:00
Matt Arsenault	4e3e4016bf	ARM: Add ImmArg to intrinsics I found these by asserting in clang for any GCCBuiltin that doesn't require mangling and requires a constant for the builtin. This means that intrinsics are missing which don't use GCCBuiltin, don't have builtins defined in clang, or were missing the constant annotation in the builtin definition. llvm-svn: 356144	2019-03-14 13:46:14 +00:00
Simon Pilgrim	238a94c4b6	[SystemZ] Remove icmp undef Prep-work for PR40800 (Add UNDEF handling to SelectionDAG::FoldSetCC) llvm-svn: 356138	2019-03-14 11:56:41 +00:00
Simon Pilgrim	5bcd59bc84	[SystemZ] Regenerate tests to make complete codegen more obvious llvm-svn: 356137	2019-03-14 11:54:46 +00:00
James Henderson	b5de5e25de	[llvm-objcopy]Don't implicitly strip sections in segments This patch changes llvm-objcopy's behaviour to not strip sections that are in segments, if they otherwise would be due to a stripping operation (--strip-all, --strip-sections, --strip-non-alloc). This preserves the segment contents. It does not change the behaviour of --strip-all-gnu (although we could choose to do so), because GNU objcopy's behaviour in this case seems to be to strip the section, nor does it prevent removing of sections in segments with --remove-section (if a user REALLY wants to remove a section, we should probably let them, although I could be persuaded that warning might be appropriate). Tests have been added to show this latter behaviour. This fixes https://bugs.llvm.org/show_bug.cgi?id=41006. Reviewed by: grimar, rupprecht, jakehehrlich Differential Revision: https://reviews.llvm.org/D59293 This is a reland of r356129, attempting to fix greendragon failures due to a suspected compatibility issue with od on the greendragon bots versus other versions. llvm-svn: 356136	2019-03-14 11:47:41 +00:00
James Henderson	e81f5f91b4	Revert r356129 due to greendragon bot failures llvm-svn: 356133	2019-03-14 11:23:04 +00:00
Sam Parker	4c4ff13d3c	[ARM][ParallelDSP] Enable multiple uses of loads When choosing whether a pair of loads can be combined into a single wide load, we check that the load only has a sext user and that sext also only has one user. But this can prevent the transformation in the cases when parallel macs use the same loaded data multiple times. To enable this, we need to fix up any other uses after creating the wide load: generating a trunc and a shift + trunc pair to recreate the narrow values. We also need to keep a record of which loads have already been widened. Differential Revision: https://reviews.llvm.org/D59215 llvm-svn: 356132	2019-03-14 11:14:13 +00:00
Sam Parker	3b2ba20afd	[ARM] Run ARMParallelDSP in the IRPasses phase Run EarlyCSE before ParallelDSP and do this in the backend IR opt phase. Differential Revision: https://reviews.llvm.org/D59257 llvm-svn: 356130	2019-03-14 10:57:40 +00:00
James Henderson	c03a95d465	[llvm-objcopy]Don't implicitly strip sections in segments This patch changes llvm-objcopy's behaviour to not strip sections that are in segments, if they otherwise would be due to a stripping operation (--strip-all, --strip-sections, --strip-non-alloc). This preserves the segment contents. It does not change the behaviour of --strip-all-gnu (although we could choose to do so), because GNU objcopy's behaviour in this case seems to be to strip the section, nor does it prevent removing of sections in segments with --remove-section (if a user REALLY wants to remove a section, we should probably let them, although I could be persuaded that warning might be appropriate). Tests have been added to show this latter behaviour. This fixes https://bugs.llvm.org/show_bug.cgi?id=41006. Reviewed by: grimar, rupprecht, jakehehrlich Differential Revision: https://reviews.llvm.org/D59293 llvm-svn: 356129	2019-03-14 10:20:27 +00:00
Alex Bradbury	d08ed38e08	[RISCV] Extend test/CodeGen/RISCV/callee-saved-* to test getCalleePreservedRegs Add a caller which exhausts regs then calls another function. This allows getCalleePreservedRegs to be tested. llvm-svn: 356122	2019-03-14 08:17:44 +00:00
Craig Topper	54a0b53308	[X86] Add patterns for rotr by immediate to fix PR41057. Prior to the introduction of funnel shift intrinsics we could count on rotate by immediates prefering to use rotl since that's what MatchRotate would check first. The or+shift pattern doesn't have a direction so one must be chosen arbitrarily. With funnel shift, there is a direction and fshr will try to use rotr first. While fshl will try to use rotl first. This patch adds the isel patterns for rotr to complement the rotl patterns. I've put the rotr by 1 patterns in the instruction patterns. And moved the rotl by bitwidth-1 patterns to separate Pat patterns. Fixes PR41057. llvm-svn: 356121	2019-03-14 07:07:26 +00:00
Craig Topper	c867847016	[X86] Add various test cases for PR41057. NFC llvm-svn: 356120	2019-03-14 07:07:24 +00:00
Quentin Colombet	e77e5f44b8	[GlobalISel][Utils] Add a getConstantVRegVal variant that looks through instrs getConstantVRegVal used to only look for G_CONSTANT when looking at unboxing the value of a vreg. However, constants are sometimes not directly used and are hidden behind trunc, s\|zext or copy chain of computation. In particular this may be introduced by the legalization process that doesn't want to simplify these patterns because it can lead to infine loop when legalizing a constant. To circumvent that problem, add a new variant of getConstantVRegVal, named getConstantVRegValWithLookThrough, that allow to look through extensions. Differential Revision: https://reviews.llvm.org/D59227 llvm-svn: 356116	2019-03-14 01:37:13 +00:00
Douglas Yung	591040adc2	Fixup tests to check for any MCInst number instead of a specific one. llvm-svn: 356115	2019-03-14 01:24:35 +00:00
Craig Topper	fad96a1588	[X86] Add 64-bit mode command lines to rot32.ll so that it will demonstrate PR41055 for 32 bit. NFC llvm-svn: 356112	2019-03-14 00:23:31 +00:00
Jordan Rupprecht	42bc1e241c	[llvm-objcopy] Cleanup errors from CopyConfig and remove llvm-objcopy.h dependency error() was previously cleaned up from CopyConfig, but new uses were introduced. This also tweaks the error message for --add-symbol to report all invalid flags. llvm-svn: 356105	2019-03-13 22:26:01 +00:00
Matt Arsenault	0253620f89	Verifier: Make sure masked load/store alignment is a power of 2 The same should also be done for scatter/gather, but the verifier doesn't check those at all now. llvm-svn: 356094	2019-03-13 19:46:34 +00:00
Matt Arsenault	24e249ec01	SystemZ: Add ImmArg to intrinsics I found these by asserting in clang for any GCCBuiltin that doesn't require mangling and requires a constant for the builtin. This means that intrinsics are missing which don't use GCCBuiltin, don't have builtins defined in clang, or were missing the constant annotation in the builtin definition. llvm-svn: 356091	2019-03-13 19:46:32 +00:00
Matt Arsenault	88dc015a92	Mips: Add ImmArg to intrinsics I found these by asserting in clang for any GCCBuiltin that doesn't require mangling and requires a constant for the builtin. This means that intrinsics are missing which don't use GCCBuiltin, don't have builtins defined in clang, or were missing the constant annotation in the builtin definition. I'm not sure what's going on with the immediates.ll test. It seems to be intended to test invalid cases like this, but then tries to handle some of them anyway. I've moved the cases that were inconsistent with the GCCBuiltin definition so they don't test the codegen anymore. llvm-svn: 356085	2019-03-13 19:07:59 +00:00
Simon Pilgrim	e15cd7909b	[X86] Remove icmp undef in more reduced tests llvm-svn: 356084	2019-03-13 19:07:54 +00:00
Simon Pilgrim	8f1b825068	[X86] Regenerate tail call tests llvm-svn: 356083	2019-03-13 19:04:45 +00:00
Tim Renouf	ed0b9af997	[AMDGPU] Switched HSA metadata to use MsgPackDocument Summary: MsgPackDocument is the lighter-weight replacement for MsgPackTypes. This commit switches AMDGPU HSA metadata processing to use MsgPackDocument instead of MsgPackTypes. Differential Revision: https://reviews.llvm.org/D57024 Change-Id: I0751668013abe8c87db01db1170831a76079b3a6 llvm-svn: 356081	2019-03-13 18:55:50 +00:00
Craig Topper	84abec2855	[X86] Check for 64-bit mode in X86Subtarget::hasCmpxchg16b() The feature flag alone can't be trusted since it can be passed via -mattr. Need to ensure 64-bit mode as well. We had a 64 bit mode check on the instruction to make the assembler work correctly. But we weren't guarding any of our lowering code or the hooks for the AtomicExpandPass. I've added 32-bit command lines to atomic128.ll with and without cx16. The tests there would all previously fail if -mattr=cx16 was passed to them. I had to move one test case for f128 to a new file as it seems to have a different 32-bit mode or possibly sse issue. Differential Revision: https://reviews.llvm.org/D59308 llvm-svn: 356078	2019-03-13 18:48:50 +00:00
Simon Pilgrim	e1be3403ff	[X86] Avoid icmp undef in reduced tests Because we don't currently simplify icmp with undef in DAG, bugpoint loves to introduce them during reduction. This is a small step towards re-adding non-undef values into some of the simpler tests so that they should still test correctly and emit similar/same codegen. Prep work for PR40800 ([SelectionDAG] Add UNDEF handling to SelectionDAG::FoldSetCC). llvm-svn: 356076	2019-03-13 18:36:59 +00:00
Alex Bradbury	bd1c56648f	[RISCV] Regenerate test/CodeGen/RISCV/legalize-fneg.ll after rL356068 rL356068 caused some minor re-orderings. Regenerate legalize-fneg.ll to reflect this, and remove the NOLIB check lines (they're redundant given that the RV32I and RV64I check lines generated by update_llc_test_checks.py already demonstrate there is no libcall). llvm-svn: 356074	2019-03-13 18:25:23 +00:00
Simon Pilgrim	510f26dca8	Regenerate test llvm-svn: 356071	2019-03-13 18:18:24 +00:00
Nirav Dave	d6351340bb	[DAGCombiner] If a TokenFactor would be merged into its user, consider the user later. Summary: A number of optimizations are inhibited by single-use TokenFactors not being merged into the TokenFactor using it. This makes we consider if we can do the merge immediately. Most tests changes here are due to the change in visitation causing minor reorderings and associated reassociation of paired memory operations. CodeGen tests with non-reordering changes: X86/aligned-variadic.ll -- memory-based add folded into stored leaq value. X86/constant-combiners.ll -- Optimizes out overlap between stores. X86/pr40631_deadstore_elision -- folds constant byte store into preceding quad word constant store. Reviewers: RKSimon, craig.topper, spatel, efriedma, courbet Reviewed By: courbet Subscribers: dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, eraman, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59260 llvm-svn: 356068	2019-03-13 17:07:09 +00:00
Simon Pilgrim	bef4fe056d	[X86][AVX] Add X86ISD::VTRUNC handling to SimplifyDemandedVectorEltsForTargetNode llvm-svn: 356067	2019-03-13 17:00:18 +00:00
Simon Pilgrim	d9aa879b67	[X86][AVX] Add combineConcatVectors support to improve subvector handling Attempt to combine CONCAT_VECTORS nodes, which we only really have pre-legalization. This encourages a lot of X86ISD::SUBV_BROADCAST generation, so I've added SimplifyDemandedVectorEltsForTargetNode handling for this at the same time. The X86ISD::VTRUNC regression in shuffle-vs-trunc-256-widen.ll will be handled in a future commit. llvm-svn: 356064	2019-03-13 16:37:30 +00:00
Alex Bradbury	8a70468a27	[RISCV] Only mark fp as reserved if the function has a dedicated frame pointer This follows similar logic in the ARM and Mips backends, and allows the free use of s0 in functions without a dedicated frame pointer. The changes in callee-saved-gprs.ll most clearly show the effect of this patch. llvm-svn: 356063	2019-03-13 16:33:45 +00:00
Alex Bradbury	7d546aba6c	[RISCV] Add tests for callee-saved GPRs, FPR32s, and FPR64s Note that s0 need not be marked reserved if the frame pointer isn't used. For the ILP32 and LP64 soft float ABIS that are currently support, all FPRs are always considered temporaries. llvm-svn: 356061	2019-03-13 16:14:16 +00:00
Sander de Smalen	72fc7b842c	[AArch64] Add test/CodeGen/AArch64/vecreduce-fadd.ll This test is added to see difference created by: https://reviews.llvm.org/D59259 llvm-svn: 356054	2019-03-13 15:18:27 +00:00
Sanjay Patel	0a251e4076	[x86] limit extractelement of setcc to pre-legalization A fuzzer found the crasher: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13700 The bug was introduced recently here: rL355741 This is the quick fix. If we need to do this transform later, then we'd have to extend/truncate the vector setcc element type to the scalar setcc type (i8). llvm-svn: 356053	2019-03-13 14:49:52 +00:00
Simon Atanasyan	9bfd140ddb	[mips] Fix encoding of the `mov.d` command for microMIPS R6 Before this change LLVM emits non-microMIPS variant of the `mov.d` command for microMIPS code. Differential Revision: http://reviews.llvm.org/D59045 llvm-svn: 356052	2019-03-13 14:23:12 +00:00
Clement Courbet	3bb5d0bb9b	Re-land r354244 "[DAGCombiner] Eliminate dead stores to stack." Always check candidates for hasOtherUses(), not only stores. llvm-svn: 356050	2019-03-13 13:56:23 +00:00
Simon Atanasyan	b9d9e0be3c	[mips] Map SW instruction to its microMIPS R6 variant To provide mapping between standard and microMIPS R6 variants of the `sw` command we have to rename SWSP_xxx commands from "sw" to "swsp". Otherwise `tablegen` starts to show the error `Multiple matches found for `SW'`. After that to restore printing SWSP command as `sw`, I add an appropriate `MipsInstAlias` instance. We also need to implement "size reduction" for microMIPS R6. But this task is for separate patch. After that the `micromips-lwsp-swsp.ll` test case will be extended. Differential Revision: http://reviews.llvm.org/D59046 llvm-svn: 356045	2019-03-13 13:09:30 +00:00
Alex Bradbury	192df587d1	[RISCV] Regenerate umulo-128-legalisation-lowering.ll Upstream changes have improved codegen, reducing stack usage. Regenerate the test. llvm-svn: 356044	2019-03-13 12:33:44 +00:00
Simon Pilgrim	7abbd70300	[X86][AVX] lowerShuffleAsBroadcast - improve load folding by avoiding bitcasts AVX1 broadcasts were failing as we were adding bitcasts that caused MayFoldLoad's hasOneUse to return false. This patch stops introducing bitcasts so early and also replaces the broadcast index scaling through bitcasts (which can't succeed in some cases) to instead just keep track of the bitoffset which can be converted back to the broadcast index later on. Differential Revision: https://reviews.llvm.org/D58888 llvm-svn: 356043	2019-03-13 12:20:39 +00:00
Simon Pilgrim	360ce82db2	[DAG] Move integer setcc %x, %x folding into FoldSetCC First step towards PR40800 - I intend to move the float case in a separate future patch. I had to tweak the (overly reduced) thumb2 test and the x86 widening test change is annoying (no longer rematerializable) but we should address this separately. Differential Revision: https://reviews.llvm.org/D59244 llvm-svn: 356040	2019-03-13 11:08:57 +00:00
Simon Atanasyan	c2b975a75c	[MIPS][microMIPS] Fix PseudoMTLOHI_MM matching and expansion On micromips MipsMTLOHI is always matched to PseudoMTLOHI_DSP regardless of +dsp argument. This patch checks is HasDSP predicate is present for PseudoMTLOHI_DSP so PseudoMTLOHI_MM can be matched when appropriate. Add expansion of PseudoMTLOHI_MM instruction into a mtlo/mthi pair. Patch by Mirko Brkusanin. Differential Revision: http://reviews.llvm.org/D59203 llvm-svn: 356039	2019-03-13 11:04:38 +00:00
Simon Atanasyan	c711002041	[mips] Fix CPU used in the test case to suppress warning. NFC The MSA ASE used in in the test case requires MIPS32 revision 5 or greater while the test uses MIPS32 revision 1. llvm-svn: 356038	2019-03-13 11:04:28 +00:00
Jonas Hahnfeld	c64d73cce2	[ELF] Fix GCC8 warnings about "fall through", NFCI Add break statements in Object/ELF.cpp since the code should consider the generic tags for Hexagon, MIPS, and PPC. Add a test (copied from llvm-readobj) to show that this works correctly (earlier versions of this patch would have asserted). The warnings in X86ELFObjectWriter.cpp are actually false-positives since the nested switch() handles all possible values and returns in all cases. Make this explicit by adding llvm_unreachable's. Differential Revision: https://reviews.llvm.org/D58837 llvm-svn: 356037	2019-03-13 10:38:17 +00:00
Philip Reames	21a50ccf9c	[ImplicitNullChecks] Support unordered atomic accesses Update the INC pass to allow folding unordered atomics. This is the first optimization unblocked by the changes landed from D57601. llvm-svn: 356006	2019-03-13 03:25:20 +00:00
Philip Reames	80ccc88869	[Tests] Expand implicit null check coverage llvm-svn: 356004	2019-03-13 03:17:58 +00:00
Evgeniy Stepanov	6e64a14804	Revert "[llvm] Skip over empty line table entries." This reverts commit r355972. See the discussion at https://reviews.llvm.org/D58952. llvm-svn: 356001	2019-03-13 01:37:58 +00:00
Craig Topper	750efba67c	[X86] Enable printAliasInstr for the Intel assembly printer so that AAM and AAD will print without an immediate when the immediate is 10. llvm-svn: 355997	2019-03-13 00:43:03 +00:00
Heejin Ahn	8b49b6bed6	[WebAssembly] Place 'try' and 'catch' correctly wrt EH_LABELs Summary: After instruction selection phase, possibly-throwing calls, which were previously invoke, are wrapped in `EH_LABEL` instructions. For example: ``` EH_LABEL <mcsymbol .Ltmp0> CALL_VOID @foo ... EH_LABEL <mcsymbol .Ltmp1> ``` `EH_LABEL` is placed also in the beginning of EH pads: ``` bb.1 (landing-pad): EH_LABEL <mcsymbol .Ltmp2> ... ``` And we'd like to maintian this relationship, so when we place a `try`, ``` TRY ... EH_LABEL <mcsymbol .Ltmp0> CALL_VOID @foo ... EH_LABEL <mcsymbol .Ltmp1> ``` When we place a `catch`, ``` bb.1 (landing-pad): EH_LABEL <mcsymbol .Ltmp2> %0:except_ref = CATCH ... ... ``` Previously we didn't treat EH_LABELs specially, so `try` was placed right before a call, and `catch` was placed in the beginning of an EH pad. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58914 llvm-svn: 355996	2019-03-13 00:37:31 +00:00
Craig Topper	9bae5ba076	[X86] Add ImmArg markings to intrinsics. Remove test cases that checked for not crashing when immediate operands were passed not an immediate. These are now considered ill-formed in IR. This was done by manually scanning the intrinsic file for llvm_i32_ty and llvm_i8_ty which are the predominant types we use for immediates. Most of them are on vector intrinsics. I might have missed some other intrinsics. Differential Revision: https://reviews.llvm.org/D58302 llvm-svn: 355993	2019-03-12 23:48:07 +00:00
Francis Visoiu Mistrih	dd42236c6c	Reland "[Remarks] Add -foptimization-record-passes to filter remark emission" Currently we have -Rpass for filtering the remarks that are displayed as diagnostics, but when using -fsave-optimization-record, there is no way to filter the remarks while generating them. This adds support for filtering remarks by passes using a regex. Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline` will only emit the remarks coming from the pass `inline`. This adds: * `-fsave-optimization-record` to the driver * `-opt-record-passes` to cc1 * `-lto-pass-remarks-filter` to the LTOCodeGenerator * `--opt-remarks-passes` to lld * `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2 * `-opt-remarks-passes` to gold-plugin Differential Revision: https://reviews.llvm.org/D59268 Original llvm-svn: 355964 llvm-svn: 355984	2019-03-12 21:22:27 +00:00
Philip Reames	b760558517	[Test] Add tests for implicit null checks on atomic/volatile instructions llvm-svn: 355983	2019-03-12 21:09:58 +00:00
Philip Reames	9134f84ba4	For faulting ops, include a comment w/the fault destination A faulting_op is one that has specified behavior when a fault occurs, generally redirecting control flow to another location. This change just adds a comment to the assembly output which makes it both human readable, and machine checkable w/o having to parse the FaultMap section. This is used to split a test file into two parts, so that I can (in a near future commit) easily extend the test file to demonstrate another case. llvm-svn: 355982	2019-03-12 21:05:31 +00:00
Matt Arsenault	caf1316f71	IR: Add immarg attribute This indicates an intrinsic parameter is required to be a constant, and should not be replaced with a non-constant value. Add the attribute to all AMDGPU and generic intrinsics that comments indicate it should apply to. I scanned other target intrinsics, but I don't see any obvious comments indicating which arguments are intended to be only immediates. This breaks one questionable testcase for the autoupgrade. I'm unclear on whether the autoupgrade is supposed to really handle declarations which were never valid. The verifier fails because the attributes now refer to a parameter past the end of the argument list. llvm-svn: 355981	2019-03-12 21:02:54 +00:00
Francis Visoiu Mistrih	1d6c47ad2b	Revert "[Remarks] Add -foptimization-record-passes to filter remark emission" This reverts commit `20fff32b7d`. llvm-svn: 355976	2019-03-12 20:54:18 +00:00
Mircea Trofin	0c29402eb4	[llvm] Skip over empty line table entries. Summary: This is similar to how addr2line handles consecutive entries with the same address - pick the last one. Reviewers: dblaikie, friss, JDevlieghere Reviewed By: dblaikie Subscribers: ormris, echristo, JDevlieghere, probinson, aprantl, hiraditya, rupprecht, jdoerfert, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D58952 llvm-svn: 355972	2019-03-12 20:48:45 +00:00
Francis Visoiu Mistrih	20fff32b7d	[Remarks] Add -foptimization-record-passes to filter remark emission Currently we have -Rpass for filtering the remarks that are displayed as diagnostics, but when using -fsave-optimization-record, there is no way to filter the remarks while generating them. This adds support for filtering remarks by passes using a regex. Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline` will only emit the remarks coming from the pass `inline`. This adds: * `-fsave-optimization-record` to the driver * `-opt-record-passes` to cc1 * `-lto-pass-remarks-filter` to the LTOCodeGenerator * `--opt-remarks-passes` to lld * `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2 * `-opt-remarks-passes` to gold-plugin Differential Revision: https://reviews.llvm.org/D59268 llvm-svn: 355964	2019-03-12 20:28:50 +00:00
Philip Reames	9b6b4fac83	[SROA] Fix a crash when trying to convert a memset to an non-integral pointer type The included test case currently crashes on tip of tree. Rather than adding a bailout, I chose to restructure the code so that the existing helper function could be used. Given that, the majority of the diff is NFC-ish, but the key difference is that canConvertValue returns false when only one side is a non-integral pointer. Thanks to Cherry Zhang for the test case. Differential Revision: https://reviews.llvm.org/D59000 llvm-svn: 355962	2019-03-12 20:15:05 +00:00
Sanjay Patel	737c27a9cd	[x86] scalarize extractelement 0 of FP vselect llvm-svn: 355955	2019-03-12 19:20:45 +00:00
Teresa Johnson	4ab0a9f0a4	[SCEV] Use depth limit for trunc analysis Summary: This fixes an extremely long compile time caused by recursive analysis of truncs, which were not previously subject to any depth limits unlike some of the other ops. I decided to use the same control used for sext/zext, since the routines analyzing these are sometimes mutually recursive with the trunc analysis. Reviewers: mkazantsev, sanjoy Subscribers: sanjoy, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58994 llvm-svn: 355949	2019-03-12 18:28:05 +00:00
Jinsong Ji	9dc2c1d564	Set useful flags for vector imm setting instructions Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz Should behave like LI8. We should set corresponding flags to allow rematerialization and other opts in LICM, RA, Scheduling etc. Differential Revision: https://reviews.llvm.org/D58645 llvm-svn: 355948	2019-03-12 18:27:09 +00:00
Craig Topper	03e93f514a	[SanitizerCoverage] Avoid splitting critical edges when destination is a basic block containing unreachable This patch adds a new option to SplitAllCriticalEdges and uses it to avoid splitting critical edges when the destination basic block ends with unreachable. Otherwise if we split the critical edge, sanitizer coverage will instrument the new block that gets inserted for the split. But since this block itself shouldn't be reachable this is pointless. These basic blocks will stick around and generate assembly, but they don't end in sane control flow and might get placed at the end of the function. This makes it look like one function has code that flows into the next function. This showed up while compiling the linux kernel with clang. The kernel has a tool called objtool that detected the code that appeared to flow from one function to the next. https://github.com/ClangBuiltLinux/linux/issues/351#issuecomment-461698884 Differential Revision: https://reviews.llvm.org/D57982 llvm-svn: 355947	2019-03-12 18:20:25 +00:00
Eli Friedman	74b6aae4e8	[RISCV][MC] Find matching pcrel_hi fixup in more cases. If a symbol points to the end of a fragment, instead of searching for fixups in that fragment, search in the next fragment. Fixes spurious assembler error with subtarget change next to "la" pseudo-instruction, or expanded equivalent. Alternate proposal to fix the problem discussed in https://reviews.llvm.org/D58759. Testcase by Ana Pazos. Differential Revision: https://reviews.llvm.org/D58943 llvm-svn: 355946	2019-03-12 18:14:16 +00:00
Jinsong Ji	b6bfcfc847	[NFC][PowerPC] Update testcases using utils/update_llc_test_checks.py llvm-svn: 355945	2019-03-12 17:55:32 +00:00
Jason Liu	8cf8bb1313	Test commit: add a blank line in test case ppc64-dq-expr.s llvm-svn: 355942	2019-03-12 17:33:07 +00:00
James Henderson	9bc817a0ae	[yaml2obj]Allow explicit symbol indexes in relocations and emit error for bad names Prior to this change, the "Symbol" field of a relocation would always be assumed to be a symbol name, and if no such symbol existed, the relocation would reference index 0. This confused me when I tried to use a literal symbol index in the field: since "0x1" was not a known symbol name, the symbol index was set as 0. This change falls back to treating unknown symbol names as integers, and emits an error if the name is not found and the string is not an integer. Note that the Symbol field is optional, so if a relocation doesn't reference a symbol, it shouldn't be specified. The new error required a number of test updates. Reviewed by: grimar, ruiu Differential Revision: https://reviews.llvm.org/D58510 llvm-svn: 355938	2019-03-12 17:00:25 +00:00
Nikita Popov	149bc099f6	[SDAG] Expand pow2 mulo using shifts Expand MULO with constant power of two operand into a shift. The overflow is checked with (x << shift) >> shift == x, where the right shift will be logical for umulo and arithmetic for smulo (with exception for multiplications by signed_min). Differential Revision: https://reviews.llvm.org/D59041 llvm-svn: 355937	2019-03-12 16:57:25 +00:00
Simon Pilgrim	a6013c0286	Regenerate sign_extend.ll test. This will change as part of the fix for the regressions in D58017. llvm-svn: 355933	2019-03-12 16:00:59 +00:00
James Henderson	b69a50115b	[llvm-cxxfilt]Add test to show that empty lines can be handled I recently discovered a bug in llvm-cxxfilt introduced in r353743 but was fixed later incidentally due to r355031. Specifically, llvm-cxxfilt was attempting to call .back() on an empty string any time there was a new line in the input. This was causing a crash in my debug builds only. This patch simply adds a test that explicitly tests that llvm-cxxfilt handles empty lines correctly. It may pass under release builds under the broken behaviour, but it fails at least in debug builds. Reviewed by: mattd Differential Revision: https://reviews.llvm.org/D58785 llvm-svn: 355929	2019-03-12 15:42:38 +00:00
James Henderson	662c043628	[FileCheck]Remove assertions that prevent matching an empty string at file start before CHECK-NEXT/SAME This patch removes two assertions that were preventing writing of a test that checked an empty line followed by some text. For example: CHECK: {{^$}} CHECK-NEXT: foo() The assertion was because the current location the CHECK-NEXT was scanning from was the start of the buffer. A similar issue occurred with CHECK-SAME. These assertions don't protect against anything, as there is already an error check that checks that CHECK-NEXT/EMPTY/SAME don't appear first in the checks, and the following code works fine if the pointer is at the start of the input. Reviewed by: probinson, thopre, jdenny Differential Revision: https://reviews.llvm.org/D58784 llvm-svn: 355928	2019-03-12 15:37:34 +00:00
Tim Northover	8935aca9c7	CodeGenPrep: preserve inbounds attribute when sinking GEPs. Targets can potentially emit more efficient code if they know address computations never overflow. For example ILP32 code on AArch64 (which only has 64-bit address computation) can ignore the possibility of overflow with this extra information. llvm-svn: 355926	2019-03-12 15:22:23 +00:00
Xing GUO	eec3206a41	[llvm-readobj] Print symbol version when dumping relocations (PR31564) Summary: This helps resolve https://bugs.llvm.org/show_bug.cgi?id=31564 Reviewers: jhenderson, grimar Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59175 llvm-svn: 355922	2019-03-12 14:30:13 +00:00
Sam Parker	a7ae60ac93	[ARM][NFC] Delete original smlad tests Because I don't understand svn. llvm-svn: 355908	2019-03-12 11:06:15 +00:00
Sam Parker	28e46e58db	[ARM][NFC] Move smlad tests Created a test/CodeGen/ARM/ParallelDSP folder. llvm-svn: 355907	2019-03-12 11:01:11 +00:00
Fangrui Song	f260967055	[SimplifyLibCalls] Fix comments about fputs, memchr, and s[n]printf. NFC llvm-svn: 355905	2019-03-12 10:31:52 +00:00
Eugene Leviant	1e249caaec	[CGP] Fix UB when GEP is bound to trivial PHINode Differential revision: https://reviews.llvm.org/D59140 llvm-svn: 355904	2019-03-12 10:10:29 +00:00
David Stuttard	20ea21c6ed	[AMDGPU] Add support for immediate operand for S_ENDPGM Summary: Add support for immediate operand in S_ENDPGM Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6 Reviewers: alexshap Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59213 llvm-svn: 355902	2019-03-12 09:52:58 +00:00
Simon Tatham	cdb7c31f0a	[TableGen] Allow 2^63-1 and 2^63-2 as int literals. These two values correspond to the 'Empty' and 'Tombstone' special keys defined by DenseMapInfo<int64_t>, which means that neither one can be used as a key in DenseMap<int64_t, anything>. Hence, if you try to use either of those values as an int literal, IntInit::get() fails an assertion when it tries to insert them into its static cache of int-literal objects. Fixed by replacing the DenseMap with a std::map, which doesn't intrude on the space of legal values of the key type. Reviewers: nhaehnle, hfinkel, javedabsar, efriedma Reviewed By: efriedma Subscribers: fhahn, efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59016 llvm-svn: 355900	2019-03-12 09:28:19 +00:00
Alex Bradbury	c965d21f33	[RISCV] Add test cases for the lp64 ABI These are closely modeled on similar tests for the ilp32 ABI. Like those tests, we group together tests that should be common cross lp64, lp64+lp64f, and lp64+lp64f+lp64d ABIs. llvm-svn: 355899	2019-03-12 09:26:53 +00:00
Sanjoy Das	3f5ce18658	Reland "Relax constraints for reduction vectorization" Change from original commit: move test (that uses an X86 triple) into the X86 subdirectory. Original description: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355889	2019-03-12 01:31:44 +00:00
Nathan Lanza	cc51dc649a	Add Swift enumerator value for CodeView::SourceLanguage Summary: Swift now generates PDBs for debugging on Windows. llvm and lldb need a language enumerator value too properly handle the output emitted by swiftc. Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59231 llvm-svn: 355882	2019-03-11 23:27:59 +00:00
Sanjoy Das	2136a5bc49	Revert "Relax constraints for reduction vectorization" This reverts commit r355868. Breaks hexagon. llvm-svn: 355873	2019-03-11 22:37:31 +00:00
Jessica Paquette	607774c960	Recommit "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT" After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without running into any problematic intrinsics. Also add a fix for lane copies, which don't support index 0. llvm-svn: 355871	2019-03-11 22:18:01 +00:00
Evgeniy Stepanov	aedec3f684	Remove ASan asm instrumentation. Summary: It is incomplete and has no users AFAIK. Reviewers: pcc, vitalybuka Subscribers: srhines, kubamracek, mgorny, krytarowski, eraman, hiraditya, jdoerfert, #sanitizers, llvm-commits, thakis Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D59154 llvm-svn: 355870	2019-03-11 21:50:10 +00:00
Alex Bradbury	4d20cc21c7	[RISCV] Do a sign-extension in a compare-and-swap of 32 bit in RV64A AtomicCmpSwapWithSuccess is legalised into an AtomicCmpSwap plus a comparison. This requires an extension of the value which, by default, is a zero-extension. When we later lower AtomicCmpSwap into a PseudoCmpXchg32 and then expanded in RISCVExpandPseudoInsts.cpp, the lr.w instruction does a sign-extension. This mismatch of extensions causes the comparison to fail when the compared value is negative. This change overrides TargetLowering::getExtendForAtomicOps for RISC-V so it does a sign-extension instead. Differential Revision: https://reviews.llvm.org/D58829 Patch by Ferran Pallarès Roca. llvm-svn: 355869	2019-03-11 21:41:22 +00:00
Sanjoy Das	93f8cc186a	Relax constraints for reduction vectorization Summary: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355868	2019-03-11 21:36:41 +00:00
Alex Bradbury	b6d322bdc2	[RISCV] Allow fp as an alias of s0 The RISC-V Assembly Programmer's Manual defines fp as another alias of x8. However, our tablegen rules only recognise s0. This patch adds fp as another alias of x8. GCC also accepts fp. Differential Revision: https://reviews.llvm.org/D59209 Patch by Ferran Pallarès Roca. llvm-svn: 355867	2019-03-11 21:35:26 +00:00
Jessica Paquette	42d16501e6	[GlobalISel][AArch64] Always fall back on aarch64.neon.addp.* Overloaded intrinsics aren't necessarily safe for instruction selection. One such intrinsic is aarch64.neon.addp.*. This is a temporary workaround to ensure that we always fall back on that intrinsic. Eventually this will be replaced with a proper solution. https://bugs.llvm.org/show_bug.cgi?id=40968 Differential Revision: https://reviews.llvm.org/D59062 llvm-svn: 355865	2019-03-11 20:51:17 +00:00
Nico Weber	885b790f89	Remove esan. It hasn't seen active development in years, and it hasn't reached a state where it was useful. Remove the code until someone is interested in working on it again. Differential Revision: https://reviews.llvm.org/D59133 llvm-svn: 355862	2019-03-11 20:23:40 +00:00
Nikita Popov	aa7cfa75f9	[SDAG][AArch64] Legalize VECREDUCE Fixes https://bugs.llvm.org/show_bug.cgi?id=36796. Implement basic legalizations (PromoteIntRes, PromoteIntOp, ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes. There are more legalizations missing (esp float legalizations), but there's no way to test them right now, so I'm not adding them. This also includes a few more changes to make this work somewhat reasonably: * Add support for expanding VECREDUCE in SDAG. Usually experimental.vector.reduce is expanded prior to codegen, but if the target does have native vector reduce, it may of course still be necessary to expand due to legalization issues. This uses a shuffle reduction if possible, followed by a naive scalar reduction. * Allow the result type of integer VECREDUCE to be larger than the vector element type. For example we need to be able to reduce a v8i8 into an (nominally) i32 result type on AArch64. * Use the vector operand type rather than the scalar result type to determine the action, so we can control exactly which vector types are supported. Also change the legalize vector op code to handle operations that only have vector operands, but no vector results, as is the case for VECREDUCE. * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE), explicitly specify for which vector types the reductions are supported. This does not handle anything related to VECREDUCE_STRICT_*. Differential Revision: https://reviews.llvm.org/D58015 llvm-svn: 355860	2019-03-11 20:22:13 +00:00
Jonas Paulsson	8b8dc50e79	[RegAlloc] Avoid compile time regression with multiple copy hints. As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile time building opencollada"), this patch makes sure that no phys reg is hinted more than once from getRegAllocationHints(). This handles the case were many virtual registers are assigned to the same physreg. The previous compile time fix (r343686) in weightCalcHelper() only made sure that physical/virtual registers are passed no more than once to addRegAllocationHint(). Review: Dimitry Andric, Quentin Colombet https://reviews.llvm.org/D59201 llvm-svn: 355854	2019-03-11 19:00:37 +00:00
Brian Gesiak	d7b68132d8	[coroutines][PR40979] Ignore unreachable uses across suspend points Summary: Depends on https://reviews.llvm.org/D59069. https://bugs.llvm.org/show_bug.cgi?id=40979 describes a bug in which the -coro-split pass would assert that a use was across a suspend point from a definition. Normally this would mean that a value would "spill" across a suspend point and thus need to be stored in the coroutine frame. However, in this case the use was unreachable, and so it would not be necessary to store the definition on the frame. To prevent the assert, simply remove unreachable basic blocks from a coroutine function before computing spills. This avoids the assert reported in PR40979. Reviewers: GorNishanov, tks2103 Reviewed By: GorNishanov Subscribers: EricWF, jdoerfert, llvm-commits, lewissbaker Tags: #llvm Differential Revision: https://reviews.llvm.org/D59068 llvm-svn: 355852	2019-03-11 18:31:28 +00:00
Michael Trent	76d66123b2	Detect malformed LC_LINKER_COMMANDs in Mach-O binaries Summary: llvm-objdump can be tricked into reading beyond valid memory and segfaulting if LC_LINKER_COMMAND strings are not null terminated. libObject does have code to validate the integrity of the LC_LINKER_COMMAND struct, but this validator improperly assumes linker command strings are null terminated. The solution is to report an error if a string extends beyond the end of the LC_LINKER_COMMAND struct. Reviewers: lhames, pete Reviewed By: pete Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59179 llvm-svn: 355851	2019-03-11 18:29:25 +00:00
Simon Pilgrim	06ae025345	[X86] Extend widening comparison test. Ensure we test both v2i16 unary and binary comparisons. llvm-svn: 355849	2019-03-11 18:08:20 +00:00
Jeremy Morse	90ede5f4bf	[SimplifyCFG] Retain debug info when threading jumps with critical edges Fixes bug 38023: https://bugs.llvm.org/show_bug.cgi?id=38023 The SimplifyCFG pass will perform jump threading in some cases where doing so is trivial and would simplify the CFG. When folding a series of blocks with redundant conditional branches into an unconditional "critical edge" block, it does not keep the debug location associated with the previous conditional branch. This patch fixes the bug described by copying the debug info from the old conditional branch to the new unconditional branch instruction, and adds a regression test for the SimplifyCFG pass that covers this case. Patch by Stephen Tozer! Differential Revision: https://reviews.llvm.org/D59206 llvm-svn: 355833	2019-03-11 16:23:59 +00:00
Petar Jovanovic	28e13eb098	[MIPS][microMIPS] Add a pattern to match TruncIntFP A pattern needed to match TruncIntFP was missing. This was causing multiple tests from llvm test suite to fail during compilation for micromips. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D58722 llvm-svn: 355825	2019-03-11 14:13:31 +00:00
Sam Parker	52760bf435	[CGP] Limit distance between overflow math and cmp Inserting an overflowing arithmetic intrinsic can increase register pressure by producing two values at a point where only one is needed, while the second use maybe several blocks away. This increase in pressure is likely to be more detrimental on performance than rematerialising one of the original instructions. So, check that the arithmetic and compare instructions are no further apart than their immediate successor/predecessor. Differential Revision: https://reviews.llvm.org/D59024 llvm-svn: 355823	2019-03-11 13:19:46 +00:00
Jeremy Morse	b60aea4131	[JumpThreading] Retain debug info when replacing branch instructions Fixes bug 37966: https://bugs.llvm.org/show_bug.cgi?id=37966 The Jump Threading pass will replace certain conditional branch instructions with unconditional branches when it can prove that only one branch can occur. Prior to this patch, it would not carry the debug info from the old instruction to the new one. This patch fixes the bug described by copying the debug info from the conditional branch instruction to the new unconditional branch instruction, and adds a regression test for the Jump Threading pass that covers this case. Patch by Stephen Tozer! Differential Revision: https://reviews.llvm.org/D58963 llvm-svn: 355822	2019-03-11 11:48:57 +00:00
George Rimar	d8a5c6cf19	[llvm-objcopy] - Fix --compress-debug-sections when there are relocations. When --compress-debug-sections is given, llvm-objcopy removes the uncompressed sections and adds compressed to the section list. This makes all the pointers to old sections to be outdated. Currently, code already has logic for replacing the target sections of the relocation sections. But we also have to update the relocations by themselves. This fixes https://bugs.llvm.org/show_bug.cgi?id=40885. Differential revision: https://reviews.llvm.org/D58960 llvm-svn: 355821	2019-03-11 11:01:24 +00:00
Petar Avramovic	5229f47f9f	[MIPS GlobalISel] NarrowScalar G_UMULH NarrowScalar G_UMULH in LegalizerHelper using multiplyRegisters helper function. NarrowScalar G_UMULH for MIPS32. Differential Revision: https://reviews.llvm.org/D58825 llvm-svn: 355815	2019-03-11 10:08:44 +00:00
Petar Avramovic	0b17e59b5c	[MIPS GlobalISel] NarrowScalar G_MUL Narrow Scalar G_MUL for MIPS32. Revisit NarrowScalar implementation in LegalizerHelper. Introduce new helper function multiplyRegisters. It performs generic multiplication of values held in multiple registers. Generated instructions use only types NarrowTy and i1. Destination can be same or two times size of the source. Differential Revision: https://reviews.llvm.org/D58824 llvm-svn: 355814	2019-03-11 10:00:17 +00:00
Craig Topper	00afa193f1	[X86] Enable sse2_cvtsd2ss intrinsic to use an EVEX encoded instruction. llvm-svn: 355810	2019-03-11 06:01:04 +00:00
Amaury Sechet	a5820cbd20	Add test case for add to sub post legalization. NFC llvm-svn: 355797	2019-03-11 01:25:48 +00:00
Sanjay Patel	26e06e859e	[x86] add x86-specific opcodes to extractelement scalarization list llvm-svn: 355792	2019-03-10 18:56:21 +00:00
Craig Topper	93e15dfacc	[X86] Make lowering of intrinsics with rounding mode stricter so that only valid rounding modes are lowered. Update tests accordingly Many of our tests were not using valid rounding mode immediates. Clang verifies this in the frontend when it creates the intrinsics from builtins, but the backend would still lower invalid immediates. With this change we will now leave them as intrinsics if the immediate is invalid. This will cause an isel selection failure. llvm-svn: 355789	2019-03-10 17:20:45 +00:00
Nikita Popov	bfec0d610c	[AArch64] Add tests for saddsat/ssubsat; NFC Signed versions of the existing unsigned tests. llvm-svn: 355787	2019-03-10 12:21:36 +00:00
Nikita Popov	506c1aba4d	[ARM] Use non-constant operand in umulo-32.ll; NFC Currently the store+load is folded and both operands of the umulo end up being constants. To avoid this getting folded away entirely, make sure at least one operand is non-constant. Also remove some allocas which don't seem relevant to the test. llvm-svn: 355776	2019-03-09 13:43:21 +00:00
Nikita Popov	74dde7e5a1	[ARM] Generate test checks for umulo-32.ll; NFC The second test case is going to be changed by D59041, so generate full baseline checks. llvm-svn: 355775	2019-03-09 13:21:15 +00:00
Alex Bradbury	fea4957177	[RISCV] Support -target-abi at the MC layer and for codegen This patch adds proper handling of -target-abi, as accepted by llvm-mc and llc. Lowering (codegen) for the hard-float ABIs will follow in a subsequent patch. However, this patch does add MC layer support for the hard float and RVE ABIs (emission of the appropriate ELF flags https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#-file-header). ABI parsing must be shared between codegen and the MC layer, so we add computeTargetABI to RISCVUtils. A warning will be printed if an invalid or unrecognized ABI is given. Differential Revision: https://reviews.llvm.org/D59023 llvm-svn: 355771	2019-03-09 09:28:06 +00:00
Thomas Lively	972d7d514b	[WebAssembly] Use named operands to identify loads and stores Summary: Uses the named operands tablegen feature to look up the indices of offset, address, and p2align operands for all load and store instructions. This replaces brittle, incorrect logic for identifying loads and store when eliminating frame indices, which previously crashed on bulk-memory ops. It also cleans up the SetP2Alignment pass. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59007 llvm-svn: 355770	2019-03-09 04:31:37 +00:00
Sanjay Patel	40bcc3de7d	[x86] add tests for extract of FP select; NFC llvm-svn: 355768	2019-03-09 02:11:05 +00:00
Craig Topper	69f8c1653d	[ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement. This saves needing to call getInt32 ourselves. Making the code a little shorter. The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue. This cleanup because I plan to write similar code for expandload/compressstore. llvm-svn: 355767	2019-03-09 02:08:41 +00:00
Ana Pazos	5254d1baae	[RISCV] Allow access to FP CSRs without F extension Summary: Floating-point CSRs should be accessible even when F extension is not enabled. But pseudo instructions that access floating point CSRs still require the F extension. GNU tools already implement this behavior. RISC-V spec is pending update to reflect this behavior and to extend it to pseudo instructions that access floating point CSRs. Reviewers: asb Reviewed By: asb Subscribers: asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, llvm-commits Differential Revision: https://reviews.llvm.org/D58932 llvm-svn: 355753	2019-03-08 23:01:08 +00:00
Rong Xu	ce3be45cac	[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst r44412 fixed a huge compile time regression but it needed ModifiedDT flag to be maintained correctly in optimizations in optimizeBlock() and optimizeInst(). Function optimizeSelectInst() does not update the flag. This patch propagates the flag in optimizeSelectInst() back to optimizeBlock(). This patch also removes ModifiedDT in CodeGenPrepare class (which is not used). The property of ModifiedDT is now recorded in a ref parameter. Differential Revision: https://reviews.llvm.org/D59139 llvm-svn: 355751	2019-03-08 22:46:18 +00:00
Mitch Phillips	53d3994719	[Go / ASAN] Disable Go bindings for ASAN tests. Go binding tests fail under ASAN with the error at the bottom of this commit message. The reason the buildbots are not currently always failing on this test is that they selectively disable the bindings due to a Go binary not being present on their system. This change should allow users to build an asan-bootstrapped compiler and run asan-ified unit tests locally, similar to the way that sanitizer-* buildbots do. The error is: ``` FAIL: LLVM :: Bindings/Go/go.test (7050 of 30112) ****************** TEST 'LLVM :: Bindings/Go/go.test' FAILED ****************** Script: -- : 'RUN: at line 1'; /usr/local/google/home/mitchp/llvm-build/asan/sanitized-clang/bin/llvm-go go=/usr/lib/google-golang/bin/go test llvm.org/llvm/bindings/go/llvm -- Exit Code: 1 Command Output (stdout): -- FAIL llvm.org/llvm/bindings/go/llvm [build failed] -- Command Output (stderr): -- ld.lld: error: undefined symbol: std::allocator<char>::allocator() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const, std::allocator<char> const&) >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: std::allocator<char>::~allocator() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: std::allocator<char>::~allocator() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: llvm::createDataFlowSanitizerPass(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, void ()(), void ()()) >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(LLVMAddDataFlowSanitizerPass) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(void std::_Destroy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(void __gnu_cxx::new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&)) ld.lld: error: undefined symbol: std::__throw_length_error(char const) >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_check_len(unsigned long, char const) const) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(void std::_Construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&)) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() >>> referenced by InstrumentationBindings.cpp >>> $WORK/b048/_x018.o:(void __gnu_cxx::new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::destroy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string() >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(LLVMLoadLibraryPermanently2) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(LLVMLoadLibraryPermanently2) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::c_str() const >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(LLVMLoadLibraryPermanently2) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::size() const >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(LLVMLoadLibraryPermanently2) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(LLVMLoadLibraryPermanently2) ld.lld: error: undefined symbol: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(LLVMLoadLibraryPermanently2) ld.lld: error: undefined symbol: llvm::sys::DynamicLibrary::getPermanentLibrary(char const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) >>> referenced by SupportBindings.cpp >>> $WORK/b048/_x019.o:(llvm::sys::DynamicLibrary::LoadLibraryPermanently(char const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)) ld.lld: error: undefined symbol: __asan_option_detect_stack_use_after_return >>> referenced by MCJIT.cpp:45 (/usr/local/google/home/mitchp/llvm/llvm/lib/ExecutionEngine/MCJIT/MCJIT.cpp:45) >>> MCJIT.cpp.o:(llvm::MCJIT::createJIT(std::__1::unique_ptr<llvm::Module, std::__1::default_delete<llvm::Module> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<llvm::MCJITMemoryManager>, std::__1::shared_ptr<llvm::LegacyJITSymbolResolver>, std::__1::unique_ptr<llvm::TargetMachine, std::__1::default_delete<llvm::TargetMachine> >)) in archive /usr/local/google/home/mitchp/llvm-build/asan/sanitized-clang/lib/libLLVMMCJIT.a ld.lld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors) clang-9: error: linker command failed with exit code 1 (use -v to see invocation) -- ``` llvm-svn: 355749	2019-03-08 22:34:33 +00:00
Amara Emerson	7a05d1c1f1	[AArch64][GlobalISel] Fix i1 arguments not being zero-extended as required by ABI. Fixes PR41001. llvm-svn: 355745	2019-03-08 22:17:00 +00:00
Sunil Srivastava	ae8fe4e093	Improve "llvm-nm -f sysv" output for Elf files Specifically, compute and Print Type and Section columns. This is a re-commit of rL354833, after fixing the Asan problem found a a buildbot. Differential Revision: https://reviews.llvm.org/D59060 llvm-svn: 355742	2019-03-08 22:00:50 +00:00
Sanjay Patel	f84083b4db	[x86] scalarize extract element 0 of FP cmp An extension of D58282 noted in PR39665: https://bugs.llvm.org/show_bug.cgi?id=39665 This doesn't answer the request to use movmsk, but that's an independent problem. We need this and probably still need scalarization of FP selects because we can't do that as a target-independent transform (although it seems likely that targets besides x86 should have this transform). llvm-svn: 355741	2019-03-08 21:54:41 +00:00
Alexey Bataev	a8b3eb46b5	[NVPTX][DEBUGINFO]Temp workaround for crash of ptxas: disable packed bytes in debug sections. Summary: This patch works around the bug in the ptxas tool with the processing of bytes separated by the comma symbol. The emission of the packed string is temporarily disabled. Reviewers: tra Subscribers: jholewinski, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59148 llvm-svn: 355740	2019-03-08 21:29:17 +00:00
Mitch Phillips	790edbc16e	[HWASan] Save + print registers when tag mismatch occurs in AArch64. Summary: This change change the instrumentation to allow users to view the registers at the point at which tag mismatch occured. Most of the heavy lifting is done in the runtime library, where we save the registers to the stack and emit unwind information. This allows us to reduce the overhead, as very little additional work needs to be done in each __hwasan_check instance. In this implementation, the fast path of __hwasan_check is unmodified. There are an additional 4 instructions (16B) emitted in the slow path in every __hwasan_check instance. This may increase binary size somewhat, but as most of the work is done in the runtime library, it's manageable. The failure trace now contains a list of registers at the point of which the failure occured, in a format similar to that of Android's tombstones. It currently has the following format: Registers where the failure occurred (pc 0x0055555561b4): x0 0000000000000014 x1 0000007ffffff6c0 x2 1100007ffffff6d0 x3 12000056ffffe025 x4 0000007fff800000 x5 0000000000000014 x6 0000007fff800000 x7 0000000000000001 x8 12000056ffffe020 x9 0200007700000000 x10 0200007700000000 x11 0000000000000000 x12 0000007fffffdde0 x13 0000000000000000 x14 02b65b01f7a97490 x15 0000000000000000 x16 0000007fb77376b8 x17 0000000000000012 x18 0000007fb7ed6000 x19 0000005555556078 x20 0000007ffffff768 x21 0000007ffffff778 x22 0000000000000001 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000007ffffff6f0 x30 00000055555561b4 ... and prints after the dump of memory tags around the buggy address. Every register is saved exactly as it was at the point where the tag mismatch occurs, with the exception of x16/x17. These registers are used in the tag mismatch calculation as scratch registers during __hwasan_check, and cannot be saved without affecting the fast path. As these registers are designated as scratch registers for linking, there should be no important information in them that could aid in debugging. Reviewers: pcc, eugenis Reviewed By: pcc, eugenis Subscribers: srhines, kubamracek, mgorny, javed.absar, krytarowski, kristof.beyls, hiraditya, jdoerfert, llvm-commits, #sanitizers Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58857 llvm-svn: 355738	2019-03-08 21:22:35 +00:00
Matt Arsenault	e8c03a2511	AMDGPU: Move d16 load matching to preprocess step When matching half of the build_vector to a load, there could still be a hidden dependency on the other half of the build_vector the pattern wouldn't detect. If there was an additional chain dependency on the other value, a cycle could be introduced. I don't think a tablegen pattern is capable of matching the necessary conditions, so move this into PreprocessISelDAG. Check isPredecessorOf for the other value to avoid a cycle. This has a warning that it's expensive, so this should probably be moved into an MI pass eventually that will have more freedom to reorder instructions to help match this. That is currently complicated by the lack of a computeKnownBits type mechanism for the selected function. llvm-svn: 355731	2019-03-08 20:58:11 +00:00
Matt Arsenault	26e76ef0e2	DAG: Don't try to cluster loads with tied inputs This avoids breaking possible value dependencies when sorting loads by offset. AMDGPU has some load instructions that write into the high or low bits of the destination register, and have a tied input for the other input bits. These can easily have the same base pointer, but be a swizzle so the high address load needs to come first. This was inserting glue forcing the opposite ordering, producing a cycle the InstrEmitter would assert on. It may be potentially expensive to look for the dependency between the other loads, so just skip any where this could happen. Fixes bug 40936 by reverting r351379, which added a hacky attempt to fix this by adding chains in this case, which I think was just working around broken glue before the InstrEmitter. The core of the patch is re-implementing the fix for that problem. llvm-svn: 355728	2019-03-08 20:46:15 +00:00
Sanjay Patel	43f098e719	[x86] add tests for extracted vector FP cmp; NFC llvm-svn: 355727	2019-03-08 20:45:27 +00:00
Matt Arsenault	74c9c305e0	AMDGPU: Add more tests for d16 loads Also fix a few cases that weren't testing what they were supposed to. llvm-svn: 355724	2019-03-08 20:30:51 +00:00
Matt Arsenault	07f904befb	AMDGPU: Correct DS implementation of areLoadsFromSameBasePtr This was checking the wrong operands for the base register and the offsets. The indexes are shifted by the number of output registers from the machine instruction definition, and the chain is moved to the end. llvm-svn: 355722	2019-03-08 20:30:50 +00:00
Alexey Bataev	78fcb8381f	[DEBUG_INFO][NVPTX]Emit empty .debug_loc section in presence of the debug option. Summary: If the LLVM module shows that it has debug info, but the file is actually empty and the real debug info is not emitted, the ptxas tool emits error 'Debug information not found in presence of .target debug'. We need at leas one empty debug section to silence this message. Section `.debug_loc` is not emitted for PTX and we can emit empty `.debug_loc` section if `debug` option was emitted. Reviewers: tra Subscribers: jholewinski, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D57250 llvm-svn: 355719	2019-03-08 20:08:04 +00:00
Amaury Sechet	782ac933b5	[DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a) Summary: This pattern is sometime created after legalization. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58874 llvm-svn: 355716	2019-03-08 19:39:32 +00:00
Sanjay Patel	b22f438df3	[x86] prevent infinite looping from inverse shuffle transforms llvm-svn: 355713	2019-03-08 19:20:28 +00:00
Simon Pilgrim	53652feab7	[X86] Add test case for PR22473 llvm-svn: 355712	2019-03-08 19:16:26 +00:00
Diogo N. Sampaio	c20c37ba7f	[ARM][FIX] Fix vfmal.f16 and vfmsl.f16 operand The indexed variant of vfmal.f16 and vfmsl.f16 instructions use the uppser bits of the indexed operand to store the index (1 bit for the double variant, 2 bits for the quad). This limits the usable registers to d0 - d7 or s0 - s15. This patch enforces this limitation. Differential Revision: https://reviews.llvm.org/D59021 llvm-svn: 355707	2019-03-08 17:11:20 +00:00
Simon Pilgrim	00ab0339ed	Fix typo in constant vector llvm-svn: 355699	2019-03-08 15:17:26 +00:00
James Henderson	b41130bedc	[llvm-readelf]Don't lose negative-ness of negative addends for no symbol relocations llvm-readelf prints relocation addends as: <symbol value>[+-]<absolute addend> where [+-] is determined from whether addend is less than zero or not. However, it does not print the +/- if there is no symbol, which meant that negative addends became their positive value with no indication that this had happened. This patch stops the absolute conversion when addends are negative and there is no associated symbol. Reviewed by: Higuoxing, mattd, MaskRay Differential Revision: https://reviews.llvm.org/D59095 llvm-svn: 355696	2019-03-08 13:22:05 +00:00
Clement Courbet	8e16d73346	[SelectionDAG] Allow the user to specify a memeq function. Summary: Right now, when we encounter a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a small compile-time constant, and fall back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms that support `bcmp`. `bcmp` can be made much more efficient than `memcmp` because equality compare is trivially parallel while lexicographic ordering has a chain dependency. Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits Differential Revision: https://reviews.llvm.org/D56593 llvm-svn: 355672	2019-03-08 09:07:45 +00:00
Carl Ritson	1a98dc1840	[AMDGPU] V_CVT_F32_UBYTE{0,1,2,3} are full rate instructions Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions. Reviewers: arsenm, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59091 llvm-svn: 355671	2019-03-08 09:03:11 +00:00
Craig Topper	4505c99e72	[X86] Improve the type checking in isLegalMaskedLoad and isLegalMaskedGather. We were just checking pointer size and type primitive size. But this caused unintended things like vectors of half being accepted by masked load/store. For FP we now explicitly check for only double and float. For pointers we now let any pointer through. Trusting that only 32 and 64 would be used to generate assembly. We only check bitwidth after checking that the type is an integer. llvm-svn: 355667	2019-03-08 07:33:43 +00:00
Steven Wu	ed98229286	[Bitcode] Fix bitcode compatibility issue with clang.arc.use intrinsic Summary: In r349534, objc arc implementation is switched to use intrinsics and at the same time, clang.arc.use is renamed to llvm.objc.clang.arc.use to make the naming more consistent. The side-effect of that is llvm no longer recognize it as intrinsics and codegen external references to it instead. Rather than upgrade the old intrinsics name to the new one and wait for the arc-contract pass to remove it, simply remove it in the bitcode upgrader. rdar://problem/48607063 Reviewers: pete, ahatanak, erik.pilkington, dexonsmith Reviewed By: pete, dexonsmith Subscribers: jkorous, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59112 llvm-svn: 355663	2019-03-08 05:27:53 +00:00
Sanjay Patel	5ed14ef1e4	[x86] add extract FP tests for target-specific nodes; NFC llvm-svn: 355655	2019-03-07 23:55:54 +00:00
Craig Topper	d0c2dba644	[X86] Correct scheduler information for rotate by constant for Haswell, Broadwell, and Skylake. Rotate with explicit immediate is a single uop from Haswell on. An immediate of 1 has a dependency on the previous writer of flags, but the other immediate values do not. The implicit rotate by 1 instruction is 2 uops. But the flags are merged after the rotate uop so the data result does not see the flag dependency. But I don't think we have any way of modeling that. RORX is 1 uop without the load. 2 uops with the load. We currently model these with WriteShift/WriteShiftLd. Differential Revision: https://reviews.llvm.org/D59077 llvm-svn: 355636	2019-03-07 21:22:56 +00:00
Craig Topper	b3af5d3e57	[X86] Model ADC/SBB with immediate 0 more accurately in the Haswell scheduler model Haswell and possibly Sandybridge have an optimization for ADC/SBB with immediate 0 to use a single uop flow. This only applies GR16/GR32/GR64 with an 8-bit immediate. It does not apply to GR8. It also does not apply to the implicit AX/EAX/RAX forms. Differential Revision: https://reviews.llvm.org/D59058 llvm-svn: 355635	2019-03-07 21:22:51 +00:00
Konstantin Zhuravlyov	47f0bf8f1f	AMDHSA: Code object v3 updates - Copy kernel symbol attributes into kernel descriptor attributes - Make sure kernel symbol's visibility is not "higher" than protected Differential Revision: https://reviews.llvm.org/D59057 llvm-svn: 355630	2019-03-07 19:58:29 +00:00
Matt Davis	6c5a49ccb9	[llvm-mca] Emit a message when no bottlenecks are identified. Summary: Since bottleneck hints are enabled via user request, it can be confusing if no bottleneck information is presented. Such is the case when no bottlenecks are identified. This patch emits a message in that case. Reviewers: andreadb Reviewed By: andreadb Subscribers: tschuett, gbedwell, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59098 llvm-svn: 355628	2019-03-07 19:34:44 +00:00
Vlad Tsyrklevich	2e1479e2f2	Delete x86_64 ShadowCallStack support Summary: ShadowCallStack on x86_64 suffered from the same racy security issues as Return Flow Guard and had performance overhead as high as 13% depending on the benchmark. x86_64 ShadowCallStack was always an experimental feature and never shipped a runtime required to support it, as such there are no expected downstream users. Reviewers: pcc Reviewed By: pcc Subscribers: mgorny, javed.absar, hiraditya, jdoerfert, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D59034 llvm-svn: 355624	2019-03-07 18:56:36 +00:00
Florian Hahn	6ca0985aa5	[InterleavedAccessAnalysis] Fix integer overflow in insertMember. Without checking for integer overflow, invalid members can be added e.g. if the calculated key overflows, becomes positive and the largest key. This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7560 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13128 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13229 Reviewers: Ayal, anna, hsaito, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55538 llvm-svn: 355613	2019-03-07 17:50:16 +00:00
Petar Jovanovic	95817d3641	[DebugInfo] Fix the type of the formated variable Change the format type of Personality and LSDAAddress to PRIx64 since they are of type uint64_t. The problem was detected on mips builds, where it was printing junk values and causing test failure. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D58451 llvm-svn: 355607	2019-03-07 16:31:08 +00:00
Xing GUO	eee6226c21	[llvm-readobj] Dump DT_USED value as string like GNU readelf does Reviewers: jhenderson Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59089 llvm-svn: 355600	2019-03-07 14:53:10 +00:00
David Green	ffc922ec35	[LSR] Attempt to increase the accuracy of LSR's setup cost In some loops, we end up generating loop induction variables that look like: {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1} As opposed to the simpler: {(zext i16 (%i0 * %i1) to i32),+,-1} i.e we count up from -limit to 0, not the simpler counting down from limit to 0. This is because the scores, as LSR calculates them, are the same and the second is filtered in place of the first. We end up with a redundant SUB from 0 in the code. This patch tries to make the calculation of the setup cost a little more thoroughly, recursing into the scev members to better approximate the setup required. The cost function for comparing LSR costs is: return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds, C1.ScaleCost, C1.ImmCost, C1.SetupCost) < std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds, C2.ScaleCost, C2.ImmCost, C2.SetupCost); So this will only alter results if none of the other variables turn out to be different. Differential Revision: https://reviews.llvm.org/D58770 llvm-svn: 355597	2019-03-07 13:44:40 +00:00
Petar Avramovic	3d3120dc9a	[MIPS GlobalISel] Fix mul operands Unsigned mul high for MIPS32 is selected into two PseudoInstructions: PseudoMULTu and PseudoMFHI that use accumulator register class ACC64 for some of its operands. Registers in this class have appropriate hi and lo register as subregisters: $lo0 and $hi0 are subregisters of $ac0 etc. mul instruction implicit-defs $lo0 and $hi0 according to MipsInstrInfo.td. In functions where mul and PseudoMULTu are present fastRegisterAllocator will "run out of registers during register allocation" because 'calcSpillCost' for $ac0 will return spillImpossible because subregisters $lo0 and $hi0 of $ac0 are reserved by mul instruction above. A solution is to mark implicit-defs of $lo0 and $hi0 as dead in mul instruction. Differential Revision: https://reviews.llvm.org/D58715 llvm-svn: 355594	2019-03-07 13:28:29 +00:00
George Rimar	a5a0a0f049	[yaml2obj] - Allow producing ELFDATANONE ELFs I need this to remove a binary from LLD test suite. The patch also simplifies the code a bit. Differential revision: https://reviews.llvm.org/D59082 llvm-svn: 355591	2019-03-07 12:09:19 +00:00
Craig Topper	3acc4236b8	[X86] Enable combineFMinNumFMaxNum for 512 bit vectors when AVX512 is enabled. Simplified by just checking if the vector type is legal rather than listing all combinations of types and features. Fixes PR40984. llvm-svn: 355582	2019-03-07 06:30:19 +00:00
Craig Topper	a0dd6e9a08	[X86] Add 512-bit fminnum/maxnum test cases for PR40984. Also add v8f32 minnum/maxnum tests. NFC llvm-svn: 355581	2019-03-07 05:56:52 +00:00
Aakanksha Patil	c56d2afc63	AMDGPU: Handle "uniform-work-group-size" attribute (fix for RADV) A previous patch for "uniform-work-group-size" attribute was found to break some RADV and possibly radeon SI tests and had to be retracted. This patch fixes that. Differential Revision: http://reviews.llvm.org/D58993 llvm-svn: 355574	2019-03-07 00:54:04 +00:00
Nick Desaulniers	212c8ac23f	[LoopRotate] fix crash encountered with callbr Summary: While implementing inlining support for callbr (https://bugs.llvm.org/show_bug.cgi?id=40722), I hit a crash in Loop Rotation when trying to build the entire x86 Linux kernel (drivers/char/random.c). This is a small fix up to r353563. Test case is drivers/char/random.c (with callbr's inlined), then ran through creduce, then `opt -opt-bisect-limit=<limit>`, then bugpoint. Thanks to Craig Topper for immediately spotting the fix, and teaching me how to fish. Reviewers: craig.topper, jyknight Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D58929 llvm-svn: 355564	2019-03-06 23:04:40 +00:00
Simon Atanasyan	83b88441ad	[mips] Replace assertion by error message while lowering `RETURNADDR` and `FRAMEADDR` MIPS target supports lowering `RETURNADDR` and `FRAMEADDR` for a current frame only. It's better to show an error message then crash on assertion if `__builtin_return_address` is invoked with non-zero argument. llvm-svn: 355558	2019-03-06 22:40:28 +00:00
Rong Xu	3ee1524afc	[PGO] Fix hexagon buildbot errors in r355541 Add "REQUIRES: x86-registered-target" to thinlto test cases. llvm-svn: 355556	2019-03-06 22:16:47 +00:00
Abderrazek Zaafrani	5ced596198	[AArch64] Improve FP16 instruction selection for vector round and vector conver from half instructions https://reviews.llvm.org/D58855 llvm-svn: 355545	2019-03-06 20:30:06 +00:00
Nikita Popov	e1012e1efb	[X86] Add vector mulo with power of two operand tests; NFC llvm-svn: 355544	2019-03-06 20:25:49 +00:00
Paul Robinson	05efe0fdc4	[PS4] Emit a trap after a stack-protector fail call. llvm-svn: 355542	2019-03-06 19:57:43 +00:00
Rong Xu	05c0afe842	[PGO] Context sensitive PGO (part 4) Part 4 of CSPGO changes: (1) add support in cmake for cspgo build. (2) fix an issue in big endian. (3) test cases. Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355541	2019-03-06 19:31:37 +00:00
Philip Reames	9549f7560f	[AtomicExpand] Allow libcall expansion for non-zero address spaces (try 2) Restore a reverted commit, with the silly mistake fixed. Sorry for the previous breakage. Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355540	2019-03-06 19:27:13 +00:00
Sanjay Patel	f941631875	[AArch64] add tests for uaddsat/usubsat; NFC The tests are copied from the sibling x86 test files. llvm-svn: 355535	2019-03-06 19:02:01 +00:00
Simon Pilgrim	417f8c5be4	[PowerPC] Use real pointers instead of undef The reduced test removed the pointer arguments, but to better survive D58017 and D58070 we need them back. llvm-svn: 355532	2019-03-06 18:49:39 +00:00
Nikita Popov	884feb1b69	[InstCombine] Fold add nsw + sadd.with.overflow Fold `add nsw` and `sadd.with.overflow` with constants if the addition does not overflow. Part of https://bugs.llvm.org/show_bug.cgi?id=38146. Patch by Dan Robertson. Differential Revision: https://reviews.llvm.org/D58881 llvm-svn: 355530	2019-03-06 18:30:00 +00:00
Guozhi Wei	11308bdb43	[PPC] Adjust the computed branch offset for the possible shorter distance In file PPCBranchSelector.cpp we tend to over estimate code size due to large alignment and inline assembly. Usually it causes larger computed branch offset, it is not big problem. But sometimes it may also causes smaller computed branch offset than actual branch offset. If the offset is close to the limit of encoding, it may cause problem at run time. Following is a simplified example. actual estimated address address ... bne Far 100 10c .p2align 4 Near: 110 110 ... Far: 8108 8108 Actual offset: 0x8108 - 0x100 = 0x8008 Computed offset: 0x8108 - 0x10c = 0x7ffc The computed offset is at most ((1 << alignment) - 4) bytes smaller than actual offset. So we add this number to the offset for safety. Differential Revision: https://reviews.llvm.org/D57718 llvm-svn: 355529	2019-03-06 18:22:22 +00:00
Francis Visoiu Mistrih	c01140ef1f	[MC][MachO] Emit an error for emitting relocations of the form -SYM + cst Emit an error for an unsupported relocation. mach-o relocations can't encode the form -SYM + cst. Differential Revision: https://reviews.llvm.org/D58944 llvm-svn: 355527	2019-03-06 18:10:41 +00:00
Krzysztof Parzyszek	9c005bbdd4	[Hexagon] Avoid creating 5-instruction packets with vgather pseudos Change the resource usage of the vgather pseudos from SLOT0+LD to SLOT0+SLOT1. llvm-svn: 355524	2019-03-06 17:43:50 +00:00
Ryan Taylor	67f36903ae	[AMDGPU] Add support for 64 bit buffer atomic artihmetic instructions Summary: This adds support for 64 bit buffer atomic arithmetic instructions but does not include cmpswap as that depends on a fix to the way the register pairs are handled Change-Id: Ib207ea65fb69487ccad5066ea647ae8ddfe2ce61 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58918 llvm-svn: 355520	2019-03-06 17:02:06 +00:00
Simon Pilgrim	cdf95f8f07	[DAGCombiner] Enable UADDO/USUBO vector combine support Differential Revision: https://reviews.llvm.org/D58965 llvm-svn: 355517	2019-03-06 16:11:03 +00:00
Alexander Kornienko	3d467a890e	Revert "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default" This reverts commit `2a0f2c5ef3` (r355490). The commit causes an assertion failure when compiling LLVM code: $ cat repro.cpp class QQQ { public: bool x() const; bool y() const; unsigned getSizeInBits() const { if (y() \|\| x()) return getScalarSizeInBits(); return getScalarSizeInBits() * 2; } unsigned getScalarSizeInBits() const; }; int f(const QQQ &Ty) { switch (Ty.getSizeInBits()) { case 1: case 8: return 0; case 16: return 1; case 32: return 2; case 64: return 3; default: __builtin_unreachable(); } } $ clang -O2 -o repro.o repro.cpp assert.h assertion failed at llvm/include/llvm/ADT/ilist_iterator.h:139 in llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, true, false>::operator() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = true, IsConst = false]: !NodePtr->isKnownSentinel() Check failure stack trace: * @ 0x558aab4afc10 __assert_fail @ 0x558aa885479b llvm::ilist_iterator<>::operator() @ 0x558aa8854715 llvm::MachineInstrBundleIterator<>::operator() @ 0x558aa92c33c3 llvm::X86InstrInfo::optimizeCompareInstr() @ 0x558aa9a9c251 (anonymous namespace)::PeepholeOptimizer::optimizeCmpInstr() @ 0x558aa9a9b371 (anonymous namespace)::PeepholeOptimizer::runOnMachineFunction() @ 0x558aa99a4fc8 llvm::MachineFunctionPass::runOnFunction() @ 0x558aab019fc4 llvm::FPPassManager::runOnFunction() @ 0x558aab01a3a5 llvm::FPPassManager::runOnModule() @ 0x558aab01aa9b (anonymous namespace)::MPPassManager::runOnModule() @ 0x558aab01a635 llvm::legacy::PassManagerImpl::run() @ 0x558aab01afe1 llvm::legacy::PassManager::run() @ 0x558aa5914769 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly() @ 0x558aa5910f44 clang::EmitBackendOutput() @ 0x558aa5906135 clang::BackendConsumer::HandleTranslationUnit() @ 0x558aa6d165ad clang::ParseAST() @ 0x558aa6a94e22 clang::ASTFrontendAction::ExecuteAction() @ 0x558aa590255d clang::CodeGenAction::ExecuteAction() @ 0x558aa6a94840 clang::FrontendAction::Execute() @ 0x558aa6a38cca clang::CompilerInstance::ExecuteAction() @ 0x558aa4e2294b clang::ExecuteCompilerInvocation() @ 0x558aa4df6200 cc1_main() @ 0x558aa4e1b37f ExecuteCC1Tool() @ 0x558aa4e1a725 main @ 0x7ff20d56abbd __libc_start_main @ 0x558aa4df51c9 _start llvm-svn: 355515	2019-03-06 15:23:50 +00:00
Strahinja Petrovic	94fccc93de	[PowerPC] Add secure plt support for TLS symbols This patch supports secure plt mode for TLS symbols. Differential Revision: https://reviews.llvm.org/D45520 llvm-svn: 355513	2019-03-06 15:00:10 +00:00
Simon Pilgrim	1bdc2d1874	[DAGCombiner] Add SADDO/SSUBO combine support Basic constant handling folds, for both scalars and vectors Differential Revision: https://reviews.llvm.org/D58967 llvm-svn: 355506	2019-03-06 14:22:21 +00:00
Simon Pilgrim	3f37538b86	[llvm-mca][X86] Add ADC/SBB with zero test cases Some targets have fast-path handling for these patterns that we should model. llvm-svn: 355498	2019-03-06 12:51:16 +00:00
Roman Lebedev	4764310505	[X86][NFC] Autogenerate check lines in cmovcmov.ll test Investigating 8-bit cmov promotion, this test comes up. llvm-svn: 355496	2019-03-06 11:47:43 +00:00
Simon Pilgrim	642f53d292	[DAGCombiner] Enable SMULO/UMULO vector combine support (PR40442) Differential Revision: https://reviews.llvm.org/D58968 llvm-svn: 355495	2019-03-06 11:04:21 +00:00
Simon Pilgrim	468bb2e601	[X86][SSE] VSELECT(XOR(Cond,-1), LHS, RHS) --> VSELECT(Cond, RHS, LHS) As noticed on D58965 DAGCombiner::visitSELECT has something similar, so we should be able to move this to DAGCombiner and support VSELECT as well at some point. Differential Revision: https://reviews.llvm.org/D58974 llvm-svn: 355494	2019-03-06 10:54:43 +00:00
Ayonam Ray	2a0f2c5ef3	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355490	2019-03-06 10:01:02 +00:00
Ayonam Ray	af92b7a3b8	Reversing the commit of revision 355483 since it is giving a regression on a newly added test. llvm-svn: 355487	2019-03-06 07:51:28 +00:00
Craig Topper	c0e01d29a4	[X86] Enable the add with 128 -> sub with -128 encoding trick with X86ISD::ADD when the carry flag isn't used. This allows us to use an 8-bit sign extended immediate instead of a 16 or 32 bit immediate. Also do similar for 0x80000000 with 64-bit adds to avoid having to use a movabsq. llvm-svn: 355485	2019-03-06 07:36:38 +00:00
Craig Topper	97a1c4c340	[X86] Suppress load folding for add/sub with 128 immediate. 128 won't fit in a sign extended 8-bit immediate, but we can negate it to -128 and use the other operation. This results in a shorter encoding since the move would have used 16 or 32 bits for the immediate. llvm-svn: 355484	2019-03-06 07:36:36 +00:00
Ayonam Ray	6025fa8e30	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Differential Revision: https://reviews.llvm.org/D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 355483	2019-03-06 07:27:45 +00:00
Heejin Ahn	3c20b34d24	[WebAssembly] Remove trailing whitespaces in tests (NFC) Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58955 llvm-svn: 355472	2019-03-06 02:00:22 +00:00
Xing GUO	dd440675cf	[BinaryFormat] Add DT_USED tag into dynamic section. Summary: This tag is documented in https://docs.oracle.com/cd/E19253-01/817-1984/chapter6-42444/index.html Though I could not find some docs that describe it in detail, I found some code snippets. 1. ``` /* * Look up the string in the string table and get its offset. If * this succeeds, then it is possible that there is a DT_NEEDED * dynamic entry that references it. / have_string = elfedit_sec_findstr(argstate->str.sec, strpad_elt.dn_dyn.d_un.d_val, arg, &str_offset) != 0; if (have_string) { dyn = argstate->dynamic.data; for (ndx = 0; ndx < numdyn; dyn++, ndx++) { if (((dyn->d_tag == DT_NEEDED) \|\| (dyn->d_tag == DT_USED)) && (dyn->d_un.d_val == str_offset)) goto done; } } ``` `80192cd83b/usr/src/cmd/sgs/elfedit/modules/common/syminfo.c (L512)` 2. ``` case DT_USED: case DT_INIT_ARRAY: case DT_FINI_ARRAY: if (do_dynamic) { if (entry->d_tag == DT_USED && VALID_DYNAMIC_NAME (entry->d_un.d_val)) { char name = GET_DYNAMIC_NAME (entry->d_un.d_val); if (name) { printf (_("Not needed object: [%s]\n"), name); break; } } print_vma (entry->d_un.d_val, PREFIX_HEX); putchar ('\n'); } break; ``` http://web.mit.edu/freebsd/head/contrib/binutils/binutils/readelf.c 3. ``` #define DT_USED 0x7ffffffe / ignored - same as needed */ ``` https://github.com/switchbrew/switch-tools/blob/master/src/elf_common.h Reviewers: jhenderson, grimar Reviewed By: jhenderson, grimar Subscribers: emaste, krytarowski, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58762 llvm-svn: 355468	2019-03-06 01:28:40 +00:00
Mitch Phillips	f0c21e2ff5	Revert "[AtomicExpand] Allow libcall expansion for non-zero address spaces" for buildbot failures. llvm-svn: 355461	2019-03-06 00:25:40 +00:00
Florian Hahn	13bbcb3264	[ARM] Sink zext/sext operands for add and sub to enable vsubl generation. This uses the infrastructure added in rL353152 to sink zext and sexts to sub/add users, to enable vsubl/vaddl generation when NEON is available. See https://bugs.llvm.org/show_bug.cgi?id=40025. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D58063 llvm-svn: 355460	2019-03-06 00:10:03 +00:00
Jonas Devlieghere	4cc567bb9e	[DWARFFormValue] Don't consider DW_FORM_data4/8 to be section offsets. When dumping ToT clan's debug info with dwarfdump, we were seeing an error saying that that the location list overflows the debug_loc section. After reducing the testcase we figured out that we were interpreting the DW_FORM_data4 as a section offset. In DWARF3 DW_FORM_data4 and DW_FORM_data8 served also as a section offset. Until now we didn't check check for the DWARF version, because some producers (read old versions of clang) were still emitting this. The relevant code/comment was added in 2013, and I believe it's now reasonable to start checking the version. The FormValue class is a little bit of a mess because it cashes the DWARF unit and context when it extracted the value itself. Several methods of the class rely on it being present, or return an Optional for the code path that needs it. At the same time the FormValue class also used in places where there's no DWARF unit. For this patch I went with the least invasive change: checking the version from the CU when it's available. If it's not (because the form value was created from a value directly) we default to the old behavior. Differential revision: https://reviews.llvm.org/D58698 llvm-svn: 355456	2019-03-05 23:47:22 +00:00
Philip Reames	1e4c5d3611	[AtomicExpand] Allow libcall expansion for non-zero address spaces Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355453	2019-03-05 23:00:14 +00:00
Roman Lebedev	98d412ff13	[X86][NFC] Add proper test for promotion of i8 cmov's of trunc's There was no proper test for that code in X86TargetLowering::LowerSELECT(). Noticed accidentally while trying to modify the last branch in that function. llvm-svn: 355452	2019-03-05 22:43:53 +00:00
Heejin Ahn	ef9d6aea45	[WebAssembly] Disable MachineBlockPlacement pass Summary: This pass hurts code size for wasm and sometimes generates irreducible control flow. Context: https://github.com/emscripten-core/emscripten/pull/8233 Reviewers: kripken, dschuff Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58953 llvm-svn: 355437	2019-03-05 20:35:34 +00:00
Roman Lebedev	c38831e11d	[NFC][CodeGen][X86][AArch64] Add tests for C++ std::midpoint() pattern (PR40965) Tests only for integers, not floating point or pointers. The scalar 8-bit case uses branch instead of CMOV, because there is no no 8-bit CMOV. Vector tests are for consistency, since it can be vectorized. https://bugs.llvm.org/show_bug.cgi?id=40965 llvm-svn: 355436	2019-03-05 20:18:47 +00:00
Matt Arsenault	870397739e	AMDGPU: Preserve undef flag when expanding SI_IF Fixes undefined value verifier error. llvm-svn: 355426	2019-03-05 18:38:00 +00:00
Craig Topper	4a9dd7c39b	[X86] Enable 8-bit SHL to convert to LEA Differential Revision: https://reviews.llvm.org/D58870 llvm-svn: 355425	2019-03-05 18:37:41 +00:00
Craig Topper	216bf7f03b	[X86] Allow 8-bit INC/DEC to be converted to LEA. We already do this for 16/32/64 as well as 8-bit add with register/immediate. Might as well do it for 8-bit INC/DEC too. Differential Revision: https://reviews.llvm.org/D58869 llvm-svn: 355424	2019-03-05 18:37:37 +00:00
Craig Topper	572e94ca02	[X86] Enable 8-bit OR with disjoint bits to convert to LEA We already support 8-bits adds in convertToThreeAddress. But we can also support 8-bit OR if the bits are disjoint. We already do this for 16/32/64. Differential Revision: https://reviews.llvm.org/D58863 llvm-svn: 355423	2019-03-05 18:37:33 +00:00
Florian Hahn	add2d2e304	[SLP] Fix invalid triple in X86 tests x86-64 is an invalid architecture in triples. Changing it to the correct triple (x86_64) changes some tests, because SLP is not deemed profitable any more. Reviewers: ABataev, RKSimon, spatel Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D58931 llvm-svn: 355420	2019-03-05 17:56:35 +00:00
Javed Absar	34d3b80dba	TableGen: Allow lists to be concatenated through '#' Currently one can concatenate strings using hash(#), but not lists, although that would be a natural thing to do. This patch allows one to write something like: def : A<!listconcat([1,2], [3,4])>; simply as : def : A<[1,2] # [3,4]>; This was missing feature was highlighted by Nicolai at FOSDEM talk. Reviewed by: nhaehnle, hfinkel Differential Revision: https://reviews.llvm.org/D58895 llvm-svn: 355414	2019-03-05 17:16:07 +00:00
Simon Pilgrim	40441aa86a	[X86][SSE] Regenerate vector zero tests llvm-svn: 355412	2019-03-05 16:52:14 +00:00
Jessica Paquette	00d5847b5c	Revert "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT" This broke test-suite::aarch64_neon_intrinsics.test Reverting while I look into it. Example failure: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/17740 llvm-svn: 355408	2019-03-05 15:47:00 +00:00
Simon Pilgrim	f011e53a78	[X86] Add SMULO/UMULO combine tests Include scalar and vector test variants covering the folds in DAGCombiner (vector isn't currently supported - PR40442) llvm-svn: 355407	2019-03-05 15:36:45 +00:00
Simon Pilgrim	65676571e1	Fix typo in constant vector llvm-svn: 355405	2019-03-05 15:06:01 +00:00
Simon Pilgrim	a3d06ccd5e	[X86] Add SADDO/UADDO and SSUBO/USUBO combine tests Include scalar and vector test variants covering the folds in DAGCombiner (vector isn't currently supported - PR40442) llvm-svn: 355404	2019-03-05 14:52:42 +00:00
Simon Pilgrim	4d93b9c75c	[X86] Add test cases for D58874 Add scalar and vector test cases for missing (add (add (xor a, -1), b), 1) -> (sub b, a) fold llvm-svn: 355400	2019-03-05 13:52:09 +00:00
George Rimar	ade3c70537	[llvm-objcopy] - Simplify `isCompressable` and fix the issue relative. When --compress-debug-sections is given, llvm-objcopy do not compress sections that have "ZLIB" header in data. Normally this signature is used in zlib-gnu compression format. But if zlib-gnu used then the name of the compressed section should start from .z* (e.g .zdebug_info). If it does not, then it is not a zlib-gnu format and section should be treated as a normal uncompressed section. Differential revision: https://reviews.llvm.org/D58908 llvm-svn: 355399	2019-03-05 13:07:43 +00:00
Carl Ritson	9e3f7d8ad0	[AMDGPU] Fix DPP operand order in atomic optimizer Summary: Ensure order of operands in DPP atomic optimizer final WWM step is appropriate for sub instructions. Change-Id: I631d050e1c00a3b4bc7c11a90437064403c4cf30 Reviewers: sheredom, tpr Reviewed By: sheredom Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58900 llvm-svn: 355394	2019-03-05 12:21:44 +00:00
David Green	4511f3fa86	[SCEV] Ensure that isHighCostExpansion takes into account what is being divided A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393	2019-03-05 12:12:18 +00:00
David Green	3bcb0aa7f9	[SCEV] Add some extra tests for IndVarSimplifys loop exit values. NFC. Add some tests for various loops of the form: while(S >= 32) { S -= 32; something(); }; return S; llvm-svn: 355389	2019-03-05 11:18:55 +00:00
Oliver Stannard	4a9086b537	[ARM] Fix select_cc lowering for fp16 When lowering a select_cc node where the true and false values are of type f16, we can't use a general conditional move because the FP16 instructions do not support conditional execution. Instead, we must ensure that the condition code is one of the four supported by the VSEL instruction. Differential revision: https://reviews.llvm.org/D58813 llvm-svn: 355385	2019-03-05 10:42:34 +00:00
David Stuttard	81eec58a0d	[AMDGPU] Omit KILL instructions from hazard recognizer Summary: In some cases the KILL was causing a hazard to be introduced as these were scheduled into hazard slots, but don't result in an instruction. KILL shouldn't be considered for hazard recognition. Change-Id: Ib6d2a2160f8c94cd0ce611ab198c7e4f46aeffcf Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58898 llvm-svn: 355384	2019-03-05 10:25:16 +00:00
Chen Zheng	9cfe7e81f1	[PowerPC] fix killed/dead flag after convert x-form to d-form tranformation. Differential Revision: https://reviews.llvm.org/D58428 llvm-svn: 355378	2019-03-05 04:56:54 +00:00
Xing GUO	013e17f50e	[ARM][MC] Update one test case in 'test/MC/Disassembler/ARM/invalid-armv7.txt' Summary: Instruction `[0xfe 0xf0 0x20 0xe3]` is a valid instruction on ARM-v7, which is `dbg #14`. See: https://www.cl.cam.ac.uk/research/srg/han/ACS-P35/zynq/ARMv7-A-R-manual.pdf (Page: 377) ``` Encoding A1: DBG<c> #<option> \|31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16\|15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00\| \| cond \| 0 0 1 1 0\| 0\| 1 0\| 0 0 0 0\| 1 1 1 1\| 0 0 0 0\| 1 1 1 1\| option \| ``` Reviewers: fhahn, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58873 llvm-svn: 355374	2019-03-05 03:07:56 +00:00
Scott Linder	efec1396ac	[AMDGPU] Implement AMDGPUMCInstrAnalysis Implement MCInstrAnalysis for AMDGPU, with default implementations save for `evaluateBranch`. Differential Revision: https://reviews.llvm.org/D58400 llvm-svn: 355373	2019-03-05 03:02:00 +00:00
Yonghong Song	d82247cb80	[BPF] Do not generate BTF sections unnecessarily If There is no types/non-empty strings, do not generate .BTF section. If there is no func_info/line_info, do not generate .BTF.ext section. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D58936 llvm-svn: 355360	2019-03-05 01:01:21 +00:00
Florian Hahn	fd2d89f98b	Fix invalid target triples in tests. (NFC) llvm-svn: 355349	2019-03-04 23:37:41 +00:00
Evgeniy Stepanov	53d7c5cd44	[msan] Instrument x86 BMI intrinsics. Summary: They simply shuffle bits. MSan needs to do the same with shadow bits, after making sure that the shuffle mask is fully initialized. Reviewers: pcc, vitalybuka Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58858 llvm-svn: 355348	2019-03-04 22:58:20 +00:00
Sanjay Patel	3b2d0bc7c2	[CodeGenPrepare] avoid crashing on non-canonical/degenerate code The test is reduced from an example in the post-commit thread for: rL354746 http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html While we must avoid dying here, the real question should be: Why is non-canonical and/or degenerate code making it to CGP when using the new pass manager? llvm-svn: 355345	2019-03-04 22:47:13 +00:00
Jessica Paquette	caf62b1d47	[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT This adds instruction selection support for G_EXTRACT_VECTOR_ELT for cases where the index is defined by a G_CONSTANT. It also factos out the lane copy opcode selection part into its own function, `getLaneCopyOpcode`. This is used by both `selectUnmergeValues` and `selectExtractElt`. Differential Revision: https://reviews.llvm.org/D58469 llvm-svn: 355344	2019-03-04 22:35:32 +00:00
Jessica Paquette	0632e12f89	[GlobalISel][AArch64] Legalize vector G_SELECT Just scalarize it, and add a test showing it works. Differential Revision: https://reviews.llvm.org/D58747 llvm-svn: 355339	2019-03-04 21:12:46 +00:00
Sanjay Patel	6e32b46b1d	[ConstantHoisting] avoid hang/crash from unreachable blocks (PR40930) I'm not too familiar with this pass, so there might be a better solution, but this appears to fix the degenerate: PR40930 PR40931 PR40932 PR40934 ...without affecting any real-world code. As we've seen in several other passes, when we have unreachable blocks, they can contain semi-bogus IR and/or cause unexpected conditions. We would not typically expect these patterns to make it this far, but we have to guard against them anyway. llvm-svn: 355337	2019-03-04 20:57:14 +00:00
Nikita Popov	8670faf939	[InstCombine] Add tests for add nsw + sadd.with.overflow; NFC Baseline tests for D58881, which fixes part of PR38146. Patch by Dan Robertson. llvm-svn: 355328	2019-03-04 19:35:46 +00:00
Amara Emerson	8acb0d9c82	Re-commit r355104: "[AArch64][GlobalISel] Add support for 64 bit vector shuffle using TBL1." The code to materialize a mask from a constant pool load tried to use a 128 bit LDR to load a 64 bit constant pool entry, which was 8 byte aligned. This resulted in a link failure in the NEON tests in the test suite since the LDR address was unaligned. This change fixes that to instead emit a 64 bit LDR if the entry is 64 bit, before converting back to a 128 bit register for the TBL. llvm-svn: 355326	2019-03-04 19:16:00 +00:00
Nirav Dave	05e2335076	[MC] Teach ELFObjectWriter that parse-time variables do not appear in symbol table. llvm-svn: 355325	2019-03-04 19:12:56 +00:00
Craig Topper	509a8a3cf1	[DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast (build_vector constants)) between legalize types and legalize dag. This patch enables combining integer bitcasts of integer build vectors when the new scalar type is legal. I've avoided floating point because the implementation bitcasts float to int along the way and we would need to check the intermediate types for legality Differential Revision: https://reviews.llvm.org/D58884 llvm-svn: 355324	2019-03-04 19:12:16 +00:00
Wouter van Oortmerssen	f3feb6adb9	[WebAssembly] Add support for data sections in the assembler. Summary: This is quite minimal so far, introduce them with .section, fill them with .int8 or .asciz, end with .size Reviewers: dschuff, sbc100, aheejin Subscribers: jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58660 llvm-svn: 355321	2019-03-04 17:18:04 +00:00
Simon Pilgrim	eeb1144d27	[X86] Regenerate illegal type load test with non-undef load address. This would be affected by an upcoming patch without undoing some of the bugpoint reduction. llvm-svn: 355316	2019-03-04 14:49:02 +00:00
Dmitry Preobrazhensky	6023d5990d	[AMDGPU][MC] Enable lds_direct operand for v_readfirstlane_b32, v_readlane_b32 and v_writelane_b32 See bug 40662: https://bugs.llvm.org/show_bug.cgi?id=40662 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58713 llvm-svn: 355312	2019-03-04 12:48:32 +00:00
Andrea Di Biagio	be3281a281	[MCA] Highlight kernel bottlenecks in the summary view. This patch adds a new flag named -bottleneck-analysis to print out information about throughput bottlenecks. MCA knows how to identify and classify dynamic dispatch stalls. However, it doesn't know how to analyze and highlight kernel bottlenecks. The goal of this patch is to teach MCA how to correlate increases in backend pressure to backend stalls (and therefore, the loss of throughput). From a Scheduler point of view, backend pressure is a function of the scheduler buffer usage (i.e. how the number of uOps in the scheduler buffers changes over time). Backend pressure increases (or decreases) when there is a mismatch between the number of opcodes dispatched, and the number of opcodes issued in the same cycle. Since buffer resources are limited, continuous increases in backend pressure would eventually leads to dispatch stalls. So, there is a strong correlation between dispatch stalls, and how backpressure changed over time. This patch teaches how to identify situations where backend pressure increases due to: - unavailable pipeline resources. - data dependencies. Data dependencies may delay execution of instructions and therefore increase the time that uOps have to spend in the scheduler buffers. That often translates to an increase in backend pressure which may eventually lead to a bottleneck. Contention on pipeline resources may also delay execution of instructions, and lead to a temporary increase in backend pressure. Internally, the Scheduler classifies instructions based on whether register / memory operands are available or not. An instruction is marked as "ready to execute" only if data dependencies are fully resolved. Every cycle, the Scheduler attempts to execute all instructions that are ready to execute. If an instruction cannot execute because of unavailable pipeline resources, then the Scheduler internally updates a BusyResourceUnits mask with the ID of each unavailable resource. ExecuteStage is responsible for tracking changes in backend pressure. If backend pressure increases during a cycle because of contention on pipeline resources, then ExecuteStage sends a "backend pressure" event to the listeners. That event would contain information about instructions delayed by resource pressure, as well as the BusyResourceUnits mask. Note that ExecuteStage also knows how to identify situations where backpressure increased because of delays introduced by data dependencies. The SummaryView observes "backend pressure" events and prints out a "bottleneck report". Example of bottleneck report: ``` Cycles with backend pressure increase [ 99.89% ] Throughput Bottlenecks: Resource Pressure [ 0.00% ] Data Dependencies: [ 99.89% ] - Register Dependencies [ 0.00% ] - Memory Dependencies [ 99.89% ] ``` A bottleneck report is printed out only if increases in backend pressure eventually caused backend stalls. About the time complexity: Time complexity is linear in the number of instructions in the Scheduler::PendingSet. The average slowdown tends to be in the range of ~5-6%. For memory intensive kernels, the slowdown can be significant if flag -noalias=false is specified. In the worst case scenario I have observed a slowdown of ~30% when flag -noalias=false was specified. We can definitely recover part of that slowdown if we optimize class LSUnit (by doing extra bookkeeping to speedup queries). For now, this new analysis is disabled by default, and it can be enabled via flag -bottleneck-analysis. Users of MCA as a library can enable the generation of pressure events through the constructor of ExecuteStage. This patch partially addresses https://bugs.llvm.org/show_bug.cgi?id=37494 Differential Revision: https://reviews.llvm.org/D58728 llvm-svn: 355308	2019-03-04 11:52:34 +00:00
Jeremy Morse	09d8ea5282	[X86] Avoid codegen changes when DBG_VALUE appears between lowered selects X86TargetLowering::EmitLoweredSelect presently detects sequences of CMOV pseudo instructions without accounting for debug intrinsics. This leads to different codegen with and without option -g, if a DBG_VALUE instruction lands in the middle of several lowered selects. Work around this by skipping over debug instructions when looking for CMOV sequences, and sinking those debug insts into the EmitLoweredSelect sunk block. This might slightly shift where variables appear in the instruction sequence, but won't re-order assignments. Differential Revision: https://reviews.llvm.org/D58672 llvm-svn: 355307	2019-03-04 10:56:02 +00:00
Oliver Stannard	181afc7f3b	[ARM] Fix selection of VLDR.16 instruction with imm offset The isScaledConstantInRange function takes upper and lower bounds which are checked after dividing by the scale, so the bounds checks for half, single and double precision should all be the same. Previously, we had wrong bounds checks for half precision, so selected an immediate the instructions can't actually represent. Differential revision: https://reviews.llvm.org/D58822 llvm-svn: 355305	2019-03-04 09:17:38 +00:00
Eugene Leviant	daea28ab64	[DebugInfo] Construct nested types on behalf of owner CU Differential revision: https://reviews.llvm.org/D58786 llvm-svn: 355303	2019-03-04 07:15:36 +00:00
Davide Italiano	672bec223d	[InstCombine] Mark debug values as unavailable after DCE. Fixes PR40838. llvm-svn: 355301	2019-03-04 04:38:58 +00:00
Heejin Ahn	195a62e9ae	[WebAssembly] Delete ThrowUnwindDest map from WasmEHFuncInfo Summary: Before when we implemented the first EH proposal, 'catch <tag>' instruction may not catch an exception so there were multiple EH pads an exception can unwind to. That means a BB could have multiple EH pad successors. Now after we switched to the new proposal, every 'catch' instruction catches an exception, and there is only one catchpad per catchswitch, so we at most have one EH pad successor, making `ThrowUnwindDest` map in `WasmEHInfo` unnecessary. Keeping `ThrowUnwindDest` map in `WasmEHInfo` has its own problems, because other optimization passes can split a BB that contains possibly throwing calls (previously invokes), and we have to update the map every time that happens, which is not easy for common CodeGen passes. This also correctly updates successor info in LateEHPrepare when we add a rethrow instruction. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58486 llvm-svn: 355296	2019-03-03 22:35:56 +00:00
Craig Topper	e9e4a0f5b4	[X86] Regenerate test to get the full FP operands printed. NFC Missed when I updated the printer to print implicit %st operand on binops. llvm-svn: 355295	2019-03-03 20:28:52 +00:00
Sanjay Patel	e076491759	[InstCombine] remove stale FIXME comment from test; NFC llvm-svn: 355293	2019-03-03 19:08:54 +00:00
Sanjay Patel	2a70703770	[ValueTracking] do not try to peek through bitcasts in computeKnownBitsFromAssume() There are no tests for this case, and I'm not sure how it could ever work, so I'm just removing this option from the matcher. This should fix PR40940: https://bugs.llvm.org/show_bug.cgi?id=40940 llvm-svn: 355292	2019-03-03 18:59:33 +00:00
Amaury Sechet	d341a94261	Add extra ops in add to sub transform test in order to enforce proper operand ordering. NFC llvm-svn: 355291	2019-03-03 15:11:13 +00:00
Simon Pilgrim	d8e91a54c0	[X86] getShuffleScalarElt - peek through insert/extract subvector nodes. llvm-svn: 355288	2019-03-03 14:11:05 +00:00
Craig Topper	ce68659772	[X86] Prefer VPBLENDD for v2i64/v4i64 blends with AVX2. We were using VPBLENDW for v2i64 and VBLENDPD for v4i64. VPBLENDD has better throughput than VPBLENDW on some CPUs so it makes sense to use it when possible. VBLENDPD will probably become VBLENDD during execution domain fixing, but we might as well use integer in isel while we can. This should work around some issues with the domain fixing pass prefering PBLENDW when we start with PBLENDW. There may still be some v8i16 cases that could use PBLENDD. llvm-svn: 355281	2019-03-03 00:18:07 +00:00
Amaury Sechet	315d0bbb9c	Add test case for add to sub transformation. NFC llvm-svn: 355277	2019-03-02 20:12:25 +00:00
Sanjay Patel	1f65903dc1	[InstCombine] move add after smin/smax Follow-up to rL355221. This isn't specifically called for within PR14613, but we'll get there eventually if it's not already requested in some other bug report. https://rise4fun.com/Alive/5b0 Name: smax Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i8 %x, C0 %cond = icmp sgt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp sgt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nsw i8 %u2, C0 Name: smin Pre: WillNotOverflowSignedSub(C1,C0) %a = add nsw i32 %x, C0 %cond = icmp slt i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp slt i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nsw i32 %u2, C0 llvm-svn: 355272	2019-03-02 16:45:10 +00:00
Sanjay Patel	42ad8685c6	[InstCombine] add tests for add+smin/smax; NFC llvm-svn: 355271	2019-03-02 16:45:05 +00:00
Amaury Sechet	31291a403c	Add test case for add to sub transformation. NFC llvm-svn: 355269	2019-03-02 14:28:59 +00:00
Xing GUO	23b1dfe675	[Transforms] fix typo in test case. NFC. llvm-svn: 355265	2019-03-02 08:32:32 +00:00
Xing GUO	33649349c5	[Codegen] fix typos in test case llvm-svn: 355264	2019-03-02 08:03:59 +00:00
Xing GUO	b285878907	[llvm-objdump] Should print unknown d_tag in hex format Summary: Currently, `llvm-objdump` prints "unknown" instead of d_tag value in hex format. Because getDynamicTagAsString returns "unknown" rather than empty string. Reviewers: grimar, jhenderson Reviewed By: jhenderson Subscribers: rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58763 llvm-svn: 355262	2019-03-02 04:20:28 +00:00
Thomas Lively	43876ae7bc	[WebAssembly] Expand operations not supported by SIMD Summary: This prevents crashes in instruction selection when these operations are used. The tests check that the scalar version of the instruction is used where applicable, although some expansions do not use the scalar version. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58859 llvm-svn: 355261	2019-03-02 03:32:25 +00:00
Amaury Sechet	f24abf6511	[X86] Improve use of SHLD/SHRD Summary: This extends the variety of pattern that can generate a SHLD instead of using two shifts. This fixes a regression that would be introduced by D57367 or D33587 Reviewers: RKSimon, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57389 llvm-svn: 355260	2019-03-02 02:44:16 +00:00
Florian Hahn	98f11a7d75	[SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR. In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\| !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259	2019-03-02 02:31:44 +00:00
Amaury Sechet	1cc0f6061f	Add test case for truncate funnel shifts. NFC llvm-svn: 355258	2019-03-02 02:24:36 +00:00
Caroline Tice	bcdb1f3d04	llvm-dwarfdump: Add new variable, parameter and inlining statistics; also function source location statistics. Add statistics for abstract origins, function, variable and parameter locations; break the 'variable' counts down into variables and parameters. Also update call site counting to check for DW_AT_call_{file,line} in addition to DW_TAG_call_site. Differential revision: https://reviews.llvm.org/D58849 llvm-svn: 355243	2019-03-01 23:51:54 +00:00
Craig Topper	35f55d72f6	[X86] Remove IntrArgMemOnly from target specific gather/scatter intrinsics IntrArgMemOnly implies that only memory pointed to by pointer typed arguments will be accessed. But these intrinsics allow you to pass null to the pointer argument and put the full address into the index argument. Other passes won't be able to understand this. A colleague found that ISPC was creating gathers like this and then dead store elimination removed some stores because it didn't understand what the gather was doing since the pointer argument was null. Differential Revision: https://reviews.llvm.org/D58805 llvm-svn: 355228	2019-03-01 21:02:40 +00:00
Craig Topper	e6bfb0919c	[X86] Add test case for D58805. NFC This demonstrates dead store elimination removing a store that may alias a gather that uses null as its base. llvm-svn: 355227	2019-03-01 21:02:34 +00:00
Paul Robinson	1ca25763f0	[DWARF] Make -g with empty assembler source work better. This was sometimes causing clang or llvm-mc to crash, and in other cases could emit a bogus DWARF line-table header. I did an interim patch in r352541; this patch should be a cleaner and more complete fix, and retains the test. Addresses PR40538. Differential Revision: https://reviews.llvm.org/D58750 llvm-svn: 355226	2019-03-01 20:58:04 +00:00
Philip Reames	cf0a978e1f	[InstCombine] Extend saturating idempotent atomicrmw transform to FP I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw. Differential Revision: https://reviews.llvm.org/D58836 llvm-svn: 355222	2019-03-01 19:50:36 +00:00
Sanjay Patel	6e1e7e1c3e	[InstCombine] move add after umin/umax In the motivating cases from PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 ...moving the add enables us to narrow the min/max which eliminates zext/trunc which enables signficantly better vectorization. But that bug is still not completely fixed. https://rise4fun.com/Alive/5KQ Name: umax Pre: C1 u>= C0 %a = add nuw i8 %x, C0 %cond = icmp ugt i8 %a, C1 %r = select i1 %cond, i8 %a, i8 C1 => %c2 = icmp ugt i8 %x, C1-C0 %u2 = select i1 %c2, i8 %x, i8 C1-C0 %r = add nuw i8 %u2, C0 Name: umin Pre: C1 u>= C0 %a = add nuw i32 %x, C0 %cond = icmp ult i32 %a, C1 %r = select i1 %cond, i32 %a, i32 C1 => %c2 = icmp ult i32 %x, C1-C0 %u2 = select i1 %c2, i32 %x, i32 C1-C0 %r = add nuw i32 %u2, C0 llvm-svn: 355221	2019-03-01 19:42:40 +00:00
Sanjay Patel	20292a0526	[InstCombine] add tests for umin/umax narrowing (PR14613); NFC llvm-svn: 355220	2019-03-01 19:42:34 +00:00
Vlad Tsyrklevich	8925138007	Revert "[MIPS GlobalISel] Fix mul operands" This reverts commit r355178, it is causing ASan failures on the sanitizer bots. llvm-svn: 355219	2019-03-01 18:58:22 +00:00
Philip Reames	2226e9a745	[LICM] Infer proper alignment from loads during scalar promotion This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load. For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an incredibly rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we may fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually is well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit. Differential Revision: https://reviews.llvm.org/D58809 llvm-svn: 355217	2019-03-01 18:45:05 +00:00
Philip Reames	1648f95eb1	[Tests] More missing atomicrmw combines llvm-svn: 355215	2019-03-01 18:24:05 +00:00
Philip Reames	21f7c35df1	[Tests] Add tests for missed optimizations of saturating and idempotent FP atomicrmws llvm-svn: 355212	2019-03-01 18:10:37 +00:00
Philip Reames	77982868c5	[InstCombine] Extend "idempotent" atomicrmw optimizations to floating point An idempotent atomicrmw is one that does not change memory in the process of execution. We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR. Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load. As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future. Differential Revision: https://reviews.llvm.org/D58251 llvm-svn: 355210	2019-03-01 18:00:07 +00:00

... 4 5 6 7 8 ...

60324 Commits