llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	aa9bcd56b1	AMDGPU: Remove llvm.AMDGPU.kill This is the last of the old AMDGPU intrinsics. llvm-svn: 348615	2018-12-07 17:46:16 +00:00
Sanjay Patel	3af4ae9735	[DAGCombiner] disable truncation of binops by default As discussed in the post-commit thread of r347917, this transform is fighting with an existing transform causing an infinite loop or out-of-memory, so this is effectively reverting r347917 and its follow-up r348195 while we investigate the bug. llvm-svn: 348604	2018-12-07 15:47:52 +00:00
Nikita Popov	110cf05203	Reapply "[DemandedBits][BDCE] Support vectors of integers" DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. Unlike the previous iteration of this patch, getDemandedBits() can now again be called on arbirary (sized) instructions, even if they don't have integer or vector of integer type. (For vector types the size of the returned mask will now be the scalar size in bits though.) The added LoopVectorize test case shows a case which triggered an assertion failure with the previous attempt, because getDemandedBits() was called on a pointer-typed instruction. Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348602	2018-12-07 15:38:13 +00:00
Graham Sellers	b297379ef0	[AMDGPU] Shrink scalar AND, OR, XOR instructions This change attempts to shrink scalar AND, OR and XOR instructions which take an immediate that isn't inlineable. It performs: AND s0, s0, ~(1 << n) -> BITSET0 s0, n OR s0, s0, (1 << n) -> BITSET1 s0, n AND s0, s1, x -> ANDN2 s0, s1, ~x OR s0, s1, x -> ORN2 s0, s1, ~x XOR s0, s1, x -> XNOR s0, s1, ~x In particular, this catches setting and clearing the sign bit for fabs (and x, 0x7ffffffff -> bitset0 x, 31 and or x, 0x80000000 -> bitset1 x, 31). llvm-svn: 348601	2018-12-07 15:33:21 +00:00
Sanjay Patel	bb796cd61c	[DAGCombiner] remove explicit calls to AddToWorkList; NFCI As noted in the post-commit thread for rL347917: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608936.html ...we don't need to repeat these calls because the combiner does it automatically. llvm-svn: 348597	2018-12-07 15:00:56 +00:00
Max Kazantsev	b9e65cbddf	Introduce llvm.experimental.widenable_condition intrinsic This patch introduces a new instinsic `@llvm.experimental.widenable_condition` that allows explicit representation for guards. It is an alternative to using `@llvm.experimental.guard` intrinsic that does not contain implicit control flow. We keep finding places where `@llvm.experimental.guard` is not supported or treated too conservatively, and there are 2 reasons to that: - `@llvm.experimental.guard` has memory write side effect to model implicit control flow, and this sometimes confuses passes and analyzes that work with memory; - Not all passes and analysis are aware of the semantics of guards. These passes treat them as regular throwing call and have no idea that the condition of guard may be used to prove something. One well-known place which had caused us troubles in the past is explicit loop iteration count calculation in SCEV. Another example is new loop unswitching which is not aware of guards. Whenever a new pass appears, we potentially have this problem there. Rather than go and fix all these places (and commit to keep track of them and add support in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible. The only significant difference between guards and regular explicit branches is that guard's condition can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor, and it is always legal to go there no matter what the guard condition is. The other successor is a guarded block, and it is only legal to go there if the condition is true. This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard` intrinsic. Now a widenable guard can be represented in the CFG explicitly like this: %widenable_condition = call i1 @llvm.experimental.widenable.condition() %new_condition = and i1 %cond, %widenable_condition br i1 %new_condition, label %guarded, label %deopt guarded: ; Guarded instructions deopt: call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an `undef`, but the intrinsic prevents the optimizer from folding it early. This form should exploit all optimization boons provided to `br` instuction, and it still can be widened by replacing the result of `@llvm.experimental.widenable.condition()` with `and` with any arbitrary boolean value (as long as the branch that is taken when it is `false` has a deopt and has no side-effects). For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using implicit control flow in guards". This patch introduces this new intrinsic with respective LangRef changes and a pass that converts old-style guards (expressed as intrinsics) into the new form. The naming discussion is still ungoing. Merging this to unblock further items. We can later change the name of this intrinsic. Reviewed By: reames, fedor.sergeev, sanjoy Differential Revision: https://reviews.llvm.org/D51207 llvm-svn: 348593	2018-12-07 14:39:46 +00:00
Tim Northover	4bf394be3a	ARM: use correct offset from base pointer (r6) in call frame regions. When we had dynamic call frames (i.e. sp adjustment around each call) we were including that adjustment into offsets calculated based on r6, even though it's only sp that changes. This led to incorrect stack slot accesses. llvm-svn: 348591	2018-12-07 13:43:55 +00:00
David Green	ca29c271d2	[Targets] Add errors for tiny and kernel codemodel on targets that don't support them Adds fatal errors for any target that does not support the Tiny or Kernel codemodels by rejigging the getEffectiveCodeModel calls. Differential Revision: https://reviews.llvm.org/D50141 llvm-svn: 348585	2018-12-07 12:10:23 +00:00
Simon Pilgrim	74c371da7b	Fix gcc7.3 -Wparentheses warning. NFCI. llvm-svn: 348581	2018-12-07 11:10:03 +00:00
Simon Pilgrim	9c7d85bc62	[X86] Add ivybridge to llvm-exegesis PFM counter mappings llvm-svn: 348575	2018-12-07 09:27:35 +00:00
Simon Pilgrim	d498dee7a2	[SelectionDAG] Don't pass on DemandedElts when handling SCALAR_TO_VECTOR Fixes an assertion: llc: lib/CodeGen/SelectionDAG/SelectionDAG.cpp:2200: llvm::KnownBits llvm::SelectionDAG::computeKnownBits(llvm::SDValue, const llvm::APInt&, unsigned int) const: Assertion `(!Op.getValueType().isVector() \|\| NumElts == Op.getValueType().getVectorNumElements()) && "Unexpected vector size"' failed. Committed on behalf of: @pendingchaos (Rhys Perry) Differential Revision: https://reviews.llvm.org/D55223 llvm-svn: 348574	2018-12-07 09:18:44 +00:00
Ranjeet Singh	7a7132603b	[IR] Don't assume all functions are 4 byte aligned In some cases different alignments for function might be used to save space e.g. thumb mode with -Oz will try to use 2 byte function alignment. Similar patch that fixed this in other areas exists here https://reviews.llvm.org/D46110 This was approved previously https://reviews.llvm.org/D55115 (r348215) but when committed it caused failures on the sanitizer buildbots when building llvm with clang (containing this patch). This is now fixed because I've added a check to see if getting the parent module returns null if it does then set the alignment to 0. Differential Revision: https://reviews.llvm.org/D55115 llvm-svn: 348571	2018-12-07 08:34:59 +00:00
Markus Lavin	4dc4ebd606	[PM] Port LoadStoreVectorizer to the new pass manager. Differential Revision: https://reviews.llvm.org/D54848 llvm-svn: 348570	2018-12-07 08:23:37 +00:00
Max Kazantsev	a523a21175	[LoopSimplifyCFG] Do not deal with loops with irreducible CFG inside The current algorithm that collects live/dead/inloop blocks relies on some invariants related to RPO and PO traversals. In particular, the important fact it requires is that the only loop's latch is the first block in PO traversal. It also relies on fact that during RPO we visit all prececessors of a block before we visit this block (backedges ignored). If a loop has irreducible non-loop cycle inside, both these assumptions may break. This patch adds detection for this situation and prohibits the terminator folding for loops with irreducible CFG. We can in theory support this later, for this some algorithmic changes are needed. Besides, irreducible CFG is not a frequent situation and we can just don't bother. Thanks @uabelho for finding this! Differential Revision: https://reviews.llvm.org/D55357 Reviewed By: skatkov llvm-svn: 348567	2018-12-07 05:44:45 +00:00
Zi Xuan Wu	cf4d477b0b	[PowerPC] Fix assert from machine verify pass that missing undef register flag Fix assert about using an undefined physical register in machine instruction verify pass. The reason is that register flag undef is missing when doing transformation from If Conversion Pass. ``` Bad machine code: Using an undefined physical register - function: func_65 - basic block: %bb.0 entry (0x10024740738) - instruction: BCLR killed $cr5lt, implicit $lr8, implicit $rm, implicit undef $x3 - operand 0: killed $cr5lt LLVM ERROR: Found 1 machine code errors. ``` There are also other existing testcases with same issue. So I add -verify-machineinstrs option to open verifying. Differential Revision: https://reviews.llvm.org/D55408 llvm-svn: 348566	2018-12-07 05:25:16 +00:00
Vedant Kumar	b2a6f8e505	[CodeExtractor] Store outputs at the first valid insertion point When CodeExtractor outlines values which are used by the original function, it must store those values in some in-out parameter. This store instruction must not be inserted in between a PHI and an EH pad instruction, as that results in invalid IR. This fixes the following verifier failure seen while outlining within ObjC methods with live exit values: The unwind destination does not have an exception handling instruction! %call35 = invoke i8* bitcast (i8* (i8, i8, ...)* @objc_msgSend to i8* (i8, i8))(i8 %exn.adjusted, i8* %1) to label %invoke.cont34 unwind label %lpad33, !dbg !4183 The unwind destination does not have an exception handling instruction! invoke void @objc_exception_throw(i8* %call35) #12 to label %invoke.cont36 unwind label %lpad33, !dbg !4184 LandingPadInst not the first non-PHI instruction in the block. %3 = landingpad { i8, i32 } catch i8 null, !dbg !1411 rdar://46540815 llvm-svn: 348562	2018-12-07 03:01:54 +00:00
Armando Montanez	c9488e4d89	Revert "[llvm-tapi] Don't override SequenceTraits for std::string" Revert r348551 since it triggered some warnings that don't appear to have a quick fix. llvm-svn: 348560	2018-12-07 01:31:28 +00:00
Nikita Popov	14ca9a8355	Revert "[DemandedBits][BDCE] Support vectors of integers" This reverts commit r348549. Causing assertion failures during clang build. llvm-svn: 348558	2018-12-07 00:42:03 +00:00
Sanjay Patel	c6441c8547	[DAGCombiner] use root SDLoc for all nodes created by logic fold If this is not a valid way to assign an SDLoc, then we get this wrong all over SDAG. I don't know enough about the SDAG to explain this. IIUC, theoretically, debug info is not supposed to affect codegen. But here it has clearly affected 3 different targets, and the x86 change is an actual improvement. llvm-svn: 348552	2018-12-07 00:01:57 +00:00
Armando Montanez	8c1cd213b7	[llvm-tapi] Don't override SequenceTraits for std::string Change the ELF YAML implementation of TextAPI so NeededLibs uses flow sequence vector correctly instead of overriding the YAML implementation for std::vector<std::string>>. This should fix the test failure with the LLVM_LINK_LLVM_DYLIB build mentioned in D55381. Still passes existing tests that cover this. Differential Revision: https://reviews.llvm.org/D55390 llvm-svn: 348551	2018-12-06 23:59:32 +00:00
Sanjay Patel	86cb679851	[DAGCombiner] don't bother saving a SDLoc for a node that's dead; NFCI We shouldn't care about the debug location for a node that we're creating, but attaching the root of the pattern should be the best effort. (If this is not true, then we are doing it wrong all over the SDAG). This is no-functional-change-intended, and there are no regression test diffs...and that's what I expected. But there's a similar line above this diff, where those assumptions apparently do not hold. llvm-svn: 348550	2018-12-06 23:53:58 +00:00
Nikita Popov	cf65b9207b	[DemandedBits][BDCE] Support vectors of integers DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. The getDemandedBits() method can now only be called on instructions that have integer or vector of integer type. Previously it could be called on any sized instruction (even if it was not particularly useful). The size of the return value is now always the scalar size in bits (while previously it was the type size in bits). Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348549	2018-12-06 23:50:32 +00:00
Sanjay Patel	276cef343c	[DAGCombiner] more clean up in hoistLogicOpWithSameOpcodeHands(); NFC This code can still misbehave. llvm-svn: 348547	2018-12-06 23:39:28 +00:00
Craig Topper	2c7a9476e0	[X86] Directly create ADC/SBB nodes instead of using ADD/SUB with (and SETCC_CARRY, 1) This addresses a FIXME and avoids depending on an isel pattern match I think. I've remove the isel patterns too since he have no lit tests left that cover them. Hopefully that really means they are unused. I'm trying to decide if we need SETCC_CARRY. This removes one of its usages. Differential Revision: https://reviews.llvm.org/D55355 llvm-svn: 348536	2018-12-06 22:26:59 +00:00
Sanjay Patel	70af85b0ac	[DAGCombiner] don't group bswap with casts in logic hoisting fold This was probably organized as it was because bswap is a unary op. But that's where the similarity to the other opcodes ends. We should not limit this transform to scalars, and we should not try it if either input has other uses. This is another step towards trying to clean this whole function up to prevent it from causing infinite loops and memory explosions. Earlier commits in this series: rL348501 rL348508 rL348518 llvm-svn: 348534	2018-12-06 22:10:44 +00:00
Sanjay Patel	03a3ef2a0c	[DAGCombiner] reduce indent; NFC Unlike some of the folds in hoistLogicOpWithSameOpcodeHands() above this shuffle transform, this has the expected hasOneUse() checks in place. llvm-svn: 348523	2018-12-06 20:02:47 +00:00
Andrea Di Biagio	52a2bac583	[DagCombiner][X86] Simplify a ConcatVectors of a scalar_to_vector with undef. This patch introduces a new DAGCombiner rule to simplify concat_vectors nodes: concat_vectors( bitcast (scalar_to_vector %A), UNDEF) --> bitcast (scalar_to_vector %A) This patch only partially addresses PR39257. In particular, it is enough to fix one of the two problematic cases mentioned in PR39257. However, it is not enough to fix the original test case posted by Craig; that particular case would probably require a more complicated approach (and knowledge about used bits). Before this patch, we used to generate the following code for function PR39257 (-mtriple=x86_64 , -mattr=+avx): vmovsd (%rdi), %xmm0 # xmm0 = mem[0],zero vxorps %xmm1, %xmm1, %xmm1 vblendps $3, %xmm0, %xmm1, %xmm0 # xmm0 = xmm0[0,1],xmm1[2,3] vmovaps %ymm0, (%rsi) vzeroupper retq Now we generate this: vmovsd (%rdi), %xmm0 # xmm0 = mem[0],zero vmovaps %ymm0, (%rsi) vzeroupper retq As a side note: that VZEROUPPER is completely redundant... I guess the vzeroupper insertion pass doesn't realize that the definition of %xmm0 from vmovsd is already zeroing the upper half of %ymm0. Note that on %-mcpu=btver2, we don't get that vzeroupper because pass vzeroupper insertion %pass is disabled. Differential Revision: https://reviews.llvm.org/D55274 llvm-svn: 348522	2018-12-06 19:55:38 +00:00
Sanjay Patel	bfc7ffa40f	[DAGCombiner] don't hoist logic op if operands have other uses, part 2 The PPC test with 2 extra uses seems clearly better by avoiding this transform. With 1 extra use, we also prevent an extra register move (although that might be an RA problem). The general rule should be to only make a change here if it is always profitable. The x86 diffs are all neutral. llvm-svn: 348518	2018-12-06 19:18:56 +00:00
Simon Pilgrim	845d5a0aa8	Fix Wdocumentation warning. NFCI. llvm-svn: 348517	2018-12-06 19:17:28 +00:00
Adrian Prantl	fbeeac0e1e	Reapply "Adapt gcov to changes in CFE." This reverts commit r348203 and reapplies D55085 with an additional GCOV bugfix to make the change NFC for relative file paths in .gcno files. Thanks to Ilya Biryukov for additional testing! Original commit message: Update Diagnostic handling for changes in CFE. The clang frontend no longer emits the current working directory for DIFiles containing an absolute path in the filename: and will move the common prefix between current working directory and the file into the directory: component. https://reviews.llvm.org/D55085 llvm-svn: 348512	2018-12-06 18:44:48 +00:00
Evandro Menezes	799b76eae2	[AArch64] Fix Exynos predicate Fix predicate for arithmetic instructions with shift and/or extend. llvm-svn: 348510	2018-12-06 18:25:37 +00:00
Sanjay Patel	c3717cd0d5	[DAGCombiner] don't hoist logic op if operands have other uses The AVX512 diffs are neutral, but the bswap test shows a clear overreach in hoistLogicOpWithSameOpcodeHands(). If we don't check for other uses, we can increase the instruction count. This could also fight with transforms trying to go in the opposite direction and possibly blow up/infinite loop. This might be enough to solve the bug noted here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608593.html I did not add the hasOneUse() checks to all opcodes because I see a perf regression for at least one opcode. We may decide that's irrelevant in the face of potential compiler crashing, but I'll see if I can salvage that first. llvm-svn: 348508	2018-12-06 18:16:32 +00:00
Zachary Turner	a93458b050	[PDB] Move some code around. NFC. llvm-svn: 348505	2018-12-06 17:49:15 +00:00
Sanjay Patel	e9bf78fa23	[DAGCombiner] refactor function that hoists bitwise logic; NFCI Added FIXME and TODO comments for lack of safety checks. This function is a suspect in out-of-memory errors as discussed in the follow-up thread to r347917: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608593.html llvm-svn: 348501	2018-12-06 17:08:03 +00:00
Zachary Turner	579264bd59	Support skewed stream arrays. VarStreamArray was built on the assumption that it is backed by a StreamRef, and offset 0 of that StreamRef is the first byte of the first record in the array. This is a logical and intuitive assumption, but unfortunately we have use cases where it doesn't hold. Specifically, a PDB module's symbol stream is prefixed by 4 bytes containing a magic value, and the first byte of record data in the array is actually at offset 4 of this byte sequence. Previously, we would just truncate the first 4 bytes and then construct the VarStreamArray with the resulting StreamRef, so that offset 0 of the underlying stream did correspond to the first byte of the first record, but this is problematic, because symbol records reference other symbol records by the absolute offset including that initial magic 4 bytes. So if another record wants to refer to the first record in the array, it would say "the record at offset 4". This led to extremely confusing hacks and semantics in loading code, and after spending 30 minutes trying to get some math right and failing, I decided to fix this in the underlying implementation of VarStreamArray. Now, we can say that a stream is skewed by a particular amount. This way, when we access a record by absolute offset, we can use the same values that the records themselves contain, instead of having to do fixups. Differential Revision: https://reviews.llvm.org/D55344 llvm-svn: 348499	2018-12-06 16:55:00 +00:00
Simon Pilgrim	bb650daeaf	[X86] Refactored IsSplatVector to use switch. NFCI. Initial step towards making the function more generic (and probably move into SelectionDAG). This is necessary to avoid massive codegen bloat for PR38243 (Add modulo rotate support to LowerRotate). llvm-svn: 348498	2018-12-06 16:29:14 +00:00
Alexey Bataev	2e1a782189	[DEBUGINFO, NVPTX] Disable emission of ',debug' option if only debug directives are allowed. Summary: If the output of debug directives only is requested, we should drop emission of ',debug' option from the target directive. Required for supporting of nvprof profiler. Reviewers: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46061 llvm-svn: 348497	2018-12-06 16:25:35 +00:00
Alexandros Lamprineas	e4c91f5c4c	[GVN] Don't perform scalar PRE on GEPs Partial Redundancy Elimination of GEPs prevents CodeGenPrepare from sinking the addressing mode computation of memory instructions back to its uses. The problem comes from the insertion of PHIs, which confuse CGP and make it bail. I've autogenerated the check lines of an existing test and added a store instruction to demonstrate the motivation behind this change. The store is now using the gep instead of a phi. Differential Revision: https://reviews.llvm.org/D55009 llvm-svn: 348496	2018-12-06 16:11:58 +00:00
Alexey Bataev	64ad0ad5ed	[DEBUGINFO, NVPTX]Emit last debugging directives. Summary: We may end up with not emitted debug directives at the end of the module emission. Patch fixes this problem emitting those last directives the end of the module emission. Reviewers: echristo Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D54320 llvm-svn: 348495	2018-12-06 16:02:09 +00:00
Simon Pilgrim	105a366254	DAGCombiner::visitINSERT_VECTOR_ELT - pull out repeated VT.getVectorNumElements(). NFCI. llvm-svn: 348494	2018-12-06 15:39:25 +00:00
Diogo N. Sampaio	9c9067316b	[NFC][AArch64] Split out backend features This patch splits backend features currently hidden behind architecture versions. For example, currently the only way to activate complex numbers extension is targeting an v8.3 architecture, where after the patch this extension can be added separately. This refactoring is required by the new command lines proposal: http://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html Reviewers: DavidSpickett, olista01, t.p.northover Subscribers: kristof.beyls, bryanpkc, javed.absar, pbarrio Differential revision: https://reviews.llvm.org/D54633 -- It was reverted in rL348249 due a build bot failure in one of the regression tests: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/14386 The problem seems to be that FileCheck behaves different in windows and linux. This new patch splits the test file in multiple, and does more exact pattern matching attempting to circumvent the issue. llvm-svn: 348493	2018-12-06 15:39:17 +00:00
Nicolai Haehnle	ca4a32945f	AMDGPU: Generate VALU ThreeOp Integer instructions Summary: Original patch by: Fabian Wahlster <razor@singul4rity.com> Change-Id: I148f692a88432541fad468963f58da9ddf79fac5 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, b-sumner, llvm-commits Differential Revision: https://reviews.llvm.org/D51995 llvm-svn: 348488	2018-12-06 14:33:40 +00:00
Valery Pykhtin	f479fbba5f	[AMDGPU] Partial revert of rL348371: Turn on the DPP combiner by default Turn the combiner back off as there're failures until the issue is fixed. Differential revision: https://reviews.llvm.org/D55314 llvm-svn: 348487	2018-12-06 14:20:02 +00:00
Ilya Biryukov	cb5331eb93	Revert "[LoopSimplifyCFG] Delete dead in-loop blocks" This reverts commit r348457. The original commit causes clang to crash when doing an instrumented build with a new pass manager. Reverting to unbreak our integrate. llvm-svn: 348484	2018-12-06 13:21:01 +00:00
Markus Lavin	8ba5ee57a0	Test commit: Removed trailing space in .txt file. llvm-svn: 348483	2018-12-06 13:20:27 +00:00
Diana Picus	1027249ec9	[ARM GlobalISel] Nothing is legal for Thumb ...yet! A lot of the current code should be shared for arm and thumb mode, but until we add tests and work out some of the details (e.g. checking the correct subtarget feature for G_SDIV) it's safer to bail out as early as possible for thumb targets. This should have arguably been part of r348347, which allowed Thumb functions to be handled by the IR Translator. llvm-svn: 348472	2018-12-06 09:26:14 +00:00
Roman Lebedev	98cb1216a6	[InstCombine] foldICmpWithLowBitMaskedVal(): don't miscompile -1 vector elts I was finally able to quantify what i thought was missing in the fix, it was vector constants. If we have a scalar (and %x, -1), it will be instsimplified before we reach this code, but if it is a vector, we may still have a -1 element. Thus, we want to avoid the fold if at least one element is -1. Or in other words, ignoring the undef elements, no sign bits should be set. Thus, m_NonNegative(). A follow-up for rL348181 https://bugs.llvm.org/show_bug.cgi?id=39861 llvm-svn: 348462	2018-12-06 08:14:24 +00:00
Craig Topper	6a6d77b851	[X86] Remove some leftover code for handling an i1 setcc type. NFC We should only need to handle i8 now. llvm-svn: 348460	2018-12-06 07:00:02 +00:00
Max Kazantsev	0b1d069d64	[LoopSimplifyCFG] Delete dead in-loop blocks This patch teaches LoopSimplifyCFG to delete loop blocks that have become unreachable after terminator folding has been done. Differential Revision: https://reviews.llvm.org/D54023 Reviewed By: anna llvm-svn: 348457	2018-12-06 05:45:02 +00:00
Matthias Braun	d041212c07	AArch64: Fix invalid CCMP emission The code emitting AND-subtrees used to check whether any of the operands was an OR in order to figure out if the result needs to be negated. However the OR could be hidden in further subtrees and not immediately visible. Change the code so that canEmitConjunction() determines whether the result of the generated subtree needs to be negated. Cleanup emission logic to use this. I also changed the code a bit to make all negation decisions early before we actually emit the subtrees. This fixes http://llvm.org/PR39550 Differential Revision: https://reviews.llvm.org/D54137 llvm-svn: 348444	2018-12-06 01:40:23 +00:00
Pete Cooper	e13d0992dc	Add objc.* ARC intrinsics and codegen them to their runtime methods. Reviewers: erik.pilkington, ahatanak Differential Revision: https://reviews.llvm.org/D55233 llvm-svn: 348441	2018-12-06 00:52:54 +00:00
Jessica Paquette	3cd70b385d	[MachineOutliner][NFC] Move yet another std::vector out of a loop Once again, following the wisdom of the LLVM Programmer's Manual. I think that's enough refactoring for today. :) llvm-svn: 348439	2018-12-06 00:26:21 +00:00
Jessica Paquette	d4e7d0749b	[MachineOutliner][NFC] Move std::vector out of loop See http://llvm.org/docs/ProgrammersManual.html#vector llvm-svn: 348433	2018-12-06 00:04:03 +00:00
Jessica Paquette	ca3ed964f1	[MachineOutliner][NFC] Remove IntegerInstructionMap from InstructionMapper Refactoring. This map was only used when we used a string of integers to output the outlined sequence. Since it's no longer used for anything, there's no reason to keep it around. llvm-svn: 348432	2018-12-06 00:01:51 +00:00
Amara Emerson	a0b15d8f3e	[GlobalISel] Introduce G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC and G_CONCAT_VECTOR opcodes. These opcodes are intended to subsume some of the capability of G_MERGE_VALUES, as it was too powerful and thus complex to add deal with throughout the GISel pipeline. G_BUILD_VECTOR creates a vector value from a sequence of uniformly typed scalar values. G_BUILD_VECTOR_TRUNC is a special opcode for handling scalar operands which are larger than the destination vector element type, and therefore does an implicit truncate. G_CONCAT_VECTOR creates a vector by concatenating smaller, uniformly typed, vectors together. These will be used in a subsequent commit. This commit just adds the initial infrastructure. Differential Revision: https://reviews.llvm.org/D53594 llvm-svn: 348430	2018-12-05 23:53:30 +00:00
Jessica Paquette	ce3a2dcf70	[MachineOutliner][NFC] Remove buildCandidateList and replace with findCandidates More refactoring. Since the pruning logic has changed, and the candidate list is gone, everything can be sunk into findCandidates. We no longer need to keep track of the length of the longest substring, so we can drop all of that logic as well. After this, we just find all of the candidates and move to outlining. llvm-svn: 348428	2018-12-05 23:39:07 +00:00
Jessica Paquette	e18d6ff036	[MachineOutliner][NFC] Candidates don't need to be shared_ptrs anymore More refactoring. After the changes to the pruning logic, and removing CandidateList, there's no reason for Candiates to be shared_ptrs (or pointers at all). std::shared_ptr<Candidate> -> Candidate. llvm-svn: 348427	2018-12-05 23:24:22 +00:00
David L. Jones	5ff7b8a04a	Revert r347934 "[SCEV] Guard movement of insertion point for loop-invariants" This change caused SEGVs in instcombine. (The r347934 change seems to me to be a precipitating cause, not a root cause. Details are on the llvm-commits thread for r347934.) llvm-svn: 348426	2018-12-05 23:13:50 +00:00
Sanjay Patel	998ececef0	[InstCombine] remove dead code from visitExtractElement Extracting from a splat constant is always handled by InstSimplify. Move the test for this from InstCombine to InstSimplify to make sure that stays true. llvm-svn: 348423	2018-12-05 23:09:33 +00:00
Jessica Paquette	4ae3b71df0	[MachineOutliner][NFC] Remove CandidateList, since it's now unused. After removing the pruning logic, there's no reason to populate a list of Candidates. Remove CandidateList and update comments. llvm-svn: 348422	2018-12-05 22:50:26 +00:00
Jessica Paquette	d9d9309bd4	Fix buildbot capture warning A bot didn't like my lambda. This ought to fix it. Example: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/30139/steps/build%20lld/logs/stdio error C3493: 'AlreadyRemoved' cannot be implicitly captured because no default capture mode has been specified llvm-svn: 348421	2018-12-05 22:47:25 +00:00
Jessica Paquette	235d877eea	[MachineOutliner][NFC] Simplify and unify pruning/outlining logic Since we're now performing outlining per OutlinedFunction rather than per Candidate, we can simply outline each candidate as it shows up. Instead of having a pruning phase, instead, we'll outline entire functions. Then we'll update the UnsignedVec we mapped to reflect the deletion. If any candidate is in a space that's marked dirty, then we'll drop it. This lets us remove the pruning logic entirely, and greatly simplifies the code. llvm-svn: 348420	2018-12-05 22:27:38 +00:00
Sanjay Patel	47b3b4b5aa	[InstCombine] reduce duplication in visitExtractElementInst; NFC llvm-svn: 348418	2018-12-05 21:57:51 +00:00
David Blaikie	01f1d9b589	ThinLTO: Do not import debug info for imported global constants It looks like this isn't necessary (in any tests I've done, it results in the global being described with no location or value in the imported side - while it's still fully described in the place it's imported from) & results in significant/pathological debug info growth to home these location-less global variable descriptions on the import side. This is a rather pressing/important issue to address - this regressed executable size for one example I'm looking at by 15%, object size is probably similar though I haven't measured it, and a 22x increase in the number of CUs in the cu_index in split DWARF DWP files, creating a similarly large regression in the time it takes llvm-symbolizer to run on such binaries. Reviewers: tejohnson, evgeny777 Differential Revision: https://reviews.llvm.org/D55309 llvm-svn: 348416	2018-12-05 21:42:17 +00:00
Jessica Paquette	962b3ae659	[MachineOutliner] Outline functions by order of benefit Mostly NFC, only change is the order of outlined function names. Loop over the outlined functions instead of walking the candidate list. This is a bit easier to understand. It's far more natural to create a function, then replace all of its occurrences with calls than the other way around. The functions outlined after this do not change, but their names will be decided by their benefit. E.g, OUTLINED_FUNCTION_0 will now always be the most beneficial function, rather than the first one seen. This makes it easier to enforce an ordering on the outlined functions. So, this also adds a test to make sure that the ordering works as expected. llvm-svn: 348414	2018-12-05 21:36:04 +00:00
Krzysztof Parzyszek	8eb394d764	[Hexagon] Add intrinsics for Hexagon V66 llvm-svn: 348413	2018-12-05 21:14:51 +00:00
Krzysztof Parzyszek	545a68ca4b	[Hexagon] Add instruction definitions for Hexagon V66 llvm-svn: 348411	2018-12-05 21:01:07 +00:00
Krzysztof Parzyszek	13a9cf28a1	[Hexagon] Foundation of support for Hexagon V66 llvm-svn: 348407	2018-12-05 20:18:09 +00:00
Aditya Nandakumar	f75d4f329c	[GISel]: Provide standard interface to observe changes in GISel passes https://reviews.llvm.org/D54980 This provides a standard API across GISel passes to observe and notify passes about changes (insertions/deletions/mutations) to MachineInstrs. This patch also removes the recordInsertion method in MachineIRBuilder and instead provides method to setObserver. Reviewed by: vkeles. llvm-svn: 348406	2018-12-05 20:14:52 +00:00
Vedant Kumar	09415a850e	[CodeExtractor] Do not marked outlined calls which may resume EH as noreturn Treat terminators which resume exception propagation as returning instructions (at least, for the purposes of marking outlined functions `noreturn`). This is to avoid inserting traps after calls to outlined functions which unwind. rdar://46129950 llvm-svn: 348404	2018-12-05 19:35:37 +00:00
Evandro Menezes	4b5707550c	[AArch64] Reword description of feature (NFC) Reword the description of the feature that enables custom handling of cheap instructions. llvm-svn: 348398	2018-12-05 18:42:57 +00:00
Jessica Paquette	34b618bf7e	[MachineOutliner][NFC] Don't create outlined sequence from integer mapping Some gardening/refactoring. It's cleaner to copy the instructions into the MachineFunction using the first candidate instead of going to the mapper. Also, by doing this we can remove the Seq member from OutlinedFunction entirely. llvm-svn: 348390	2018-12-05 17:57:33 +00:00
Sanjay Patel	33a448f935	[DAGCombiner] don't try to extract a fraction of a vector binop and crash (PR39893) Because we're potentially peeking through a bitcast in this transform, we need to use overall bitwidths rather than number of elements to determine when it's safe to proceed. Should fix: https://bugs.llvm.org/show_bug.cgi?id=39893 llvm-svn: 348383	2018-12-05 17:10:30 +00:00
Christian Bruel	4ead99b3ac	Allow norecurse attribute on functions that have debug infos. Summary: debug intrinsics might be marked norecurse to enable the caller function to be norecurse and optimized if needed. This avoids code gen optimisation differences when -g is used, as in globalOpt.cpp:processInternalGlobal checks. Reviewers: chandlerc, jmolloy, aprantl Reviewed By: aprantl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55187 llvm-svn: 348381	2018-12-05 16:48:00 +00:00
Chandler Carruth	71c14a36a2	[SLH] Fix a nasty bug in SLH. Whenever we effectively take the address of a basic block we need to manually update that basic block to reflect that fact or later passes such as tail duplication and tail merging can break the invariants of the code. =/ Sadly, there doesn't appear to be any good way of automating this or even writing a reasonable assert to catch it early. The change seems trivially and obviously correct, but sadly the only really good test case I have is 1000s of basic blocks. I've tried directly writing a test case that happens to make tail duplication do something that crashes later on, but this appears to require an amazingly complex set of conditions that I've not yet reproduced. The change is technically covered by the tests because we mark the blocks as having their address taken, but that doesn't really count as properly testing the functionality. llvm-svn: 348374	2018-12-05 15:42:11 +00:00
Valery Pykhtin	5b4db77b13	[AMDGPU]: Turn on the DPP combiner by default Differential revision: https://reviews.llvm.org/D55314 llvm-svn: 348371	2018-12-05 15:21:17 +00:00
Sanjay Patel	baffae91b2	[InstCombine] simplify icmps with same operands based on dominating cmp The tests here are based on the motivating cases from D54827. More background: 1. We don't get these cases in general with SimplifyCFG because the root of the pattern match is an icmp, not a branch. I'm not sure how often we encounter this pattern vs. the seemingly more likely case with branches, but I don't see evidence to leave the minimal pattern unoptimized. 2. This has a chance of increasing compile-time because we're using a ValueTracking call to handle the match. The motivating cases could be handled with a simpler pair of calls to isImpliedTrueByMatchingCmp/ isImpliedFalseByMatchingCmp, but I saw that we have a more comprehensive wrapper around those, so we might as well use it here unless there's evidence that it's significantly slower. 3. Ideally, we'd handle the fold to constants in InstSimplify, but as with the existing code here, we could extend this to handle cases where the result is not a constant, but a new combined predicate. That would mean splitting the logic across the 2 passes and possibly duplicating the pattern-matching cost. 4. As mentioned in D54827, this seems like the kind of thing that should be handled in Correlated Value Propagation, but that pass is currently limited to dealing with instructions with constant operands, so extending this bit of InstCombine is the smallest/easiest way to get these patterns optimized. llvm-svn: 348367	2018-12-05 15:04:00 +00:00
Simon Pilgrim	32483668d7	[X86][SSE] Begun adding modulo rotate support to LowerRotate Prep work for PR38243 - mainly adding comments on where we need to add modulo support (doing so at the moment causes massive codegen regressions). I've also consistently added support for modulo folding for uniform constants (although at the moment we have no way to trigger this) and removed the old assertions. llvm-svn: 348366	2018-12-05 14:46:37 +00:00
Simon Pilgrim	8fdaf5c915	[TargetLowering] Remove ISD::ANY_EXTEND/ANY_EXTEND_VECTOR_INREG opcodes from SimplifyDemandedVectorElts These have no test coverage and the KnownZero flags can't be guaranteed unlike SIGN/ZERO_EXTEND cases. llvm-svn: 348361	2018-12-05 12:20:05 +00:00
Simon Pilgrim	180639afe5	[SelectionDAG] Initial support for FSHL/FSHR funnel shift opcodes (PR39467) This is an initial patch to add a minimum level of support for funnel shifts to the SelectionDAG and to begin wiring it up to the X86 SHLD/SHRD instructions. Some partial legalization code has been added to handle the case for 'SlowSHLD' where we want to expand instead and I've added a few DAG combines so we don't get regressions from the existing DAG builder expansion code. Differential Revision: https://reviews.llvm.org/D54698 llvm-svn: 348353	2018-12-05 11:12:12 +00:00
George Rimar	a3d0d5fe68	[MC] - Fix build bot. Error was: /home/buildslave/slave_as-bldslv8/lld-perf-testsuite/llvm/lib/MC/MCFragment.cpp:241:22: error: field 'Offset' will be initialized after field 'LayoutOrder' [-Werror,-Wreorder] Atom(nullptr), Offset(~UINT64_C(0)), LayoutOrder(0) { http://lab.llvm.org:8011/builders/lld-perf-testsuite/builds/9628/steps/build-bin%2Flld/logs/stdio llvm-svn: 348351	2018-12-05 11:06:29 +00:00
Simon Pilgrim	cd8a152b18	Remove superfluous comments. NFCI. As requested in D54698. llvm-svn: 348350	2018-12-05 10:45:44 +00:00
George Rimar	79ace92fcd	Recommit r348243 - "[llvm-mc] - Do not crash when referencing undefined debug sections." The patch triggered an unrelated msan issue: LayoutOrder variable was not initialized. (http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/26794/steps/check-llvm%20msan/logs/stdio) It was fixed. Original commit message: MC has code that pre-creates few debug sections: https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCObjectFileInfo.cpp#L396 If users code has a reference to such section but does not redefine it, MC code currently asserts, because still thinks they are normally defined. The patch fixes the issue. Differential revision: https://reviews.llvm.org/D55173 ---- Modified : /llvm/trunk/lib/MC/ELFObjectWriter.cpp Added : /llvm/trunk/test/MC/ELF/undefined-debug.s llvm-svn: 348349	2018-12-05 10:43:58 +00:00
Simon Pilgrim	d24730cdda	[TargetLowering] SimplifyDemandedVectorElts - don't alter DemandedElts mask Fix potential issue with the ISD::INSERT_VECTOR_ELT case tweaking the DemandedElts mask instead of using a local copy - so later uses of the mask use the tweaked version..... Noticed while investigating adding zero/undef folding to SimplifyDemandedVectorElts and the altered DemandedElts mask was causing mismatches. llvm-svn: 348348	2018-12-05 10:37:45 +00:00
Diana Picus	8a1b4f57c9	[ARM GlobalISel] Implement call lowering for Thumb2 The only things that are different from arm are: * different opcodes for calls and returns * Thumb calls take predicate operands llvm-svn: 348347	2018-12-05 10:35:28 +00:00
Alina Sbirlea	0e216854f9	[LICM] Actually disable ControlFlowHoisting. Summary: The remaining code paths that ControlFlowHoisting introduced that were not disabled, increased compile time by 3x for some benchmarks. The time is spent in DominatorTree updates. Reviewers: john.brawn, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D55313 llvm-svn: 348345	2018-12-05 10:16:21 +00:00
Saleem Abdulrasool	efd2cb8a0d	AArch64: support funclets in fastcall and swift_call Functions annotated with `__fastcall` or `__attribute__((__fastcall__))` or `__attribute__((__swiftcall__))` may contain SEH handlers even on Win64. This matches the behaviour of cl which allows for `__try`/`__except` inside a `__fastcall` function. This was detected while trying to self-host clang on Windows ARM64. llvm-svn: 348337	2018-12-05 07:09:20 +00:00
Craig Topper	6934202dc0	[MachineLICM][X86][AMDGPU] Fix subtle bug in the updating of PhysRegClobbers in post-RA LICM It looks like MCRegAliasIterator can visit the same physical register twice. When this happens in this code in LICM we end up setting the PhysRegDef and then later in the same loop visit the register again. Now we see that PhysRegDef is set from the earlier iteration so now set PhysRegClobber. This patch splits the loop so we have one that uses the previous value of PhysRegDef to update PhysRegClobber and second loop that updates PhysRegDef. The X86 atomic test is an improvement. I had to add sideeffect to the two shrink wrapping tests to prevent hoisting from occurring. I'm not sure about the AMDGPU tests. It looks like the branch instruction changed at end the of the loops. And in the branch-relaxation test I think there is now "and vcc, exec, -1" instruction that wasn't there before. Differential Revision: https://reviews.llvm.org/D55102 llvm-svn: 348330	2018-12-05 03:41:26 +00:00
Vitaly Buka	8076c57fd2	[asan] Add clang flag -fsanitize-address-use-odr-indicator Reviewers: eugenis, m.ostapenko, ygribov Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55157 llvm-svn: 348327	2018-12-05 01:44:31 +00:00
Amara Emerson	814a6794ba	[SelectionDAG] Split very large token factors for loads into 64k chunks. There's a 64k limit on the number of SDNode operands, and some very large functions with 64k or more loads can cause crashes due to this limit being hit when a TokenFactor with this many operands is created. To fix this, create sub-tokenfactors if we've exceeded the limit. No test case as it requires a very large function. rdar://45196621 Differential Revision: https://reviews.llvm.org/D55073 llvm-svn: 348324	2018-12-05 00:41:30 +00:00
Peter Collingbourne	ff9aaa25e8	LTO: Don't internalize available_externally globals. This breaks C and C++ semantics because it can cause the address of the global inside the module to differ from the address outside of the module. Differential Revision: https://reviews.llvm.org/D55237 llvm-svn: 348321	2018-12-05 00:09:36 +00:00
Amara Emerson	8547f4fb7f	[AArch64][GlobalISel] Re-enable selection of volatile loads. We previously disabled this in r323371 because of a bug where we selected an extending load, but didn't delete the old G_LOAD, resulting in two loads being generated for volatile loads. Since we now have dedicated G_SEXTLOAD/G_ZEXTLOAD operations, and that the tablegen patterns should no longer be able to select (ext(load x)) patterns, it should be safe to re-enable it. The old test case should still work as expected. llvm-svn: 348320	2018-12-05 00:03:09 +00:00
Vitaly Buka	d6bab09b4b	[asan] Split -asan-use-private-alias to -asan-use-odr-indicator Reviewers: eugenis, m.ostapenko, ygribov Subscribers: mehdi_amini, kubamracek, hiraditya, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D55156 llvm-svn: 348316	2018-12-04 23:17:41 +00:00
Saleem Abdulrasool	a9248fdfab	AArch64: clean up some whitespace in Windows CC (NFC) Drive by clean up for Windows ARM64 variadic CC (NFC). llvm-svn: 348310	2018-12-04 22:19:29 +00:00
Zachary Turner	7c6b19f49b	[PDB] Emit S_UDT records in LLD. Previously these were dropped. We now understand them sufficiently well to start emitting them. From the debugger's perspective, this now enables us to have debug info about typedefs (both global and function-locally scoped) Differential Revision: https://reviews.llvm.org/D55228 llvm-svn: 348306	2018-12-04 21:48:46 +00:00
Nirav Dave	9c593e8676	[AVR] Silence fallthrough warning. NFC. llvm-svn: 348304	2018-12-04 21:41:52 +00:00
Stefan Pintilie	46f840f286	[PowerPC] Make no-PIC default to match GCC - LLVM Change the default for PowerPC LE to -fno-PIC. Differential Revision: https://reviews.llvm.org/D53383 llvm-svn: 348298	2018-12-04 20:14:57 +00:00
Sanjay Patel	3d5bb15a1d	[CmpInstAnalysis] fix function signature for ICmp code to predicate; NFC The old function underspecified the return type, took an unused parameter, and had a misleading name. llvm-svn: 348292	2018-12-04 18:53:27 +00:00
Nirav Dave	ce26c27b2a	[SelectionDAG] Redefine isGAPlusOffset in terms of unwrapAddress. NFCI. llvm-svn: 348288	2018-12-04 17:59:43 +00:00
Matt Arsenault	0f414b81cc	AMDGPU: Add f32 vectors to SGPR register classes llvm-svn: 348286	2018-12-04 17:51:36 +00:00

1 2 3 4 5 ...

118870 Commits