llvm-project

Commit Graph

Author	SHA1	Message	Date
James Molloy	c53b40b509	[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted. As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider: if (a) x(1); else if (b) x(2); This produces the following CFG: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ \| / [ end ] [end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch. We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects). We are now able to detect this case and split off the unconditional arcs to a common successor: [if] / \ [x(1)] [if] \| \| \ \| \| \ \| [x(2)] \| \ / \| [sink.split] \| \ / [ end ] Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases. llvm-svn: 280217	2016-08-31 10:46:33 +00:00
James Molloy	55bd04cd20	[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything. This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink. This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example: %a = load i32* %b %d = load i32* %b %c = gep i32* %a, i32 0 %e = gep i32* %d, i32 1 Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor). This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough. Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm. In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging. In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive. This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans. llvm-svn: 280216	2016-08-31 10:46:23 +00:00
James Molloy	923e98c232	[SimplifyCFG] Tail-merge calls with sideeffects This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour at least vaguely consistent with the previous version and keep it as close to NFC as I could. There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup along the way to ensure we don't create indirect calls. Should fix PR28964. llvm-svn: 280215	2016-08-31 10:46:16 +00:00
Gor Nishanov	50d7fb974f	[Coroutines] Part 10: Add coroutine promise support. Summary: 1) CoroEarly now lowers llvm.coro.promise intrinsic that allows to obtain a coroutine promise pointer from a coroutine frame and vice versa. 2) CoroFrame now interprets Promise argument of llvm.coro.begin to place CoroutinPromise alloca at a deterministic offset from the coroutine frame. Now, the coroutine promise example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex4.ll). Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23993 llvm-svn: 280184	2016-08-31 00:35:41 +00:00
Alina Sbirlea	3f8f7840bf	[LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions. Resolves PR29148. Summary: LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize. A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head. e.g.: i1: store a[0] i2: store a[1] i3: store a[1] Leads to: H: i1 T: i2 i3 Instead of: H: i1 i1 T: i2 i3 So the positions for instructions that follow i3 will have different indexes in H/T. This patch resolves PR29148. This issue also surfaced the fact that if the chain is too long, and TLI returns a "not-fast" answer, the whole chain will be abandoned for vectorization, even though a smaller one would be beneficial. Added a testcase and FIXME for this. Reviewers: tstellarAMD, arsenm, jlebar Subscribers: mzolotukhin, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24057 llvm-svn: 280179	2016-08-30 23:53:59 +00:00
Sanjay Patel	ddb53dd080	[InstCombine] add tests to show type limitations of InsertRangeTest and callers llvm-svn: 280175	2016-08-30 23:16:59 +00:00
Michael Kuperstein	2954d1db77	[LoopVectorizer] Predicate instructions in blocks with several incoming edges We don't need to limit predication to blocks that have a single incoming edge, we just need to use the right mask. This fixes PR30172. Differential Revision: https://reviews.llvm.org/D24009 llvm-svn: 280148	2016-08-30 20:22:21 +00:00
Daniel Berlin	c943d72d94	IntrArgMemOnly is only defined (and current AA machinery only sanely supports) pointer arguments, and these intrinsics have vector of pointer arguments. Remove ArgMemOnly until we either have the machinery, define a new attribute, or something similar llvm-svn: 280143	2016-08-30 19:58:48 +00:00
Sanjay Patel	b37145712e	[InstCombine] replace divide-by-constant checks with asserts; NFC These folds already have tests for scalar and vector types, except for the vector div-by-0 case, so I'm adding tests for that. llvm-svn: 280115	2016-08-30 17:31:34 +00:00
James Molloy	d13b1239e4	[SimplifyCFG] Properly CSE metadata in SinkThenElseCodeToEnd This was missing, meaning the metadata in sunk instructions was potentially bogus and could cause miscompiles. llvm-svn: 280072	2016-08-30 10:56:08 +00:00
Teresa Johnson	8c1bc986b5	[ThinLTO] Indirect call promotion fixes for promoted local functions Summary: Fix a couple issues limiting the application of indirect call promotion in ThinLTO mode: - Invoke indirect call promotion before globalopt, since it may eliminate imported functions which appear unreferenced. - Invoke indirect call promotion with InLTO=true so that the PGOFuncName metadata is used to get the name for locals which would have been renamed during promotion. Reviewers: davidxl, mehdi_amini Subscribers: Prazek, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24004 llvm-svn: 280024	2016-08-29 22:46:56 +00:00
Matthew Simpson	df19502b16	[LV] Move insertelement sequence after scalar definitions After r279649 when getting a vector value from VectorLoopValueMap, we create an insertelement sequence on-demand if the value has been scalarized instead of vectorized. We previously inserted this insertelement sequence before the value's first vector user. However, this insert location is problematic if that user is the phi node of a first-order recurrence. With this patch, we move the insertelement sequence after the last scalar instruction we created when scalarizing the value. Thus, the value's vector definition in the new loop will immediately follow its scalar definitions. This should fix PR30183. Reference: https://llvm.org/bugs/show_bug.cgi?id=30183 llvm-svn: 280001	2016-08-29 20:14:04 +00:00
David Majnemer	e8fd5f9ffd	[SimplifyCFG] Hoisting invalidates metadata We forgot to remove optimization metadata when performing hosting during FoldTwoEntryPHINode. This fixes PR29163. llvm-svn: 279980	2016-08-29 17:14:08 +00:00
Anna Thomas	2bc129c5fd	[StatepointsForGC] Rematerialize in the presence of PHIs Summary: While walking the use chain for identifying rematerializable values in RS4GC, add the case where the current value and base value are the same PHI nodes. This will aid rematerialization of geps and casts instead of relocating. Reviewers: sanjoy, reames, igor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23920 llvm-svn: 279975	2016-08-29 15:41:59 +00:00
Sanjay Patel	25475bcc0c	[Constant] remove fdiv and frem from canTrap() Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env. This matches how we treat the fdiv/frem in IR with isSafeToSpeculativelyExecute() and in the backend after: https://reviews.llvm.org/rL279970 llvm-svn: 279973	2016-08-29 15:27:17 +00:00
Sanjay Patel	20f02b271b	[SimplifyCFG] rename test file, regenerate checks, and add test The fdiv test shows a problem similar to: https://reviews.llvm.org/rL279970 llvm-svn: 279972	2016-08-29 14:57:53 +00:00
Gor Nishanov	dce9b02677	[Coroutines] Part 9: Add cleanup subfunction. Summary: [Coroutines] Part 9: Add cleanup subfunction. This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll) Intrinsic Changes: * coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine. * coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined. CoroSplit now creates three subfunctions: # f$resume - resume logic # f$destroy - cleanup logic, followed by a deallocation code # f$cleanup - just the cleanup code CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not. Other fixes, improvements: * Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point. * Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>) Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23844 llvm-svn: 279971	2016-08-29 14:34:12 +00:00
Sanjay Patel	5c5311f4e5	[InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat constant vectors llvm-svn: 279937	2016-08-28 18:18:00 +00:00
Elena Demikhovsky	3622fbfc68	[Loop Vectorizer] Fixed memory confilict checks. Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930	2016-08-28 08:53:53 +00:00
Adam Nemet	cef3314156	[Inliner] Report when inlining fails because callee's def is unavailable Summary: This is obviously an interesting case because it may motivate code restructuring or LTO. Reporting this requires instantiation of ORE in the loop where the call sites are first gathered. I've checked compile-time overhead with -Rpass-with-hotness and the worst slow-down was 6% in mcf and quickly tailing off. As before without -Rpass-with-hotness there is no overhead. Because this could be a pretty noisy diagnostics, it is currently qualified as 'verbose'. As of this patch, 'verbose' diagnostics are only emitted with -Rpass-with-hotness, i.e. when the output is expected to be filtered. Reviewers: eraman, chandlerc, davidxl, hfinkel Subscribers: tejohnson, Prazek, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D23415 llvm-svn: 279860	2016-08-26 20:21:05 +00:00
Tim Shen	3ad8b43cc2	[MemCpy] Add comments for r279769 Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279778	2016-08-25 21:03:46 +00:00
Tim Shen	a3dbead2d6	[MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the identity of two operands Summary: This fixes pr29105. The reason is that lifetime marks creates new aliasing pointers the original ones, but before this patch aliases were not checked in performMemCpyToMemSetOptzn. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279769	2016-08-25 19:27:26 +00:00
Sebastian Pop	5f0d0e60d1	GVN-hoist: fix hoistingFromAllPaths for loops (PR29034) It is invalid to hoist stores or loads if they are not executed on all paths from the hoisting point to the exit of the function. In the testcase, there are paths in the loop that do not execute the stores or the loads, and so hoisting them within the loop is unsafe. The problem is that the current implementation of hoistingFromAllPaths is incomplete: it walks all blocks dominated by the hoisting point, and does not return false when the loop contains a path on which the hoisted ld/st is not executed. Differential Revision: https://reviews.llvm.org/D23843 llvm-svn: 279732	2016-08-25 11:55:47 +00:00
Xinliang David Li	cad3a995a4	[Profile] Propagate branch metadata properly in instcombine Differential Revision: http://reviews.llvm.org/D23590 llvm-svn: 279693	2016-08-25 00:26:32 +00:00
Sanjay Patel	d398d4a39e	[InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat constant vectors llvm-svn: 279677	2016-08-24 22:22:06 +00:00
Matthew Simpson	abd2be1e2e	[LV] Unify vector and scalar maps This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649	2016-08-24 18:23:17 +00:00
Sanjoy Das	ff855b6020	[SCCP] Don't delete side-effecting instructions I'm not sure if the `!isa<CallInst>(Inst) && !isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the case we know is broken. llvm-svn: 279647	2016-08-24 18:10:21 +00:00
Gil Rapaport	550148b2f6	[Loop Vectorizer] Support predication of div/rem div/rem instructions in basic blocks that require predication currently prevent vectorization. This patch extends the existing mechanism for predicating stores to handle other instructions and leverages it to predicate divs and rems. Differential Revision: https://reviews.llvm.org/D22918 llvm-svn: 279620	2016-08-24 11:37:57 +00:00
Gor Nishanov	241b041fba	[Coroutines] Part 8: Coroutine Frame Building algorithm Summary: This patch adds coroutine frame building algorithm. Now, simple coroutines such as ex0.ll and ex1.ll (first examples from docs\Coroutines.rst can be compiled). Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions. (https://reviews.llvm.org/D23461) 8. Coroutine Frame Building algorithm <= we are here 9. Add f.cleanup subfunction. 10+. The rest of the logic Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23586 llvm-svn: 279609	2016-08-24 04:44:35 +00:00
Matthew Simpson	df2ab917ad	[SLP] Avoid signed integer overflow The test case included with r279125 exposed an existing signed integer overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together with other costs, such as getReductionCost. This patch removes the possibility of assigning a cost of INT_MAX. Since we were previously using INT_MAX as an indicator for "should not vectorize", we now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable" before computing a cost. This patch adds a run-line to the test case used for r279125 that ensures we don't vectorize. Previously, this line would vectorize the test case by chance due to undefined behavior in the cost calculation. Differential Revision: https://reviews.llvm.org/D23723 llvm-svn: 279562	2016-08-23 20:48:50 +00:00
Sanjay Patel	6946e2ade3	[InstSimplify] allow icmp with constant folds for splat vectors, part 2 Completes the m_APInt changes for simplifyICmpWithConstant(). Other commits in this series: https://reviews.llvm.org/rL279492 https://reviews.llvm.org/rL279530 https://reviews.llvm.org/rL279534 https://reviews.llvm.org/rL279538 llvm-svn: 279543	2016-08-23 18:00:51 +00:00
Sanjay Patel	200e3cbfb0	[InstSimplify] allow icmp with constant folds for splat vectors, part 1 llvm-svn: 279538	2016-08-23 17:30:56 +00:00
Sanjay Patel	ada2bb3d5d	[InstSimplify] add tests to show missing vector icmp folds llvm-svn: 279534	2016-08-23 17:13:38 +00:00
Sanjay Patel	5c269d0b7a	[InstSimplify] move icmp with constant tests to another file; NFC ...because like the corresponding code, this is just too big to keep adding to. And the next step is to add a vector version of each of these tests to show missed folds. Also, auto-generate CHECK lines and add comments for the tests that correspond to the source code. llvm-svn: 279530	2016-08-23 16:46:53 +00:00
Sanjay Patel	a392049419	[InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat constant vectors llvm-svn: 279472	2016-08-22 20:45:06 +00:00
James Molloy	5bf2114265	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460	2016-08-22 19:07:15 +00:00
Jun Bum Lim	ec8b8cc595	[InstCombine] Allow sinking from unique predecessor with multiple edges Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448	2016-08-22 18:21:56 +00:00
James Molloy	475f4a763f	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447	2016-08-22 18:13:12 +00:00
James Molloy	353052698a	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443	2016-08-22 17:40:23 +00:00
Artur Pilipenko	bc76ecada0	Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See https://reviews.llvm.org/D18777 for details. llvm-svn: 279433	2016-08-22 13:14:07 +00:00
Artur Pilipenko	b78ad9d41f	Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See comments on https://reviews.llvm.org/D18777 for details. llvm-svn: 279432	2016-08-22 13:12:07 +00:00
Balaram Makam	a927aa4ad0	[PM] Port LoopDataPrefetch AArch64 tests to new pass manager Reviewers: mcrosier, tejohnson Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23724 llvm-svn: 279431	2016-08-22 12:59:58 +00:00
Sanjay Patel	643d21a62c	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401	2016-08-21 17:10:07 +00:00
Sanjay Patel	163a5ab799	remove FIXME comment; fixed by previous commit llvm-svn: 279400	2016-08-21 16:40:42 +00:00
Sanjay Patel	7ffcde7422	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 3 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279399	2016-08-21 16:35:34 +00:00
Sanjay Patel	7e09f13fed	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 2 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279398	2016-08-21 16:28:22 +00:00
Matthew Simpson	235e479984	Reapply "[SLP] Initialize VectorizedValue when gathering" The test case included in r279125 exposed existing undefined behavior in the SLP vectorizer that it did not introduce. This patch reapplies the original patch, but modifies the test case to avoid hitting the undefined behavior. This allows us to close PR28330 while keeping the UBSan bot happy. The undefined behavior the original test uncovered will be addressed in a follow-on patch. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 llvm-svn: 279370	2016-08-20 14:49:02 +00:00
Vitaly Buka	cc7db13bf0	Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot. This reverts commit r279125. https://reviews.llvm.org/D23410 llvm-svn: 279363	2016-08-20 07:09:39 +00:00
Xinliang David Li	3bf2d58a21	[Profile] add test with large counts llvm-svn: 279361	2016-08-20 05:28:42 +00:00
Sanjay Patel	fa7de606c4	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 1 This is a partial enablement (move the ConstantInt guard down) because there are many different folds here and one of the later ones will require reworking 'isSignBitCheck'. llvm-svn: 279339	2016-08-19 22:33:26 +00:00
Reid Kleckner	98a48afa5d	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279229. It breaks intrinsic function calls in diamonds. llvm-svn: 279313	2016-08-19 20:22:39 +00:00
Reid Kleckner	a871d3872a	Fix regression in InstCombine introduced by r278944 The intended transform is: // Simplify icmp eq (or (ptrtoint P), (ptrtoint Q)), 0 // -> and (icmp eq P, null), (icmp eq Q, null). P and Q are both pointer types, but may have different types. We need two calls to getNullValue() to make the icmps. llvm-svn: 279271	2016-08-19 16:53:18 +00:00
David Majnemer	5554edabef	[CloneFunction] Don't remove unrelated nodes from the CGSSC CGSCC use a WeakVH to track call sites. RAUW a call within a function can result in that WeakVH getting confused about whether or not the call site is still around. llvm-svn: 279268	2016-08-19 16:37:40 +00:00
Sanjay Patel	a867afe094	[InstCombine] use m_APInt to allow icmp (shl 1, Y), C folds for splat constant vectors llvm-svn: 279266	2016-08-19 16:12:16 +00:00
Sanjay Patel	57b12d3876	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors Of course, we really need to refactor and fix all of the cmp predicates, but this one is interesting because without it, we later perform an information-losing transform of icmp (shl 1, Y), C, and we can't recover the better fold. llvm-svn: 279263	2016-08-19 15:40:44 +00:00
Sanjay Patel	78111a7617	[InstCombine] add tests for missing vector icmp folds llvm-svn: 279259	2016-08-19 15:27:28 +00:00
Sanjay Patel	14cdf1968f	[InstCombine] add missing tests for basic icmp folds These are implicitly included as part of larger test cases, but they don't exist stand-alone (and don't happen for vectors...). llvm-svn: 279257	2016-08-19 15:21:45 +00:00
James Molloy	11a1936b70	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 279229	2016-08-19 10:10:27 +00:00
Amaury Sechet	763c59dc9a	Make cltz and cttz zero undef when the operand cannot be zero in InstCombine Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141	2016-08-18 20:43:50 +00:00
Sanjay Patel	40e8ca46ad	[InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133	2016-08-18 20:28:54 +00:00
Matthew Simpson	11db6b6b8c	[SLP] Initialize VectorizedValue when gathering We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125	2016-08-18 19:50:32 +00:00
Sanjay Patel	fa5ca2bf46	[InstCombine] use m_APInt to allow icmp (udiv X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 llvm-svn: 279101	2016-08-18 17:55:59 +00:00
Artur Pilipenko	615b820af6	CVP. Turn marking adds as no wrap (introduced by r278107) off by default It causes a regression on our internal benchmark. Introduce cvp-dont-process flag and set it off by default while investigating the regression. llvm-svn: 279082	2016-08-18 16:08:35 +00:00
Sanjay Patel	6347807f87	[InstCombine] use m_APInt to allow icmp (mul X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 llvm-svn: 279077	2016-08-18 15:44:44 +00:00
Sanjay Patel	4c5e60d95c	[InstCombine] use m_APInt to allow icmp (xor X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 llvm-svn: 279066	2016-08-18 14:10:48 +00:00
Chad Rosier	83f6bbc154	[Reassociate] Add test for PR28367. llvm-svn: 279063	2016-08-18 13:22:37 +00:00
Sanjay Patel	3c92db7560	[InstCombine] add test for missing vector icmp fold Also, add a scalar test to demonstrate one of the intermediate folds that is necessary to accomplish the existing, multi-step test. And simplify the vector tests to only check the final piece of that multi-step transform. llvm-svn: 278995	2016-08-17 22:18:57 +00:00
Sanjay Patel	84ff18ba92	[InstCombine] minimize tests and autogenerate checks llvm-svn: 278960	2016-08-17 19:56:10 +00:00
Sanjay Patel	63e14a07e8	[InstCombine] use m_APInt to allow icmp (or X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 llvm-svn: 278945	2016-08-17 16:38:57 +00:00
Sanjay Patel	f636d762ed	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278943	2016-08-17 16:23:15 +00:00
Chad Rosier	ea7e4647db	Revert "Reassociate: Reprocess RedoInsts after each inst". This reverts commit r258830, which introduced a bug described in PR28367. PR28367 llvm-svn: 278938	2016-08-17 15:54:39 +00:00
Sanjay Patel	4f7eb2aa95	[InstCombine] use m_APInt to allow icmp (add X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 llvm-svn: 278935	2016-08-17 15:24:30 +00:00
Chad Rosier	a6822f64f3	Revert "[Reassociate] Avoid iterator invalidation when negating value." This reverts commit r278928 due to lit test failures. llvm-svn: 278929	2016-08-17 14:31:34 +00:00
Chad Rosier	cf3e8121a6	[Reassociate] Avoid iterator invalidation when negating value. Differential Revision: https://reviews.llvm.org/D23464 PR28367 llvm-svn: 278928	2016-08-17 14:16:45 +00:00
Chandler Carruth	67fc52f067	[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896	2016-08-17 02:56:20 +00:00
Sanjay Patel	7ad324b396	[InstCombine] add tests for fold with no coverage and missing vector fold llvm-svn: 278867	2016-08-16 23:18:42 +00:00
Sanjay Patel	e47df1ac62	[InstCombine] use m_APInt to allow icmp (sub X, Y), C folds for splat constant vectors llvm-svn: 278859	2016-08-16 21:53:19 +00:00
David Majnemer	00940fb854	Make MDNode::intersect faster than O(n * m) It is pretty easy to get it down to O(nlogn + mlogm). This implementation has the added benefit of automatically deduplicating entries between the two sets. llvm-svn: 278837	2016-08-16 18:48:37 +00:00
David Majnemer	fa0f1e660b	Don't passively concatenate MDNodes I have audited all the callers of concatenate and none require duplicate entries to service concatenation. These duplicates serve no purpose but to needlessly embiggen the IR. N.B. Layering getMostGenericAliasScope on top of concatenate makes it O(nlogn + mlogm) instead of O(n*m). llvm-svn: 278836	2016-08-16 18:48:34 +00:00
Gor Nishanov	74309fa014	[Coroutines] Part 7: Split coroutine into subfunctions Summary: This patch adds simple coroutine splitting logic to CoroSplit pass. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions <= we are here 8. Coroutine Frame Building algorithm 9. Handle coroutine with unwinds 10+. The rest of the logic Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23461 llvm-svn: 278830	2016-08-16 18:04:14 +00:00
David Majnemer	5c5df6283a	[InstSimplify] Fold gep (gep V, C), (xor V, -1) to C-1 llvm-svn: 278779	2016-08-16 06:13:46 +00:00
Sanjay Patel	46a68ba618	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278768	2016-08-16 00:48:38 +00:00
Sanjay Patel	f1bf21c56b	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278765	2016-08-16 00:27:12 +00:00
Sanjay Patel	df77a4dbb0	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278757	2016-08-15 22:43:52 +00:00
Sanjay Patel	41520e1712	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278751	2016-08-15 21:47:50 +00:00
Sanjay Patel	638b613101	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278747	2016-08-15 21:37:24 +00:00
Sanjay Patel	55d87a88cc	update tests to use FileCheck and exact checking llvm-svn: 278741	2016-08-15 21:02:25 +00:00
Sanjay Patel	3e9acec2fa	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278737	2016-08-15 20:56:11 +00:00
Sanjay Patel	b37bd6d7b7	[InstCombine] add test for missing vector icmp fold llvm-svn: 278727	2016-08-15 20:02:40 +00:00
Sanjay Patel	7d98be81cc	[InstCombine] add tests for vector icmp folds llvm-svn: 278726	2016-08-15 19:58:21 +00:00
Sanjay Patel	b860859611	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278717	2016-08-15 19:16:33 +00:00
Sanjay Patel	6866b82a05	update test to use FileCheck and autogenerated checks llvm-svn: 278714	2016-08-15 18:56:10 +00:00
Sanjay Patel	2044a8eba9	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278709	2016-08-15 18:45:10 +00:00
Sanjay Patel	aaf34d1bfc	[InstCombine] add test for missing vector icmp fold llvm-svn: 278708	2016-08-15 18:39:54 +00:00
Sanjay Patel	195eb9340a	minimize test llvm-svn: 278707	2016-08-15 18:35:44 +00:00
Sanjay Patel	3f506daf8c	remove unnecessary IR comments about uses llvm-svn: 278705	2016-08-15 18:32:50 +00:00
Sanjay Patel	d391b0d69e	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278704	2016-08-15 18:26:56 +00:00
Sanjay Patel	cbd62a082c	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278689	2016-08-15 17:55:39 +00:00
Sanjay Patel	566b348987	[InstCombine] auto-generate exact checks Note that several of these tests belong in InstSimplify rather than InstCombine because they return existing operands or constants. llvm-svn: 278684	2016-08-15 17:19:07 +00:00
Sanjay Patel	a7b9bb3785	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278683	2016-08-15 17:10:35 +00:00
Reid Kleckner	70a600b8bb	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r278660. It causes downstream assertion failure in InstCombine on shuffle instructions. Comes up in __mm_swizzle_epi32. llvm-svn: 278672	2016-08-15 15:42:31 +00:00
James Molloy	9a3c82f5cf	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 278660	2016-08-15 08:04:56 +00:00
James Molloy	196ad0823e	[LSR] Don't try and create post-inc expressions on non-rotated loops If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs! Motivating testcase: void f(float a, float b, float c, int n) { while (n-- > 0) c++ = a++ + b++; } It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage. llvm-svn: 278658	2016-08-15 07:53:03 +00:00
Sanjay Patel	52fe9ae990	[InstCombine] add test for missing vector icmp fold llvm-svn: 278639	2016-08-14 22:56:46 +00:00
Sanjay Patel	7e57b00274	[InstCombine] add tests for vector icmp folds llvm-svn: 278637	2016-08-14 22:44:10 +00:00
Sanjay Patel	8554f70c07	[InstCombine] add test for potentially missing vector icmp fold llvm-svn: 278636	2016-08-14 22:30:07 +00:00
Sanjay Patel	beebe05af1	[InstCombine] add test for missing vector icmp fold llvm-svn: 278635	2016-08-14 22:29:27 +00:00
Sanjay Patel	ba1f9fbddc	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278634	2016-08-14 22:28:50 +00:00
Sanjay Patel	f6559404d5	[InstCombine] remove unnecessary function attributes from tests llvm-svn: 278633	2016-08-14 21:48:21 +00:00
Sanjay Patel	b44ca3bfa9	[InstCombine] add tests for missing vector icmp folds llvm-svn: 278632	2016-08-14 21:36:22 +00:00
Sanjay Patel	bbb3dffd0a	[InstCombine] add test for missing vector icmp fold llvm-svn: 278631	2016-08-14 21:05:08 +00:00
Sanjay Patel	66a3457a4c	[InstCombine] add test for missing vector icmp fold llvm-svn: 278630	2016-08-14 20:39:42 +00:00
Sanjoy Das	2143447c73	[IRCE] Create llvm::Loop instances for cloned out loops llvm-svn: 278618	2016-08-14 01:04:46 +00:00
Sanjoy Das	7a18a238c6	[IRCE] Don't iterate on loops that were cloned out IRCE has the ability to further version pre-loops and post-loops that it created, but this isn't useful at all. This change teaches IRCE to leave behind some metadata in the loops it creates (by cloning the main loop) so that these new loops are not re-processed by IRCE. Today this bug is hidden by another bug -- IRCE does not update LoopInfo properly so the loop pass manager does not re-invoke IRCE on the loops it split out. However, once the latter is fixed the bug addressed in this change causes IRCE to infinite-loop in some cases (e.g. it splits out a pre-loop, a pre-pre-loop from that, a pre-pre-pre-loop from that and so on). llvm-svn: 278617	2016-08-14 01:04:36 +00:00
Sanjoy Das	1b1272f515	[IRCE] Fix test case; NFC The (negative) test case is supposed to check that IRCE does not muck with range checks it cannot handle, not that it does the right thing in the absence of profiling information. llvm-svn: 278612	2016-08-13 23:36:40 +00:00
Sanjoy Das	2a2f14d7ab	[IRCE] Be resilient in the face of non-simplified loops Loops containing `indirectbr` may not be in simplified form, even after running LoopSimplify. Reject then gracefully, instead of tripping an assert. llvm-svn: 278611	2016-08-13 23:36:35 +00:00
Mehdi Amini	8c629ecf3a	Revert "Revert "Invariant start/end intrinsics overloaded for address space"" This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530. llvm-svn: 278609	2016-08-13 23:31:24 +00:00
Mehdi Amini	164ac651da	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276447. llvm-svn: 278608	2016-08-13 23:27:32 +00:00
Teresa Johnson	1eca6bc6a7	[PM] Port LoopDataPrefetch to new pass manager Summary: Refactor the existing support into a LoopDataPrefetch implementation class and a LoopDataPrefetchLegacyPass class that invokes it. Add a new LoopDataPrefetchPass for the new pass manager that utilizes the LoopDataPrefetch implementation class. Reviewers: mehdi_amini Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23483 llvm-svn: 278591	2016-08-13 04:11:27 +00:00
Sanjoy Das	3502511548	[IndVars] Ignore (s\|z)exts that don't extend the induction variable `IVVisitor::visitCast` used to have the invariant that if the instruction it was passed was a sext or zext instruction, the result of the instruction would be wider than the induction variable. This is no longer true after rL275037, so this change teaches `IndVarSimplify` s implementation of `IVVisitor::visitCast` to work with the relaxed invariant. A corresponding change to SimplifyIndVar to preserve the said invariant after rL275037 would also work, but given how `IVVisitor::visitCast` is spelled (no indication of said invariant), I figured the current fix is cleaner. Fixes PR28935. llvm-svn: 278584	2016-08-13 00:58:31 +00:00
Tim Shen	c9c0d2dcb5	[LoopVectorize] Detect loops in the innermost loop before creating InnerLoopVectorizer InnerLoopVectorizer shouldn't handle a loop with cycles inside the loop body, even if that cycle isn't a natural loop. Fixes PR28541. Differential Revision: https://reviews.llvm.org/D22952 llvm-svn: 278573	2016-08-12 22:47:13 +00:00
Reid Kleckner	6ee00a2602	[Inliner] Don't treat inalloca allocas as static They aren't static, and moving them to the entry block across something else will only result in tears. Root cause of http://crbug.com/636558. llvm-svn: 278571	2016-08-12 22:23:04 +00:00
Michael Kuperstein	31b8399beb	[PM] Port LowerInvoke to the new pass manager llvm-svn: 278531	2016-08-12 17:28:27 +00:00
Dehao Chen	c0a1e432c7	Fine tuning of sample profile propagation algorithm. Summary: The refined propagation algorithm is more accurate and robust. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23224 llvm-svn: 278522	2016-08-12 16:22:12 +00:00
Artur Pilipenko	2e8f82d962	[LVI] Take guards into account Teach LVI to gather control dependant constraints from guards. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23358 llvm-svn: 278518	2016-08-12 15:52:23 +00:00
Artur Pilipenko	b623088abe	[LVI] Fix potential memory corruption in getValueFromCondition Rewrite Visited[Cond] = getValueFromConditionImpl(..., Visited) statement which can lead to a memory corruption since getValueFromConditionImpl changes Visited map and invalidates the iterators. llvm-svn: 278514	2016-08-12 15:08:15 +00:00
Artur Pilipenko	6669f253d5	[LVI] Take range metadata into account while calculating icmp condition constraints Take range metadata into account for conditions like this: %length = load i32, i32* %length_ptr, !range !{i32 0, i32 2147483647} %cmp = icmp ult i32 %a, %length This is a common pattern for range checks where the length of the array is dynamically loaded. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23267 llvm-svn: 278496	2016-08-12 10:14:11 +00:00
Artur Pilipenko	635625855f	[LVI] Handle any predicate in comparisons like icmp <pred> (add Val, Offset), ... Currently LVI can only gather value constraints from comparisons like: * icmp <pred> Val, ... * icmp ult (add Val, Offset), ... In fact we can handle any predicate in latter comparisons. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23357 llvm-svn: 278493	2016-08-12 10:05:11 +00:00
Gor Nishanov	0f303accde	[Coroutines]: Part6b: Add coro.id intrinsic. Summary: 1. Make coroutine representation more robust against optimization that may duplicate instruction by introducing coro.id intrinsics that returns a token that will get fed into coro.alloc and coro.begin. Due to coro.id returning a token, it won't get duplicated and can be used as reliable indicator of coroutine identify when a particular coroutine call gets inlined. 2. Move last three arguments of coro.begin into coro.id as they will be shared if coro.begin will get duplicated. 3. doc + test + code updated to support the new intrinsic. Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23412 llvm-svn: 278481	2016-08-12 05:45:49 +00:00
Eli Friedman	a6707f56b5	[DSE] Don't remove stores made live by a call which unwinds. Issue exposed by noalias or more aggressive alias analysis. Fixes http://llvm.org/PR25422. Differential revision: https://reviews.llvm.org/D21007 llvm-svn: 278451	2016-08-12 01:09:53 +00:00
Piotr Padlewski	332b3b2210	Don't import variadic functions Summary: This patch adds IsVariadicFunction bit to summary in order to not import variadic functions. Inliner doesn't inline variadic functions because it is hard to reason about it. This one small fix improves Importer by about 16% (going from 86% to 100% of imported functions that are inlined anywhere) on some spec benchmarks like 'int' and others. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23339 llvm-svn: 278432	2016-08-11 22:13:57 +00:00
Ehsan Amiri	dbcfea9811	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on). https://reviews.llvm.org/D23075 llvm-svn: 278421	2016-08-11 21:31:40 +00:00
Geoff Berry	d01828096f	[SCEV] Update interface to handle SCEVExpander insert point motion. Summary: This is an extension of the fix in r271424. That fix dealt with builder insert points being moved by SCEV expansion, but only for the lifetime of the expand call. This change modifies the interface so that LSR can safely call expand multiple times at the same insert point and do the right thing if one of the expansions decides to move the original insert point. This is a fix for PR28719. Reviewers: sanjoy Subscribers: llvm-commits, mcrosier, mzolotukhin Differential Revision: https://reviews.llvm.org/D23342 llvm-svn: 278413	2016-08-11 21:05:17 +00:00
Daniel Berlin	2698cbb4f1	Move GVNHoist tests into their own directory since it is a separate pass llvm-svn: 278404	2016-08-11 20:35:07 +00:00
Daniel Berlin	f75fd1b58b	Fix PR 28933 Summary: This fixes PR 28933 by making sure GVNHoist does not try to recreate memory accesses when it has not actually moved them. Reviewers: sebpop Subscribers: llvm-commits, george.burgess.iv Differential Revision: https://reviews.llvm.org/D23411 llvm-svn: 278401	2016-08-11 20:32:43 +00:00
Ivan Krasin	f3403fd2c8	WholeProgramDevirt: generate more detailed and accurate remarks. Summary: Keep track of all methods for which we have devirtualized at least one call and then print them sorted alphabetically. That allows to avoid duplicates and also makes the order deterministic. Add optimization names into the remarks, so that it's easier to understand how has each method been devirtualized. Fix a bug when wrong methods could have been reported for tryVirtualConstProp. Reviewers: kcc, mehdi_amini Differential Revision: https://reviews.llvm.org/D23297 llvm-svn: 278389	2016-08-11 19:09:02 +00:00
Andrew Kaylor	7cdf01ef58	Target independent codesize heuristics for Loop Idiom Recognition Patch by Sunita Marathe Differential Revision: https://reviews.llvm.org/D21449 llvm-svn: 278378	2016-08-11 18:28:33 +00:00
Ehsan Amiri	3818f1b38a	revert 278334 llvm-svn: 278337	2016-08-11 14:51:14 +00:00
Ehsan Amiri	b9fcc2b171	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. https://reviews.llvm.org/D23075 llvm-svn: 278334	2016-08-11 13:51:20 +00:00
Xinliang David Li	76a0108be4	[Profile] improve warning control option Change --no-pgo-warn-missing to -pgo-warn-missing-function and negate the default. /NFC Add more test to make sure the warning is off by default llvm-svn: 278314	2016-08-11 05:09:30 +00:00
Andrew Kaylor	498d3113c3	[IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18867 llvm-svn: 278269	2016-08-10 18:56:35 +00:00
Andrew Kaylor	b10f6876cd	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 278267	2016-08-10 18:47:19 +00:00
Gor Nishanov	b2a9c02521	[Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possible Summary: A particular coroutine usage pattern, where a coroutine is created, manipulated and destroyed by the same calling function, is common for coroutines implementing RAII idiom and is suitable for allocation elision optimization which avoid dynamic allocation by storing the coroutine frame as a static `alloca` in its caller. coro.free and coro.alloc intrinsics are used to indicate which code needs to be suppressed when dynamic allocation elision happens: ``` entry: %elide = call i8* @llvm.coro.alloc() %need.dyn.alloc = icmp ne i8* %elide, null br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc dyn.alloc: %alloc = call i8* @CustomAlloc(i32 4) br label %coro.begin coro.begin: %phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ] %hdl = call i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null, i8* bitcast ([2 x void (%f.frame)]* @f.resumers to i8)) ``` and ``` %mem = call i8 @llvm.coro.free(i8* %hdl) %need.dyn.free = icmp ne i8* %mem, null br i1 %need.dyn.free, label %dyn.free, label %if.end dyn.free: call void @CustomFree(i8* %mem) br label %if.end if.end: ... ``` If heap allocation elision is performed, we replace coro.alloc with a static alloca on the caller frame and coro.free with null constant. Also, we need to make sure that if there are any tail calls referencing the coroutine frame, we need to remote tail call attribute, since now coroutine frame lives on the stack. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization (https://reviews.llvm.org/D23229) 5.Add CGSCC restart trigger + tests. (https://reviews.llvm.org/D23234) 6.Add coroutine heap elision + tests. <= we are here 7.Add the rest of the logic (split into more patches) Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23245 llvm-svn: 278242	2016-08-10 16:40:39 +00:00
Artur Pilipenko	fd223d5d25	[LVI] Handle conditions in the form of (cond1 && cond2) Teach LVI how to gather information from conditions in the form of (cond1 && cond2). Our out-of-tree front-end emits range checks in this form. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23200 llvm-svn: 278231	2016-08-10 15:13:15 +00:00
Artur Pilipenko	e896325ca3	Add a test case for r278217 "[LVI] Relax the assertion about LVILatticeVal type in getConstantRange" llvm-svn: 278226	2016-08-10 13:51:01 +00:00
Artur Pilipenko	e171ea8a33	Teach CorrelatedValuePropagation to mark adds as no wrap This is a resubmission of previously reverted r277592. It was hitting overly strong assertion in getConstantRange which was relaxed in r278217. Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23059 llvm-svn: 278220	2016-08-10 13:08:34 +00:00
Davide Italiano	873219c406	[SimplifyLibCalls] Restore the old behaviour, emit a libcall. Hal pointed out that the semantic of our intrinsic and the libc call are slightly different. Add a comment while I'm here to explain why we can't emit an intrinsic. Thanks Hal! llvm-svn: 278200	2016-08-10 06:33:32 +00:00
Adam Nemet	896c09bd10	[Inliner,OptDiag] Add hotness attribute to opt diagnostics Summary: The inliner not being a function pass requires the work-around of generating the OptimizationRemarkEmitter and in turn BFI on demand. This will go away after the new PM is ready. BFI is only computed inside ORE if the user has requested hotness information for optimization diagnostitics (-pass-remark-with-hotness at the 'opt' level). Thus there is no additional overhead without the flag. Reviewers: hfinkel, davidxl, eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22694 llvm-svn: 278185	2016-08-10 00:44:44 +00:00
Michael Zolotukhin	aae168f993	[LoopSimplify] Rebuild LCSSA for the inner loop after separating nested loops. Summary: This hopefully fixes PR28825. The problem now was that a value from the original loop was used in a subloop, which became a sibling after separation. While a subloop doesn't need an lcssa phi node, a sibling does, and that's where we broke LCSSA. The most natural way to fix this now is to simply call formLCSSA on the original loop: it'll do what we've been doing before plus it'll cover situations described above. I think we don't need to run formLCSSARecursively here, and we have an assert to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy to be corrected here though. I also changed a run line in the test from '-lcssa -loop-unroll' to '-lcssa -loop-simplify -indvars', because it exercises LCSSA preservation to the same extent, but also makes less unrelated transformation on the CFG, which makes it easier to verify. Reviewers: chandlerc, sanjoy, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23288 llvm-svn: 278173	2016-08-09 22:44:56 +00:00
Anna Thomas	b2d12b81c3	[EarlyCSE] Teach about CSE'ing over invariant.start intrinsics Summary: Teach EarlyCSE about invariant.start intrinsic. Specifically, we can perform store-load, load-load forwarding over this call. Reviewers: majnemer, reames, dberlin, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23268 llvm-svn: 278153	2016-08-09 20:00:47 +00:00

1 2 3 4 5 ...

7560 Commits