llvm-project

Commit Graph

Author	SHA1	Message	Date
Serguei Katkov	edf3c8292b	[SCEV] Do not insert if it is already in cache This is fix for the crash caused by ScalarEvolution::getTruncateExpr. It expects that if it checked the condition that SCEV is not in UniqueSCEVs cache in the beginning that it will not be there inside this method. However during recursion and transformation/simplification for sub expression, it is possible that these modifications will end up with the same SCEV as we started from. So we must always check whether SCEV is in cache and do not insert item if it is already there. Reviewers: sanjoy, mkazantsev, craig.topper Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41380 llvm-svn: 321472	2017-12-27 07:15:23 +00:00
Max Kazantsev	9c08b7a053	[SCEV] Fix predicate usage in computeExitLimitFromICmp In this method, we invoke `SimplifyICmpOperands` which takes the `Cond` predicate by reference and may change it along with `LHS` and `RHS` SCEVs. But then we invoke `computeShiftCompareExitLimit` with Values from which the SCEVs have been derived, these Values have not been modified while `Cond` could be. One of possible outcomes of this is that we may falsely prove that an infinite loop ends within some finite number of iterations. In this patch, we save the original `Cond` and pass it along with original operands. This logic may be removed in future once `computeShiftCompareExitLimit` works with SCEVs instead of value operands. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D40953 llvm-svn: 320142	2017-12-08 12:19:45 +00:00
Max Kazantsev	23044fa639	[SCEV] Strengthen variance condition in calculateLoopDisposition Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively. When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if `L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are consecutive sibling loops within the parent loop. In this case, `AR2` is also varying w.r.t. `L1`, but we don't correctly identify it. It can lead, for exaple, to attempt of incorrect folding. Consider: AR1 = {a,+,b}<L1> AR2 = {c,+,d}<L2> EXAR2 = sext(AR1) MUL = mul AR1, EXAR2 If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect because `AR2` is not available on entrance of `L1`. Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled with one check: "header of `L1` dominates header of `L2`". This patch replaces the old insufficient check with this one. Differential Revision: https://reviews.llvm.org/D39453 llvm-svn: 318819	2017-11-22 06:21:39 +00:00
Jatin Bhateja	c61ade1ca0	[SCEV] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: sanjoy, junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 318050	2017-11-13 16:43:24 +00:00
Sanjoy Das	8499ebf2e9	[SCEV] Fix an assertion failure in the max backedge taken count Max backedge taken count is always expected to be a constant; and this is usually true by construction -- it is a SCEV expression with constant inputs. However, if the max backedge expression ends up being computed to be a udiv with a constant zero denominator[0], SCEV does not fold the result to a constant since there is no constant it can fold it to (SCEV has no representation for "infinity" or "undef"). However, in computeMaxBECountForLT we already know the denominator is positive, and thus at least 1; and we can use this fact to avoid dividing by zero. [0]: We can end up with a constant zero denominator if the signed range of the stride is more precise than the unsigned range. llvm-svn: 316615	2017-10-25 21:41:00 +00:00
Sanjoy Das	2f27456c82	Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain." This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129	2017-10-18 22:00:57 +00:00
Jatin Bhateja	1fc49627e4	[ScalarEvolution] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054	2017-10-18 01:36:16 +00:00
Anna Thomas	a2ca902033	[SCEV] Teach SCEV to find maxBECount when loop endbound is variant Summary: This patch teaches SCEV to calculate the maxBECount when the end bound of the loop can vary. Note that we cannot calculate the exactBECount. This will only be done when both conditions are satisfied: 1. the loop termination condition is strictly LT. 2. the IV is proven to not overflow. This provides more information to users of SCEV and can be used to improve identification of finite loops. Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38825 llvm-svn: 315683	2017-10-13 14:30:43 +00:00
Alexandre Isoard	405728fd47	[SCEV] Add URem support to SCEV In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. llvm-svn: 312329	2017-09-01 14:59:59 +00:00
Amara Emerson	56dca4e3ca	[SCEV] Preserve NSW information for sext(subtract). Pushes the sext onto the operands of a Sub if NSW is present. Also adds support for propagating the nowrap flags of the llvm.ssub.with.overflow intrinsic during analysis. Differential Revision: https://reviews.llvm.org/D35256 llvm-svn: 310117	2017-08-04 20:19:46 +00:00
Max Kazantsev	2cb3653404	[SCEV] Re-enable "Cache results of computeExitLimit" The patch rL309080 was reverted because it did not clean up the cache on "forgetValue" method call. This patch re-enables this change, adds the missing check and introduces two new unit tests that make sure that the cache is cleaned properly. Differential Revision: https://reviews.llvm.org/D36087 llvm-svn: 309925	2017-08-03 08:41:30 +00:00
Sanjoy Das	843ab57457	Revert "[SCEV] Cache results of computeExitLimit" This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults. I've replied on the commit thread for the patch with more details. llvm-svn: 309357	2017-07-28 03:25:07 +00:00
Max Kazantsev	f282aed428	[SCEV] Cache results of computeExitLimit This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug 33494. Differential Revision: https://reviews.llvm.org/D35827 llvm-svn: 309080	2017-07-26 04:55:54 +00:00
Max Kazantsev	0e9e0796f4	[SCEV] Limit max size of AddRecExpr during evolving When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every operand of the resulting SCEV is combined from operands of initial SCEV and has much higher complexity than they have. As result, if we try to calculate something like: %x1 = {a,+,b} %x2 = mul i32 %x1, %x1 %x3 = mul i32 %x2, %x1 %x4 = mul i32 %x3, %x2 ... The size of such SCEVs grows as `2^N`, and the arguments become more and more complex as we go forth. This leads to long compilation and huge memory consumption. This patch sets a limit after which we don't try to combine two `SCEVAddRecExpr`s into one. By default, max allowed size of the resulting AddRecExpr is set to 16. Differential Revision: https://reviews.llvm.org/D35664 llvm-svn: 308847	2017-07-23 15:40:19 +00:00
Max Kazantsev	b9edcbcb1d	Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477	2017-07-08 17:17:30 +00:00
Max Kazantsev	98838527c6	Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249	2017-07-06 10:47:13 +00:00
Max Kazantsev	c8db20b78c	Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244	2017-07-06 09:57:41 +00:00
Max Kazantsev	ebe56283bc	Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135	2017-07-05 09:44:41 +00:00
Max Kazantsev	80bc4a5554	[IndVars] Canonicalize comparisons between non-negative values and indvars -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126	2017-07-05 06:38:49 +00:00
Max Kazantsev	8d0322e612	[SCEV] Use depth limit instead of local cache for SExt and ZExt In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785	2017-06-30 05:04:09 +00:00
Alexandre Isoard	41044876fc	Reverting r306695 while investigating failing test case. Failing test case: Transforms/LoopVectorize.iv_outside_user.ll llvm-svn: 306723	2017-06-29 18:48:56 +00:00
Alexandre Isoard	aa29afc756	ScalarEvolution: Add URem support In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to: %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort this way. Note: While SRem and SDiv are also related this way, SCEV does not provides SDiv yet. llvm-svn: 306695	2017-06-29 16:29:04 +00:00
Max Kazantsev	dc80366d52	[ScalarEvolution] Apply Depth limit to getMulExpr This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463	2017-06-15 11:48:21 +00:00
Max Kazantsev	41450329f7	Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The reason of this is that the patch makes a correctless fix that changes optimizer's behavior for this test. Without the change, LSR was making an overconfident simplification basing on a wrong SCEV. Apparently it did not need the IV analysis to do this. With the change, it chose a different way to simplify (that wasn't so confident), and this way required the IV analysis. Now, following the right execution path, LSR tries to make a transformation relying on IV Users analysis. This analysis is target-dependent due to this code: // LSR is not APInt clean, do not touch integers bigger than 64-bits. // Also avoid creating IVs of non-native types. For example, we don't want a // 64-bit IV in 32-bit code just because the loop has one 64-bit cast. uint64_t Width = SE->getTypeSizeInBits(I->getType()); if (Width > 64 \|\| !DL.isLegalInteger(Width)) return false; To make a proper transformation in this test case, the type i32 needs to be legal for the specified data layout. When the test runs on some non-X86 configuration (e.g. pure ARM 64), opt gets confused by the specified target and does not use it, rejecting the specified data layout as well. Instead, it uses some default layout that does not treat i32 as a legal type (currently the layout that is used when it is not specified does not have legal types at all). As result, the transformation we expect to happen does not happen for this test. This re-enabling patch does not have any source code changes compared to the original patch rL303730. The only difference is that the failing test is moved to X86 directory and now has requirement of running on x86 only to comply with the specified target triple and data layout. Differential Revision: https://reviews.llvm.org/D33543 llvm-svn: 303971	2017-05-26 06:47:04 +00:00
Diana Picus	183863fc3b	Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" This reverts commit r303730 because it broke all the buildbots. llvm-svn: 303747	2017-05-24 14:16:04 +00:00
Max Kazantsev	13e016bf48	[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that the loop of our base recurrency is the bottom-lost in terms of domination. This assumption may be broken by an expression which is treated as invariant, and which depends on a complex Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence. Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here: for (...) { // loop phi = {A,+,B} } X = load ... Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment). It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop> may be existant. This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr, if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior. llvm-svn: 303730	2017-05-24 08:52:18 +00:00
Sanjoy Das	036dda25a5	[SCEV] Clarify behavior around max backedge taken count This is a re-application of a r303497 that was reverted in r303498. I thought it had broken a bot when it had not (the breakage did not go away with the revert). This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303531	2017-05-22 06:46:04 +00:00
Sanjoy Das	8963650cfa	Revert "[SCEV] Clarify behavior around max backedge taken count" This reverts commit r303497 since it breaks the msan bootstrap bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/ llvm-svn: 303498	2017-05-21 05:02:12 +00:00
Sanjoy Das	5207168383	[SCEV] Clarify behavior around max backedge taken count This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303497	2017-05-21 01:47:50 +00:00
Max Kazantsev	b09b5db793	[SCEV] Fix sorting order for AddRecExprs The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148	2017-05-16 07:27:06 +00:00
Michael Zolotukhin	37162adf3e	[SCEV] createAddRecFromPHI: Optimize for the most common case. Summary: The existing implementation creates a symbolic SCEV expression every time we analyze a phi node and then has to remove it, when the analysis is finished. This is very expensive, and in most of the cases it's also unnecessary. According to the data I collected, ~60-70% of analyzed phi nodes (measured on SPEC) have the following form: PN = phi(Start, OP(Self, Constant)) Handling such cases separately significantly speeds this up. Reviewers: sanjoy, pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32663 llvm-svn: 302096	2017-05-03 23:53:38 +00:00
Sanjoy Das	08989c7ecd	Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC Summary: programUndefinedIfPoison makes more sense, given what the function does; and I'm about to add a function with a name similar to isKnownNotFullPoison (so do the rename to avoid confusion). Reviewers: broune, majnemer, bjarke.roune Reviewed By: broune Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30444 llvm-svn: 301776	2017-04-30 19:41:19 +00:00
Sanjoy Das	bdbc4938f9	[SCEV] Fix exponential time complexity by caching llvm-svn: 301149	2017-04-24 00:09:46 +00:00
Eli Friedman	d0e6ae5678	Revert r300746 (SCEV analysis for or instructions). There have been multiple reports of this causing problems: a compile-time explosion on the LLVM testsuite, and a stack overflow for an opencl kernel. llvm-svn: 300928	2017-04-20 23:59:05 +00:00
Eli Friedman	e77d2b86b4	[SCEV] Make SCEV or modeling more aggressive. Use haveNoCommonBitsSet to figure out whether an "or" instruction is equivalent to addition. This handles more cases than just checking for a constant on the RHS. Differential Revision: https://reviews.llvm.org/D32239 llvm-svn: 300746	2017-04-19 20:19:58 +00:00
Max Kazantsev	2e44d2969a	[ScalarEvolution] Re-enable Predicate implication from operations The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build. The reason of the crash was type mismatch between either a or b and RHS in the following situation: LHS = sext(a +nsw b) > RHS. This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type. But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this situation we don't need to create any non-constant SCEVs. This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not go further into range analysis etc (because in some situations these analyzes succeed even when the passed arguments have wrong types, what should not normally happen). The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong usage of predicates in recursive invocations. The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll Reviewers: reames, apilipenko, anna, sanjoy Reviewed By: sanjoy Subscribers: mzolotukhin, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31238 llvm-svn: 299205	2017-03-31 12:05:30 +00:00
Max Kazantsev	7696a7edf9	Revert "[ScalarEvolution] Re-enable Predicate implication from operations" This reverts commit rL298690 Causes failures on clang. llvm-svn: 298693	2017-03-24 07:04:31 +00:00
Max Kazantsev	89554446e7	[ScalarEvolution] Re-enable Predicate implication from operations The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build. The reason of the crash was type mismatch between either a or b and RHS in the following situation: LHS = sext(a +nsw b) > RHS. This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type. But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this situation we don't need to create any non-constant SCEVs. This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not go further into range analysis etc (because in some situations these analyzes succeed even when the passed arguments have wrong types, what should not normally happen). The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong usage of predicates in recursive invocations. The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll llvm-svn: 298690	2017-03-24 06:19:00 +00:00
Zhaoshi Zheng	e3c9070f06	Model ashr(shl(x, n), m) as mul(x, 2^(n-m)) when n > m Given below case: %y = shl %x, n %z = ashr %y, m when n = m, SCEV models it as sext(trunc(x)). This patch tries to handle the case where n > m by using sext(mul(trunc(x), 2^(n-m)))) as the SCEV expression. llvm-svn: 298631	2017-03-23 18:06:09 +00:00
Max Kazantsev	c6effaa495	Revert "[ScalarEvolution] Predicate implication from operations" This reverts commit rL298481 Fails clang-with-lto-ubuntu build. llvm-svn: 298489	2017-03-22 07:50:33 +00:00
Max Kazantsev	15e76aa0f8	[ScalarEvolution] Predicate implication from operations This patch allows SCEV predicate analysis to prove implication of some expression predicates from context predicates related to arguments of those expressions. It introduces three new rules: For addition: (A >X && B >= 0) \|\| (B >= 0 && A > X) ===> (A + B) > X. For division: (A > X) && (0 < B <= X + 1) ===> (A / B > 0). (A > X) && (-B <= X < 0) ===> (A / B >= 0). Using these rules, SCEV is able to prove facts like "if X > 1 then X / 2 > 0". They can also be combined with the same context, to prove more complex expressions like "if X > 1 then X/2 + 1 > 1". Diffirential Revision: https://reviews.llvm.org/D30887 Reviewed by: sanjoy llvm-svn: 298481	2017-03-22 04:48:46 +00:00
Eli Friedman	b1578d3612	[SCEV] Fix trip multiple calculation If loop bound containing calculations like min(a,b), the Scalar Evolution API getSmallConstantTripMultiple returns 4294967295 "-1" as the trip multiple. The problem is that, SCEV use -1 * umax to represent umin. The multiple constant -1 was returned, and the logic of guarding against huge trip counts was skipped. Because -1 has 32 active bits. The fix attempt to factor more general cases. First try to get the greatest power of two divisor of trip count expression. In case overflow happens, the trip count expression is still divisible by the greatest power of two divisor returned. Returns 1 if not divisible by 2. Patch by Huihui Zhang <huihuiz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D30840 llvm-svn: 298301	2017-03-20 20:25:46 +00:00
Michael Zolotukhin	99de88d1f3	[SCEV] Compute affine range in another way to avoid bitwidth extending. Summary: This approach has two major advantages over the existing one: 1. We don't need to extend bitwidth in our computations. Extending bitwidth is a big issue for compile time as we often end up working with APInts wider than 64bit, which is a slow case for APInt. 2. When we zero extend a wrapped range, we lose some information (we replace the range with [0, 1 << src bit width)). Thus, avoiding such extensions better preserves information. Correctness testing: I ran 'ninja check' with assertions that the new implementation of getRangeForAffineAR gives the same results as the old one (this functionality is not present in this patch). There were several failures - I inspected them manually and found out that they all are caused by the fact that we're returning more accurate results now (see bullet (2) above). Without such assertions 'ninja check' works just fine, as well as SPEC2006. Compile time testing: CTMark/Os: - mafft/pairlocalalign -16.98% - tramp3d-v4/tramp3d-v4 -12.72% - lencod/lencod -11.51% - Bullet/bullet -4.36% - ClamAV/clamscan -3.66% - 7zip/7zip-benchmark -3.19% - sqlite3/sqlite3 -2.95% - SPASS/SPASS -2.74% - Average -5.81% Performance testing: The changes are expected to be neutral for runtime performance. Reviewers: sanjoy, atrick, pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30477 llvm-svn: 297992	2017-03-16 21:07:38 +00:00
Sanjoy Das	5cd6c5cacf	[ValueTracking] Make poison propagation more aggressive Summary: Motivation: fix PR31181 without regression (the actual fix is still in progress). However, the actual content of PR31181 is not relevant here. This change makes poison propagation more aggressive in the following cases: 1. poision * Val == poison, for any Val. In particular, this changes existing intentional and documented behavior in these two cases: a. Val is 0 b. Val is 2^k * N 2. poison << Val == poison, for any Val 3. getelementptr is poison if any input is poison I think all of these are justified (and are axiomatically true in the new poison / undef model): 1a: we need poison * 0 to be poison to allow transforms like these: A * (B + C) ==> A * B + A * C If poison * 0 were 0 then the above transform could not be allowed since e.g. we could have A = poison, B = 1, C = -1, making the LHS poison * (1 + -1) = poison * 0 = 0 and the RHS poison * 1 + poison * -1 = poison + poison = poison 1b: we need e.g. poison * 4 to be poison since we want to allow A * 4 ==> A + A + A + A If poison * 4 were a value with all of their bits poison except the last four; then we'd not be able to do this transform since then if A were poison the LHS would only be "partially" poison while the RHS would be "full" poison. 2: Same reasoning as (1b), we'd like have the following kinds transforms be legal: A << 1 ==> A + A Reviewers: majnemer, efriedma Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D30185 llvm-svn: 295809	2017-02-22 06:52:32 +00:00
Igor Laevsky	c11c1ed909	[SCEV] Cache results during GetMinTrailingZeros query Differential Revision: https://reviews.llvm.org/D29759 llvm-svn: 295060	2017-02-14 15:53:12 +00:00
Eli Friedman	10d1ff64fe	[SCEV] Simplify/generalize howFarToZero solving. Make SolveLinEquationWithOverflow take the start as a SCEV, so we can solve more cases. With that implemented, get rid of the special case for powers of two. The additional functionality probably isn't particularly useful, but it might help a little for certain cases involving pointer arithmetic. Differential Revision: https://reviews.llvm.org/D28884 llvm-svn: 293576	2017-01-31 00:42:42 +00:00
Daniil Fukalov	b09dac59fc	[SCEV] Introduce add operation inlining limit Inlining in getAddExpr() can cause abnormal computational time in some cases. New parameter -scev-addops-inline-threshold is intruduced with default value 500. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D28812 llvm-svn: 293176	2017-01-26 13:33:17 +00:00
Chandler Carruth	d501b18990	This test apparently requires an x86 target and is failing on numerous bots ever since d0k fixed the CHECK lines so that it did something at all. It isn't actually testing SCEV directly but LSR, so move it into LSR and the x86-specific tree of tests that already exists there. Target dependence is common and unavoidable with the current design of LSR. llvm-svn: 292774	2017-01-23 08:33:29 +00:00
Benjamin Kramer	1fd0d44e9b	Attempt to fix test in release builds. llvm-svn: 292762	2017-01-22 21:01:19 +00:00
Benjamin Kramer	db9e0b659d	Fix some broken CHECK lines. The colon is important. llvm-svn: 292761	2017-01-22 20:28:56 +00:00

1 2 3 4 5 ...

361 Commits