llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	f1e8136115	[SCEV] Bail out if URem operand cannot be zero-extended. In some cases, LHS is larger than the target expression type. Bail out in that case for now, to avoid crashing	2021-02-01 13:50:54 +00:00
Mindong Chen	00fcc03687	[SCEV] Fix incorrect loop exit count analysis. In computeLoadConstantCompareExitLimit, the addrec used to compute the exit count should be from the loop which the exiting block belongs to. Reviewed by: mkazantsev Differential Revision: https://reviews.llvm.org/D92367	2021-01-27 19:36:05 +08:00
Arthur Eubanks	f374138058	[test] Make incorrect-exit-count.ll work under NPM	2021-01-21 21:45:32 -08:00
Mindong Chen	5d718374a6	[SCEV] Add a test with wrong exit counts. (NFC) This patch pre-commits a test case with wrong exit count analysis for D92367. Reviewed by: mkazantsev Differential Revision: https://reviews.llvm.org/D94657	2021-01-20 20:58:34 +08:00
Gil Rapaport	d9c0b128e3	[SCEV] Simplify trunc to zero based on known bits Let getTruncateExpr() short-circuit to zero when the value being truncated is known to have at least as many trailing zeros as the target type. Differential Revision: https://reviews.llvm.org/D93973	2021-01-03 13:57:12 +02:00
Juneyoung Lee	509fa8e02e	[SCEV] recognize logical and/or pattern This patch makes SCEV recognize 'select A, B, false' and 'select A, true, B'. This is a performance improvement that will be helpful after unsound select -> and/or transformation is removed, as discussed in D93065. SCEV's answers for the select form should be a bit more conservative than the equivalent `and A, B` / `or A, B`. Take this example: https://alive2.llvm.org/ce/z/NsP9ue . To check whether it is valid for SCEV's computeExitLimit to return min(n, m) as ExactNotTaken value, I put llvm.assume at tgt. It fails because the exit limit becomes poison if n is zero and m is poison. This is problematic if e.g. the exit value of i is replaced with min(n, m). If either n or m is constant, we can revive the analysis again. I added relevant tests and put alive2 links there. If and is used instead, this is okay: https://alive2.llvm.org/ce/z/K9rbJk . Hence the existing analysis is sound. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93882	2021-01-01 04:37:57 +09:00
Max Kazantsev	48d7cc6ae2	[SCEV] Fix incorrect treatment of max taken count. PR48225 SCEV makes a logical mistake when handling EitherMayExit in case when both conditions must be met to exit the loop. The mistake looks like follows: "if condition `A` fails within at most `X` first iterations, and `B` fails within at most `Y` first iterations, then `A & B` fails at most within `min (X, Y)` first iterations". This is wrong, because both of them must fail at the same time. Simple example illustrating this is following: we have an IV with step 1, condition `A` = "IV is even", condition `B` = "IV is odd". Both `A` and `B` will fail within first two iterations. But it doesn't mean that both of them will fail within first two first iterations at the same time, which would mean that IV is neither even nor odd at the same time within first 2 iterations. We can only do so for known exact BE counts, but not for max. Differential Revision: https://reviews.llvm.org/D91942 Reviewed By: nikic	2020-11-23 16:52:39 +07:00
Max Kazantsev	2290daa938	[Test] Auto-update checks in a test	2020-11-20 16:53:51 +07:00
Max Kazantsev	0c101c9cbc	[Test] Add tests demonstrating a bug in SCEV, PR48225 Slightly simplified version of original test reported by Congzhe Cao.	2020-11-20 15:59:22 +07:00
Nikita Popov	9ace4b337f	Revert "[SCEV] Factor out part of wrap flag detection logic [NFC-ish]" This reverts commit `1ec6e1eb8a`. This change causes a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=dd0b8b94d0796bd895cc998dd163b4fbebceb0b8&to=1ec6e1eb8a084bffae8a40236eb9925d8026dd07&stat=instructions I assume that this is due to the non-NFC part of the change, which now performs expensive nowrap inference even for nowrap flags that are not used by the particular code.	2020-11-15 10:19:44 +01:00
Philip Reames	1ec6e1eb8a	[SCEV] Factor out part of wrap flag detection logic [NFC-ish] In an effort to make code around flag determination more readable, and (possibly) prepare for a follow up change, factor out some of the flag detection logic. In the process, reduce the number of locations we mutate wrap flags by a couple. Note that this isn't NFC. The old code tried for NSW xor (NUW \|\| NW). This is, two different paths computed different sets of wrap flags. The new code will try for all three. The result is that some expressions end up with a few extra flags set.	2020-11-14 19:21:05 -08:00
Nikita Popov	f3124a46c1	[SCEV] Fix nsw flags for GEP expressions The SCEV code for constructing GEP expressions currently assumes that the addition of the base and all the offsets is nsw if the GEP is inbounds. While the addition of the offsets is indeed nsw, the addition to the base address is not, as the base address is interpreted as an unsigned value. Fix the GEP expression code to not assume nsw for the base+offset calculation. However, do assume nuw if we know that the offset is non-negative. With this, we use the same behavior as the construction of GEP addrecs does. (Modulo the fact that we disregard SCEV unification, as the pre-existing FIXME points out). Differential Revision: https://reviews.llvm.org/D90648	2020-11-13 18:19:32 +01:00
Simon Pilgrim	88fe246a34	[ScalarEvolution] Remove unused check prefixes	2020-11-10 14:31:02 +00:00
Max Kazantsev	6022a8b7e8	[SCEV] Drop cached ranges of AddRecs after flag update Our range computation methods benefit from no-wrap flags. But if the ranges were first computed before the flags were set, the cached range will be too pessimistic. We need to drop cached ranges whenever we sharpen AddRec's no wrap flags. Differential Revision: https://reviews.llvm.org/D89847 Reviewed By: fhahn	2020-11-10 12:37:12 +07:00
Roman Lebedev	b4916918e5	[SCEV] SCEVPtrToIntExpr simplifications If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown, we want to sink the SCEVPtrToIntExpr into an operand, so that the operation is performed on integers, and eventually we end up with just an `SCEVPtrToIntExpr(SCEVUnknown)`. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89692	2020-10-30 11:13:35 +03:00
Roman Lebedev	81fc53a36a	[SCEV] Introduce SCEVPtrToIntExpr (PR46786) And use it to model LLVM IR's `ptrtoint` cast. This is essentially an alternative to D88806, but with no chance for all the problems it caused due to having the cast as implicit there. (see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3) As we've established by now, there are at least two reasons why we want this: * It will allow SCEV to actually model the `ptrtoint` casts and their operands, instead of treating them as `SCEVUnknown` * It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 \| PR46786 ]], in principle, we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()` should sink the cast as far down into the expression as possible, so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`. But i think that it isn't the best solution, because it doesn't really matter from memory consumption side - there probably won't be that many `SCEVPtrToIntExpr`s for it to matter, and it allows for much better discoverability. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89456	2020-10-30 11:13:35 +03:00
Max Kazantsev	5ef84688fb	Re-enable "[SCEV] Prove implications of different type via truncation" When we need to prove implication of expressions of different type width, the default strategy is to widen everything to wider type and prove in this type. This does not interact well with AddRecs with negative steps and unsigned predicates: such AddRec will likely not have a `nuw` flag, and its `zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec in some cases can easily be proved to be an `AddRec` too. This patch introduces an alternative way to handling implications of different type widths. If we can prove that wider type values actually fit in the narrow type, we truncate them and prove the implication in narrow type. The return was due to revert of underlying patch that this one depends on. Unit test temporarily disabled because the required logic in SCEV is switched off due to compile time reasons. Differential Revision: https://reviews.llvm.org/D89548	2020-10-28 16:02:14 +07:00
Max Kazantsev	624fc63a05	[SCEV] Re-enable "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 3 We can sharpen the range of a AddRec if we know that it does not self-wrap and know the symbolic iteration count in the loop. If we can evaluate the value of AddRec on the last iteration and prove that at least one its intermediate value lies between start and end, then no-wrap flag allows us to conclude that all of them also lie between start and end. So the estimate of range can be improved to union of ranges of start and end. Switched off by default, can be turned on by flag. Differential Revision: https://reviews.llvm.org/D89381 Reviewed By: lebedev.ri, nikic	2020-10-28 12:39:41 +07:00
Nikita Popov	ebeef022aa	[SCEV] Strenthen nowrap flags after constant folding for mul exprs Same change as `0dda633317`, but for mul expressions. We want to first fold any constant operans and then strengthen the nowrap flags, as we can compute more precise flags at that point.	2020-10-25 19:43:58 +01:00
Nikita Popov	1ff313f098	[SCEV] Always constant fold mul expression operands Establish parity with the handling of add expressions, by always constant folding mul expression operands before checking the depth limit (this is a non-recursive simplification). The code was already unconditionally constant folding the case where all operands were constants, but was not folding multiple constant operands together if there were also non-constant operands. This requires picking out a different demonstration for depth-based folding differences in the limit-depth.ll test.	2020-10-25 18:50:06 +01:00
Nikita Popov	0dda633317	[SCEV] Strength nowrap flags after constant folding We should first try to constant fold the add expression and only strengthen nowrap flags afterwards. This allows us to determine stronger flags if e.g. only two operands are left after constant folding (and thus "guaranteed no wrap region" code applies) or the resulting operands are non-negative and thus nsw->nuw strengthening applies.	2020-10-25 18:00:22 +01:00
Arthur Eubanks	1d1217c4ea	[test] Fix no-wrap-symbolic-becount.ll under NPM	2020-10-21 13:15:15 -07:00
Max Kazantsev	bed02fa8b0	Revert "[SCEV] Prove implications of different type via truncation" This reverts commit `80852a4f2f`. Test is now broken because underlying required patch was also reverted SUDDENLY.	2020-10-21 13:03:46 +07:00
Max Kazantsev	80852a4f2f	[SCEV] Prove implications of different type via truncation When we need to prove implication of expressions of different type width, the default strategy is to widen everything to wider type and prove in this type. This does not interact well with AddRecs with negative steps and unsigned predicates: such AddRec will likely not have a `nuw` flag, and its `zext` to wider type will not be an AddRec. In contraty, `trunc` of an AddRec in some cases can easily be proved to be an `AddRec` too. This patch introduces an alternative way to handling implications of different type widths. If we can prove that wider type values actually fit in the narrow type, we truncate them and prove the implication in narrow type. Differential Revision: https://reviews.llvm.org/D89548 Reviewed By: fhahn	2020-10-21 12:53:22 +07:00
Fangrui Song	d9f91a3d14	Revert D89381 "[SCEV] Recommit "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 2" This reverts commit `a10a64e7e3`. It broke polly/test/ScopInfo/NonAffine/non-affine-loop-condition-dependent-access_3.ll The difference suggests that this may be a serious issue.	2020-10-20 21:03:58 -07:00
Roman Lebedev	d1946469d6	[NFC][SCEV] Improve/rework test coverage for ptrtoint handling	2020-10-20 14:17:56 +03:00
Max Kazantsev	a10a64e7e3	[SCEV] Recommit "Use nw flag and symbolic iteration count to sharpen ranges of AddRecs", attempt 2 Fixed wrapping range case & proof methods reduced to constant range checks to save compile time. Differential Revision: https://reviews.llvm.org/D89381	2020-10-20 11:32:36 +07:00
Florian Hahn	3cbdae22b9	[SCEV] Add tests where assumes can be used to improve tripe multiple. This patch adds a set of tests where information from assumes can be used to improve the trip multiple. See PR47904.	2020-10-19 18:26:09 +01:00
Max Kazantsev	c153d48b15	[Test] Add one more SCEV range test	2020-10-19 13:38:20 +07:00
Roman Lebedev	ec54867df5	[SCEV] Model `ashr exact x, C` as `(abs(x) EXACT/u (1<<C)) * signum(x)` It's not pretty, but probably better than modelling it as an opaque SCEVUnknown, i guess. It is relevant e.g. for the loop that was brought up in https://bugs.llvm.org/show_bug.cgi?id=46786#c26 as an example of what we'd be able to better analyze once SCEV handles `ptrtoint` (D89456). But as it is evident, even if we deal with `ptrtoint` there, we also fail to model such an `ashr`. Also, modeling of mul-of-exact-shr/div could use improvement. As per alive2: https://alive2.llvm.org/ce/z/tnfZKd ``` define i8 @src(i8 %0) { %2 = ashr exact i8 %0, 4 ret i8 %2 } declare i8 @llvm.abs(i8, i1) declare i8 @llvm.smin(i8, i8) declare i8 @llvm.smax(i8, i8) define i8 @tgt(i8 %x) { %abs_x = call i8 @llvm.abs(i8 %x, i1 false) %div = udiv exact i8 %abs_x, 16 %t0 = call i8 @llvm.smax(i8 %x, i8 -1) %t1 = call i8 @llvm.smin(i8 %t0, i8 1) %r = mul nsw i8 %div, %t1 ret i8 %r } ``` Transformation seems to be correct!	2020-10-17 21:22:24 +03:00
Roman Lebedev	bd6d41f52e	[NFC][SCEV] Add some more ptrtoint/PR46786 -related tests	2020-10-17 21:04:44 +03:00
Nikita Popov	74c8c2d903	Revert "Recommit "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs"" This reverts commit `32b72c3165`. While better than before, this change still introduces a large compile-time regression (>3% on mafft): https://llvm-compile-time-tracker.com/compare.php?from=fbd62fe60fb2281ca33da35dc25ca3c87ec0bb51&to=32b72c3165bf65cca2e8e6197b59eb4c4b60392a&stat=instructions Additionally, the logic here doesn't look quite right to me, I will comment in more detail on the differential revision.	2020-10-16 21:36:33 +02:00
Florian Hahn	f085b7cbc1	[SCEV] Add additional tests where the max BTC is limited by wrapping.	2020-10-16 20:36:02 +01:00
Max Kazantsev	32b72c3165	Recommit "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs" It was reverted because of negative compile time impact. In this version, less powerful proof methods are used (non-recursive reasoning only), and scope limited to constant End values to avoid explision of complex proofs. Differential Revision: https://reviews.llvm.org/D89381	2020-10-16 17:35:13 +07:00
Florian Hahn	e034c3f704	[SCEV] Add a few test cases where the max BTC is limited by wrapping.	2020-10-16 09:53:32 +01:00
Nikita Popov	7d3b475810	Revert "[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs" This reverts commit `905101c360`. This causes a large compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=cc175c2cc8e638462bab74e0781e06f9b6eb5017&to=905101c36025fe1c8ecdf9a20cd59db036676073&stat=instructions	2020-10-16 09:47:38 +02:00
Max Kazantsev	905101c360	[SCEV] Use nw flag and symbolic iteration count to sharpen ranges of AddRecs We can sharpen the range of a AddRec if we know that it does not self-wrap and know the symbolic iteration count in the loop. If we can evaluate the value of AddRec on the last iteration and prove that at least one its intermediate value lies between start and end, then no-wrap flag allows us to conclude that all of them also lie between start and end. So the estimate of range can be improved to union of ranges of start and end. Differential Revision: https://reviews.llvm.org/D89381 Reviewed By: efriedma	2020-10-16 12:00:39 +07:00
Roman Lebedev	b3d2df42f7	[NFC][SCEV] Autogenerate check lines in tests being affected by upcoming patch	2020-10-15 23:15:03 +03:00
Roman Lebedev	7ee6c40247	Revert "Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"" and it's follow-ups While we haven't encountered an earth-shattering problem with this yet, by now it is pretty evident that trying to model the ptr->int cast implicitly leads to having to update every single place that assumed no such cast could be needed. That is of course the wrong approach. Let's back this out, and re-attempt with some another approach, possibly one originally suggested by Eli Friedman in https://bugs.llvm.org/show_bug.cgi?id=46786#c20 which should hopefully spare us this pain and more. This reverts commits `1fb6104293`, `7324616660`, `aaafe350bb`, `e92a8e0c74`. I've kept&improved the tests though.	2020-10-14 16:09:18 +03:00
Max Kazantsev	fb2627d8d2	[Test] Add test showing that SCEV cannot compute IV's range	2020-10-13 17:52:39 +07:00
Roman Lebedev	aaafe350bb	[SCEV] BuildConstantFromSCEV(): properly handle SCEVSignExtend from ptr Much similar to the ZExt/Trunc handling. Thanks goes to Alexander Richardson for nudging towards noticing this one proactively. The appropriate (currently crashing) test coverage added.	2020-10-13 12:19:59 +03:00
Roman Lebedev	7324616660	[SCEV] BuildConstantFromSCEV(): properly handle SCEVZeroExtend from ptr As being reported in https://reviews.llvm.org/D88806#2326944, this is pretty much the sibling problem of https://reviews.llvm.org/D88806#2325340, with root cause being that SCEV now models `ptrtoint` as trunc/zext/self of unknown. The appropriate (currently crashing) test coverage added.	2020-10-13 11:47:44 +03:00
Roman Lebedev	1fb6104293	Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit `1c021c64ca` which was reverted in commit `17cec6a11a` because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 23:02:55 +03:00
Roman Lebedev	73818f450e	[NFC][ScalarEvolution] Add tests with ptrtoint in constant context in loop Reduced from the https://reviews.llvm.org/D88806#2325340	2020-10-12 23:02:55 +03:00
Hans Wennborg	17cec6a11a	Revert `1c021c64c` "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" > While we indeed can't treat them as no-ops, i believe we can/should > do better than just modelling them as `unknown`. `inttoptr` story > is complicated, but for `ptrtoint`, it seems straight-forward > to model it just as a zext-or-trunc of unknown. > > This may be important now that we track towards > making inttoptr/ptrtoint casts not no-op, > and towards preventing folding them into loads/etc > (see D88979/D88789/D88788) > > Reviewed By: mkazantsev > > Differential Revision: https://reviews.llvm.org/D88806 It caused the following assert during Chromium builds: llvm/lib/IR/Constants.cpp:1868: static llvm::Constant llvm::ConstantExpr::getTrunc(llvm::Constant , llvm::Type *, bool): Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed. See code review for a link to a reproducer. This reverts commit `1c021c64ca`.	2020-10-12 18:39:35 +02:00
Roman Lebedev	1c021c64ca	[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 11:04:03 +03:00
Florian Hahn	d48b249b71	[SCEV] Add test cases where the max BTC is imprecise, due to step != 1. Add a test case where we fail to compute a tight max backedge taken count, due to the step being != 1. This is part of the issue with PR40961.	2020-10-10 16:39:48 +01:00
Florian Hahn	2e9fd754b4	[SCEV] Handle ULE in applyLoopGuards. Handle ULE predicate in similar fashion to ULT predicate in applyLoopGuards.	2020-10-10 16:26:28 +01:00
Florian Hahn	2c6fc28aba	[SCEV] Add a test case with ULE loop guard.	2020-10-10 15:58:26 +01:00
Roman Lebedev	027e7a7721	Reland "[NFC][SCEV] Improve tests for ptrtoint modelling (D88806)" I messed up runlines in the original commit.	2020-10-09 14:50:05 +03:00
Roman Lebedev	2aeae1617c	Revert "[NFC][SCEV] Improve tests for ptrtoint modelling (D88806)" Buildbots aren't happy, need to investigate. This reverts commit `32cc8f7998`.	2020-10-09 14:10:43 +03:00
Roman Lebedev	32cc8f7998	[NFC][SCEV] Improve tests for ptrtoint modelling (D88806)	2020-10-09 13:50:30 +03:00
Roman Lebedev	80ac6da98e	[NFC][SCEV] Add a test with some patterns where we could treat inttoptr/ptrtoint as semi-transparent	2020-10-05 00:05:39 +03:00
Florian Hahn	0ad793f321	[SCEV] Also use info from assumes in applyLoopGuards. Similar to collecting information from branches guarding a loop, we can also collect information from assumes dominating the loop header. Fixes PR47247. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87854	2020-09-28 13:14:24 +01:00
Florian Hahn	7d274aa9be	[SCEV] Add support for `x != 0` to CollectCondition. Add support for NE predicates with 0 constants. Those can be translated to UMaxExpr(x, 1).	2020-09-25 18:58:55 +01:00
Florian Hahn	3a69ebf0ad	[SCEV] Add another test using info from loop guards for BTC with NE.	2020-09-25 18:58:55 +01:00
Florian Hahn	b5a3b901c7	[SCEV] Add support for `x == constant` to CollectCondition. Add support for EQ predicates with constant operand. In that case, using the constant instead of an unknown expression should always be beneficial.	2020-09-25 16:56:49 +01:00
Florian Hahn	8858340bd3	[SCEV] Swap operands if LHS is not unknown. Currently we only use information from guards for unknown expressions. Swap LHS/RHS and predicate, if LHS is not unknown.	2020-09-25 15:50:01 +01:00
Florian Hahn	1fa06162c1	[SCEV] Add more tests using info from loop guards for BTC.	2020-09-25 14:18:58 +01:00
Florian Hahn	d4ddf63fc4	[SCEV] Use loop guard info when computing the max BE taken count in howFarToZero. For some expressions, we can use information from loop guards when we are looking for a maximum. This patch applies information from loop guards to the expression used to compute the maximum backedge taken count in howFarToZero. It currently replaces an unknown expression X with UMin(X, Y), if the loop is guarded by X ult Y. This patch is minimal in what conditions it applies, and there are a few TODOs to generalize. This partly addresses PR40961. We will also need an update to LV to address it completely. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D67178	2020-09-24 11:06:55 +01:00
Arthur Eubanks	2d0de5f9a4	[test][NewPM] Clean up ScalarEvolution tests to work under NPM	2020-09-22 19:31:10 -07:00
Fangrui Song	8fdac7cb7a	Revert D71539 "Recommit "[SCEV] Look through single value PHIs."" This reverts commit `11dccf8d3a`. A bootstrapped clang crashes (due to ArrayRef::front called on an empty ArrayRef) when compiling some files. Very strangely, this only reproduces with modules. ``` 13 0x0000564d3349e968 llvm::ArrayRef<llvm::BasicBlock>::front() const /proc/self/cwd/llvm/include/llvm/ADT/ArrayRef.h:160:7 14 0x0000564d3349e896 llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getHeader() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfo.h:104:50 15 0x0000564d3349fd9d llvm::LoopBase<llvm::BasicBlock, llvm::Loop>::getLoopLatch() const /proc/self/cwd/llvm/include/llvm/Analysis/LoopInfoImpl.h:210:11 16 0x0000564d33593c8a llvm::ScalarEvolution::computeBackedgeTakenCount(llvm::Loop const, bool) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6933:15 17 0x0000564d33592ebc llvm::ScalarEvolution::getBackedgeTakenInfo(llvm::Loop const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:0:30 18 0x0000564d33593a54 llvm::ScalarEvolution::getBackedgeTakenCount(llvm::Loop const, llvm::ScalarEvolution::ExitCountKind) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:6487:36 19 0x0000564d32be2402 llvm::ScalarEvolution::getConstantMaxBackedgeTakenCount(llvm::Loop const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:768:5 20 0x0000564d33590807 llvm::ScalarEvolution::getRangeRef(llvm::SCEV const, llvm::ScalarEvolution::RangeSignHint) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:5495:19 21 0x0000564d320abab7 llvm::ScalarEvolution::getSignedRange(llvm::SCEV const) /proc/self/cwd/llvm/include/llvm/Analysis/ScalarEvolution.h:840:12 22 0x0000564d335a03aa llvm::ScalarEvolution::isKnownPredicateViaConstantRanges(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:9239:60 23 0x0000564d33586a80 llvm::ScalarEvolution::isKnownViaNonRecursiveReasoning(llvm::CmpInst::Predicate, llvm::SCEV const, llvm::SCEV const*) /proc/self/cwd/llvm/lib/Analysis/ScalarEvolution.cpp:10284:60 ```	2020-09-21 17:21:43 -07:00
Roman Lebedev	64e2cb7e96	[SCEV] Recognize @llvm.uadd.sat as `%y + umin(%x, (-1 - %y))` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = sub nsw nuw i32 4294967295, %y %t1 = umin i32 %x, %t0 %r = add nuw i32 %t1, %y ret i32 %r } Transformation seems to be correct! The alternative, naive, lowering could be the following, although i don't think it's better, thought it will likely be needed for sadd/ssub/*shl: ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = uadd_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = zext i32 %x to i33 %t1 = zext i32 %y to i33 %t2 = add nuw i33 %t0, %t1 %t3 = zext i32 4294967295 to i33 %t4 = umin i33 %t2, %t3 %r = trunc i33 %t4 to i32 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	fedc9549d5	[SCEV] Recognize @llvm.usub.sat as `%x - (umin %x, %y)` ---------------------------------------- define i32 @src(i32 %x, i32 %y) { %0: %r = usub_sat i32 %x, %y ret i32 %r } => define i32 @tgt(i32 %x, i32 %y) { %0: %t0 = umin i32 %x, %y %r = sub nuw i32 %x, %t0 ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:54 +03:00
Roman Lebedev	0592de550f	[NFC][SCEV] Add tests for @llvm.*.sat intrinsics	2020-09-21 20:25:53 +03:00
Roman Lebedev	1bb7ab8c4a	[SCEV] Recognize @llvm.abs as smax(x, -x) As per alive2 (ignoring undef): ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 0 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct! ---------------------------------------- define i32 @src(i32 %x, i1 %y) { %0: %r = abs i32 %x, 1 ret i32 %r } => define i32 @tgt(i32 %x, i1 %y) { %0: %neg_x = mul nsw i32 %x, 4294967295 %r = smax i32 %x, %neg_x ret i32 %r } Transformation seems to be correct!	2020-09-21 20:25:53 +03:00
Roman Lebedev	83c2d10d3c	[NFC][SCEV] Add tests for @llvm.abs intrinsic	2020-09-21 20:25:53 +03:00
Florian Hahn	3cbdfe424f	[SCEV] Add additional max BTC tests with loop guards.	2020-09-21 17:41:24 +01:00
Florian Hahn	11dccf8d3a	Recommit "[SCEV] Look through single value PHIs." This commit was originally because it was suspected to cause a crash, but a reproducer did not surface. A crash that was exposed by this change was fixed in `1d8f2e5292`. This reverts the revert commit `0581c0b0ee`.	2020-09-21 11:59:50 +01:00
Florian Hahn	51973a607d	[SCEV] Add test cases for max BTC with loop guard info. This adds test cases for PR40961 and PR47247. They illustrate cases in which the max backedge-taken count can be improved by information from the loop guards.	2020-09-17 20:27:48 +01:00
Nikita Popov	ac87480bd8	[SCEV] Recognize min/max intrinsics Recognize umin/umax/smin/smax intrinsics and convert them to the already existing SCEV nodes of the same name. In the future we'll want SCEVExpander to also produce the intrinsics, but we're not ready for that yet. Differential Revision: https://reviews.llvm.org/D87160	2020-09-05 16:30:11 +02:00
Nikita Popov	6b50ce3ac9	[SCEV] Add tests for min/max intrinsics (NFC)	2020-09-04 22:08:01 +02:00
Max Kazantsev	e7f53044e7	[Test] Move IndVars test to a proper place	2020-09-01 12:17:31 +07:00
Ali Tamur	0581c0b0ee	Revert "[SCEV] Look through single value PHIs." This reverts commit `e441b7a7a0`. This patch causes a compile error in tensorflow opensource project. The stack trace looks like: Point of crash: llvm/include/llvm/Analysis/LoopInfoImpl.h : line 35 (gdb) ptype this type = const class llvm::LoopBase<llvm::BasicBlock, llvm::Loop> [with BlockT = llvm::BasicBlock, LoopT = llvm::Loop] (gdb) p this $1 = {ParentLoop = 0x0, SubLoops = std::vector of length 0, capacity 0, Blocks = std::vector of length 0, capacity 1, DenseBlockSet = {<llvm::SmallPtrSetImpl<llvm::BasicBlock const>> = {<llvm::SmallPtrSetImplBase> = {<llvm::DebugEpochBase> = {Epoch = 3}, SmallArray = 0x1b2bf6c8, CurArray = 0x1b2bf6c8, CurArraySize = 8, NumNonEmpty = 0, NumTombstones = 0}, <No data fields>}, SmallStorage = {0xfffffffffffffffe, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, IsInvalid = true} (gdb) p this->DenseBlockSet->CurArray $2 = (const void *) 0xfffffffffffffffe I will try to get a case from tensorflow or use creduce to get a small case.	2020-08-12 23:13:24 -07:00
Florian Hahn	e441b7a7a0	[SCEV] Look through single value PHIs. Now that SCEVExpander can preserve LCSSA form, we do not have to worry about LCSSA form when trying to look through PHIs. SCEVExpander will take care of inserting LCSSA PHI nodes as required. This increases precision of the analysis in some cases. Reviewed By: mkazantsev, bmahjour Differential Revision: https://reviews.llvm.org/D71539	2020-08-12 10:03:42 +01:00
Florian Hahn	3483c28c5b	[SCEV] ] If RHS >= Start, simplify (Start smax RHS) to RHS for trip counts. This is the max version of D85046. This change causes binary changes in 44 out of 237 benchmarks (out of MultiSource/SPEC2000/SPEC2006) Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D85189	2020-08-11 13:20:24 +01:00
Max Kazantsev	da9e7b1ab0	[Test] Added test showing missing range check elimination opportunity in IndVars Seems that SCEV is not powerful enough to handle this.	2020-08-07 16:47:25 +07:00
Florian Hahn	b7856f9d8d	[SCEV] Consolidate some smin/smax folding tests into single test file. This patch moves a few spread out smin/smax tests to smin-smax-folds.ll and adds additional test cases that expose further potential for folds.	2020-08-04 10:24:11 +01:00
Florian Hahn	ee1c12708a	[SCEV] If Start>=RHS, simplify (Start smin RHS) = RHS for trip counts. In some cases, it seems like we can get rid of unnecessary s/umins by using information from the loop guards (unless I am missing something). One place where this seems to be helpful in practice is when computing loop trip counts. This patch just changes howManyGreaterThans for now. Note that this requires a loop for which we can check 'is guarded'. On SPEC2000/SPEC2006/MultiSource, there are some notable changes for some programs in the number of loops unrolled and trip counts computed. ``` Same hash: 179 (filtered out) Remaining: 58 Metric: scalar-evolution.NumTripCountsComputed Program base patch diff test-suite...langs-C/compiler/compiler.test 25.00 31.00 24.0% test-suite.../Applications/SPASS/SPASS.test 2020.00 2323.00 15.0% test-suite...langs-C/allroots/allroots.test 29.00 32.00 10.3% test-suite.../Prolangs-C/loader/loader.test 17.00 18.00 5.9% test-suite...fice-ispell/office-ispell.test 253.00 265.00 4.7% test-suite...006/450.soplex/450.soplex.test 3552.00 3692.00 3.9% test-suite...chmarks/MallocBench/gs/gs.test 453.00 470.00 3.8% test-suite...ngs-C/assembler/assembler.test 29.00 30.00 3.4% test-suite.../Benchmarks/Ptrdist/bc/bc.test 263.00 270.00 2.7% test-suite...rks/FreeBench/pifft/pifft.test 722.00 741.00 2.6% test-suite...count/automotive-bitcount.test 41.00 42.00 2.4% test-suite...0/253.perlbmk/253.perlbmk.test 1417.00 1451.00 2.4% test-suite...000/197.parser/197.parser.test 387.00 396.00 2.3% test-suite...lications/sqlite3/sqlite3.test 1168.00 1189.00 1.8% test-suite...000/255.vortex/255.vortex.test 173.00 176.00 1.7% Metric: loop-unroll.NumUnrolled Program base patch diff test-suite...langs-C/compiler/compiler.test 1.00 3.00 200.0% test-suite.../Applications/SPASS/SPASS.test 134.00 234.00 74.6% test-suite...count/automotive-bitcount.test 3.00 4.00 33.3% test-suite.../Prolangs-C/loader/loader.test 3.00 4.00 33.3% test-suite...langs-C/allroots/allroots.test 3.00 4.00 33.3% test-suite...Source/Benchmarks/sim/sim.test 10.00 12.00 20.0% test-suite...fice-ispell/office-ispell.test 21.00 25.00 19.0% test-suite.../Benchmarks/Ptrdist/bc/bc.test 32.00 38.00 18.8% test-suite...006/450.soplex/450.soplex.test 300.00 352.00 17.3% test-suite...rks/FreeBench/pifft/pifft.test 60.00 69.00 15.0% test-suite...chmarks/MallocBench/gs/gs.test 57.00 63.00 10.5% test-suite...ngs-C/assembler/assembler.test 10.00 11.00 10.0% test-suite...0/253.perlbmk/253.perlbmk.test 145.00 157.00 8.3% test-suite...000/197.parser/197.parser.test 43.00 46.00 7.0% test-suite...TimberWolfMC/timberwolfmc.test 205.00 214.00 4.4% Geomean difference 7.6% ``` Fixes https://bugs.llvm.org/show_bug.cgi?id=46939 Fixes https://bugs.llvm.org/show_bug.cgi?id=46924 on X86. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D85046	2020-08-03 17:22:42 +01:00
Florian Hahn	ffb4735200	[SCEV] Precommit tests with signed counting down loop. From PR46939.	2020-08-02 10:26:26 +01:00
Florian Hahn	be2ea29ee1	[SCEV] Add additional tests. Increase test coverage for upcoming changes to how SCEV deals with LCSSA phis.	2020-07-28 16:15:57 +01:00
Max Kazantsev	c1d8e39236	[Test] Add more simple tests for PR46786	2020-07-22 17:11:26 +07:00
Max Kazantsev	b96114c1e1	[SCEV] Remove premature assert. PR46786 This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri	2020-07-22 15:43:16 +07:00
Arthur Eubanks	9adbb5cb3a	[SCEV] Fix ScalarEvolution tests under NPM Many tests use opt's -analyze feature, which does not translate well to NPM and has better alternatives. The alternative here is to explicitly add a pass that calls ScalarEvolution::print(). The legacy pass manager RUNs aren't changing, but they are now pinned to the legacy pass manager. For each legacy pass manager RUN, I added a corresponding NPM RUN using the 'print<scalar-evolution>' pass. For compatibility with update_analyze_test_checks.py and existing test CHECKs, 'print<scalar-evolution>' now prints what -analyze prints per function. This was generated by the following Python script and failures were manually fixed up: import sys for i in sys.argv: with open(i, 'r') as f: s = f.read() with open(i, 'w') as f: for l in s.splitlines(): if "RUN:" in l and ' -analyze ' in l and '\\' not in l: f.write(l.replace(' -analyze ', ' -analyze -enable-new-pm=0 ')) f.write('\n') f.write(l.replace(' -analyze ', ' -disable-output ').replace(' -scalar-evolution ', ' "-passes=print<scalar-evolution>" ').replace(" \| ", " 2>&1 \| ")) f.write('\n') else: f.write(l) There are a couple failures still in ScalarEvolution under NPM, but those are due to other unrelated naming conflicts. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D83798	2020-07-16 11:24:07 -07:00
Arthur Eubanks	f413b53a67	[NPM][IVUsers] Rename ivusers -> iv-users LPM passes were named iv-users, which seems nicer than ivusers. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D83803	2020-07-15 09:38:21 -07:00
Roman Lebedev	a2619a60e4	Reland "[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`" This reverts commit `d3e3f36ff1`, which reverter the original commit `2c16100e6f`, but with polly tests now actually passing.	2020-07-06 18:00:22 +03:00
Mikhail Goncharov	d3e3f36ff1	Revert "[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem`" Summary: This reverts commit `2c16100e6f`. ninja check-polly fails: Polly :: Isl/CodeGen/MemAccess/generate-all.ll Polly :: ScopInfo/multidim_srem.ll Reviewers: kadircet, bollu Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83230	2020-07-06 16:41:59 +02:00
Arthur Eubanks	3d12e79094	[NewPM][LSR] Rename strength-reduce -> loop-reduce The legacy pass was called "loop-reduce". This lowers the number of check-llvm failures under NPM by 83. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D82925	2020-07-02 11:15:29 -07:00
Roman Lebedev	2c16100e6f	[ScalarEvolution] createSCEV(): recognize `udiv`/`urem` disguised as an `sdiv`/`srem` Summary: While InstCombine trivially converts that `srem` into a `urem`, it might happen later than wanted, in particular i'd like for that to happen on https://godbolt.org/z/bwuEmJ test case early in pipeline, before first instcombine run, just before `-mem2reg`. SCEV should recognize this case natively. Reviewers: mkazantsev, efriedma, nikic, reames Reviewed By: efriedma Subscribers: clementval, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82721	2020-07-02 13:22:12 +03:00
Roman Lebedev	e7da7d9428	[NFCI] Actually provide correct check lines in sdiv.ll	2020-07-02 02:00:02 +03:00
Roman Lebedev	51ff7642a3	[NFC][ScalarEvolution] Add udiv-disguised-as-sdiv test Much like `25521150d7`, but with division instead of remainder. See https://reviews.llvm.org/D82721	2020-07-02 01:44:19 +03:00
Roman Lebedev	25521150d7	[NFC][ScalarEvolution] Add a test showing SCEV failure to recognize 'urem' While InstCombine trivially converts that `srem` into a `urem`, it might happen later than wanted. SCEV should recognize this natively.	2020-06-28 20:35:02 +03:00
Roman Lebedev	141e845da5	[SCEV] Make SCEVAddExpr actually always return pointer type if there is pointer operand (PR46457) Summary: The added assertion fails on the added test without the fix. Reduced from test-suite/MultiSource/Benchmarks/MiBench/office-ispell/correct.c In IR, getelementptr, obviously, takes pointer as it's base, and returns a pointer. When creating an SCEV expression, SCEV operands are sorted in hope that it increases folding potential, and at the same time SCEVAddExpr's type is the type of the last(!) operand. Which means, in some exceedingly rare cases, pointer operand may happen to end up not being the last operand, and as a result SCEV for GEP will suddenly have a non-pointer return type. We should ensure that does not happen. In the end, actually storing the `Type *`, at the cost of increasing memory footprint of `SCEVAddExpr`, appears to be the solution. We can't just store a 'is a pointer' bit and create pointer type on the fly since we don't have data layout in getType(). Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=46457 \| PR46457 ]] Reviewers: efriedma, mkazantsev, reames, nikic Reviewed By: efriedma Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82633	2020-06-27 11:37:17 +03:00
Fangrui Song	4cd19a6e15	[BasicAA] Rename -disable-basicaa to -disable-basic-aa to be consistent with the canonical name "basic-aa"	2020-06-26 20:55:44 -07:00
Fangrui Song	f31811f2dc	[BasicAA] Rename deprecated -basicaa to -basic-aa Follow-up to D82607 Revert an accidental change (empty.ll) of D82683	2020-06-26 20:41:37 -07:00
Roman Lebedev	c868335e24	[SCEV] ScalarEvolution::createSCEV(): clarify no-wrap flag propagation for shift by bitwidth-1 Summary: There was this comment here previously: ``` - // It is currently not resolved how to interpret NSW for left - // shift by BitWidth - 1, so we avoid applying flags in that - // case. Remove this check (or this comment) once the situation - // is resolved. See - // http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html - // and http://reviews.llvm.org/D8890 . ``` But langref was fixed in rL286785, and the behavior is pretty obvious: http://volta.cs.utah.edu:8080/z/MM4WZP ^ nuw can always be propagated. nsw can be propagated if either nuw is specified, or the shift is by less than bitwidth-1. This mimics similar D81189 Reassociate change, alive2 is happy about that one. I'm not sure `NUW` isn't being printed, but that seems unrelated. Reviewers: mkazantsev, reames, sanjoy, nlopes, craig.topper, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81243	2020-06-06 13:02:07 +03:00
Roman Lebedev	39e3683534	[NFC][SCEV] Add test with 'or' with no common bits set	2020-06-05 12:18:15 +03:00
Roman Lebedev	39e3c92410	[NFC][SCEV] Some tests for shifts by bitwidth-2/bitwidth-1 w/ no-wrap flags	2020-06-05 11:45:09 +03:00
Denis Antrushin	5451289aba	[SCEV] Constant fold MultExpr before applying depth limit. Summary: Users of SCEV reasonably assume that multiplication of two constant SCEVs will in turn be constant. However, that is not always the case: First, we can get here with reached depth limit, and will create MultExpr SCEV `C1 * C2` and cache it. Then, we can get here with the same operands, but with small depth level. But this time we will find existing MultExpr SCEV and return it, instead of expected constant SCEV. This patch changes getMultExpr to not apply depth limit to all constant operands expression, allowing them to be folded. Reviewers: reames, mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79893	2020-05-22 18:34:32 +03:00
Eli Friedman	4532a50899	Infer alignment of unmarked loads in IR/bitcode parsing. For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout. The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time. Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate. This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test. Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places. Differential Revision: https://reviews.llvm.org/D78403	2020-05-14 13:03:50 -07:00
Juneyoung Lee	e5f602d82c	[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. Summary: This patch makes propagatesPoison be more accurate by returning true on more bin ops/unary ops/casts/etc. The changed test in ScalarEvolution/nsw.ll was introduced by `a19edc4d15` . IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has no-overflow flags even if the loop isn't in the wanted form. It becomes more accurate with this patch, so think this is okay. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy Reviewed By: spatel, nikic Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D78615	2020-05-13 02:51:42 +09:00
Juneyoung Lee	aca335955c	[ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 08:08:53 +09:00
Juneyoung Lee	5ceef26350	Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison" This reverts commit `80faa8c3af`.	2020-04-23 08:07:09 +09:00
Juneyoung Lee	80faa8c3af	RFC: [ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 07:57:12 +09:00
Eli Friedman	9b9454af8a	Require "target datalayout" to be at the beginning of an IR file. This will allow us to use the datalayout to disambiguate other constructs in IR, like load alignment. Split off from D78403. Differential Revision: https://reviews.llvm.org/D78413	2020-04-20 11:55:49 -07:00
Denis Antrushin	06c58f11a9	[SCEV] Use backedge SCEV of PHI only if its input is loop invariant For the PHI node %1 = phi [%A, %entry], [%X, %latch] it is incorrect to use SCEV of backedge val %X as an exit value of PHI unless %X is loop invariant. This is because exit value of %1 is value of %X at one-before-last iteration of the loop. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D73181	2020-03-31 18:39:24 +07:00
Eli Friedman	65fc706ddf	[SCEV] Add support for GEPs over scalable vectors. Because we have to use a ConstantExpr at some point, the canonical form isn't set in stone, but this seems reasonable. The pretty sizeof(<vscale x 4 x i32>) dumping is a relic of ancient LLVM; I didn't have to touch that code. :) Differential Revision: https://reviews.llvm.org/D75887	2020-03-13 16:12:45 -07:00
Roman Lebedev	9c801c48ee	[NFC][IndVarSimplify] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668)	2020-01-27 23:34:29 +03:00
Zheng Chen	a6342c247a	[SCEV] accurate range for addrecexpr with nuw flag If addrecexpr has nuw flag, the value should never be less than its start value and start value does not required to be SCEVConstant. Reviewed By: nikic, sanjoy Differential Revision: https://reviews.llvm.org/D71690	2020-01-12 20:22:37 -05:00
Zheng Chen	569ccfc384	[SCEV] more accurate range for addrecexpr with nsw flag. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D72436	2020-01-11 23:26:35 -05:00
Zheng Chen	a701be8f03	[SCEV] [NFC] add more test cases for range of addrecexpr with nsw flag	2020-01-10 22:44:47 -05:00
Zheng Chen	4ebb589629	[SCEV] [NFC] add testcase for constant range for addrecexpr with nsw flag	2020-01-09 01:26:57 -05:00
Fangrui Song	a36ddf0aa9	Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351	2019-12-24 16:27:51 -08:00
czhengsz	7259f04dde	[SCEV] add testcase for get accurate range for addrecexpr with nuw flag	2019-12-22 20:58:19 -05:00
czhengsz	d588a00206	[SCEV] NFC - add testcase for get accurate range for AddExpr	2019-12-19 04:11:45 -05:00
Philip Reames	d9426c3360	[Tests] Autogenerate a bunch of SCEV trip count tests for readability. Will likely merge some of these files soon.	2019-11-21 10:46:16 -08:00
Philip Reames	70d173fb1f	[SCEV] Add a mode to skip classification when printing analysis For the various trip-count tests, the classification isn't useful and makes the auto-generated tests super verbose. By skipping it, we make the auto-gen tests closer to the manually written ones. Up next: auto-genning a bunch of the existings tests.	2019-11-21 10:24:19 -08:00
Philip Reames	f1a9a83232	[SCEV] Be robust against IR generated by simple-loop-unswitch Simple loop unswitch likes to leave around unsimplified and/or/xors. SCEV today bails out on these idioms which is unfortunate in general, and specifically for the unswitch interaction. Differential Revision: https://reviews.llvm.org/D70459	2019-11-21 09:53:43 -08:00
Philip Reames	3a8104a9ea	Precommit test showing oppurtunity when computing exit tests of unsimplified IR If we partially unswitch a loop, we leave around the (and i1 X, true) or (or i1 X, false) forms. At the moment, this inhibits SCEVs ability to compute trip counts, patch forthcoming.	2019-11-19 13:12:03 -08:00
Philip Reames	1d509201e2	[SCEV] Simplify umin/max of zext and sext of the same value This is a common idiom which arises after induction variables are widened, and we have two or more exit conditions. Interestingly, we don't have instcombine or instsimplify support for this either. Differential Revision: https://reviews.llvm.org/D69006 llvm-svn: 375349	2019-10-19 17:23:02 +00:00
Philip Reames	3266eac714	[Test] Precommit test for D69006 llvm-svn: 375190	2019-10-17 23:32:35 +00:00
Philip Reames	a40162d475	[Tests] Add a SCEV analysis test for llvm.widenable.condition Mostly because we don't appear to have one and a prototype patch I just saw would have broken the example committed. llvm-svn: 374835	2019-10-14 22:42:35 +00:00
Tim Northover	58e8c793d0	Revert "[SCEV] add no wrap flag for SCEVAddExpr." This reverts r366419 because the analysis performed is within the context of the loop and it's only valid to add wrapping flags to "global" expressions if they're always correct. llvm-svn: 373184	2019-09-30 07:46:52 +00:00
Shoaib Meenai	d89f2d872d	[Analysis] Allow -scalar-evolution-max-iterations more than once At present, `-scalar-evolution-max-iterations` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same value) which results in an error, after which compilation fails: ``` clang (LLVM option parsing): for the -scalar-evolution-max-iterations option: may only occur zero or one times! ``` It seems reasonable to allow -scalar-evolution-max-iterations to be passed more than once. Quoting the [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed \| documentation ]]: > The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times. > ... > If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained. Original patch by: Enrico Bern Hardy Tanuwidjaja <etanuwid@fb.com> Differential Revision: https://reviews.llvm.org/D67512 llvm-svn: 372346	2019-09-19 18:21:32 +00:00
Philip Reames	bdf608477e	[SCEV] Add smin support to getRangeRef We were failing to compute trip counts (both exact and maximum) for any loop which involved a comparison against either an umin or smin. It looks like this simply got missed when we added smin/umin to SCEV. (Note: umin was submitted separately earlier today. Turned out two folks hit this at the same time.) Differential Revision: https://reviews.llvm.org/D67514 llvm-svn: 371776	2019-09-12 21:32:27 +00:00
Florian Hahn	a31ee37624	[SCEV] Support SCEVUMinExpr in getRangeRef. This patch adds support for SCEVUMinExpr to getRangeRef, similar to the support for SCEVUMaxExpr. Reviewers: sanjoy.google, efriedma, reames, nikic Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D67177 llvm-svn: 371768	2019-09-12 20:03:32 +00:00
Philip Reames	a3d2737520	Precommit tests for D67514 llvm-svn: 371762	2019-09-12 19:34:27 +00:00
Chen Zheng	c38e3efe27	[SCEV] add no wrap flag for SCEVAddExpr. Differential Revision: https://reviews.llvm.org/D64868 llvm-svn: 366419	2019-07-18 09:23:19 +00:00
Chen Zheng	627095ec5b	[SCEV] teach SCEV symbolical execution about overflow intrinsics folding. Differential Revision: https://reviews.llvm.org/D64422 llvm-svn: 365726	2019-07-11 02:18:22 +00:00
Philip Reames	1cf9e72cbc	Update -analyze -scalar-evolution output for multiple exit loops w/computable exit values The previous output was next to useless if any exit was not computable. If we have more than one exit, show the exit count for each so that it's easier to see what's going from with SCEV analysis when debugging. llvm-svn: 364579	2019-06-27 19:22:43 +00:00
Florian Hahn	4c11b5268c	[LoopUnroll] Add support for loops with exiting headers and uncond latches. This patch generalizes the UnrollLoop utility to support loops that exit from the header instead of the latch. Usually, LoopRotate would take care of must of those cases, but in some cases (e.g. -Oz), LoopRotate does not kick in. Codesize impact looks relatively neutral on ARM64 with -Oz + LTO. Program master patch diff External/S.../CFP2006/447.dealII/447.dealII 629060.00 627676.00 -0.2% External/SPEC/CINT2000/176.gcc/176.gcc 1245916.00 1244932.00 -0.1% MultiSourc...Prolangs-C/simulator/simulator 86100.00 86156.00 0.1% MultiSourc...arks/Rodinia/backprop/backprop 66212.00 66252.00 0.1% MultiSourc...chmarks/Prolangs-C++/life/life 67276.00 67312.00 0.1% MultiSourc...s/Prolangs-C/compiler/compiler 69824.00 69788.00 -0.1% MultiSourc...Prolangs-C/assembler/assembler 86672.00 86696.00 0.0% Reviewers: efriedma, vsk, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D61962 llvm-svn: 364398	2019-06-26 09:16:57 +00:00
Nikita Popov	8550fb386a	[SCEV] Use unsigned/signed intersection type in SCEV Based on D59959, this switches SCEV to use unsigned/signed range intersection based on the sign hint. This will prefer non-wrapping ranges in the relevant domain. I've left the one intersection in getRangeForAffineAR() to use the smallest intersection heuristic, as there doesn't seem to be any obvious preference there. Differential Revision: https://reviews.llvm.org/D60035 llvm-svn: 363490	2019-06-15 09:15:52 +00:00
Keno Fischer	a1a4adf4b9	[SCEV] Add explicit representations of umin/smin Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159	2019-05-07 15:28:47 +00:00
Sanjoy Das	32fd32bc6f	[SCEV] Check the cache in get{S\|U}MaxExpr before doing any work Summary: This lets us avoid e.g. checking if A >=s B in getSMaxExpr(A, B) if we've already established that (A smax B) is the best we can do. Fixes PR41225. Reviewers: asbirlea Subscribers: mcrosier, jlebar, bixia, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60010 llvm-svn: 357320	2019-03-29 22:00:12 +00:00
Teresa Johnson	4ab0a9f0a4	[SCEV] Use depth limit for trunc analysis Summary: This fixes an extremely long compile time caused by recursive analysis of truncs, which were not previously subject to any depth limits unlike some of the other ops. I decided to use the same control used for sext/zext, since the routines analyzing these are sometimes mutually recursive with the trunc analysis. Reviewers: mkazantsev, sanjoy Subscribers: sanjoy, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58994 llvm-svn: 355949	2019-03-12 18:28:05 +00:00
Florian Hahn	98f11a7d75	[SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR. In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) \|\| !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259	2019-03-02 02:31:44 +00:00
Dmitri Gribenko	751c5fbf6a	Fixed typos in tests: s/CEHCK/CHECK/ Reviewers: ilya-biryukov Subscribers: sanjoy, sdardis, javed.absar, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58608 llvm-svn: 354781	2019-02-25 13:12:33 +00:00
Max Kazantsev	437ee05885	[SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we have a huge amounts of such code, we will be littering SE with these nodes. We could just state that they all are undef and save some memory. Differential Revision: https://reviews.llvm.org/D57567 Reviewed By: sanjoy llvm-svn: 353017	2019-02-04 05:04:19 +00:00
Max Kazantsev	b37419ef66	[SCEV] Prohibit SCEV transformations for huge SCEVs Currently SCEV attempts to limit transformations so that they do not work with big SCEVs (that may take almost infinite compile time). But for this, it uses heuristics such as recursion depth and number of operands, which do not give us a guarantee that we don't actually have big SCEVs. This situation is still possible, though it is not likely to happen. However, the bug PR33494 showed a bunch of simple corner case tests where we still produce huge SCEVs, even not reaching big recursion depth etc. This patch introduces a concept of 'huge' SCEVs. A SCEV is huge if its expression size (intoduced in D35989) exceeds some threshold value. We prohibit optimizing transformations if any of SCEVs we are dealing with is huge. This gives us a reliable check that we don't spend too much time working with them. As the next step, we can possibly get rid of old limiting mechanisms, such as recursion depth thresholds. Differential Revision: https://reviews.llvm.org/D35990 Reviewed By: reames llvm-svn: 352728	2019-01-31 06:19:25 +00:00
Max Kazantsev	468ad52213	[SCEV] Take correct loop in AddRec simplification. PR40420 The code of AddRec simplification is using wrong loop when it creates a new AddRecExpr. It should be using AddRecLoop which we have saved and against which all gate checks are made, and not calling AddRec->getLoop() over and over again because AddRec may change and become an AddRecurrency from outer loop during the transform iterations. Considering this change trivial, commiting for postcommit review. llvm-svn: 352451	2019-01-29 05:37:59 +00:00
Max Kazantsev	d4de606ddb	[NFC] Merge failing test from PR40420 llvm-svn: 352450	2019-01-29 05:12:40 +00:00
Michal Gorny	014a6f930a	[test] Fix ScalarEvolution test to allow __func__ with prototype Fix ScalarEvolution/solve-quadratic.ll test to account for __func__ output listing the complete function prototype rather than just its name, as it does on NetBSD. Example Linux output: GetQuadraticEquation: addrec coeff bw: 4 GetQuadraticEquation: equation -2x^2 + -2x + -4, coeff bw: 5, multiplied by 2 Example NetBSD output: llvm::Optional<std::tuple<llvm::APInt, llvm::APInt, llvm::APInt, llvm::APInt, unsigned int> > GetQuadraticEquation(const llvm::SCEVAddRecExpr): addrec coeff bw: 4 llvm::Optional<std::tuple<llvm::APInt, llvm::APInt, llvm::APInt, llvm::APInt, unsigned int> > GetQuadraticEquation(const llvm::SCEVAddRecExpr): equation -2x^2 + -2x + -4, coeff bw: 5, multiplied by 2 Differential Revision: https://reviews.llvm.org/D55162 llvm-svn: 348096	2018-12-02 16:49:28 +00:00
Max Kazantsev	266c087b9d	Return "[IndVars] Smart hard uses detection" The patch has been reverted because it ended up prohibiting propagation of a constant to exit value. For such values, we should skip all checks related to hard uses because propagating a constant is always profitable. Differential Revision: https://reviews.llvm.org/D53691 llvm-svn: 346397	2018-11-08 11:54:35 +00:00
Max Kazantsev	e059f4452b	Revert "[IndVars] Smart hard uses detection" This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402. It seems that the check that we still should do the transform if we know the result is constant is missing in this code. So the logic that has been deleted by this change is still sometimes accidentally useful. I revert the change to see what can be done about it. The motivating case is the following: @Y = global [400 x i16] zeroinitializer, align 1 define i16 @foo() { entry: br label %for.body for.body: ; preds = %entry, %for.body %i = phi i16 [ 0, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i store i16 0, i16* %arrayidx, align 1 %inc = add nuw nsw i16 %i, 1 %cmp = icmp ult i16 %inc, 400 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body %inc.lcssa = phi i16 [ %inc, %for.body ] ret i16 %inc.lcssa } We should be able to figure out that the result is constant, but the patch breaks it. Differential Revision: https://reviews.llvm.org/D51584 llvm-svn: 346198	2018-11-06 02:02:05 +00:00
Max Kazantsev	3d347bf545	[IndVars] Smart hard uses detection When rewriting loop exit values, IndVars considers this transform not profitable if the loop instruction has a loop user which it believes cannot be optimized away. In current implementation only calls that immediately use the instruction are considered as such. This patch extends the definition of "hard" users to any side-effecting instructions (which usually cannot be optimized away from the loop) and also allows handling of not just immediate users, but use chains. Differentlai Revision: https://reviews.llvm.org/D51584 Reviewed By: etherzhhb llvm-svn: 345814	2018-11-01 06:47:01 +00:00
Max Kazantsev	e0a2613aea	[SCEV] Avoid redundant computations when doing AddRec merge When we calculate a product of 2 AddRecs, we end up making quite massive computations to deduce the operands of resulting AddRec. This process can be optimized by computing all args of intermediate sum and then calling `getAddExpr` once rather than calling `getAddExpr` with intermediate result every time a new argument is computed. Differential Revision: https://reviews.llvm.org/D53189 Reviewed By: rtereshin llvm-svn: 345813	2018-11-01 06:18:27 +00:00
Max Kazantsev	fdfd98ceec	[SCEV] Limit AddRec "simplifications" to avoid combinatorial explosions SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into a single AddRec of size `2n+1` with complex combinatorial coefficients can easily trigger exponential growth of the SCEV (in case if nothing gets folded and simplified). We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`, but its default value seems to be insufficiently small: the test attached to this patch with default value of this option `16` has a SCEV of >3M symbols (when printed out). This patch reduces the simplification limit. It is not a cure to combinatorial explosions, but at least it reduces this corner case to something more or less reasonable. Differential Revision: https://reviews.llvm.org/D53282 Reviewed By: sanjoy llvm-svn: 344584	2018-10-16 05:26:21 +00:00
Krzysztof Parzyszek	90f3249ce2	[SCEV] Properly solve quadratic equations Differential Revision: https://reviews.llvm.org/D48283 llvm-svn: 338758	2018-08-02 19:13:35 +00:00
Roman Tereshin	1ba1f9310c	[SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw><nsw> transform if the top level addition in (D + (C-D + x + ...)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x + ...), ensuring homogeneous behaviour of the transformation and better canonicalization of such expressions. This enables better canonicalization of expressions like 1 + zext(5 + 20 * %x + 24 * %y) and zext(6 + 20 * %x + 24 * %y) which get both transformed to 2 + zext(4 + 20 * %x + 24 * %y) This pattern is common in address arithmetics and the transformation makes it easier for passes like LoadStoreVectorizer to prove that 2 or more memory accesses are consecutive and optimize (vectorize) them. Reviewed By: mzolotukhin Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337859	2018-07-24 21:48:56 +00:00
Max Kazantsev	d41faecc49	[SCEV] Fix buggy behavior in getAddExpr with truncs SCEV tries to constant-fold arguments of trunc operands in SCEVAddExpr, and when it does that, it passes wrong flags into the recursion. It is only valid to pass flags that are proved for narrow type into a computation in wider type if we can prove that trunc instruction doesn't actually change the value. If it did lose some meaningful bits, we may end up proving wrong no-wrap flags for sum of arguments of trunc. In the provided test we end up with `nuw` where it shouldn't be because of this bug. The solution is to conservatively pass `SCEV::FlagAnyWrap` which is always a valid thing to do. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D49471 llvm-svn: 337435	2018-07-19 01:46:21 +00:00
Max Kazantsev	6b12506200	[NFC] Make a test more neat llvm-svn: 337379	2018-07-18 11:03:40 +00:00
Tim Shen	a064622bd3	Re-apply "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)." llvm-svn: 337075	2018-07-13 23:58:46 +00:00
Tim Shen	2ed501d656	Revert "[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428)." This reverts commit r336140. Our tests shows that LSR assert fails with it. llvm-svn: 336473	2018-07-06 23:20:35 +00:00
Tim Shen	c7cef4bcc4	[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428). Summary: Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change SCEV is able to prove that the loop doesn't wrap-self (due to zext i16 to i64), disabling the entire loop versioning pass. Removed the zext and just use i64. Reviewers: sanjoy Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48409 llvm-svn: 336140	2018-07-02 20:01:54 +00:00
Roman Shirokiy	272eac85c7	Fix overconfident assert in ScalarEvolution::isImpliedViaMerge We can have AddRec with loops having many predecessors. This changes an assert to an early return. Differential Revision: https://reviews.llvm.org/D48766 llvm-svn: 335965	2018-06-29 11:46:30 +00:00
Tim Shen	63f244c4f4	[SCEV] Re-apply r335197 (with Polly fixes). Summary: This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338). I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output. All LLVM files are already reviewed in D48338. Reviewers: jdoerfert, bollu, efriedma Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia Differential Revision: https://reviews.llvm.org/D48453 llvm-svn: 335292	2018-06-21 21:29:54 +00:00
Tim Shen	433b9761ce	Revert "[SCEV] Improve zext(A /u B) and zext(A % B)" This reverts commit r335197, as some bots are not happy. llvm-svn: 335198	2018-06-21 02:15:32 +00:00
Tim Shen	5af61e0a28	[SCEV] Improve zext(A /u B) and zext(A % B) Summary: Try to match udiv and urem patterns, and sink zext down to the leaves. I'm not entirely sure why some unrelated tests change, but the added <nsw>s seem right. Reviewers: sanjoy Subscribers: jlebar, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48338 llvm-svn: 335197	2018-06-21 01:49:07 +00:00
Roman Lebedev	42a1ff11fb	[NFC][SCEV] Add tests related to bit masking (PR37793) Summary: Related to https://bugs.llvm.org/show_bug.cgi?id=37793, https://reviews.llvm.org/D46760#1127287 We'd like to do this canonicalization https://rise4fun.com/Alive/Gmc But it is currently restricted by rL155136 / rL155362, which says: ``` // This is a constant shift of a constant shift. Be careful about hiding // shl instructions behind bit masks. They are used to represent multiplies // by a constant, and it is important that simple arithmetic expressions // are still recognizable by scalar evolution. // // The transforms applied to shl are very similar to the transforms applied // to mul by constant. We can be more aggressive about optimizing right // shifts. // // Combinations of right and left shifts will still be optimized in // DAGCombine where scalar evolution no longer applies. ``` I think these tests show that for constants, SCEV has no issues with that canonicalization. Reviewers: mkazantsev, spatel, efriedma, sanjoy Reviewed By: mkazantsev Subscribers: sanjoy, javed.absar, llvm-commits, stoklund, bixia Differential Revision: https://reviews.llvm.org/D48229 llvm-svn: 335101	2018-06-20 07:54:11 +00:00
Sanjoy Das	6e9b355cc9	Revert "[SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags" This reverts r334428. It incorrectly marks some multiplications as nuw. Tim Shen is working on a proper fix. Original commit message: [SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags where safe. Summary: Previously we would add them for adds, but not multiplies. llvm-svn: 335016	2018-06-19 04:09:44 +00:00
Justin Lebar	fe455464eb	[SCEV] Simplify zext/trunc idiom that appears when handling bitmasks. Summary: Specifically, we transform zext(2^K * (trunc X to iN)) to iM -> 2^K * (zext(trunc X to i{N-K}) to iM)<nuw> This is helpful because pulling the 2^K out of the zext allows further optimizations. Reviewers: sanjoy Subscribers: hiraditya, llvm-commits, timshen Differential Revision: https://reviews.llvm.org/D48158 llvm-svn: 334737	2018-06-14 17:13:48 +00:00
Justin Lebar	b326904dba	[SCEV] Simplify trunc-of-add/mul to add/mul-of-trunc under more circumstances. Summary: Previously we would do this simplification only if it did not introduce any new truncs (excepting new truncs which replace other cast ops). This change weakens this condition: If the number of truncs stays the same, but we're able to transform trunc(X + Y) to X + trunc(Y), that's still simpler, and it may open up additional transformations. While we're here, also clean up some duplicated code. Reviewers: sanjoy Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48160 llvm-svn: 334736	2018-06-14 17:13:35 +00:00
Max Kazantsev	0ed79620c6	[SimplifyIndVars] Ignore dead users IndVarSimplify sometimes makes transforms basing on users that are trivially dead. In particular, if DCE wasn't run before it, there may be a dead `sext/zext` in loop that will trigger widening transforms, however it makes no sense to do it. This patch teaches IndVarsSimplify ignore the mist trivial cases of that. Differential Revision: https://reviews.llvm.org/D47974 Reviewed By: sanjoy llvm-svn: 334567	2018-06-13 02:25:32 +00:00
Tim Shen	df2d6652c1	Fix incorrect CHECK-LABEL llvm-svn: 334434	2018-06-11 19:56:12 +00:00
Justin Lebar	4da41c13a5	[SCEV] Add transform zext((A * B * ...)<nuw>) --> (zext(A) * zext(B) * ...)<nuw>. Reviewers: sanjoy Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48041 llvm-svn: 334429	2018-06-11 18:57:58 +00:00
Justin Lebar	aa4fec94d8	[SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags where safe. Summary: Previously we would add them for adds, but not multiplies. Reviewers: sanjoy Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D48038 llvm-svn: 334428	2018-06-11 18:57:42 +00:00
Tim Shen	cc63761720	[SCEV] Canonicalize "A /u C1 /u C2" to "A /u (C1C2)". Summary: FWIW InstCombine already folds this. Also avoid the case where C1C2 overflows. Reviewers: sunfish, sanjoy Subscribers: hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D47965 llvm-svn: 334425	2018-06-11 18:44:58 +00:00
Krzysztof Parzyszek	b10ea39270	[SCEV] Look through zero-extends in howFarToZero An expression like (zext i2 {(trunc i32 (1 + %B) to i2),+,1}<%while.body> to i32) will become zero exactly when the nested value becomes zero in its type. Strip injective operations from the input value in howFarToZero to make the value simpler. Differential Revision: https://reviews.llvm.org/D47951 llvm-svn: 334318	2018-06-08 20:43:07 +00:00
Shiva Chen	2c864551df	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841	2018-05-09 02:40:45 +00:00
Max Kazantsev	58fce7e54b	Re-enable "[SCEV] Make computeExitLimit more simple and more powerful" This patch was temporarily reverted because it has exposed bug 37229 on PowerPC platform. The bug is unrelated to the patch and was just a general bug in the optimization done for PowerPC platform only. The bug was fixed by the patch rL331410. This patch returns the disabled commit since the bug was fixed. llvm-svn: 331427	2018-05-03 02:37:55 +00:00
Max Kazantsev	2c287ec9c5	Revert "[SCEV] Make computeExitLimit more simple and more powerful" This reverts commit 023c8be90980e0180766196cba86f81608b35d38. This patch triggers miscompile of zlib on PowerPC platform. Most likely it is caused by some pre-backend PPC-specific pass, but we don't clearly know the reason yet. So we temporally revert this patch with intention to return it once the problem is resolved. See bug 37229 for details. llvm-svn: 330893	2018-04-26 02:07:40 +00:00
Max Kazantsev	c01e47b43f	[SCEV] Make computeExitLimit more simple and more powerful Current implementation of `computeExitLimit` has a big piece of code the only purpose of which is to prove that after the execution of this block the latch will be executed. What it currently checks is actually a subset of situations where the exiting block dominates latch. This patch replaces all these checks for simple particular cases with domination check over loop's latch which is the only necessary condition of taking the exiting block into consideration. This change allows to calculate exact loop taken count for simple loops like for (int i = 0; i < 100; i++) { if (cond) {...} else {...} if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44677 Reviewed By: efriedma llvm-svn: 329047	2018-04-03 05:57:19 +00:00
Max Kazantsev	7094c8deb2	[SCEV] Make exact taken count calculation more optimistic Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611	2018-03-27 07:30:38 +00:00
Serguei Katkov	529f42331e	[SCEV] Re-land: Fix isKnownPredicate This is re-land of https://reviews.llvm.org/rL327362 with a fix and regression test. The crash was due to it is possible that for found MDL loop, LHS or RHS may contain an invariant unknown SCEV which does not dominate the MDL. Please see regression test for an example. Reviewers: sanjoy, mkazantsev, reames Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44553 llvm-svn: 327822	2018-03-19 06:35:30 +00:00
Max Kazantsev	f8d2969abb	[SCEV] Smart range calculation for SCEVUnknown Phis The range of SCEVUnknown Phi which merges values `X1, X2, ..., XN` can be evaluated as `U(Range(X1), Range(X2), ..., Range(XN))`. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D43810 llvm-svn: 326418	2018-03-01 06:56:48 +00:00
Max Kazantsev	db3a9e0cfe	[SCEV] Make getPostIncExpr guaranteed to return AddRec The current implementation of `getPostIncExpr` invokes `getAddExpr` for two recurrencies and expects that it always returns it a recurrency. But this is not guaranteed to happen if we have reached max recursion depth or refused to make SCEV simplification for other reasons. This patch changes its implementation so that now it always returns SCEVAddRec without relying on `getAddExpr`. Differential Revision: https://reviews.llvm.org/D42953 llvm-svn: 324866	2018-02-12 05:09:38 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Serguei Katkov	edf3c8292b	[SCEV] Do not insert if it is already in cache This is fix for the crash caused by ScalarEvolution::getTruncateExpr. It expects that if it checked the condition that SCEV is not in UniqueSCEVs cache in the beginning that it will not be there inside this method. However during recursion and transformation/simplification for sub expression, it is possible that these modifications will end up with the same SCEV as we started from. So we must always check whether SCEV is in cache and do not insert item if it is already there. Reviewers: sanjoy, mkazantsev, craig.topper Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41380 llvm-svn: 321472	2017-12-27 07:15:23 +00:00
Max Kazantsev	9c08b7a053	[SCEV] Fix predicate usage in computeExitLimitFromICmp In this method, we invoke `SimplifyICmpOperands` which takes the `Cond` predicate by reference and may change it along with `LHS` and `RHS` SCEVs. But then we invoke `computeShiftCompareExitLimit` with Values from which the SCEVs have been derived, these Values have not been modified while `Cond` could be. One of possible outcomes of this is that we may falsely prove that an infinite loop ends within some finite number of iterations. In this patch, we save the original `Cond` and pass it along with original operands. This logic may be removed in future once `computeShiftCompareExitLimit` works with SCEVs instead of value operands. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D40953 llvm-svn: 320142	2017-12-08 12:19:45 +00:00
Max Kazantsev	23044fa639	[SCEV] Strengthen variance condition in calculateLoopDisposition Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively. When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if `L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are consecutive sibling loops within the parent loop. In this case, `AR2` is also varying w.r.t. `L1`, but we don't correctly identify it. It can lead, for exaple, to attempt of incorrect folding. Consider: AR1 = {a,+,b}<L1> AR2 = {c,+,d}<L2> EXAR2 = sext(AR1) MUL = mul AR1, EXAR2 If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect because `AR2` is not available on entrance of `L1`. Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled with one check: "header of `L1` dominates header of `L2`". This patch replaces the old insufficient check with this one. Differential Revision: https://reviews.llvm.org/D39453 llvm-svn: 318819	2017-11-22 06:21:39 +00:00
Jatin Bhateja	c61ade1ca0	[SCEV] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: sanjoy, junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 318050	2017-11-13 16:43:24 +00:00
Sanjoy Das	8499ebf2e9	[SCEV] Fix an assertion failure in the max backedge taken count Max backedge taken count is always expected to be a constant; and this is usually true by construction -- it is a SCEV expression with constant inputs. However, if the max backedge expression ends up being computed to be a udiv with a constant zero denominator[0], SCEV does not fold the result to a constant since there is no constant it can fold it to (SCEV has no representation for "infinity" or "undef"). However, in computeMaxBECountForLT we already know the denominator is positive, and thus at least 1; and we can use this fact to avoid dividing by zero. [0]: We can end up with a constant zero denominator if the signed range of the stride is more precise than the unsigned range. llvm-svn: 316615	2017-10-25 21:41:00 +00:00
Sanjoy Das	2f27456c82	Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain." This reverts commit r316054. There was some confusion over the review process: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html llvm-svn: 316129	2017-10-18 22:00:57 +00:00
Jatin Bhateja	1fc49627e4	[ScalarEvolution] Handling for ICmp occuring in the evolution chain. Summary: If a compare instruction is same or inverse of the compare in the branch of the loop latch, then return a constant evolution node. Currently scope of evaluation is limited to SCEV computation for PHI nodes. This shall facilitate computations of loop exit counts in cases where compare appears in the evolution chain of induction variables. Will fix PR 34538 Reviewers: sanjoy, hfinkel, junryoungju Reviewed By: junryoungju Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38494 llvm-svn: 316054	2017-10-18 01:36:16 +00:00
Anna Thomas	a2ca902033	[SCEV] Teach SCEV to find maxBECount when loop endbound is variant Summary: This patch teaches SCEV to calculate the maxBECount when the end bound of the loop can vary. Note that we cannot calculate the exactBECount. This will only be done when both conditions are satisfied: 1. the loop termination condition is strictly LT. 2. the IV is proven to not overflow. This provides more information to users of SCEV and can be used to improve identification of finite loops. Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38825 llvm-svn: 315683	2017-10-13 14:30:43 +00:00
Alexandre Isoard	405728fd47	[SCEV] Add URem support to SCEV In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. llvm-svn: 312329	2017-09-01 14:59:59 +00:00
Amara Emerson	56dca4e3ca	[SCEV] Preserve NSW information for sext(subtract). Pushes the sext onto the operands of a Sub if NSW is present. Also adds support for propagating the nowrap flags of the llvm.ssub.with.overflow intrinsic during analysis. Differential Revision: https://reviews.llvm.org/D35256 llvm-svn: 310117	2017-08-04 20:19:46 +00:00
Max Kazantsev	2cb3653404	[SCEV] Re-enable "Cache results of computeExitLimit" The patch rL309080 was reverted because it did not clean up the cache on "forgetValue" method call. This patch re-enables this change, adds the missing check and introduces two new unit tests that make sure that the cache is cleaned properly. Differential Revision: https://reviews.llvm.org/D36087 llvm-svn: 309925	2017-08-03 08:41:30 +00:00
Sanjoy Das	843ab57457	Revert "[SCEV] Cache results of computeExitLimit" This reverts commit r309080. The patch needs to clear out the ScalarEvolution::ExitLimits cache in forgetMemoizedResults. I've replied on the commit thread for the patch with more details. llvm-svn: 309357	2017-07-28 03:25:07 +00:00
Max Kazantsev	f282aed428	[SCEV] Cache results of computeExitLimit This patch adds a cache for computeExitLimit to save compilation time. A lot of examples of tests that take extensive time to compile are attached to the bug 33494. Differential Revision: https://reviews.llvm.org/D35827 llvm-svn: 309080	2017-07-26 04:55:54 +00:00
Max Kazantsev	0e9e0796f4	[SCEV] Limit max size of AddRecExpr during evolving When SCEV calculates product of two SCEVAddRecs from the same loop, it tries to combine them into one big AddRecExpr. If the sizes of the initial SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every operand of the resulting SCEV is combined from operands of initial SCEV and has much higher complexity than they have. As result, if we try to calculate something like: %x1 = {a,+,b} %x2 = mul i32 %x1, %x1 %x3 = mul i32 %x2, %x1 %x4 = mul i32 %x3, %x2 ... The size of such SCEVs grows as `2^N`, and the arguments become more and more complex as we go forth. This leads to long compilation and huge memory consumption. This patch sets a limit after which we don't try to combine two `SCEVAddRecExpr`s into one. By default, max allowed size of the resulting AddRecExpr is set to 16. Differential Revision: https://reviews.llvm.org/D35664 llvm-svn: 308847	2017-07-23 15:40:19 +00:00
Max Kazantsev	b9edcbcb1d	Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477	2017-07-08 17:17:30 +00:00
Max Kazantsev	98838527c6	Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249	2017-07-06 10:47:13 +00:00
Max Kazantsev	c8db20b78c	Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244	2017-07-06 09:57:41 +00:00
Max Kazantsev	ebe56283bc	Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135	2017-07-05 09:44:41 +00:00
Max Kazantsev	80bc4a5554	[IndVars] Canonicalize comparisons between non-negative values and indvars -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126	2017-07-05 06:38:49 +00:00
Max Kazantsev	8d0322e612	[SCEV] Use depth limit instead of local cache for SExt and ZExt In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785	2017-06-30 05:04:09 +00:00
Alexandre Isoard	41044876fc	Reverting r306695 while investigating failing test case. Failing test case: Transforms/LoopVectorize.iv_outside_user.ll llvm-svn: 306723	2017-06-29 18:48:56 +00:00
Alexandre Isoard	aa29afc756	ScalarEvolution: Add URem support In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to: %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort this way. Note: While SRem and SDiv are also related this way, SCEV does not provides SDiv yet. llvm-svn: 306695	2017-06-29 16:29:04 +00:00
Max Kazantsev	dc80366d52	[ScalarEvolution] Apply Depth limit to getMulExpr This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463	2017-06-15 11:48:21 +00:00
Max Kazantsev	41450329f7	Re-enable "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" The patch rL303730 was reverted because test lsr-expand-quadratic.ll failed on many non-X86 configs with this patch. The reason of this is that the patch makes a correctless fix that changes optimizer's behavior for this test. Without the change, LSR was making an overconfident simplification basing on a wrong SCEV. Apparently it did not need the IV analysis to do this. With the change, it chose a different way to simplify (that wasn't so confident), and this way required the IV analysis. Now, following the right execution path, LSR tries to make a transformation relying on IV Users analysis. This analysis is target-dependent due to this code: // LSR is not APInt clean, do not touch integers bigger than 64-bits. // Also avoid creating IVs of non-native types. For example, we don't want a // 64-bit IV in 32-bit code just because the loop has one 64-bit cast. uint64_t Width = SE->getTypeSizeInBits(I->getType()); if (Width > 64 \|\| !DL.isLegalInteger(Width)) return false; To make a proper transformation in this test case, the type i32 needs to be legal for the specified data layout. When the test runs on some non-X86 configuration (e.g. pure ARM 64), opt gets confused by the specified target and does not use it, rejecting the specified data layout as well. Instead, it uses some default layout that does not treat i32 as a legal type (currently the layout that is used when it is not specified does not have legal types at all). As result, the transformation we expect to happen does not happen for this test. This re-enabling patch does not have any source code changes compared to the original patch rL303730. The only difference is that the failing test is moved to X86 directory and now has requirement of running on x86 only to comply with the specified target triple and data layout. Differential Revision: https://reviews.llvm.org/D33543 llvm-svn: 303971	2017-05-26 06:47:04 +00:00
Diana Picus	183863fc3b	Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start" This reverts commit r303730 because it broke all the buildbots. llvm-svn: 303747	2017-05-24 14:16:04 +00:00
Max Kazantsev	13e016bf48	[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that the loop of our base recurrency is the bottom-lost in terms of domination. This assumption may be broken by an expression which is treated as invariant, and which depends on a complex Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence. Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here: for (...) { // loop phi = {A,+,B} } X = load ... Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment). It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop> may be existant. This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr, if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior. llvm-svn: 303730	2017-05-24 08:52:18 +00:00
Sanjoy Das	036dda25a5	[SCEV] Clarify behavior around max backedge taken count This is a re-application of a r303497 that was reverted in r303498. I thought it had broken a bot when it had not (the breakage did not go away with the revert). This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303531	2017-05-22 06:46:04 +00:00
Sanjoy Das	8963650cfa	Revert "[SCEV] Clarify behavior around max backedge taken count" This reverts commit r303497 since it breaks the msan bootstrap bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/ llvm-svn: 303498	2017-05-21 05:02:12 +00:00
Sanjoy Das	5207168383	[SCEV] Clarify behavior around max backedge taken count This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303497	2017-05-21 01:47:50 +00:00
Max Kazantsev	b09b5db793	[SCEV] Fix sorting order for AddRecExprs The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148	2017-05-16 07:27:06 +00:00
Michael Zolotukhin	37162adf3e	[SCEV] createAddRecFromPHI: Optimize for the most common case. Summary: The existing implementation creates a symbolic SCEV expression every time we analyze a phi node and then has to remove it, when the analysis is finished. This is very expensive, and in most of the cases it's also unnecessary. According to the data I collected, ~60-70% of analyzed phi nodes (measured on SPEC) have the following form: PN = phi(Start, OP(Self, Constant)) Handling such cases separately significantly speeds this up. Reviewers: sanjoy, pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32663 llvm-svn: 302096	2017-05-03 23:53:38 +00:00
Sanjoy Das	08989c7ecd	Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC Summary: programUndefinedIfPoison makes more sense, given what the function does; and I'm about to add a function with a name similar to isKnownNotFullPoison (so do the rename to avoid confusion). Reviewers: broune, majnemer, bjarke.roune Reviewed By: broune Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30444 llvm-svn: 301776	2017-04-30 19:41:19 +00:00
Sanjoy Das	bdbc4938f9	[SCEV] Fix exponential time complexity by caching llvm-svn: 301149	2017-04-24 00:09:46 +00:00
Eli Friedman	d0e6ae5678	Revert r300746 (SCEV analysis for or instructions). There have been multiple reports of this causing problems: a compile-time explosion on the LLVM testsuite, and a stack overflow for an opencl kernel. llvm-svn: 300928	2017-04-20 23:59:05 +00:00
Eli Friedman	e77d2b86b4	[SCEV] Make SCEV or modeling more aggressive. Use haveNoCommonBitsSet to figure out whether an "or" instruction is equivalent to addition. This handles more cases than just checking for a constant on the RHS. Differential Revision: https://reviews.llvm.org/D32239 llvm-svn: 300746	2017-04-19 20:19:58 +00:00
Max Kazantsev	2e44d2969a	[ScalarEvolution] Re-enable Predicate implication from operations The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build. The reason of the crash was type mismatch between either a or b and RHS in the following situation: LHS = sext(a +nsw b) > RHS. This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type. But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this situation we don't need to create any non-constant SCEVs. This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not go further into range analysis etc (because in some situations these analyzes succeed even when the passed arguments have wrong types, what should not normally happen). The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong usage of predicates in recursive invocations. The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll Reviewers: reames, apilipenko, anna, sanjoy Reviewed By: sanjoy Subscribers: mzolotukhin, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31238 llvm-svn: 299205	2017-03-31 12:05:30 +00:00
Max Kazantsev	7696a7edf9	Revert "[ScalarEvolution] Re-enable Predicate implication from operations" This reverts commit rL298690 Causes failures on clang. llvm-svn: 298693	2017-03-24 07:04:31 +00:00
Max Kazantsev	89554446e7	[ScalarEvolution] Re-enable Predicate implication from operations The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build. The reason of the crash was type mismatch between either a or b and RHS in the following situation: LHS = sext(a +nsw b) > RHS. This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type. But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this situation we don't need to create any non-constant SCEVs. This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not go further into range analysis etc (because in some situations these analyzes succeed even when the passed arguments have wrong types, what should not normally happen). The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong usage of predicates in recursive invocations. The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll llvm-svn: 298690	2017-03-24 06:19:00 +00:00
Zhaoshi Zheng	e3c9070f06	Model ashr(shl(x, n), m) as mul(x, 2^(n-m)) when n > m Given below case: %y = shl %x, n %z = ashr %y, m when n = m, SCEV models it as sext(trunc(x)). This patch tries to handle the case where n > m by using sext(mul(trunc(x), 2^(n-m)))) as the SCEV expression. llvm-svn: 298631	2017-03-23 18:06:09 +00:00
Max Kazantsev	c6effaa495	Revert "[ScalarEvolution] Predicate implication from operations" This reverts commit rL298481 Fails clang-with-lto-ubuntu build. llvm-svn: 298489	2017-03-22 07:50:33 +00:00
Max Kazantsev	15e76aa0f8	[ScalarEvolution] Predicate implication from operations This patch allows SCEV predicate analysis to prove implication of some expression predicates from context predicates related to arguments of those expressions. It introduces three new rules: For addition: (A >X && B >= 0) \|\| (B >= 0 && A > X) ===> (A + B) > X. For division: (A > X) && (0 < B <= X + 1) ===> (A / B > 0). (A > X) && (-B <= X < 0) ===> (A / B >= 0). Using these rules, SCEV is able to prove facts like "if X > 1 then X / 2 > 0". They can also be combined with the same context, to prove more complex expressions like "if X > 1 then X/2 + 1 > 1". Diffirential Revision: https://reviews.llvm.org/D30887 Reviewed by: sanjoy llvm-svn: 298481	2017-03-22 04:48:46 +00:00
Eli Friedman	b1578d3612	[SCEV] Fix trip multiple calculation If loop bound containing calculations like min(a,b), the Scalar Evolution API getSmallConstantTripMultiple returns 4294967295 "-1" as the trip multiple. The problem is that, SCEV use -1 * umax to represent umin. The multiple constant -1 was returned, and the logic of guarding against huge trip counts was skipped. Because -1 has 32 active bits. The fix attempt to factor more general cases. First try to get the greatest power of two divisor of trip count expression. In case overflow happens, the trip count expression is still divisible by the greatest power of two divisor returned. Returns 1 if not divisible by 2. Patch by Huihui Zhang <huihuiz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D30840 llvm-svn: 298301	2017-03-20 20:25:46 +00:00
Michael Zolotukhin	99de88d1f3	[SCEV] Compute affine range in another way to avoid bitwidth extending. Summary: This approach has two major advantages over the existing one: 1. We don't need to extend bitwidth in our computations. Extending bitwidth is a big issue for compile time as we often end up working with APInts wider than 64bit, which is a slow case for APInt. 2. When we zero extend a wrapped range, we lose some information (we replace the range with [0, 1 << src bit width)). Thus, avoiding such extensions better preserves information. Correctness testing: I ran 'ninja check' with assertions that the new implementation of getRangeForAffineAR gives the same results as the old one (this functionality is not present in this patch). There were several failures - I inspected them manually and found out that they all are caused by the fact that we're returning more accurate results now (see bullet (2) above). Without such assertions 'ninja check' works just fine, as well as SPEC2006. Compile time testing: CTMark/Os: - mafft/pairlocalalign -16.98% - tramp3d-v4/tramp3d-v4 -12.72% - lencod/lencod -11.51% - Bullet/bullet -4.36% - ClamAV/clamscan -3.66% - 7zip/7zip-benchmark -3.19% - sqlite3/sqlite3 -2.95% - SPASS/SPASS -2.74% - Average -5.81% Performance testing: The changes are expected to be neutral for runtime performance. Reviewers: sanjoy, atrick, pete Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30477 llvm-svn: 297992	2017-03-16 21:07:38 +00:00
Sanjoy Das	5cd6c5cacf	[ValueTracking] Make poison propagation more aggressive Summary: Motivation: fix PR31181 without regression (the actual fix is still in progress). However, the actual content of PR31181 is not relevant here. This change makes poison propagation more aggressive in the following cases: 1. poision * Val == poison, for any Val. In particular, this changes existing intentional and documented behavior in these two cases: a. Val is 0 b. Val is 2^k * N 2. poison << Val == poison, for any Val 3. getelementptr is poison if any input is poison I think all of these are justified (and are axiomatically true in the new poison / undef model): 1a: we need poison * 0 to be poison to allow transforms like these: A * (B + C) ==> A * B + A * C If poison * 0 were 0 then the above transform could not be allowed since e.g. we could have A = poison, B = 1, C = -1, making the LHS poison * (1 + -1) = poison * 0 = 0 and the RHS poison * 1 + poison * -1 = poison + poison = poison 1b: we need e.g. poison * 4 to be poison since we want to allow A * 4 ==> A + A + A + A If poison * 4 were a value with all of their bits poison except the last four; then we'd not be able to do this transform since then if A were poison the LHS would only be "partially" poison while the RHS would be "full" poison. 2: Same reasoning as (1b), we'd like have the following kinds transforms be legal: A << 1 ==> A + A Reviewers: majnemer, efriedma Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D30185 llvm-svn: 295809	2017-02-22 06:52:32 +00:00
Igor Laevsky	c11c1ed909	[SCEV] Cache results during GetMinTrailingZeros query Differential Revision: https://reviews.llvm.org/D29759 llvm-svn: 295060	2017-02-14 15:53:12 +00:00
Eli Friedman	10d1ff64fe	[SCEV] Simplify/generalize howFarToZero solving. Make SolveLinEquationWithOverflow take the start as a SCEV, so we can solve more cases. With that implemented, get rid of the special case for powers of two. The additional functionality probably isn't particularly useful, but it might help a little for certain cases involving pointer arithmetic. Differential Revision: https://reviews.llvm.org/D28884 llvm-svn: 293576	2017-01-31 00:42:42 +00:00
Daniil Fukalov	b09dac59fc	[SCEV] Introduce add operation inlining limit Inlining in getAddExpr() can cause abnormal computational time in some cases. New parameter -scev-addops-inline-threshold is intruduced with default value 500. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D28812 llvm-svn: 293176	2017-01-26 13:33:17 +00:00
Chandler Carruth	d501b18990	This test apparently requires an x86 target and is failing on numerous bots ever since d0k fixed the CHECK lines so that it did something at all. It isn't actually testing SCEV directly but LSR, so move it into LSR and the x86-specific tree of tests that already exists there. Target dependence is common and unavoidable with the current design of LSR. llvm-svn: 292774	2017-01-23 08:33:29 +00:00
Benjamin Kramer	1fd0d44e9b	Attempt to fix test in release builds. llvm-svn: 292762	2017-01-22 21:01:19 +00:00
Benjamin Kramer	db9e0b659d	Fix some broken CHECK lines. The colon is important. llvm-svn: 292761	2017-01-22 20:28:56 +00:00
Eli Friedman	f1f49c8265	[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly. To avoid regressions, make ScalarEvolution::createSCEV a bit more clever. Also get rid of some useless code in ScalarEvolution::howFarToZero which was hiding this bug. No new testcase because it's impossible to actually expose this bug: we don't have any in-tree users of getUDivExactExpr besides the two functions I just mentioned, and they both dodged the problem. I'll try to add some interesting users in a followup. Differential Revision: https://reviews.llvm.org/D28587 llvm-svn: 292449	2017-01-18 23:56:42 +00:00
Chandler Carruth	0952750fae	[PM] Clean up the testing for IVUsers, especially with the new PM. First, I've moved a test of IVUsers from the LSR tree to a dedicated IVUsers test directory. I've also simplified its RUN line now that the new pass manager's loop PM is providing analyses on their own. No functionality changed, but it makes subsequent changes cleaner. llvm-svn: 292060	2017-01-15 09:29:27 +00:00
Chandler Carruth	2f19a324cb	[PM] The assumption cache is fundamentally designed to be self-updating, mark it as never invalidated in the new PM. The old PM already required this to work, and after a discussion with Hal this seems to really be the only sensible answer. The cache gracefully degrades as the IR is mutated, and most things which do this should already be incrementally updating the cache. This gets rid of a bunch of logic preserving and testing the invalidation of this analysis. llvm-svn: 292039	2017-01-15 00:26:18 +00:00
Eli Friedman	bd6dedaa7f	[SCEV] Make howFarToZero max backedge-taken count check for precondition. Refines max backedge-taken count if a loop like "for (int i = 0; i != n; ++i) { /* body */ }" is rotated. Differential Revision: https://reviews.llvm.org/D28536 llvm-svn: 291704	2017-01-11 21:07:15 +00:00
Eli Friedman	8396265655	[SCEV] Make howFarToZero use a simpler formula for max backedge-taken count. This is both easier to understand, and produces a tighter bound in certain cases. Differential Revision: https://reviews.llvm.org/D28393 llvm-svn: 291701	2017-01-11 20:55:48 +00:00
Chandler Carruth	082c183f06	[PM] Teach SCEV to invalidate itself when its dependencies become invalid. This fixes use-after-free bugs that will arise with any interesting use of SCEV. I've added a dedicated test that works diligently to trigger these kinds of bugs in the new pass manager and also checks for them explicitly as well as triggering ASan failures when things go squirly. llvm-svn: 291426	2017-01-09 07:44:34 +00:00
Daniel Jasper	aec2fa352f	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Hal Finkel	cb9f78e1c3	Make processing @llvm.assume more efficient by using operand bundles There was an efficiency problem with how we processed @llvm.assume in ValueTracking (and other places). The AssumptionCache tracked all of the assumptions in a given function. In order to find assumptions relevant to computing known bits, etc. we searched every assumption in the function. For ValueTracking, that means that we did O(#assumes * #values) work in InstCombine and other passes (with a constant factor that can be quite large because we'd repeat this search at every level of recursion of the analysis). Several of us discussed this situation at the last developers' meeting, and this implements the discussed solution: Make the values that an assume might affect operands of the assume itself. To avoid exposing this detail to frontends and passes that need not worry about it, I've used the new operand-bundle feature to add these extra call "operands" in a way that does not affect the intrinsic's signature. I think this solution is relatively clean. InstCombine adds these extra operands based on what ValueTracking, LVI, etc. will need and then those passes need only search the users of the values under consideration. This should fix the computational-complexity problem. At this point, no passes depend on the AssumptionCache, and so I'll remove that as a follow-up change. Differential Revision: https://reviews.llvm.org/D27259 llvm-svn: 289755	2016-12-15 02:53:42 +00:00
Li Huang	faa857dba7	[SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor. Summary: When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV multiple times if this SCEV is referenced by multiple other SCEVs. This has exponential time complexity in the worst case. Memoizing the results will avoid re-visiting the same SCEV. Add a map to save the results, and override the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each SCEV once and thus returns the same result for the same input SCEV. This patch fixes PR18606, PR18607. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25810 llvm-svn: 284868	2016-10-21 20:05:21 +00:00
John Brawn	84b21835f1	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Li Huang	fcfe8cd3ae	[SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV This is to avoid inlining too many multiplication operands into a SCEV, which could take exponential time in the worst case. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25794 llvm-svn: 284784	2016-10-20 21:38:39 +00:00
John Brawn	ecf79300dd	[SCEV] More accurate calculation of max backedge count of some less-than loops In loops that look something like i = n; do { ... } while(i++ < n+k); where k is a constant, the maximum backedge count is k (in fact the backedge count will be either 0 or k, depending on whether n+k wraps). More generally for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will iterate either 0 or that constant number of times. This allows for more loop unrolling with the recent upper bound loop unrolling changes, and I'm working on a patch that will let loop unrolling additionally make use of the loop being executed either 0 or k times (we need to retain the loop comparison only on the first unrolled iteration). Differential Revision: https://reviews.llvm.org/D25607 llvm-svn: 284465	2016-10-18 10:10:53 +00:00
David L Kreitzer	8bbabee21a	Reapplying r278731 after fixing the problem that caused it to be reverted. Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 281732	2016-09-16 14:38:13 +00:00
Wei Mi	24662395df	Create a getelementptr instead of sub expr for ValueOffsetPair if the value is a pointer. This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair, if the value is of pointer type, we can only create a getelementptr instead of sub expr. Differential Revision: https://reviews.llvm.org/D24088 llvm-svn: 281439	2016-09-14 04:39:50 +00:00
Wei Mi	59ca96636d	[UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748	2016-08-25 16:17:18 +00:00
Hans Wennborg	3879035e66	SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932) Differential Revision: https://reviews.llvm.org/D23594 llvm-svn: 278999	2016-08-17 22:50:18 +00:00
Reid Kleckner	b99b709068	Revert "Enhance SCEV to compute the trip count for some loops with unknown stride." This reverts commit r278731. It caused http://crbug.com/638314 llvm-svn: 278853	2016-08-16 21:02:04 +00:00
David L Kreitzer	7fe18251a5	Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 278731	2016-08-15 20:21:41 +00:00
Wei Mi	575435012c	Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The patch is to fix the bug in PR28705. It was caused by setting wrong return value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion have different meanings when the function is used in different ways so it is easy to make mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion, and specifies where each interface is expected to be used. Differential Revision: https://reviews.llvm.org/D22942 llvm-svn: 278161	2016-08-09 20:40:03 +00:00
Wei Mi	785858cf6c	Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The fix for PR28705 will be committed consecutively. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 278160	2016-08-09 20:37:50 +00:00
Sanjoy Das	d4c85af7fd	[SCEV] Un-grep'ify tests; NFC llvm-svn: 277861	2016-08-05 20:33:49 +00:00
Sanjoy Das	b0b4e86215	[SCEV] Don't infinitely recurse on unreachable code llvm-svn: 277848	2016-08-05 18:34:14 +00:00
Hans Wennborg	685e8ff953	Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion." It causes Clang tests to fail after Windows self-host (PR28705). (Also reverts follow-up r276139.) llvm-svn: 276822	2016-07-26 23:25:13 +00:00

... 3 4 5 6 7 ...

738 Commits