Commit Graph

419 Commits

Author SHA1 Message Date
Michael Zolotukhin 99de88d1f3 [SCEV] Compute affine range in another way to avoid bitwidth extending.
Summary:
This approach has two major advantages over the existing one:
1. We don't need to extend bitwidth in our computations. Extending
bitwidth is a big issue for compile time as we often end up working with
APInts wider than 64bit, which is a slow case for APInt.
2. When we zero extend a wrapped range, we lose some information (we
replace the range with [0, 1 << src bit width)). Thus, avoiding such
extensions better preserves information.

Correctness testing:
I ran 'ninja check' with assertions that the new implementation of
getRangeForAffineAR gives the same results as the old one (this
functionality is not present in this patch). There were several failures
- I inspected them manually and found out that they all are caused by
the fact that we're returning more accurate results now (see bullet (2)
above).
Without such assertions 'ninja check' works just fine, as well as
SPEC2006.

Compile time testing:
CTMark/Os:
 - mafft/pairlocalalign	-16.98%
 - tramp3d-v4/tramp3d-v4	-12.72%
 - lencod/lencod	-11.51%
 - Bullet/bullet	-4.36%
 - ClamAV/clamscan	-3.66%
 - 7zip/7zip-benchmark	-3.19%
 - sqlite3/sqlite3	-2.95%
 - SPASS/SPASS	-2.74%
 - Average	-5.81%

Performance testing:
The changes are expected to be neutral for runtime performance.

Reviewers: sanjoy, atrick, pete

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30477

llvm-svn: 297992
2017-03-16 21:07:38 +00:00
Sanjoy Das 5cd6c5cacf [ValueTracking] Make poison propagation more aggressive
Summary:
Motivation: fix PR31181 without regression (the actual fix is still in
progress).  However, the actual content of PR31181 is not relevant
here.

This change makes poison propagation more aggressive in the following
cases:

 1. poision * Val == poison, for any Val.  In particular, this changes
    existing intentional and documented behavior in these two cases:
     a. Val is 0
     b. Val is 2^k * N
 2. poison << Val == poison, for any Val
 3. getelementptr is poison if any input is poison

I think all of these are justified (and are axiomatically true in the
new poison / undef model):

1a: we need poison * 0 to be poison to allow transforms like these:

  A * (B + C) ==> A * B + A * C

If poison * 0 were 0 then the above transform could not be allowed
since e.g. we could have A = poison, B = 1, C = -1, making the LHS

  poison * (1 + -1) = poison * 0 = 0

and the RHS

  poison * 1 + poison * -1 = poison + poison = poison

1b: we need e.g. poison * 4 to be poison since we want to allow

  A * 4 ==> A + A + A + A

If poison * 4 were a value with all of their bits poison except the
last four; then we'd not be able to do this transform since then if A
were poison the LHS would only be "partially" poison while the RHS
would be "full" poison.

2: Same reasoning as (1b), we'd like have the following kinds
transforms be legal:

  A << 1 ==> A + A

Reviewers: majnemer, efriedma

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30185

llvm-svn: 295809
2017-02-22 06:52:32 +00:00
Igor Laevsky c11c1ed909 [SCEV] Cache results during GetMinTrailingZeros query
Differential Revision: https://reviews.llvm.org/D29759

llvm-svn: 295060
2017-02-14 15:53:12 +00:00
Eli Friedman 10d1ff64fe [SCEV] Simplify/generalize howFarToZero solving.
Make SolveLinEquationWithOverflow take the start as a SCEV, so we can
solve more cases. With that implemented, get rid of the special case
for powers of two.

The additional functionality probably isn't particularly useful,
but it might help a little for certain cases involving pointer
arithmetic.

Differential Revision: https://reviews.llvm.org/D28884

llvm-svn: 293576
2017-01-31 00:42:42 +00:00
Daniil Fukalov b09dac59fc [SCEV] Introduce add operation inlining limit
Inlining in getAddExpr() can cause abnormal computational time in some cases.
New parameter -scev-addops-inline-threshold is intruduced with default value 500.

Reviewers: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28812

llvm-svn: 293176
2017-01-26 13:33:17 +00:00
Chandler Carruth d501b18990 This test apparently requires an x86 target and is failing on numerous
bots ever since d0k fixed the CHECK lines so that it did something at
all.

It isn't actually testing SCEV directly but LSR, so move it into LSR and
the x86-specific tree of tests that already exists there. Target
dependence is common and unavoidable with the current design of LSR.

llvm-svn: 292774
2017-01-23 08:33:29 +00:00
Benjamin Kramer 1fd0d44e9b Attempt to fix test in release builds.
llvm-svn: 292762
2017-01-22 21:01:19 +00:00
Benjamin Kramer db9e0b659d Fix some broken CHECK lines.
The colon is important.

llvm-svn: 292761
2017-01-22 20:28:56 +00:00
Eli Friedman f1f49c8265 [SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.

Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.

No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.

Differential Revision: https://reviews.llvm.org/D28587

llvm-svn: 292449
2017-01-18 23:56:42 +00:00
Chandler Carruth 0952750fae [PM] Clean up the testing for IVUsers, especially with the new PM.
First, I've moved a test of IVUsers from the LSR tree to a dedicated
IVUsers test directory. I've also simplified its RUN line now that the
new pass manager's loop PM is providing analyses on their own.

No functionality changed, but it makes subsequent changes cleaner.

llvm-svn: 292060
2017-01-15 09:29:27 +00:00
Chandler Carruth 2f19a324cb [PM] The assumption cache is fundamentally designed to be self-updating,
mark it as never invalidated in the new PM.

The old PM already required this to work, and after a discussion with
Hal this seems to really be the only sensible answer. The cache
gracefully degrades as the IR is mutated, and most things which do this
should already be incrementally updating the cache.

This gets rid of a bunch of logic preserving and testing the
invalidation of this analysis.

llvm-svn: 292039
2017-01-15 00:26:18 +00:00
Eli Friedman bd6dedaa7f [SCEV] Make howFarToZero max backedge-taken count check for precondition.
Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.

Differential Revision: https://reviews.llvm.org/D28536

llvm-svn: 291704
2017-01-11 21:07:15 +00:00
Eli Friedman 8396265655 [SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.
This is both easier to understand, and produces a tighter bound in certain
cases.

Differential Revision: https://reviews.llvm.org/D28393

llvm-svn: 291701
2017-01-11 20:55:48 +00:00
Chandler Carruth 082c183f06 [PM] Teach SCEV to invalidate itself when its dependencies become
invalid.

This fixes use-after-free bugs that will arise with any interesting use
of SCEV.

I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.

llvm-svn: 291426
2017-01-09 07:44:34 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Hal Finkel cb9f78e1c3 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

llvm-svn: 289755
2016-12-15 02:53:42 +00:00
Li Huang faa857dba7 [SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor.
Summary:
When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV
multiple times if this SCEV is referenced by multiple other SCEVs. This has
exponential time complexity in the worst case. Memoizing the results will
avoid re-visiting the same SCEV. Add a map to save the results, and override
the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each
SCEV once and thus returns the same result for the same input SCEV.

This patch fixes PR18606, PR18607.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25810

llvm-svn: 284868
2016-10-21 20:05:21 +00:00
John Brawn 84b21835f1 [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818
2016-10-21 11:08:48 +00:00
Li Huang fcfe8cd3ae [SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV
This is to avoid inlining too many multiplication operands into a SCEV, which could 
take exponential time in the worst case.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25794

llvm-svn: 284784
2016-10-20 21:38:39 +00:00
John Brawn ecf79300dd [SCEV] More accurate calculation of max backedge count of some less-than loops
In loops that look something like
 i = n;
 do {
  ...
 } while(i++ < n+k);
where k is a constant, the maximum backedge count is k (in fact the backedge
count will be either 0 or k, depending on whether n+k wraps). More generally
for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will
iterate either 0 or that constant number of times.

This allows for more loop unrolling with the recent upper bound loop unrolling
changes, and I'm working on a patch that will let loop unrolling additionally
make use of the loop being executed either 0 or k times (we need to retain the
loop comparison only on the first unrolled iteration).

Differential Revision: https://reviews.llvm.org/D25607

llvm-svn: 284465
2016-10-18 10:10:53 +00:00
David L Kreitzer 8bbabee21a Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Wei Mi 24662395df Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Wei Mi 59ca96636d [UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded
when unroll runtime iteration loop.

In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner
loop inside a loop nest, the scalar evolution needs to be dropped for its
parent loop which is done by ScalarEvolution::forgetLoop. However, we can
postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC
expansion can still reuse existing value.

Differential Revision: https://reviews.llvm.org/D23572

llvm-svn: 279748
2016-08-25 16:17:18 +00:00
Hans Wennborg 3879035e66 SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932)
Differential Revision: https://reviews.llvm.org/D23594

llvm-svn: 278999
2016-08-17 22:50:18 +00:00
Reid Kleckner b99b709068 Revert "Enhance SCEV to compute the trip count for some loops with unknown stride."
This reverts commit r278731. It caused http://crbug.com/638314

llvm-svn: 278853
2016-08-16 21:02:04 +00:00
David L Kreitzer 7fe18251a5 Enhance SCEV to compute the trip count for some loops with unknown stride.
Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 278731
2016-08-15 20:21:41 +00:00
Wei Mi 575435012c Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.

Differential Revision: https://reviews.llvm.org/D22942

llvm-svn: 278161
2016-08-09 20:40:03 +00:00
Wei Mi 785858cf6c Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The fix for PR28705 will be committed consecutively.

In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 278160
2016-08-09 20:37:50 +00:00
Sanjoy Das d4c85af7fd [SCEV] Un-grep'ify tests; NFC
llvm-svn: 277861
2016-08-05 20:33:49 +00:00
Sanjoy Das b0b4e86215 [SCEV] Don't infinitely recurse on unreachable code
llvm-svn: 277848
2016-08-05 18:34:14 +00:00
Hans Wennborg 685e8ff953 Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion."
It causes Clang tests to fail after Windows self-host (PR28705).

(Also reverts follow-up r276139.)

llvm-svn: 276822
2016-07-26 23:25:13 +00:00
Sanjoy Das a7d9ec8751 [SCEV] Make isImpliedCondOperandsViaRanges smarter
This change lets us prove things like

  "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow"

It does this by replacing replacing getConstantDifference by
computeConstantDifference (which is smarter) in
isImpliedCondOperandsViaRanges.

llvm-svn: 276505
2016-07-23 00:54:36 +00:00
Wei Mi 481232e991 Fix test/Analysis/ScalarEvolution/scev-expander-existing-value-offset.ll for rL276136.
The content in this testcase was accidentally duplicated. Fix the error.

llvm-svn: 276139
2016-07-20 16:54:58 +00:00
Wei Mi db80c0c77f Use ValueOffsetPair to enhance value reuse during SCEV expansion.
In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 276136
2016-07-20 16:40:33 +00:00
Keno Fischer 1efc3b70c5 Fix ScalarEvolutionExpander step scaling bug
The expandAddRecExprLiterally function incorrectly transforms
`[Start + Step * X]` into `Step * [Start + X]` instead of the correct
transform of `[Step * X] + Start`.

This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219
due to what appeared to be sufficiently complicated loop interactions.

Patch by Jameson Nash (jameson@juliacomputing.com).

Reviewers: sanjoy
Differential Revision: http://reviews.llvm.org/D16505

llvm-svn: 275239
2016-07-13 01:28:12 +00:00
Hal Finkel e186debb8b Teach SCEV to look through returned-argument functions
When building SCEVs, if a function is known to return its argument, then we can
build the SCEV using the corresponding argument value.

Differential Revision: http://reviews.llvm.org/D9381

llvm-svn: 275037
2016-07-11 02:48:23 +00:00
Sanjoy Das 0da2d14766 [SCEV] Compute max be count from shift operator only if all else fails
In particular, check to see if we can compute a precise trip count by
exhaustively simulating the loop first.

llvm-svn: 274199
2016-06-30 02:47:28 +00:00
Sanjoy Das e8fd9561cb [SCEV] Fix incorrect trip count computation
The way we elide max expressions when computing trip counts is incorrect
-- it breaks cases like this:

```
static int wrapping_add(int a, int b) {
  return (int)((unsigned)a + (unsigned)b);
}

void test() {
  volatile int end_buf = 2147483548; // INT_MIN - 100
  int end = end_buf;

  unsigned counter = 0;
  for (int start = wrapping_add(end,  200); start < end; start++)
    counter++;

  print(counter);
}
```

Note: the `NoWrap` variable that was being tested has little to do with
the values flowing into the max expression; it is a property of the
induction variable.

test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test
functionality I'm reverting in this change, so I've deleted the test
fully.

llvm-svn: 273079
2016-06-18 04:38:31 +00:00
Sanjoy Das c7f69b921f Be wary of abnormal exits from loop when exploiting UB
We can safely rely on a NoWrap add recurrence causing UB down the road
only if we know the loop does not have a exit expressed in a way that is
opaque to ScalarEvolution (e.g. by a function call that conditionally
calls exit(0)).

I believe with this change PR28012 is fixed.

Note: I had to change some llvm-lit tests in LoopReroll, since it looks
like they were depending on this incorrect behavior.

llvm-svn: 272237
2016-06-09 01:13:59 +00:00
Sanjoy Das 8598412e24 [SCEV] Track no-abnormal-exits instead of no-throw calls
Absence of may-unwind calls is not enough to guarantee that a
UB-generating use of an add-rec poison in the loop latch will actually
cause UB.  We also need to guard against calls that terminate the thread
or infinite loop themselves.

This partially addresses PR28012.

llvm-svn: 272181
2016-06-08 17:48:42 +00:00
Sanjoy Das 9a65cd214d Teach isGuarantdToTransferExecToSuccessor about debug info intrinsics
Calls to `@llvm.dbg.*` can be assumed to terminate.

llvm-svn: 272180
2016-06-08 17:48:36 +00:00
Sanjoy Das a19edc4d15 Fix a bug in SCEV's poison value propagation
The worklist algorithm introduced in rL271151 didn't check to see if the
direct users of the post-inc add recurrence propagates poison.  This
change fixes the problem and makes the code structure more obvious.

Note for release managers: correctness wise, this bug wasn't a
regression introduced by rL271151 -- the behavior of SCEV around
post-inc add recurrences was strictly improved (in terms of correctness)
in rL271151.

llvm-svn: 272179
2016-06-08 17:48:31 +00:00
Sanjoy Das f49ca52b9d [SCEV] See through op.with.overflow intrinsics (re-apply)
Summary:
This change teaches SCEV to see reduce `(extractvalue
0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if
possible).

This was first checked in at r265912 but reverted in r265950 because it
exposed some issues around how SCEV handled post-inc add recurrences.
Those issues have now been fixed.

Reviewers: atrick, regehr

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18684

llvm-svn: 271152
2016-05-29 00:34:42 +00:00
Sanjoy Das 7e4a64167d [SCEV] Don't always add no-wrap flags to post-inc add recs
Fixes PR27315.

The post-inc version of an add recurrence needs to "follow the same
rules" as a normal add or subtract expression.  Otherwise we miscompile
programs like

```
int main() {
  int a = 0;
  unsigned a_u = 0;
  volatile long last_value;
  do {
    a_u += 3;
    last_value = (long) ((int) a_u);
    if (will_add_overflow(a, 3)) {
      // Leave, and don't actually do the increment, so no UB.
      printf("last_value = %ld\n", last_value);
      exit(0);
    }
    a += 3;
  } while (a != 46);
  return 0;
}
```

This patch changes SCEV to put no-wrap flags on post-inc add recurrences
only when the poison from a potential overflow will go ahead to cause
undefined behavior.

To avoid regressing performance too much, I've assumed infinite loops
without side effects is undefined behavior to prove poison<->UB
equivalence in more cases.  This isn't ideal, but is not new to LLVM as
a whole, and far better than the situation I'm trying to fix.

llvm-svn: 271151
2016-05-29 00:32:17 +00:00
Sanjoy Das 70c2bbd29c [ValueTracking] ICmp instructions propagate poison
This is a stripped down version of D19211, leaving out the questionable
"branching in poison is UB" bit.

llvm-svn: 271150
2016-05-29 00:31:18 +00:00
Oleg Ranevskyy eb4eccae5c [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
Summary:
**Description**

This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.

When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call.
`getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`.

Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr.

This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.

**Minimal reproducer:**
```
int foo(int a, int b, int c);
int baz();

void bar()
{
   int arr[20];
   int i = 0;

   for (i = 0; i < 4; ++i)
     arr[i] = baz();

   for (; i < 20; ++i)
     arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}
```

**Clang command line:**
```
clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir
```

**Expected result:**
The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated.

Reviewers: sanjoy

Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D20058

llvm-svn: 270695
2016-05-25 13:01:33 +00:00
Sanjoy Das f5d40d5350 [SCEV] Be more aggressive in proving NUW
... for AddRec's in loops for which SCEV is unable to compute a max
tripcount.  This is the NUW variant of r269211 and fixes PR27691.

(Note: PR27691 is not a correct or stability bug, it was created to
track a pending task).

llvm-svn: 269790
2016-05-17 17:51:14 +00:00
Sanjoy Das 4e8c80382f [SCEVExpander] Fix a failed cast<> assertion
SCEVExpander::replaceCongruentIVs assumes the backedge value of an
SCEV-analysable PHI to always be an instruction, when this is not
necessarily true.  For now address this by bailing out of the
optimization if the backedge value of the PHI is a non-Instruction.

llvm-svn: 269213
2016-05-11 17:41:41 +00:00
Sanjoy Das abb7b93eb9 [SCEVExpander] Don't break SSA in replaceCongruentIVs
`SCEVExpander::replaceCongruentIVs` bypasses `hoistIVInc` if both the
original and the isomorphic increments are PHI nodes.  Doing this can
break SSA if the isomorphic increment is not dominated by the original
increment.  Get rid of the bypass, and let `hoistIVInc` do the right
thing.

Fixes PR27232 (compile time crash/hang).

llvm-svn: 269212
2016-05-11 17:41:34 +00:00
Sanjoy Das 787c2460c2 [SCEV] Be more aggressive around proving no-wrap
... for AddRec's in loops for which SCEV is unable to compute a max
tripcount.  This is not a problem for "normal" loops[0] that don't have
guards or assumes, but helps in cases where we have guards or assumes in
the loop that can be used to constrain incoming values over the backedge.

This partially fixes PR27691 (we still don't handle the NUW case).

[0]: for "normal" loops, in the cases where we'd be able to prove
no-wrap via isKnownPredicate, we'd also be able to compute a max
tripcount.

llvm-svn: 269211
2016-05-11 17:41:26 +00:00
Sanjoy Das 2512d0c837 [SCEV] Use guards to prove predicates
We can use calls to @llvm.experimental.guard to prove predicates,
relying on the fact that in all locations domianted by a call to
@llvm.experimental.guard the predicate it is guarding is known to be
true.

llvm-svn: 268997
2016-05-10 00:31:49 +00:00
Sanjoy Das 013a4ac4aa [SCEV] Tweak the output format and content of -analyze
In the "LoopDispositions:" section:

 - Instead of printing out a list, print out a "dictionary" to make it
   obvious by inspection which disposition is for which loop.  This is
   just a cosmetic change.

 - Print dispositions for parent _and_ sibling loops.  I will use this
   to write a test case.

llvm-svn: 268405
2016-05-03 17:49:57 +00:00
Sanjoy Das f2f00fb11a [SCEV] When printing via -analysis, dump loop disposition
There are currently some bugs in tree around SCEV caching an incorrect
loop disposition.  Printing out loop dispositions will let us write
whitebox tests as those are fixed.

The dispositions are printed as a list in "inside out" order,
i.e. innermost loop first.

llvm-svn: 268177
2016-05-01 04:51:05 +00:00
Sanjoy Das a6155b659a Have isKnownNotFullPoison be smarter around control flow
Summary:
(... while still not using a PostDomTree)

The way we use isKnownNotFullPoison from SCEV today, the new CFG walking
logic will not trigger for any realistic cases -- it will kick in only
for situations where we could have merged the contiguous basic blocks
anyway[0], since the poison generating instruction dominates all of its
non-PHI uses (which are the only uses we consider right now).

However, having this change in place will allow a later bugfix to break
fewer llvm-lit tests.

[0]: i.e. cases where block A branches to block B and B is A's only
successor and A is B's only predecessor.

Reviewers: broune, bjarke.roune

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19212

llvm-svn: 267175
2016-04-22 17:41:06 +00:00
Sanjoy Das f9d88e650b This reverts commit r265913 and r265912
See PR27315

r265913: "[IndVars] Eliminate op.with.overflow when possible"

r265912: "[SCEV] See through op.with.overflow intrinsics"
llvm-svn: 265950
2016-04-11 15:26:18 +00:00
Sanjoy Das 3c529a40ca [SCEV] See through op.with.overflow intrinsics
Summary:
This change teaches SCEV to see reduce `(extractvalue
0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if
possible).

Reviewers: atrick, regehr

Subscribers: mcrosier, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D18684

llvm-svn: 265912
2016-04-10 22:50:26 +00:00
Silviu Baranga 6f444dfd55 Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
This re-commits r265535 which was reverted in r265541 because it
broke the windows bots. The problem was that we had a PointerIntPair
which took a pointer to a struct allocated with new. The problem
was that new doesn't provide sufficient alignment guarantees.
This pattern was already present before r265535 and it just happened
to work. To fix this, we now separate the PointerToIntPair from the
ExitNotTakenInfo struct into a pointer and a bool.

Original commit message:

Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Reviewers: anemet, mzolotukhin, hfinkel, sanjoy

Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17201

llvm-svn: 265786
2016-04-08 14:29:09 +00:00
Silviu Baranga a393baf1fd Revert r265535 until we know how we can fix the bots
llvm-svn: 265541
2016-04-06 14:06:32 +00:00
Silviu Baranga 72b4a4a330 [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Reviewers: anemet, mzolotukhin, hfinkel, sanjoy

Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17201

llvm-svn: 265535
2016-04-06 13:18:26 +00:00
Sanjoy Das 84216672da Remove trailing newline from test case; NFC
llvm-svn: 262980
2016-03-09 01:51:44 +00:00
Sanjoy Das 97d19bd95f [SCEV] Slightly generalize getRangeViaFactoring
Building on the previous change, this generalizes
ScalarEvolution::getRangeViaFactoring to work with
{Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign
extend or truncate operation, and k0 and k1 are constants.

llvm-svn: 262979
2016-03-09 01:51:02 +00:00
Sanjoy Das d3488c6060 [SCEV] Slightly generalize getRangeViaFactoring
This change generalizes ScalarEvolution::getRangeViaFactoring to work
with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign
extend or truncate operation.

llvm-svn: 262978
2016-03-09 01:50:57 +00:00
Sanjoy Das 724f5cf278 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das 11ef606f1d [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00
Sanjoy Das bf73098472 [SCEV] Make getRange smarter around selects
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.

The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).

llvm-svn: 262438
2016-03-02 00:57:54 +00:00
Chandler Carruth 2b3d0446f4 [PM/AA] Wire up SCEVAA to the new pass manager's registry and test it.
llvm-svn: 261409
2016-02-20 04:01:45 +00:00
Wei Mi fc1cab305f This patch is to fix PR26529 caused by r259736.
IndVarSimplify assumes scAddRecExpr to be expanded in literal form instead of
canonical form by calling disableCanonicalMode after it creates SCEVExpander.
When CanonicalMode is disabled, SCEVExpander::expand should always return PHI
node for scAddRecExpr. r259736 broke the assumption.

The fix is to let SCEVExpander::expand skip the reuse Value logic if
CanonicalMode is false.

In addition, Besides IndVarSimplify, LSR pass also calls disableCanonicalMode
before doing rewrite. We can remove the original check of LSRMode in reuse
Value logic and use CanonicalMode instead.

llvm-svn: 260174
2016-02-09 00:07:08 +00:00
Wei Mi a49559befb [SCEV] Try to reuse existing value during SCEV expansion
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.

This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.

The original commit triggered regressions in Polly tests. The regressions
exposed two problems which have been fixed in current version.

1. Polly will generate a new function based on the old one. To generate an
instruction for the new function, it builds SCEV for the old instruction,
applies some tranformation on the SCEV generated, then expands the transformed
SCEV and insert the expanded value into new function. Because SCEV expansion
may reuse value cached in ExprValueMap, the value in old function may be
inserted into new function, which is wrong.
   In SCEVExpander::expand, there is a logic to check the cached value to
be used should dominate the insertion point. However, for the above
case, the check always passes. That is because the insertion point is
in a new function, which is unreachable from the old function. However
for unreachable node, DominatorTreeBase::dominates thinks it will be
dominated by any other node.
   The fix is to simply add a check that the cached value to be used in
expansion should be in the same function as the insertion point instruction.

2. When the SCEV is of scConstant type, expanding it directly is cheaper than
reusing a normal value cached. Although in the cached value set in ExprValueMap,
there is a Constant type value, but it is not easy to find it out -- the cached
Value set is not sorted according to the potential cost. Existing reuse logic
in SCEVExpander::expand simply chooses the first legal element from the cached
value set.
   The fix is that when the SCEV is of scConstant type, don't try the reuse
logic. simply expand it.

Differential Revision: http://reviews.llvm.org/D12090

llvm-svn: 259736
2016-02-04 01:27:38 +00:00
Wei Mi 97de385868 Revert r259662, which caused regressions on polly tests.
llvm-svn: 259675
2016-02-03 18:05:57 +00:00
Wei Mi ed133978a0 [SCEV] Try to reuse existing value during SCEV expansion
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.

This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.

Differential Revision: http://reviews.llvm.org/D12090

llvm-svn: 259662
2016-02-03 17:05:12 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Sanjoy Das 52bfa0faa4 [SCEV] Fix PR25369
Have `getConstantEvolutionLoopExitValue` work correctly with multiple
entry loops.

As far as I can tell, `getConstantEvolutionLoopExitValue` never did the
right thing for multiple entry loops; and before r249712 it would
silently return an incorrect answer.  r249712 changed SCEV to fail an
assert on a multiple entry loop, and this change fixes the underlying
issue.

llvm-svn: 251770
2015-11-02 02:06:01 +00:00
Sanjoy Das 337d4786e1 [SCEV] Don't create SCEV expressions that break LCSSA
Prevent `createNodeFromSelectLikePHI` from creating SCEV expressions
that break LCSSA.

A better fix for the same issue is to teach SCEVExpander to not break
LCSSA by inserting PHI nodes at appropriate places.  That's planned for
the future.

Fixes PR25360.

llvm-svn: 251756
2015-10-31 23:21:40 +00:00
Silviu Baranga f91c80720d [SCEV] Generalize the SCEV algorithm for creating expressions for PHI nodes
Summary:
When forming expressions for phi nodes having an incoming value from
outside the loop A and a value coming from the previous iteration B
we were forming an AddRec if:
  - B was an AddRec
  - the value A was equal to the value for B at iteration -1 (or equal
    to the value of B shifted by one iteration, at iteration 0)

In this case, we were computing the expression to be the expression of
B, shifted by one iteration.

This changes generalizes the logic above by removing the restriction that
B needs to be an AddRec. For this we introduce two expression rewriters
that allow us to
  - shift an expression by one iteration
  - get the value of an expression at iteration 0

This allows us to get SCEV expressions for PHI nodes when these expressions
are not AddRecExprs.

Reviewers: sanjoy

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D14175

llvm-svn: 251700
2015-10-30 15:02:28 +00:00
Sanjoy Das c88f5d3c2c [SCEV] Compute max backedge count for loops with "shift ivs"
This teaches SCEV to compute //max// backedge taken counts for loops
like

    for (int i = k; i != 0; i >>>= 1)
      whatever();

SCEV yet cannot represent the exact backedge count for these loops, and
this patch does not change that.  This is really geared towards teaching
SCEV that loops like the above are *not* infinite.

llvm-svn: 251558
2015-10-28 21:27:14 +00:00
Sanjoy Das eeca9f6fd4 [SCEV] Commute zero extends through <nuw> additions
llvm-svn: 251052
2015-10-22 19:57:38 +00:00
Sanjoy Das a060e602fd [SCEV] Commute sign extends through nsw additions
Summary: Depends on D13613.

Reviewers: atrick, hfinkel, reames, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13685

llvm-svn: 251049
2015-10-22 19:57:25 +00:00
Sanjoy Das 8f27415c05 [SCEV] Mark AddExprs as nsw or nuw if legal
Summary:
This uses `ScalarEvolution::getRange` and not potentially control
dependent `nsw` and `nuw` bits on the arithmetic instruction.

Reviewers: atrick, hfinkel, nlewycky

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D13613

llvm-svn: 251048
2015-10-22 19:57:19 +00:00
Mehdi Amini 044cb34bdc Revert "Revert "This patch builds on top of D13378 to handle constant condition.""
This reverts commit r249528 and reapply r249431. The fix for the
fallout has been commited in r249575.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 249581
2015-10-07 18:14:25 +00:00
James Molloy 47efaeb36e Revert "This patch builds on top of D13378 to handle constant condition."
This reverts commit r249431. This caused failures in sqlite3: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/14453

llvm-svn: 249528
2015-10-07 09:03:34 +00:00
Mehdi Amini cf2513b352 This patch builds on top of D13378 to handle constant condition.
With this patch, clang -O3 optimizes correctly providing > 1000x speedup on this artificial benchmark):

for (a=0; a<n; a++)
    for (b=0; b<n; b++)
        for (c=0; c<n; c++)
            for (d=0; d<n; d++)
                for (e=0; e<n; e++)
                    for (f=0; f<n; f++)
                        x++;
From test-suite/SingleSource/Benchmarks/Shootout/nestedloop.c

Reviewers: sanjoyd

Differential Revision: http://reviews.llvm.org/D13390

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 249431
2015-10-06 17:19:20 +00:00
Sanjoy Das 55015d210f [SCEV] Recognize simple br-phi patterns
Summary:
Teach SCEV to match patterns like

```
  br %cond, label %left, label %right
 left:
  br label %merge
 right:
  br label %merge
 merge:
  V = phi [ %x, %left ], [ %y, %right ]
```

as "select %cond, %x, %y".  Before this SCEV would match PHI nodes
exclusively to add recurrences.

This addresses PR25005.

Reviewers: joker.eph, joker-eph, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13378

llvm-svn: 249211
2015-10-02 23:09:44 +00:00
Sanjoy Das b174f9a316 [SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit trip counts'
Summary:
If the trip count of a specific backedge is `N`, then we know that
backedge is effectively guarded by the condition `{0,+,1} u< N`.  This
change teaches SCEV to use this condition to prove things in
`isLoopBackedgeGuardedByCond`.

Depends on D12948
Depends on D12949

The original checkin, r248608 had to be backed out due to an issue with
a ObjCXX unit test.  That issue is now fixed, so re-landing.

Reviewers: atrick, reames, majnemer, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12950

llvm-svn: 248638
2015-09-25 23:53:50 +00:00
Sanjoy Das 4a39b97671 Revert two SCEV changes that caused test failures in clang.
r248606: "[SCEV] Exploit A < B => (A+K) < (B+K) when possible"
r248608: "[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts."
llvm-svn: 248614
2015-09-25 21:16:50 +00:00
Sanjoy Das d706fa8a0c [SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts.
Summary:
If the trip count of a specific backedge is `N`, then we know that
backedge is effectively guarded by the condition `{0,+,1} u< N`.  This
change teaches SCEV to use this condition to prove things in
`isLoopBackedgeGuardedByCond`.

Depends on D12948
Depends on D12949

Reviewers: atrick, reames, majnemer, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12950

llvm-svn: 248608
2015-09-25 19:59:57 +00:00
Sanjoy Das f3132d3b03 [ScalarEvolution] Fix PR24757.
Summary:
PR24757 was caused by some incorect math in
`ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X
in

  2^N * A = 2^N * X

is not necessarily A.

Reviewers: atrick, majnemer, meheff

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12721

llvm-svn: 247242
2015-09-10 05:27:38 +00:00
Piotr Padlewski 0dde00d239 ScalarEvolution assume hanging bugfix
http://reviews.llvm.org/D12719

llvm-svn: 247184
2015-09-09 20:47:30 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00
Bjarke Hammersholt Roune 9791ed4705 [SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl
Summary:
http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.

This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i - offset];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[i * stride];

  for (int i = 0; i < numIterations; ++i)
    sum += ptr[3 * (i << 7)];


Reviewers: atrick, sanjoy

Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben

Differential Revision: http://reviews.llvm.org/D11860

llvm-svn: 245118
2015-08-14 22:45:26 +00:00
Jingyue Wu 42f1d67a45 [SCEV] Apply NSW and NUW flags via poison value analysis
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.

There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.

This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:

  for (int i = 0; i < limit; ++i) {
    sum += ptr[i + offset];
  }

Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.


There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392

Patch by Bjarke Roune

Reviewers: eliben, atrick, sanjoy

Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits

Differential Revision: http://reviews.llvm.org/D11212

llvm-svn: 243460
2015-07-28 18:22:40 +00:00
David Blaikie 23af64846f [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction
See r230786 and r230794 for similar changes to gep and load
respectively.

Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.

When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.

This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.

This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).

No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.

This leaves /only/ the varargs case where the explicit type is required.

Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.

About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.

import fileinput
import sys
import re

pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")

def conv(match, line):
  if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
    return line
  return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]

for line in sys.stdin:
  sys.stdout.write(conv(re.search(pat, line), line))

llvm-svn: 235145
2015-04-16 23:24:18 +00:00
Sanjoy Das b864c1f76f [SCEV] Look at backedge dominating conditions (re-land r233447).
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch.  We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.

This re-lands r233447.  r233447 was reverted because it caused massive
compile-time regressions.  This change has a fix for the same issue.

llvm-svn: 233829
2015-04-01 18:24:06 +00:00
Daniel Jasper 87e848c7dc Revert "[SCEV] Look at backedge dominating conditions."
This leads to terribly slow compile times under MSAN. More discussion
on the commit thread of r233447.

llvm-svn: 233529
2015-03-30 09:30:02 +00:00
Sanjoy Das fe0e0fff92 [SCEV] Look at backedge dominating conditions.
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch.  We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8627

llvm-svn: 233447
2015-03-27 23:18:08 +00:00
Sanjoy Das 14598830fe [SCEV] Revert bailout added in r75511.
Summary:
With the introduction of MarkPendingLoopPredicates in r157092, I don't
think the bailout is needed anymore.

Reviewers: atrick, nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8624

llvm-svn: 233296
2015-03-26 17:28:26 +00:00
Nick Lewycky be8af48824 When simplifying a SCEV truncate by distributing, consider it a simplification to replace a cast, even if we end up with a trunc around the term. Fixes PR22960!
llvm-svn: 232794
2015-03-20 02:25:00 +00:00
Sanjoy Das cb8bca1777 [SCEV] Make isImpliedCond smarter.
Summary:
This change teaches isImpliedCond to infer things like "X sgt 0" => "X -
1 sgt -1".  The `ConstantRange` class has the logic to do the heavy
lifting, this change simply gets ScalarEvolution to exploit that when
reasonable.

Depends on D8345

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8346

llvm-svn: 232576
2015-03-18 00:41:29 +00:00
Sanjoy Das f1e9e1df25 [SCEV] Fix PR22856.
Summary:
ScalarEvolutionExpander assumes that the header block of a loop is a
legal place to have a use for a phi node.  This is true only for phis
that are either in the header or dominate the header block, but it is
not true for phi nodes that are strictly internal to the loop body.

This change teaches ScalarEvolutionExpander to place uses of PHI nodes
in the basic block the PHI nodes belong to.  This is always legal, and
`hoistIVInc` ensures that the said position dominates `IsomorphicInc`.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8311

llvm-svn: 232189
2015-03-13 18:31:19 +00:00
David Blaikie f72d05bc7b [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

llvm-svn: 232184
2015-03-13 18:20:45 +00:00
Nick Lewycky b6ef9a14de When forming an addrec out of a phi don't just look at the last computation and steal its flags for our own, there may be other computations in the middle. Check whether the LHS of the computation is the phi itself and then we know it's safe to steal the flags. Fixes PR22795.
There's a missed optimization opportunity where we could look at the full chain of computation and take the intersection of the flags instead of only looking one instruction deep.

llvm-svn: 232134
2015-03-13 01:37:52 +00:00
Sanjoy Das 91b5477aad [SCEV] Unify getUnsignedRange and getSignedRange
Summary:
This removes some duplicated code, and also helps optimization: e.g. in
the test case added, `%idx ULT 128` in `@x` is not currently optimized
to `true` by `-indvars` but will be, after this change.

The only functional change in ths commit is that for add recurrences,
ScalarEvolution::getRange will be more aggressive -- computing the
unsigned (resp. signed) range for a SCEVAddRecExpr will now look at the
NSW (resp. NUW) bits and check for signed (resp. unsigned) overflow.
This can be a strict improvement in some cases (such as the attached
test case), and should be no worse in other cases.

Reviewers: atrick, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8142

llvm-svn: 231709
2015-03-09 21:43:43 +00:00
Sanjoy Das f257452986 [SCEV] Add a `scalar-evolution-print-constant-ranges' option
Summary:
Unused in this commit, but will be used in a subsequent change (D8142)
by a FileCheck test.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8143

llvm-svn: 231708
2015-03-09 21:43:39 +00:00
Sanjoy Das 9e2c5010f6 [SCEV] make SCEV smarter about proving no-wrap.
Summary:
Teach SCEV to prove no overflow for an add recurrence by proving
something about the range of another add recurrence a loop-invariant
distance away from it.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7980

llvm-svn: 231305
2015-03-04 22:24:17 +00:00
Mehdi Amini 46a43556db Make DataLayout Non-Optional in the Module
Summary:
DataLayout keeps the string used for its creation.

As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().

Get rid of DataLayoutPass: the DataLayout is in the Module

The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.

Make DataLayout Non-Optional in the Module

Module->getDataLayout() will never returns nullptr anymore.

Reviewers: echristo

Subscribers: resistor, llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D7992

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
2015-03-04 18:43:29 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
David Blaikie 79e6c74981 [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

llvm-svn: 230786
2015-02-27 19:29:02 +00:00
Sanjoy Das dcc84db264 Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap
(The change was landed in r230280 and caused the regression PR22674.
This version contains a fix and a test-case for PR22674).
    
When emitting the increment operation, SCEVExpander marks the
operation as nuw or nsw based on the flags on the preincrement SCEV.
This is incorrect because, for instance, it is possible that {-6,+,1}
is <nuw> while {-6,+,1}+1 = {-5,+,1} is not.
    
This change teaches SCEV to mark the increment as nuw/nsw only if it
can explicitly prove that the increment operation won't overflow.
    
Apart from the attached test case, another (more realistic)
manifestation of the bug can be seen in
Transforms/IndVarSimplify/pr20680.ll.

Differential Revision: http://reviews.llvm.org/D7778

llvm-svn: 230533
2015-02-25 20:02:59 +00:00
Hans Wennborg 953d6fb84e Revert r230280: "Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap"
This caused PR22674, failing this assert:

Instructions.h:2281: llvm::Value* llvm::PHINode::getOperand(unsigned int) const: Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed.

llvm-svn: 230341
2015-02-24 16:19:29 +00:00
Sanjoy Das b14010d28b Fix bug 22641
The bug was a result of getPreStartForExtend interpreting nsw/nuw
flags on an add recurrence more strongly than is legal.  {S,+,X}<nsw>
implies S+X is nsw only if the backedge of the loop is taken at least
once.

NOTE: I had accidentally committed an unrelated change with the commit
message of this change in r230275 (r230275 was reverted in r230279).
This is the correct change for this commit message.

Differential Revision: http://reviews.llvm.org/D7808

llvm-svn: 230291
2015-02-24 01:02:42 +00:00
Sanjoy Das 18c243b933 Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap
When emitting the increment operation, SCEVExpander marks the
operation as nuw or nsw based on the flags on the preincrement SCEV.
This is incorrect because, for instance, it is possible that {-6,+,1}
is <nuw> while {-6,+,1}+1 = {-5,+,1} is not.

This change teaches SCEV to mark the increment as nuw/nsw only if it
can explicitly prove that the increment operation won't overflow.

Apart from the attached test case, another (more realistic) manifestation
of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll.

NOTE: this change was landed with an incorrect commit message in
rL230275 and was reverted for that reason in rL230279.  This commit
message is the correct one.

Differential Revision: http://reviews.llvm.org/D7778

llvm-svn: 230280
2015-02-23 23:22:58 +00:00
Sanjoy Das c9cf0151cf Revert 230275.
230275 got committed with an incorrect commit message due to a mixup
on my side.  Will re-land in a few moments with the correct commit
message.

llvm-svn: 230279
2015-02-23 23:13:22 +00:00
Sanjoy Das 913dfd8f7f Fix bug 22641
The bug was a result of getPreStartForExtend interpreting nsw/nuw
flags on an add recurrence more strongly than is legal.  {S,+,X}<nsw>
implies S+X is nsw only if the backedge of the loop is taken at least
once.

Differential Revision: http://reviews.llvm.org/D7808

llvm-svn: 230275
2015-02-23 22:55:13 +00:00
Sanjoy Das 4153f47026 Generalize getExtendAddRecStart to work with both sign and zero
extensions.

This change also removes `DEBUG(dbgs() << "SCEV: untested prestart
overflow check\n");` because that case has a unit test now.

Differential Revision: http://reviews.llvm.org/D7645

llvm-svn: 229600
2015-02-18 01:47:07 +00:00
Sanjoy Das bf5d870dfa Bugfix: SCEV incorrectly marks certain add recurrences as nsw
When creating a scev for sext({X,+,Y}), scev checks if the expression
is equivalent to {sext X,+,zext Y}.  If it can prove that, it also
tags the original {X,+,Y} as <nsw>, which is not correct.

In the test case I run `-scalar-evolution` twice because the bug
manifests only once SCEV has run through and seen the `sext`
expressions (and then does a in-place mutation on {X,+,Y}).

Differential Revision: http://reviews.llvm.org/D7495

llvm-svn: 228586
2015-02-09 18:34:55 +00:00
Johannes Doerfert 2683e5676c Allow ScalarEvolution to catch more min/max cases
For the attached test case different types are used in the ICmpInst
  and SelectInst that represent the min/max expressions. However, if the
  ICmpInst type is smaller a comparison with the sign/zero extended
  operands would have yielded the same result. This situation might
  arise after the instruction combination pass was applied.

  Differential Revision: http://reviews.llvm.org/D7338

llvm-svn: 228572
2015-02-09 12:34:23 +00:00
Sanjoy Das f2e931cae9 Bugfix: ScalarEvolution incorrectly assumes that the start of certain
add recurrences don't overflow.

This change makes the optimization more restrictive.  It still assumes
that an overflowing `add nsw` is undefined behavior; and this change
will need revisiting once we have a consistent semantics for poison
values.

Differential Revision: http://reviews.llvm.org/D7331

llvm-svn: 228552
2015-02-08 22:52:17 +00:00
Sanjoy Das cb47366366 Make ScalarEvolution less aggressive with respect to no-wrap flags.
ScalarEvolution currently lowers a subtraction recurrence to an add
recurrence with the same no-wrap flags as the subtraction.  This is
incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y`
and `sub nuw X, Y` is not the same as `add nuw X, -Y`.  This patch
fixes the issue, and adds two test cases demonstrating the bug.

Differential Revision: http://reviews.llvm.org/D7081

llvm-svn: 226755
2015-01-22 00:48:47 +00:00
Sanjoy Das 81401d4b19 Fix PR22179.
We were incorrectly inferring nsw for certain SCEVs. We can be more
aggressive here (see Richard Smith's comment on
http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just
focuses on correctness.

Differential Revision: http://reviews.llvm.org/D6914

llvm-svn: 225591
2015-01-10 23:41:24 +00:00
Duncan P. N. Exon Smith be7ea19b58 IR: Make metadata typeless in assembly
Now that `Metadata` is typeless, reflect that in the assembly.  These
are the matching assembly changes for the metadata/value split in
r223802.

  - Only use the `metadata` type when referencing metadata from a call
    intrinsic -- i.e., only when it's used as a `Value`.

  - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode`
    when referencing it from call intrinsics.

So, assembly like this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata !{i32 %v}, metadata !0)
      call void @llvm.foo(metadata !{i32 7}, metadata !0)
      call void @llvm.foo(metadata !1, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{metadata !3}, metadata !0)
      ret void, !bar !2
    }
    !0 = metadata !{metadata !2}
    !1 = metadata !{i32* @global}
    !2 = metadata !{metadata !3}
    !3 = metadata !{}

turns into this:

    define @foo(i32 %v) {
      call void @llvm.foo(metadata i32 %v, metadata !0)
      call void @llvm.foo(metadata i32 7, metadata !0)
      call void @llvm.foo(metadata i32* @global, metadata !0)
      call void @llvm.foo(metadata !3, metadata !0)
      call void @llvm.foo(metadata !{!3}, metadata !0)
      ret void, !bar !2
    }
    !0 = !{!2}
    !1 = !{i32* @global}
    !2 = !{!3}
    !3 = !{}

I wrote an upgrade script that handled almost all of the tests in llvm
and many of the tests in cfe (even handling many `CHECK` lines).  I've
attached it (or will attach it in a moment if you're speedy) to PR21532
to help everyone update their out-of-tree testcases.

This is part of PR21532.

llvm-svn: 224257
2014-12-15 19:07:53 +00:00
David Majnemer 0df1d12476 ScalarEvolution: HowFarToZero was wrongly using signed division
HowFarToZero was supposed to use unsigned division in order to calculate
the backedge taken count.  However, SCEVDivision::divide performs signed
division.  Unless I am mistaken, no users of SCEVDivision actually want
signed arithmetic: switch to udiv and urem.

This fixes PR21578.

llvm-svn: 222093
2014-11-16 07:30:35 +00:00
Rafael Espindola 1c99344156 Use FileCheck in a few tests.
llvm-svn: 221459
2014-11-06 15:05:51 +00:00
Bradley Smith 9992b167ae [SCEV] Improve Scalar Evolution's use of no {un,}signed wrap flags
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.

This allows us to unroll loops such as:

 void testcase3(int v) {
   for (int i=v; i<=v+1; ++i)
     f(i);
 }

llvm-svn: 220960
2014-10-31 11:40:32 +00:00
Sanjoy Das 1f05c51e5e This patch teaches ScalarEvolution to pick and use !range metadata.
It also makes it more aggressive in querying range information by
adding a call to isKnownPredicateWithRanges to
isLoopBackedgeGuardedByCond and isLoopEntryGuardedByCond.

phabricator: http://reviews.llvm.org/D5638

Reviewed by: atrick, hfinkel

llvm-svn: 219532
2014-10-10 21:22:34 +00:00
Mark Heffernan 2beab5f0b4 This patch de-pessimizes the calculation of loop trip counts in
ScalarEvolution in the presence of multiple exits. Previously all
loops exits had to have identical counts for a loop trip count to be
considered computable. This pessimization was implemented by calling
getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock)
inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME
in the comments of that function). The pessimization was added to fix
a corner case involving undefined behavior (pr/16130). This patch more
precisely handles the undefined behavior case allowing the pessimization
to be removed.

ControlsExit replaces IsSubExpr to more precisely track the case where
undefined behavior is expected to occur. Because undefined behavior is
tracked more precisely we can remove MustExit from ExitLimit. MustExit
was used to track the case where the limit was computed potentially
assuming undefined behavior even if undefined behavior didn't necessarily
occur.

llvm-svn: 219517
2014-10-10 17:39:11 +00:00
Hal Finkel cebf0cc210 Make use @llvm.assume for loop guards in ScalarEvolution
This adds a basic (but important) use of @llvm.assume calls in ScalarEvolution.
When SE is attempting to validate a condition guarding a loop (such as whether
or not the loop count can be zero), this check should also include dominating
assumptions.

llvm-svn: 217348
2014-09-07 21:37:59 +00:00
Dinesh Dwivedi c0e6703360 Adding testcase for PR18886.
Differential Revision: http://reviews.llvm.org/D3837

llvm-svn: 209645
2014-05-27 06:44:25 +00:00
Andrew Trick b429083aff Test case comments. Fix sloppiness.
llvm-svn: 209551
2014-05-23 20:46:21 +00:00
Andrew Trick 839e30b2c0 Fix and improve SCEV ComputeBackedgeTankCount.
This is a follow-up to r209358: PR19799: Indvars miscompile due to an
incorrect max backedge taken count from SCEV.

That fix was incomplete as pointed out by Arnold and Michael Z. The
code was also too confusing. It needed a careful rewrite with more
unit tests. This version will also happen to optimize more cases.

<rdar://17005101> PR19799: Indvars miscompile...

llvm-svn: 209545
2014-05-23 19:47:13 +00:00
Andrew Trick e255359b57 Fix a bug in SCEV's backedge taken count computation from my prior fix in Jan.
This has to do with the trip count computation for loops with multiple
exits, which is quite subtle. Most passes just ask for a single trip
count number, so we must be conservative assuming any exit could be
taken.  Normally, we rely on the "exact" trip count, which was
correctly given as "unknown". However, SCEV also gives a "max"
back-edge taken count. The loops max BE taken count is conservatively
a maximum over the max of each exit's non-exiting iterations
count. Note that some exit tests can be skipped so the max loop
back-edge taken count can actually exceed the max non-exiting
iterations for some exits. However, when we know the loop *latch*
cannot be skipped, we can directly use its max taken count
disregarding other exits. I previously took the minimum here without
checking whether the other exit could be skipped. The correct, and
simpler thing to do here is just to directly use the loop latch's max
non-exiting iterations as the loops max back-edge count.

In the problematic test case, the first loop exit had a max of zero
non-exiting iterations, but could be skipped. The loop latch was known
not to be skipped but had max of one non-exiting iteration. We
incorrectly claimed the loop back-edge could be taken zero times, when
it is actually taken one time.

Fixes Loop %for.body.i: <multiple exits> Unpredictable backedge-taken count.
Loop %for.body.i: max backedge-taken count is 1.

llvm-svn: 209358
2014-05-22 00:37:03 +00:00
Benjamin Kramer e75eaca32f ScalarEvolution: Compute exit counts for loops with a power-of-2 step.
If we have a loop of the form
for (unsigned n = 0; n != (k & -32); n += 32) {}
then we know that n is always divisible by 32 and the loop must
terminate. Even if we have a condition where the loop counter will
overflow it'll always hold this invariant.

PR19183. Our loop vectorizer creates this pattern and it's also
occasionally formed by loop counters derived from pointers.

llvm-svn: 204728
2014-03-25 16:25:12 +00:00
Nico Rieck 35a237d4ed Actually call FileCheck in tests
llvm-svn: 201491
2014-02-16 13:27:39 +00:00
Benjamin Kramer 5a188549ad ScalarEvolution: Analyze trip count of loops with a switch guarding the exit.
llvm-svn: 201159
2014-02-11 15:44:32 +00:00
Nick Lewycky 629199ccb3 Fix crasher introduced in r200203 and caught by a libc++ buildbot. Don't assume that getMulExpr returns a SCEVMulExpr, it may have simplified it to something else!
llvm-svn: 200210
2014-01-27 10:47:44 +00:00
Nick Lewycky 31eaca5513 Teach SCEV to handle more cases of 'and X, CST', specifically where CST is any number of contiguous 1 bits in a row, with any number of leading and trailing 0 bits.
Unfortunately, this in turn led to some lower quality SCEVs due to some different paths through expression simplification, so add getUDivExactExpr and use it. This fixes all instances of the problems that I found, but we can make that function smarter as necessary.

Merge test "xor-and.ll" into "and-xor.ll" since I needed to update it anyways. Test 'nsw-offset.ll' analyzes a little deeper, %n now gets a scev in terms of %no instead of a SCEVUnknown.

llvm-svn: 200203
2014-01-27 10:04:03 +00:00
Alp Toker cb40291100 Fix known typos
Sweep the codebase for common typos. Includes some changes to visible function
names that were misspelt.

llvm-svn: 200018
2014-01-24 17:20:08 +00:00
Stepan Dyatkovskiy 431993b57b Fixed old typo in ScalarEvolution, that caused wrong SCEVs zext operation.
Detailed description is here:
http://llvm.org/bugs/show_bug.cgi?id=18000#c16

For participation in bugfix process special thanks to David Wiberg.

llvm-svn: 198863
2014-01-09 12:26:12 +00:00
Andrew Trick 34e2f0c4ea Rewrite SCEV's backedge taken count computation.
Patch by Michele Scandale!

Rewrite of the functions used to compute the backedge taken count of a
loop on LT and GT comparisons.

I decided to split the handling of LT and GT cases becasue the trick
"a > b == -a < -b" in some cases prevents the trip count computation
due to the multiplication by -1 on the two operands of the
comparison. This issue comes from the conservative computation of
value range of SCEVs: taking the negative SCEV of an expression that
have a small positive range (e.g. [0,31]), we would have a SCEV with a
fullset as value range.

Indeed, in the new rewritten function I tried to better handle the
maximum backedge taken count computation when MAX/MIN expression are
used to handle the cases where no entry guard is found.

Some test have been modified in order to check the new value correctly
(I manually check them and reasoning on possible overflow the new
values seem correct).

I finally added a new test case related to the multiplication by -1
issue on GT comparisons.

llvm-svn: 194116
2013-11-06 02:08:26 +00:00
Benjamin Kramer 6094f30da2 SCEV: Make the final add of an inbounds GEP nuw if we know that the index is positive.
We can't do this for the general case as saying a GEP with a negative index
doesn't have unsigned wrap isn't valid for negative indices.
  %gep = getelementptr inbounds i32* %p, i64 -1

But an inbounds GEP cannot run past the end of address space. So we check for
the very common case of a positive index and make GEPs derived from that NUW.
Together with Andy's recent non-unit stride work this lets us analyze loops
like

  void foo3(int *a, int *b) {
    for (; a < b; a++) {}
  }

PR12375, PR12376.

Differential Revision: http://llvm-reviews.chandlerc.com/D2033

llvm-svn: 193514
2013-10-28 07:30:06 +00:00
Matt Arsenault be18b8a3ca Fix creating bitcasts between address spaces in SCEV.
The test before wasn't successfully testing this
since it was missing the datalayout piece to change
the size of the second address space.

llvm-svn: 193102
2013-10-21 18:41:10 +00:00
Andrew Trick 768b917dc8 SCEV should use NSW to get trip count for positive nonunit stride loops.
SCEV currently fails to compute loop counts for nonunit stride
loops. This comes up frequently. It prevents loop optimization and
forces vectorization to insert extra loop checks.

For example:
void foo(int n, int *x) {
 for (int i = 0; i < n; i += 3) {
   x[i] = i;
   x[i+1] = i+1;
   x[i+2] = i+2;
 }
}

We need to properly handle the case in which limit > INT_MAX-stride. In
the above case: n > INT_MAX-3. In this case the loop counter will step
beyond the limit and overflow at the same time. However, knowing that
signed integer overlow in undefined, we can assume the loop test
behavior is arbitrary after overflow. This obeys both C undefined
behavior rules, and the more strict LLVM poison value rules.

I'm finally fixing this in response to Hal Finkel's persistence.
The most probable reason that we never optimized this before is that
we were being careful to handle case where the developer expected a
side-effect free infinite loop relying on overflow:

for (int i = 0; i < n; i += s) {
  ++j;
}
return j;

If INT_MAX+1 is a multiple of s and n > INT_MAX-s, then we might
expect an infinite loop. However there are plenty of ways to achieve
this effect without relying on undefined behavior of signed overflow.

llvm-svn: 193015
2013-10-18 23:43:53 +00:00
Matt Arsenault a90a18e0ea Teach ScalarEvolution about pointer address spaces
llvm-svn: 190425
2013-09-10 19:55:24 +00:00
Bill Wendling a08bb49c61 FileCheck-ize tests.
llvm-svn: 188971
2013-08-22 00:51:19 +00:00
Daniel Dunbar 9efbedfd35 [tests] Cleanup initialization of test suffixes.
- Instead of setting the suffixes in a bunch of places, just set one master
   list in the top-level config. We now only modify the suffix list in a few
   suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).

 - Aside from removing the need for a bunch of lit.local.cfg files, this enables
   4 tests that were inadvertently being skipped (one in
   Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
   CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
   XFAILED).

 - This commit also fixes a bunch of config files to use config.root instead of
   older copy-pasted code.

llvm-svn: 188513
2013-08-16 00:37:11 +00:00
Bill Wendling dc17270968 FileCheckize some of the testcases.
llvm-svn: 187756
2013-08-05 23:43:18 +00:00
Stephen Lin 6dd347b39f Add newlines at end of test files, no functionality change
llvm-svn: 186263
2013-07-13 22:00:58 +00:00
Andrew Trick 3c944ba2f0 Unit test for SCEV fix r182989, PR16130.
llvm-svn: 183017
2013-05-31 16:42:41 +00:00
Manman Ren 662ece49de TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180743
2013-04-29 22:42:01 +00:00
Andrew Trick 9093e15066 Fix SCEV forgetMemoizedResults should search and destroy backedge exprs.
Fixes PR15570: SEGV: SCEV back-edge info invalid after dead code removal.

Indvars creates a SCEV expression for the loop's back edge taken
count, then determines that the comparison is always true and
removes it.

When loop-unroll asks for the expression, it contains a NULL
SCEVUnknkown (as a CallbackVH).

forgetMemoizedResults should invalidate the loop back edges expression.

llvm-svn: 177986
2013-03-26 03:14:53 +00:00