Commit Graph

433 Commits

Author SHA1 Message Date
Philip Reames af11a4376c [Tests] Update a test to consistently use new pass manager and FileCheck the result
llvm-svn: 362518
2019-06-04 16:19:34 +00:00
Philip Reames 78e71c4d09 [Tests] Autogen tests so that diffs for a future change are understandable
llvm-svn: 362516
2019-06-04 16:15:19 +00:00
Philip Reames 83645d214d [Tests] Add LFTR tests for multiple exit loops (try 2)
(Recommit after fixing a keymash in the run line.  Sorry for breakage.)

This is preparation for D62625 <https://reviews.llvm.org/D62625>

llvm-svn: 362426
2019-06-03 17:41:12 +00:00
Dmitri Gribenko b46934eeb8 Revert "[Tests] Add LFTR tests for multiple exit loops"
This reverts commit r362417.  There's a syntax error in the RUN line.

llvm-svn: 362418
2019-06-03 16:58:11 +00:00
Philip Reames 2fcd2bd0df [Tests] Add LFTR tests for multiple exit loops
This is preparation for D62625

llvm-svn: 362417
2019-06-03 16:46:03 +00:00
Nikita Popov eb37509832 [IndVarSimplify] Add tests for saturating math on IV; NFC
These saturating math ops can be replaced with simple math.

llvm-svn: 362320
2019-06-02 08:49:35 +00:00
Nikita Popov 46d4dba6e6 [IndVarSimplify] Fixup nowrap flags during LFTR (PR31181)
Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix
for LFTR poison handling issues in general.

When LFTR moves a condition from pre-inc to post-inc, it may now
depend on value that is poison due to nowrap flags. To avoid this,
we clear any nowrap flag that SCEV cannot prove for the post-inc
addrec.

Additionally, LFTR may switch to a different IV that is dynamically
dead and as such may be arbitrarily poison. This patch will correct
nowrap flags in some but not all cases where this happens. This is
related to the adoption of IR nowrap flags for the pre-inc addrec.
(See some of the switch_to_different_iv tests, where flags are not
dropped or insufficiently dropped.)

Finally, there are likely similar issues with the handling of GEP
inbounds, but we don't have a test case for this yet.

Differential Revision: https://reviews.llvm.org/D60935

llvm-svn: 362292
2019-06-01 09:40:18 +00:00
Nikita Popov 2b1d799a59 [IndVarSimplify] Add additional PR33181 tests; NFC
Two more tests with a switch to a dynamically dead IV, with poison
occuring on the first or second iteration.

llvm-svn: 362291
2019-06-01 09:40:09 +00:00
Nikita Popov e1d38ec811 [LFTR] Add additional PR31181 test cases
One case where overflow happens in the first loop iteration, and
two cases where we switch to a dynamically dead IV with post/pre
increment, respectively.

llvm-svn: 361189
2019-05-20 19:13:04 +00:00
Philip Reames f0a0e8bb36 [Tests] Consolidate more lftr tests
These are all of the ones involving the same data layout string.  Remainder take a bit more consideration, but at least everything can be auto-updated now.

llvm-svn: 360961
2019-05-17 00:19:28 +00:00
Philip Reames 087a30d527 [Tests] Expand basic lftr coverage
Newly written tests to cover the simple cases.  We don't appear to have broad coverage of this transform anywhere.

llvm-svn: 360957
2019-05-16 23:41:28 +00:00
Philip Reames e7b680478c [Tests] More consolidation of lftr tests
llvm-svn: 360936
2019-05-16 20:42:00 +00:00
Philip Reames c37a86d479 [Test] Remove a bunch of cruft from a test
This test hadn't been fully reduced, so do so.

llvm-svn: 360935
2019-05-16 20:37:20 +00:00
Philip Reames fb70fbaba4 [Tests] Start consolidating lftr tests into a single file
llvm-svn: 360934
2019-05-16 20:33:41 +00:00
Philip Reames c8783798f4 [Tests] Autogen the last lftr test
llvm-svn: 360933
2019-05-16 20:24:57 +00:00
Philip Reames 082ec7a784 [Tests] Autogen a few more lftr tests for readability
llvm-svn: 360932
2019-05-16 20:19:02 +00:00
Philip Reames 12a8ea9876 [Tests] Autogen a few lftr test in preparation for merging
llvm-svn: 360931
2019-05-16 20:15:25 +00:00
Philip Reames bd8d309111 [IndVars] Extend reasoning about loop invariant exits to non-header blocks
Noticed while glancing through the code for other reasons.  The extension is trivial enough, decided to just do it.

llvm-svn: 360694
2019-05-14 17:20:10 +00:00
Philip Reames bbe4ff10df [Test] Autogen a test for ease of later changing
llvm-svn: 360690
2019-05-14 16:37:29 +00:00
Keno Fischer a1a4adf4b9 [SCEV] Add explicit representations of umin/smin
Summary:
Currently we express umin as `~umax(~x, ~y)`. However, this becomes
a problem for operands in non-integral pointer spaces, because `~x`
is not something we can compute for `x` non-integral. However, since
comparisons are generally still allowed, we are actually able to
express `umin(x, y)` directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions were
largely identical.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D50167

llvm-svn: 360159
2019-05-07 15:28:47 +00:00
Nikita Popov d89de3f7f4 [IndVarSimplify] Generate full checks for some LFTR tests; NFC
llvm-svn: 358813
2019-04-20 12:05:53 +00:00
Nikita Popov aa0c5a022f [IndVarSimplify] Add tests for PR31181; NFC
llvm-svn: 358812
2019-04-20 12:05:43 +00:00
Eric Christopher cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
David Green 4511f3fa86 [SCEV] Ensure that isHighCostExpansion takes into account what is being divided
A SCEV is not low-cost just because you can divide it by a power of 2. We need to also
check what we are dividing to make sure it too is not a high-code expansion. This helps
to not expand the exit value of certain loops, helping not to bloat the code.

The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116,
and looks a lot like the other tests in replace-loop-exit-folds.ll.

Differential Revision: https://reviews.llvm.org/D58435

llvm-svn: 355393
2019-03-05 12:12:18 +00:00
David Green 3bcb0aa7f9 [SCEV] Add some extra tests for IndVarSimplifys loop exit values. NFC.
Add some tests for various loops of the form:
  while(S >= 32) {
    S -= 32;
    something();
  };
  return S;

llvm-svn: 355389
2019-03-05 11:18:55 +00:00
Florian Hahn 98f11a7d75 [SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR.
In some cases, MaxBECount can be less precise than ExactBECount for AND
and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are
undef, but MaxBECount are different, so we hit the assertion below. This
patch uses the same solution the AND case already uses.

Assertion failed:
   ((isa<SCEVCouldNotCompute>(ExactNotTaken) || !isa<SCEVCouldNotCompute>(MaxNotTaken))
     && "Exact is not allowed to be less precise than Max"), function ExitLimit

This patch also consolidates test cases for both AND and OR in a single
test case.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245

Reviewers: sanjoy, efriedma, mkazantsev

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D58853

llvm-svn: 355259
2019-03-02 02:31:44 +00:00
Max Kazantsev 2a184af221 [IndVars] Fix corner case with unreachable Phi inputs. PR40454
Logic in `getInsertPointForUses` doesn't account for a corner case when `Def`
only comes to a Phi user from unreachable blocks. In this case, the incoming
value may be arbitrary (and not even available in the input block) and break
the loop-related invariants that are asserted below.

In fact, if we encounter this situation, no IR modification is needed. This
Phi will be simplified away with nearest cleanup.

Differential Revision: https://reviews.llvm.org/D58045
Reviewed By: spatel

llvm-svn: 353816
2019-02-12 09:59:44 +00:00
Max Kazantsev 0136e7a246 [TEST] Add missing opportunity test for PR39673
llvm-svn: 353693
2019-02-11 12:58:18 +00:00
Max Kazantsev 8ec0c5e02f [TEST] Add failing test from PR40454
llvm-svn: 353688
2019-02-11 10:44:57 +00:00
Max Kazantsev 266c087b9d Return "[IndVars] Smart hard uses detection"
The patch has been reverted because it ended up prohibiting propagation
of a constant to exit value. For such values, we should skip all checks
related to hard uses because propagating a constant is always profitable.

Differential Revision: https://reviews.llvm.org/D53691

llvm-svn: 346397
2018-11-08 11:54:35 +00:00
Max Kazantsev c210c65e77 [NFC] Add motivating test case for revert in rL346198
llvm-svn: 346199
2018-11-06 02:12:44 +00:00
Max Kazantsev e059f4452b Revert "[IndVars] Smart hard uses detection"
This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402.

It seems that the check that we still should do the transform if we
know the result is constant is missing in this code. So the logic that
has been deleted by this change is still sometimes accidentally useful.
I revert the change to see what can be done about it. The motivating
case is the following:

@Y = global [400 x i16] zeroinitializer, align 1

define i16 @foo() {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i = phi i16 [ 0, %entry ], [ %inc, %for.body ]

  %arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i
  store i16 0, i16* %arrayidx, align 1
  %inc = add nuw nsw i16 %i, 1
  %cmp = icmp ult i16 %inc, 400
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  %inc.lcssa = phi i16 [ %inc, %for.body ]
  ret i16 %inc.lcssa
}

We should be able to figure out that the result is constant, but the patch
breaks it.

Differential Revision: https://reviews.llvm.org/D51584

llvm-svn: 346198
2018-11-06 02:02:05 +00:00
Max Kazantsev 3d347bf545 [IndVars] Smart hard uses detection
When rewriting loop exit values, IndVars considers this transform not profitable if
the loop instruction has a loop user which it believes cannot be optimized away.
In current implementation only calls that immediately use the instruction are considered
as such.

This patch extends the definition of "hard" users to any side-effecting instructions
(which usually cannot be optimized away from the loop) and also allows handling
of not just immediate users, but use chains.

Differentlai Revision: https://reviews.llvm.org/D51584
Reviewed By: etherzhhb

llvm-svn: 345814
2018-11-01 06:47:01 +00:00
Max Kazantsev 541f824d32 [IndVars] Strengthen restricton in rewriteLoopExitValues
For some unclear reason rewriteLoopExitValues considers recalculation
after the loop profitable if it has some "soft uses" outside the loop (i.e. any
use other than call and return), even if we have proved that it has a user inside
the loop which we think will not be optimized away.

There is no existing unit test that would explain this. This patch provides an
example when rematerialisation of exit value is not profitable but it passes
this check due to presence of a "soft use" outside the loop.

It makes no sense to recalculate value on exit if we are going to compute it
due to some irremovable within the loop. This patch disallows applying this
transform in the described situation.

Differential Revision: https://reviews.llvm.org/D51581
Reviewed By: etherzhhb

llvm-svn: 345708
2018-10-31 10:30:50 +00:00
Max Kazantsev b2e51090a4 [IndVars] Drop "exact" flag from lshr and udiv when substituting their args
There is a transform that may replace `lshr (x+1), 1` with `lshr x, 1` in case
if it can prove that the result will be the same. However the initial instruction
might have an `exact` flag set, and it now should be dropped unless we prove
that it may hold. Incorrectly set `exact` attribute may then produce poison.

Differential Revision: https://reviews.llvm.org/D53061
Reviewed By: sanjoy

llvm-svn: 344223
2018-10-11 07:22:26 +00:00
Max Kazantsev 0994abda3a [IndVars] Remove unreasonable checks in rewriteLoopExitValues
A piece of logic in rewriteLoopExitValues has a weird check on number of
users which allowed an unprofitable transform in case if an instruction has
more than 6 users.

Differential Revision: https://reviews.llvm.org/D51404
Reviewed By: etherzhhb

llvm-svn: 342444
2018-09-18 04:57:18 +00:00
Matt Arsenault 9de2fb58fa AMDGPU: Fix some outdated datalayouts in tests
llvm-svn: 342131
2018-09-13 11:56:28 +00:00
Max Kazantsev 9aacaffd98 [NFC] Specify test's option to reduce reliance on defaults
llvm-svn: 341904
2018-09-11 06:34:43 +00:00
Max Kazantsev 4d10ba37b9 [IndVars] Set Changed if sinkUnusedInvariants changes IR. PR38863
Currently, `sinkUnusedInvariants` does not set Changed flag even if it makes
changes in the IR. There is no clear evidence that it can cause a crash, but it
looks highly suspicious and likely invalid.

Differential Revision: https://reviews.llvm.org/D51777
Reviewed By: skatkov

llvm-svn: 341777
2018-09-10 06:32:00 +00:00
Abderrazek Zaafrani c30dfb2dfc [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Laod operand.
Differential Revision: https://reviews.llvm.org/D49151

llvm-svn: 341726
2018-09-07 22:41:57 +00:00
Max Kazantsev 9e6845d8e1 [IndVars] Set Changed when we delete dead instructions. PR38855
IndVars does not set `Changed` flag when it eliminates dead instructions. As result,
it may make IR modifications and report that it has done nothing. It leads to inconsistent
preserved analyzes results.

Differential Revision: https://reviews.llvm.org/D51770
Reviewed By: skatkov

llvm-svn: 341633
2018-09-07 07:23:39 +00:00
Max Kazantsev e157cea3ec [NFC] Add test on full IV widening
llvm-svn: 341456
2018-09-05 10:10:59 +00:00
Max Kazantsev 2cbba56337 [IndVars] Fix usage of SCEVExpander to not mess with SCEVConstant. PR38674
This patch removes the function `expandSCEVIfNeeded` which behaves not as
it was intended. This function tries to make a lookup for exact existing expansion
and only goes to normal expansion via `expandCodeFor` if this lookup hasn't found
anything. As a result of this, if some instruction above the loop has a `SCEVConstant`
SCEV, this logic will return this instruction when asked for this `SCEVConstant` rather
than return a constant value. This is both non-profitable and in some cases leads to
breach of LCSSA form (as in PR38674).

Whether or not it is possible to break LCSSA with this algorithm and with some
non-constant SCEVs is still in question, this is still being investigated. I wasn't
able to construct such a test so far, so maybe this situation is impossible. If it is,
it will go as a separate fix.

Rather than do it, it is always correct to just invoke `expandCodeFor` unconditionally:
it behaves smarter about insertion points, and as side effect of this it will choose a
constant value for SCEVConstants. For other SCEVs it may end up finding a better insertion
point. So it should not be worse in any case.

NOTE: So far the only known case for which this transform may break LCSSA is mapping
of SCEVConstant to an instruction. However there is a suspicion that the entire algorithm
can compromise LCSSA form for other cases as well (yet not proved).

Differential Revision: https://reviews.llvm.org/D51286
Reviewed By: etherzhhb

llvm-svn: 341345
2018-09-04 05:01:35 +00:00
Kit Barton 7c80f98b69 [PPC] Remove Darwin support from POWER backend.
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.

Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.

Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.

llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.

All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.

Phabricator Review: https://reviews.llvm.org/D50988

llvm-svn: 340795
2018-08-28 01:18:29 +00:00
Max Kazantsev 4d980515d2 [SimplifyIndVar] Canonicalize comparisons to unsigned while eliminating truncs
This is a follow-up for the patch rL335020. When we replace compares against
trunc with compares against wide IV, we can also replace signed predicates with
unsigned where it is legal.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D48763

llvm-svn: 338115
2018-07-27 09:43:39 +00:00
Roman Tereshin ed047b0184 [SCEV] Add [zs]ext{C,+,x} -> (D + [zs]ext{C-D,+,x})<nuw><nsw> transform
as well as sext(C + x + ...) -> (D + sext(C-D + x + ...))<nuw><nsw>
similar to the equivalent transformation for zext's

if the top level addition in (D + (C-D + x * n)) could be proven to
not wrap, where the choice of D also maximizes the number of trailing
zeroes of (C-D + x * n), ensuring homogeneous behaviour of the
transformation and better canonicalization of such AddRec's

(indeed, there are 2^(2w) different expressions in `B1 + ext(B2 + Y)` form for
the same Y, but only 2^(2w - k) different expressions in the resulting `B3 +
ext((B4 * 2^k) + Y)` form, where w is the bit width of the integral type)

This patch generalizes sext(C1 + C2*X) --> sext(C1) + sext(C2*X) and
sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformations added in
r209568 relaxing the requirements the following way:

1. C2 doesn't have to be a power of 2, it's enough if it's divisible by 2
 a sufficient number of times;
2. C1 doesn't have to be less than C2, instead of extracting the entire
  C1 we can split it into 2 terms: (00...0XXX + YY...Y000), keep the
  second one that may cause wrapping within the extension operator, and
  move the first one that doesn't affect wrapping out of the extension
  operator, enabling further simplifications;
3. C1 and C2 don't have to be positive, splitting C1 like shown above
 produces a sum that is guaranteed to not wrap, signed or unsigned;
4. in AddExpr case there could be more than 2 terms, and in case of
  AddExpr the 2nd and following terms and in case of AddRecExpr the
  Step component don't have to be in the C2*X form or constant
  (respectively), they just need to have enough trailing zeros,
  which in turn could be guaranteed by means other than arithmetics,
  e.g. by a pointer alignment;
5. the extension operator doesn't have to be a sext, the same
  transformation works and profitable for zext's as well.

Apparently, optimizations like SLPVectorizer currently fail to
vectorize even rather trivial cases like the following:

 double bar(double *a, unsigned n) {
   double x = 0.0;
   double y = 0.0;
   for (unsigned i = 0; i < n; i += 2) {
     x += a[i];
     y += a[i + 1];
   }
   return x * y;
 }

If compiled with `clang -std=c11 -Wpedantic -Wall -O3 main.c -S -o - -emit-llvm`
(!{!"clang version 7.0.0 (trunk 337339) (llvm/trunk 337344)"})

it produces scalar code with the loop not unrolled with the unsigned `n` and
`i` (like shown above), but vectorized and unrolled loop with signed `n` and
`i`. With the changes made in this commit the unsigned version will be
vectorized (though not unrolled for unclear reasons).

How it all works:

Let say we have an AddExpr that looks like (C + x + y + ...), where C
is a constant and x, y, ... are arbitrary SCEVs. Let's compute the
minimum number of trailing zeroes guaranteed of that sum w/o the
constant term: (x + y + ...). If, for example, those terms look like
follows:

        i
XXXX...X000
YYYY...YY00
   ...
ZZZZ...0000

then the rightmost non-guaranteed-zero bit (a potential one at i-th
position above) can change the bits of the sum to the left (and at
i-th position itself), but it can not possibly change the bits to the
right. So we can compute the number of trailing zeroes by taking a
minimum between the numbers of trailing zeroes of the terms.

Now let's say that our original sum with the constant is effectively
just C + X, where X = x + y + .... Let's also say that we've got 2
guaranteed trailing zeros for X:

         j
CCCC...CCCC
XXXX...XX00  // this is X = (x + y + ...)

Any bit of C to the left of j may in the end cause the C + X sum to
wrap, but the rightmost 2 bits of C (at positions j and j - 1) do not
affect wrapping in any way. If the upper bits cause a wrap, it will be
a wrap regardless of the values of the 2 least significant bits of C.
If the upper bits do not cause a wrap, it won't be a wrap regardless
of the values of the 2 bits on the right (again).

So let's split C to 2 constants like follows:

0000...00CC  = D
CCCC...CC00  = (C - D)

and represent the whole sum as D + (C - D + X). The second term of
this new sum looks like this:

CCCC...CC00
XXXX...XX00
-----------  // let's add them up
YYYY...YY00

The sum above (let's call it Y)) may or may not wrap, we don't know,
so we need to keep it under a sext/zext. Adding D to that sum though
will never wrap, signed or unsigned, if performed on the original bit
width or the extended one, because all that that final add does is
setting the 2 least significant bits of Y to the bits of D:

YYYY...YY00 = Y
0000...00CC = D
-----------  <nuw><nsw>
YYYY...YYCC

Which means we can safely move that D out of the sext or zext and
claim that the top-level sum neither sign wraps nor unsigned wraps.

Let's run an example, let's say we're working in i8's and the original
expression (zext's or sext's operand) is 21 + 12x + 8y. So it goes
like this:

0001 0101  // 21
XXXX XX00  // 12x
YYYY Y000  // 8y

0001 0101  // 21
ZZZZ ZZ00  // 12x + 8y

0000 0001  // D
0001 0100  // 21 - D = 20
ZZZZ ZZ00  // 12x + 8y

0000 0001  // D
WWWW WW00  // 21 - D + 12x + 8y = 20 + 12x + 8y

therefore zext(21 + 12x + 8y) = (1 + zext(20 + 12x + 8y))<nuw><nsw>

This approach could be improved if we move away from using trailing
zeroes and use KnownBits instead. For instance, with KnownBits we could
have the following picture:

    i
10 1110...0011  // this is C
XX X1XX...XX00  // this is X = (x + y + ...)

Notice that some of the bits of X are known ones, also notice that
known bits of X are interspersed with unknown bits and not grouped on
the rigth or left.

We can see at the position i that C(i) and X(i) are both known ones,
therefore the (i + 1)th carry bit is guaranteed to be 1 regardless of
the bits of C to the right of i. For instance, the C(i - 1) bit only
affects the bits of the sum at positions i - 1 and i, and does not
influence if the sum is going to wrap or not. Therefore we could split
the constant C the following way:

    i
00 0010...0011  = D
10 1100...0000  = (C - D)

Let's compute the KnownBits of (C - D) + X:

XX1 1            = carry bit, blanks stand for known zeroes
 10 1100...0000  = (C - D)
 XX X1XX...XX00  = X
--- -----------
 XX X0XX...XX00

Will this add wrap or not essentially depends on bits of X. Adding D
to this sum, however, is guaranteed to not to wrap:

0    X
 00 0010...0011  = D
 sX X0XX...XX00  = (C - D) + X
--- -----------
 sX XXXX   XX11

As could be seen above, adding D preserves the sign bit of (C - D) +
X, if any, and has a guaranteed 0 carry out, as expected.

The more bits of (C - D) we constrain, the better the transformations
introduced here canonicalize expressions as it leaves less freedom to
what values the constant part of ((C - D) + x + y + ...) can take.

Reviewed By: mzolotukhin, efriedma

Differential Revision: https://reviews.llvm.org/D48853

llvm-svn: 337943
2018-07-25 18:01:41 +00:00
Max Kazantsev f5ba37182e [IndVarSimplify] Ignore unreachable users of truncs
If a trunc has a user in a block which is not reachable from entry,
we can safely perform trunc elimination as if this user didn't exist.

llvm-svn: 335816
2018-06-28 08:20:03 +00:00
Max Kazantsev 37da4333a8 [SimplifyIndVars] Eliminate redundant truncs
This patch adds logic to deal with the following constructions:

  %iv = phi i64 ...
  %trunc = trunc i64 %iv to i32
  %cmp = icmp <pred> i32 %trunc, %invariant

Replacing it with
  %iv = phi i64 ...
  %cmp = icmp <pred> i64 %iv, sext/zext(%invariant)

In case if it is legal. Specifically, if `%iv` has signed comparison users, it is
required that `sext(trunc(%iv)) == %iv`, and if it has unsigned comparison
uses then we require `zext(trunc(%iv)) == %iv`. The current implementation
bails if `%trunc` has other uses than `icmp`, but in theory we can handle more
cases here (e.g. if the user of trunc is bitcast).

Differential Revision: https://reviews.llvm.org/D47928
Reviewed By: reames

llvm-svn: 335020
2018-06-19 04:48:34 +00:00
Sanjoy Das 6e9b355cc9 Revert "[SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags"
This reverts r334428.  It incorrectly marks some multiplications as nuw.  Tim
Shen is working on a proper fix.

Original commit message:

[SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags where safe.

Summary:
Previously we would add them for adds, but not multiplies.

llvm-svn: 335016
2018-06-19 04:09:44 +00:00
Max Kazantsev 0ed79620c6 [SimplifyIndVars] Ignore dead users
IndVarSimplify sometimes makes transforms basing on users that are trivially dead. In particular,
if DCE wasn't run before it, there may be a dead `sext/zext` in loop that will trigger widening
transforms, however it makes no sense to do it.

This patch teaches IndVarsSimplify ignore the mist trivial cases of that.

Differential Revision: https://reviews.llvm.org/D47974
Reviewed By: sanjoy

llvm-svn: 334567
2018-06-13 02:25:32 +00:00
Justin Lebar aa4fec94d8 [SCEV] Add nuw/nsw to mul ops in StrengthenNoWrapFlags where safe.
Summary:
Previously we would add them for adds, but not multiplies.

Reviewers: sanjoy

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D48038

llvm-svn: 334428
2018-06-11 18:57:42 +00:00
Shiva Chen 2c864551df [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

llvm-svn: 331841
2018-05-09 02:40:45 +00:00
Max Kazantsev 613af1f7ca [SCEV] Prove implications for SCEVUnknown Phis
This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis.
If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values
`L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also
prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient
to prove the predicate for values that come from the same predecessor block.

The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0`
given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and
non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot.

Differential Revision: https://reviews.llvm.org/D44001
Reviewed By: anna

llvm-svn: 329150
2018-04-04 05:46:47 +00:00
Max Kazantsev 7094c8deb2 [SCEV] Make exact taken count calculation more optimistic
Currently, `getExact` fails if it sees two exit counts in different blocks. There is
no solid reason to do so, given that we only calculate exact non-taken count
for exiting blocks that dominate latch. Using this fact, we can simply take min
out of all exits of all blocks to get the exact taken count.

This patch makes the calculation more optimistic with enforcing our assumption
with asserts. It allows us to calculate exact backedge taken count in trivial loops
like

  for (int i = 0; i < 100; i++) {
    if (i > 50) break;
    . . .
  }

Differential Revision: https://reviews.llvm.org/D44676
Reviewed By: fhahn

llvm-svn: 328611
2018-03-27 07:30:38 +00:00
Max Kazantsev a63d333881 [SCEV] Add one more case in computeConstantDifference
This patch teaches `computeConstantDifference` handle calculation of constant
difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`.

Differential Revision: https://reviews.llvm.org/D43759
Reviewed By: anna

llvm-svn: 328609
2018-03-27 04:54:00 +00:00
Serguei Katkov 529f42331e [SCEV] Re-land: Fix isKnownPredicate
This is re-land of https://reviews.llvm.org/rL327362 with a fix
and regression test.

The crash was due to it is possible that for found MDL loop,
LHS or RHS may contain an invariant unknown SCEV which
does not dominate the MDL. Please see regression
test for an example.

Reviewers: sanjoy, mkazantsev, reames
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44553

llvm-svn: 327822
2018-03-19 06:35:30 +00:00
Serguei Katkov bbfbf21ddc Revert [SCEV] Fix isKnownPredicate
It is a revert of rL327362 which causes build bot failures with assert like

Assertion `isAvailableAtLoopEntry(RHS, L) && "RHS is not available at Loop Entry"' failed.

llvm-svn: 327363
2018-03-13 06:36:00 +00:00
Serguei Katkov b05574c0d3 [SCEV] Fix isKnownPredicate
IsKnownPredicate is updated to implement the following algorithm
proposed by @sanjoy and @mkazantsev :
isKnownPredicate(Pred, LHS, RHS) {
  Collect set S all loops on which either LHS or RHS depend.
  If S is non-empty
    a. Let PD be the element of S which is dominated by all other elements of S
    b. Let E(LHS) be value of LHS on entry of PD.
       To get E(LHS), we should just take LHS and replace all AddRecs that
       are attached to PD on with their entry values.
       Define E(RHS) in the same way.
    c. Let B(LHS) be value of L on backedge of PD.
       To get B(LHS), we should just take LHS and replace all AddRecs that
       are attached to PD on with their backedge values.
       Define B(RHS) in the same way.
    d. Note that E(LHS) and E(RHS) are automatically available on entry of PD,
       so we can assert on that.
    e. Return true if isLoopEntryGuardedByCond(Pred, E(LHS), E(RHS)) &&
                      isLoopBackedgeGuardedByCond(Pred, B(LHS), B(RHS))
Return true if Pred, L, R is known from ranges, splitting etc.
}
This is follow-up for https://reviews.llvm.org/D42417.

Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy, mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43507

llvm-svn: 327362
2018-03-13 06:10:27 +00:00
Max Kazantsev 6e4ce23add [NFC] Fix metadata placement in test
llvm-svn: 325215
2018-02-15 07:13:18 +00:00
Max Kazantsev c5941d12f4 [SCEV] Favor isKnownViaSimpleReasoning over constant ranges check
There is a more powerful but still simple function `isKnownViaSimpleReasoning ` that
does constant range check and few more additional checks. We use it some places (e.g.
when proving implications) and in some other places we only check constant ranges.

Currently, indvar simplifier fails to remove the check in following loop:

  int inc = ...;
  for (int i = inc, j = inc - 1; i < 200; ++i, ++j)
    if (i > j) { ... }

This patch replaces all usages of `isKnownPredicateViaConstantRanges` with
`isKnownViaSimpleReasoning` to have smarter proofs. In particular, it fixes the
case above.

Reviewed-By: sanjoy
Differential Revision: https://reviews.llvm.org/D43175

llvm-svn: 325214
2018-02-15 07:09:00 +00:00
Max Kazantsev b299ade2c5 Re-enable "[SCEV] Make isLoopEntryGuardedByCond a bit smarter"
The failures happened because of assert which was overconfident about
SCEV's proving capabilities and is generally not valid.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324473
2018-02-07 11:16:29 +00:00
Serguei Katkov 69246ca787 Revert [SCEV] Make isLoopEntryGuardedByCond a bit smarter
Revert rL324453 commit which causes buildbot failures.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324462
2018-02-07 09:10:08 +00:00
Max Kazantsev dd5ee6f5d9 [SCEV] Make isLoopEntryGuardedByCond a bit smarter
Sometimes `isLoopEntryGuardedByCond` cannot prove predicate `a > b` directly.
But it is a common situation when `a >= b` is known from ranges and `a != b` is
known from a dominating condition. Thia patch teaches SCEV to sum these facts
together and prove strict comparison via non-strict one.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324453
2018-02-07 07:56:26 +00:00
Serguei Katkov ec7029c286 Re-apply [SCEV] Fix isLoopEntryGuardedByCond usage
ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check
that SCEV is available at entry point of the loop. It is incorrect and fixed by patch.

To bugs additionally fixed:
assert is moved after the check whether loop is not a nullptr.
Usage of isLoopEntryGuardedByCond in ScalarEvolution::isImpliedCondOperandsViaNoOverflow
is guarded by isAvailableAtLoopEntry.

Reviewers: sanjoy, mkazantsev, anna, dorit, reames
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42417

llvm-svn: 324204
2018-02-05 05:49:47 +00:00
Serguei Katkov f38041dc3e Revert [SCEV] Fix isLoopEntryGuardedByCond usage
It causes buildbot failures. New added assert is fired.
It seems not all usages of isLoopEntryGuardedByCond are fixed.

llvm-svn: 323079
2018-01-22 07:47:02 +00:00
Serguei Katkov 50714a1cbc [SCEV] Fix isLoopEntryGuardedByCond usage
ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check
that SCEV is available at entry point of the loop. It is incorrect and fixed by patch.

Reviewers: sanjoy, mkazantsev, anna, dorit
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42165

llvm-svn: 323077
2018-01-22 07:31:41 +00:00
Serguei Katkov 67da7696a0 [SCEV] Fix the movement of insertion point in expander. PR35406.
We cannot move the insertion point to header if SCEV contains div/rem
operations due to they may go over check for zero denominator.

Reviewers: sanjoy, mkazantsev, sebpop
Reviewed By: sebpop
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41229

llvm-svn: 320789
2017-12-15 05:24:42 +00:00
Bjorn Pettersson 33c9d5535f [ScalarEvolution] Fix base condition in isNormalAddRecPHI.
Summary:
The function is meant to recurse until it comes upon the
phi it's looking for. However, with the current condition,
it will recurse until it finds anything _but_ the phi.

The function will even fail for simple cases like:
  %i = phi i32 [ %inc, %loop ], ...
  ...
  %inc = add i32 %i, 1

because the base condition will not happen when the phi
is recursed to, and the recursion will end with a 'false'
result since the previous instruction is a phi.

Reviewers: sanjoy, atrick

Reviewed By: sanjoy

Subscribers: Ka-Ka, bjope, llvm-commits

Committing on behalf of: Bevin Hansson (bevinh)

Differential Revision: https://reviews.llvm.org/D40946

llvm-svn: 320700
2017-12-14 14:47:52 +00:00
Philip Reames 6260cf71d3 [IndVars] Fix a bug introduced in r317012
Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant.  In this case, there is no loop invariant value contributing and we'd fail an assert.

The test case was found by a java fuzzer and reduced.  It's a real cornercase.  You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate.  To my knowledge, only the fuzzer has hit this case.

llvm-svn: 319583
2017-12-01 20:57:19 +00:00
Adrian Prantl fbb6fbf709 IndVarSimplify: preserve debug information attached to widened PHI nodes.
This fixes PR35015.

https://bugs.llvm.org/show_bug.cgi?id=35015

Differential Revision: https://reviews.llvm.org/D39345

llvm-svn: 317282
2017-11-02 23:17:06 +00:00
Philip Reames 59bf1e0548 [IndVarSimplify] Simplify code using preheader assumption
As noted in the nice block comment, the previous code didn't actually handle multi-entry loops correctly, it just assumed SCEV didn't analyze such loops.  Given SCEV has comments to the contrary, that seems a bit suspect.  More importantly, the pass actually requires loopsimplify form which ensures a loop-preheader is available.  Remove the excessive generaility and shorten the code greatly.

Note that we do successfully analyze many multi-entry loops, but we do so by converting them to single entry loops.  See the added test case.

llvm-svn: 316976
2017-10-31 05:16:46 +00:00
Max Kazantsev 52d0a49046 Revert rL316568 because of sudden performance drop on ARM
llvm-svn: 316739
2017-10-27 04:17:44 +00:00
Max Kazantsev b6d40067af [SCEV] Enhance SCEVFindUnsafe for division
This patch allows SCEVFindUnsafe algorithm to tread division by any non-positive
value as safe. Previously, it could only recognize non-zero constants.

Differential Revision: https://reviews.llvm.org/D39228

llvm-svn: 316568
2017-10-25 11:07:43 +00:00
Hongbin Zheng d36f2030e2 [SimplifyIndVar] Replace IVUsers with loop invariant whenever possible
Differential Revision: https://reviews.llvm.org/D38415

llvm-svn: 315551
2017-10-12 02:54:11 +00:00
Hongbin Zheng c8abdf5f25 [SimplifyIndVar] Do not fail when we constant fold an IV user to ConstantPointerNull
The type of a SCEVConstant may not match the corresponding LLVM Value.
In this case, we skip the constant folding for now.

TODO: Replace ConstantInt Zero by ConstantPointerNull
llvm-svn: 314531
2017-09-29 16:32:12 +00:00
Hongbin Zheng d1b7b2efba [SimplifyIndVar] Constant fold IV users
This patch tries to transform cases like:

for (unsigned i = 0; i < N; i += 2) {
  bool c0 = (i & 0x1) == 0;
  bool c1 = ((i + 1) & 0x1) == 1;
}
To

for (unsigned i = 0; i < N; i += 2) {
  bool c0 = true;
  bool c1 = true;
}

This commit also update test/Transforms/IndVarSimplify/replace-srem-by-urem.ll to prevent constant folding.

Differential Revision: https://reviews.llvm.org/D38272

llvm-svn: 314266
2017-09-27 03:11:46 +00:00
Hongbin Zheng f0093e45c4 [SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative
Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior

This is a follow up of D34598

Differential Revision: https://reviews.llvm.org/D38072

llvm-svn: 314125
2017-09-25 17:39:40 +00:00
Sanjoy Das 4cad61adb3 [SCEV/IndVars] Always compute loop exiting values if the backedge count is 0
If SCEV can prove that the backedge taken count for a loop is zero, it does not
need to "understand" a recursive PHI to compute its exiting value.

This should fix PR33885.

llvm-svn: 309758
2017-08-01 22:37:58 +00:00
Max Kazantsev b9edcbcb1d Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars"
The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp
instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate.
In this patch we use the original predicate to do the transformation.

Also added a test case that exercises this situation.

Differentian Revision: https://reviews.llvm.org/D35107

llvm-svn: 307477
2017-07-08 17:17:30 +00:00
Max Kazantsev 98838527c6 Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"""
It appears that the problem is still there. Needs more analysis to understand why
SaturatedMultiply test fails.

llvm-svn: 307249
2017-07-06 10:47:13 +00:00
Max Kazantsev c8db20b78c Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""
It seems that the patch was reverted by mistake. Clang testing showed failure of the
MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the
fresh code base and was able to confirm that the transformation introduced by the change
does not happen in the said test. This gives a strong confidence that the actual reason of
the failure of the initial patch was somewhere else, and that problem now seems to be
fixed. Re-submitting the change to confirm that.

llvm-svn: 307244
2017-07-06 09:57:41 +00:00
David Green b26a0a460c [IndVarSimplify] Add AShr exact flags using induction variables ranges.
This adds exact flags to AShr/LShr flags where we can statically
prove it is valid using the range of induction variables. This
allows further optimisations to remove extra loads.

Differential Revision: https://reviews.llvm.org/D34207

llvm-svn: 307157
2017-07-05 13:25:58 +00:00
Max Kazantsev ebe56283bc Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"
This patch seems to cause failures of test MathExtras.SaturatingMultiply on
multiple buildbots. Reverting until the reason of that is clarified.

Differential Revision: https://reviews.llvm.org/rL307126

llvm-svn: 307135
2017-07-05 09:44:41 +00:00
Max Kazantsev 80bc4a5554 [IndVars] Canonicalize comparisons between non-negative values and indvars
-If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative,
then signed and unsigned comparisons between them produce the same result. Both of those can be
seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like

  %c = icmp slt i32 %iv, %b
to

  %c = icmp ult i32 %iv, %b

if both %iv and %b are known to be non-negative.

Differential Revision: https://reviews.llvm.org/D34979

llvm-svn: 307126
2017-07-05 06:38:49 +00:00
Max Kazantsev eac01d4c62 [SCEV] Make MulOpsInlineThreshold lower to avoid excessive compilation time
MulOpsInlineThreshold option of SCEV is defaulted to 1000, which is inadequately high.
When constructing SCEVs of expressions like:

  x1 = a * a
  x2 = x1 * x1
  x3 = x2 * x2
    ...

We actually have huge SCEVs with max allowed amount of operands inlined.
Such expressions are easy to get from unrolling of loops looking like

  x = a
  for (i = 0; i < n; i++)
    x = x * x

Or more tricky cases where big powers are involved. If some non-linear analysis
tries to work with a SCEV that has 1000 operands, it may lead to excessively long
compilation. The attached test does not pass within 1 minute with default threshold.

This patch decreases its default value to 32, which looks much more reasonable if we
use analyzes with complexity O(N^2) or O(N^3) working with SCEV.

Differential Revision: https://reviews.llvm.org/D34397

llvm-svn: 305882
2017-06-21 07:28:13 +00:00
Serguei Katkov 38414b57f9 [IndVars] Add an option to be able to disable LFTR
This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.

Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979

llvm-svn: 305055
2017-06-09 06:11:59 +00:00
Daniel Berlin 74ffa5c62f ConstantFold: Fold getelementptr (i32, i32* null, i64 undef) to null.
Transforms/IndVarSimplify/2011-10-27-lftrnull will fail if this regresses.
Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll has been changed to still test what it was
trying to test.

llvm-svn: 302446
2017-05-08 17:37:29 +00:00
Matt Arsenault f10061ec70 Add address space mangling to lifetime intrinsics
In preparation for allowing allocas to have non-0 addrspace.

llvm-svn: 299876
2017-04-10 20:18:21 +00:00
Hongbin Zheng bfd7c38de7 [SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative
Since there is no sdiv in SCEV, an 'udiv' is a better canonical form than an 'sdiv' as the user of induction variable

Differential Revision: https://reviews.llvm.org/D31488

llvm-svn: 299118
2017-03-30 21:56:56 +00:00
Matt Arsenault 3dbeefa978 AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444
2017-03-21 21:39:51 +00:00
Sanjoy Das 39a684d117 [ValueTracking] Don't do an unchecked shift in ComputeNumSignBits
Summary:
Previously we used to return a bogus result, 0, for IR like `ashr %val,
-1`.

I've also added an assert checking that `ComputeNumSignBits` at least
returns 1.  That assert found an already checked in test case where we
were returning a bad result for `ashr %val, -1`.

Fixes PR32045.

Reviewers: spatel, majnemer

Reviewed By: spatel, majnemer

Subscribers: efriedma, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30311

llvm-svn: 296273
2017-02-25 20:30:45 +00:00
Peter Collingbourne 10c500ddc0 opt: Rename -default-data-layout flag to -data-layout and make it always override the layout.
There isn't much point in a flag that only works if the data layout is empty.

Differential Revision: https://reviews.llvm.org/D30014

llvm-svn: 295468
2017-02-17 17:36:52 +00:00
Wei Mi d2948cef70 [IndVars] Change the order to compute WidenAddRec in widenIVUse.
When both WidenIV::getWideRecurrence and WidenIV::getExtendedOperandRecurrence
return non-null but different WideAddRec, if getWideRecurrence is called
before getExtendedOperandRecurrence, we won't bother to call
getExtendedOperandRecurrence again. But As we know it is possible that after
SCEV folding, we cannot prove the legality using the SCEVAddRecExpr returned
by getWideRecurrence. Meanwhile if getExtendedOperandRecurrence returns non-null
WideAddRec, we know for sure that it is legal to do widening for current instruction.
So it is better to put getExtendedOperandRecurrence before getWideRecurrence, which
will increase the chance of successful widening.

Differential Revision: https://reviews.llvm.org/D26059

llvm-svn: 286987
2016-11-15 17:34:52 +00:00
Artur Pilipenko 5c6ef75485 [IndVarSimplify] Teach calculatePostIncRange to take guards into account
Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D25739

llvm-svn: 284632
2016-10-19 19:43:54 +00:00
Artur Pilipenko f2d5dc5dc6 [IndVarSimplify] Use control-dependent range information to prove non-negativity
This change is motivated by the case when IndVarSimplify doesn't widen a comparison of IV increment because it can't prove IV increment being non-negative. We end up with a redundant trunc of the widened increment on this example.

for.body:
  %i = phi i32 [ %start, %for.body.lr.ph ], [ %i.inc, %for.inc ]
  %within_limits = icmp ult i32 %i, 64
  br i1 %within_limits, label %continue, label %for.end

continue:
  %i.i64 = zext i32 %i to i64
  %arrayidx = getelementptr inbounds i32, i32* %base, i64 %i.i64
  %val = load i32, i32* %arrayidx, align 4
  br label %for.inc

for.inc:
  %i.inc = add nsw nuw i32 %i, 1
  %cmp = icmp slt i32 %i.inc, %limit
  br i1 %cmp, label %for.body, label %for.end

There is a range check inside of the loop which guarantees the IV to be non-negative. NSW on the increment guarantees that the increment is also non-negative. Teach IndVarSimplify to use the range check to prove non-negativity of loop increments.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D25738

llvm-svn: 284629
2016-10-19 18:59:03 +00:00
Evgeny Stupachenko dc8a254663 Wisely choose sext or zext when widening IV.
Summary:
The patch fixes regression caused by two earlier patches D18777 and D18867.

Reviewers: reames, sanjoy

Differential Revision: http://reviews.llvm.org/D24280

From: Li Huang
llvm-svn: 282650
2016-09-28 23:39:39 +00:00
Artur Pilipenko b78ad9d41f Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative
This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks.

See comments on https://reviews.llvm.org/D18777 for details.

llvm-svn: 279432
2016-08-22 13:12:07 +00:00
Sanjoy Das 3502511548 [IndVars] Ignore (s|z)exts that don't extend the induction variable
`IVVisitor::visitCast` used to have the invariant that if the
instruction it was passed was a sext or zext instruction, the result of
the instruction would be wider than the induction variable.  This is no
longer true after rL275037, so this change teaches `IndVarSimplify` s
implementation of `IVVisitor::visitCast` to work with the relaxed
invariant.

A corresponding change to SimplifyIndVar to preserve the said invariant
after rL275037 would also work, but given how `IVVisitor::visitCast` is
spelled (no indication of said invariant), I figured the current fix is
cleaner.

Fixes PR28935.

llvm-svn: 278584
2016-08-13 00:58:31 +00:00
Ehsan Amiri dbcfea9811 Extend trip count instead of truncating IV in LFTR, when legal
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because

(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).

I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.

To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.

This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on).

https://reviews.llvm.org/D23075

llvm-svn: 278421
2016-08-11 21:31:40 +00:00