For some reason MSVC seems to think I'm calling getConstant() from a
static context. Try to avoid this issue by explicitly specifying
'this->' (though I'm not confident that this will actually work).
llvm-svn: 262451
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.
The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).
llvm-svn: 262438
analyses in the new pass manager.
These just handle really basic stuff: turning a type name into a string
statically that is nice to print in logs, and getting a static unique ID
for each analysis.
Sadly, the format of passes in anonymous namespaces makes using their
names in tests really annoying so I've customized the names of the no-op
passes to keep tests sane to read.
This is the first of a few simplifying refactorings for the new pass
manager that should reduce boilerplate and confusion.
llvm-svn: 262004
Rename makeNoWrapRegion to a more obvious makeGuaranteedNoWrapRegion,
and add a comment about the counter-intuitive aspects of the function.
This is to help prevent cases like PR26628.
llvm-svn: 261532
Before this patch simplified SCEV expressions for PHI nodes were only returned
the very first time getSCEV() was called, but later calls to getSCEV always
returned the non-simplified value, which had "temporarily" been stored in the
ValueExprMap, but was never removed and consequently blocked the caching of the
simplified PHI expression.
llvm-svn: 261485
sanitizer issue. The PredicatedScalarEvolution's copy constructor
wasn't copying the Generation value, and was leaving it un-initialized.
Original commit message:
[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
Summary:
This change adds no wrap SCEV predicates with:
- support for runtime checking
- support for expression rewriting:
(sext ({x,+,y}) -> {sext(x),+,sext(y)}
(zext ({x,+,y}) -> {zext(x),+,sext(y)}
Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.
We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.
Reviewers: mzolotukhin, sanjoy, anemet
Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel
Differential Revision: http://reviews.llvm.org/D15412
llvm-svn: 260112
Summary:
This change adds no wrap SCEV predicates with:
- support for runtime checking
- support for expression rewriting:
(sext ({x,+,y}) -> {sext(x),+,sext(y)}
(zext ({x,+,y}) -> {zext(x),+,sext(y)}
Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.
We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.
Reviewers: mzolotukhin, sanjoy, anemet
Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel
Differential Revision: http://reviews.llvm.org/D15412
llvm-svn: 260085
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.
This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.
The original commit triggered regressions in Polly tests. The regressions
exposed two problems which have been fixed in current version.
1. Polly will generate a new function based on the old one. To generate an
instruction for the new function, it builds SCEV for the old instruction,
applies some tranformation on the SCEV generated, then expands the transformed
SCEV and insert the expanded value into new function. Because SCEV expansion
may reuse value cached in ExprValueMap, the value in old function may be
inserted into new function, which is wrong.
In SCEVExpander::expand, there is a logic to check the cached value to
be used should dominate the insertion point. However, for the above
case, the check always passes. That is because the insertion point is
in a new function, which is unreachable from the old function. However
for unreachable node, DominatorTreeBase::dominates thinks it will be
dominated by any other node.
The fix is to simply add a check that the cached value to be used in
expansion should be in the same function as the insertion point instruction.
2. When the SCEV is of scConstant type, expanding it directly is cheaper than
reusing a normal value cached. Although in the cached value set in ExprValueMap,
there is a Constant type value, but it is not easy to find it out -- the cached
Value set is not sorted according to the potential cost. Existing reuse logic
in SCEVExpander::expand simply chooses the first legal element from the cached
value set.
The fix is that when the SCEV is of scConstant type, don't try the reuse
logic. simply expand it.
Differential Revision: http://reviews.llvm.org/D12090
llvm-svn: 259736
Current SCEV expansion will expand SCEV as a sequence of operations
and doesn't utilize the value already existed. This will introduce
redundent computation which may not be cleaned up throughly by
following optimizations.
This patch introduces an ExprValueMap which is a map from SCEV to the
set of equal values with the same SCEV. When a SCEV is expanded, the
set of values is checked and reused whenever possible before generating
a sequence of operations.
Differential Revision: http://reviews.llvm.org/D12090
llvm-svn: 259662
- ScalarEvolution::isKnownPredicateViaConstantRanges duplicates some
logic already present in ConstantRange, use ConstantRange for those
bits.
- In some cases ScalarEvolution::isKnownPredicateViaConstantRanges
returns `false` to mean "definitely false" (e.g. see the
`LHSRange.getSignedMin().sge(RHSRange.getSignedMax())` case for
`ICmpInst::ICMP_SLT`), but for `isKnownPredicateViaConstantRanges`,
`false` actually means "don't know". Get rid of this extra bit of
code to avoid confusion.
llvm-svn: 259401
Summary:
The previous form, taking opcode and type, is moved to an internal
helper and the new form, taking an instruction, is a wrapper around this
helper.
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.
Reviewers: eddyb
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D16383
llvm-svn: 258391
In some cases, the max backedge taken count can be more conservative
than the exact backedge taken count (for instance, because
ScalarEvolution::getRange is not control-flow sensitive whereas
computeExitLimitFromICmp can be). In these cases,
computeExitLimitFromCond (specifically the bit that deals with `and` and
`or` instructions) can create an ExitLimit instance with a
`SCEVCouldNotCompute` max backedge count expression, but a computable
exact backedge count expression. This violates an implicit SCEV
assumption: a computable exact BE count should imply a computable max BE
count.
This change
- Makes the above implicit invariant explicit by adding an assert to
ExitLimit's constructor
- Changes `computeExitLimitFromCond` to be more robust around
conservative max backedge counts
llvm-svn: 258184
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.
GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Reviewers: mjacob, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16275
llvm-svn: 258145
The way `getLoopBackedgeTakenCounts` is written right now isn't
correct. It will try to compute and store the BE counts of a Loop
#{child loop} number of times (which may be zero).
llvm-svn: 256338
Clang has better diagnostics in this case. It is not necessary therefore
to change the destructor to avoid what is effectively an invalid warning
in gcc. Instead, better handle the warning flags given to the compiler.
llvm-svn: 255905
ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform
and Analysis modules:
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
P1: {a,+,b} has nsw
P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
llvm-svn: 255122
Reduces the scope over which the struct is visible, making its usages
obvious. I did not move structs in cases where this wasn't a clear
win (the struct is too large, or is grouped in some other interesting
way).
llvm-svn: 255003
It is not enough to simply make the destructor virtual since there is a g++ 4.7
issue (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53613) that throws the
error "looser throw specifier for ... overridding ~SCEVPredicate() noexcept".
llvm-svn: 254592
The nuw constraint will not be satisfied unless <expr> == 0.
This bug has been around since r102234 (in 2010!), but was uncovered by
r251052, which introduced more aggressive optimization of nuw scev expressions.
Differential Revision: http://reviews.llvm.org/D14850
llvm-svn: 253627
The bug: I missed adding break statements in the switch / case.
Original commit message:
[SCEV] Teach SCEV some axioms about non-wrapping arithmetic
Summary:
- A s< (A + C)<nsw> if C > 0
- A s<= (A + C)<nsw> if C >= 0
- (A + C)<nsw> s< A if C < 0
- (A + C)<nsw> s<= A if C <= 0
Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13686
llvm-svn: 252236
Summary:
SCEV Predicates represent conditions that typically cannot be derived from
static analysis, but can be used to reduce SCEV expressions to forms which are
usable for different optimizers.
ScalarEvolution now has the rewriteUsingPredicate method which can simplify a
SCEV expression using a SCEVPredicateSet. The normal workflow of a pass using
SCEVPredicates would be to hold a SCEVPredicateSet and every time assumptions
need to be made a new SCEV Predicate would be created and added to the set.
Each time after calling getSCEV, the user will call the rewriteUsingPredicate
method.
We add two types of predicates
SCEVPredicateSet - implements a set of predicates
SCEVEqualPredicate - tests for equality between two SCEV expressions
We use the SCEVEqualPredicate to re-implement stride versioning. Every time we
version a stride, we will add a SCEVEqualPredicate to the context.
Instead of adding specific stride checks, LoopVectorize now adds a more
generic SCEV check.
We only need to add support for this in the LoopVectorizer since this is the
only pass that will do stride versioning.
Reviewers: mzolotukhin, anemet, hfinkel, sanjoy
Subscribers: sanjoy, hfinkel, rengolin, jmolloy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13595
llvm-svn: 251800
Have `getConstantEvolutionLoopExitValue` work correctly with multiple
entry loops.
As far as I can tell, `getConstantEvolutionLoopExitValue` never did the
right thing for multiple entry loops; and before r249712 it would
silently return an incorrect answer. r249712 changed SCEV to fail an
assert on a multiple entry loop, and this change fixes the underlying
issue.
llvm-svn: 251770
Prevent `createNodeFromSelectLikePHI` from creating SCEV expressions
that break LCSSA.
A better fix for the same issue is to teach SCEVExpander to not break
LCSSA by inserting PHI nodes at appropriate places. That's planned for
the future.
Fixes PR25360.
llvm-svn: 251756
Summary:
When forming expressions for phi nodes having an incoming value from
outside the loop A and a value coming from the previous iteration B
we were forming an AddRec if:
- B was an AddRec
- the value A was equal to the value for B at iteration -1 (or equal
to the value of B shifted by one iteration, at iteration 0)
In this case, we were computing the expression to be the expression of
B, shifted by one iteration.
This changes generalizes the logic above by removing the restriction that
B needs to be an AddRec. For this we introduce two expression rewriters
that allow us to
- shift an expression by one iteration
- get the value of an expression at iteration 0
This allows us to get SCEV expressions for PHI nodes when these expressions
are not AddRecExprs.
Reviewers: sanjoy
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D14175
llvm-svn: 251700
This teaches SCEV to compute //max// backedge taken counts for loops
like
for (int i = k; i != 0; i >>>= 1)
whatever();
SCEV yet cannot represent the exact backedge count for these loops, and
this patch does not change that. This is really geared towards teaching
SCEV that loops like the above are *not* infinite.
llvm-svn: 251558
The loop idiom creating a ConstantRange is repeated twice in the
codebase, time to give it a name and a home.
The loop is also repeated in `rangeMetadataExcludesValue`, but using
`getConstantRangeFromMetadata` there would not be an NFC -- the range
returned by `getConstantRangeFromMetadata` may contain a value that none
of the subranges did.
llvm-svn: 251180
Instead of checking `(FlagsPresent & ExpectedFlags) != 0`, check
`(FlagsPresent & ExpectedFlags) == ExpectedFlags`. Right now they're
equivalent since `ExpectedFlags` can only be either `FlagNUW` or
`FlagNSW`, but if we ever pass in `ExpectedFlags` as `FlagNUW | FlagNSW`
then checking `(FlagsPresent & ExpectedFlags) != 0` would be wrong.
llvm-svn: 251142
I could not come up a way to test this -- I think this bug is latent
today, and will not actually result in a miscompile.
In `getPreStartForExtend`, SCEV constructs `PreStart` as a sum of all of
`SA`'s operands except `Op`. It also uses `SA`'s no-wrap flags, and
this is problematic because removing an element from an add expression
can make it signed-wrap. E.g. if `SA` was `(127 + 1 + -1)`, then it
could safely be `<nsw>` (since `sext(127) + sext(1) + sext(-1)` ==
`sext(127 + 1 + -1)`), but `(127 + 1)` (== `PreStart` if `Op` is `-1`)
is not `<nsw>`.
Transferring `<nuw>` from `SA` to `PreStart` is safe, as far as I can
tell.
llvm-svn: 251097
Summary:
An unsigned comparision is equivalent to is corresponding signed version
if both the operands being compared are positive. Teach SCEV to use
this fact when profitable.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13687
llvm-svn: 251051
Summary:
- A s< (A + C)<nsw> if C > 0
- A s<= (A + C)<nsw> if C >= 0
- (A + C)<nsw> s< A if C < 0
- (A + C)<nsw> s<= A if C <= 0
Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.
Reviewers: atrick, hfinkel, reames, nlewycky
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13686
llvm-svn: 251050
Summary:
This uses `ScalarEvolution::getRange` and not potentially control
dependent `nsw` and `nuw` bits on the arithmetic instruction.
Reviewers: atrick, hfinkel, nlewycky
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D13613
llvm-svn: 251048
In a later commit, `SplitBinaryAdd` will be used outside `IsConstDiff`,
so lift that out. And lift out `IsConstDiff` as
`computeConstantDifference` to keep things clean and to avoid playing
C++ access specifier games.
NFC.
llvm-svn: 250143
This patch also allows the -delinearize pass to delinearize expressions that do
not have an outermost SCEVAddRec expression. The SCEV::delinearize
infrastructure allowed this since r240952, but the -delinearize pass was not
updated yet.
llvm-svn: 250018
The current implementation of `StrengthenNoWrapFlags` is agnostic to the
order of `Ops`, so this commit should not change anything semantic. An
upcoming change will make `StrengthenNoWrapFlags` sensitive to the order
of `Ops`.
llvm-svn: 249802
Summary:
`getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively`
assumed all phi nodes in the loop header have the same order of incoming
values. This is not correct, and this commit changes
`getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively`
to lookup the backedge value of a phi node using the loop's latch block.
Unfortunately, there is still some code duplication
`getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively`.
At some point in the future we should extract out a helper class /
method that can evolve constant evolution phi nodes across iterations.
Fixes 25060. Thanks to Mattias Eriksson for the spot-on analysis!
Depends on D13457.
Reviewers: atrick, hfinkel
Subscribers: materi, llvm-commits
Differential Revision: http://reviews.llvm.org/D13458
llvm-svn: 249712
Comparing `Pred` with `ICmpInst::ICMP_ULT` is cheaper that memory access
-- do that check before loading / storing `ProvingSplitPredicate`.
llvm-svn: 249654
This reverts commit r249528 and reapply r249431. The fix for the
fallout has been commited in r249575.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 249581
With this patch, clang -O3 optimizes correctly providing > 1000x speedup on this artificial benchmark):
for (a=0; a<n; a++)
for (b=0; b<n; b++)
for (c=0; c<n; c++)
for (d=0; d<n; d++)
for (e=0; e<n; e++)
for (f=0; f<n; f++)
x++;
From test-suite/SingleSource/Benchmarks/Shootout/nestedloop.c
Reviewers: sanjoyd
Differential Revision: http://reviews.llvm.org/D13390
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 249431
This time by lifting the lambda's in `createNodeFromSelectLikePHI` to
the file scope. Looks like there are differences in capture rules
between clang and MSVC?
llvm-svn: 249222
Summary:
This change teaches SCEV that to prove `A u< B` it is sufficient to
prove each of these facts individually:
- B >= 0
- A s< B
- A >= 0
In practice, SCEV sometimes finds it easier to prove these facts
individually than to prove `A u< B` as one atomic step.
Reviewers: reames, atrick, nlewycky, hfinkel
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13042
llvm-svn: 249168
`ScalarEvolution::isImpliedCondOperandsViaNoOverflow` tries to cast the
operand type of the comparison it is given to an `IntegerType`. This is
incorrect because it could actually be simplifying a comparison between
two pointers. Switch it to using `getTypeSizeInBits` instead, which
does the right thing for both pointers and integers.
Fixed PR24956.
llvm-svn: 248743
Before this change `HasSameValue` would return true for distinct
`alloca` instructions if they happened to be allocating the same
type (`alloca` instructions are not specified as reading memory). This
change adds an explicit whitelist of instruction types for which
"identical" instructions compute the same value.
Fixes PR24952.
llvm-svn: 248690
Summary:
If the trip count of a specific backedge is `N`, then we know that
backedge is effectively guarded by the condition `{0,+,1} u< N`. This
change teaches SCEV to use this condition to prove things in
`isLoopBackedgeGuardedByCond`.
Depends on D12948
Depends on D12949
The original checkin, r248608 had to be backed out due to an issue with
a ObjCXX unit test. That issue is now fixed, so re-landing.
Reviewers: atrick, reames, majnemer, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12950
llvm-svn: 248638
Summary:
This change teaches SCEV's `isImpliedCond` two new identities:
A u< B u< -C => (A + C) u< (B + C)
A s< B s< INT_MIN - C => (A + C) s< (B + C)
While these are useful on their own, they're really intended to support
D12950.
The original checkin, r248606 had to be backed out due to an issue with
a ObjCXX unit test. That issue is now fixed, so re-landing.
Reviewers: atrick, reames, majnemer, nlewycky, hfinkel
Subscribers: aadg, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12948
llvm-svn: 248637
Summary:
If the trip count of a specific backedge is `N`, then we know that
backedge is effectively guarded by the condition `{0,+,1} u< N`. This
change teaches SCEV to use this condition to prove things in
`isLoopBackedgeGuardedByCond`.
Depends on D12948
Depends on D12949
Reviewers: atrick, reames, majnemer, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12950
llvm-svn: 248608
Summary:
This new helper routine will be used in a subsequent change.
Reviewers: hfinkel
Subscribers: hfinkel, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12949
llvm-svn: 248607
Summary:
This change teaches SCEV's `isImpliedCond` two new identities:
A u< B u< -C => (A + C) u< (B + C)
A s< B s< INT_MIN - C => (A + C) s< (B + C)
While these are useful on their own, they're really intended to support
D12950.
Reviewers: atrick, reames, majnemer, nlewycky, hfinkel
Subscribers: aadg, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12948
llvm-svn: 248606
Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.
I've refactored the call sites I could find easily to use getZero /
getOne.
Reviewers: hfinkel, majnemer, reames
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12947
llvm-svn: 248362
Summary:
For loop destroyed current instance before invoking next.
Temporary variable added to prevent use-after-dtor when invoke
destructor on current instance.
Reviewers: eugenis
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D12912
Rename temp var.
llvm-svn: 247867
This patch addresses the issue of SCEV division asserting on some
input expressions (e.g., non-affine expressions) and quietly giving
up on others. When giving up, we set the quotient to be equal to
zero and the remainder to be equal to the numerator. With this
patch, we always quietly give up when we cannot perform the
division.
This patch also adds a test case for DependenceAnalysis that
previously caused an assertion.
Differential Revision: http://reviews.llvm.org/D11725
llvm-svn: 247314
Summary:
PR24757 was caused by some incorect math in
`ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X
in
2^N * A = 2^N * X
is not necessarily A.
Reviewers: atrick, majnemer, meheff
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D12721
llvm-svn: 247242
Rewrite some code to not use a lambda function. The non-lambda code is just
about as clean as the original, and not any longer. The lambda function causes
an internal compiler error in GCC 4.8.0, and it is not worth breaking support
for that compiler over this. NFC.
llvm-svn: 245466
Here we make ScalarEvolution::isKnownPredicate, indirectly, a little smarter.
Given some relational comparison operator OP, and two AddRec SCEVs, {I,+,S} OP
{J,+,T}, we can reduce this to the comparison I OP J when S == T, both AddRecs
are for the same loop, and both are known not to wrap.
As it turns out, because of the way that backedge-guard expressions can be
leveraged when computing known predicates, this allows indvars to simplify the
if-statement comparison in this loop:
void foo (int *a, int *b, int n) {
for (int i = 0; i < n; ++i) {
if (i > n)
a[i] = b[i] + 1;
}
}
which, somewhat surprisingly, we were not previously optimizing away.
llvm-svn: 245400
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.
I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.
But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.
To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.
To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.
With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.
Differential Revision: http://reviews.llvm.org/D12063
llvm-svn: 245193
Summary:
http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions.
This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit:
for (int i = 0; i < numIterations; ++i)
sum += ptr[i - offset];
for (int i = 0; i < numIterations; ++i)
sum += ptr[i * stride];
for (int i = 0; i < numIterations; ++i)
sum += ptr[3 * (i << 7)];
Reviewers: atrick, sanjoy
Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben
Differential Revision: http://reviews.llvm.org/D11860
llvm-svn: 245118
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst. This commit changes a bunch
of eligible loops to use it.
llvm-svn: 244260
Summary:
Make Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs in some cases. This is based on reasoning about when poison from instructions with these flags would trigger undefined behavior. This gives a 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.
There does not seem to be clear agreement about when poison should be considered to propagate through instructions. In this analysis, poison propagates only in cases where that should be uncontroversial.
This change makes LSR able to create induction variables for expressions like &ptr[i + offset] for loops like this:
for (int i = 0; i < limit; ++i) {
sum += ptr[i + offset];
}
Here ptr is a 64 bit pointer and offset is a 32 bit integer. For NVPTX, LSR currently creates an induction variable for i + offset instead, which is not as fast. Improving this situation is what brings the 13% speed-up on some Eigen3-based Google-internal microbenchmarks for NVPTX.
There are more details in this discussion on llvmdev.
June: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-June/thread.html#87234
July: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/thread.html#87392
Patch by Bjarke Roune
Reviewers: eliben, atrick, sanjoy
Subscribers: majnemer, hfinkel, jingyue, meheff, llvm-commits
Differential Revision: http://reviews.llvm.org/D11212
llvm-svn: 243460
Summary:
Was D9784: "Remove loop variant range check when induction variable is
strictly increasing"
This change re-implements D9784 with the two differences:
1. It does not use SCEVExpander and does not generate new
instructions. Instead, it does a quick local search for existing
`llvm::Value`s that it needs when modifying the `icmp`
instruction.
2. It is more general -- it deals with both increasing and decreasing
induction variables.
I've added all of the tests included with D9784, and two more.
As an example on what this change does (copied from D9784):
Given C code:
```
for (int i = M; i < N; i++) // i is known not to overflow
if (i < 0) break;
a[i] = 0;
}
```
This transformation produces:
```
for (int i = M; i < N; i++)
if (M < 0) break;
a[i] = 0;
}
```
Which can be unswitched into:
```
if (!(M < 0))
for (int i = M; i < N; i++)
a[i] = 0;
}
```
I went back and forth on whether the top level logic should live in
`SimplifyIndvar::eliminateIVComparison` or be put into its own
routine. Right now I've put it under `eliminateIVComparison` because
even though the `icmp` is not *eliminated*, it no longer is an IV
comparison. I'm open to putting it in its own helper routine if you
think that is better.
Reviewers: reames, nicholas, atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11278
llvm-svn: 243331
The expressions we delinearize do not necessarily have to have a SCEVAddRecExpr
at the outermost level. At this moment, the additional flexibility is not
exploited in LLVM itself, but in Polly we will soon soonish use this
functionality. For LLVM, this change should not affect existing functionality
(which is covered by test/Analysis/Delinearization/)
llvm-svn: 240952
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
Summary:
This allows other passes (such as SLSR) to compute the SCEV expression for an
imaginary GEP.
Test Plan: no regression
Reviewers: atrick, sanjoy
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9786
llvm-svn: 237589
An assert was triggered when attempting to create a new SCEV
with operands of different types in the visitAddRecExpr. In this
test case, the operand types of the numerator and denominator
are different. The SCEV division code should generate a
conservative answer when this happens.
Differential Revision: http://reviews.llvm.org/D9021
llvm-svn: 235511
n/1 generates a quotient equal to n and a remainder of 0.
If this case is not recognized, then the SCEV divide() function
can return a remainder that is greater than or equal to the
denominator, which means the delinearized subscripts for the
test case will be incorrect.
Differential Revision: http://reviews.llvm.org/D9003
llvm-svn: 235311
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.
llvm-svn: 233938
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch. We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.
This re-lands r233447. r233447 was reverted because it caused massive
compile-time regressions. This change has a fix for the same issue.
llvm-svn: 233829
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch. We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.
Reviewers: atrick, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8627
llvm-svn: 233447
Summary:
With the introduction of MarkPendingLoopPredicates in r157092, I don't
think the bailout is needed anymore.
Reviewers: atrick, nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8624
llvm-svn: 233296
Simplify boolean expressions using `true` and `false` with `clang-tidy`
Patch by Richard Thomson.
Reviewed By: nlewycky
Differential Revision: http://reviews.llvm.org/D8528
llvm-svn: 233091
Summary:
This change teaches isImpliedCond to infer things like "X sgt 0" => "X -
1 sgt -1". The `ConstantRange` class has the logic to do the heavy
lifting, this change simply gets ScalarEvolution to exploit that when
reasonable.
Depends on D8345
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8346
llvm-svn: 232576
There's a missed optimization opportunity where we could look at the full chain of computation and take the intersection of the flags instead of only looking one instruction deep.
llvm-svn: 232134
Summary:
Now that the DataLayout is a mandatory part of the module, let's start
cleaning the codebase. This patch is a first attempt at doing that.
This patch is not exactly NFC as for instance some places were passing
a nullptr instead of the DataLayout, possibly just because there was a
default value on the DataLayout argument to many functions in the API.
Even though it is not purely NFC, there is no change in the
validation.
I turned as many pointer to DataLayout to references, this helped
figuring out all the places where a nullptr could come up.
I had initially a local version of this patch broken into over 30
independant, commits but some later commit were cleaning the API and
touching part of the code modified in the previous commits, so it
seemed cleaner without the intermediate state.
Test Plan:
Reviewers: echristo
Subscribers: llvm-commits
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231740
Summary:
This removes some duplicated code, and also helps optimization: e.g. in
the test case added, `%idx ULT 128` in `@x` is not currently optimized
to `true` by `-indvars` but will be, after this change.
The only functional change in ths commit is that for add recurrences,
ScalarEvolution::getRange will be more aggressive -- computing the
unsigned (resp. signed) range for a SCEVAddRecExpr will now look at the
NSW (resp. NUW) bits and check for signed (resp. unsigned) overflow.
This can be a strict improvement in some cases (such as the attached
test case), and should be no worse in other cases.
Reviewers: atrick, nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8142
llvm-svn: 231709
Summary:
Unused in this commit, but will be used in a subsequent change (D8142)
by a FileCheck test.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8143
llvm-svn: 231708
Summary:
Teach SCEV to prove no overflow for an add recurrence by proving
something about the range of another add recurrence a loop-invariant
distance away from it.
Reviewers: atrick, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7980
llvm-svn: 231305
Summary:
DataLayout keeps the string used for its creation.
As a side effect it is no longer needed in the Module.
This is "almost" NFC, the string is no longer
canonicalized, you can't rely on two "equals" DataLayout
having the same string returned by getStringRepresentation().
Get rid of DataLayoutPass: the DataLayout is in the Module
The DataLayout is "per-module", let's enforce this by not
duplicating it more than necessary.
One more step toward non-optionality of the DataLayout in the
module.
Make DataLayout Non-Optional in the Module
Module->getDataLayout() will never returns nullptr anymore.
Reviewers: echristo
Subscribers: resistor, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D7992
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 231270
The bug was a result of getPreStartForExtend interpreting nsw/nuw
flags on an add recurrence more strongly than is legal. {S,+,X}<nsw>
implies S+X is nsw only if the backedge of the loop is taken at least
once.
NOTE: I had accidentally committed an unrelated change with the commit
message of this change in r230275 (r230275 was reverted in r230279).
This is the correct change for this commit message.
Differential Revision: http://reviews.llvm.org/D7808
llvm-svn: 230291
extensions.
This change also removes `DEBUG(dbgs() << "SCEV: untested prestart
overflow check\n");` because that case has a unit test now.
Differential Revision: http://reviews.llvm.org/D7645
llvm-svn: 229600
I could not come up with a test case for this one; but I don't think
`getPreStartForSignExtend` can assume `AR` is `nsw` -- there is one
place in scalar evolution that calls `getSignExtendAddRecStart(AR,
...)` without proving that `AR` is `nsw`
(line 1564)
OperandExtendedAdd =
getAddExpr(WideStart,
getMulExpr(WideMaxBECount,
getZeroExtendExpr(Step, WideTy)));
if (SAdd == OperandExtendedAdd) {
// If AR wraps around then
//
// abs(Step) * MaxBECount > unsigned-max(AR->getType())
// => SAdd != OperandExtendedAdd
//
// Thus (AR is not NW => SAdd != OperandExtendedAdd) <=>
// (SAdd == OperandExtendedAdd => AR is NW)
const_cast<SCEVAddRecExpr *>(AR)->setNoWrapFlags(SCEV::FlagNW);
// Return the expression with the addrec on the outside.
return getAddRecExpr(getSignExtendAddRecStart(AR, Ty, this),
getZeroExtendExpr(Step, Ty),
L, AR->getNoWrapFlags());
}
Differential Revision: http://reviews.llvm.org/D7640
llvm-svn: 229594
When creating a scev for sext({X,+,Y}), scev checks if the expression
is equivalent to {sext X,+,zext Y}. If it can prove that, it also
tags the original {X,+,Y} as <nsw>, which is not correct.
In the test case I run `-scalar-evolution` twice because the bug
manifests only once SCEV has run through and seen the `sext`
expressions (and then does a in-place mutation on {X,+,Y}).
Differential Revision: http://reviews.llvm.org/D7495
llvm-svn: 228586
For the attached test case different types are used in the ICmpInst
and SelectInst that represent the min/max expressions. However, if the
ICmpInst type is smaller a comparison with the sign/zero extended
operands would have yielded the same result. This situation might
arise after the instruction combination pass was applied.
Differential Revision: http://reviews.llvm.org/D7338
llvm-svn: 228572
add recurrences don't overflow.
This change makes the optimization more restrictive. It still assumes
that an overflowing `add nsw` is undefined behavior; and this change
will need revisiting once we have a consistent semantics for poison
values.
Differential Revision: http://reviews.llvm.org/D7331
llvm-svn: 228552