This patch applies the restrictions on the number of domain conjuncts
also to the domain parts of piecewise affine expressions we generate.
To this end the wording is change slightly. It was needed to support
complex additions featuring zext-instructions but it also fixes PR27045.
lnt profitable runs reports only little changes that might be noise:
Compile Time:
Polybench/[...]/2mm +4.34%
SingleSource/[...]/stepanov_container -2.43%
Execution Time:
External/[...]/186_crafty -2.32%
External/[...]/188_ammp -1.89%
External/[...]/473_astar -1.87%
llvm-svn: 264514
Index calculations can use the last value that come out of a loop.
Ideally, ScalarEvolution can compute that exit value directly without
depending on the loop induction variable, but not in all cases.
This changes isAffine to not consider such loop exit values as affine to
avoid that SCEVExpander adds uses of the original loop induction
variable.
This fix is analogous to r262404 that applies to general uses of loop
exit values instead of index expressions and loop bouds as in this
patch.
This reduces the number of LNT test-suite fails with
-polly-position=before-vectorizer -polly-unprofitable
from 10 to 8.
llvm-svn: 262665
The scope will be required in the following fix. This commit separates
the large changes that do not change behaviour from the small, but
functional change.
llvm-svn: 262664
The LNT test suite with -polly-process-unprofitable
-polly-position=before-vectorizer currenty fails 59 tests. With this
barrier added, only 16 keep failing. This is probably because Polly's
code generation currently does not correctly preserve all analyses it
promised to preserve. Temporarily add this barrier until further
investigation.
llvm-svn: 262488
Polly recognizes affine loops that ScalarEvolution does not, in
particular those with loop conditions that depend on hoisted invariant
loads. Check for SCEVAddRec dependencies on such loops and do not
consider their exit values as synthesizable because SCEVExpander would
generate them as expressions that depend on the original induction
variables. These are not available in generated code.
llvm-svn: 262404
So far we separated constant factors from multiplications, however,
only when they are at the outermost level of a parameter SCEV. Now,
we also separate constant factors from the parameter SCEV if the
outermost expression is a SCEVAddRecExpr. With the changes to the
SCEVAffinator we can now improve the extractConstantFactor(...)
function at will without worrying about any other code part. Thus,
if needed we can implement a more comprehensive
extractConstantFactor(...) function that will traverse the SCEV
instead of looking only at the outermost level.
Four test cases were affected. One did not change much and the other
three were simplified.
llvm-svn: 260859
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).
We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.
Differential Revision: http://reviews.llvm.org/D15706
llvm-svn: 259987
MemAccInst wraps the common members of LoadInst and StoreInst. Also use
of this class in:
- ScopInfo::buildMemoryAccess
- BlockGenerator::generateLocationAccessed
- ScopInfo::addArrayAccess
- Scop::buildAliasGroups
- Replace every use of polly::getPointerOperand
Reviewers: jdoerfert, grosser
Differential Revision: http://reviews.llvm.org/D16530
llvm-svn: 258947
Use it to print "null" if a MemoryAccess's access relation is not
available instead of printing nothing.
Suggested-by: Johannes Doerfert
llvm-svn: 255466
Re-run canonicalization passes after Polly's code generation.
The set of passes currently added here are nearly all the passes between
--polly-position=early and --polly-position=before-vectorizer, i.e. all
passes that would usually run after Polly.
In order to run these only if Polly actually modified the code, we add a
function attribute "polly-optimzed" to a function that contains
generated code. The cleanup pass is skipped if the function does not
have this attribute.
There is no support by the (legacy) PassManager to run passes only under
some conditions. One could have wrapped all transformation passes to run
only when CodeGeneration changed the code, but the analyses would run
anyway. This patch creates an independent pass manager. The
disadvantages are that all analyses have to re-run even if preserved and
it does not honor compiler switches like the PassManagerBuilder does.
Differential Revision: http://reviews.llvm.org/D14333
llvm-svn: 254150
If an llvm.assume dominates the SCoP entry block and the assumed condition
can be expressed as an affine inequality we will now add it to the context.
Differential Revision: http://reviews.llvm.org/D14413
llvm-svn: 252851
Basic blocks that are always executed can not be error blocks as their execution
can not possibly be an unlikely event. In this commit we tighten the check
if an error block to basic blcoks that do not dominate the exit condition, but
that dominate all exiting blocks of the scop.
llvm-svn: 252726
r252713 introduced a couple of regressions due to later basic blocks refering
to instructions defined in error blocks which have not yet been modeled.
This commit is currently just encoding limitations of our modeling and code
generation backends to ensure correctness. In theory, we should be able to
generate and optimize such regions, as everything that is dominated by an error
region is assumed to not be executed anyhow. We currently just lack the code
to make this happen in practice.
llvm-svn: 252725
Remove all the implicit ilist iterator conversions from polly, in
preparation for making them illegal in ADT. There was one oddity I came
across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a
post-increment `Builder.GetInsertPoint()++`.
Since it was a no-op, I removed it, but I admit I wonder if it might be
a bug (both before and after this change)? Perhaps it should be a
pre-increment?
llvm-svn: 252357
Polly can now be used as a analysis only tool as long as the code
generation is disabled. However, we do not have an alternative to the
independent blocks pass in place yet, though in the relevant cases
this does not seem to impact the performance much. Nevertheless, a
virtual alternative that allows the same transformations without
changing the input region will follow shortly.
llvm-svn: 250652
instead of llvm::PassManagerBuilder::EP_EarlyAsPossible. This will allow us
to run actual module passes in Polly's canonicalization sequence, but should
otherwise have only little impact.
llvm-svn: 250091
Helper functions in the BlockGenerators.h/cpp introduce dependences
from the frontend to the backend of Polly. As they are used in
ScopDetection, ScopInfo, etc. we move them to the ScopHelper file.
llvm-svn: 249919
This replaces the support for user defined error functions by a
heuristic that tries to determine if a call to a non-pure function
should be considered "an error". If so the block is assumed not to be
executed at runtime. While treating all non-pure function calls as
errors will allow a lot more regions to be analyzed, it will also
cause us to dismiss a lot again due to an infeasible runtime context.
This patch tries to limit that effect. A non-pure function call is
considered an error if it is executed only in conditionally with
regards to a cheap but simple heuristic.
llvm-svn: 249611
This patch allows invariant loads to be used in the SCoP description,
e.g., as loop bounds, conditions or in memory access functions.
First we collect "required invariant loads" during SCoP detection that
would otherwise make an expression we care about non-affine. To this
end a new level of abstraction was introduced before
SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
if we want a load inside the region to be optimistically assumed
invariant or not. If we do, it will be marked as required and in the
SCoP generation we bail if it is actually not invariant. If we don't
it will be a non-affine expression as before. At the moment we
optimistically assume all "hoistable" (namely non-loop-carried) loads
to be invariant. This causes us to expand some SCoPs and dismiss them
later but it also allows us to detect a lot we would dismiss directly
if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
allow potential aliases between optimistically assumed invariant loads
and other pointers as our runtime alias checks are sound in case the
loads are actually invariant. Together with the invariant checks this
combination allows to handle a lot more than LICM can.
The code generation of the invariant loads had to be extended as we
can now have dependences between parameters and invariant (hoisted)
loads as well as the other way around, e.g.,
test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
First, it is important to note that we cannot have real cycles but
only dependences from a hoisted load to a parameter and from another
parameter to that hoisted load (and so on). To handle such cases we
materialize llvm::Values for parameters that are referred by a hoisted
load on demand and then materialize the remaining parameters. Second,
there are new kinds of dependences between hoisted loads caused by the
constraints on their execution. If a hoisted load is conditionally
executed it might depend on the value of another hoisted load. To deal
with such situations we sort them already in the ScopInfo such that
they can be generated in the order they are listed in the
Scop::InvariantAccesses list (see compareInvariantAccesses). The
dependences between hoisted loads caused by indirect accesses are
handled the same way as before.
llvm-svn: 249607
There have been various places where llvm::DenseMap<const llvm::Value *,
llvm::Value *> types have been defined, but all types have been expected to be
identical. We make this more clear by consolidating the different types and use
BlockGenerator::ValueMapT wherever there is a need for types to match
BlockGenerator::ValueMapT.
llvm-svn: 249264
The user can provide function names with
-polly-error-functions=name1,name2,name3
that will be treated as error functions. Any call to them is assumed
not to be executed.
This feature is mainly for developers to play around with the new
"error block" feature.
llvm-svn: 249098
Because we handle more than SCEV does it is not possible to rewrite an
expression on the top-level using the SCEVParameterRewriter only. With
this patch we will do the rewriting on demand only and also
recursively, thus not only on the top-level.
llvm-svn: 248916
This patch allows switch instructions with affine conditions in the
SCoP. Also switch instructions in non-affine subregions are allowed.
Both did not require much changes to the code, though there was some
refactoring needed to integrate them without code duplication.
In the llvm-test suite the number of profitable SCoPs increased from
135 to 139 but more importantly we can handle more benchmarks and user
inputs without preprocessing.
Differential Revision: http://reviews.llvm.org/D13200
llvm-svn: 248701
If we encounter a <nsw> tagged AddRec for a loop we know the trip count of
that loop has to be bounded or the semantics is undefined anyway. Hence, we
only need to add unbounded assumptions if no such AddRec is known.
llvm-svn: 248128
This will allow to generate non-wrap assumptions for integer expressions
that are part of the SCoP. We compare the common isl representation of
the expression with one computed with modulo semantic. For all parameter
combinations they are not equal we can have integer overflows.
The nsw flags are respected when the modulo representation is computed,
nuw and nw flags are ignored for now.
In order to not increase compile time to much, the non-wrap assumptions
are collected in a separate boundary context instead of the assumed
context. This helps compile time as the boundary context can become
complex and it is therefor not advised to use it in other operations
except runtime check generation. However, the assumed context is e.g.,
used to tighten dependences. While the boundary context might help to
tighten the assumed context it is doubtful that it will help in practice
(it does not effect lnt much) as the boundary (or no-wrap assumptions)
only restrict the very end of the possible value range of parameters.
PET uses a different approach to compute the no-wrap context, though lnt runs
have shown that this version performs slightly better for us.
llvm-svn: 247732
Due to the new domain generation, the SCoP keeps track of the domain
for all blocks, thus the SCEVAffinator can now work with blocks to avoid
duplication of the domains.
llvm-svn: 247731
Hoist runtime checks in the loop nest if they guard an "error" like event.
Such events are recognized as blocks with an unreachable terminator or a call
to the ubsan function that deals with out of bound accesses. Other "error"
events can be added easily.
We will ignore these blocks when we detect/model/optmize and code generate SCoPs
but we will make sure that they would not have been executed using the assumption
framework.
llvm-svn: 247310
As we do not rely on ScalarEvolution any more we do not need to get
the backedge taken count. Additionally, our domain generation handles
everything that is affine and has one latch and our ScopDetection will
over-approximate everything else.
This change will therefor allow loops with:
- one latch
- exiting conditions that are affine
Additionally, it will not check for structured control flow anymore.
Hence, loops and conditionals are not necessarily single entry single
exit regions any more.
Differential Version: http://reviews.llvm.org/D12758
llvm-svn: 247289
The TempScopInfo (-polly-analyze-ir) pass is removed and its work taken
over by ScopInfo (-polly-scops). Several tests depend on
-polly-analyze-ir and use -polly-scops instead which for the moment
prints the output of both passes. This again is not expected by some
other tests, especially those with negative searches, which have been
adapted.
Differential Version: http://reviews.llvm.org/D12694
llvm-svn: 247288
The support for modulo expressions is not comlete and makes the new
domain generation harder. As the currently broken domain generation
needs to be replaced, we will first swap in the new, fixed domain
generation and make it compatible with the modulo expressions later.
llvm-svn: 247278
This prepares for a series of patches that merges TempScopInfo into ScopInfo to
reduce Polly's code complexity. Only ScopInfo.{cpp|h} will be left thereafter.
Moving the code of TempScopInfo in one commit makes the mains diffs simpler to
understand.
In detail, merging the following classes is planned:
TempScopInfo into ScopInfo
TempScop into Scop
IRAccess into MemoryAccess
Only moving code, no functional changes intended.
Differential Version: http://reviews.llvm.org/D12693
llvm-svn: 247274
The support for pointer expressions is broken as it can only handle
some patterns in the IslExprBuilder. We should to treat pointers in
expressions the same as integers at some point and revert this patch.
llvm-svn: 247147
Use ISL to compute the loop trip count when scalar evolution is unable to do
so.
Contributed-by: Matthew Simpson <mssimpso@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D9444
llvm-svn: 246142
The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds
of expressions. To this end we introduce a ScopExpander that handles
the additional expressions separatly and falls back to the
SCEVExpander for everything else.
Reviewers: grosser, Meinersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D12066
llvm-svn: 245288
While the compile time is not affected by this patch much it will
allow us to look at all translated expressions after the SCoP is build
in a convenient way. Additionally, bigger SCoPs or SCoPs with
repeating complicated expressions might benefit from the cache later
on.
Reviewers: grosser, Meinersbur
Subscribers: #polly
Differential Revision: http://reviews.llvm.org/D11975
llvm-svn: 244734
This change has three major advantages:
- The ScopInfo becomes smaller.
- It allows to use the SCEVAffinator from outside the ScopInfo.
- A member object allows state which in turn allows e.g., caching.
Differential Revision: http://reviews.llvm.org/D9099
llvm-svn: 244730