Summary:
Most changes are mechanical, but in one place I changed the program semantics
by fixing a likely bug:
In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the
error-case. Before, when the call to `addNonEmptyDomainConstraints()`
returned a null set, this (probably) accidentally worked because
isl_bool_error converts to true. I'm checking for nullptr now.
Reviewers: grosser, Meinersbur, bollu
Reviewed By: Meinersbur
Subscribers: nemanjai, kbarton, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39971
llvm-svn: 318632
Summary:
There is a potential use-after-free bug in Scop::buildSchedule(Region *,
LoopStackTy &, LoopInfo &). Before, we took a reference to LoopStack.back()
which is a use after free, since back is popped off further below. This didn't
crash before by pure chance, since LoopStack is actually a vector, and the
memory isn't freed upon pop. I turned this into an iterator-based algorithm.
Reviewers: grosser, bollu, Meinersbur
Reviewed By: Meinersbur
Subscribers: llvm-commits, pollydev
Differential Revision: https://reviews.llvm.org/D39979
llvm-svn: 318415
The option splits BasicBlocks into minimal statements such that no
additional scalar dependencies are introduced.
The algorithm is based on a union-find structure, and unites sets if
putting them into separate statements would introduce a scalar
dependencies. As a consequence, instructions may be split into separate
statements such their relative order is different than the statements
they are in. This is accounted for instructions whose relative order
matters (e.g. memory accesses).
The algorithm is generic in that heuristic changes can be made
relatively easily. We might relax the order requirement for read-reads
or accesses to different base pointers. Forwardable instructions can be
made to not cause a join.
This implementation gives us a speed-up of 82% in SPEC 2006 456.hmmer
benchmark by allowing loop-distribution in a hot loop such that one of
the loops can be vectorized.
Differential Revision: https://reviews.llvm.org/D38403
llvm-svn: 314983
The option is introduced with only one possible value
-polly-stmt-granularity=bb which represents the current behaviour, which
is outlined into the new function buildSequentialBlockStmts().
More options will be added in future commits.
llvm-svn: 314900
Iterate over statement instructions instead over basic block
instructions when creating MemoryAccesses. It allows making the creation
of MemoryAccesses independent of how the basic blocks are split into
multiple ScopStmts.
llvm-svn: 314665
Create the MemoryAccesses of invariant loads separately and before
all other MemoryAccesses.
Invariant loads are classified as synthesizable and therefore are not
contained in any statement. When iterating over all instructions of all
statements, the invariant loads are consequently not processed and
iterating over them separately becomes necessary.
This patch can change the order in which MemoryAccesses are created, but
otherwise has no functional change.
Some temporary code is introduced to ensure correctness, but will be
removed in the next commit.
llvm-svn: 314664
Instructions that compute escaping values might be synthesizable and
therefore not contained in any ScopStmt. When buildAccessFunctions is
changed to only iterate over the instruction list of statement,
"free" instructions still need to be written. We do this after the
main MemoryAccesses have been created.
This can change the order in which MemoryAccesses are created, but has
otherwise no functional change.
llvm-svn: 314663
Decouple handling of exit block PHIs and other MemoryAccesses. Exit PHIs
only need the PHI handling part of buildAccessFunctions but requires
code for skipping them in while creating other MemoryAcesses.
This change will make it easier to modify how statement MemoryAccesses
are created without considering the exit block special case.
llvm-svn: 314662
Loads before the SCoP are always invariant within the SCoP and
therefore are no "required invariant loads". An assertion failes in
ScopBuilder when it finds such an invariant load.
Fix by not adding such loads to the required invariant load list. This
likely will cause the region to be not considered a valid SCoP.
We may want to unconditionally accept instructions defined before
the region as valid invariant conditions instead of rejecting them.
This fixes a compilation crash of SPEC CPU2006 453.povray's
render.cpp.
llvm-svn: 314636
In case a PHI node follows an error block we can assume that the incoming value
can only come from the node that is not an error block. As a result, conditions
that seemed non-affine before are now in fact affine.
This is a recommit of r312663 after fixing
test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll
llvm-svn: 314075
Before this patch, ScopInfo::getValueDef(SAI) used
getStmtFor(Instruction*) to find the MemoryAccess that writes a
MemoryKind::Value. In cases where the value is synthesizable within the
statement that defines, the instruction is not added to the statement's
instruction list, which means getStmtFor() won't return anything.
If the synthesiable instruction is not synthesiable in a different
statement (due to being defined in a loop that and ScalarEvolution
cannot derive its escape value), we still need a MemoryKind::Value
and a write to it that makes it available in the other statements.
Introduce a separate map for this purpose.
This fixes MultiSource/Benchmarks/MallocBench/cfrac where
-polly-simplify could not find the writing MemoryAccess for a use. The
write was not marked as required and consequently was removed.
Because this could in principle happen as well for PHI scalars,
add such a map for PHI reads as well.
llvm-svn: 313881
This reverts commit
r312410 - [ScopDetect/Info] Look through PHIs that follow an error block
The commit caused generation of invalid IR due to accessing a parameter
that does not dominate the SCoP.
llvm-svn: 312663
In case a PHI node follows an error block we can assume that the incoming value
can only come from the node that is not an error block. As a result, conditions
that seemed non-affine before are now in fact affine.
llvm-svn: 312410
Mark scalar dependences for different statements belonging to same BB
as 'Inter'.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D37147
llvm-svn: 312324
By using statement lists in the entry blocks of region statements, instruction
level analyses also work on region statements.
We currently only model the entry block of a region statements, as this is
sufficient for most transformations the known-passes currently execute. Modeling
instructions in the presence of control flow (e.g. infinite loops) is left
out to not increase code complexity too much. It can be added when good use
cases are found.
This change set is reapplied, after a memory corruption issue had been fixed.
llvm-svn: 312210
By using statement lists in the entry blocks of region statements, instruction
level analyses also work on region statements.
We currently only model the entry block of a region statements, as this is
sufficient for most transformations the known-passes currently execute. Modeling
instructions in the presence of control flow (e.g. infinite loops) is left
out to not increase code complexity too much. It can be added when good use
cases are found.
llvm-svn: 312128
Reduction detection is only executed in the SCoP building phase.
Hence it fits better into ScopBuilder to separate
SCoP-construction from SCoP modeling.
llvm-svn: 312118
This method is only called in the SCoP building phase.
Therefore it fits better into ScopBuilder to separate
SCoP-construction from SCoP modeling.
llvm-svn: 312117
This method is only called in the SCoP building phase.
Therefore it fits better into ScopBuilder to separate
SCoP-construction from SCoP modeling.
llvm-svn: 312116
This method is only called in the SCoP building phase.
Therefore it fits better into ScopBuilder to separate
SCoP-construction from SCoP modeling.
This mostly mechanical change makes ScopBuilder directly access some of
ScopStmt/MemoryAccess private fields. We add ScopBuilder as a friend
class and will add proper accessor functions sometime later.
llvm-svn: 312115
This patch allows annotating of metadata in ir instruction
(with "polly_split_after"), which specifies where to split a particular
scop statement.
Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in>
Differential Revision: https://reviews.llvm.org/D36402
llvm-svn: 312107
Properly require and preserve the OptimizationRemarkEmitter for use in
ScopPass. Previously one had to get the ORE from ScopDetection because
CodeGeneration did not mark it as preserved. It would need to be
recomputed which results in the legacy PM to throw away all previous
SCoP analysis.
This also changes the implementation of ScopPass::getAnalysisUsage to
not unconditionally preserve all passes, but only those needed to be
preserved by any SCoP pass (at least when using the legacy PM). This
allows invalidating DependenceInfo (and IslAstInfo) in case the pass
would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion)
JSONImporter should also invalidate the DependenceInfo. In this patch
it marks DependenceInfo as preserved anyway because some regression
tests depend on it.
Differential Revision: https://reviews.llvm.org/D37010
llvm-svn: 311888
In cases where the entry block of a scop was not contained in a loop that was
part of the scop region and at the same time there was a loop surrounding the
scop, we missed to count the loops in the scop and consequently did not consider
the scop profitable. We correct this by only moving to the loop parent, in case
the current loop is loop contained in the scop.
This increases the number of loops in COSMO which we assume to be profitable
from 3974 to 4981.
llvm-svn: 311863
Add statistics about
- Which optimizations are applied
- Number of loops in Scops at various stages
- Number of scalar/singleton writes at various stages representative
for scalar false dependencies
- Number of parallel loops
These will be useful to find regressions due to moving Polly further
down of LLVM's pass pipeline.
Differential Revision: https://reviews.llvm.org/D37049
llvm-svn: 311553
Loop with zero iteration are, syntactically, loops. They have been
excluded from the loop counter even for the non-profitable counters.
This seems to be unintentially as the sentinel value of '0' minimal
iterations does exclude such loops.
Fix by never considering the iteration count when the sentinel
value of 0 is found.
This makes the recently added NumTotalLoops couter redundant
with NumLoopsOverall, which now is equivalent. Hence, NumTotalLoops
is removed as well.
Note: The test case 'ScopDetect/statistics.ll' effectively does not
check profitability, because -polly-process-unprofitable is passed
to all test cases.
llvm-svn: 311551
Summary:
ScopDetection used to check if a loop withing a region was infinite and emitted a diagnostic in such cases. After r310940 there's no point checking against that situation, as infinite loops don't appear in regions anymore.
The test failure was observed on these two polly buildbots:
http://lab.llvm.org:8011/builders/polly-arm-linux/builds/8368http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/10310
This patch XFAILs `ReportLoopHasNoExit.ll` and turns infinite loop detection into an assert.
Reviewers: grosser, sanjoy, bollu
Reviewed By: grosser
Subscribers: efriedma, aemerson, kristof.beyls, dberlin, llvm-commits
Tags: #polly
Differential Revision: https://reviews.llvm.org/D36776
llvm-svn: 311503
Dragonegg generates most function parameters as pointers to the actual
parameters. However, it does not mark these parameters with the
dereferencable attribute.
Polly is conservative when it comes to invariant load
hoisting, thus we add runtime checks to invariant load hoisted pointers
when we do not know that pointers are dereferencable. This is correct behaviour,
but is a performance penalty.
Add a flag that allows all pointer parameters to be dereferencable. That
way, polly can speculatively load-hoist paramters to functions without
runtime checks.
Differential Revision: https://reviews.llvm.org/D36461
llvm-svn: 311329
This avoid the construction of very large sets and in many cases also keeps the
number of parameters low. As a result, we see a compile time reduction from 5
minutes to only slightly above 1 minute for one of our larger test cases.
llvm-svn: 311127
We add a ScopInliner pass which inlines functions based on a simple heuristic:
Let `g` call `f`.
If we can model all of `f` as a Scop, we inline `f` into `g`.
This requires `-polly-detect-full-function` to be enabled. So, the pass
asserts that `-polly-detect-full-function` is enabled.
Differential Revision: https://reviews.llvm.org/D36832
llvm-svn: 311126