The command line flag -polly-annotate-alias-scopes controls whether or not
Polly annotates alias scopes in the new SCoP (default ON). This can improve
later optimizations as the new SCoP is basically an alias free environment for
them.
llvm-svn: 218877
This change allows to annotate all parallel loops with loop id metadata.
Furthermore, it will annotate memory instructions with
llvm.mem.parallel_loop_access metadata for all surrounding parallel loops.
This is especially usefull if an external paralleliser is used.
This also removes the PollyLoopInfo class and comments the
LoopAnnotator.
A test case for multiple parallel loops is attached.
llvm-svn: 218793
If there are multiple read only base addresses in an alias group
we can split it into multiple alias groups each with only one
read only access. This way we might reduce the number of
comparisons significantly as it grows linear in the number of
alias groups but exponential in their size.
Differential Revision: http://reviews.llvm.org/D5435
llvm-svn: 218757
This fixes two problems which are usualy caused together:
1) The elements of an isl AST access expression could be pointers
not only integers, floats and vectores thereof.
2) The runtime alias checks need to compare pointers but if they
are of a different type we need to cast them into a "max" type
similar to the non pointer case.
llvm-svn: 218113
This change will build all alias groups (minimal/maximal accesses
to possible aliasing base pointers) we have to check before
we can assume an alias free environment. It will also use these
to create Runtime Alias Checks (RTC) in the ISL code generation
backend, thus allow us to optimize SCoPs despite possibly aliasing
pointers when this backend is used.
This feature will be enabled for the isl code generator, e.g.,
--polly-code-generator=isl, but disabled for:
- The cloog code generator (still the default).
- The case delinearization is enabled.
- The case non-affine accesses are allowed.
llvm-svn: 218046
We use SplitEdge to split a conditional entry edge of the SCoP region.
However, SplitEdge can cause two different situations (depending on
whether or not the edge is critical). This patch tests
which one is present and deals with the former unhandled one.
It also refactors and unifies the case we have to change the basic
blocks of the SCoP to new ones (see replaceScopAndRegionEntry).
llvm-svn: 217802
During the IslAst parallelism check also compute the minimal dependency
distance and store it in the IstAst for node.
Reviewer: sebpop
Differential Revision: http://reviews.llvm.org/D4987
llvm-svn: 217729
This allows us to omit the GuardBB in front of created loops
if we can show the loop trip count is at least one. It also
simplifies the dominance relation inside the new created region.
A GuardBB (even with a constant branch condition) might trigger
false dominance errors during function verification.
Differential Revision: http://reviews.llvm.org/D5297
llvm-svn: 217525
Summary:
+ Refactor the runtime check (RTC) build function
+ Added helper function to create an PollyIRBuilder
+ Change the simplify region function to create not
only unique entry and exit edges but also enfore that
the entry edge is unconditional
+ Cleaned the IslCodeGeneration runOnScop function:
- less post-creation changes of the created IR
+ Adjusted and added test cases
Reviewers: grosser, sebpop, simbuerg, dpeixott
Subscribers: llvm-commits, #polly
Differential Revision: http://reviews.llvm.org/D5076
llvm-svn: 217508
There was a bug in the IslAst which caused that no more outermost
parallel loops were detected/checked after a parallel outermost loop
of depth 1.
+ Test case attached
llvm-svn: 217452
In Polly we used to have a mix of test cases, some that used 'opt %s' and others
that used 'opt < %s'. We now change all to use 'opt < %s'. Piping in test files
is preferable as it does prevent temporary files to be written to disk. This
brings us in line with what is usus in LLVM.
llvm-svn: 216816
This replaces the use of %defaultOpts = '-basicaa -polly-prepare' with the
minimal set of passes necessary for a test to succeed. Of the test cases that
previously used %defaultOpts 76 test cases require none of these passes, 42
need -basicaa and only 2 need -polly-prepare. Our change makes this requirement
explicit.
In Polly many test cases have been using a macro '%defaultOpts' which run a
couple of preparing passes before the actual Polly test case. This macro was
introduced very early in the development of Polly and originally contained a
large set of canonicalization passes. However, as the need for additional
canonicalization passes makes test cases harder to understand and also more
fragile in terms of changes in such passes, we aim since a longer time to only
include the minimal set of passes necessary. This patch removes the last
leftovers from of %defaultOpts and brings our tests cases more in line to what
is usus in LLVM itself.
llvm-svn: 216815
This reverts commit 215466 (and 215528, a trivial formatting fix).
The intention of these commits is a good one, but unfortunately they broke
our LNT buildbot:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-codegen-isl
Several of the cleanup changes that have been combined in this 'fixup' are
trivial and could probably be committed as obvious changes without risking to
break the build. The remaining changes are little and it should be easy to
figure out what went wrong.
llvm-svn: 215817
There is no needed for neither 1-dimensional nor higher dimensional arrays to
require positive offsets in the outermost array dimension.
We originally introduced this assumption with the support for delinearizing
multi-dimensional arrays.
llvm-svn: 214665
+ Remove the class IslGenerator which duplicates the functionality of
IslExprBuilder.
+ Use the IslExprBuilder to create code for memory access relations.
+ Also handle array types during access creation.
+ Enable scev codegen for one of the transformed memory access tests,
thus access creation without canonical induction variables available.
+ Update one test case to the new output.
llvm-svn: 214659
The updated tests use a different context than the old ones did.
Other than that only their path and the code generation we use
changed.
llvm-svn: 214657
Use the fact that if we visit a for node first in pre and next in post order
we know we did not visit any children, thus we found an innermost loop.
+ Test case for an innermost loop with a conditional inside
llvm-svn: 213870
+ Introduced dependency type TYPE_TC_RED to represent the transitive closure
(& the reverse) of reduction dependences. These are used when we check for
reduction parallel loops.
+ Test cases including loop reversals and modulo schedules which compute
reductions in a alternated order.
llvm-svn: 213019
As our delinearization works optimistically, we need in some cases run-time
checks that verify our optimistic assumptions. A simple example is the
following code:
void foo(long n, long m, long o, double A[n][m][o]) {
for (long i = 0; i < 100; i++)
for (long j = 0; j < 150; j++)
for (long k = 0; k < 200; k++)
A[i][j][k] = 1.0;
}
After clang linearized the access to A and we delinearized it again to
A[i][j][k] we need to ensure that we do not access the delinearized array
out of bounds (this information is not available in LLVM-IR). Hence, we
need to verify the following constraints at run-time:
CHECK: Assumed Context:
CHECK: [o, m] -> { : m >= 150 and o >= 200 }
llvm-svn: 212198
For complex examples it may happen that we do not compute dependences. In this
case we do not want to crash, but just not detect parallel loops.
llvm-svn: 204470
This patch enables vectorization of loops containing backward array
traversal (array stride is -1).
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>
llvm-svn: 204257
For now we only mark innermost loops for the loop vectorizer. We could later
also mark not-innermost loops to enable the introduction of openmp parallelism.
llvm-svn: 202854
We now skip the debug intrinsics which is a lot better than crashing due to
uncopied metadata references. We should step by step investigate which debug
intrinsics we can copy without trouble.
We still keep the debug location metadata.
llvm-svn: 201860
This is not only not necessary, but in case -03 changes this can actually
cause arbitrarily failing test cases such as, e.g., a recent change by Chandler
that caused -O3 to unroll the loop body, which made the loop we wanted to
detect disappear and consequently this test case fail.
llvm-svn: 200204
When constructing a scop sometimes the exact representation of a statement or
condition would be very complex, but there is a common case which is a lot
simpler, but which is only valid under certain assumptions. The assumed context
records the assumptions taken during the construction of this scop and that need
to be code generated as a run-time test.
At the moment, we do not yet model any assumptions, but only added the
AssumedContext as well as the isl-ast generation support. As a next step,
this needs to be hooked up with the isl code generation.
if (1) /* run-time condition */
{ /* optimized code */ }
else
{ /* original code */ }
llvm-svn: 193652
The original test case showed a problem with the independet blocks pass and
we decided to XFAIL it for now. Unfortunately the failure is not detected if
we build without asserts and the verification of the independent block pass
is not run. This change tests now for the actual reason of the failure and
should trigger even in a non asserts build. We did not yet solve the underlying
bug, but this should at least make the test suite behavior consistent.
llvm-svn: 183025
When the Polly code generation was written we did not correctly update the
LoopInfo data, but still claimed that the loop information is correct. This
does not only lead to missed optimizations, but it can also cause
miscompilations in case passes such as LoopSimplify are run after Polly.
Reported-by: Sergei Larin <slarin@codeaurora.org>
llvm-svn: 181987
BeforeBB
|
v
GuardBB
/ \
__ PreHeaderBB \
/ \ / |
latch HeaderBB |
\ / \ /
< \ /
\ /
ExitBB
This does not only remove the need for an explicit loop rotate pass, but it also
gives us the possibility to skip the construction of the guard condition in case
the loop is known to be executed at least once. We do not yet exploit this, but
by implementing this analysis in the isl code generator we should be able to
remove more guards than the generic loop rotate pass can. Another point is that
loop rotation can introduce additional PHI nodes, which may hide that a loop can
be executed in parallel. This change avoids this complication and will make it
easier to move the openmp code generation into a separate pass.
llvm-svn: 181986
In the classical (non -polly-codegen-scev) mode, we assume that we can always
recreate PHI nodes during code generation. This is not true. We can only
reconstruct them from the polyhedral information, in case the entire loop of the
PHI node is part of the SCoP and consequently the PHI node was translated in
the polyhedral description.
llvm-svn: 179674
We now support regions with multiple entries and multiple exits natively.
Regions are not needed to be simplified to single entry and single exit.
We need to XFAIL two test cases as this change increases the scop coverage
and uncoveres two failures in the independent blocks pass. The first failure
will be fixed in a subsequent commit, the second one is in the non-default
-polly-codegen-scev mode and still needs to be fixed.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179673
Regions that have multiple entry edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
for_region
This for_region contains two entry edges 'then' -> 'for_region' and 'else' -> 'for_region'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively when the region is
in -loop-simplify form, which means the entry block should not be a loop header.
Contributed by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179586
Regions that have multiple exit edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
after
Region: if -> after
This regions contains the bbs 'if', 'then', 'else', but not 'after'. It has
two exit edges 'then' -> 'after' and 'else' -> 'after'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179159
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When code generating S using scev based code generation, we need to retrieve
the scev of 'i' at the location of 'S'. If we do not do this the scev that
we obtain will be expressed as {0,+,1}_for and will reference loop iterators
that do not surround 'S' and that we consequently do not know how to code
generate. What we really want is the scev to be instantiated to the value of 'i'
after the loop. This value is {10} and it can be code generated without
troubles.
llvm-svn: 177777
When doing SCEV based code generation, we ignore instructions calculating values
that are fully defined by a SCEV expression. The values that are calculated by
this instructions are recalculated on demand.
This commit improves the check to verify if certain instructions can be ignored
and recalculated on demand.
llvm-svn: 177313
We need to remove one dimension. Any is correct as long as it exists. We have
choosen for whatever reason the dimension #dims - 2. This is incorrect if
there is just one dimension. For CLooG this case did never happen. For isl
however, the case can happen and causes undefined behavior including crashes.
We choose now always the last dimension #dims - 1. We could have choosen
dimension '0' but the last dimension is what we remove conceptionally in the
algorithm, so it seems better to actually program it that way.
While at it remove another piece of undefined behavior.
llvm-svn: 174894