SCoP invariant parameters with the different start value would deter parameter
sharing. For example, when compiling the following C code:
void foo(float *input) {
for (long j = 0; j < 8; j++) {
// SCoP begin
for (long i = 0; i < 8; i++) {
float x = input[j * 64 + i + 1];
input[j * 64 + i] = x * x;
}
}
}
Polly would creat two parameters for these memory accesses:
p_0: {0,+,256}
p_2: {4,+,256}
[j * 64 + i + 1] => MemRef_input[o0] : 4o0 = p_1 + 4i0
[j * 64 + i] => MemRef_input[o0] : 4o0 = p_0 + 4i0
These parameters only differ from start value. To enable parameter sharing,
we split the start value from SCEVAddRecExpr, so they would share a single
parameter that always has zero start value:
p0: {0,+,256}<%for.cond1.preheader>
[j * 64 + i + 1] => MemRef_input[o0] : 4o0 = 4 + p_1 + 4i0
[j * 64 + i] => MemRef_input[o0] : 4o0 = p_0 + 4i0
Such translation can make the polly-dependence much faster.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 187728
In case we detect that the schedule the user wants to import is invalid we
refuse it _and_ free the isl_maps containing it.
Another bug found thanks to Rafael.
llvm-svn: 187339
We now use __isl_take to annotate the uses of the isl_set where we got the
memory management wrong.
Thanks to Rafael! His pipefail work hardened our test environment and exposed
this bug nicely.
llvm-svn: 187338
Ensure that the scalar write access corresponds to the result of a load
instruction appears after the generic read access corresponds to the load
instruction.
llvm-svn: 186419
Previously this happend to work for integers up to i64, but we got it wrong
for larger numbers. Fix this and add test cases to verify this keeps working.
Reported by: Sven Verdoolaege <skimo at kotnet dot org>
llvm-svn: 183986
When a region header is part of a loop, then all entering edges of this region
should not come from the loop but outside the region. Otherwise, the loop may be
only partially part of the region, which would cause troubles in handling
induction variables.
Currently, we can only model induction variables that are either fully part of
the scop (loop induction variable) or induction variables that are scop-
invariant (parameter). A loop that is only partially part of the
scop causes troubles, as there is no good way to handle the induction
variable in the independent blocks pass.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 183800
The original test case showed a problem with the independet blocks pass and
we decided to XFAIL it for now. Unfortunately the failure is not detected if
we build without asserts and the verification of the independent block pass
is not run. This change tests now for the actual reason of the failure and
should trigger even in a non asserts build. We did not yet solve the underlying
bug, but this should at least make the test suite behavior consistent.
llvm-svn: 183025
When the Polly code generation was written we did not correctly update the
LoopInfo data, but still claimed that the loop information is correct. This
does not only lead to missed optimizations, but it can also cause
miscompilations in case passes such as LoopSimplify are run after Polly.
Reported-by: Sergei Larin <slarin@codeaurora.org>
llvm-svn: 181987
BeforeBB
|
v
GuardBB
/ \
__ PreHeaderBB \
/ \ / |
latch HeaderBB |
\ / \ /
< \ /
\ /
ExitBB
This does not only remove the need for an explicit loop rotate pass, but it also
gives us the possibility to skip the construction of the guard condition in case
the loop is known to be executed at least once. We do not yet exploit this, but
by implementing this analysis in the isl code generator we should be able to
remove more guards than the generic loop rotate pass can. Another point is that
loop rotation can introduce additional PHI nodes, which may hide that a loop can
be executed in parallel. This change avoids this complication and will make it
easier to move the openmp code generation into a separate pass.
llvm-svn: 181986
Use the new cl::OptionCategory support to move the Polly options into a separate
option category. The aim is to hide most options and show by default only the
options a user needs to influence '-O3 -polly'. The available options probably
need some care, but here is the current status:
Polly Options:
Configure the polly loop optimizer
-enable-polly-openmp - Generate OpenMP parallel code
-polly - Enable the polly optimizer (only at -O3)
-polly-no-tiling - Disable tiling in the scheduler
-polly-only-func=<function-name> - Only run on a single function
-polly-report - Print information about the activities
of Polly
-polly-vectorizer - Select the vectorization strategy
=none - No Vectorization
=polly - Polly internal vectorizer
=unroll-only - Only grouped unroll the vectorize
candidate loops
=bb - The Basic Block vectorizer driven by
Polly
llvm-svn: 181295
In the classical (non -polly-codegen-scev) mode, we assume that we can always
recreate PHI nodes during code generation. This is not true. We can only
reconstruct them from the polyhedral information, in case the entire loop of the
PHI node is part of the SCoP and consequently the PHI node was translated in
the polyhedral description.
llvm-svn: 179674
We now support regions with multiple entries and multiple exits natively.
Regions are not needed to be simplified to single entry and single exit.
We need to XFAIL two test cases as this change increases the scop coverage
and uncoveres two failures in the independent blocks pass. The first failure
will be fixed in a subsequent commit, the second one is in the non-default
-polly-codegen-scev mode and still needs to be fixed.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179673
Regions that have multiple entry edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
for_region
This for_region contains two entry edges 'then' -> 'for_region' and 'else' -> 'for_region'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively when the region is
in -loop-simplify form, which means the entry block should not be a loop header.
Contributed by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179586
We do not only need to understand that 'k * p' is a parameter expression, but
also need to store this expression in the set of parameters. Before this patch
we wrongly stored the two individual parameters %k and %p.
Reported by: Sebastian Pop <spop@codeaurora.org>
llvm-svn: 179485
Statements with an empty iteration domain may not have a schedule assigned by
the isl schedule optimizer. As Polly expects each statement to have a schedule,
we keep the old schedule for such statements.
This fixes http://llvm.org/PR15645`
Reported-by: Johannes Doerfert <johannesdoerfert@gmx.de>
llvm-svn: 179233
Regions that have multiple exit edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
after
Region: if -> after
This regions contains the bbs 'if', 'then', 'else', but not 'after'. It has
two exit edges 'then' -> 'after' and 'else' -> 'after'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179159
Fix inspired from c2d4a0627e95c34a819b9d4ffb4db62daa78dade.
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When translate the data reference A[i] in statement S using scev, we need to
retrieve the scev of 'i' at the location of 'S'. If we do not do this the
scev that we obtain will be expressed as {0,+,1}_for and will reference loop
iterators that do not surround 'S'. What we really want is the scev to be
instantiated to the value of 'i' after the loop. This value is {10}.
This used to crash in:
int loopDimension = getLoopDepth(Expr->getLoop());
isl_aff *LAff = isl_aff_set_coefficient_si(
isl_aff_zero_on_domain(LocalSpace), isl_dim_in, loopDimension, 1);
(gdb) p Expr->dump()
{8,+,8}<nw><%do.body>
(gdb) p getLoopDepth(Expr->getLoop())
$5 = 0
isl_space *Space = isl_space_set_alloc(Ctx, 0, NbLoopSpaces);
isl_local_space *LocalSpace = isl_local_space_from_space(Space);
As we are trying to create a memory access in a stmt that is outside all loops,
LocalSpace has 0 dimensions:
(gdb) p NbLoopSpaces
$12 = 0
(gdb) p Statement.BB->dump()
if.then: ; preds = %do.end
%0 = load float* %add.ptr, align 4
store float %0, float* %q.1.reg2mem, align 4
br label %if.end.single_exit
and so the scev for %add.ptr should be taken at the place where it is used,
i.e., it should be the value on the last iteration of the do.body loop, and not
"{8,+,8}<nw><%do.body>".
llvm-svn: 179148
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.
We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.
llvm-svn: 177796
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When code generating S using scev based code generation, we need to retrieve
the scev of 'i' at the location of 'S'. If we do not do this the scev that
we obtain will be expressed as {0,+,1}_for and will reference loop iterators
that do not surround 'S' and that we consequently do not know how to code
generate. What we really want is the scev to be instantiated to the value of 'i'
after the loop. This value is {10} and it can be code generated without
troubles.
llvm-svn: 177777
We now detect scops without a canonical induction variable and can generate a
polyhedral representation for them. There was no modification necessary to
code generate these scops.
llvm-svn: 177643
When using the scev based code generation, we now do not rely on the presence
of a canonical induction variable any more. This commit prepares the path to
(conditionally) disable the induction variable canonicalization pass.
llvm-svn: 177548