Further:
o ensure that the header is properly readable even on smaller screen sizes.
o Shorten the table of contents of the documentation section.
llvm-svn: 197794
We now report the following:
$ polly-clang -O3 -mllvm -polly -mllvm -polly-report test.c -c \
-gline-tables-only
note: Polly detected an optimizable loop region (scop) in function 'foo'
test.c:2: Start of scop
test.c:3: End of scop
note: Polly detected an optimizable loop region (scop) in function 'bar'
test.c:9: Start of scop
test.c:13: End of scop
llvm-svn: 197558
We still have troubles as clang is not properly built yet. I messed up a path
in the PollyBuilder and am waiting for another buildmaster restart.
llvm-svn: 195520
For for-nodes that are translated to a set of vector lanes, we already know the
overall number of iterations. Calculating the upper bound is consequently not
necessary. This change removes the code for upper bound calculation, which was
probably copy/pasted from the code generation for the normal for-loop.
This issue was found by Sylvestre's scan-build server.
llvm-svn: 193925
When constructing a scop sometimes the exact representation of a statement or
condition would be very complex, but there is a common case which is a lot
simpler, but which is only valid under certain assumptions. The assumed context
records the assumptions taken during the construction of this scop and that need
to be code generated as a run-time test.
At the moment, we do not yet model any assumptions, but only added the
AssumedContext as well as the isl-ast generation support. As a next step,
this needs to be hooked up with the isl code generation.
if (1) /* run-time condition */
{ /* optimized code */ }
else
{ /* original code */ }
llvm-svn: 193652
Instead of defining the relevant functions inline, we now just keep the
declarations in the class itself. This makes the class declaration a lot
easier to read as all functions can be seen at once. We also use this
opportunity to privatize all functions not used in the public interface of the
class.
llvm-svn: 190841
Use 0 >= 1 instead of 0 != 0 to represent 'false'. This might be slightly more
efficient as isl may create a union of sets for 0 != 0, whereas this is never
needed for the expression 0 >= 1.
Contributed-by: Alexandre Isoard <alexandre.isoard@gmail.com>
llvm-svn: 190384
The Makefile rule "polly-test" has been renamed to
"check-polly" in r182171. This CL updates the document and
the automatic build script.
llvm-svn: 188624
SCoP invariant parameters with the different start value would deter parameter
sharing. For example, when compiling the following C code:
void foo(float *input) {
for (long j = 0; j < 8; j++) {
// SCoP begin
for (long i = 0; i < 8; i++) {
float x = input[j * 64 + i + 1];
input[j * 64 + i] = x * x;
}
}
}
Polly would creat two parameters for these memory accesses:
p_0: {0,+,256}
p_2: {4,+,256}
[j * 64 + i + 1] => MemRef_input[o0] : 4o0 = p_1 + 4i0
[j * 64 + i] => MemRef_input[o0] : 4o0 = p_0 + 4i0
These parameters only differ from start value. To enable parameter sharing,
we split the start value from SCEVAddRecExpr, so they would share a single
parameter that always has zero start value:
p0: {0,+,256}<%for.cond1.preheader>
[j * 64 + i + 1] => MemRef_input[o0] : 4o0 = 4 + p_1 + 4i0
[j * 64 + i] => MemRef_input[o0] : 4o0 = p_0 + 4i0
Such translation can make the polly-dependence much faster.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 187728
In case we detect that the schedule the user wants to import is invalid we
refuse it _and_ free the isl_maps containing it.
Another bug found thanks to Rafael.
llvm-svn: 187339
We now use __isl_take to annotate the uses of the isl_set where we got the
memory management wrong.
Thanks to Rafael! His pipefail work hardened our test environment and exposed
this bug nicely.
llvm-svn: 187338
Split the old getNewValue into two parts:
1. The function "lookupAvailableValue" that return the new version of
the instruction which is already available.
2. The function calls "lookupAvailableValue", and tries to generate
the new version if it is not available yet.
llvm-svn: 187114
String operations resulted by raw_string_ostream in the INVALID macro can lead
to significant compile-time overhead when compiling large size source code.
This is because raw_string_ostream relies on TypeFinder class, whose
compile-time cost increases as the size of the module increases. This patch
targets to ensure that it only track detection failures if actually needed.
In this way, we can avoid expensive string operations in normal execution.
With this patch file, the relative compile-time cost of Polly-detect pass does
not increase even when compiling very large size source code.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 187102
Ensure that the scalar write access corresponds to the result of a load
instruction appears after the generic read access corresponds to the load
instruction.
llvm-svn: 186419
Orignally, we first test if a ValueMap contains a Value, and than use the
index operator to get the corresponding new value. This requires the ValueMap
to lookup the key (i.e. the old value) twice.
Now, we directly use the "lookup" function provided by DenseMap to implement
the same functionality.
llvm-svn: 185260
1. Do not allow creating new memory access record in the InstructionToAccess map
on the fly in function getAccessFor.
2. Do not allow user to modify the memory accesses returned by getAccessFor
during the code generation process.
llvm-svn: 185253
isl recently introduced isl_val as an abstract interface to represent arbitrary
precision numbers. This interface superseeds the old isl_int interface. In
contrast to the old interface which implemented arbitrary precision arithmetic
using macros that forward to the gmp library, the new library hides the math
library implementation in isl. This allows us to switch the math library used by
isl without affecting users such as Polly.
llvm-svn: 184529
Previously this happend to work for integers up to i64, but we got it wrong
for larger numbers. Fix this and add test cases to verify this keeps working.
Reported by: Sven Verdoolaege <skimo at kotnet dot org>
llvm-svn: 183986
When a region header is part of a loop, then all entering edges of this region
should not come from the loop but outside the region. Otherwise, the loop may be
only partially part of the region, which would cause troubles in handling
induction variables.
Currently, we can only model induction variables that are either fully part of
the scop (loop induction variable) or induction variables that are scop-
invariant (parameter). A loop that is only partially part of the
scop causes troubles, as there is no good way to handle the induction
variable in the independent blocks pass.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 183800
The latest version of isl includes a new data type isl_val, which properly hides
the multi precision math library used by isl. In Polly we would like to replace
all uses of isl_int with the corresponding isl_val interfaces. This will allow
us to switch the multi precision math library in isl. This is especially
interesting for people who would like to replace libgmp with a non-gpl licensed
library (e.g. imath).
llvm-svn: 183026
The original test case showed a problem with the independet blocks pass and
we decided to XFAIL it for now. Unfortunately the failure is not detected if
we build without asserts and the verification of the independent block pass
is not run. This change tests now for the actual reason of the failure and
should trigger even in a non asserts build. We did not yet solve the underlying
bug, but this should at least make the test suite behavior consistent.
llvm-svn: 183025
In GDB when "step" through generateScalarLoad and "finish" the call, the
returned value is non NULL, however when printing the value contained in
BBMap[Load] after this stmt:
BBMap[Load] = generateScalarLoad(...);
the value in BBMap[Load] is NULL, and the BBMap.count(Load) is 1.
The only intuitive idea that I have to explain this behavior is that we are
playing with the undefined behavior of eval order of the params for the function
standing for "BBMap[Load] = generateScalarLoad()". "BBMap[Load] = " may be
executed before generateScalarLoad is called.
Here are some other possible explanations from Will Dietz <w@wdtz.org>:
The error is likely due to BBMap[Load] being evaluated first (creating
a {Load -> uninitialized } entry in the DenseMap), then
generateScalarLoad eventually accesses the same element and finds it
to be NULL (DenseMap[Old]).. Offhand I'm not sure if this is
guaranteed to be NULL or if it's uninitialized and happens to be NULL.
The same issue can also go wrong in an even worse way: the second
DenseMap access can trigger a rehash and *invalidate* the an earlier
evaluated expression (for example LHS of the assignment), leading to a
crash when performing the assignment store.
llvm-svn: 182655
It was initially committed to allow people to get a list of the files used
or generated in the matmul tutorial. Since the documentation does now
point people to the directory in their git checkout, it is not necessary anymore
to make a directory listing available. Especially, as this never worked and
recently the LLVM web server does not deliver files in this directory at all
due to the unsupported .htaccess file.
llvm-svn: 182370
As the namings of the scops have changed, polly was not able to read in the user
given .jscop files. By renaming the provided files, polly now finds them again
and can use them to optimize the matmul function. We also update the generated
files to reflect the very latest version of Polly.
llvm-svn: 182265
When the Polly code generation was written we did not correctly update the
LoopInfo data, but still claimed that the loop information is correct. This
does not only lead to missed optimizations, but it can also cause
miscompilations in case passes such as LoopSimplify are run after Polly.
Reported-by: Sergei Larin <slarin@codeaurora.org>
llvm-svn: 181987
BeforeBB
|
v
GuardBB
/ \
__ PreHeaderBB \
/ \ / |
latch HeaderBB |
\ / \ /
< \ /
\ /
ExitBB
This does not only remove the need for an explicit loop rotate pass, but it also
gives us the possibility to skip the construction of the guard condition in case
the loop is known to be executed at least once. We do not yet exploit this, but
by implementing this analysis in the isl code generator we should be able to
remove more guards than the generic loop rotate pass can. Another point is that
loop rotation can introduce additional PHI nodes, which may hide that a loop can
be executed in parallel. This change avoids this complication and will make it
easier to move the openmp code generation into a separate pass.
llvm-svn: 181986
Use the new cl::OptionCategory support to move the Polly options into a separate
option category. The aim is to hide most options and show by default only the
options a user needs to influence '-O3 -polly'. The available options probably
need some care, but here is the current status:
Polly Options:
Configure the polly loop optimizer
-enable-polly-openmp - Generate OpenMP parallel code
-polly - Enable the polly optimizer (only at -O3)
-polly-no-tiling - Disable tiling in the scheduler
-polly-only-func=<function-name> - Only run on a single function
-polly-report - Print information about the activities
of Polly
-polly-vectorizer - Select the vectorization strategy
=none - No Vectorization
=polly - Polly internal vectorizer
=unroll-only - Only grouped unroll the vectorize
candidate loops
=bb - The Basic Block vectorizer driven by
Polly
llvm-svn: 181295
In the classical (non -polly-codegen-scev) mode, we assume that we can always
recreate PHI nodes during code generation. This is not true. We can only
reconstruct them from the polyhedral information, in case the entire loop of the
PHI node is part of the SCoP and consequently the PHI node was translated in
the polyhedral description.
llvm-svn: 179674
We now support regions with multiple entries and multiple exits natively.
Regions are not needed to be simplified to single entry and single exit.
We need to XFAIL two test cases as this change increases the scop coverage
and uncoveres two failures in the independent blocks pass. The first failure
will be fixed in a subsequent commit, the second one is in the non-default
-polly-codegen-scev mode and still needs to be fixed.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179673
Regions that have multiple entry edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
for_region
This for_region contains two entry edges 'then' -> 'for_region' and 'else' -> 'for_region'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively when the region is
in -loop-simplify form, which means the entry block should not be a loop header.
Contributed by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179586
We do not only need to understand that 'k * p' is a parameter expression, but
also need to store this expression in the set of parameters. Before this patch
we wrongly stored the two individual parameters %k and %p.
Reported by: Sebastian Pop <spop@codeaurora.org>
llvm-svn: 179485
Statements with an empty iteration domain may not have a schedule assigned by
the isl schedule optimizer. As Polly expects each statement to have a schedule,
we keep the old schedule for such statements.
This fixes http://llvm.org/PR15645`
Reported-by: Johannes Doerfert <johannesdoerfert@gmx.de>
llvm-svn: 179233
Regions that have multiple exit edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
after
Region: if -> after
This regions contains the bbs 'if', 'then', 'else', but not 'after'. It has
two exit edges 'then' -> 'after' and 'else' -> 'after'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179159
During code generation we split the original entry and exit basic blocks
of the scop to make room for the newly generated code. To keep the region tree
up to date, we need to update the region tree. This patch ensures that not only
the region of the scop is updated, but also all child regions that share the
same entry or exit block.
We have now test case here, as the bug is only exposed by the subsequent commit.
The test cases of that commit also cover this bug.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179158
Fix inspired from c2d4a0627e95c34a819b9d4ffb4db62daa78dade.
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When translate the data reference A[i] in statement S using scev, we need to
retrieve the scev of 'i' at the location of 'S'. If we do not do this the
scev that we obtain will be expressed as {0,+,1}_for and will reference loop
iterators that do not surround 'S'. What we really want is the scev to be
instantiated to the value of 'i' after the loop. This value is {10}.
This used to crash in:
int loopDimension = getLoopDepth(Expr->getLoop());
isl_aff *LAff = isl_aff_set_coefficient_si(
isl_aff_zero_on_domain(LocalSpace), isl_dim_in, loopDimension, 1);
(gdb) p Expr->dump()
{8,+,8}<nw><%do.body>
(gdb) p getLoopDepth(Expr->getLoop())
$5 = 0
isl_space *Space = isl_space_set_alloc(Ctx, 0, NbLoopSpaces);
isl_local_space *LocalSpace = isl_local_space_from_space(Space);
As we are trying to create a memory access in a stmt that is outside all loops,
LocalSpace has 0 dimensions:
(gdb) p NbLoopSpaces
$12 = 0
(gdb) p Statement.BB->dump()
if.then: ; preds = %do.end
%0 = load float* %add.ptr, align 4
store float %0, float* %q.1.reg2mem, align 4
br label %if.end.single_exit
and so the scev for %add.ptr should be taken at the place where it is used,
i.e., it should be the value on the last iteration of the do.body loop, and not
"{8,+,8}<nw><%do.body>".
llvm-svn: 179148
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.
We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.
llvm-svn: 177796
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When code generating S using scev based code generation, we need to retrieve
the scev of 'i' at the location of 'S'. If we do not do this the scev that
we obtain will be expressed as {0,+,1}_for and will reference loop iterators
that do not surround 'S' and that we consequently do not know how to code
generate. What we really want is the scev to be instantiated to the value of 'i'
after the loop. This value is {10} and it can be code generated without
troubles.
llvm-svn: 177777
Scev code generation can now handle scops with non canonical induction
variables. Hence there is no need to introduce canonical ones any more.
llvm-svn: 177644
We now detect scops without a canonical induction variable and can generate a
polyhedral representation for them. There was no modification necessary to
code generate these scops.
llvm-svn: 177643
When using the scev based code generation, we now do not rely on the presence
of a canonical induction variable any more. This commit prepares the path to
(conditionally) disable the induction variable canonicalization pass.
llvm-svn: 177548
This allows us to test Polly and the Polly optimizer without actually doing
code generation at the end. By enabling this option, we can also measure the
compile time overhead due to code generation and the cost of LLVM optimizing the
newly generated code.t
llvm-svn: 177516
When doing SCEV based code generation, we ignore instructions calculating values
that are fully defined by a SCEV expression. The values that are calculated by
this instructions are recalculated on demand.
This commit improves the check to verify if certain instructions can be ignored
and recalculated on demand.
llvm-svn: 177313
In my previous commits I failed to realise that my new requires lines fully
disabled these tests. We now properly check if we are in an asserts build and
only disable the tests if assertions are not available.
Reported-by: Sean Silva <silvas@purdue.edu>
llvm-svn: 176900
This fixes issues caused by the following commit:
r176733 | jvoung | 2013-03-08 17:56:31 -0500
Disable statistics on Release builds and move tests that depend on -stats.
Reported by: Jack Howarth <howarth@bromo.med.uc.edu>
llvm-svn: 176856