1. Do not allow creating new memory access record in the InstructionToAccess map
on the fly in function getAccessFor.
2. Do not allow user to modify the memory accesses returned by getAccessFor
during the code generation process.
llvm-svn: 185253
isl recently introduced isl_val as an abstract interface to represent arbitrary
precision numbers. This interface superseeds the old isl_int interface. In
contrast to the old interface which implemented arbitrary precision arithmetic
using macros that forward to the gmp library, the new library hides the math
library implementation in isl. This allows us to switch the math library used by
isl without affecting users such as Polly.
llvm-svn: 184529
Previously this happend to work for integers up to i64, but we got it wrong
for larger numbers. Fix this and add test cases to verify this keeps working.
Reported by: Sven Verdoolaege <skimo at kotnet dot org>
llvm-svn: 183986
When a region header is part of a loop, then all entering edges of this region
should not come from the loop but outside the region. Otherwise, the loop may be
only partially part of the region, which would cause troubles in handling
induction variables.
Currently, we can only model induction variables that are either fully part of
the scop (loop induction variable) or induction variables that are scop-
invariant (parameter). A loop that is only partially part of the
scop causes troubles, as there is no good way to handle the induction
variable in the independent blocks pass.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 183800
In GDB when "step" through generateScalarLoad and "finish" the call, the
returned value is non NULL, however when printing the value contained in
BBMap[Load] after this stmt:
BBMap[Load] = generateScalarLoad(...);
the value in BBMap[Load] is NULL, and the BBMap.count(Load) is 1.
The only intuitive idea that I have to explain this behavior is that we are
playing with the undefined behavior of eval order of the params for the function
standing for "BBMap[Load] = generateScalarLoad()". "BBMap[Load] = " may be
executed before generateScalarLoad is called.
Here are some other possible explanations from Will Dietz <w@wdtz.org>:
The error is likely due to BBMap[Load] being evaluated first (creating
a {Load -> uninitialized } entry in the DenseMap), then
generateScalarLoad eventually accesses the same element and finds it
to be NULL (DenseMap[Old]).. Offhand I'm not sure if this is
guaranteed to be NULL or if it's uninitialized and happens to be NULL.
The same issue can also go wrong in an even worse way: the second
DenseMap access can trigger a rehash and *invalidate* the an earlier
evaluated expression (for example LHS of the assignment), leading to a
crash when performing the assignment store.
llvm-svn: 182655
When the Polly code generation was written we did not correctly update the
LoopInfo data, but still claimed that the loop information is correct. This
does not only lead to missed optimizations, but it can also cause
miscompilations in case passes such as LoopSimplify are run after Polly.
Reported-by: Sergei Larin <slarin@codeaurora.org>
llvm-svn: 181987
BeforeBB
|
v
GuardBB
/ \
__ PreHeaderBB \
/ \ / |
latch HeaderBB |
\ / \ /
< \ /
\ /
ExitBB
This does not only remove the need for an explicit loop rotate pass, but it also
gives us the possibility to skip the construction of the guard condition in case
the loop is known to be executed at least once. We do not yet exploit this, but
by implementing this analysis in the isl code generator we should be able to
remove more guards than the generic loop rotate pass can. Another point is that
loop rotation can introduce additional PHI nodes, which may hide that a loop can
be executed in parallel. This change avoids this complication and will make it
easier to move the openmp code generation into a separate pass.
llvm-svn: 181986
Use the new cl::OptionCategory support to move the Polly options into a separate
option category. The aim is to hide most options and show by default only the
options a user needs to influence '-O3 -polly'. The available options probably
need some care, but here is the current status:
Polly Options:
Configure the polly loop optimizer
-enable-polly-openmp - Generate OpenMP parallel code
-polly - Enable the polly optimizer (only at -O3)
-polly-no-tiling - Disable tiling in the scheduler
-polly-only-func=<function-name> - Only run on a single function
-polly-report - Print information about the activities
of Polly
-polly-vectorizer - Select the vectorization strategy
=none - No Vectorization
=polly - Polly internal vectorizer
=unroll-only - Only grouped unroll the vectorize
candidate loops
=bb - The Basic Block vectorizer driven by
Polly
llvm-svn: 181295
In the classical (non -polly-codegen-scev) mode, we assume that we can always
recreate PHI nodes during code generation. This is not true. We can only
reconstruct them from the polyhedral information, in case the entire loop of the
PHI node is part of the SCoP and consequently the PHI node was translated in
the polyhedral description.
llvm-svn: 179674
We now support regions with multiple entries and multiple exits natively.
Regions are not needed to be simplified to single entry and single exit.
We need to XFAIL two test cases as this change increases the scop coverage
and uncoveres two failures in the independent blocks pass. The first failure
will be fixed in a subsequent commit, the second one is in the non-default
-polly-codegen-scev mode and still needs to be fixed.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179673
Regions that have multiple entry edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
for_region
This for_region contains two entry edges 'then' -> 'for_region' and 'else' -> 'for_region'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively when the region is
in -loop-simplify form, which means the entry block should not be a loop header.
Contributed by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179586
We do not only need to understand that 'k * p' is a parameter expression, but
also need to store this expression in the set of parameters. Before this patch
we wrongly stored the two individual parameters %k and %p.
Reported by: Sebastian Pop <spop@codeaurora.org>
llvm-svn: 179485
Statements with an empty iteration domain may not have a schedule assigned by
the isl schedule optimizer. As Polly expects each statement to have a schedule,
we keep the old schedule for such statements.
This fixes http://llvm.org/PR15645`
Reported-by: Johannes Doerfert <johannesdoerfert@gmx.de>
llvm-svn: 179233
Regions that have multiple exit edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
after
Region: if -> after
This regions contains the bbs 'if', 'then', 'else', but not 'after'. It has
two exit edges 'then' -> 'after' and 'else' -> 'after'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179159
During code generation we split the original entry and exit basic blocks
of the scop to make room for the newly generated code. To keep the region tree
up to date, we need to update the region tree. This patch ensures that not only
the region of the scop is updated, but also all child regions that share the
same entry or exit block.
We have now test case here, as the bug is only exposed by the subsequent commit.
The test cases of that commit also cover this bug.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179158
Fix inspired from c2d4a0627e95c34a819b9d4ffb4db62daa78dade.
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When translate the data reference A[i] in statement S using scev, we need to
retrieve the scev of 'i' at the location of 'S'. If we do not do this the
scev that we obtain will be expressed as {0,+,1}_for and will reference loop
iterators that do not surround 'S'. What we really want is the scev to be
instantiated to the value of 'i' after the loop. This value is {10}.
This used to crash in:
int loopDimension = getLoopDepth(Expr->getLoop());
isl_aff *LAff = isl_aff_set_coefficient_si(
isl_aff_zero_on_domain(LocalSpace), isl_dim_in, loopDimension, 1);
(gdb) p Expr->dump()
{8,+,8}<nw><%do.body>
(gdb) p getLoopDepth(Expr->getLoop())
$5 = 0
isl_space *Space = isl_space_set_alloc(Ctx, 0, NbLoopSpaces);
isl_local_space *LocalSpace = isl_local_space_from_space(Space);
As we are trying to create a memory access in a stmt that is outside all loops,
LocalSpace has 0 dimensions:
(gdb) p NbLoopSpaces
$12 = 0
(gdb) p Statement.BB->dump()
if.then: ; preds = %do.end
%0 = load float* %add.ptr, align 4
store float %0, float* %q.1.reg2mem, align 4
br label %if.end.single_exit
and so the scev for %add.ptr should be taken at the place where it is used,
i.e., it should be the value on the last iteration of the do.body loop, and not
"{8,+,8}<nw><%do.body>".
llvm-svn: 179148
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.
We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.
llvm-svn: 177796
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When code generating S using scev based code generation, we need to retrieve
the scev of 'i' at the location of 'S'. If we do not do this the scev that
we obtain will be expressed as {0,+,1}_for and will reference loop iterators
that do not surround 'S' and that we consequently do not know how to code
generate. What we really want is the scev to be instantiated to the value of 'i'
after the loop. This value is {10} and it can be code generated without
troubles.
llvm-svn: 177777
Scev code generation can now handle scops with non canonical induction
variables. Hence there is no need to introduce canonical ones any more.
llvm-svn: 177644
We now detect scops without a canonical induction variable and can generate a
polyhedral representation for them. There was no modification necessary to
code generate these scops.
llvm-svn: 177643
When using the scev based code generation, we now do not rely on the presence
of a canonical induction variable any more. This commit prepares the path to
(conditionally) disable the induction variable canonicalization pass.
llvm-svn: 177548
This allows us to test Polly and the Polly optimizer without actually doing
code generation at the end. By enabling this option, we can also measure the
compile time overhead due to code generation and the cost of LLVM optimizing the
newly generated code.t
llvm-svn: 177516
When doing SCEV based code generation, we ignore instructions calculating values
that are fully defined by a SCEV expression. The values that are calculated by
this instructions are recalculated on demand.
This commit improves the check to verify if certain instructions can be ignored
and recalculated on demand.
llvm-svn: 177313
We now show the all members of the alias set that may couse possible aliasing.
In case a alias set member is not a named instruction (unnamed instructions or
constant expressions), we show the expression itself.
This improves our error message
from:
Possible aliasing for value: .reg2mem
to:
Possible aliasing: ".reg2mem",
"[0 x double]* inttoptr (i64 47255179264 to [0 x double]*)
llvm-svn: 174329
We fix the following formatting problems found by clang-format:
- 80 cols violations
- Obvious problems with missing or too many spaces
- multiple new lines in a row
clang-format suggests many more changes, most of them falling in the following
two categories:
1) clang-format does not at all format a piece of code nicely
2) The style that clang-format suggests does not match the style used in
Polly/LLVM
I consider differences caused by reason 1) bugs, which should be fixed by
improving clang-format. Differences due to 2) need to be investigated closer
to understand the cause of the difference and the solution that should be taken.
llvm-svn: 171241
Recent changes in isl:
- Allow analysis of loops during code generation
This simplifies the detection of parallel loops.
- Simplify the way costumized ast printers are defined
This enables us to highlight parallel / vector loops in our debug output.
- Compile time improvements for codegen contexts that include parameters
- Various bug fixes
This update also gets us in sync for the isl 0.11 release.
llvm-svn: 169100
generation.
We don't use the exact same way to build loop body for GPGPU codegen as openmp
codegen and other transformations do currently, in which cases 'createLoop'
function is called recursively. GPGPU codegen may fail due to improper restore
of ValueMap and ClastVars .
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 168966