This fixes a crash that appeared when generating dotty graphs for functions
without loops (for which we do not calculate polyhedral information).
llvm-svn: 198364
We now report the following:
$ polly-clang -O3 -mllvm -polly -mllvm -polly-report test.c -c \
-gline-tables-only
note: Polly detected an optimizable loop region (scop) in function 'foo'
test.c:2: Start of scop
test.c:3: End of scop
note: Polly detected an optimizable loop region (scop) in function 'bar'
test.c:9: Start of scop
test.c:13: End of scop
llvm-svn: 197558
When constructing a scop sometimes the exact representation of a statement or
condition would be very complex, but there is a common case which is a lot
simpler, but which is only valid under certain assumptions. The assumed context
records the assumptions taken during the construction of this scop and that need
to be code generated as a run-time test.
At the moment, we do not yet model any assumptions, but only added the
AssumedContext as well as the isl-ast generation support. As a next step,
this needs to be hooked up with the isl code generation.
if (1) /* run-time condition */
{ /* optimized code */ }
else
{ /* original code */ }
llvm-svn: 193652
Instead of defining the relevant functions inline, we now just keep the
declarations in the class itself. This makes the class declaration a lot
easier to read as all functions can be seen at once. We also use this
opportunity to privatize all functions not used in the public interface of the
class.
llvm-svn: 190841
Use 0 >= 1 instead of 0 != 0 to represent 'false'. This might be slightly more
efficient as isl may create a union of sets for 0 != 0, whereas this is never
needed for the expression 0 >= 1.
Contributed-by: Alexandre Isoard <alexandre.isoard@gmail.com>
llvm-svn: 190384
SCoP invariant parameters with the different start value would deter parameter
sharing. For example, when compiling the following C code:
void foo(float *input) {
for (long j = 0; j < 8; j++) {
// SCoP begin
for (long i = 0; i < 8; i++) {
float x = input[j * 64 + i + 1];
input[j * 64 + i] = x * x;
}
}
}
Polly would creat two parameters for these memory accesses:
p_0: {0,+,256}
p_2: {4,+,256}
[j * 64 + i + 1] => MemRef_input[o0] : 4o0 = p_1 + 4i0
[j * 64 + i] => MemRef_input[o0] : 4o0 = p_0 + 4i0
These parameters only differ from start value. To enable parameter sharing,
we split the start value from SCEVAddRecExpr, so they would share a single
parameter that always has zero start value:
p0: {0,+,256}<%for.cond1.preheader>
[j * 64 + i + 1] => MemRef_input[o0] : 4o0 = 4 + p_1 + 4i0
[j * 64 + i] => MemRef_input[o0] : 4o0 = p_0 + 4i0
Such translation can make the polly-dependence much faster.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 187728
String operations resulted by raw_string_ostream in the INVALID macro can lead
to significant compile-time overhead when compiling large size source code.
This is because raw_string_ostream relies on TypeFinder class, whose
compile-time cost increases as the size of the module increases. This patch
targets to ensure that it only track detection failures if actually needed.
In this way, we can avoid expensive string operations in normal execution.
With this patch file, the relative compile-time cost of Polly-detect pass does
not increase even when compiling very large size source code.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 187102
Ensure that the scalar write access corresponds to the result of a load
instruction appears after the generic read access corresponds to the load
instruction.
llvm-svn: 186419
isl recently introduced isl_val as an abstract interface to represent arbitrary
precision numbers. This interface superseeds the old isl_int interface. In
contrast to the old interface which implemented arbitrary precision arithmetic
using macros that forward to the gmp library, the new library hides the math
library implementation in isl. This allows us to switch the math library used by
isl without affecting users such as Polly.
llvm-svn: 184529
When a region header is part of a loop, then all entering edges of this region
should not come from the loop but outside the region. Otherwise, the loop may be
only partially part of the region, which would cause troubles in handling
induction variables.
Currently, we can only model induction variables that are either fully part of
the scop (loop induction variable) or induction variables that are scop-
invariant (parameter). A loop that is only partially part of the
scop causes troubles, as there is no good way to handle the induction
variable in the independent blocks pass.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 183800
Use the new cl::OptionCategory support to move the Polly options into a separate
option category. The aim is to hide most options and show by default only the
options a user needs to influence '-O3 -polly'. The available options probably
need some care, but here is the current status:
Polly Options:
Configure the polly loop optimizer
-enable-polly-openmp - Generate OpenMP parallel code
-polly - Enable the polly optimizer (only at -O3)
-polly-no-tiling - Disable tiling in the scheduler
-polly-only-func=<function-name> - Only run on a single function
-polly-report - Print information about the activities
of Polly
-polly-vectorizer - Select the vectorization strategy
=none - No Vectorization
=polly - Polly internal vectorizer
=unroll-only - Only grouped unroll the vectorize
candidate loops
=bb - The Basic Block vectorizer driven by
Polly
llvm-svn: 181295
Regions that have multiple entry edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
for_region
This for_region contains two entry edges 'then' -> 'for_region' and 'else' -> 'for_region'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively when the region is
in -loop-simplify form, which means the entry block should not be a loop header.
Contributed by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179586
Regions that have multiple exit edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
after
Region: if -> after
This regions contains the bbs 'if', 'then', 'else', but not 'after'. It has
two exit edges 'then' -> 'after' and 'else' -> 'after'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179159
Fix inspired from c2d4a0627e95c34a819b9d4ffb4db62daa78dade.
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When translate the data reference A[i] in statement S using scev, we need to
retrieve the scev of 'i' at the location of 'S'. If we do not do this the
scev that we obtain will be expressed as {0,+,1}_for and will reference loop
iterators that do not surround 'S'. What we really want is the scev to be
instantiated to the value of 'i' after the loop. This value is {10}.
This used to crash in:
int loopDimension = getLoopDepth(Expr->getLoop());
isl_aff *LAff = isl_aff_set_coefficient_si(
isl_aff_zero_on_domain(LocalSpace), isl_dim_in, loopDimension, 1);
(gdb) p Expr->dump()
{8,+,8}<nw><%do.body>
(gdb) p getLoopDepth(Expr->getLoop())
$5 = 0
isl_space *Space = isl_space_set_alloc(Ctx, 0, NbLoopSpaces);
isl_local_space *LocalSpace = isl_local_space_from_space(Space);
As we are trying to create a memory access in a stmt that is outside all loops,
LocalSpace has 0 dimensions:
(gdb) p NbLoopSpaces
$12 = 0
(gdb) p Statement.BB->dump()
if.then: ; preds = %do.end
%0 = load float* %add.ptr, align 4
store float %0, float* %q.1.reg2mem, align 4
br label %if.end.single_exit
and so the scev for %add.ptr should be taken at the place where it is used,
i.e., it should be the value on the last iteration of the do.body loop, and not
"{8,+,8}<nw><%do.body>".
llvm-svn: 179148
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.
We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.
llvm-svn: 177796
We now detect scops without a canonical induction variable and can generate a
polyhedral representation for them. There was no modification necessary to
code generate these scops.
llvm-svn: 177643
When using the scev based code generation, we now do not rely on the presence
of a canonical induction variable any more. This commit prepares the path to
(conditionally) disable the induction variable canonicalization pass.
llvm-svn: 177548
We now show the all members of the alias set that may couse possible aliasing.
In case a alias set member is not a named instruction (unnamed instructions or
constant expressions), we show the expression itself.
This improves our error message
from:
Possible aliasing for value: .reg2mem
to:
Possible aliasing: ".reg2mem",
"[0 x double]* inttoptr (i64 47255179264 to [0 x double]*)
llvm-svn: 174329
We fix the following formatting problems found by clang-format:
- 80 cols violations
- Obvious problems with missing or too many spaces
- multiple new lines in a row
clang-format suggests many more changes, most of them falling in the following
two categories:
1) clang-format does not at all format a piece of code nicely
2) The style that clang-format suggests does not match the style used in
Polly/LLVM
I consider differences caused by reason 1) bugs, which should be fixed by
improving clang-format. Differences due to 2) need to be investigated closer
to understand the cause of the difference and the solution that should be taken.
llvm-svn: 171241
Instead of calculating exact value (flow) dependences, it is also possible to
calculate memory based dependences. Sometimes memory based dependences are a lot
easier to calculate. To evaluate the benefits, we add an option to calculate
memory based dependences (use -polly-value-dependences=false).
llvm-svn: 167251
If the flags '-polly-report -g' are given, we print file name and line numbers
for the beginning and end of all detected scops.
linear-algebra/kernels/gemm/gemm.c:23: Scop start
linear-algebra/kernels/gemm/gemm.c:42: Scop end
linear-algebra/kernels/gemm/gemm.c:77: Scop start
linear-algebra/kernels/gemm/gemm.c:82: Scop end
llvm-svn: 167235
This ensures that the isl sets/maps we operate on have the same parameter
dimensions. Operations on objects with different parameter dimensions are not
allow and trigger assertions.
llvm-svn: 163618
This includes:
- The isl_id of the domain of the scattering must be copied from the original
domain
- Remove outdated references to a 'FinalRead' statement
- Print of the Pocc output, if -debug is provided.
- Add line breaks to some error messages.
Reported and Debugged by: Dustin Feld <d3.feld@gmail.com>
llvm-svn: 162901
Cast instruction do not have side effects and can consequently be part of a
scop. We special cased them earlier, as they may be problematic within array
subscripts or loop bounds. However, the scalar evolution validator already
checks for them such that there is no need to also check the instructions within
the basic blocks. Checking them is actually overly conservative as the precence
of casts may invalidate a scop, even though scalar evolution is not influenced
by it.
llvm-svn: 160261
Store a pointer to each ScopStmt in the isl_id associated with the space of its
domain. This will later allow us to recover the statement during code
generation with isl.
llvm-svn: 157607
Derive the maximal and minimal values of a parameter from the type it has. Add
this information to the scop context. This information is needed, to derive
optimal types during code generation.
llvm-svn: 157245
There is no need for special code to handle SCEVUnknowns. SCEVUnkowns are always
parameters and will be handled by the generic parameter handling code in
visit().
llvm-svn: 157243
The FinalRead statement represented a virtual read that is executed after the
SCoP. It was used when we verified the correctness of a schedule by checking if
it yields the same FLOW dependences as the original code. This is only works, if
we have a final read that reads all memory at the end of the SCoP.
We now switched to just checking if a schedule does not introduce negative
dependences and also consider WAW WAR dependences. This restricts the schedules
a little bit more, but we do not have any optimizer that would calculate a more
complex schedule. Hence, for now final reads are obsolete.
llvm-svn: 152319
We now just check if the new scattering would create non-positive dependences.
This is a lot faster than recalculating dependences (which is especially slow
on tiled code).
llvm-svn: 152230
In case we can not analyze an access function, we do not discard the SCoP, but
assume conservatively that all memory accesses that can be derived from our base
pointer may be accessed.
Patch provided by: Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 146972
Parameters can be complex SCEV expressions, but they can also be single scalar
values. If a parameters is such a simple scalar value and the value is named,
use this name to name the isl parameter dimensions.
llvm-svn: 144641
This does not work reliable and is probably not needed. I accidentally changed
this in this recent commit:
commit a0bcd63c6ffa81616cf8c6663a87588803f7d91c
Author: grosser <grosser@91177308-0d34-0410-b5e6-96231b3b80d8>
Date: Thu Nov 10 12:47:21 2011 +0000
ScopDetect: Use INVALID macro to fail in case of aliasing
This simplifies the code and also makes the error message available to the
graphviz scop viewer.
git-svn-id: https://llvm.org/svn/llvm-project/polly/trunk@144284
llvm-svn: 144286
address is part of the access function. Also remove unused special cases that
were necessery when the base address was still contained in the access function
llvm-svn: 144280
Previously we allowed in access functions only a single SCEVUnknown, which later
became the base address. We now use getPointerBase() to derive the base address
and all remaining unknowns are handled as parameters. This allows us to handle
cases like A[b+c];
llvm-svn: 144278
This check was necessary because of the use AffineSCEVIterator in TempScopInfo.
As we removed this use recently it is not necessary any more.
llvm-svn: 144228
Instead of using TempScop to find parameters, we detect them directly
on the SCEV. This allows us to remove the TempScop parameter detection
in a subsequent commit.
This fixes a bug reported by Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 144087
Previously we built a context that contained already all parameter dimensions
from the start. We now build a context without any parameter dimensions and
extend the context as needed. All parameter dimensions are added during final
realignment.
llvm-svn: 144085
We currently run the old memory access checker in parallel, as we would
otherwise fail in TempScop because of currently unsupported functions. We will
remove the old memory access checker as soon as TempScop is fixed.
llvm-svn: 143654
The SCEV Validator is used to check if the bound of a loop can be translated
into a polyhedral constraint. The new validator is more general as the check
used previously and e.g. allows bounds like 'smax 1, %a'. At the moment, we
only allow signed comparisons. Also, the new validator is only used to verify
loop bounds. Memory accesses are still handled by the old validator.
llvm-svn: 143576
- Use __isl_give and __isl_take
- Convert variables to start with Uppercase letter
- Only assign the 'domain' after it is fully constructed
- Only name it after it is fully constructed
llvm-svn: 141361
Also take the chance and rename access functions to access relations. This is
because we do not only allow plain functions to describe an access, but we
can have any access relation that can be described with linear constraints.
llvm-svn: 141257
Polly should now be compiled with CLooG 0c252c88946b27b7b61a1a8d8fd7f94d2461dbfd
and isl 56b7d238929980e62218525b4b3be121af386edf. The most convenient way to
update is utils/checkout_cloog.sh.
llvm-svn: 141251
Due to the recent introduction of isl_id, parameters need now always to be
aligned. This was not yet taken care of in the code path of vectorization and
dependence analysis.
llvm-svn: 138555
I am planning to eliminate the TempScopInfo pass. To simplify this I remove
some features that may later be added to the ScopInfo pass.
The interchange pass is currently strongly tested and furthermore ment to be
replaced by the general scheduling optimizer. Reductions itself can later
be added easily.
llvm-svn: 138219
Because of me not understanding the LLVM pass structure well, I did not find a
good way to allocate isl_ctx and to free it later without getting issues with
reference counting. I now found this place, such that we can free isl_ctx. This
patch also fixes the memory leaks that were ignored beforehand.
llvm-svn: 138204
Until today, we compared two affine expressions by defining two maps describing
them, creating an union of those maps, adding constraints that do the comparison
and projecting out unneeded dimensions.
This was simplified to using the isl_pw_aff representation of the affine
expressions and using the relevant isl functions to compare them.
llvm-svn: 137932
At the moment, we still remove the ids after all data structures are created,
as later passes do not yet support ids. This limitation will be removed later.
llvm-svn: 137931
Do not use AffFunc to derive the affine expressions, but use isl_pw_aff to
analyze the original SCEV directly. This will allow several simplifications in
follow up patches, with the final goal of removing AffFunc completely.
llvm-svn: 137930
Instead of returning a pointer to the domain, we return a new copy of it. This
is safer, as we do not give access to internal objects. It is also not
expensive, as isl will just increment a reference counter.
llvm-svn: 131010