We currently do not support pointer types in affine expressions. Hence, we
disallow in the SCoP detection. Later we may decide to add support for them.
This fixes PR12277
Reported-By: Sebastian Pop <sebpop@gmail.com>
llvm-svn: 152928
This also fixes UMax where we did not correctly keep track of the parameters.
Fixes PR12275.
Reported-By: Sebastian Pop <sebpop@gmail.com>
llvm-svn: 152913
The FinalRead statement represented a virtual read that is executed after the
SCoP. It was used when we verified the correctness of a schedule by checking if
it yields the same FLOW dependences as the original code. This is only works, if
we have a final read that reads all memory at the end of the SCoP.
We now switched to just checking if a schedule does not introduce negative
dependences and also consider WAW WAR dependences. This restricts the schedules
a little bit more, but we do not have any optimizer that would calculate a more
complex schedule. Hence, for now final reads are obsolete.
llvm-svn: 152319
We now just check if the new scattering would create non-positive dependences.
This is a lot faster than recalculating dependences (which is especially slow
on tiled code).
llvm-svn: 152230
This change itself should not change functionality, but it will make it easier
to support use different dependence kinds in for validity and proximity
constraints.
llvm-svn: 150483
This allows us to enable -enable-iv-rewrite by default and releases LLVM from
the burdon to keep that feature. This is an intermediate step. We plan to soon
remove the need for rewritten induction variables entirely.
llvm-svn: 150481
When I first tried to commit this patch, the builder pointed after generation
of a loop still into the loop body. This means that code that was supposed to
be generated after the loop was generated right into the loop body. We fixed
this by pointing the builder to the BB after the loop, as soon as code
generation of the loop body itself is finished.
llvm-svn: 150480
Before this change we built the CFG such that it was only valid after code was
fully generated. During code generation itself, it was often incomplete. After
this change always maintain a valid CFG. This will later allow us to use the
SCEVExpander during code generation. This is the first step to get rid of the
independent blocks pass.
llvm-svn: 150339
Such a dead code elimination can remove redundant stores to arrays. It can also
eliminate calculations where the results are stored to memory but where they are
overwritten before ever being read. It may also fix bugs like:
http://llvm.org/bugs/show_bug.cgi?id=5117
This commit just adds a sceleton without any functionality.
If anybody is interested to learn about polyhedral optimizations this would be
a good task. Well definined, self contained and pretty simple. Ping me if you
want to start and you need some pointers to get going.
llvm-svn: 149386
It is only needed for PoCC. We may update our openscop support which is
expected to be wider used. If this is the case we could automatically build
openscop.
llvm-svn: 149372
As we now have a scheduler that works, I do not believe a lot of people need
PoCC right ahead. People who want to do an in depth investigation of the
different schedulers can install it as well manually.
llvm-svn: 149370
This has shown better results for 2mm, 3mm and a couple of other benchmarks.
After this we show consistenly better results as PoCC with maxfuse. We need
to see if PoCC can also give better results with another fusion strategy.
llvm-svn: 149267
maximise_band_depth does not seem to have any effect for now, but it may help to
increase the amount of tileable loops. We expose the flag to be able to analyze
its effects when looking into individual benchmarks.
llvm-svn: 149266
This speeds up the scheduler by orders of magnitude and in addition yields often
to a better schedule.
With this we can compile all polybench kernels with less than 5x compile time
overhead. In general the overhead is even less than 2-3x. This is still with
running a lot of redundant passes and no compile time tuning at all. There are
several obvious areas where we can improve here further.
There are also two test cases where we cannot find a schedule any more (cholesky
and another). I will look into them later on.
With this we have a very solid base line from which we can start to optimize
further.
llvm-svn: 149263
In case we can not analyze an access function, we do not discard the SCoP, but
assume conservatively that all memory accesses that can be derived from our base
pointer may be accessed.
Patch provided by: Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 146972
To extract a preoptimized LLVM-IR file from a C-file run:
clang -Xclang -load -Xclang LLVMPolly.so -O0 -mllvm -polly file.c -S -emit-llvm
On the generated file you can directly run passes such as:
'opt -view-scops file.s'
llvm-svn: 146560
Previously the scheduler was splitting bands at the level at which it detected
that the splitting of the band is necessary. This may introduce an additional
level of bands, that can be avoided by backtracking and splitting on a higher
level. Additional splits reduce the number of loops that can be tiled, such that
avoiding splits and maximizing the band depth seems preferable.
As a first data point we looked at 2mm and 3mm from the polybench test suite.
For both maximizing the tilable bands results in a significant (5-10x)
performance improvement.
This patch enables the isl scheduler option to maximize the band depth.
llvm-svn: 146557
If larger coefficients appear as part of the input dependences, the schedule
calculation can take a very long time. We observed that the main overhead in
this calculation is due to optimizing the constant coefficients. They are
misused to increase locality by merging several unrelated dimensions into a
single dimension. This unwanted optimization increases the complexity of the
generated code and furthermore slows it down.
We use a new isl scheduler option to bound the values in the constant dimension
by a user defined value (20 in our case). If the right value is choosen, costly
overoptimization is prevented.
This solution works, but requires a specific (here almost randomly choosen)
value by which the constants are bound. For the moment, this is our best
solution, but we hope to to find a more generic one later on.
After these patch the extremly long compile time for simple kernels like 2mm or
3mm is reduced to a reasonable amount of time (Not more than a couple of seconds
even in debug mode).
llvm-svn: 146556