Such a dead code elimination can remove redundant stores to arrays. It can also
eliminate calculations where the results are stored to memory but where they are
overwritten before ever being read. It may also fix bugs like:
http://llvm.org/bugs/show_bug.cgi?id=5117
This commit just adds a sceleton without any functionality.
If anybody is interested to learn about polyhedral optimizations this would be
a good task. Well definined, self contained and pretty simple. Ping me if you
want to start and you need some pointers to get going.
llvm-svn: 149386
It is only needed for PoCC. We may update our openscop support which is
expected to be wider used. If this is the case we could automatically build
openscop.
llvm-svn: 149372
As we now have a scheduler that works, I do not believe a lot of people need
PoCC right ahead. People who want to do an in depth investigation of the
different schedulers can install it as well manually.
llvm-svn: 149370
This has shown better results for 2mm, 3mm and a couple of other benchmarks.
After this we show consistenly better results as PoCC with maxfuse. We need
to see if PoCC can also give better results with another fusion strategy.
llvm-svn: 149267
maximise_band_depth does not seem to have any effect for now, but it may help to
increase the amount of tileable loops. We expose the flag to be able to analyze
its effects when looking into individual benchmarks.
llvm-svn: 149266
This speeds up the scheduler by orders of magnitude and in addition yields often
to a better schedule.
With this we can compile all polybench kernels with less than 5x compile time
overhead. In general the overhead is even less than 2-3x. This is still with
running a lot of redundant passes and no compile time tuning at all. There are
several obvious areas where we can improve here further.
There are also two test cases where we cannot find a schedule any more (cholesky
and another). I will look into them later on.
With this we have a very solid base line from which we can start to optimize
further.
llvm-svn: 149263
In case we can not analyze an access function, we do not discard the SCoP, but
assume conservatively that all memory accesses that can be derived from our base
pointer may be accessed.
Patch provided by: Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 146972
To extract a preoptimized LLVM-IR file from a C-file run:
clang -Xclang -load -Xclang LLVMPolly.so -O0 -mllvm -polly file.c -S -emit-llvm
On the generated file you can directly run passes such as:
'opt -view-scops file.s'
llvm-svn: 146560
Previously the scheduler was splitting bands at the level at which it detected
that the splitting of the band is necessary. This may introduce an additional
level of bands, that can be avoided by backtracking and splitting on a higher
level. Additional splits reduce the number of loops that can be tiled, such that
avoiding splits and maximizing the band depth seems preferable.
As a first data point we looked at 2mm and 3mm from the polybench test suite.
For both maximizing the tilable bands results in a significant (5-10x)
performance improvement.
This patch enables the isl scheduler option to maximize the band depth.
llvm-svn: 146557
If larger coefficients appear as part of the input dependences, the schedule
calculation can take a very long time. We observed that the main overhead in
this calculation is due to optimizing the constant coefficients. They are
misused to increase locality by merging several unrelated dimensions into a
single dimension. This unwanted optimization increases the complexity of the
generated code and furthermore slows it down.
We use a new isl scheduler option to bound the values in the constant dimension
by a user defined value (20 in our case). If the right value is choosen, costly
overoptimization is prevented.
This solution works, but requires a specific (here almost randomly choosen)
value by which the constants are bound. For the moment, this is our best
solution, but we hope to to find a more generic one later on.
After these patch the extremly long compile time for simple kernels like 2mm or
3mm is reduced to a reasonable amount of time (Not more than a couple of seconds
even in debug mode).
llvm-svn: 146556