Statements with an empty iteration domain may not have a schedule assigned by
the isl schedule optimizer. As Polly expects each statement to have a schedule,
we keep the old schedule for such statements.
This fixes http://llvm.org/PR15645`
Reported-by: Johannes Doerfert <johannesdoerfert@gmx.de>
llvm-svn: 179233
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.
We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.
llvm-svn: 177796
The bug was within isl. To fix it, we simply update the isl version that
is used by Polly. We still have some changes within Polly to be able to
write a proper test case.
Reported-by: Sameer Sahasrabuddhe <Sameer.Sahasrabuddhe@amd.com>
llvm-svn: 166021
The FinalRead statement represented a virtual read that is executed after the
SCoP. It was used when we verified the correctness of a schedule by checking if
it yields the same FLOW dependences as the original code. This is only works, if
we have a final read that reads all memory at the end of the SCoP.
We now switched to just checking if a schedule does not introduce negative
dependences and also consider WAW WAR dependences. This restricts the schedules
a little bit more, but we do not have any optimizer that would calculate a more
complex schedule. Hence, for now final reads are obsolete.
llvm-svn: 152319
This change itself should not change functionality, but it will make it easier
to support use different dependence kinds in for validity and proximity
constraints.
llvm-svn: 150483
This has shown better results for 2mm, 3mm and a couple of other benchmarks.
After this we show consistenly better results as PoCC with maxfuse. We need
to see if PoCC can also give better results with another fusion strategy.
llvm-svn: 149267
maximise_band_depth does not seem to have any effect for now, but it may help to
increase the amount of tileable loops. We expose the flag to be able to analyze
its effects when looking into individual benchmarks.
llvm-svn: 149266
This speeds up the scheduler by orders of magnitude and in addition yields often
to a better schedule.
With this we can compile all polybench kernels with less than 5x compile time
overhead. In general the overhead is even less than 2-3x. This is still with
running a lot of redundant passes and no compile time tuning at all. There are
several obvious areas where we can improve here further.
There are also two test cases where we cannot find a schedule any more (cholesky
and another). I will look into them later on.
With this we have a very solid base line from which we can start to optimize
further.
llvm-svn: 149263
Previously the scheduler was splitting bands at the level at which it detected
that the splitting of the band is necessary. This may introduce an additional
level of bands, that can be avoided by backtracking and splitting on a higher
level. Additional splits reduce the number of loops that can be tiled, such that
avoiding splits and maximizing the band depth seems preferable.
As a first data point we looked at 2mm and 3mm from the polybench test suite.
For both maximizing the tilable bands results in a significant (5-10x)
performance improvement.
This patch enables the isl scheduler option to maximize the band depth.
llvm-svn: 146557
If larger coefficients appear as part of the input dependences, the schedule
calculation can take a very long time. We observed that the main overhead in
this calculation is due to optimizing the constant coefficients. They are
misused to increase locality by merging several unrelated dimensions into a
single dimension. This unwanted optimization increases the complexity of the
generated code and furthermore slows it down.
We use a new isl scheduler option to bound the values in the constant dimension
by a user defined value (20 in our case). If the right value is choosen, costly
overoptimization is prevented.
This solution works, but requires a specific (here almost randomly choosen)
value by which the constants are bound. For the moment, this is our best
solution, but we hope to to find a more generic one later on.
After these patch the extremly long compile time for simple kernels like 2mm or
3mm is reduced to a reasonable amount of time (Not more than a couple of seconds
even in debug mode).
llvm-svn: 146556
- Use uppercase letters according to the LLVM coding style
- Rename functions to not include 'tiledSchedule', but just Schedule. This
is more correct as tiling might be disabled.
llvm-svn: 144899
Polly should now be compiled with CLooG 0c252c88946b27b7b61a1a8d8fd7f94d2461dbfd
and isl 56b7d238929980e62218525b4b3be121af386edf. The most convenient way to
update is utils/checkout_cloog.sh.
llvm-svn: 141251
Due to the recent introduction of isl_id, parameters need now always to be
aligned. This was not yet taken care of in the code path of vectorization and
dependence analysis.
llvm-svn: 138555