This fixes two crashes that appeared in case of:
- A load of a non vectorizable type (e.g. float**)
- An instruction that is not vectorizable (e.g. call)
llvm-svn: 154586
Grouped unrolling means that we unroll a loop such that the different instances
of a certain statement are scheduled right after each other, but we do
not generate any vector code. The idea here is that we can schedule the
bb vectorizer right afterwards and use it heuristics to decide when
vectorization should be performed.
llvm-svn: 154251
To avoid overflows we still use a larger type (i64) while calculating the value
of the old ivs. However, we truncate the result to the type of the old iv when
providing it to the new code.
A corresponding test case is added to the polly test suite. Also, a failing test
case is fixed.
This fixes PR12311.
Contributed by: Tsingray Liu <tsingrayliu@gmail.com>
llvm-svn: 153952
When deriving new values for the statements of a SCoP, we assumed that parameter
values are constant within the SCoP and consquently do not need to be rewritten.
For OpenMP code generation this assumption is wrong, as such values are not
available in the OpenMP subfunction and consequently also may need to be
rewritten.
Committed with some changes.
Contributed-By: Johannes Doerfert <s9jodoer@stud.uni-saarland.de>
llvm-svn: 153838
We create a new file LoopGenerators that provides utility classes for the
generation of OpenMP parallel and scalar loops. This means we move a lot
of the OpenMP generation out of the Polly specific code generator.
llvm-svn: 153325
This also adds support for modifiable write accesses (until now only read
accesses where supported). We currently do not derive an exact type for the
expression, but assume that i64 is good enough. This will be improved in future
patches.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 153319
This functionality is not available in LLVM trunk and breaks the compilation of
Polly. This patch fixes the compilation, but may not be enough to recover all
functionality.
llvm-svn: 153318
For boolean flags in Polly there is no problem if they are given more than once.
Hence, we can allow it to not fail for build systems that (acciently) add flags
several times.
This fixes: PR12278
Reported by: Sebastian Pop <sebpop@gmail.com>
llvm-svn: 152933
We currently do not support pointer types in affine expressions. Hence, we
disallow in the SCoP detection. Later we may decide to add support for them.
This fixes PR12277
Reported-By: Sebastian Pop <sebpop@gmail.com>
llvm-svn: 152928
This also fixes UMax where we did not correctly keep track of the parameters.
Fixes PR12275.
Reported-By: Sebastian Pop <sebpop@gmail.com>
llvm-svn: 152913
The FinalRead statement represented a virtual read that is executed after the
SCoP. It was used when we verified the correctness of a schedule by checking if
it yields the same FLOW dependences as the original code. This is only works, if
we have a final read that reads all memory at the end of the SCoP.
We now switched to just checking if a schedule does not introduce negative
dependences and also consider WAW WAR dependences. This restricts the schedules
a little bit more, but we do not have any optimizer that would calculate a more
complex schedule. Hence, for now final reads are obsolete.
llvm-svn: 152319
We now just check if the new scattering would create non-positive dependences.
This is a lot faster than recalculating dependences (which is especially slow
on tiled code).
llvm-svn: 152230
This change itself should not change functionality, but it will make it easier
to support use different dependence kinds in for validity and proximity
constraints.
llvm-svn: 150483
This allows us to enable -enable-iv-rewrite by default and releases LLVM from
the burdon to keep that feature. This is an intermediate step. We plan to soon
remove the need for rewritten induction variables entirely.
llvm-svn: 150481
When I first tried to commit this patch, the builder pointed after generation
of a loop still into the loop body. This means that code that was supposed to
be generated after the loop was generated right into the loop body. We fixed
this by pointing the builder to the BB after the loop, as soon as code
generation of the loop body itself is finished.
llvm-svn: 150480
Before this change we built the CFG such that it was only valid after code was
fully generated. During code generation itself, it was often incomplete. After
this change always maintain a valid CFG. This will later allow us to use the
SCEVExpander during code generation. This is the first step to get rid of the
independent blocks pass.
llvm-svn: 150339
Such a dead code elimination can remove redundant stores to arrays. It can also
eliminate calculations where the results are stored to memory but where they are
overwritten before ever being read. It may also fix bugs like:
http://llvm.org/bugs/show_bug.cgi?id=5117
This commit just adds a sceleton without any functionality.
If anybody is interested to learn about polyhedral optimizations this would be
a good task. Well definined, self contained and pretty simple. Ping me if you
want to start and you need some pointers to get going.
llvm-svn: 149386
This has shown better results for 2mm, 3mm and a couple of other benchmarks.
After this we show consistenly better results as PoCC with maxfuse. We need
to see if PoCC can also give better results with another fusion strategy.
llvm-svn: 149267
maximise_band_depth does not seem to have any effect for now, but it may help to
increase the amount of tileable loops. We expose the flag to be able to analyze
its effects when looking into individual benchmarks.
llvm-svn: 149266
This speeds up the scheduler by orders of magnitude and in addition yields often
to a better schedule.
With this we can compile all polybench kernels with less than 5x compile time
overhead. In general the overhead is even less than 2-3x. This is still with
running a lot of redundant passes and no compile time tuning at all. There are
several obvious areas where we can improve here further.
There are also two test cases where we cannot find a schedule any more (cholesky
and another). I will look into them later on.
With this we have a very solid base line from which we can start to optimize
further.
llvm-svn: 149263
In case we can not analyze an access function, we do not discard the SCoP, but
assume conservatively that all memory accesses that can be derived from our base
pointer may be accessed.
Patch provided by: Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 146972
To extract a preoptimized LLVM-IR file from a C-file run:
clang -Xclang -load -Xclang LLVMPolly.so -O0 -mllvm -polly file.c -S -emit-llvm
On the generated file you can directly run passes such as:
'opt -view-scops file.s'
llvm-svn: 146560
Previously the scheduler was splitting bands at the level at which it detected
that the splitting of the band is necessary. This may introduce an additional
level of bands, that can be avoided by backtracking and splitting on a higher
level. Additional splits reduce the number of loops that can be tiled, such that
avoiding splits and maximizing the band depth seems preferable.
As a first data point we looked at 2mm and 3mm from the polybench test suite.
For both maximizing the tilable bands results in a significant (5-10x)
performance improvement.
This patch enables the isl scheduler option to maximize the band depth.
llvm-svn: 146557
If larger coefficients appear as part of the input dependences, the schedule
calculation can take a very long time. We observed that the main overhead in
this calculation is due to optimizing the constant coefficients. They are
misused to increase locality by merging several unrelated dimensions into a
single dimension. This unwanted optimization increases the complexity of the
generated code and furthermore slows it down.
We use a new isl scheduler option to bound the values in the constant dimension
by a user defined value (20 in our case). If the right value is choosen, costly
overoptimization is prevented.
This solution works, but requires a specific (here almost randomly choosen)
value by which the constants are bound. For the moment, this is our best
solution, but we hope to to find a more generic one later on.
After these patch the extremly long compile time for simple kernels like 2mm or
3mm is reduced to a reasonable amount of time (Not more than a couple of seconds
even in debug mode).
llvm-svn: 146556
We forgot to include the unistd.h header file that defines the
functions mentioned above. This was not a problem with gnu C++ library,
however it did not work for libc++.
llvm-svn: 145790
We disable Polly by default and add a new option '-polly' that enables Polly.
This allows us to create an the alias
$ alias clang clang -Xclang -load -Xclang LLVMPolly.so
which loads Polly always into clang. It can now be enabled by running:
$ clang -O3 -mllvm -polly file.c
To enable it by default an alias pollycc can be create
$ alias pollycc clang -O3 -mllvm -polly
llvm-svn: 144917
- Use uppercase letters according to the LLVM coding style
- Rename functions to not include 'tiledSchedule', but just Schedule. This
is more correct as tiling might be disabled.
llvm-svn: 144899
The new isl_id support for parmeters created problems when importing new
access functions. Even though the parameters had the same names,
they were mapped to different ids and where therefore incompatible.
We copy the ids now from the old parameter dimensions. This fixes the
problem.
llvm-svn: 144642
Parameters can be complex SCEV expressions, but they can also be single scalar
values. If a parameters is such a simple scalar value and the value is named,
use this name to name the isl parameter dimensions.
llvm-svn: 144641
This does not work reliable and is probably not needed. I accidentally changed
this in this recent commit:
commit a0bcd63c6ffa81616cf8c6663a87588803f7d91c
Author: grosser <grosser@91177308-0d34-0410-b5e6-96231b3b80d8>
Date: Thu Nov 10 12:47:21 2011 +0000
ScopDetect: Use INVALID macro to fail in case of aliasing
This simplifies the code and also makes the error message available to the
graphviz scop viewer.
git-svn-id: https://llvm.org/svn/llvm-project/polly/trunk@144284
llvm-svn: 144286
address is part of the access function. Also remove unused special cases that
were necessery when the base address was still contained in the access function
llvm-svn: 144280
Previously we allowed in access functions only a single SCEVUnknown, which later
became the base address. We now use getPointerBase() to derive the base address
and all remaining unknowns are handled as parameters. This allows us to handle
cases like A[b+c];
llvm-svn: 144278
This check was necessary because of the use AffineSCEVIterator in TempScopInfo.
As we removed this use recently it is not necessary any more.
llvm-svn: 144228
Instead of using TempScop to find parameters, we detect them directly
on the SCEV. This allows us to remove the TempScop parameter detection
in a subsequent commit.
This fixes a bug reported by Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 144087
Previously we built a context that contained already all parameter dimensions
from the start. We now build a context without any parameter dimensions and
extend the context as needed. All parameter dimensions are added during final
realignment.
llvm-svn: 144085
We currently run the old memory access checker in parallel, as we would
otherwise fail in TempScop because of currently unsupported functions. We will
remove the old memory access checker as soon as TempScop is fixed.
llvm-svn: 143654
The SCEV Validator is used to check if the bound of a loop can be translated
into a polyhedral constraint. The new validator is more general as the check
used previously and e.g. allows bounds like 'smax 1, %a'. At the moment, we
only allow signed comparisons. Also, the new validator is only used to verify
loop bounds. Memory accesses are still handled by the old validator.
llvm-svn: 143576
These are remainders of the switch to the newer isl version. At the point of
switching I did not test with PoCC support. I should have done. ;-)
llvm-svn: 142777
- Use __isl_give and __isl_take
- Convert variables to start with Uppercase letter
- Only assign the 'domain' after it is fully constructed
- Only name it after it is fully constructed
llvm-svn: 141361
Also take the chance and rename access functions to access relations. This is
because we do not only allow plain functions to describe an access, but we
can have any access relation that can be described with linear constraints.
llvm-svn: 141257
Polly should now be compiled with CLooG 0c252c88946b27b7b61a1a8d8fd7f94d2461dbfd
and isl 56b7d238929980e62218525b4b3be121af386edf. The most convenient way to
update is utils/checkout_cloog.sh.
llvm-svn: 141251
It may happen that we generate the code of a basic block from the original
scop is code generated several times. The new naming scheme reduces confusing
that earlier appeared as the version numbers of the new basic blocks could
have been interpreted as part of the name of the original basic block.
llvm-svn: 139092
Due to the recent introduction of isl_id, parameters need now always to be
aligned. This was not yet taken care of in the code path of vectorization and
dependence analysis.
llvm-svn: 138555
Polly adds, after it is loaded into opt or clang, its passes to the default set
of -O3 passes. This means optimizing a program with clang and Polly becomes as
simple as executing.
clang -Xclang -load -Xclang lib/LLVMPolly.so -O3 program.c
The same should work for dragonegg powered gfortran, g++, ... or any other tool
that uses the PassManagerBuilder.
Warning: Even though using Polly became with this commit extremly easy, Polly
is still Pre-Alpha Quality. This means in most cases it will rather
destroy the world than doing anything positive. ;-)
llvm-svn: 138402
I am planning to eliminate the TempScopInfo pass. To simplify this I remove
some features that may later be added to the ScopInfo pass.
The interchange pass is currently strongly tested and furthermore ment to be
replaced by the general scheduling optimizer. Reductions itself can later
be added easily.
llvm-svn: 138219
Because of me not understanding the LLVM pass structure well, I did not find a
good way to allocate isl_ctx and to free it later without getting issues with
reference counting. I now found this place, such that we can free isl_ctx. This
patch also fixes the memory leaks that were ignored beforehand.
llvm-svn: 138204