If the flags '-polly-report -g' are given, we print file name and line numbers
for the beginning and end of all detected scops.
linear-algebra/kernels/gemm/gemm.c:23: Scop start
linear-algebra/kernels/gemm/gemm.c:42: Scop end
linear-algebra/kernels/gemm/gemm.c:77: Scop start
linear-algebra/kernels/gemm/gemm.c:82: Scop end
llvm-svn: 167235
In addition to the arrays and clast variables a SCoP statement may also refer to
values defined before the SCoP or to function arguments. Detect these values and
add them to the set of values passed to the function generated for OpenMP
parallel execution of a clast.
Committed with additional test cases and some refactoring.
Contributed by: Armin Groesslinger <armin.groesslinger@uni-passau.de>
llvm-svn: 167214
This change ensures that isl is only detected if it includes code generation
support. This allows us to remove a lot of conditional compilation and also
avoids missing test cases in case the feature is not available.
llvm-svn: 166403
Previously isl always generated '<=' or '>='. However, in many cases '<' or '>'
leads to simpler code. This commit updates isl and adds the relevant code
generation support to Polly.
llvm-svn: 166020
This pass implements a new code generator that uses the code generation
algorithm included in isl.
For the moment the new code generation is limited to sequential code.
llvm-svn: 165037
This includes:
- The isl_id of the domain of the scattering must be copied from the original
domain
- Remove outdated references to a 'FinalRead' statement
- Print of the Pocc output, if -debug is provided.
- Add line breaks to some error messages.
Reported and Debugged by: Dustin Feld <d3.feld@gmail.com>
llvm-svn: 162901
Translate the selected parallel loop body into a ptx string and run it with the
cuda driver API. We limit this preliminary implementation to target the
following special test cases:
- Support only 2-dimensional parallel loops with or without only one innermost
non-parallel loop.
- Support write memory access to only one array in a SCoP.
The patch was committed with smaller changes to the build system:
There is now a flag to enable gpu code generation explictly. This was required
as we need the llvm.codegen() patch applied on the llvm sources, to compile this
feature correctly. Also, enabling gpu code generation does not require cuda.
This requirement was removed to allow 'make polly-test' runs, even without an
installed cuda runtime.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 161239
This fixes a conflict between polly::createIndVarSimplifyPass() and
llvm::createIndVarSimplifyPass(), which causes problems on windows.
Reported by: Michael Kruse <MichaelKruse@meinersbur.de
llvm-svn: 161235
I did not take into account, that this patch fails to compile without the
llvm.codegen patch applied. This breaks buildbots.
I revert this until we found a solution to commit this without buildbots
complaining.
This reverts commit cb43ab80e94434e780a66be3b9a6ad466822fe33.
llvm-svn: 160165
Translate the selected parallel loop body into a ptx string and run it
with cuda driver API. We limit this preliminary implementation to
target the following special test cases:
- Support only 2-dimensional parallel loops with or without only one
innermost non-parallel loop.
- Support write memory access to only one array in a SCoP.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 160164
Derive the maximal and minimal values of a parameter from the type it has. Add
this information to the scop context. This information is needed, to derive
optimal types during code generation.
llvm-svn: 157245
This is an incomplete implementation of the SCEV based code generation.
When finished it will remove the need for -indvars -enable-iv-rewrite.
For the moment it is still disabled. Even though it passes 'make polly-test',
there are still loose ends especially in respect of OpenMP code generation.
llvm-svn: 155717
We create a new file LoopGenerators that provides utility classes for the
generation of OpenMP parallel and scalar loops. This means we move a lot
of the OpenMP generation out of the Polly specific code generator.
llvm-svn: 153325
The FinalRead statement represented a virtual read that is executed after the
SCoP. It was used when we verified the correctness of a schedule by checking if
it yields the same FLOW dependences as the original code. This is only works, if
we have a final read that reads all memory at the end of the SCoP.
We now switched to just checking if a schedule does not introduce negative
dependences and also consider WAW WAR dependences. This restricts the schedules
a little bit more, but we do not have any optimizer that would calculate a more
complex schedule. Hence, for now final reads are obsolete.
llvm-svn: 152319
We now just check if the new scattering would create non-positive dependences.
This is a lot faster than recalculating dependences (which is especially slow
on tiled code).
llvm-svn: 152230
This allows us to enable -enable-iv-rewrite by default and releases LLVM from
the burdon to keep that feature. This is an intermediate step. We plan to soon
remove the need for rewritten induction variables entirely.
llvm-svn: 150481
Such a dead code elimination can remove redundant stores to arrays. It can also
eliminate calculations where the results are stored to memory but where they are
overwritten before ever being read. It may also fix bugs like:
http://llvm.org/bugs/show_bug.cgi?id=5117
This commit just adds a sceleton without any functionality.
If anybody is interested to learn about polyhedral optimizations this would be
a good task. Well definined, self contained and pretty simple. Ping me if you
want to start and you need some pointers to get going.
llvm-svn: 149386