Scoplib only supports access functions, but not the more generic
access relations. This commit now also supports access functions
that where not directly expresses as A[sub] with sub = i + 5b,
but with A[sub] with -sub = -i + (-5b).
Test case to come.
Contributed by: Dustin Feld <d3.feld@gmail.com>
llvm-svn: 165379
This pass implements a new code generator that uses the code generation
algorithm included in isl.
For the moment the new code generation is limited to sequential code.
llvm-svn: 165037
Older versions of libpluto crashed, if no schedule was found. Recent
versions return NULL. We detect this and keep the original schedule.
llvm-svn: 164376
At the moment we can handle such arrays only by conservatively assuming that
each access to such an array may touch any element in the array. It would be
great if we could improve Polly/LLVM at some point, such that we can
recover the multi-dimensionality of the accesses.
llvm-svn: 163619
This ensures that the isl sets/maps we operate on have the same parameter
dimensions. Operations on objects with different parameter dimensions are not
allow and trigger assertions.
llvm-svn: 163618
The IndVarSimplify pass in Polly uses the intrinsics header. We need to ensure
that the header is generated, before we use it. This patch fixes the problem
for the cmake build (it did not show up in the autoconf one).
Contributed by: Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>
llvm-svn: 163130
This includes:
- The isl_id of the domain of the scattering must be copied from the original
domain
- Remove outdated references to a 'FinalRead' statement
- Print of the Pocc output, if -debug is provided.
- Add line breaks to some error messages.
Reported and Debugged by: Dustin Feld <d3.feld@gmail.com>
llvm-svn: 162901
Before we defined GPGPU_CODEGEN to '0', which does not disable the relevant code
as we just check if that value is defined at all. We now follow the cmake
approach and only define GPGPU_CODEGEN, if the feature should be enabled.
Reported by: Sebastian Pop <spop@codeaurora.org>
llvm-svn: 162275
Added a file that explains how to load Polly in dragonegg.
Also fixed a typo in the document for clang.
Committed with a typo fix and a change to make this website available from
the documentation section.
Contributed by: Sameer Sahasrabuddhe <Sameer.Sahasrabuddhe@amd.com>
llvm-svn: 161928
Translate the selected parallel loop body into a ptx string and run it with the
cuda driver API. We limit this preliminary implementation to target the
following special test cases:
- Support only 2-dimensional parallel loops with or without only one innermost
non-parallel loop.
- Support write memory access to only one array in a SCoP.
The patch was committed with smaller changes to the build system:
There is now a flag to enable gpu code generation explictly. This was required
as we need the llvm.codegen() patch applied on the llvm sources, to compile this
feature correctly. Also, enabling gpu code generation does not require cuda.
This requirement was removed to allow 'make polly-test' runs, even without an
installed cuda runtime.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 161239
This fixes a conflict between polly::createIndVarSimplifyPass() and
llvm::createIndVarSimplifyPass(), which causes problems on windows.
Reported by: Michael Kruse <MichaelKruse@meinersbur.de
llvm-svn: 161235
The Apple linker fails by default, if some function calls can not be resolved at
link time. However, all functions that are part of LLVM itself will not be
linked into Polly, but will be provided by the compiler that Polly is loaded
into. Hence, during linking we need to ignore failures due to unresolved
function calls.
llvm-svn: 161234
Otherwise the script spams the home directory and, in case there are folders
of previous attempts lying around, it may fail in some unexpected way.
llvm-svn: 160677
Cast instruction do not have side effects and can consequently be part of a
scop. We special cased them earlier, as they may be problematic within array
subscripts or loop bounds. However, the scalar evolution validator already
checks for them such that there is no need to also check the instructions within
the basic blocks. Checking them is actually overly conservative as the precence
of casts may invalidate a scop, even though scalar evolution is not influenced
by it.
llvm-svn: 160261
I did not take into account, that this patch fails to compile without the
llvm.codegen patch applied. This breaks buildbots.
I revert this until we found a solution to commit this without buildbots
complaining.
This reverts commit cb43ab80e94434e780a66be3b9a6ad466822fe33.
llvm-svn: 160165
Translate the selected parallel loop body into a ptx string and run it
with cuda driver API. We limit this preliminary implementation to
target the following special test cases:
- Support only 2-dimensional parallel loops with or without only one
innermost non-parallel loop.
- Support write memory access to only one array in a SCoP.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 160164