forked from OSchip/llvm-project
6217e18a7d
Translate the selected parallel loop body into a ptx string and run it with the cuda driver API. We limit this preliminary implementation to target the following special test cases: - Support only 2-dimensional parallel loops with or without only one innermost non-parallel loop. - Support write memory access to only one array in a SCoP. The patch was committed with smaller changes to the build system: There is now a flag to enable gpu code generation explictly. This was required as we need the llvm.codegen() patch applied on the llvm sources, to compile this feature correctly. Also, enabling gpu code generation does not require cuda. This requirement was removed to allow 'make polly-test' runs, even without an installed cuda runtime. Contributed by: Yabin Hu <yabin.hwu@gmail.com> llvm-svn: 161239 |
||
---|---|---|
.. | ||
autoconf | ||
cmake | ||
docs | ||
include | ||
lib | ||
test | ||
tools | ||
utils | ||
www | ||
CMakeLists.txt | ||
CREDITS.txt | ||
LICENSE.txt | ||
Makefile | ||
Makefile.common.in | ||
Makefile.config.in | ||
README | ||
configure |
README
Polly - Polyhedral optimizations for LLVM Polly uses a mathematical representation, the polyhedral model, to represent and transform loops and other control flow structures. Using an abstract representation it is possible to reason about transformations in a more general way and to use highly optimized linear programming libraries to figure out the optimal loop structure. These transformations can be used to do constant propagation through arrays, remove dead loop iterations, optimize loops for cache locality, optimize arrays, apply advanced automatic parallelization, drive vectorization, or they can be used to do software pipelining.