forked from OSchip/llvm-project
1abd9ffa37
Summary: Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops. Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well. Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ | kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: mehdi_amini, nemanjai, pollydev, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35176 llvm-svn: 307814 |
||
---|---|---|
.. | ||
cmake | ||
docs | ||
include/polly | ||
lib | ||
test | ||
tools | ||
unittests | ||
utils | ||
www | ||
.arcconfig | ||
.arclint | ||
.gitattributes | ||
.gitignore | ||
CMakeLists.txt | ||
CREDITS.txt | ||
LICENSE.txt | ||
README |
README
Polly - Polyhedral optimizations for LLVM ----------------------------------------- http://polly.llvm.org/ Polly uses a mathematical representation, the polyhedral model, to represent and transform loops and other control flow structures. Using an abstract representation it is possible to reason about transformations in a more general way and to use highly optimized linear programming libraries to figure out the optimal loop structure. These transformations can be used to do constant propagation through arrays, remove dead loop iterations, optimize loops for cache locality, optimize arrays, apply advanced automatic parallelization, drive vectorization, or they can be used to do software pipelining.