Commit Graph

7 Commits

Author SHA1 Message Date
Alexey Bataev 8c5555c39a [OPENMP][NVPTX]Mark more functions as always_inline for better
performance.

Internally generated functions must be marked as always_inlines in most
cases. Patch marks some extra reduction function + outlined parallel
functions as always_inline for better performance, but only if the
optimization is requested.

llvm-svn: 361269
2019-05-21 15:11:58 +00:00
Alexey Bataev 1fc1f8e819 [OPENMP][NVPTX]Use __kmpc_data_sharing_coalesced_push_stack function.
Coalesced memory access requires use of the new function
`__kmpc_data_sharing_coalesced_push_stack` instead of the
`__kmpc_data_sharing_push_stack`.

llvm-svn: 345991
2018-11-02 16:08:31 +00:00
Alexey Bataev 9ea3c38597 [OPENMP][NVPTX] Support memory coalescing for globalized variables.
Added support for memory coalescing for better performance for
globalized variables. From now on all the globalized variables are
represented as arrays of 32 elements and each thread accesses these
elements using `tid & 31` as index.

llvm-svn: 344049
2018-10-09 14:49:00 +00:00
Alexey Bataev 8d8e1235ab [OPENMP][NVPTX] Add support for lightweight runtime.
If the target construct can be executed in SPMD mode + it is a loop
based directive with static scheduling, we can use lightweight runtime
support.

llvm-svn: 340953
2018-08-29 18:32:21 +00:00
Alexey Bataev b99dcb5f31 [OPENMP, NVPTX] Do not globalize local variables in parallel regions.
In generic data-sharing mode we are allowed to not globalize local
variables that escape their declaration context iff they are declared
inside of the parallel region. We can do this because L2 parallel
regions are executed sequentially and, thus, we do not need to put
shared local variables in the global memory.

llvm-svn: 336567
2018-07-09 17:43:58 +00:00
Alexey Bataev 9a70017537 [OPENMP, NVPTX] Fix linkage of the global entries.
The linkage of the global entries must be weak to enable support of
redefinition of the same target regions in multiple compilation units.

llvm-svn: 331768
2018-05-08 14:16:57 +00:00
Alexey Bataev c99042ba97 [OPENMP, NVPTX] Improve globalization of the variables captured by value.
If the variable is captured by value and the corresponding parameter in
the outlined function escapes its declaration context, this parameter
must be globalized. To globalize it we need to get the address of the
original parameter, load the value, store it to the global address and
use this global address instead of the original.

Patch improves globalization for parallel|teams regions + functions in
declare target regions.

llvm-svn: 327654
2018-03-15 18:10:54 +00:00