Added initial support for L2 parallelism in SPMD mode. Note, though,
that the orphaned parallel directives are not currently supported in
SPMD mode.
llvm-svn: 332016
It is required to emit unique names for offloading regions ids. Required
to support compilation and linking of several compilation units.
llvm-svn: 331899
If the global variables are marked as declare target and they need
ctors/dtors, these ctors/dtors are emitted and then invoked by the
offloading runtime library. They are not explicitly used in the emitted
code and thus can be optimized out. Patch marks these functions as used,
so the optimizer cannot remove these function during the optimization
phase.
llvm-svn: 331879
The linkage of the global entries must be weak to enable support of
redefinition of the same target regions in multiple compilation units.
llvm-svn: 331768
Added initial codegen for level 2, 3 etc. parallelism. Currently, all
the second, the third etc. parallel regions will run sequentially.
llvm-svn: 331642
enabled for the host.
If the compilation for the host enables C++ exceptions, but they are not
supported by the device, we still need to allow the code with the
exception handling constructs outside of the target regions.
llvm-svn: 331372
Some symbols are not allowed to be used as names on some targets. Patch
ries to unify the emission of the names of LLVM globals so they could be
used on different targets.
llvm-svn: 331358
devices.
If the function is an instantiation|specialization of the template and
is used in the device code, the definitions of such functions should be
emitted for the device.
llvm-svn: 331261
We should not emit warning that the parameters are not marked as declare
target, these declaration are local and cannot be marked as declare
target.
llvm-svn: 331211
Emit error messages instead of compiler crashing when the target region
does not exist in the device code + fix crash when the location comes
from macros.
llvm-svn: 331195
NVPTX target.
When generating the wrapper function for the offloading region, we need
to call the outlined function and cast the arguments correctly to follow
the ABI. Usually, variables captured by value are casted to `uintptr_t`
type. But this should not performed for the variables with pointer type.
llvm-svn: 330620
Global variables marked as declare target are allowed to be used in map
clauses. Patch fixes the crash of the compiler on the declare target
variables in map clauses.
llvm-svn: 330156
Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv
as we operate on unsigned values only (addresses, converted to integers)
llvm-svn: 329411
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
variables.
Added emission of the offloading data sections for the variables within
declare target regions + fixes emission of the declare target variables
marked as declare target not within the declare target region.
llvm-svn: 328888
When the declare target variables are emitted for the device,
constructors|destructors for these variables must emitted and registered
by the runtime in the offloading sections.
llvm-svn: 328705
If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.
llvm-svn: 328544
Summary: The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init() function is called by the master. This ensures that the per-team data structures of the runtime have been initialized.
Reviewers: ABataev, grokos, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D44749
llvm-svn: 328219
constructs in generic mode.
Fixed codegen for distribute parallel combined constructs. We have to
pass and read the shared lower and upper bound from the distribute
region in the inner parallel region. Patch is for generic mode.
llvm-svn: 327990
If the generic codegen is enabled and private copy of the original
variable escapes the declaration context, this private copy should be
globalized just like it was the original variable.
llvm-svn: 327985
If the variable is captured by value and the corresponding parameter in
the outlined function escapes its declaration context, this parameter
must be globalized. To globalize it we need to get the address of the
original parameter, load the value, store it to the global address and
use this global address instead of the original.
Patch improves globalization for parallel|teams regions + functions in
declare target regions.
llvm-svn: 327654
Added initial codegen for device side of declarations inside `omp
declare target` construct + codegen for implicit `declare target`
functions, which are used in the target regions.
llvm-svn: 327636
If initialization of the task reductions requires pointer to original
variable, which is stored in the threadprivate storage, we used the
address of this pointer instead.
llvm-svn: 327136
using.
We may emit the code in wrong order because of incorrect implementation
of the runtime functions for task reductions. Threadprivate storages may
be initialized after real initialization of the reduction items. Patch
fixes this problem.
llvm-svn: 327008
Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner.
Reviewers: ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43625
llvm-svn: 326948
We may emit incorrect lifetime info during codegen for loop counters in
OpenMP constructs because of automatic scope cleanup when we needed
temporarily locations for private loop counters.
llvm-svn: 326922
variables.
If the task has reduction construct and this construct for some variable
requires unique threadprivate storage, we may generate different names
for variables used in taskgroup task_reduction clause and in task
in_reduction clause. Patch fixes this problem.
llvm-svn: 326827
Patch fixes the problem with the functions marked as `declare simd`. If
the canonical declaration does not have associated `declare simd`
construct, we may not generate required code even if other
redeclarations are marked as `declare simd`.
llvm-svn: 326594
In CUDA mode all local variables are actually thread
local|threadprivate, not private, and, thus, they cannot be shared
between threads|lanes.
llvm-svn: 326590
Differential Revision: https://reviews.llvm.org/D43852
This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag.
llvm-svn: 326368
If several member expressions are mapped and they reference the same
address as a base, but access different members, this must be allowed.
llvm-svn: 326212