For some of the clauses the closing location erroneously points to the
beginning of the next clause rather than on the location of the closing
bracket of the clause.
llvm-svn: 336460
A reduction for an incomplete array type used to produce an assert
fail during codegen. Now it produces a diagnostic.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D48735
llvm-svn: 335911
Patch tries to make better analysis of the variables that should be
globalized. From now, instead of all parallel directives it will check
only distribute parallel .. directives and check only for
firstprivte/lastprivate variables if they must be globalized.
llvm-svn: 335632
If the shuffle is required for the reduced structures/big data type,
current code may cause compiler crash because of the loading of the
aggregate values. Patch fixes this problem.
llvm-svn: 335377
parallel region.
If the current construct requires sharing of the local variable in the
inner parallel region, this variable must be globalized to avoid
runtime crash.
llvm-svn: 335285
If orphaned parallel region is found, the next code must be emitted:
```
if(__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid))
Serialized execution.
else if (IsMasterThread())
Prepare and signal worker.
else
Outined function call.
```
llvm-svn: 333301
if `-fopenmp-simd` is specified alone, `_OPENMP` macro should not be
defined. If `-fopenmp-simd` is specified along with the `-fopenmp`,
`_OPENMP` macro should be defined with the value `201511`.
llvm-svn: 332852
functions.
If the combined construct is specified in the declare target function
and the device code is emitted, the compiler crashes because of the
incorrectly chosen captured stmt. We should choose the innermost
captured statement, not the outermost.
llvm-svn: 332477
If the orphaned directive is executed in SPMD mode, we need to emit the
check for the SPMD mode and run the orphaned parallel directive in
sequential mode.
llvm-svn: 332467
In generic data-sharing mode we do not need to globalize
variables/parameters of reference/pointer types. They already are placed
in the global memory.
llvm-svn: 332380
Added initial support for L2 parallelism in SPMD mode. Note, though,
that the orphaned parallel directives are not currently supported in
SPMD mode.
llvm-svn: 332016
It is required to emit unique names for offloading regions ids. Required
to support compilation and linking of several compilation units.
llvm-svn: 331899
If the global variables are marked as declare target and they need
ctors/dtors, these ctors/dtors are emitted and then invoked by the
offloading runtime library. They are not explicitly used in the emitted
code and thus can be optimized out. Patch marks these functions as used,
so the optimizer cannot remove these function during the optimization
phase.
llvm-svn: 331879
The linkage of the global entries must be weak to enable support of
redefinition of the same target regions in multiple compilation units.
llvm-svn: 331768
Added initial codegen for level 2, 3 etc. parallelism. Currently, all
the second, the third etc. parallel regions will run sequentially.
llvm-svn: 331642
enabled for the host.
If the compilation for the host enables C++ exceptions, but they are not
supported by the device, we still need to allow the code with the
exception handling constructs outside of the target regions.
llvm-svn: 331372
Some symbols are not allowed to be used as names on some targets. Patch
ries to unify the emission of the names of LLVM globals so they could be
used on different targets.
llvm-svn: 331358
devices.
If the function is an instantiation|specialization of the template and
is used in the device code, the definitions of such functions should be
emitted for the device.
llvm-svn: 331261
We should not emit warning that the parameters are not marked as declare
target, these declaration are local and cannot be marked as declare
target.
llvm-svn: 331211
Emit error messages instead of compiler crashing when the target region
does not exist in the device code + fix crash when the location comes
from macros.
llvm-svn: 331195
NVPTX target.
When generating the wrapper function for the offloading region, we need
to call the outlined function and cast the arguments correctly to follow
the ABI. Usually, variables captured by value are casted to `uintptr_t`
type. But this should not performed for the variables with pointer type.
llvm-svn: 330620
Global variables marked as declare target are allowed to be used in map
clauses. Patch fixes the crash of the compiler on the declare target
variables in map clauses.
llvm-svn: 330156
Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv
as we operate on unsigned values only (addresses, converted to integers)
llvm-svn: 329411
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
variables.
Added emission of the offloading data sections for the variables within
declare target regions + fixes emission of the declare target variables
marked as declare target not within the declare target region.
llvm-svn: 328888
When the declare target variables are emitted for the device,
constructors|destructors for these variables must emitted and registered
by the runtime in the offloading sections.
llvm-svn: 328705
If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.
llvm-svn: 328544
Summary: The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init() function is called by the master. This ensures that the per-team data structures of the runtime have been initialized.
Reviewers: ABataev, grokos, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D44749
llvm-svn: 328219
constructs in generic mode.
Fixed codegen for distribute parallel combined constructs. We have to
pass and read the shared lower and upper bound from the distribute
region in the inner parallel region. Patch is for generic mode.
llvm-svn: 327990
If the generic codegen is enabled and private copy of the original
variable escapes the declaration context, this private copy should be
globalized just like it was the original variable.
llvm-svn: 327985
If the variable is captured by value and the corresponding parameter in
the outlined function escapes its declaration context, this parameter
must be globalized. To globalize it we need to get the address of the
original parameter, load the value, store it to the global address and
use this global address instead of the original.
Patch improves globalization for parallel|teams regions + functions in
declare target regions.
llvm-svn: 327654
Added initial codegen for device side of declarations inside `omp
declare target` construct + codegen for implicit `declare target`
functions, which are used in the target regions.
llvm-svn: 327636
If initialization of the task reductions requires pointer to original
variable, which is stored in the threadprivate storage, we used the
address of this pointer instead.
llvm-svn: 327136
using.
We may emit the code in wrong order because of incorrect implementation
of the runtime functions for task reductions. Threadprivate storages may
be initialized after real initialization of the reduction items. Patch
fixes this problem.
llvm-svn: 327008
Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner.
Reviewers: ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43625
llvm-svn: 326948
We may emit incorrect lifetime info during codegen for loop counters in
OpenMP constructs because of automatic scope cleanup when we needed
temporarily locations for private loop counters.
llvm-svn: 326922
variables.
If the task has reduction construct and this construct for some variable
requires unique threadprivate storage, we may generate different names
for variables used in taskgroup task_reduction clause and in task
in_reduction clause. Patch fixes this problem.
llvm-svn: 326827
Patch fixes the problem with the functions marked as `declare simd`. If
the canonical declaration does not have associated `declare simd`
construct, we may not generate required code even if other
redeclarations are marked as `declare simd`.
llvm-svn: 326594
In CUDA mode all local variables are actually thread
local|threadprivate, not private, and, thus, they cannot be shared
between threads|lanes.
llvm-svn: 326590
Differential Revision: https://reviews.llvm.org/D43852
This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag.
llvm-svn: 326368
If several member expressions are mapped and they reference the same
address as a base, but access different members, this must be allowed.
llvm-svn: 326212
The tests that failed on a windows host have been fixed.
Original message:
Start setting dso_local for COFF.
With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.
llvm-svn: 325940
Differential Revision: https://reviews.llvm.org/D43513
This is a bug fix that removes the emission of reduction support for pragma 'distribute' when found alone or in combinations without simd.
Pragma 'distribute' does not have a reduction clause, but when combined with pragma 'simd' we need to emit the support for simd's reduction clause as part of code generation for distribute. This guard is similar to the one used for reduction support earlier in the same code gen function.
llvm-svn: 325822
constructs.
The compiler may emit some extra warnings for functions, that are
implicit specialization of the templates, declared in the target region.
llvm-svn: 325391
Codegen for ordered with doacross construct might produce incorrect code
because of missing cleanup scope for the construct. Without this scope
the final runtime function call could be emitted in the wrong order that
leads to incorrect codegen.
llvm-svn: 325304
Summary:
-ast-print prints omp pragmas with a trailing space. While this
behavior is likely of little concern to most users, surely it's
unintentional, and it's annoying for some source-level work I'm
pursuing. This patch focuses on omp pragmas, but it also fixes
init_seg and loop hint pragmas because they share implementation.
The testing strategy here is to add usually just one '{{$}}' per
relevant -ast-print test file. This seems to achieve good code
coverage. However, this strategy is probably easy to forget as the
tests evolve. That's probably fine as this fix is far from critical.
The main goal of the testing is to aid the initial review.
This patch also adds a fixme for "#pragma unroll", which prints as
"#pragma unroll (enable)", which is invalid syntax.
Reviewers: ABataev
Reviewed By: ABataev
Subscribers: guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43204
llvm-svn: 325145
Summary:
This patch also adds the 'DW_AT_artificial' flag to the generated variable.
Addresses the issues mentioned in http://llvm.org/PR30553.
Reviewers: CarlosAlbertoEnciso, probinson, aprantl
Reviewed By: aprantl
Subscribers: JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D43189
llvm-svn: 324988
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
This should implement:
Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553
Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie
Reviewed By: aprantl
Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D41698
llvm-svn: 323952
Summary:
This change is step three in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment()
and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
Reviewers: rjmccall
Subscribers: jyknight, nemanjai, nhaehnle, javed.absar, sbc100, aheejin, kbarton, fedor.sergeev, cfe-commits
Differential Revision: https://reviews.llvm.org/D41677
llvm-svn: 323617
Summary:
Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.
The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.
This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.
llvm-svn: 322964
Firstly, each offloading entry must have a unique name or the
linker will complain if there are multiple files with target
regions. Secondly, the compiler must not introduce padding so
mark the struct with a PackedAttr.
Differential Revision: https://reviews.llvm.org/D42168
llvm-svn: 322858
parallel for simd` directives.
Added codegen for `depend` clauses on `#pragma omp target teams
distribute parallel for simd` directives.
llvm-svn: 322587
simd`.
Added host codegen + codegen for devices with default codegen for
`#pragma omp target teams distribute parallel for simd` directive.
llvm-svn: 322515
getAssociatedStmt() returns the outermost captured statement for the
OpenMP directive. It may return incorrect region in case of combined
constructs. Reworked the code to reduce the number of calls of
getAssociatedStmt() and used getInnermostCapturedStmt() and
getCapturedStmt() functions instead.
In case of firstprivate variables it may lead to an extra allocas
generation for private copies even if the variable is passed by value
into outlined function and could be used directly as private copy.
llvm-svn: 322393
While updating clang tests for having clang set dso_local I noticed
that:
- There are *a lot* of tests to update.
- Many of the updates are redundant.
They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.
llvm-svn: 322318
Summary:
First, this patch fixes an assert failure when, for example, "omp for"
has num_teams.
Second, this patch prevents duplicate diagnostics when, for example,
"omp for" has uniform.
This patch makes the general assumption (even where it doesn't
necessarily fix an existing bug) that it is worthless to perform sema
for a clause that appears on a directive on which OpenMP does not
permit that clause. However, due to this assumption, this patch
suppresses some diagnostics that were expected in the test suite. I
assert that those diagnostics were likely just distracting to the
user.
Reviewers: ABataev
Reviewed By: ABataev
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D41841
llvm-svn: 322107
Patch fixes incorrect capturing of the expressions in clauses with
expressions that must be captured for the combined constructs. Incorrect
capturing may lead to compiler crash during codegen phase.
llvm-svn: 321820
If the reduction required shuffle in the NVPTX codegen, we may need to
cast the reduced value to the integer type. This casting was implemented
incorrectly and may cause compiler crash. Patch fixes this problem.
llvm-svn: 321818
only.
Added support for -fopenmp-simd option that allows compilation of
simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
If the clause is applied to the combined construct and has captured
expression, try to capture this expression by value rather than by
reference.
llvm-svn: 321386
This allows you to dump C++ code that spells bool instead of _Bool, leaves off the elaborated type specifiers when printing struct or class names, and other C-isms.
Fixes the -Wreorder issue and fixes the ast-dump-color.cpp test.
llvm-svn: 321310
This allows you to dump C++ code that spells bool instead of _Bool, leaves off the elaborated type specifiers when printing struct or class names, and other C-isms.
llvm-svn: 321223
Previously the attributes were emitted only for function definitions.
Patch adds emission of the attributes for function declarations.
llvm-svn: 320826
Summary:
The backend should only emit data sharing code for the cases where it is needed.
A new function attribute is used by Clang to enable data sharing only for the cases where OpenMP semantics require it and there are variables that need to be shared.
Reviewers: hfinkel, Hahnfeld, ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: cfe-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D41123
llvm-svn: 320527
This patch is to add diagnose when a function name is
specified on the link clause. According to the OpenMP
spec, only the list items that exclude the function
name are allowed on the link clause.
Differential Revision: https://reviews.llvm.org/D40968
llvm-svn: 320521
Private variables are completely redefined in the outlined regions, so
we don't need to capture them. Patch adds this behavior to the
target-based regions.
llvm-svn: 320078
The adjustment is calculated with CreatePtrDiff() which returns
the difference in (base) elements. This is passed to CreateGEP()
so make sure that the GEP base has the correct pointer type:
It needs to be a pointer to the base type, not a pointer to a
constant sized array.
Differential Revision: https://reviews.llvm.org/D40911
llvm-svn: 319931
If the error is generated during analysis of implicitly or explicitly
mapped variables, it may cause compiler crash because of incorrect
analysis.
llvm-svn: 319774
Though it is incorrect from point of view of OpenMP standard to have
dependent iteration space in OpenMP loops, compiler should not crash.
Patch fixes this problem.
llvm-svn: 319700
Previously we emitted `__tgt_target_teams` only for standalone teams
directives. This patch allows emit this function for all teams-based
directives.
llvm-svn: 319585
distribute directives.
OpenMP standard does not allow to mark the variables as firstprivate and lastprivate at the same time in distribute-based directives. Patch fixes this problem.
llvm-svn: 319560
Clang asserts on undeclared variables on the to or link clause in the declare
target directive. The patch is to properly diagnose the error.
// foo1 and foo2 are not declared
#pragma omp declare target to(foo1)
#pragma omp declare target link(foo2)
Differential Revision: https://reviews.llvm.org/D40588
llvm-svn: 319458
teams region.
If the inner teams region is not correct, it may cause an assertion when
processing outer target region. Patch fixes this problem.
llvm-svn: 319450
directives.
According to the OpenMP standard, only loop control variables can be
used in linear clauses of distribute-based simd directives.
llvm-svn: 319362
In the future the compiler will analyze whether the OpenMP
runtime needs to be (fully) initialized and avoid that overhead
if possible. The functions already take an argument to transfer
that information to the runtime, so pass in the default value 1.
(This is needed for binary compatibility with libomptarget-nvptx
currently being upstreamed.)
Differential Revision: https://reviews.llvm.org/D40354
llvm-svn: 318836
This clang patch changes the __tgt_* API function signatures in preparation for the new map interface.
Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits
Differential revision: https://reviews.llvm.org/D40281
llvm-svn: 318789
Summary:
This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team.
This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are:
- Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references;
- Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained.
A simple code snippet which illustrates an implicit data sharing situation is as follows:
```
#pragma omp target
{
// master thread only
int v;
#pragma omp parallel
{
// worker threads
// use v
}
}
```
Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master.
The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct.
Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin
Reviewed By: ABataev
Subscribers: cfe-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D38976
llvm-svn: 318773
https://reviews.llvm.org/D40187
This patch implements code gen for 'teams distribute parallel for' on the host, including all its clauses and related regression tests.
llvm-svn: 318692
Some target devices (e.g. Nvidia GPUs) don't support dynamic stack
allocation and hence no VLAs. Print errors with description instead
of failing in the backend or generating code that doesn't work.
This patch handles explicit uses of VLAs (local variable in target
or declare target region) or implicitly generated (private) VLAs
for reductions on VLAs or on array sections with non-constant size.
Differential Revision: https://reviews.llvm.org/D39505
llvm-svn: 318601
Summary:
[OpenMP] diagnose assign to firstprivate const
Clang does not diagnose assignments to const variables declared
firstprivate. Furthermore, codegen is broken such that, at run time,
such assignments simply have no effect. For example, the following
prints 0 not 1:
int main() {
const int i = 0;
#pragma omp parallel firstprivate(i)
{ i=1; printf("%d\n", i); }
return 0;
}
This commit makes these assignments a compile error, which is
consistent with other OpenMP compilers I've tried (pgcc 17.4-0, gcc
6.3.0).
Reviewers: ABataev
Reviewed By: ABataev
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D39859
llvm-svn: 317891
If the thread id is requested in windows mode within funclets, we may
generate incorrect function call that could lead to broken codegen.
llvm-svn: 317208
We can generate constant sized arrays whenever the array section has constant
length, even if the base expression itself is a VLA.
Differential Revision: https://reviews.llvm.org/D39504
llvm-svn: 317207
In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }
For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }
This relands commit r316229 that I reverted in r316235 because it
failed on some bots. During investigation I found that this was because
Clang and GCC evaluate the two arguments to emplace_back() in
ReductionCodeGen::emitSharedLValue() in a different order, hence
leading to a different order of generated instructions in the final
LLVM IR. Fix this by passing in the arguments from temporary variables
that are evaluated in a defined order.
Differential Revision: https://reviews.llvm.org/D39136
llvm-svn: 316362
In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }
For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }
Differential Revision: https://reviews.llvm.org/D39136
llvm-svn: 316229
If the variables is boolean and we generating inner function with real
types, the codegen may crash because of not loading boolean value from
memory.
llvm-svn: 316011
reduction.
If the reduction is an array or an array section and reduction operation
is declare reduction without initializer, it may lead to crash.
llvm-svn: 315611
in C.
If we try to get the lvalue for thread_id variables in inlined regions,
we did not use the correct version of function. Fixed this bug by adding
overrided version of the function getThreadIDVariableLValue for inlined
regions.
llvm-svn: 315578
If both taskloop and task directives are used at the same time in one
program, we may ran into the situation when the particular type for task
directive is reused for taskloop directives. Patch fixes this problem.
llvm-svn: 315464
Previously we may erroneously try to capture locally declared static
variables, which will lead to crash for target-based constructs.
Patch fixes this problem.
llvm-svn: 315076
In C++11 variable to global variables are considered as constant
expressions and these variables are not captured in the outlined
regions. Patch allows capturing of such variables in the OpenMP regions.
llvm-svn: 315074
If the `defaultmap(tofrom:scalar)` clause is specified, the scalars must
be mapped with 'tofrom' modifiers, otherwise they must be captured as
firstprivates.
llvm-svn: 314995
https://reviews.llvm.org/D38371
This patch implements codegen for the combined 'teams distribute" OpenMP pragma and adds regression tests for all its clauses.
llvm-svn: 314905
declaration.
Patch allows using of the `#pragma omp declare target`| `#pragma omp end
declare target` directives inside the structures if we need to mark as
declare target only some static members.
llvm-svn: 314833
directives.
The argument of the `device` clause in target-based executable
directives must be captured to support codegen for the `target`
directives with the `depend` clauses.
llvm-svn: 314686
Simplified and generalized codegen for non-offloading part that works if
offloading is failed or condition of the `if` clause is `false`.
llvm-svn: 314670
Summary: Test for checking if the mapping is performed correctly. This is a test initially included in Patch https://reviews.llvm.org/D29905
Reviewers: Hahnfeld, carlo.bertolli, caomhin, ABataev
Reviewed By: Hahnfeld
Subscribers: tra, cfe-commits
Differential Revision: https://reviews.llvm.org/D38040
llvm-svn: 314303
Summary: Test for checking if the mapping is performed correctly. This is a test initially included in Patch https://reviews.llvm.org/D29905
Reviewers: Hahnfeld, carlo.bertolli, caomhin
Reviewed By: Hahnfeld
Subscribers: tra, cfe-commits
Differential Revision: https://reviews.llvm.org/D38040
llvm-svn: 314228
Summary: Test for checking if the mapping is performed correctly. This is a test initially included in Patch https://reviews.llvm.org/D29905
Reviewers: Hahnfeld, carlo.bertolli, caomhin
Reviewed By: Hahnfeld
Subscribers: tra, cfe-commits
Differential Revision: https://reviews.llvm.org/D38040
llvm-svn: 314210
directives.
If the variable is used in the target-based region but is not found in
any private|mapping clause, then generate implicit firstprivate|map
clauses for these implicitly mapped variables.
llvm-svn: 314205
If the captured variable has re-declaration we may end up with the
situation where the captured variable is the re-declaration while the
referenced variable is the canonical declaration (or vice versa). In
this case we may generate wrong code. Patch fixes this situation.
llvm-svn: 313995
This is to fix PR31620. MaxAtomicInlineWidth is set to 128 for x86_64. However
for target without cx16 support, 128 atomic operation will generate __sync_*
libcalls. The patch set MaxAtomicInlineWidth to 64 if the target doesn't support
cx16.
Differential Revision: https://reviews.llvm.org/D38046
llvm-svn: 313992
If the captured variable has some redeclarations we may run into the
situation where the redeclaration is used instead of the canonical
declaration and we may consider this variable as one not captured
before.
llvm-svn: 313880
When the value specified for n in ordered(n) is larger than the number of loops a segmentation fault can occur in one of two ways when attempting to print out a diagnostic for an associated depend(sink : vec):
1) The iteration vector vec contains less than n items
2) The iteration vector vec contains a variable that is not a loop control variable
This patch addresses both of these issues.
Differential Revision: https://reviews.llvm.org/D38049
llvm-svn: 313675
__kmpc_for_static_fini().
Added special flags for calls of __kmpc_for_static_fini(), like previous
ly for __kmpc_for_static_init(). Added flag OMP_IDENT_WORK_DISTRIBUTE
for distribute cnstruct, OMP_IDENT_WORK_SECTIONS for sections-based
constructs and OMP_IDENT_WORK_LOOP for loop-based constructs in
location flags.
llvm-svn: 312642
move constructor.
Previously user-defined reduction initializer was considered as an
assignment expression, not as initializer. Fixed this by treating the
initializer expression as an initializer.
llvm-svn: 312638
step>1.
If the loop is a loot with random access iterators and the iteration
construct is represented it += n, then the compiler crashed because of
reusing of the same MaterializedTemporaryExpr around N. Patch fixes it
by using the expression as written, without any special kind of
wrappings.
llvm-svn: 312292
Capturing of the global variables occurs only in target regions. Patch
fixes it and allows capturing of globals in all target executable
directives.
llvm-svn: 312024
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.
This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.
See also PR22780, for making DIExpression not be an MDNode.
Reviewers: aprantl, dexonsmith, dblaikie
Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D37075
llvm-svn: 311594
of class fails to map class static variable.
If the global variable is captured and it has several redeclarations,
sometimes it may lead to a compiler crash. Patch fixes this by working
only with canonical declarations.
llvm-svn: 311479
If worksharing construct has at least one linear item, an implicit
synchronization point must be emitted to avoid possible conflict with
the loading/storing values to the original variables. Added implicit
barrier if the linear item is found before actual start of the
worksharing construct.
llvm-svn: 311013
If exceptions are enabled, there may be a problem with the codegen of
the finalization functions from OpenMP runtime. It happens because of
the problem with the getting of thread identifier value. Patch tries to
fix it by using the result of the call of function
__kmpc_global_thread_num() rather than loading of value of outlined
function parameter.
llvm-svn: 311007
When translating arguments for NVPTX target it is not taken into account
that function may have variable number of arguments. Patch fixes this
problem.
llvm-svn: 310920
__kmpc_for_static_init().
OpenMP 5.0 will include OpenMP Tools interface that requires distinguishing different worksharing constructs.
Since the same entry point (__kmp_for_static_init(ident_t *loc,
kmp_int32 global_tid,........)) is called in case static
loop/sections/distribute it is suggested using 'flags' field of the
ident_t structure to pass the type of the construct.
llvm-svn: 310865
name.
If the host code is compiled with the debug info, while the target
without, there is a problem that the compiler is unable to find the
debug wrapper. Patch fixes this problem by emitting special name for the
debug version of the code.
llvm-svn: 310511
Converting a _Complex type to a real one simply discards the imaginary part.
This can easily lead to loss of information so for safety (and GCC
compatibility) this patch disallows that when the conversion would be implicit.
The one exception is bool, which actually compares both real and imaginary
parts and so is safe.
llvm-svn: 310427
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310387
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310377
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310360
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310104
'#pragma pack (pop)' and suspicious uses of '#pragma pack' in included files
The second recommit (r309106) was reverted because the "non-default #pragma
pack value chages the alignment of struct or union members in the included file"
warning proved to be too aggressive for external projects like Chromium
(https://bugs.chromium.org/p/chromium/issues/detail?id=749197). This recommit
makes the problematic warning a non-default one, and gives it the
-Wpragma-pack-suspicious-include warning option.
The first recommit (r308441) caused a "non-default #pragma pack value might
change the alignment of struct or union members in the included file" warning
in LLVM itself. This recommit tweaks the added warning to avoid warnings for
#includes that don't have any records that are affected by the non-default
alignment. This tweak avoids the previously emitted warning in LLVM.
Original message:
This commit adds a new -Wpragma-pack warning. It warns in the following cases:
- When a translation unit is missing terminating #pragma pack (pop) directives.
- When entering an included file if the current alignment value as determined
by '#pragma pack' directives is different from the default alignment value.
- When leaving an included file that changed the state of the current alignment
value.
rdar://10184173
Differential Revision: https://reviews.llvm.org/D35484
llvm-svn: 309386
When an omp for loop is canceled the constructed objects are being destructed
twice.
It looks like the desired code is:
{
Obj o;
If (cancelled) branch-through-cleanups to cancel.exit.
}
[cleanups]
cancel.exit:
__kmpc_for_static_fini
br cancel.cont (*)
cancel.cont:
__kmpc_barrier
return
The problem seems to be the branch to cancel.cont is currently also going
through the cleanups calling them again. This change just does a direct branch
instead.
Patch By: michael.p.rice@intel.com
Differential Revision: https://reviews.llvm.org/D35854
llvm-svn: 309288
The warning fires on non-suspicious code in Chromium. Reverting until a
solution is figured out.
> Recommit r308327 2nd time: Add a warning for missing
> '#pragma pack (pop)' and suspicious uses of '#pragma pack' in included files
>
> The first recommit (r308441) caused a "non-default #pragma pack value might
> change the alignment of struct or union members in the included file" warning
> in LLVM itself. This recommit tweaks the added warning to avoid warnings for
> #includes that don't have any records that are affected by the non-default
> alignment. This tweak avoids the previously emitted warning in LLVM.
>
> Original message:
>
> This commit adds a new -Wpragma-pack warning. It warns in the following cases:
>
> - When a translation unit is missing terminating #pragma pack (pop) directives.
> - When entering an included file if the current alignment value as determined
> by '#pragma pack' directives is different from the default alignment value.
> - When leaving an included file that changed the state of the current alignment
> value.
>
> rdar://10184173
>
> Differential Revision: https://reviews.llvm.org/D35484
llvm-svn: 309186
'#pragma pack (pop)' and suspicious uses of '#pragma pack' in included files
The first recommit (r308441) caused a "non-default #pragma pack value might
change the alignment of struct or union members in the included file" warning
in LLVM itself. This recommit tweaks the added warning to avoid warnings for
#includes that don't have any records that are affected by the non-default
alignment. This tweak avoids the previously emitted warning in LLVM.
Original message:
This commit adds a new -Wpragma-pack warning. It warns in the following cases:
- When a translation unit is missing terminating #pragma pack (pop) directives.
- When entering an included file if the current alignment value as determined
by '#pragma pack' directives is different from the default alignment value.
- When leaving an included file that changed the state of the current alignment
value.
rdar://10184173
Differential Revision: https://reviews.llvm.org/D35484
llvm-svn: 309106
If the member declaration is captured in the OMPCapturedExprDecl, we may
loose data-sharing attribute info for this declaration. Patch fixes this
bug.
llvm-svn: 308629
This seems to have broken the sanitizer-x86_64-linux buildbot. Reverting until
it's fixed, especially since this landed just before the 5.0 branch.
> This commit adds a new -Wpragma-pack warning. It warns in the following cases:
>
> - When a translation unit is missing terminating #pragma pack (pop) directives.
> - When entering an included file if the current alignment value as determined
> by '#pragma pack' directives is different from the default alignment value.
> - When leaving an included file that changed the state of the current alignment
> value.
>
> rdar://10184173
>
> Differential Revision: https://reviews.llvm.org/D35484
llvm-svn: 308455
and suspicious uses of '#pragma pack' in included files
This commit adds a new -Wpragma-pack warning. It warns in the following cases:
- When a translation unit is missing terminating #pragma pack (pop) directives.
- When entering an included file if the current alignment value as determined
by '#pragma pack' directives is different from the default alignment value.
- When leaving an included file that changed the state of the current alignment
value.
rdar://10184173
Differential Revision: https://reviews.llvm.org/D35484
llvm-svn: 308441
of '#pragma pack' in included files
This commit adds a new -Wpragma-pack warning. It warns in the following cases:
- When a translation unit is missing terminating #pragma pack (pop) directives.
- When entering an included file if the current alignment value as determined
by '#pragma pack' directives is different from the default alignment value.
- When leaving an included file that changed the state of the current alignment
value.
rdar://10184173
Differential Revision: https://reviews.llvm.org/D35484
llvm-svn: 308327