Commit Graph

1174 Commits

Author SHA1 Message Date
Patrick Lyster e13b1e3299 [OpenMP] Added support for explicit mapping of classes using 'this' pointer. Differential revision: https://reviews.llvm.org/D55982
llvm-svn: 350252
2019-01-02 19:28:48 +00:00
Alexey Bataev db43f0696e [OPENMP]Fix processing of the clauses on target combined directives.
For constants with the predefined data-sharing clauses we may had
troubles with the target combined directives. It may cause compiler
crash in some corner cases.

llvm-svn: 350127
2018-12-28 17:27:32 +00:00
Nico Weber 170a55b893 Pass a concrete triple for two OpenMP tests that depend on TLS
Not all %itanium_abi_triple values support TLS. Makes
OpenMP/declare_reduction_codegen.cpp, OpenMP/parallel_copyin_codegen.cpp for
%itanium_abi_triples without TLS support.

Alternatively we could pass -fnoopenmp-use-tls and tweak some of the CHECK
lines, but this seems a bit simpler.

Fixes PR40156.

Differential Revision: https://reviews.llvm.org/D56086

llvm-svn: 350067
2018-12-26 16:06:26 +00:00
Michael Kruse 0535137e4a [CodeGen] Generate llvm.loop.parallel_accesses instead of llvm.mem.parallel_loop_access metadata.
Instead of generating llvm.mem.parallel_loop_access metadata, generate
llvm.access.group on instructions and llvm.loop.parallel_accesses on
loops. There is one access group per generated loop.

This is clang part of D52116/r349725.

Differential Revision: https://reviews.llvm.org/D52117

llvm-svn: 349823
2018-12-20 21:24:54 +00:00
Alexey Bataev ce90181751 [OPENMP]Mark the loop as started when initialized.
Need to mark the loop as started when the initialization statement is
found. It is required to prevent possible incorrect loop iteraton
variable detection during template instantiation and fix the compiler
crash during the codegen.

llvm-svn: 349657
2018-12-19 18:16:37 +00:00
Joel E. Denny 0fdf5a9acc [OpenMP] Fix data sharing analysis in nested clause
Without this patch, clang doesn't complain that X needs explicit data
sharing attributes in the following:

```
 #pragma omp target teams default(none)
 {
   #pragma omp parallel num_threads(X)
     ;
 }
```

However, clang does produce that complaint after the braces are
removed.  With this patch, clang complains in both cases.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D55861

llvm-svn: 349635
2018-12-19 15:59:47 +00:00
Kelvin Li ef57943e3f [OPENMP] parsing and sema support for 'close' map-type-modifier
A map clause with the close map-type-modifier is a hint to 
prefer that the variables are mapped using a copy into faster 
memory.

Patch by Ahsan Saghir (saghir)

Differential Revision: https://reviews.llvm.org/D55719

llvm-svn: 349551
2018-12-18 22:18:41 +00:00
Alexey Bataev 6a1b06bcd4 [OPENMP][NVPTX]Emit shared memory buffer for reduction as 128 bytes
buffer.

Seems to me, nvlink has a bug with the proper support of the weakly
linked symbols. It does not allow to define several shared memory buffer
with the different sizes even with the weak linkage. Instead we always
use 128 bytes buffer to prevent nvlink from the error message emission.

llvm-svn: 349540
2018-12-18 21:01:42 +00:00
Alexey Bataev 29d47fcb30 [OPENMP][NVPTX]Added extra sync point to the inter-warp copy function.
The parallel reduction operation requires an extra synchronization point
in the inter-warp copy function to avoid divergence.

llvm-svn: 349525
2018-12-18 19:20:15 +00:00
Alexey Bataev ae51b96f99 [OPENMP][NVPTX]Improved interwarp copy function.
Inlined runtime with the current implementation of the interwarp copy
function leads to the undefined behavior because of the not quite
correct implementation of the barriers. Start using generic
__kmpc_barier function instead of the custom made barriers.

llvm-svn: 349192
2018-12-14 21:00:58 +00:00
Alexey Bataev 6393eb7ec6 [OPENMP][NVPTX] Fix globalization of the mapped array sections.
If the array section is based on pointer and this sections is mapped in
target region + then it is used in the inner parallel region, it also
must be globalized as the pointer itself is passed by value, not by
reference.

llvm-svn: 348492
2018-12-06 15:35:13 +00:00
Alexey Bataev 2c1ff9dda7 [OPENMP][NVPTX]Fixed emission of the critical region.
Critical regions in NVPTX are the constructs, which, generally speaking,
are not supported by the NVPTX target. Instead we're using special
technique to handle the critical regions. Currently they are supported
only within the loop and all the threads in the loop must execute the
same critical region.
Inside of this special regions the regions still must be emitted as
critical, to avoid possible data races between the teams +
synchronization must use __kmpc_barrier functions.

llvm-svn: 348272
2018-12-04 15:25:01 +00:00
Alexey Bataev c3028cac24 [OPENMP][NVPTX]Mark __kmpc_barrier functions as convergent.
__kmpc_barrier runtime functions must be marked as convergent to prevent
some dangerous optimizations. Also, for NVPTX target all barriers must
be emitted as simple barriers.

llvm-svn: 348271
2018-12-04 15:03:25 +00:00
Aaron Ballman 4b5b0c0025 Move AST tests into their own test directory; NFC.
This moves everything primarily testing the functionality of -ast-dump and -ast-print into their own directory, rather than leaving the tests spread around the testing directory.

llvm-svn: 348017
2018-11-30 18:43:02 +00:00
Alexey Bataev 3ce5d827fc [OPENMP][NVPTX]Call get __kmpc_global_thread_num in worker after
initialization.

Function __kmpc_global_thread_num  should be called only after
initialization, not earlier.

llvm-svn: 347919
2018-11-29 21:21:32 +00:00
Gheorghe-Teodor Bercea 2b40470c61 [OpenMP] Add a new version of the SPMD deinit kernel function
Summary: This patch adds a new runtime for the SPMD deinit kernel function which replaces the previous function. The new function takes as argument the flag which signals whether the runtime is required or not. This enables the compiler to optimize out the part of the deinit function which are not needed.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D54970

llvm-svn: 347915
2018-11-29 20:53:49 +00:00
Alexey Bataev 719713ab7d [OPENMP]Fix emission of the target regions in virtual functions.
Fixed emission of the target regions found in the virtual functions.
Previously we may end up with the situation when those regions could be
skipped.

llvm-svn: 347793
2018-11-28 19:00:07 +00:00
Alexey Bataev a116602475 [OPENMP][NVPTX]Basic support for reductions across the teams.
Added basic codegen support for the reductions across the teams.

llvm-svn: 347715
2018-11-27 21:24:54 +00:00
Alexey Bataev e8ad4b7124 [OPENMP][NVPTX]Emit default locations with the correct Exec|Runtime
modes.

If the region is inside target|teams|distribute region, we can emit the
locations with the correct info for execution mode and runtime mode.
Patch adds this ability to the NVPTX codegen to help the optimizer to
produce better code.

llvm-svn: 347583
2018-11-26 18:37:09 +00:00
Alexey Bataev ceeaa48052 [OPENMP][NVPTX]Emit default locations as constant with undefined mode.
For the NVPTX target default locations should be emitted as constants +
additional info must be emitted in the reserved_2 field of the ident_t
structure. The 1st bit controls the execution mode and the 2nd bit
controls use of the lightweight runtime. The combination of the bits for
Non-SPMD mode + lightweight runtime represents special undefined mode,
used outside of the target regions for orphaned directives or functions.
Should allow and additional optimization inside of the target regions.

llvm-svn: 347425
2018-11-21 21:04:34 +00:00
Alexey Bataev 92b33652f6 [OPENMP]Fix handling of the LCVs in loop-based directives.
Loop-control variables with the default data-sharing attributes should
not be captured in the OpenMP region as they are private by default.
Also, default attributes should be emitted for such variables in the
inner OpenMP regions for the correct data sharing during codegen.

llvm-svn: 347409
2018-11-21 19:41:10 +00:00
Kelvin Li efbe4afbda [OPENMP] Support relational-op != (not-equal) as one of the canonical
forms of random access iterator
    
In OpenMP 4.5, only 4 relational operators are supported: <, <=, >, 
and >=.  This work is to enable support for relational operator 
!= (not-equal) as one of the canonical forms.

Patch by Anh Tuyen Tran
    
Differential Revision: https://reviews.llvm.org/D54441

llvm-svn: 347405
2018-11-21 19:10:48 +00:00
Joel E. Denny 435c739fbe [OpenMP] Update CHECK-DAG usage in target_parallel_codegen.cpp
This patch adjusts a test not to depend on deprecated FileCheck
behavior that permits overlapping matches within a block of CHECK-DAG
directives.  Thus, this patch also removes uses of FileCheck's
-allow-deprecated-dag-overlap command-line option.

There were two issues in this test:

1. There were sets of patterns for store instructions in which a
pattern X could match a superset of a pattern Y.  While X appeared
before Y, Y's intended match appeared before X's intended match.  The
result was that X matched Y's intended match.  Under the old
overlapping behavior, Y also matched Y's intended match.  Under the
new non-overlapping behavior, Y had nothing left to match.  This patch
fixes this by gathering these sets in one place and putting the most
specific patterns (Y) before the more general patterns (X).

2. The CHECK-DAG patterns involving the variables CBPADDR3 and
CBPADDR4 were the same, but there was only one match in the text, so
CBPADDR4 patterns had nothing to match under the new non-overlapping
behavior.  Moreover, a preceding related series of directives had
variables (SADDR0, BPADDR0, etc.) numbered only 0 through 4, but this
series had variables numbered 0 through 5.  Assuming CBPADDR4's
directives were not intended, this patch removes them.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D54765

llvm-svn: 347351
2018-11-20 22:05:23 +00:00
Joel E. Denny d586bc6db9 [OpenMP] Update CHECK-DAG usage in for_codegen.cpp
This patch adjusts a test not to depend on deprecated FileCheck
behavior that permits overlapping matches within a block of CHECK-DAG
directives.  Thus, this patch also removes uses of FileCheck's
-allow-deprecated-dag-overlap command-line option.

Specifically, the FileCheck variables DBG_LOC_START, DBG_LOC_END, and
DBG_LOC_CANCEL were all set to the same value.  As a result, three
TERM_DEBUG-DAG patterns, one for each variable, all matched the same
text under the old overlapping behavior.  Under the new
non-overlapping behavior, that's not permitted.  This patch's solution
is to replace these variables with one variable and replace these
patterns with one pattern.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D54764

llvm-svn: 347350
2018-11-20 22:04:45 +00:00
Patrick Lyster 8f7f586e53 [OpenMP] Check target architecture supports unified shared memory for requires directive. Differential Review: https://reviews.llvm.org/D54493
llvm-svn: 347214
2018-11-19 15:09:33 +00:00
Alexey Bataev d1840e5383 [OPENMP]Fix PR39694: do not capture `this` in non-`this` region.
If lambda is used inside of the OpenMP region and captures `this`, we
should recapture it in the OpenMP region also. But we should do this
only if the OpenMP region is used in the context of the same class, just
like the lambda.

llvm-svn: 347096
2018-11-16 21:13:33 +00:00
Alexey Bataev f2f39be9ed [OPENMP][NVPTX]Emit correct reduction code for teams/parallel
reductions.

Fixed previously committed code for the reduction support in
teams/parallel constructs taking into account new design of the NVPTX
support in the compiler. Teams reduction are not fully functional yet,
it is going to be fixed in the following patches.

llvm-svn: 347081
2018-11-16 19:38:21 +00:00
Alexey Bataev 8bcc69c054 [OPENMP][NVPTX]Extend number of constructs executed in SPMD mode.
If the statements between target|teams|distribute directives does not
require execution in master thread, like constant expressions, null
statements, simple declarations, etc., such construct can be xecuted in
SPMD mode.

llvm-svn: 346551
2018-11-09 20:03:19 +00:00
Alexey Bataev 09c9eea78f [OPENMP][NVPTX]Allow to use shared memory for the
target|teams|distribute variables.

If the total size of the variables, declared in target|teams|distribute
regions, is less than the maximal size of shared memory available, the
buffer is allocated in the shared memory.

llvm-svn: 346507
2018-11-09 16:18:04 +00:00
Alexey Bataev 969dbc02c2 [OPENMP]Make lambda mapping follow reqs for PTR_AND_OBJ mapping.
The base pointer for the lambda mapping must point to the lambda capture
placement and pointer must point to the captured variable itself. Patch
fixes this problem.

llvm-svn: 346408
2018-11-08 15:47:39 +00:00
Alexey Bataev 2a6f3f5fa2 [OPENMP]Fix handling of the globals during compilation for the device.
Fixed lookup for the target regions in unused virtual functions + fixed
processing of the global variables not marked as declare target but
emitted during debug info emission.

llvm-svn: 346343
2018-11-07 19:11:14 +00:00
Alexey Bataev 1fc1f8e819 [OPENMP][NVPTX]Use __kmpc_data_sharing_coalesced_push_stack function.
Coalesced memory access requires use of the new function
`__kmpc_data_sharing_coalesced_push_stack` instead of the
`__kmpc_data_sharing_push_stack`.

llvm-svn: 345991
2018-11-02 16:08:31 +00:00
Alexey Bataev 2dc07d0cf3 [OPENMP]Change the mapping type for lambda captures.
The previously used combination `PTR_AND_OBJ | PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ | LITERAL`.

llvm-svn: 345982
2018-11-02 15:25:06 +00:00
Alexey Bataev e40901806f [OPENMP][NVPTX]Improve emission of the globalized variables for
target/teams/distribute regions.

Target/teams/distribute regions exist for all the time the kernel is
executed. Thus, if the variable is declared in their context and then
escape it, we can allocate global memory statically instead of
allocating it dynamically.
Patch captures all the globalized variables in target/teams/distribute
contexts, merges them into the records, one per each target region.
Those records are then joined into the union, one per compilation unit
(to save the global memory). Those units are organized into
2 x dimensional arrays, where the first dimension is
the number of blocks per SM and the second one is the number of SMs.
Runtime functions manage this global memory space between the executing
teams.

llvm-svn: 345978
2018-11-02 14:54:07 +00:00
Patrick Lyster 7a2a27c4a4 Add support for 'atomic_default_mem_order' clause on 'requires' directive. Also renamed test files relating to 'requires'. Differntial review: https://reviews.llvm.org/D53513
llvm-svn: 345967
2018-11-02 12:18:11 +00:00
Alexey Bataev 6070542296 [OPENMP] Support for mapping of the lambdas in target regions.
Added support for mapping of lambdas in the target regions. It scans all
the captures by reference in the lambda, implicitly maps those variables
in the target region and then later reinstate the addresses of
references in lambda to the correct addresses of the captured|privatized
variables.

llvm-svn: 345609
2018-10-30 15:50:12 +00:00
Alexey Bataev f07946e101 [OPENMP]Fix PR39372: Does not complain about loop bound variable not
being shared.

According to the standard, the variables with unspecified data-sharing
attributes in presence of `default(none)` clause must be reported to
users. Compiler did not generate error reports for the variables used in
other OpenMP regions. Patch fixes this.

llvm-svn: 345533
2018-10-29 20:17:42 +00:00
Gheorghe-Teodor Bercea 9d6341ff51 [OpenMP] Fix condition.
Summary: Iteration variable must be strictly less than the number of iterations. This fixes a bug introduced by previous patch D53448.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53827

llvm-svn: 345527
2018-10-29 19:44:25 +00:00
Gheorghe-Teodor Bercea e92567601b [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for
Summary: This patch adds a new code generation path for bound sharing directives containing distribute parallel for. The new code generation scheme applies to chunked schedules on distribute and parallel for directives. The scheme simplifies the code that is being generated by eliminating the need for an outer for loop over chunks for both distribute and parallel for directives. In the case of distribute it applies to any sized chunk while in the parallel for case it only applies when chunk size is 1.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53448

llvm-svn: 345509
2018-10-29 15:45:47 +00:00
Gheorghe-Teodor Bercea 669dbde7a5 [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.
Summary: This patch enables the choosing of the default schedule for parallel for loops even in non-SPMD cases.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53443

llvm-svn: 345507
2018-10-29 15:23:23 +00:00
Alexey Bataev 6ab5bb115a [OPENMP] Do not capture private loop counters.
If the loop counter is not declared in the context of the loop and it is
private, such loop counters should not be captured in the outlined
regions.

llvm-svn: 345505
2018-10-29 15:01:58 +00:00
Gheorghe-Teodor Bercea a6cb25676e [NFC][OpenMP] Add new test for parallel for code generation.
Summary:
This is a simple test of the parallel for code generation. It will be used to showcase the change introduced by patch D53443.


Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53772

llvm-svn: 345417
2018-10-26 18:59:52 +00:00
Alexey Bataev 8fc7b5f922 [OPENMP]Fix PR39422: variables are not firstprivatized in task context.
According to the OpenMP standard, In a task generating construct, if no
default clause is present, a variable for which the data-sharing
attribute is not determined by the rules above is firstprivatized.
Compiler tries to implement this, but if the variable is not directly
used in the task context, this variable may not be firstprivatized.
Patch fixes this problem.

llvm-svn: 345277
2018-10-25 15:35:27 +00:00
Alexey Bataev ac6e4de714 Do not always request an implicit taskgroup region inside the kmpc_taskloop function
Summary:
For the following code:
```
    int i;
    #pragma omp taskloop
    for (i = 0; i < 100; ++i)
    {}

    #pragma omp taskloop nogroup
    for (i = 0; i < 100; ++i)
    {}
```

Clang emits the following LLVM IR:

```
 ...
  call void @__kmpc_taskgroup(%struct.ident_t* @0, i32 %0)
  %2 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates*)* @.omp_task_entry. to i32 (i32, i8*)*))
  ...
  call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %2, i32 1, i64* %8, i64* %9, i64 %13, i32 0, i32 0, i64 0, i8* null)
  call void @__kmpc_end_taskgroup(%struct.ident_t* @0, i32 %0)

  ...
  %15 = call i8* @__kmpc_omp_task_alloc(%struct.ident_t* @0, i32 %0, i32 1, i64 80, i64 8, i32 (i32, i8*)* bitcast (i32 (i32, %struct.kmp_task_t_with_privates.1*)* @.omp_task_entry..2 to i32 (i32, i8*)*))
  ...
  call void @__kmpc_taskloop(%struct.ident_t* @0, i32 %0, i8* %15, i32 1, i64* %21, i64* %22, i64 %26, i32 0, i32 0, i64 0, i8* null)

```

The first set of instructions corresponds to the first taskloop construct. It is important to note that the implicit taskgroup region associated with the taskloop construct has been materialized in our IR:  the `__kmpc_taskloop` occurs inside a taskgroup region. Note also that this taskgroup region does not exist in our second taskloop because we are using the `nogroup` clause.

The issue here is the 4th argument of the kmpc_taskloop call, starting from the end,  is always a zero. Checking the LLVM OpenMP RT implementation, we see that this argument corresponds to the nogroup parameter:

```
void __kmpc_taskloop(ident_t *loc, int gtid, kmp_task_t *task, int if_val,
                     kmp_uint64 *lb, kmp_uint64 *ub, kmp_int64 st, int nogroup,
                     int sched, kmp_uint64 grainsize, void *task_dup);
```

So basically we always tell to the RT to do another taskgroup region. For the first taskloop, this means that we create two taskgroup regions. For the second example, it means that despite the fact we had a nogroup clause we are going to have a taskgroup region, so we unnecessary wait until all descendant tasks have been executed.

Reviewers: ABataev

Reviewed By: ABataev

Subscribers: rogfer01, cfe-commits

Differential Revision: https://reviews.llvm.org/D53636

llvm-svn: 345180
2018-10-24 19:06:37 +00:00
Alexey Bataev b40e0520e1 [OPENMP]Fix PR39366: do not try to private field if it is not captured.
The compiler is crashing if we trying to post-capture the fields
implicitly captured inside of the task constructs. Seems, this kind of
processing is not supported and such fields should not be
firstprivatized.

llvm-svn: 345177
2018-10-24 18:53:12 +00:00
Sean Fertile d900dd0c23 Revert "[CodeGenCXX] Treat 'this' as noalias in constructors"
This reverts commit https://reviews.llvm.org/rL344150 which causes
MachineOutliner related failures on the ppc64le multistage buildbot.

llvm-svn: 344526
2018-10-15 15:43:00 +00:00
Alexey Bataev 4ac58d1a4b [OPENMP][NVPTX]Reduce memory usage in target region.
Additional reduction of the global memory usage in the target regions
without parallel regions.

llvm-svn: 344413
2018-10-12 20:19:59 +00:00
Alexey Bataev 9bfe91da3d [OPENMP][NVPTX]Reduce memory usage in orphaned functions.
if the function has globalized variables and called in context of
target/teams/distribute regions, it does not need to globalize 32
copies of the same variables for memory coalescing, it is enough to
have just one copy, because there is parallel region.
Patch does this by adding call for `__kmpc_parallel_level` function and
checking its return value. If the code sees that the parallel level is
0, then only one variable is allocated, not 32.

llvm-svn: 344356
2018-10-12 16:04:20 +00:00
Alexey Bataev ff23bb6622 [OPENMP][NVPTX]Reduce memory use for globalized vars in
target/teams/distribute regions.

Previously introduced globalization scheme that uses memory coalescing
scheme may increase memory usage fr the variables that are devlared in
target/teams/distribute contexts. We don't need 32 copies of such
variables, just 1. Patch reduces memory use in this case.

llvm-svn: 344273
2018-10-11 18:30:31 +00:00
Patrick Lyster 3fe9e396f4 Add support for 'dynamic_allocators' clause on 'requires' directive. Differential Revision: https://reviews.llvm.org/D53079
llvm-svn: 344249
2018-10-11 14:41:10 +00:00
Anton Bikineev cc7e74753a [CodeGenCXX] Treat 'this' as noalias in constructors
This is currently a clang extension and a resolution
of the defect report in the C++ Standard.

Differential Revision: https://reviews.llvm.org/D46441

llvm-svn: 344150
2018-10-10 16:14:51 +00:00
Alexey Bataev 9ea3c38597 [OPENMP][NVPTX] Support memory coalescing for globalized variables.
Added support for memory coalescing for better performance for
globalized variables. From now on all the globalized variables are
represented as arrays of 32 elements and each thread accesses these
elements using `tid & 31` as index.

llvm-svn: 344049
2018-10-09 14:49:00 +00:00
Alexey Bataev 6bc2732f71 [OPENMP][NVPTX] Fix emission of __kmpc_global_thread_num() for non-SPMD
mode.

__kmpc_global_thread_num() should be called before initialization of the
runtime.

llvm-svn: 343857
2018-10-05 15:27:47 +00:00
Alexey Bataev fd006c44ee [OPENMP] Fix emission of the __kmpc_global_thread_num.
Fixed emission of the __kmpc_global_thread_num() so that it is not
messed up with alloca instructions anymore. Plus, fixes emission of the
__kmpc_global_thread_num() functions in the target outlined regions so
that they are not called before runtime is initialized.

llvm-svn: 343856
2018-10-05 15:08:53 +00:00
Patrick Lyster 6bdf63bd32 [OPENMP] Add reverse_offload clause to requires directive
llvm-svn: 343711
2018-10-03 20:07:58 +00:00
Jonas Hahnfeld 3ca4701d35 [OpenMP][NVPTX] Simplify codegen for orphaned parallel, NFCI.
Worker threads fork off to the compiler generated worker function
directly after entering the kernel function. Hence, there is no
need to check whether the current thread is the master if we are
outside of a parallel region (neither SPMD nor parallel_level > 0).

Differential Revision: https://reviews.llvm.org/D52732

llvm-svn: 343618
2018-10-02 19:12:54 +00:00
Alexey Bataev e79451a517 [OPENMP][NVPTX] Handle `requires datasharing` flag correctly with
lightweight runtime.

The datasharing flag must be set to `1` when executing SPMD-mode compatible directive with reduction|lastprivate clauses.

llvm-svn: 343492
2018-10-01 16:20:57 +00:00
Patrick Lyster 4a370b9f63 Add support for unified_shared_memory clause on requires directive
llvm-svn: 343472
2018-10-01 13:47:43 +00:00
Alexey Bataev bc52967835 [OPENMP]Fix PR39084: Check datasharing attributes of reduction variables only.
According to OpenMP, the reduction item must be shared in parent region.
But the item can be an array section or array subscript. In this case,
we should not check for the datasharing of the base declaration.

llvm-svn: 343356
2018-09-28 19:33:14 +00:00
Gheorghe-Teodor Bercea 8233af90e1 [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing
Summary: Set default schedule for parallel for loops to schedule(static, 1) when using SPMD mode on the NVPTX device offloading toolchain to ensure coalescing.

Reviewers: ABataev, Hahnfeld, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D52629

llvm-svn: 343260
2018-09-27 20:29:00 +00:00
Gheorghe-Teodor Bercea 02650d4c2c [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing
Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule that ensures coalescing on the GPU when in SPMD mode. This significantly increases the performance of offloaded target code and reduces the number of registers used on the GPU side.

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev, Hahnfeld

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D52434

llvm-svn: 343253
2018-09-27 19:22:56 +00:00
Kelvin Li 1408f91a25 [OPENMP] Add support for OMP5 requires directive + unified_address clause
Add support for OMP5.0 requires directive and unified_address clause.
Patches to follow will include support for additional clauses.

Differential Revision: https://reviews.llvm.org/D52359

llvm-svn: 343063
2018-09-26 04:28:39 +00:00
Alexey Bataev f898d1de4a [OPENMP] Disable emission of the class with vptr if they are not used in
target constructs.

Prevent compilation of the classes with the virtual tables when
compiling for the device.

llvm-svn: 342741
2018-09-21 14:55:26 +00:00
Alexey Bataev 2adecff1aa [OPENMP][NVPTX] Enable support for lastprivates in SPMD constructs.
Previously we could not use lastprivates in SPMD constructs, patch
allows supporting lastprivates in SPMD with uninitialized runtime.

llvm-svn: 342738
2018-09-21 14:22:53 +00:00
Alexey Bataev e82445f5a9 [OPENMP] Add support for mapping memory pointed by member pointer.
Added support for map(s, s.ptr[0:1]) kind of mapping.

llvm-svn: 342648
2018-09-20 13:54:02 +00:00
Alexey Bataev e6aa4694de [OPENMP] Fix PR38903: Crash on instantiation of the non-dependent
declare reduction.

If the declare reduction construct with the non-dependent type is
defined in the template construct, the compiler might crash on the
template instantition. Reworked the whole instantiation scheme for the
declare reduction constructs to fix this problem correctly.

llvm-svn: 342151
2018-09-13 16:54:05 +00:00
Alexey Bataev 43b90b73e6 [OPENMP] Fix PR38902: support ADL for declare reduction constructs.
Added support for argument-dependent lookup when trying to find the
required declare reduction decl.

llvm-svn: 342062
2018-09-12 16:31:59 +00:00
Alexey Bataev 30a7821760 [OPENMP] Simplified checks for declarations in declare target regions.
Sema analysis should not mark functions as an implicit declare target,
it may break codegen. Simplified semantic analysis and removed extra
code for implicit declare target functions.

llvm-svn: 341939
2018-09-11 13:59:10 +00:00
Kelvin Li bc38e63718 [OpenMP] Add support for nested 'declare target' directives
Add the capability to nest multiple declare target directives 
- including header files within a declare target region.

Differential Revision: https://reviews.llvm.org/D51378

Patch by Patrick Lyster

llvm-svn: 341766
2018-09-10 02:07:09 +00:00
Alexey Bataev 47b2ed2e16 Revert "[OPENMP][NVPTX] Disable runtime-type info for CUDA devices."
Still need the RTTI for NVPTX target to pass sema checks.

llvm-svn: 341668
2018-09-07 14:50:25 +00:00
Alexey Bataev d2c1dd5902 [OPENMP] Fix PR38823: Do not emit vtable if it is not used in target
context.

If the explicit template instantiation definition defined outside of the
target context, its vtable should not be marked as used. This is true
for other situations where the compiler want to emit vtables
unconditionally.

llvm-svn: 341570
2018-09-06 17:56:28 +00:00
Alexey Bataev 33c137bf0c [OPENMP][NVPTX] Disable runtime-type info for CUDA devices.
RTTI is not supported by the NVPTX target.

llvm-svn: 341483
2018-09-05 17:10:30 +00:00
Alexey Bataev bd8ff9bd70 [OPENMP] Fix PR38710: static functions are not emitted as implicitly
'declare target'.

All the functions, referenced in implicit|explicit target regions must
be emitted during code emission for the device.

llvm-svn: 341093
2018-08-30 18:56:11 +00:00
Alexey Bataev 80a9a61ded [OPENMP][NVPTX] Add options -f[no-]openmp-cuda-force-full-runtime.
Added options -f[no-]openmp-cuda-force-full-runtime to [not] force use
of the full runtime for OpenMP offloading to CUDA devices.

llvm-svn: 341073
2018-08-30 14:45:24 +00:00
Alexey Bataev b4dd6d24d7 [OPENMP] Do not create offloading entry for declare target variables
declarations.

We should not create offloading entries for declare target var
declarations as it causes compiler crash.

llvm-svn: 340968
2018-08-29 20:41:37 +00:00
Alexey Bataev 8d8e1235ab [OPENMP][NVPTX] Add support for lightweight runtime.
If the target construct can be executed in SPMD mode + it is a loop
based directive with static scheduling, we can use lightweight runtime
support.

llvm-svn: 340953
2018-08-29 18:32:21 +00:00
Mike Rice e1ca7b614f [OPENMP] Create non-const ident_t objects.
Currently ident_t objects are created const when debug info is not
enabled, but the libittnotify libray in the OpenMP runtime writes to
the reserved_2 field (See __kmp_itt_region_forking in
openmp/runtime/src/kmp_itt.inl).  Now create ident_t objects non-const.

Differential Revision: https://reviews.llvm.org/D51331

llvm-svn: 340934
2018-08-29 15:45:11 +00:00
Alexey Bataev 7f792cab12 [OPENMP] Fix crash on the emission of the weak function declaration.
If the function is actually a weak reference, it should not be marked as
deferred definition as this is only a declaration. Patch adds checks for
the definitions if they must be emitted. Otherwise, only declaration is
emitted.

llvm-svn: 340191
2018-08-20 18:03:40 +00:00
Alexey Bataev d01b74974b [OPENMP] FIx processing of declare target variables.
The compiler may produce unexpected error messages/crashes when declare
target variables were used. Patch fixes problems with the declarations
marked as declare target to or link.

llvm-svn: 339805
2018-08-15 19:45:12 +00:00
Alexey Bataev 97b722121e [OPENMP] Fix processing of declare target construct.
The attribute marked as inheritable since OpenMP 5.0 supports it +
additional fixes to support new functionality.

llvm-svn: 339704
2018-08-14 18:31:20 +00:00
Alexey Bataev f138fda5ed [OPENMP] Fix emission of the loop doacross constructs.
The number of loops associated with the OpenMP loop constructs should
not be considered as the number loops to collapse.

llvm-svn: 339603
2018-08-13 19:04:24 +00:00
Alexey Bataev 23647171ea Revert "[OPENMP] Fix emission of the loop doacross constructs."
This reverts commit r339568 because of the problems with the buildbots.

llvm-svn: 339574
2018-08-13 14:42:18 +00:00
Alexey Bataev 0ce6360e0e [OPENMP] Fix emission of the loop doacross constructs.
The number of loops associated with the OpenMP loop constructs should
not be considered as the number loops to collapse.

llvm-svn: 339568
2018-08-13 14:05:43 +00:00
Alexey Bataev bf8fe71b91 [OPENMP] Mark variables captured in declare target region as implicitly
declare target.

According to OpenMP 5.0, variables captured in lambdas in declare target
regions must be considered as implicitly declare target.

llvm-svn: 339152
2018-08-07 16:14:36 +00:00
Sergey Dmitriev bde9cf942b [OpenMP] Encode offload target triples into comdat key for offload initialization code
Encoding offload target triples onto comdat group key for offload initialization
code guarantees that it will be executed once per each unique combination of
offload targets.

Differential Revision: https://reviews.llvm.org/D50218

llvm-svn: 338916
2018-08-03 20:19:28 +00:00
Alexey Bataev 62a4cb06a9 [OPENMP] Change linkage of offloading symbols to support dropping
offload targets.

Changed the linkage of omp_offloading.img_start.<triple> and omp_offloading.img_end.<triple> symbols from external to external weak to allow dropping of some targets during linking.

llvm-svn: 338413
2018-07-31 18:27:42 +00:00
Alexey Bataev 3823514b56 [OPENMP] Prevent problems with linking of the static variables.
No need to change the linkage, we can avoid the problem using special variable. That points to the original variable and, thus, prevent some of the optimizations that might break the compilation.

llvm-svn: 338399
2018-07-31 16:40:15 +00:00
Alexey Bataev 2f5b2671da [OPENMP] Static variables on device must be externally visible.
Do not mark static variable as internal on the device as they must be
visible from the host to be mapped correctly.

llvm-svn: 338139
2018-07-27 17:37:32 +00:00
Alexey Bataev 77403dee05 [OPENMP] Force OpenMP 4.5 when compiling for offloading.
If the user requested compilation for OpenMP with the offloading
support, force the version of the OpenMP standard to 4.5 by default.

llvm-svn: 338032
2018-07-26 15:17:38 +00:00
Alexey Bataev 8521ff6ec4 [OPENMP] ThreadId in serialized parallel regions is 0.
The first argument for the parallel outlined functions, called as
serialized parallel regions, should be a pointer to the global thread id
that always is 0.

llvm-svn: 337957
2018-07-25 20:03:01 +00:00
Alexey Bataev a36aec9f83 [OPENMP] Exclude service expressions/statements from the list of
the children.

Special internal helper expressions/statements for the OpenMP directives
should not be exposed as children, only the main substatement must be
represented as the child.

llvm-svn: 337941
2018-07-25 17:27:45 +00:00
Jonas Hahnfeld 3f659e8731 Revert "[OPENMP] Fix PR38026: Link -latomic when -fopenmp is used."
This reverts commit r336467: libatomic is not available on all Linux
systems and this commit completely breaks OpenMP on them, even if there
are no atomic operations or all of them can be lowered to hardware
instructions.

See http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180716/234816.html
for post-commit discussion.

llvm-svn: 337722
2018-07-23 18:27:09 +00:00
Alexey Bataev b363813543 The patch adds support for the new map interface between clang and libomptarget. The changes in the interface are the following:
device IDs are now 64-bit integers (as opposed to 32-bit)
map flags are 64-bit long (used to be 32-bit)
mappings for partially mapped structs are now calculated at compile time and members of partially mapped structs are flagged using the MEMBER_OF field
Support for is_device_ptr on struct members was dropped - this functionality is not supported by the OpenMP standard and its implementation is technically infeasible (however, use_device_ptr on struct members works as a non-standard extension of the compiler)

llvm-svn: 337468
2018-07-19 16:34:13 +00:00
Alexey Bataev c52f01d1d7 [OPENMP] Fix checks for declare target link entries.
If the declare target link entries are created but not used, the
compiler will produce an error message. Patch improves handling of such
situations + improves checks for possibly lost declare target variables.

llvm-svn: 337207
2018-07-16 20:05:25 +00:00
Alexey Bataev 7f01d20993 [OPENMP] Fix syntactic errors in error messages.
Fixed spelling of the offloading error messages.

llvm-svn: 337196
2018-07-16 18:12:18 +00:00
Alexey Bataev 3dd1f9d61d [OPENMP, NVPTX] Globalize only captured variables.
Sometimes we can try to globalize non-variable declarations, which may
lead to compiler crash.

llvm-svn: 337191
2018-07-16 16:49:20 +00:00
Gheorghe-Teodor Bercea ad4e579407 [OpenMP] Initialize data sharing stack for SPMD case
Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D49188

llvm-svn: 337015
2018-07-13 16:18:24 +00:00
Joel E. Denny 72c2783012 [FileCheck] Add -allow-deprecated-dag-overlap to failing clang tests
See https://reviews.llvm.org/D47106 for details.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D47172

llvm-svn: 336844
2018-07-11 20:26:20 +00:00
Alexey Bataev c1943e75c7 [OPENMP] Do not mark local variables as declare target.
When the parsing of the functions happens inside of the declare target
region, we may erroneously mark local variables as declare target
thought they are not. This attribute can be applied only to global
variables.

llvm-svn: 336592
2018-07-09 19:58:08 +00:00
Alexey Bataev b99dcb5f31 [OPENMP, NVPTX] Do not globalize local variables in parallel regions.
In generic data-sharing mode we are allowed to not globalize local
variables that escape their declaration context iff they are declared
inside of the parallel region. We can do this because L2 parallel
regions are executed sequentially and, thus, we do not need to put
shared local variables in the global memory.

llvm-svn: 336567
2018-07-09 17:43:58 +00:00
Alexey Bataev f5d8b841ce [OPENMP] Fix PR38026: Link -latomic when -fopenmp is used.
On Linux atomic constructs in OpenMP require libatomic library. Patch
links libatomic when -fopenmp is used.

llvm-svn: 336467
2018-07-06 21:13:41 +00:00
Alexey Bataev dbc72c9009 [OPENMP] Make clauses closing loc point to right bracket.
For some of the clauses the closing location erroneously points to the
beginning of the next clause rather than on the location of the closing
bracket of the clause.

llvm-svn: 336460
2018-07-06 19:35:42 +00:00
Joel E. Denny 3cabf73270 [OPENMP] Fix incomplete type check for array reductions
A reduction for an incomplete array type used to produce an assert
fail during codegen.  Now it produces a diagnostic.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D48735

llvm-svn: 335911
2018-06-28 19:54:49 +00:00
Joel E. Denny e0e7a8ae58 Revert r335907: [OPENMP] Fix incomplete type check for array reductions
Sorry, forgot to add commit log attributes again.

llvm-svn: 335910
2018-06-28 19:54:27 +00:00
Joel E. Denny 82e9ea5b49 [OPENMP] Fix incomplete type check for array reductions
A reduction for an incomplete array type used to produce an assert
fail during codegen.  Now it produces a diagnostic.

llvm-svn: 335907
2018-06-28 19:46:10 +00:00
Alexey Bataev 91433f6877 [OPENMP, NVPTX] Reduce the number of the globalized variables.
Patch tries to make better analysis of the variables that should be
globalized. From now, instead of all parallel directives it will check
only distribute parallel .. directives and check only for
firstprivte/lastprivate variables if they must be globalized.

llvm-svn: 335632
2018-06-26 17:24:03 +00:00
Alexey Bataev 96edb2e37e [OPENMP] Do not consider address constant vars as possibly
threadprivate.

Do not delay emission of the address constant variables in OpenMP mode
as they cannot be defined as threadprivate.

llvm-svn: 335483
2018-06-25 15:32:05 +00:00
Alexey Bataev 12c62908b5 [OPENMP, NVPTX] Fix reduction of the big data types/structures.
If the shuffle is required for the reduced structures/big data type,
current code may cause compiler crash because of the loading of the
aggregate values. Patch fixes this problem.

llvm-svn: 335377
2018-06-22 19:10:38 +00:00
Alexey Bataev 4065b9ae48 [OPENMP, NVPTX] Fix globalization of the variables passed to orphaned
parallel region.

If the current construct requires sharing of the local variable in the
inner parallel region, this variable must be globalized to avoid
runtime crash.

llvm-svn: 335285
2018-06-21 20:26:33 +00:00
Alexey Bataev 7b55d2d554 [OPENMP, NVPTX] Emit simple reduction if requested.
If simple reduction is requested, use the simple reduction instead of
the runtime functions calls.

llvm-svn: 334962
2018-06-18 17:11:45 +00:00
Alexey Bataev 0baba9e728 [OPENMP, NVPTX] Fixed codegen for orphaned parallel region.
If orphaned parallel region is found, the next code must be emitted:
```
if(__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid))
  Serialized execution.
else if (IsMasterThread())
  Prepare and signal worker.
else
  Outined function call.
```

llvm-svn: 333301
2018-05-25 20:16:03 +00:00
Alexey Bataev 66f9577f09 [OPENMP-SIMD] Fix PR37536: Fix definition of _OPENMP macro.
if `-fopenmp-simd` is specified alone, `_OPENMP` macro should not be
  defined. If `-fopenmp-simd` is specified along with the `-fopenmp`,
  `_OPENMP` macro should be defined with the value `201511`.

llvm-svn: 332852
2018-05-21 16:40:32 +00:00
Alexey Bataev 4ac68a210c [OPENMP] DO not crash on combined constructs in declare target
functions.

If the combined construct is specified in the declare target function
and the device code is emitted, the compiler crashes because of the
incorrectly chosen captured stmt. We should choose the innermost
captured statement, not the outermost.

llvm-svn: 332477
2018-05-16 15:08:32 +00:00
Alexey Bataev 673110d5d5 [OPENMP, NVPTX] Add check for SPMD mode in orphaned parallel directives.
If the orphaned directive is executed in SPMD mode, we need to emit the
check for the SPMD mode and run the orphaned parallel directive in
sequential mode.

llvm-svn: 332467
2018-05-16 13:36:30 +00:00
Alexey Bataev 2a3320a928 [OPENMP, NVPTX] Do not globalize variables with reference/pointer types.
In generic data-sharing mode we do not need to globalize
variables/parameters of reference/pointer types. They already are placed
in the global memory.

llvm-svn: 332380
2018-05-15 18:01:01 +00:00
Alexey Bataev df093e7b45 [OPENMP, NVPTX] Do not use SPMD mode for target simd and target teams
distribute simd directives.

Directives `target simd` and `target teams distribute simd` must be
executed in non-SPMD mode.

llvm-svn: 332129
2018-05-11 19:45:14 +00:00
Alexey Bataev bf5c84861c [OPENMP, NVPTX] Initial support for L2 parallelism in SPMD mode.
Added initial support for L2 parallelism in SPMD mode. Note, though,
that the orphaned parallel directives are not currently supported in
SPMD mode.

llvm-svn: 332016
2018-05-10 18:32:08 +00:00
Alexey Bataev c15ea70b1f [OPENMP] Generate unique names for offloading regions id.
It is required to emit unique names for offloading regions ids. Required
to support compilation and linking of several compilation units.

llvm-svn: 331899
2018-05-09 18:02:37 +00:00
Alexey Bataev e253f2f886 [OPENMP] Mark global tors/dtors as used.
If the global variables are marked as declare target and they need
ctors/dtors, these ctors/dtors are emitted and then invoked by the
offloading runtime library. They are not explicitly used in the emitted
code and thus can be optimized out. Patch marks these functions as used,
so the optimizer cannot remove these function during the optimization
phase.

llvm-svn: 331879
2018-05-09 14:15:18 +00:00
Alexey Bataev 9a70017537 [OPENMP, NVPTX] Fix linkage of the global entries.
The linkage of the global entries must be weak to enable support of
redefinition of the same target regions in multiple compilation units.

llvm-svn: 331768
2018-05-08 14:16:57 +00:00
Alexey Bataev c661186cca [OPENMP, NVPTX] Small test fix, NFC.
llvm-svn: 331654
2018-05-07 17:38:13 +00:00
Alexey Bataev 504fc2d0cd [OPENMP, NVPTX] Codegen for critical construct.
Added correct codegen for the critical construct on NVPTX devices.

llvm-svn: 331652
2018-05-07 17:23:05 +00:00
Alexey Bataev d7ff6d647f [OPENMP, NVPTX] Added support for L2 parallelism.
Added initial codegen for level 2, 3 etc. parallelism. Currently, all
the second, the third etc. parallel regions will run sequentially.

llvm-svn: 331642
2018-05-07 14:50:05 +00:00
Joel E. Denny 96fde08f3b [OPENMP] Fix test typos: CHECK-DAG-N -> CHECK-N-DAG
Reviewed by: ABataev

Differential Revision: https://reviews.llvm.org/D46370

llvm-svn: 331469
2018-05-03 17:22:04 +00:00
Joel E. Denny d110fe4a7e Revert r331466: [OPENMP] Fix test typos: CHECK-DAG-N -> CHECK-N-DAG"
Sorry, forgot to add commit log attributes.

llvm-svn: 331468
2018-05-03 17:22:01 +00:00
Joel E. Denny 3c14663740 [OPENMP] Fix test typos: CHECK-DAG-N -> CHECK-N-DAG
llvm-svn: 331466
2018-05-03 17:15:44 +00:00
Alexey Bataev fac26cf4ca [OPENMP] Add support for reductions on simd directives in target
regions.

Added codegen for `simd reduction()` constructs in target directives.

llvm-svn: 331393
2018-05-02 20:03:27 +00:00
Alexey Bataev 354df2eeab [OPENMP] Analyze the type of the mapped entity instead of its base.
If the mapped entity is a data member, we erroneously checked the type
of its base rather than the type of the mapped entity itself.

llvm-svn: 331385
2018-05-02 18:44:10 +00:00
Alexey Bataev dcc815d015 [OPENMP] Do not emit warning for implicitly declared target functions.
Since upcoming OpenMP 5.0 functions can be mapped implicitly as declare
target and we should not emit warnings for such functions.

llvm-svn: 331377
2018-05-02 17:39:00 +00:00
Alexey Bataev 1ab3457319 [OPENMP] Enable c++ exceptions outside of the target constructs iff they are
enabled for the host.

If the compilation for the host enables C++ exceptions, but they are not
supported by the device, we still need to allow the code with the
exception handling constructs outside of the target regions.

llvm-svn: 331372
2018-05-02 16:52:07 +00:00
Alexey Bataev 6d94410914 [OPENMP] Support C++ member functions in the device constructs.
Added correct emission of the C++ member functions for the device
function when they are used in the device constructs.

llvm-svn: 331365
2018-05-02 15:45:28 +00:00
Alexey Bataev 18fa2323b6 [OPENMP] Emit names of the globals depending on target.
Some symbols are not allowed to be used as names on some targets. Patch
ries to unify the emission of the names of LLVM globals so they could be
used on different targets.

llvm-svn: 331358
2018-05-02 14:20:50 +00:00
Alexey Bataev fb38828cb2 [OPENMP] Emit template instatiation|specialization functions for
devices.

If the function is an instantiation|specialization of the template and
is used in the device code, the definitions of such functions should be
emitted for the device.

llvm-svn: 331261
2018-05-01 14:09:46 +00:00
Alexey Bataev d0df36a79a [OPENMP] Do not emit warning about non-declared target function params.
We should not emit warning that the parameters are not marked as declare
target, these declaration are local and cannot be marked as declare
target.

llvm-svn: 331211
2018-04-30 18:28:08 +00:00
Alexey Bataev dadf2d1238 [OPENMP] Do not crash on codegen for CXX member functions.
Non-static member functions should not be emitted as a standalone
functions, this leads to compiler crash.

llvm-svn: 331206
2018-04-30 18:09:40 +00:00
Alexey Bataev 64e62dcfff [OPENMP] Do not crash on incorrect input data.
Emit error messages instead of compiler crashing when the target region
does not exist in the device code + fix crash when the location comes
from macros.

llvm-svn: 331195
2018-04-30 16:26:57 +00:00
Alexey Bataev 2091ca6c97 [OPENMP] Do not cast captured by value variables with pointer types in
NVPTX target.

When generating the wrapper function for the offloading region, we need
to call the outlined function and cast the arguments correctly to follow
the ABI. Usually, variables captured by value are casted to `uintptr_t`
type. But this should not performed for the variables with pointer type.

llvm-svn: 330620
2018-04-23 17:33:41 +00:00
Alexey Bataev e372710d30 [OPENMP] Code cleanup and code improvements.
llvm-svn: 330270
2018-04-18 15:57:46 +00:00
Alexey Bataev 2c1dffece6 [OPENMP] Allow to use declare target variables in map clauses
Global variables marked as declare target are allowed to be used in map
clauses. Patch fixes the crash of the compiler on the declare target
variables in map clauses.

llvm-svn: 330156
2018-04-16 20:34:41 +00:00
Alexey Bataev 9ff8083d98 [OPENMP] General code improvements.
llvm-svn: 330154
2018-04-16 20:16:21 +00:00
Alexey Bataev a4fa0b880a [OPENMP] General code improvements.
llvm-svn: 330140
2018-04-16 17:59:34 +00:00
Alexey Bataev c0f879bcec [OPENMP] Additional attributes for the pointer parameters.
Added attributes for better optimization of the OpenMP code.

llvm-svn: 329751
2018-04-10 20:10:53 +00:00
Alexey Bataev e290ec02c7 [OPENMP, NVPTX] Fix codegen for the teams reduction.
Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv
as we operate on unsigned values only (addresses, converted to integers)

llvm-svn: 329411
2018-04-06 16:03:36 +00:00
Alexander Kornienko 2a8c18d991 Fix typos in clang
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:

  archtype
  cas
  classs
  checkk
  compres
  definit
  frome
  iff
  inteval
  ith
  lod
  methode
  nd
  optin
  ot
  pres
  statics
  te
  thru

Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)

Differential revision: https://reviews.llvm.org/D44188

llvm-svn: 329399
2018-04-06 15:14:32 +00:00
Alexey Bataev 03f270c900 [OPENMP] Added emission of offloading data sections for declare target
variables.

Added emission of the offloading data sections for the variables within
declare target regions + fixes emission of the declare target variables
marked as declare target not within the declare target region.

llvm-svn: 328888
2018-03-30 18:31:07 +00:00
Alexey Bataev 34f8a7043b [OPENMP] Codegen for ctor|dtor of declare target variables.
When the declare target variables are emitted for the device,
constructors|destructors for these variables must emitted and registered
by the runtime in the offloading sections.

llvm-svn: 328705
2018-03-28 14:28:54 +00:00
Alexey Bataev 92327c50d3 [OPENMP] Codegen for declare target with link clause.
If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.

llvm-svn: 328544
2018-03-26 16:40:55 +00:00
Gheorghe-Teodor Bercea 36cdfad062 [OpenMP][Clang] Add call to global data sharing stack initialization on the workers side
Summary: The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init() function is called by the master. This ensures that the per-team data structures of the runtime have been initialized.

Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D44749

llvm-svn: 328219
2018-03-22 17:33:27 +00:00
Alexey Bataev 173142171e [OPENMP, NVPTX] Codegen for target distribute parallel combined
constructs in generic mode.

Fixed codegen for distribute parallel combined constructs. We have to
pass and read the shared lower and upper bound from the distribute
region in the inner parallel region. Patch is for generic mode.

llvm-svn: 327990
2018-03-20 15:41:05 +00:00
Alexey Bataev 63cc8e96c3 [OPENMP, NVPTX] Globalization of the private redeclarations.
If the generic codegen is enabled and private copy of the original
variable escapes the declaration context, this private copy should be
globalized just like it was the original variable.

llvm-svn: 327985
2018-03-20 14:45:59 +00:00
Alexey Bataev b7f3cba84c [OPENMP, NVPTX] Emit correct thread id.
We emitted fake thread id for the outined function in NVPTX codegen.
Patch adds emission of the real thread id.

llvm-svn: 327867
2018-03-19 17:04:07 +00:00
Reid Kleckner fb93154bf1 [MS] Don't escape MS C++ names with \01
It is not needed after LLVM r327734. Now it will be easier to copy-paste
IR symbol names from Clang.

llvm-svn: 327738
2018-03-16 20:36:49 +00:00
Alexey Bataev c99042ba97 [OPENMP, NVPTX] Improve globalization of the variables captured by value.
If the variable is captured by value and the corresponding parameter in
the outlined function escapes its declaration context, this parameter
must be globalized. To globalize it we need to get the address of the
original parameter, load the value, store it to the global address and
use this global address instead of the original.

Patch improves globalization for parallel|teams regions + functions in
declare target regions.

llvm-svn: 327654
2018-03-15 18:10:54 +00:00
Alexey Bataev 4f4bf7c348 [OPENMP] Codegen for `omp declare target` construct.
Added initial codegen for device side of declarations inside `omp
declare target` construct + codegen for implicit `declare target`
functions, which are used in the target regions.

llvm-svn: 327636
2018-03-15 15:47:20 +00:00
Gheorghe-Teodor Bercea d3dcf2f05d [OpenMP] Add OpenMP data sharing infrastructure using global memory
Summary:
This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure.

TODO: add a more detailed description.

Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43660

llvm-svn: 327513
2018-03-14 14:17:45 +00:00
Alexey Bataev 21dab12453 [OPENMP] Fix the address of the original variable in task reductions.
If initialization of the task reductions requires pointer to original
variable, which is stored in the threadprivate storage, we used the
address of this pointer instead.

llvm-svn: 327136
2018-03-09 15:20:30 +00:00
Alexey Bataev 2e0cbe5092 [OPENMP] Emit sizes/init ptrs etc. data for task reductions before
using.

We may emit the code in wrong order because of incorrect implementation
of the runtime functions for task reductions. Threadprivate storages may
be initialized after real initialization of the reduction items. Patch
fixes this problem.

llvm-svn: 327008
2018-03-08 15:24:08 +00:00
Gheorghe-Teodor Bercea 7d80da15a0 [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory
Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43625

llvm-svn: 326948
2018-03-07 21:59:50 +00:00
Alexey Bataev ab4ea225fe [OPENMP] Fix lifetime of the loop counters.
We may emit incorrect lifetime info during codegen for loop counters in
OpenMP constructs because of automatic scope cleanup when we needed
temporarily locations for private loop counters.

llvm-svn: 326922
2018-03-07 18:17:06 +00:00
Alexey Bataev 1c44e15f6d [OPENMP] Fix generation of the unique names for task reduction
variables.

If the task has reduction construct and this construct for some variable
requires unique threadprivate storage, we may generate different names
for variables used in taskgroup task_reduction clause and in task
  in_reduction clause. Patch fixes this problem.

llvm-svn: 326827
2018-03-06 18:59:43 +00:00
Alexey Bataev 20cf67c233 [OPENMP] Scan all redeclarations looking for `declare simd` attribute.
Patch fixes the problem with the functions marked as `declare simd`. If
the canonical declaration does not have associated `declare simd`
construct, we may not generate required code even if other
redeclarations are marked as `declare simd`.

llvm-svn: 326594
2018-03-02 18:07:00 +00:00
Alexey Bataev 852525de25 [OPENMP] Treat local variables in CUDA mode as thread local.
In CUDA mode all local variables are actually thread
local|threadprivate, not private, and, thus, they cannot be shared
between threads|lanes.

llvm-svn: 326590
2018-03-02 17:17:12 +00:00
Carlo Bertolli 79712097c7 [OpenMP] Extend NVPTX SPMD implementation of combined constructs
Differential Revision: https://reviews.llvm.org/D43852

This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag.

llvm-svn: 326368
2018-02-28 20:48:35 +00:00
Alexey Bataev 95c23e72da [OPENMP] Emit warning for non-trivial types in map clauses.
If the mapped type is non-trivial, the warning message is emitted for
better user experience.

llvm-svn: 326251
2018-02-27 21:31:11 +00:00
Alexey Bataev 2819260b35 [OPENMP] Allow multiple mappings for member expressions for pointers.
If several member expressions are mapped and they reference the same
address as a base, but access different members, this must be allowed.

llvm-svn: 326212
2018-02-27 17:42:00 +00:00
Rafael Espindola 922f2aa9b2 Bring r325915 back.
The tests that failed on a windows host have been fixed.

Original message:

Start setting dso_local for COFF.

With this there are still some GVs where we don't set dso_local
because setGVProperties is never called. I intend to fix that in
followup commits. This is just the bare minimum to teach
shouldAssumeDSOLocal what it should do for COFF.

llvm-svn: 325940
2018-02-23 19:30:48 +00:00
Carlo Bertolli beda214996 [OpenMP] Limit reduction support for pragma 'distribute' when combined with pragma 'simd'
Differential Revision: https://reviews.llvm.org/D43513

This is a bug fix that removes the emission of reduction support for pragma 'distribute' when found alone or in combinations without simd.
Pragma 'distribute' does not have a reduction clause, but when combined with pragma 'simd' we need to emit the support for simd's reduction clause as part of code generation for distribute. This guard is similar to the one used for reduction support earlier in the same code gen function.

llvm-svn: 325822
2018-02-22 19:38:14 +00:00
Alexey Bataev 8e39c3446f [OPENMP] Do not emit messages for templates in declare target
constructs.

The compiler may emit some extra warnings for functions, that are
implicit specialization of the templates, declared in the target region.

llvm-svn: 325391
2018-02-16 21:23:23 +00:00
Alexey Bataev 9a75738b2b [OPENMP] Fix PR35873: Fix data-sharing attributes for const variables.
Compiler erroneously returned wrong data-sharing attributes for the
constant variables if they have explictly specified attributes.

llvm-svn: 325373
2018-02-16 19:16:54 +00:00
Alexey Bataev 96dae81d7f [OPENMP] Fix parsing of the directives with inner directives.
The parsing may lead to compiler hanging because of the incorrect
processing of inner OpenMP pragmas.

llvm-svn: 325369
2018-02-16 18:36:44 +00:00
Alexey Bataev ea33dee38c [OPENMP] Fix PR36399: Crash on C code with ordered doacross construct.
Codegen for ordered with doacross construct might produce incorrect code
because of missing cleanup scope for the construct. Without this scope
the final runtime function call could be emitted in the wrong order that
leads to incorrect codegen.

llvm-svn: 325304
2018-02-15 23:39:43 +00:00
Alexey Bataev 17daedfd04 [OPENMP] Fix PR38398: compiler crash on standalone pragma ordered with depend sink|source clause.
Patch fixes compiler crash on standalone #pragmas ordered with
depend(sink|source) clauses.

llvm-svn: 325302
2018-02-15 22:42:57 +00:00
Alexey Bataev cbecfdfefe [OpenMP] Fix trailing space when printing pragmas, by Joel. E. Denny
Summary:
-ast-print prints omp pragmas with a trailing space.  While this
behavior is likely of little concern to most users, surely it's
unintentional, and it's annoying for some source-level work I'm
pursuing.  This patch focuses on omp pragmas, but it also fixes
init_seg and loop hint pragmas because they share implementation.

The testing strategy here is to add usually just one '{{$}}' per
relevant -ast-print test file.  This seems to achieve good code
coverage.  However, this strategy is probably easy to forget as the
tests evolve.  That's probably fine as this fix is far from critical.
The main goal of the testing is to aid the initial review.

This patch also adds a fixme for "#pragma unroll", which prints as
"#pragma unroll (enable)", which is invalid syntax.

Reviewers: ABataev

Reviewed By: ABataev

Subscribers: guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43204

llvm-svn: 325145
2018-02-14 17:38:47 +00:00
Sander de Smalen 9084a3b118 [DebugInfo] Avoid name conflict of generated VLA expression variable.
Summary:
This patch also adds the 'DW_AT_artificial' flag to the generated variable.

Addresses the issues mentioned in http://llvm.org/PR30553.

Reviewers: CarlosAlbertoEnciso, probinson, aprantl

Reviewed By: aprantl

Subscribers: JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D43189

llvm-svn: 324988
2018-02-13 07:49:34 +00:00
Sander de Smalen 891af03a55 Recommit rL323952: [DebugInfo] Enable debug information for C99 VLA types.
Fixed build issue when building with g++-4.8 (specialization after instantiation).

llvm-svn: 324173
2018-02-03 13:55:59 +00:00
Sander de Smalen 4e9a1264dd Reverting patch rL323952 due to build errors that I
haven't encountered in local builds.

llvm-svn: 323956
2018-02-01 12:27:13 +00:00
Sander de Smalen 17c4633e7f [DebugInfo] Enable debug information for C99 VLA types
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
    
This should implement:
  Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553

Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie

Reviewed By: aprantl

Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D41698

llvm-svn: 323952
2018-02-01 11:25:10 +00:00
Daniel Neilson c8bdc8db73 Change memcpy/memove/memset to have dest and source alignment attributes.
Summary:
  This change is step three in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. Steps:

Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment()
and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

Reviewers: rjmccall

Subscribers: jyknight, nemanjai, nhaehnle, javed.absar, sbc100, aheejin, kbarton, fedor.sergeev, cfe-commits

Differential Revision: https://reviews.llvm.org/D41677

llvm-svn: 323617
2018-01-28 17:27:45 +00:00
Alexey Bataev a9b9cc0d79 [OPENMP] Remove more empty SourceLocations() from the code.
Removed more empty SourceLocations() from the OpenMP code and replaced
with the correct locations for better debug info emission.

llvm-svn: 323232
2018-01-23 18:12:38 +00:00
Daniel Neilson 6e938effaa Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary:
  Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.

  The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.

 For example, code which used to read:
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
   call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.

llvm-svn: 322964
2018-01-19 17:12:54 +00:00
Jonas Hahnfeld 5e4df288e2 [OpenMP] Correct generation of offloading entries
Firstly, each offloading entry must have a unique name or the
linker will complain if there are multiple files with target
regions. Secondly, the compiler must not introduce padding so
mark the struct with a PackedAttr.

Differential Revision: https://reviews.llvm.org/D42168

llvm-svn: 322858
2018-01-18 15:38:03 +00:00
Rafael Espindola e0345b6e1f Update for llvm change.
llvm-svn: 322808
2018-01-18 02:08:38 +00:00
Alexey Bataev 9350fc3987 [OPENMP] Add support for `depend` clauses on `target teams distribute
parallel for simd` directives.

Added codegen for `depend` clauses on `#pragma omp target teams
distribute parallel for simd` directives.

llvm-svn: 322587
2018-01-16 19:18:24 +00:00
Alexey Bataev 9f9fb0ba35 [OPENMP] Add support for `depend` on `target teams distribute parallel
for` directives.

Added codegen for `depend` clauses on `#pragma omp target teams
distribute parallel for` directives.

llvm-svn: 322585
2018-01-16 19:02:33 +00:00
Alexey Bataev d60d1baadb [OPENMP] Add support for `depend` clauses on `target parallel for simd`
directives.

Added codegen for `depend` clauses on `#pragma omp target parallel for
simd` directives.

llvm-svn: 322578
2018-01-16 17:55:15 +00:00
Alexey Bataev 8ed89551e2 [OPENMP] Add support for `depend` clauses on `target parallel for`
directives.

Added codegen for `depend` clause on `#pragma omp target parallel for`
directives.

llvm-svn: 322577
2018-01-16 17:41:04 +00:00
Alexey Bataev 8d16a43416 [OPENMP] Add support for `depend` clauses on `target teams distribute
simd` directives.

Added codegen for `depend` clauses on `#pragma omp target teams
distribute simd` directives.

llvm-svn: 322575
2018-01-16 17:22:50 +00:00
Alexey Bataev 79df756d1f [OPENMP] Add support for `depend` clause on `target teams distribute`.
Added codegen for `depend` clauses on `#pragma omp target teams
distribute` directives.

llvm-svn: 322571
2018-01-16 16:46:46 +00:00
Alexey Bataev 54d5c7dc44 [OPENMP] Add support for `depend` clauses on `target parallel` directive.
Added codegen for `depend` clauses on `#pragma omp target parallel`
directives.

llvm-svn: 322570
2018-01-16 16:27:49 +00:00
Alexey Bataev 0c869ef21c [OPENMP] Add support for `depend` clauses on `target teams`.
Added codegen for `depend` clause on `#pragma omp target teams`
directives.

llvm-svn: 322569
2018-01-16 15:57:07 +00:00
Alexey Bataev f41c88fd50 [OPENMP] Add support for `depend` clauses on `target simd`.
Added codegen for `depend` clauses on `#pragma omp target simd`
directives.

llvm-svn: 322559
2018-01-16 15:05:16 +00:00
Alexey Bataev 647dd84422 [OPENMP] Initial codegen for `target teams distribute parallel for
simd`.

Added host codegen + codegen for devices with default codegen for
`#pragma omp target teams distribute parallel for simd` directive.

llvm-svn: 322515
2018-01-15 20:59:40 +00:00
Alexey Bataev 8451efad89 [OPENMP] Add codegen for `depend` clauses on `target` directive.
Added basic support for codegen of `depend` clauses on `target`
directive.

llvm-svn: 322501
2018-01-15 19:06:12 +00:00
Alexey Bataev 475a7440f1 [OPENMP] Replace calls of getAssociatedStmt().
getAssociatedStmt() returns the outermost captured statement for the
OpenMP directive. It may return incorrect region in case of combined
constructs. Reworked the code to reduce the number of calls of
getAssociatedStmt() and used getInnermostCapturedStmt() and
getCapturedStmt() functions instead.
In case of firstprivate variables it may lead to an extra allocas
generation for private copies even if the variable is passed by value
into outlined function and could be used directly as private copy.

llvm-svn: 322393
2018-01-12 19:39:11 +00:00
Rafael Espindola cbca487f49 Make internal/private GVs implicitly dso_local.
While updating clang tests for having clang set dso_local I noticed
that:

- There are *a lot* of tests to update.
- Many of the updates are redundant.

They are redundant because a GV is "obviously dso_local". This patch
starts formalizing that a bit by requiring that internal and private
GVs be dso_local too. Since they all are, we don't have to print
dso_local to the textual representation, making it a bit more compact
and easier to read.

llvm-svn: 322318
2018-01-11 22:15:12 +00:00
Alexey Bataev f3c832a970 [OpenMP] Fix handling of clause on wrong directive, by Joel. E. Denny
Summary:
First, this patch fixes an assert failure when, for example, "omp for"
has num_teams.

Second, this patch prevents duplicate diagnostics when, for example,
"omp for" has uniform.

This patch makes the general assumption (even where it doesn't
necessarily fix an existing bug) that it is worthless to perform sema
for a clause that appears on a directive on which OpenMP does not
permit that clause.  However, due to this assumption, this patch
suppresses some diagnostics that were expected in the test suite.  I
assert that those diagnostics were likely just distracting to the
user.

Reviewers: ABataev

Reviewed By: ABataev

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D41841

llvm-svn: 322107
2018-01-09 19:21:04 +00:00
Alexey Bataev aee9389b04 [OPENMP] Fix debug info for outlined functions in NVPTX + add more tests.
Fixed name of emitted outlined functions in NVPTX target + extra tests
for the debug info.

llvm-svn: 322022
2018-01-08 20:09:47 +00:00
Alexey Bataev fd9b2affc3 [OPENMP] Fix capturing of expressions in clauses.
Patch fixes incorrect capturing of the expressions in clauses with
expressions that must be captured for the combined constructs. Incorrect
capturing may lead to compiler crash during codegen phase.

llvm-svn: 321820
2018-01-04 20:50:08 +00:00
Alexey Bataev b2575930b3 [OPENMP] Fix casting in NVPTX support library.
If the reduction required shuffle in the NVPTX codegen, we may need to
cast the reduced value to the integer type. This casting was implemented
incorrectly and may cause compiler crash. Patch fixes this problem.

llvm-svn: 321818
2018-01-04 20:18:55 +00:00
Alexey Bataev 7cae94e74c [OPENMP] Add debug info for generated functions.
Most of the generated functions for the OpenMP were generated with
disabled debug info. Patch fixes this for better user experience.

llvm-svn: 321816
2018-01-04 19:45:16 +00:00