Commit Graph

1484 Commits

Author SHA1 Message Date
Artem Belevich cd01980f30 [OpenMP] Split OpenMP/target_map_codegen test [NFC]
The test file is the single longest test among clang's tests and ends up about
doubling the wall time of clang tests on machines with high number of cores.

The test appears to consist of multiple independent subtests and does not have
to be in one file. Splitting it into smaller parts reduces test time on my
machine from ~80s down to ~45.

Differential Revision: https://reviews.llvm.org/D85551
2020-08-07 13:47:53 -07:00
Alexey Bataev 4a7aedb843 [OPENMP]Simplify representation for atomic, critical, master and section
constrcut.

Several constructs may be represented wityout relying on CapturedStmt.
It saves memory and improves compilation speed.
2020-08-07 09:58:23 -04:00
Alexey Bataev 0af7835eae [OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation.
Summary:
Introduced OMPChildren class to handle all associated clauses, statement
and child expressions/statements. It allows to represent some directives
more correctly (like flush, depobj etc. with pseudo clauses, ordered
depend directives, which are standalone, and target data directives).
Also, it will make easier to avoid using of CapturedStmt in directives,
if required (atomic, tile etc. directives).
Also, it simplifies serialization/deserialization of the
executable/declarative directives.
Reduces number of allocation operations for mapper declarations.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83261
2020-08-06 12:25:19 -04:00
Joel E. Denny 002d61db2b [OpenMP] Fix `present` for exit from `omp target data`
Without this patch, the following example fails but shouldn't
according to OpenMP TR8:

```
 #pragma omp target enter data map(alloc:i)
 #pragma omp target data map(present, alloc: i)
 {
   #pragma omp target exit data map(delete:i)
 } // fails presence check here
```

OpenMP TR8 sec. 2.22.7.1 "map Clause", p. 321, L23-26 states:

> If the map clause appears on a target, target data, target enter
> data or target exit data construct with a present map-type-modifier
> then on entry to the region if the corresponding list item does not
> appear in the device data environment an error occurs and the
> program terminates.

There is no corresponding statement about the exit from a region.
Thus, the `present` modifier should:

1. Check for presence upon entry into any region, including a `target
   exit data` region.  This behavior is already implemented correctly.

2. Should not check for presence upon exit from any region, including
   a `target` or `target data` region.  Without this patch, this
   behavior is not implemented correctly, breaking the above example.

In the case of `target data`, this patch fixes the latter behavior by
removing the `present` modifier from the map types Clang generates for
the runtime call at the end of the region.

In the case of `target`, we have not found a valid OpenMP program for
which such a fix would matter.  It appears that, if a program can
guarantee that data is present at the beginning of a `target` region
so that there's no error there, that data is also guaranteed to be
present at the end.  This patch adds a comment to the runtime to
document this case.

Reviewed By: grokos, RaviNarayanaswamy, ABataev

Differential Revision: https://reviews.llvm.org/D84422
2020-08-05 10:03:31 -04:00
Saiyedul Islam 160ff83765 [OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3
Provides AMDGCN and NVPTX specific specialization of getGPUWarpSize,
getGPUThreadID, and getGPUNumThreads methods. Adds tests for AMDGCN
codegen for these methods in generic and simd modes. Also changes the
precondition in InitTempAlloca to be slightly more permissive. Useful for
AMDGCN OpenMP codegen where allocas are created with a cast to an
address space.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84260
2020-08-03 05:38:39 +00:00
Johannes Doerfert ebad64dfe1 [OpenMP][FIX] Consistently use OpenMPIRBuilder if requested
When we use the OpenMPIRBuilder for the parallel region we need to also
use it to get the thread ID (among other things) in the body. This is
because CGOpenMPRuntime::getThreadID() and
CGOpenMPRuntime::emitUpdateLocation implicitly assumes that if they are
called from within a parallel region there is a certain structure to the
code and certain members of the OMPRegionInfo are initialized. It might
make sense to initialize them even if we use the OpenMPIRBuilder but we
would preferably get rid of such state instead.

Bug reported by Anchu Rajendran Sudhakumari.

Depends on D82470.

Reviewed By: anchu-rajendran

Differential Revision: https://reviews.llvm.org/D82822
2020-07-30 10:19:40 -05:00
Alexey Bataev 622e46156d [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D84767
2020-07-30 11:18:33 -04:00
Alexey Bataev b69357c2f4 Revert "[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region."
This reverts commit 142d0d3ed8 to
investigate undefined behavior revealed by buildbots.
2020-07-30 10:57:56 -04:00
Alexey Bataev 142d0d3ed8 [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.
Need to map the base pointer for all directives, not only target
data-based ones.
The base pointer is mapped for array sections, array subscript, array
shaping and other array-like constructs with the base pointer. Also,
codegen for use_device_ptr clause was modified to correctly handle
mapping combination of array like constructs + use_device_ptr clause.
The data for use_device_ptr clause is emitted as the last records in the
data mapping array.
It applies only for global pointers.

Differential Revision: https://reviews.llvm.org/D84767
2020-07-30 09:40:05 -04:00
Johannes Doerfert b08abf4c80 [OpenMP] Fix D83281 issue on windows by allowing `dso_local` in CHECK [2/1]
The problem with 8723280b68 was that the
`dso_local` is *before* the void not after. Hope this works.
2020-07-29 15:47:45 -05:00
Johannes Doerfert 8723280b68 [OpenMP] Fix D83281 issue on windows by allowing `dso_local` in CHECK 2020-07-29 15:18:20 -05:00
Joel E. Denny 9f2f3b9de6 [OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` motion modifier for `omp target update` directives.  The
next patch in this series implements OpenMP runtime support.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84711
2020-07-29 12:18:45 -04:00
Johannes Doerfert ee05167cc4 [OpenMP] Allow traits for the OpenMP context selector `isa`
It was unclear what `isa` was supposed to mean so we did not provide any
traits for this context selector. With this patch we will allow *any*
string or identifier. We use the target attribute and target info to
determine if the trait matches. In other words, we will check if the
provided value is a target feature that is available (at the call site).

Fixes PR46338

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83281
2020-07-29 10:22:27 -05:00
Joel E. Denny 69fc33f0cd Revert "[OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)"
This reverts commit 3c3faae497.

It breaks a number of bots.
2020-07-28 20:30:05 -04:00
Joel E. Denny 3c3faae497 [OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` motion modifier for `omp target update` directives.  The
next patch in this series implements OpenMP runtime support.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84711
2020-07-28 19:15:18 -04:00
Alexey Bataev 9840208db6 [OPENMP] Fix PR46730: Fix compiler crash on taskloop over constructible loop counters.
Summary:
If the variable is constrcutible, its copy is created by calling a
constructor. Such variables are duplicated and thus, must be captured.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83909
2020-07-24 10:48:20 -04:00
Joel E. Denny aa82c40f0a [OpenMP] Implement TR8 `present` map type modifier in Clang (1/2)
This patch implements Clang front end support for the OpenMP TR8
`present` map type modifier.  The next patch in this series implements
OpenMP runtime support.

This patch does not attempt to implement TR8 sec. 2.22.7.1 "map
Clause", p. 319, L14-16:

> If a map clause with a present map-type-modifier is present in a map
> clause, then the effect of the clause is ordered before all other
> map clauses that do not have the present modifier.

Compare to L10-11, which Clang does not appear to implement yet:

> For a given construct, the effect of a map clause with the to, from,
> or tofrom map-type is ordered before the effect of a map clause with
> the alloc, release, or delete map-type.

This patch also does not implement the `present` implicit-behavior for
`defaultmap` or the `present` motion-modifier for `target update`.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83061
2020-07-22 10:15:32 -04:00
Pushpinder Singh a1b12a934d [OpenMP] Add missing RUN lines for OpenMP 4.5
Summary: This was missed when default version was upgraded to 5.0 (part of D81098)

Reviewers: saiislam, ABataev, jdoerfert

Reviewed By: saiislam

Subscribers: yaxunl, guansong, sstefan1, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84221
2020-07-22 01:08:06 -04:00
Alexey Bataev 13bfe4b226 [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.
Summary:
Need to avoid an optimization for base pointer mapping for target data
directives.

Reviewers: jdoerfert, ye-luo

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84182
2020-07-21 15:48:32 -04:00
Alexey Bataev 2875df0d56 [OPENMP50]Perform data mapping analysis only for explicitly mapped data.
Summary:
According to OpenMP 5.0, the restrictions for mapping of overlapped data
apply only for explicitly mapped data, there is no restriction for
implicitly mapped data just like in OpenMP 4.5.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83398
2020-07-20 13:01:15 -04:00
Joseph Huber 3bbbe4c4b6 [OpenMP] Add Additional Function Attribute Information to OMPKinds.def
Summary:
This patch adds more function attribute information to the runtime function definitions in OMPKinds.def. The goal is to provide sufficient information about OpenMP runtime functions to perform more optimizations on OpenMP code.

Reviewers: jdoerfert

Subscribers: aaron.ballman cfe-commits yaxunl guansong sstefan1 llvm-commits

Tags: #OpenMP #clang #LLVM

Differential Revision: https://reviews.llvm.org/D81031
2020-07-18 12:55:50 -04:00
Joel E. Denny cbf64b5834 [OpenMP] Fix map clause for unused var: don't ignore it
For example, without this patch:

```
 $ cat test.c
 int main() {
   int x[3];
   #pragma omp target map(tofrom:x[0:3])
 #ifdef USE
   x[0] = 1
 #endif
   ;
   return 0;
 }
 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -S -emit-llvm test.c
 $ grep '^@.offload_maptypes' test.ll
 $ echo $?
 1
 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -S -emit-llvm test.c \
         -DUSE
 $ grep '^@.offload_maptypes' test.ll
 @.offload_maptypes = private unnamed_addr constant [1 x i64] [i64 35]
```

With this patch, both greps produce the same result.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D83922
2020-07-17 21:37:27 -04:00
Eric Christopher 7bfaa40086 Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions"
due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753.

An SROA change soon may obviate some of these problems.

This reverts commit 8d09f20798.
2020-07-16 11:54:04 -07:00
George Rokos 537b16e9b8 [OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime
This patch implements the code generation to use OpenMP 5.0 declare mapper (a.k.a. user-defined mapper) constructs.
Patch written by Lingda Li.

Differential Revision: https://reviews.llvm.org/D67833
2020-07-15 18:11:43 -07:00
Akira Hatanaka ed6b578040 [CodeGen] Emit a call instruction instead of an invoke if the called
llvm function is marked nounwind

This fixes cases where an invoke is emitted, despite the called llvm
function being marked nounwind, because ConstructAttributeList failed to
add the attribute to the attribute list. llvm optimization passes turn
invokes into calls and optimize away the exception handling code, but
it's better to avoid emitting the code in the front-end if the called
function is known not to raise an exception.

Differential Revision: https://reviews.llvm.org/D83906
2020-07-15 14:47:45 -07:00
Alexey Bataev 41d0af0074 [OPENMP]Fix PR46593: Reduction initializer missing construnctor call.
Summary:
If user-defined reductions with the initializer are used with classes,
the compiler misses the constructor call when trying to create a private
copy of the reduction variable.

Reviewers: jdoerfert

Subscribers: cfe-commits, yaxunl, guansong, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83334
2020-07-15 15:14:22 -04:00
Alexey Bataev 9dc327d1b7 [OPENMP]Fix PR46688: cast the type of the allocated variable to the initial one.
Summary:
If the original variable is marked for allocation in the different
address space using #pragma omp allocate, need to cast the allocated
variable to its original type with the original address space.
Otherwise, the compiler may crash trying to bitcast the type of the new
allocated variable to the original type in some cases, like passing this
variable as an argument in function calls.

Reviewers: jdoerfert

Subscribers: jholewinski, cfe-commits, yaxunl, guansong, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83696
2020-07-15 14:54:19 -04:00
Johannes Doerfert d87c92e5a2 [OpenMP][FIX] Check only for deterministic part of a generated function name 2020-07-14 22:48:22 -05:00
Johannes Doerfert 7af287d0d9 [OpenMP][IRBuilder] Support nested parallel regions
During code generation we might change/add basic blocks so keeping a
list of them is fairly easy to break. Nested parallel regions were
enough. The new scheme does recompute the list of blocks to be outlined
once it is needed.

Reviewed By: anchu-rajendran

Differential Revision: https://reviews.llvm.org/D82722
2020-07-14 22:39:06 -05:00
Johannes Doerfert fec1f2109f [OpenMP] Emit remarks during GPU state machine optimization
Since D83271 we can optimize the GPU state machine to avoid spurious
call edges that increase the register usage of kernels. With this patch
we inform the user why and if this optimization is happening and when it
is not.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D83707
2020-07-14 22:33:57 -05:00
Tyker 8d09f20798 [AssumeBundles] Use operand bundles to encode alignment assumptions
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.

Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71739
2020-07-14 01:05:58 +02:00
Alexey Bataev 7075c056e9 [OPENMP]Fix compiler crash for target data directive without actual target codegen.
Summary:
Need to privatize addresses of the captured variables when trying to
emit the body of the target data directive in no target codegen mode.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83478
2020-07-13 10:52:24 -04:00
Atmn Patel 78443666bc [OpenMP] Add firstprivate as a default data-sharing attribute to clang
This implements the default(firstprivate) clause as defined in OpenMP
Technical Report 8 (2.22.4).

Reviewed By: jdoerfert, ABataev

Differential Revision: https://reviews.llvm.org/D75591
2020-07-12 23:01:40 -05:00
Johannes Doerfert c98699582a [OpenMP][NFC] Remove unused (always fixed) arguments
There are various runtime calls in the device runtime with unused, or
always fixed, arguments. This is bad for all sorts of reasons. Clean up
two before as we match them in OpenMPOpt now.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D83268
2020-07-11 00:51:51 -05:00
Johannes Doerfert cd0ea03e6f [OpenMP][NFC] Remove unused and untested code from the device runtime
Summary:
We carried a lot of unused and untested code in the device runtime.
Among other reasons, we are planning major rewrites for which reduced
size is going to help a lot.

The number of code lines reduced by 14%!

Before:
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
CUDA                            13            489            841           2454
C/C++ Header                    14            322            493           1377
C                               12            117            124            559
CMake                            4             64             64            262
C++                              1              6              6             39
-------------------------------------------------------------------------------
SUM:                            44            998           1528           4691
-------------------------------------------------------------------------------

After:
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
CUDA                            13            366            733           1879
C/C++ Header                    14            317            484           1293
C                               12            117            124            559
CMake                            4             64             64            262
C++                              1              6              6             39
-------------------------------------------------------------------------------
SUM:                            44            870           1411           4032
-------------------------------------------------------------------------------

Reviewers: hfinkel, jhuber6, fghanim, JonChesterfield, grokos, AndreyChurbanov, ye-luo, tianshilei1992, ggeorgakoudis, Hahnfeld, ABataev, hbae, ronlieb, gregrodgers

Subscribers: jvesely, yaxunl, bollu, guansong, jfb, sstefan1, aaron.ballman, openmp-commits, cfe-commits

Tags: #clang, #openmp

Differential Revision: https://reviews.llvm.org/D83349
2020-07-10 19:09:41 -05:00
cchen 2da9572a9b [OPENMP50] extend array section for stride (Parsing/Sema/AST)
Reviewers: ABataev, jdoerfert

Reviewed By: ABataev

Subscribers: yaxunl, guansong, arphaman, sstefan1, cfe-commits, sandoval, dreachem

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82800
2020-07-09 13:28:51 -05:00
Joel E. Denny ed39becd27 [OpenMP][NFC] Remove hard-coded line numbers from more tests
This is a continuation of D82224.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D83057
2020-07-07 09:48:22 -04:00
Fangrui Song b0b5162fc2 [Driver] Pass -gno-column-info instead of -dwarf-column-info
Making -g[no-]column-info opt out reduces the length of a typical CC1 command line.
Additionally, in a non-debug compile, we won't see -dwarf-column-info.
2020-07-05 11:50:38 -07:00
Roman Lebedev 7ea46aee36
Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions"
Assume bundle can have more than one entry with the same name,
but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses
getOperandBundle("align"), which internally assumes that it isn't the
case, and happily crashes otherwise.

Minimal reduced reproducer: run `opt -alignment-from-assumptions` on

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%0 = type { i64, %1*, i8*, i64, %2, i32, %3*, i8* }
%1 = type opaque
%2 = type { i8, i8, i16 }
%3 = type { i32, i32, i32, i32 }

; Function Attrs: nounwind
define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 {
bb:
  call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ]
  ret i32 0
}

; Function Attrs: nounwind willreturn
declare void @llvm.assume(i1) #1

attributes #0 = { nounwind "reciprocal-estimates"="none" }
attributes #1 = { nounwind willreturn }


This is what we'd have with -mllvm -enable-knowledge-retention

This reverts commit c95ffadb24.
2020-07-04 23:49:23 +03:00
Alexey Bataev 32ea3397be [OPENMP]Dynamic globalization for parallel target regions.
Summary:
Added support for dynamic memory allocation for globalized variables in
case if execution of target regions in parallel is required.

Reviewers: jdoerfert

Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82324
2020-06-25 08:25:24 -04:00
Tyker c95ffadb24 [AssumeBundles] Use operand bundles to encode alignment assumptions
Summary:
NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html

Complemantary to the assumption outliner prototype in D71692, this patch
shows how we could simplify the code emitted for an alignemnt
assumption. The generated code is smaller, less fragile, and it makes it
easier to recognize the additional use as a "assumption use".

As mentioned in D71692 and on the mailing list, we could adopt this
scheme, and similar schemes for other patterns, without adopting the
assumption outlining.

Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71739
2020-06-25 12:59:44 +02:00
Saiyedul Islam 2bfce22a92 [OpenMP] Upgrade default version of OpenMP to 5.0
Summary:
When -fopenmp option is specified then version 5.0 will be set as
default.

Reviewers: gregrodgers, jdoerfert, ABataev

Reviewed By: ABataev

Subscribers: pdhaliwal, yaxunl, guansong, sstefan1, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81098
2020-06-25 07:13:05 +00:00
Joel E. Denny 01ddb2a7b0 [OpenMP][NFC] Remove hard-coded line numbers from test
Otherwise, it's painful to insert new code.  There are many existing
examples in the same test file where the line numbers are not
hard-coded.

I intend to do the same for several other OpenMP tests, but I want to
be sure there are no objections before I spend time on it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D82224
2020-06-24 14:35:01 -04:00
Alexey Bataev cb90e6a7c0 [OPENMP50]Codegen for scan directives in parallel for simd regions.
Summary:
Added codegen for scan directives in parallel for simd regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp parallel for simd reduction(inscan, op : ...)
for() {
  <input phase>;
  #pragma omp scan (in)exclusive(...)
  <scan phase>
}
```
is transformed to something:
```
 #pragma omp parallel
{
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for simd
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for simd
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82115
2020-06-23 08:41:11 -04:00
Alexey Bataev 437cbad3b3 [OPENMP]Fix PR46357: Do not allow types declarations in pragmas.
Summary:
Compiler may erroneously treat current context in OpenMP pragmas as the
context where new type declaration/definition is allowed. But the
declartation/definition of the new types in OpenMP pragmas should not be
allowed.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82019
2020-06-18 13:17:03 -04:00
Alexey Bataev 4971d0b8ec [OPENMP50]Allow nonmonotonic modifier for all schedule kinds.
Summary:
According to OpenMP 5.0, nonmonotonic modifier can be used with all
schedule kinds, not only dynamic and guided as in OpenMP 4.5.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82026
2020-06-18 12:30:50 -04:00
Alexey Bataev 1ec469cf4c [OPENMP50]Codegen for scan directives in parallel for regions.
Summary:
Added codegen for scan directives in parallel for regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp parallel for reduction(inscan, op : ...)
 for() {
   <input phase>;
   #pragma omp scan (in)exclusive(...)
   <scan phase>
 }
```
is transformed to something:

```
 #pragma omp parallel
{
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81478
2020-06-18 11:56:55 -04:00
Alexey Bataev 08029595ca [OPENMP]Fix overflow during counting the number of iterations.
Summary:
The OpenMP loops are normalized and transformed into the loops from 0 to
max number of iterations. In some cases, original scheme may lead to
overflow during calculation of number of iterations. If it is unknown,
if we can end up with overflow or not (the bounds are not constant and
  we cannot define if there is an overflow), cast original type to the
  unsigned.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits, cfe-commits, caomhin

Tags: #clang, #openmp

Differential Revision: https://reviews.llvm.org/D81881
2020-06-17 08:47:01 -04:00
Alexey Bataev 34ee2549a7 [OPENMP50]Codegen for scan directive in for simd regions.
Summary:
Added codegen for scan directives in parallel for regions.

Emits the code for the directive with inscan reductions.
Original code:
```
 #pragma omp for simd reduction(inscan, op : ...)
for(...) {
  <input phase>;
  #pragma omp scan (in)exclusive(...)
  <scan phase>
}
```
is transformed to something:
```
size num_iters = <num_iters>;
<type> buffer[num_iters];
 #pragma omp for simd
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
 #pragma omp barrier
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
 #pragma omp for simd
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
```

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81658
2020-06-17 08:43:17 -04:00
Mariya Podchishchaeva 0bdcd95bf2 [SYCL][OpenMP] Implement thread-local storage restriction
Summary:
SYCL and OpenMP prohibits thread local storage in device code,
so this commit ensures that error is emitted for device code and not
emitted for host code when host target supports it.

Reviewers: jdoerfert, erichkeane, bader

Reviewed By: jdoerfert, erichkeane

Subscribers: guansong, riccibruno, ABataev, yaxunl, ebevhan, Anastasia, sstefan1, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D81641
2020-06-17 14:36:00 +03:00