Commit Graph

58 Commits

Author SHA1 Message Date
Alexey Bataev 504fc2d0cd [OPENMP, NVPTX] Codegen for critical construct.
Added correct codegen for the critical construct on NVPTX devices.

llvm-svn: 331652
2018-05-07 17:23:05 +00:00
Alexey Bataev 18fa2323b6 [OPENMP] Emit names of the globals depending on target.
Some symbols are not allowed to be used as names on some targets. Patch
ries to unify the emission of the names of LLVM globals so they could be
used on different targets.

llvm-svn: 331358
2018-05-02 14:20:50 +00:00
Alexey Bataev e290ec02c7 [OPENMP, NVPTX] Fix codegen for the teams reduction.
Added NUW flags for all the add|mul|sub operations + replaced sdiv by udiv
as we operate on unsigned values only (addresses, converted to integers)

llvm-svn: 329411
2018-04-06 16:03:36 +00:00
Gheorghe-Teodor Bercea d3dcf2f05d [OpenMP] Add OpenMP data sharing infrastructure using global memory
Summary:
This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure.

TODO: add a more detailed description.

Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43660

llvm-svn: 327513
2018-03-14 14:17:45 +00:00
Gheorghe-Teodor Bercea 7d80da15a0 [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory
Summary: Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an optimization for the generic data sharing scheme. Removing this completely and then adding it via future patches will make all future data sharing patches cleaner.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43625

llvm-svn: 326948
2018-03-07 21:59:50 +00:00
Jonas Hahnfeld 891c7fb19d [OpenMP] Adjust arguments of nvptx runtime functions
In the future the compiler will analyze whether the OpenMP
runtime needs to be (fully) initialized and avoid that overhead
if possible. The functions already take an argument to transfer
that information to the runtime, so pass in the default value 1.
(This is needed for binary compatibility with libomptarget-nvptx
currently being upstreamed.)

Differential Revision: https://reviews.llvm.org/D40354

llvm-svn: 318836
2017-11-22 14:46:49 +00:00
Gheorghe-Teodor Bercea eb89b1d46f [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading
Summary:
This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team.

This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are:
- Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references;
- Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained.

A simple code snippet which illustrates an implicit data sharing situation is as follows:

```
#pragma omp target
{
   // master thread only
   int v;
   #pragma omp parallel
   {
      // worker threads
      // use v
   }
}
```

Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master.

The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct.
Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled.

Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin

Reviewed By: ABataev

Subscribers: cfe-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D38976

llvm-svn: 318773
2017-11-21 15:54:54 +00:00
Arpith Chacko Jacob bb36fe8dba [OpenMP] Basic support for a parallel directive in a target region on an NVPTX device
Summary:

This patch introduces support for the execution of parallel constructs in a target
region on the NVPTX device.  Parallel regions must be in the lexical scope of the
target directive.

The master thread in the master warp signals parallel work for worker threads in worker
warps on encountering a parallel region.

Note: The patch does not yet support capture of arguments in a parallel region so
the test cases are simple.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28145

llvm-svn: 291565
2017-01-10 15:42:51 +00:00