[OPENMP] What's new for OpenMP in clang.

Updated ReleaseNotes + Status of the OpenMP support in clang.

llvm-svn: 338049
This commit is contained in:
Alexey Bataev 2018-07-26 17:53:45 +00:00
parent 66d405d31f
commit 3bdd60095f
2 changed files with 84 additions and 7 deletions

View File

@ -10,13 +10,15 @@
.. role:: partial
.. role:: good
.. contents::
:local:
==================
OpenMP Support
==================
Clang fully supports OpenMP 3.1 + some elements of OpenMP 4.5. Clang supports offloading to X86_64, AArch64 and PPC64[LE] devices.
Support for Cuda devices is not ready yet.
The status of major OpenMP 4.5 features support in Clang.
Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
PPC64[LE] and has `basic support for Cuda devices`_.
Standalone directives
=====================
@ -35,7 +37,7 @@ Standalone directives
* #pragma omp target: :good:`Complete`.
* #pragma omp declare target: :partial:`Partial`. No full codegen support.
* #pragma omp declare target: :good:`Complete`.
* #pragma omp teams: :good:`Complete`.
@ -64,5 +66,66 @@ Combined directives
* #pragma omp target teams distribute parallel for [simd]: :good:`Complete`.
Clang does not support any constructs/updates from upcoming OpenMP 5.0 except for `reduction`-based clauses in the `task` and `target`-based directives.
In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and mac OS.
Clang does not support any constructs/updates from upcoming OpenMP 5.0 except
for `reduction`-based clauses in the `task` and `target`-based directives.
In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and mac OS.
ows, and mac OS.
.. _basic support for Cuda devices:
Cuda devices support
====================
Directives execution modes
--------------------------
Clang code generation for target regions supports two modes: the SPMD and
non-SPMD modes. Clang chooses one of these two modes automatically based on the
way directives and clauses on those directives are used. The SPMD mode uses a
simplified set of runtime functions thus increasing performance at the cost of
supporting some OpenMP features. The non-SPMD mode is the most generic mode and
supports all currently available OpenMP features. The compiler will always
attempt to use the SPMD mode wherever possible. SPMD mode will not be used if:
- The target region contains an `if()` clause that refers to a `parallel`
directive.
- The target region contains a `parallel` directive with a `num_threads()`
clause.
- The target region contains user code (other than OpenMP-specific
directives) in between the `target` and the `parallel` directives.
Data-sharing modes
------------------
Clang supports two data-sharing models for Cuda devices: `Generic` and `Cuda`
modes. The default mode is `Generic`. `Cuda` mode can give an additional
performance and can be activated using the `-fopenmp-cuda-mode` flag. In
`Generic` mode all local variables that can be shared in the parallel regions
are stored in the global memory. In `Cuda` mode local variables are not shared
between the threads and it is user responsibility to share the required data
between the threads in the parallel regions.
Features not supported or with limited support for Cuda devices
---------------------------------------------------------------
- Reductions across the teams are not supported yet.
- Cancellation constructs are not supported.
- Doacross loop nest is not supported.
- User-defined reductions are supported only for trivial types.
- Nested parallelism: inner parallel regions are executed sequentially.
- Static linking of libraries containing device code is not supported yet.
- Automatic translation of math functions in target regions to device-specific
math functions is not implemented yet.
- Debug information for OpenMP target regions is not supported yet.

View File

@ -216,7 +216,21 @@ OpenCL C Language Changes in Clang
OpenMP Support in Clang
----------------------------------
- ...
- Clang gained basic support for OpenMP 4.5 offloading for NVPTX target.
To compile your program for NVPTX target use the following options:
`-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda` for 64 bit platforms or
`-fopenmp -fopenmp-targets=nvptx-nvidia-cuda` for 32 bit platform.
- Passing options to the OpenMP device offloading toolchain can be done using
the `-Xopenmp-target=<triple> -opt=val` flag. In this way the `-opt=val`
option will be forwarded to the respective OpenMP device offloading toolchain
described by the triple. For example passing the compute capability to
the OpenMP NVPTX offloading toolchain can be done as follows:
`-Xopenmp-target=nvptx62-nvidia-cuda -march=sm_60`. For the case when only one
target offload toolchain is specified under the `-fopenmp-targets=<triples>`
option, then the triple can be skipped: `-Xopenmp-target -march=sm_60`.
- Other bugfixes.
CUDA Support in Clang
---------------------