Commit Graph

11 Commits

Author SHA1 Message Date
Shilei Tian ca999f7191 [OpenMP][Offloading] Use bitset to indicate execution mode instead of value
The execution mode of a kernel is stored in a global variable, whose value means:
- 0 - SPMD mode
- 1 - indicates generic mode
- 2 - SPMD mode execution with generic mode semantics

We are going to add support for SIMD execution mode. It will be come with another
execution mode, such as SIMD-generic mode. As a result, this value-based indicator
is not flexible.

This patch changes to bitset based solution to encode execution mode. Each
position is:
[0] - generic mode
[1] - SPMD mode
[2] - SIMD mode (will be added later)

In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode
execution with generic mode semantics. In the future after we add the support for
SIMD mode, `0b1xx` will be in SIMD mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110029
2021-09-22 11:40:52 -04:00
Johannes Doerfert e2cfbfcc0c [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 17:53:56 -05:00
Joseph Huber 68d133a3e8 [OpenMP] Simplify GPU memory globalization
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.

This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D97680
2021-06-22 10:52:46 -04:00
Pushpinder Singh 3a12ff0dac [OpenMP][RTL] Remove dead code
RequiresDataSharing was always 0, resulting dead code in device runtime library.

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D88829
2020-10-06 05:43:47 -04:00
Saiyedul Islam a1bdf8f545 [OpenMP] Ensure testing for versions 4.5 and default - Part 2
Many OpenMP Clang tests do not RUN for version 4.5 and the default
version. This second patch in the series handles test cases which
require updation in CHECK lines along with adding RUN lines for
the default version. It involves updating line number of pragmas.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D85150
2020-08-27 18:50:52 +00:00
Gheorghe-Teodor Bercea 2b40470c61 [OpenMP] Add a new version of the SPMD deinit kernel function
Summary: This patch adds a new runtime for the SPMD deinit kernel function which replaces the previous function. The new function takes as argument the flag which signals whether the runtime is required or not. This enables the compiler to optimize out the part of the deinit function which are not needed.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D54970

llvm-svn: 347915
2018-11-29 20:53:49 +00:00
Alexey Bataev 8bcc69c054 [OPENMP][NVPTX]Extend number of constructs executed in SPMD mode.
If the statements between target|teams|distribute directives does not
require execution in master thread, like constant expressions, null
statements, simple declarations, etc., such construct can be xecuted in
SPMD mode.

llvm-svn: 346551
2018-11-09 20:03:19 +00:00
Alexey Bataev 8d8e1235ab [OPENMP][NVPTX] Add support for lightweight runtime.
If the target construct can be executed in SPMD mode + it is a loop
based directive with static scheduling, we can use lightweight runtime
support.

llvm-svn: 340953
2018-08-29 18:32:21 +00:00
Gheorghe-Teodor Bercea ad4e579407 [OpenMP] Initialize data sharing stack for SPMD case
Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D49188

llvm-svn: 337015
2018-07-13 16:18:24 +00:00
Carlo Bertolli 79712097c7 [OpenMP] Extend NVPTX SPMD implementation of combined constructs
Differential Revision: https://reviews.llvm.org/D43852

This patch extends the SPMD implementation to all target constructs and guards this implementation under a new flag.

llvm-svn: 326368
2018-02-28 20:48:35 +00:00
Arpith Chacko Jacob 2cd6eeabfd [OpenMP] Support for the proc_bind-clause on 'target parallel' on the NVPTX device.
This patch adds support for the proc_bind clause on the Spmd construct
'target parallel' on the NVPTX device.  Since the parallel region is created
upon kernel launch, this clause can be safely ignored on the NVPTX device at
codegen time for level 0 parallelism.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29128

llvm-svn: 293069
2017-01-25 16:55:10 +00:00