Commit Graph

1563 Commits

Author SHA1 Message Date
AndreyChurbanov dab5d6c2eb [OpenMP] fix race condition in test 2021-02-18 02:27:49 +03:00
Jon Chesterfield 53d7fd3762 [libomptarget][amdgcn] Remove lookup of .language msgpack field 2021-02-17 23:02:16 +00:00
AndreyChurbanov cf1ddae7e3 [OpenMP][NFC] replaced 'dependencies' with 'dependences' in comments and debug prints 2021-02-18 00:38:18 +03:00
Alexey Bataev 60d71a286b [OPENMP50]Allow overlapping mapping in target constructs.
OpenMP 5.0 removed a lot of restriction for overlapped mapped items
comparing to OpenMP 4.5. Patch restricts the checks for overlapped data
mappings only for OpenMP 4.5 and less and reorders mapping of the
arguments so, that present and alloc mappings are processed first and
then all others.

Differential Revision: https://reviews.llvm.org/D86119
2021-02-16 14:42:08 -08:00
Johannes Doerfert 2518cc65d2 [OpenMP][FIX] Avoid use of stack allocations in asynchronous calls
As reported by Guilherme Valarini [0], we used to pass stack allocations
to calls that can nowadays be asynchronous. This is arguably a problem
and it will inevitably result in UB. To remedy the situation we
allocate the locations as part of the AsyncInfoTy object. The lifetime
of that object matches what we need for now. If the synchronization is
not tied to the AsyncInfoTy object anymore we might need to have a
different buffer construct in global space.

This should be back-ported to LLVM 12 but needs slight modifications as
it is based on refactoring patches we do not need to backport.

[0] https://lists.llvm.org/pipermail/openmp-dev/2021-February/003867.html

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D96667
2021-02-16 15:38:11 -06:00
Johannes Doerfert 758b849931 [OpenMP] Unify omptarget API and usage wrt. `__tgt_async_info`
This patch unifies our libomptarget API in two ways:
  - always pass a `__tgt_async_info` object, the Queue member decides if
    it is in use or not.
  - (almost) always synchronize in the interface layer and not in the
    omptarget layer.

A side effect is that we now put all constructor and static initializer
kernels in a stream too, if the device utilizes `__tgt_async_info`.

The patch contains a TODO which can be addressed as we add support for
asynchronous malloc and free in the plugin API. This is the only
`synchronizeAsyncInfo` left in the omptarget layer.

Site note: On a V100 system the GridMini performance for small sizes
more than doubled.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D96379
2021-02-16 15:38:06 -06:00
Johannes Doerfert a2fc0d34db [OpenMP] Move synchronization into `__tgt_async_info`
The AsyncInfo should be passed everywhere and it should offer a way to
ensure synchronization, given a libomptarget Device.

This replaces D96431.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D96438
2021-02-16 15:38:01 -06:00
Johannes Doerfert 942728763b [OpenMP][NFC] Unify `target` API with other by passing a `__tgt_async_info` pointer
Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D96430
2021-02-16 15:37:56 -06:00
Johannes Doerfert 44f3022cdf [OpenMP][NFC] Pass a DeviceTy, not the device number to `target`
This unifies the API of `target` relative to `targetUpdateData` and
such.

Reviewed By: tianshilei1992, grokos

Differential Revision: https://reviews.llvm.org/D96429
2021-02-16 15:37:51 -06:00
Johannes Doerfert ea9395716e [OpenMP][NFC] Clang format the libomptarget plugins
Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D96445
2021-02-16 15:37:46 -06:00
Johannes Doerfert ad94fce845 [OpenMP][NFC] Eliminate sign comparison warning via explicit casts
Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D96812
2021-02-16 15:37:41 -06:00
Johannes Doerfert 9cd1e2228c [OpenMP][NFC] Clang format libomptarget code (src & include)
The struct and enum alignments are kept by disabling clang-format for
that code region.

Reviewed By: tianshilei1992, JonChesterfield, grokos

Differential Revision: https://reviews.llvm.org/D96428
2021-02-16 15:37:35 -06:00
AndreyChurbanov 5631842d18 [OpenMP] NFC: fix test removing the target construct 2021-02-13 04:49:52 +03:00
AndreyChurbanov 091e8daa24 [OpenMP] fix test adding mapping of shared variables 2021-02-13 04:13:54 +03:00
Martin Storsjö 496ca4127e [OpenMP] Silence more warning flags
This silences warnings like these, in mingw builds with clang:

runtime/src/kmp_atomic.h:1021:13: warning: '__kmpc_atomic_cmplx8_rd' has C-linkage specified, but returns user-defined type 'kmp_cmplx64' (aka '__kmp_cmplx64_t') which is incompatible with C [-Wreturn-type-c-linkage]

runtime/src/z_Windows_NT_util.cpp:479:17: warning: cast from 'volatile void *' to 'type-parameter-0-0 *' drops volatile qualifier [-Wcast-qual]
    flag = (C *)th->th.th_sleep_loc;

runtime/src/z_Windows_NT_util.cpp:1321:14: warning: cast to 'void *' from smaller integer type 'DWORD' (aka 'unsigned long') [-Wint-to-void-pointer-cast]
  } else if ((void *)exit_val != (void *)th) {

Differential Revision: https://reviews.llvm.org/D96585
2021-02-12 21:55:32 +02:00
Martin Storsjö 16428a8d91 [OpenMP] Avoid warnings about unused static functions on windows
Add ifdefs around one function that only is used in unix build
configurations.

Add a void cast for a windows specific function that currently is
unused but may be intended to be used at some point.

Differential Revision: https://reviews.llvm.org/D96584
2021-02-12 21:55:31 +02:00
Martin Storsjö b388c84c09 [OpenMP] Remove two entirely unused variables
Differential Revision: https://reviews.llvm.org/D96583
2021-02-12 21:55:31 +02:00
Martin Storsjö b3d84790fa [OpenMP] Add void casts to silence unused variable warnings
These variables are used only in certain build configurations,
or marked with a todo comment indicating that they should be
used/checked/reported.

Differential Revision: https://reviews.llvm.org/D96582
2021-02-12 21:55:31 +02:00
Martin Storsjö 3f9519b768 [OpenMP] Only use #pragma comment(lib, ...) in MSVC build configurations
MinGW build configurations don't support this pragma (unless
compiling with clang, with -fms-extensions, and linking with
lld), and at least clang warns about it.

This library does end up linked by the cmake files anyway (as
long as the check works properly).

Differential Revision: https://reviews.llvm.org/D96581
2021-02-12 21:55:31 +02:00
Martin Storsjö 77632422bc [OpenMP] Fix the check for libpsapi for i386
check_library_exists fails for stdcall functions, because that
check doesn't include the necessary headers (and thus fails with
an undefined reference to _EnumProcessModules, when the import
library symbol actually is called _EnumProcessModules@16).

Merge the two previous checks check_include_files and
check_library_exists into one with check_c_source_compiles, and
merge the variables that indicate whether it succeeded.

Differential Revision: https://reviews.llvm.org/D96580
2021-02-12 21:55:30 +02:00
Jon Chesterfield 6f04addc8b [libomptarget][amdgcn] Build amdgcn devicertl as openmp
[libomptarget][amdgcn] Build amdgcn devicertl as openmp

Change cmake to build as openmp and fix up some minor errors in the code.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96533
2021-02-12 09:51:21 +00:00
AndreyChurbanov 838dcdb5fc [OpenMP] libomp: minor changes to improve library performance
Three minor changes in this patch:
- added UNLIKELY hint to few rarely executed branches;
- replaced couple of run time checks with debug assertions;
- moved check of presence of ittnotify tool from inside the function call.

Differential Revision: https://reviews.llvm.org/D95816
2021-02-12 00:43:13 +03:00
Hansang Bae ffb21e7f05 [OpenMP] Enable omp_get_num_devices() on Windows
This patch enables omp_get_num_devices() and omp_get_initial_device() on
Windows by providing an alternative to dlsym on Windows, and proposes to
add a new libomptarget entry, __tgt_get_num_devices().

Differential Revision: https://reviews.llvm.org/D96182
2021-02-11 14:53:48 -06:00
Nawrin Sultana 4692bb4a8a [OpenMP] Add lower and upper bound in num_teams clause
This patch adds lower-bound and upper-bound to num_teams clause
according to OpenMP 5.1 specification. The initial number of teams
created is implementation defined, but it will be greater than or
equal to lower-bound and less than or equal to upper-bound. If
num_teams clause is not specified, the number of teams created is
implementation defined, but it will be greater or equal to 1.

Differential Revision: https://reviews.llvm.org/D95820
2021-02-10 13:58:50 -06:00
Jon Chesterfield 56c446a878 [libomptarget][amdgcn] Tolerate deadstripped device_state variable
[libomptarget][amdgcn] Tolerate deadstripped device_state variable

The device_state variable may have been deadstripped. Similar to
device_environment, leave detection of missing but used symbol to loader.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D96330
2021-02-09 16:29:53 +00:00
Jon Chesterfield 4756f76bce [libomptarget][amdgcn] Tolerate deadstripped env variable
[libomptarget][amdgcn] Tolerate deadstripped env variable

Discovered by Pushpinder. If the device_environment variable is unused
it can be deadstripped, in which case we should not abort due to it
missing. This change is safe in that a missing symbol which is actually
used can be reported by both linker and loader, and a missing unused
symbol is better deadstripped than left in the image.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D96329
2021-02-09 11:58:37 +00:00
Jon Chesterfield 2fa4186d4e [libomptarget][amdgcn] Fix language linkage post D95300, drop use of assert 2021-02-08 20:07:51 +00:00
Shilei Tian b68a6b09e6 [OpenMP][libomptarget] Fixed an issue that device sync is skipped if the kernel doesn't have any argument
Currently if there is not kernel argument, device synchronization will
be skipped. This can lead to two issues:
1. If there is any device error, it will not be captured;
2. The target region might end before the kernel is done, which is not spec
   conformant.

The test added in this patch only runs on NVPTX platform, although it will not
be executed by Phab at all. It also requires `not` which is not available on most
systems.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D96067
2021-02-04 20:14:24 -05:00
Shilei Tian 567b3f8841 [OpenMP][deviceRTLs] Drop `assert` in common parts of `deviceRTLs`
The header `assert.h` needs to be included in order to use `assert` in the code.
When building NVPTX `deviceRTLs` on a CUDA free system, it requires headers from
`gcc-multilib`, which some systems don't have. This patch drops the use of
`assert` in common parts of `deviceRTLs`. In light of
`openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h`, a code block
```
if (!cond)
  __builtin_trap();
```
is being used. The builtin will be translated to `call void @llvm.trap()`, and
the corresponding PTX is `trap;`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95986
2021-02-04 12:39:43 -05:00
Shilei Tian 0f0ce3c12e [OpenMP][NVPTX] Take functions in `deviceRTLs` as `convergent`
OpenMP device compiler (similar to other SPMD compilers) assumes that
functions are convergent by default to avoid invalid transformations, such as
the bug (https://bugs.llvm.org/show_bug.cgi?id=49021).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95971
2021-02-03 20:58:12 -05:00
Shilei Tian 3c31b78455 [OpenMP] Fixed an issue that taskwait doesn't work on detachable task
D77609 mistakenly changed the bebavior of task waiting on detachable task that a detachable task is not waited, based on https://lists.llvm.org/pipermail/openmp-dev/2021-February/003836.html. This patch fixed it. Thank Raúl for the report.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95798
2021-02-03 13:12:43 -05:00
Peyton, Jonathan L ffca74b8b8 [OpenMP] Fix sign comparison warnings from GCC
New affinity patch introduced legitimate sign-compare warnings that
clang doesn't report but GCC-10 does. This removes the warnings by
changing two variables types to unsigned.

Differential Revision: https://reviews.llvm.org/D95818
2021-02-02 10:52:16 -06:00
Joseph Huber ed8943c087 [OpenMP][NFC] Adding FAQ Entry for errors with static libraries 2021-02-02 10:50:22 -05:00
Atmn Patel b545667d0a [OpenMP][Libomptarget] Remove possible harmful copy constructor call for RTLsTy
From https://bugs.llvm.org/show_bug.cgi?id=48973, we know that
`std::call_once(PM->RTLs.initFlag, &RTLsTy::LoadRTLs, PM->RTLs)` causes compile
time problems in libstdc++v3 5.3.1. This is because there was a defect in the
standard regarding the `call_once` (LWG 2442). This was fixed in libstdc++ soon
thereafter, but there are likely other standard libraries where this will fail.

By matching this function call with the other one, we fix this bug.

Differential Revision: https://reviews.llvm.org/D95769
2021-02-01 20:13:03 -05:00
AndreyChurbanov d7b12004bd [OpenMP] libomp: implement nteams-var and teams-thread-limit-var ICVs
The change includes OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT env variables,
omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit,
omp_get_teams_thread_limit routines.

Differential Revision: https://reviews.llvm.org/D95003
2021-02-01 22:54:11 +03:00
Shilei Tian f0129cc35e [OpenMP] Disable tests if FileCheck is not available in in-tree building
FileCheck is required for OpenMP tests. The current detection can fail
if building OpenMP in-tree when user sets `LLVM_INSTALL_TOOLCHAIN_ONLY=ON`. As a
result, CMake will raise an error and the compilation will be broken. This patch
fixed the issue. When `FileCheck` is not a target, tests will just be skipped.

Reviewed By: jdoerfert, JonChesterfield

Differential Revision: https://reviews.llvm.org/D95689
2021-02-01 13:14:55 -05:00
Joseph Huber fda4853998 [OpenMP] Fix seg fault in libomptarget when using Info with multiple threads
Summary:
One option for the LIBOMPTARGET_INFO environment variable is to print the current status of the device's data mappings. These are a shared resource among threads so this needs to be protected when using multiple streams.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95786
2021-02-01 11:21:57 -05:00
xgupta 94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Tobias Hieta c3c02d0d5a [OpenMP] Fix python3 compatibility in openmp's lit.cfg
Differential Revision: https://reviews.llvm.org/D95669
2021-02-01 08:20:26 +01:00
Shilei Tian 26d38f6d20 [OpenMP][NVPTX] Refined CMake logic to choose compute capabilites
This patch refines the logic to choose compute capabilites via the
environment variable `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES`. It supports the
following values (all case insensitive):
- "all": Build `deviceRTLs` for all supported compute capabilites;
- "auto": Only build for the compute capability auto detected. Note that this
  requires CUDA. If CUDA is not found, a CMake fatal error will be raised.
- "xx,yy" or "xx;yy": Build for compute capabilities `xx` and `yy`.

If `LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES` is not set, it is equivalent to set
it to `all`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95687
2021-01-30 15:14:48 -05:00
Jonathan Peyton 67773681c0 [OpenMP] Add environment variable to force monotonic dynamic scheduling
This patch introduces a new environment variable to force monotonic
behavior for users that absolutely need it.  This is in anticipation
of 5.0 change that uses non-monotonic behavior for dynamic scheduling
by default. Fixes for that and the actual switch are coming soon.

Differential Revision: https://reviews.llvm.org/D95263
2021-01-29 12:23:27 -06:00
Shilei Tian 7bc31018f7 [OpenMP][NFC] Added release note for new `deviceRTLs` and hidden helper task
Added release note for new `deviceRTLs` and hidden helper task for LLVM
12.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95584
2021-01-29 13:13:03 -05:00
AndreyChurbanov 7f5ad0e071 [OpenMP] libomp: fix build by cl with vs2019
Replace VLA with dynamic allocation using alloca().
This fixes https://bugs.llvm.org/show_bug.cgi?id=48919.

Differential Revision: https://reviews.llvm.org/D95627
2021-01-29 13:16:41 +03:00
AndreyChurbanov ac70a53653 [OpenMP] NFC: disabled two flakey tests as the bug in libomp not fixed yet 2021-01-29 00:54:13 +03:00
Shilei Tian 1b19c42302 [OpenMP][deviceRTLs] Separate declaration of target dependent functions from `target_impl.h`
This patch created a new header file `target_interface.h` for declarations of all target dependent functions. All future targets can get things work by simply implementing all functions declared in the header and macros/data same as each `target_impl.h`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95300
2021-01-28 08:14:33 -05:00
Shilei Tian 5a64794bba [OpenMP][NVPTX] Added the missing -O1 when building NVPTX bitcode libraries
In the past `-O1` was used when building NVPTX bitcode libraries. After
we switched to OpenMP, `-O1` was missing by mistake, leading to a huge performance
regression.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95545
2021-01-28 08:13:38 -05:00
Shilei Tian 19248d30e4 [OpenMP][deviceRTLs] Added `[[clang::loader_uninitialized]]` explicitly
`[[clang::loader_uninitialized]]` is in macro `SHARED` but it doesn't
work for array like `parallelLevel`, so the variable will be zero initialized.
There is also a similar issue for `omptarget_nvptx_device_State` which is in
global address space. Its c'tor is also generated, which was not in the past when
building the `deviceRTLs` with CUDA. In this patch, we added the attribute to
the two variables explicitly.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95550
2021-01-28 08:12:49 -05:00
Shilei Tian c571b16834 [OpenMP] Disabled profiling in `libomp` by default to unblock link errors
Link error occurred when time profiling in libomp is enabled by default
because `libomp` is assumed to be a C library but the dependence on
`libLLVMSupport` for profiling is a C++ library. Currently the issue blocks all
OpenMP tests in Phabricator.

This patch set a new CMake option `OPENMP_ENABLE_LIBOMP_PROFILING` to
enable/disable the feature. By default it is disabled. Note that once time
profiling is enabled for `libomp`, it becomes a C++ library.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95585
2021-01-28 07:24:32 -05:00
Vyacheslav Zakharin 0fc90873b2 [libomptarget][NFC] Link plugins with threads support library due to std::call_once usage.
Differential Revision: https://reviews.llvm.org/D95572
2021-01-27 19:26:18 -08:00
Atmn Patel 8a77056256 [OpenMP][Libomptarget] Fix conditional in CMake for remote plugin
The remote offloading plugin's CMakeLists was trying to build if its
flag was enabled even if it didn't find gRPC/protobuf. The conditional
was wrong, it's fixed by this.

Differential Revision: https://reviews.llvm.org/D95574
2021-01-27 21:28:25 -05:00