Commit Graph

1162 Commits

Author SHA1 Message Date
Yuanfang Chen c2c4f1c120 [openmp][cmake] passing option argument correctly
From the context, it looks like the test should not be run with `check-all`,
but it does. It turns out option argument resolving to True/False which
could not be passed down as is. There is one such example in
AddLLVM.cmake.
2020-02-13 09:33:58 -08:00
Alexey Bataev 578c13d13c [OPENMP]Fix the test, NFC. 2020-02-13 10:40:06 -05:00
Ethan Stewart 190a11148b Changed omp_get_max_threads() implementation to more closely match spec description.
Summary: The 5.0 spec states, "The omp_get_max_threads routine returns an upper bound on the number of threads that could be used to form a new team if a parallel construct without a num_threads clause were encountered after execution returns from this routine." The attached test shows Max Threads: 96, Num Threads: 128 without the proposed change. The number of threads should not exceed the (max) nthreads ICV, hence we should return the higher SPMD thread number even when omp_get_max_threads() is called in a generic kernel. This change does fail the api test, max_threads.c, because now it would return 64 instead of 32.

Reviewers: jdoerfert, ABataev, grokos, JonChesterfield

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D74092
2020-02-12 23:29:34 +00:00
JonChesterfield c2ce9ea4e3 [libomptarget][nfc] Change enum values to match those in cuda/rtl
Summary:
[libomptarget][nfc] Change enum values to match those in cuda/rtl

support.h and cuda/rtl.cpp (and downsteam hsa/rtl.cpp) have enums for execution
mode. These are actually independent - the numbers that used within support, or
within the plugin, are never passed across the boundary.

Nevertheless, trying to work out why the values are different between the two
has generated a reasonable amount of confusion. This patch changes support to
match the values in plugin, on the basis that the plugin also has some comments
which I'd have to update if I changed that one instead. Credit to Ron for
working through this in our own fork. See rocm-developer-tools/aomp/issues/7
for that earlier diagnostic write up.

Also happy with generic = 0, spmd = 1 - provided it's the same in both places.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D74503
2020-02-12 23:27:08 +00:00
Kelvin Li 4f1f2b7a5b [OpenMP] update strings output of libomp.so [NFC]
Change the string from "Intel(R) OMP" to "LLVM OMP" in libomp.so

Differential Revision: https://reviews.llvm.org/D74462
2020-02-12 15:45:55 -05:00
Johannes Doerfert a5153dbc36 [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D74145
2020-02-11 22:07:14 -06:00
Johannes Doerfert 3ff4e2eee8 [OpenMP] Switch default C++ standard to C++ 14
Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D74258
2020-02-11 17:11:54 -06:00
Jonas Devlieghere 4fe839ef3a [CMake] Rename EXCLUDE_FROM_ALL and make it an argument to add_lit_testsuite
EXCLUDE_FROM_ALL means something else for add_lit_testsuite as it does
for something like add_executable. Distinguish between the two by
renaming the variable and making it an argument to add_lit_testsuite.

Differential revision: https://reviews.llvm.org/D74168
2020-02-06 15:33:18 -08:00
Jon Chesterfield 6a82f0f0b9 [libomptarget] Implement wavefront functions for amdgcn
Summary: [libomptarget] Implement wavefront functions for amdgcn

Reviewers: jdoerfert, ABataev, grokos, arsenm

Reviewed By: arsenm

Subscribers: saiislam, wdng, arsenm, jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73077
2020-02-04 21:55:29 +00:00
protze@itc.rwth-aachen.de 90e4ebdce5 [OpenMP][OMPT] fix reduction test for 32-bit x86
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44733 | TEST 'libomp :: ompt/synchronization/reduction/tree_reduce.c' FAILED on 32-bit x86 ]]

For 32-bit we need at least 3 variables to avoid atomic reduction to be
choosen by runtime function `__kmp_determine_reduction_method`.
This patch adds reduction variables to the testcase.

Reviewers: mgorny, Hahnfeld

Differential Revision: https://reviews.llvm.org/D73850
2020-02-04 12:19:10 +01:00
Jon Chesterfield ab9762a9f5 Revert "[nfc][libomptarget] Remove SHARED annotation from local variables"
This reverts commit 0e9374e374.
Revert D73239. It fails some local testing, cause presently unknown
2020-01-27 20:05:17 +00:00
Michał Górny 3c545e4b73 [openmp] Disable archer if LIBOMP_OMPT_SUPPORT is off
This fixed build failures due to missing ompt headers.

See https://bugs.gentoo.org/700762.

Differential Revision: https://reviews.llvm.org/D73249
2020-01-23 19:26:18 +01:00
Kelvin Li ad24cf2a94 [OpenMP] change omp_atk_* and omp_atv_* enumerators to lowercase [NFC]
The OpenMP spec defines the OMP_ATK_* and OMP_ATV_* to be lowercase.

Differential Revision: https://reviews.llvm.org/D73248
2020-01-23 11:15:44 -05:00
Jon Chesterfield 0e9374e374 [nfc][libomptarget] Remove SHARED annotation from local variables
Summary:
[nfc][libomptarget] Remove SHARED annotation from local variables

A few local variables in reduction.cu were marked SHARED. This patch leaves
all per-kernel global state localised in omp_data.cu.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D73239
2020-01-23 00:00:23 +00:00
Alexey Bataev 9148b8b734 [OpenMP][Offloading] Fix the issue that omp_get_num_devices returns wrong number of devices, by Shiley Tian.
Summary:
This patch is to fix issue in the following simple case:

  #include <omp.h>
  #include <stdio.h>

  int main(int argc, char *argv[]) {
    int num = omp_get_num_devices();
    printf("%d\n", num);

    return 0;
  }

Currently it returns 0 even devices exist. Since this file doesn't contain any
target region, the host entry is empty so further actions like initialization
will not be proceeded, leading to wrong device number returned by runtime
function call.

Reviewers: jdoerfert, ABataev, protze.joachim

Reviewed By: ABataev

Subscribers: protze.joachim

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72576
2020-01-21 13:25:18 -05:00
David Carlier ea99c09963 [OpenMP] affinity little fix for FreeBSD
- pthread affinity np has different semantic than sched affinity counterpart. On success returns strictly 0.

Reviewers: chandlerc, AndreyChurbanov, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72132
2020-01-20 18:52:10 +00:00
Jon Chesterfield 03c2a59cd6 [libomptarget] Implement smid for amdgcn
Summary:
[libomptarget] Implement smid for amdgcn

Implementation is in a new file as it uses an intrinsic with
complicated encoding that warranted substantial comments.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72956
2020-01-20 14:52:17 +00:00
Joachim Protze 39f746d8de [OpenMP][Tool] Fix memory leak and double-allocation
Fix the memory leak pointed out in https://reviews.llvm.org/D70412.
And a second one due to double-allocation.

Reviewed by: Hahnfeld

Differential revision: https://reviews.llvm.org/D72779
2020-01-16 10:05:06 -10:00
George Rokos e244145ab0 [LIBOMPTARGET] Do not increment/decrement the refcount for "declare target" objects
The reference counter for global objects marked with declare target is INF. This patch prevents the runtime from incrementing /decrementing INF refcounts. Without it, the map(delete: global_object) directive actually deallocates the global on the device. With this patch, such a directive becomes a no-op.

Differential Revision: https://reviews.llvm.org/D72525
2020-01-14 16:30:38 -08:00
Joachim Protze 2d4571bf30 [OpenMP][Tool] Runtime warning for missing TSan-option
TSan spuriously reports for any OpenMP application a race on the initialization
of a runtime internal mutex:

```
Atomic read of size 1 at 0x7b6800005940 by thread T4:
  #0 pthread_mutex_lock <null> (a.out+0x43f39e)
  #1 __kmp_resume_64 <null> (libomp.so.5+0x84db4)

Previous write of size 1 at 0x7b6800005940 by thread T7:
  #0 pthread_mutex_init <null> (a.out+0x424793)
  #1 __kmp_suspend_initialize_thread <null> (libomp.so.5+0x8422e)
```

According to @AndreyChurbanov this is a false positive report, as the control
flow of the runtime guarantees the ordering of the mutex initialization and
the lock:
https://software.intel.com/en-us/forums/intel-open-source-openmp-runtime-library/topic/530363

To suppress this report, I suggest the use of
TSAN_OPTIONS='ignore_uninstrumented_modules=1'.
With this patch, a runtime warning is provided in case an OpenMP application
is built with Tsan and executed without this Tsan-option.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70412
2020-01-14 09:58:05 -10:00
Jon Chesterfield 2a43688a0a [nfc][libomptarget] Refactor nvptx/target_impl.cu
Summary:
[nfc][libomptarget] Refactor nxptx/target_impl.cu

Use __kmpc_impl_atomic_add instead of atomicAdd to match the rest of the file.
Alternatively, target_impl.cu could use the cuda functions directly. Using a mixture in this
file was an oversight, happy to resolve in either direction.

Removed some comments that look outdated.

Call __kmpc_impl_unset_lock directly to avoid a redundant diagnostic and remove an implict
dependency on interface.h.

Reviewers: ABataev, grokos, jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72719
2020-01-14 19:27:45 +00:00
Jon Chesterfield 2d287bec3c [nfc][libomptarget] Refactor amdgcn target_impl
Summary:
[nfc][libomptarget] Refactor amdgcn target_impl

Removes references to internal libraries from the header
Standardises on C++ mangling for all the target_impl functions
Update comment block
clang-format
Move some functions into a new target_impl.hip source file

This lays the groundwork for implementing the remaining unresolved
symbols in the target_impl.hip source.

Reviewers: jdoerfert, grokos, ABataev, ronlieb

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72712
2020-01-14 19:27:07 +00:00
Joachim Protze ed810da732 [OpenMP][Tool] Improving stack trace for Archer
The OpenMP runtime is not instrumented, so entering the runtime leaves no hint
on the source line of the pragma on ThreadSanitizer's function stack.

This patch adds function entry/exit annotations for OpenMP parallel regions,
and synchronization regions (barrier, taskwait, taskgroup).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D70408
2020-01-13 22:14:06 -10:00
Joachim Protze 84637408f2 [OpenMP][Tool] Make tests for archer dependent on TSan
If the openmp project is built standalone, the test compiler is feature tested for an available -fsanitize=thread flag.
If the openmp project is built as part of llvm, the target tsan is needed to test archer.

An additional line (requires tsan) was introduced to the tests, this patch updates the line numbers for the race.

Follow-up for 77ad98c

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D71914
2020-01-13 21:47:58 -10:00
Alexey Bataev b19c0810e5 [LIBOMPTARGET]Ignore empty target descriptors.
Summary:
If the dynamically loaded module has been compiled with -fopenmp-targets
and has no target regions, it has empty target descriptor. It leads to a
crash at the runtime if another module has at least one target region
and at least one entry in its descriptor. The runtime library is unable
to load the empty binary descriptor and terminates the execution.
Caused by a clang-offload-wrapper.

Reviewers: grokos, jdoerfert

Subscribers: caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72472
2020-01-10 09:45:27 -05:00
Kazuaki Ishizaki 4c6a098ad5 [OpenMP] NFC: Fix trivial typos in comments
Reviewers: jdoerfert, Jim

Reviewed By: Jim

Subscribers: Jim, mgorny, guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72285
2020-01-07 14:05:03 +08:00
Kelvin Li 19433b199d [OpenMP] Fix incorrect property of __has_attribute() macro
__has_attribute(fallthough) -> __has_attribute(fallthrough)

Submitted by: kiszk (Kazuaki Ishizaki <ishizaki@jp.ibm.com>)

Differential Revision: https://reviews.llvm.org/D72287
2020-01-06 15:00:10 -05:00
Kelvin Li ed5fe64581 [OpenMP] NFC: Fix trivial typos in comments
Submitted by: kiszk

Differential Revision: https://reviews.llvm.org/D72171
2020-01-03 22:03:42 -05:00
Jon Chesterfield bc48af8c57 [libomptarget][nfc] Change unintentional target_impl prefix to kmpc_impl 2019-12-30 20:50:23 +00:00
protze@itc.rwth-aachen.de 3356e268f6 [OpenMP] Implementation of OMPT reduction callbacks
Including two tests
These callbacks were added late to the 5.0 specification, an implementation is missing.

Reviewed By: jdoerfert

Differential Review: https://reviews.llvm.org/D70395
2019-12-27 15:30:51 +01:00
Jon Chesterfield 63e2aa5658 [libomptarget][nfc] Provide target_impl malloc/free
Summary:
[libomptarget][nfc] Provide target_impl malloc/free

Sufficient to build support.cu for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71685
2019-12-19 16:54:28 +00:00
JonChesterfield b40822fc14 [libomptarget][nvptx] Fix build, second symbol reordering 2019-12-19 02:02:44 +00:00
Jon Chesterfield 89a2bef27a [libomptarget][nvptx] Fix build, symbol ordering in target_impl.h 2019-12-19 01:50:06 +00:00
JonChesterfield 9aefe5f65e [libomptarget][amdgcn] Correct return type of extern __clock64 to unsigned 2019-12-19 00:11:21 +00:00
Jon Chesterfield 2caeaf2f45 [libomptarget][nfc] Introduce atomic wrapper function
Summary:
[libomptarget][nfc] Introduce atomic wrapper function

Wraps atomic functions in a template prefixed __kmpc_atomic that
dispatches to cuda or hip atomic functions. Intended to be easily extended
to dispatch to OpenCL or C++ atomics for a third target.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: Anastasia, jvesely, mgrang, dexonsmith, llvm-commits, mgorny, jfb, openmp-commits

Tags: #openmp, #llvm

Differential Revision: https://reviews.llvm.org/D71404
2019-12-18 20:06:17 +00:00
JonChesterfield 8adae6027c [libomptarget][nfc] Extract function from data_sharing, move to common
Summary:
[libomptarget][nfc] Extract function from data_sharing, move to common

Finding the first active thread in the warp is different on nvptx and amdgcn,
mostly due to warp size and the desire for efficiency.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71643
2019-12-18 19:39:35 +00:00
Alexey Bataev 15d47deedd [LIBOPENMP][NVPTX]Fix the build error in the runtime. 2019-12-17 14:46:04 -05:00
JonChesterfield 0c83f8ccc7 [libomptarget][nfc] Move three files under common, build them for amdgcn
Summary:
[libomptarget][nfc] Move three files under common, build them for amdgcn

Change to reduction.cu to remove two dead includes, otherwise no code change.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71601
2019-12-17 18:02:49 +00:00
JonChesterfield 3d3e4076cd [libomptarget][nfc] Move omp locks under target_impl
Summary:
[libomptarget][nfc] Move omp locks under target_impl

These are likely to be target specific, even down to the lock_t which is
correspondingly moved out of interface.h. The alternative is to include
interface.h in target_impl which substantiatially increases the scope of
those symbols.

The current nvptx implementation deadlocks on amdgcn. The preferred
implementation for that arch is still under discussion - this change
leaves declarations in target_impl.

The functions could be inline for nvptx. I'd prefer to keep the internals
hidden in the target_impl translation unit, but will add the (possibly renamed)
macros to target_impl.h if preferred.

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71574
2019-12-17 12:18:57 +00:00
Jon Chesterfield ce12a523b0 [libomptarget][nfc] Move timer functions behind target_impl
Summary: [libomptarget][nfc] Move timer functions behind target_impl

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71584
2019-12-17 02:22:29 +00:00
Jon Chesterfield 53bcd1e141 [libomptarget][nfc] Wrap cuda min() in target_impl
Summary:
[libomptarget][nfc] Wrap cuda min() in target_impl

nvptx forwards to cuda min, amdgcn implements directly.
Sufficient to build parallel.cu for amdgcn, added to CMakeLists.

All call sites are homogenous except one that passes a uint32_t and an
int32_t. This could be smoothed over by taking two type parameters
and some care over the return type, but overall I think the inline
<uint32_t> calling attention to what was an implicit sign conversion
is cleaner.

Reviewers: ABataev, jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71580
2019-12-17 01:30:04 +00:00
JonChesterfield 69fcc6ecc1 Revert "Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn""
Summary:
This reverts commit dd8a7fcdd7.

Alexey reports undefined symbols for the new inline functions defined in target_impl.h
This does not reproduce for me for nvptx, or amdgcn, under release or debug builds.

I believe the patch is fine, based on:
 - the semantics of an inline function in C++ (the cuda INLINE functions end
   up as linkonce_odr in IR), which are only legal to drop if they have no uses
 - the code generated from a debug build of clang 9 does not show these undef symbols
 - the tests pass
 - the code is trivial

To progress from here I either need:
 - A tie break - someone to play the role of CI in determining whether the patch works
 - Alexey to provide sufficient information about his build for me to reproduce the failure
 - Alexey to debug why the symbols are disappearing for him and report back

Reviewers: ABataev, jdoerfert, grokos

Subscribers: jvesely, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71502
2019-12-16 16:16:14 +00:00
Alexey Bataev dd8a7fcdd7 Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn"
This reverts commit dbb3fec8ad since it
breaks the NVPTX tests.
2019-12-13 16:36:06 -05:00
Jon Chesterfield 40d72134fd [libomptarget] Build most of common/src for amdgcn
Summary:
[libomptarget] Build most of common/src for amdgcn

Excluding parallel.cu, which uses an integer min() from cuda,
Excluding support.cu, which calls malloc that is not yet available for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: gregrodgers, ronlieb, jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71446
2019-12-13 17:48:19 +00:00
Jon Chesterfield 56adcebfda [libomptarget][nfc] Add nop syncwarp function for amdgcn 2019-12-13 14:27:52 +00:00
Jon Chesterfield 479868646a [libomptarget][nfc] Add declarations of atomic functions for amdgcn
Summary:
[libomptarget][nfc] Add declarations of atomic functions for amdgcn

This enables building more source for amdgcn. The functions are usually available
in a hip runtime header, but are duplicated here to decouple the implementation

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71412
2019-12-12 22:56:14 +00:00
Jon Chesterfield dbb3fec8ad [libomptarget] Move resource id functions into target specific code, implement for amdgcn
Summary: [libomptarget] Move resource id functions into target specific code, implement for amdgcn

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71382
2019-12-12 22:49:02 +00:00
Jon Chesterfield b399252028 [libomptarget][nfc] Add missing header for amdgcn/target_impl 2019-12-12 09:36:57 +00:00
David Carlier 27535a1449 [OpenMP] Fix linkage issue on FreeBSD
needs kmp_set_thread_affinity_mask_initial implementation.
2019-12-06 15:47:50 +00:00
JonChesterfield 0dd62c5c2e [libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl
Summary:
[libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl

Part of building code under common/ without requiring a cuda compiler

Reviewers: ABataev, jdoerfert, grokos

Reviewed By: ABataev

Subscribers: jvesely, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D71102
2019-12-06 15:41:18 +00:00