Commit Graph

1154 Commits

Author SHA1 Message Date
Jon Chesterfield 327aa81123 [libomptarget] Refactor shfl_down_sync macro to inline function
Summary:
[libomptarget] Refactor shfl_down_sync macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66853

llvm-svn: 370146
2019-08-28 01:47:41 +00:00
Jon Chesterfield b9b712df82 [libomptarget] Refactor shfl_sync macro to inline function
Summary:
[libomptarget] Refactor shfl_sync macro to inline function
See also abandoned D66846, split into this diff and others.

Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66852

llvm-svn: 370144
2019-08-28 01:31:04 +00:00
Alexey Bataev da8b5cc9f1 [OPENMP][NVPTX]Add __kmpc_syncwarp(int32_t) function.
Summary:
Added function void __kmpc_syncwarp(int32_t) to expose it to the
compiler. It is required to fix the problem with the critical regions in
Cuda9.0+. We cannot use barrier in the critical region, but still need
to reconverge the threads in the warp after. This function allows to do
this.

Reviewers: grokos, jdoerfert

Subscribers: guansong, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D66672

llvm-svn: 369933
2019-08-26 17:32:45 +00:00
Alexey Bataev 0366168f3a [OPENMP][NVPTX]Use __syncwarp() to reconverge the threads.
Summary:
In Cuda 9.0 it is not guaranteed that threads in the warps are
convergent. We need to use __syncwarp() function to reconverge
the threads and to guarantee the memory ordering among threads in the
warps.
This is the first patch to fix the problem with the test
libomptarget/deviceRTLs/nvptx/src/sync.cu on Cuda9+.
This patch just replaces calls to __shfl_sync() function with the call
of __syncwarp() function where we need to reconverge the threads when we
try to modify the value of the parallel level counter.

Reviewers: grokos

Subscribers: guansong, jfb, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65013

llvm-svn: 369796
2019-08-23 18:34:48 +00:00
Jonathan Peyton 57ae6b8e37 Force honoring nthreads-var and thread-limit-var inside teams construct on host
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=42906, via adding
adjustment of number of threads on enter to the teams construct on host
according to user settings. This allows to pass checks and avoid assertions
at time of team of threads creation.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D66351

llvm-svn: 369430
2019-08-20 19:39:17 +00:00
Jonas Hahnfeld d2ae0c4f44 [OpenMP] Enable warning about "implicit fallthrough"
Fix last warned location in ittnotify_static.cpp using the defined
macro KMP_FALLTHROUGH().

Differential Revision: https://reviews.llvm.org/D65871

llvm-svn: 369003
2019-08-15 13:26:55 +00:00
Jonas Hahnfeld 4d77e50e6e [OpenMP] Remove 'unnecessary parentheses'
The variables in kmp_lock.cpp are really arrays of function pointers
that return void or int, not pointers to functions that return void*
or int*. The other changes are only cosmetic.

Differential Revision: https://reviews.llvm.org/D65870

llvm-svn: 369002
2019-08-15 13:26:41 +00:00
Jonas Hahnfeld fb72a03f85 [OMPT] Resolve warnings because of ints in if conditions
The implementation status can only be one of
ompt_event_UNIMPLEMENTED = ompt_set_never = 1
ompt_event_MAY_ALWAYS = ompt_set_always = 5

In both cases, the condition was already true, so just remove
the check.

Differential Revision: https://reviews.llvm.org/D65869

llvm-svn: 369001
2019-08-15 13:26:29 +00:00
Jonas Hahnfeld dc23c832f4 [OpenMP] Turn on -Wall compiler warnings by default
Instead, maintain a list of disabled options to still build libomp and
libomptarget without warnings. This includes -Wno-error and -Wno-pedantic
to silence warnings that LLVM enables when building in-tree.

I tested the following compilers:
 * Clang 6.0, 7.0, 8.0
 * GCC 4.8.5 (CentOS 7), GCC 6, 7, 8, 9
 * Intel Compiler 16, 17, 18, 19

RFC thread on openmp-dev mailing list:
http://lists.llvm.org/pipermail/openmp-dev/2019-August/002668.html

Differential Revision: https://reviews.llvm.org/D65867

llvm-svn: 368999
2019-08-15 13:11:50 +00:00
Jon Chesterfield ed3324f6b6 Factor architecture dependent code out of loop.cu
Summary:
[libomptarget] Factor architecture dependent code out of loop.cu

Related to the patch series starting D64217. Added subscribers to said series as reviewers. This effort is smaller in scope.

This patch factors out just enough architecture dependent code from loop.cu to allow the same source to be used with amdgcn, given a different target_impl.h. Testing is that the same bitcode (modulo variable names) is generated for libomptarget before and after the refactor, for nvptx and the out of tree amdgcn.

Reviewers: jdoerfert, ABataev, bollu, jfb, tra, grokos, Hahnfeld, guansong, xtian, gregrodgers, ronlieb, hfinkel, gtbercea, guraypp, arpith-jacob

Reviewed By: jdoerfert, ABataev

Subscribers: dexonsmith, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65836

llvm-svn: 368751
2019-08-13 21:41:47 +00:00
Andrey Churbanov 5eec1a9d32 Cleanup unused variable.
This patch fixes problem raised in post-review comments of the
https://reviews.llvm.org/D65285. Developers of ittnotify confirmed
that dll_path_ptr field of the __itt_global structure is never used
by ittnotify library, so it is safe to remove the dll_path array.

Differential Revision: https://reviews.llvm.org/D65885

llvm-svn: 368559
2019-08-12 12:37:30 +00:00
Gheorghe-Teodor Bercea 6c7b882e52 [OpenMP][libomptarget] Add support for close map modifier
Summary:
This patch adds support for the close map modifier.

The close map modifier will overwrite the unified shared memory requirement and create a device copy of the data.

Reviewers: ABataev, Hahnfeld, caomhin, grokos, jdoerfert, AlexEichenberger

Reviewed By: Hahnfeld, AlexEichenberger

Subscribers: guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65340

llvm-svn: 368488
2019-08-09 21:32:57 +00:00
Jonas Hahnfeld 7a0f2dc5a4 [libomptarget] Remove duplicate RTLRequiresFlags per device
We have one global RTLs.RequiresFlags, I don't see a need to make a
copy per device that the runtime manages. This was problematic anyway
because the copy happened during the first __tgt_register_lib(). This
made it impossible to call __tgt_register_requires() from normal user
funtions for testing.
Hence, this change also fixes unified_shared_memory/shared_update.c for
older versions of Clang that don't call __tgt_register_requires() before
__tgt_register_lib().

Differential Revision: https://reviews.llvm.org/D66019

llvm-svn: 368465
2019-08-09 19:20:39 +00:00
Gheorghe-Teodor Bercea a1d20506e7 [OpenMP][libomptarget] Add support for unified memory for regular maps
Summary:
This patch adds support for using unified memory in the case of regular maps that happen when a target region is offloaded to the device.

For cases where only a single version of the data is required then the host address can be used. When variables need to be privatized in any way or globalized, then the copy to the device is still required for correctness.

Reviewers: ABataev, jdoerfert, Hahnfeld, AlexEichenberger, caomhin, grokos

Reviewed By: Hahnfeld

Subscribers: mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65001

llvm-svn: 368192
2019-08-07 17:29:45 +00:00
Jon Chesterfield ae0178bee7 Use forceinline. Necessary for nvcc to inline small functions within the bitcode library
Summary:
[libomptarget] Use forceinline. Necessary for nvcc to inline small functions within the bitcode library
Suggested in D65836

Reviewers: ABataev, jdoerfert, grokos, gregrodgers

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D65876

llvm-svn: 368177
2019-08-07 15:24:12 +00:00
Alexey Bataev c10180ed8e [OPENMP][OFFLOADING]Fix the test, NFC.
llvm-svn: 368068
2019-08-06 18:13:39 +00:00
Jonathan Peyton 73d5abd809 [OpenMP] Add support for GOMP_*_nonmonotonic_* functions
Patch by Isuru Fernando

Differential Revision: https://reviews.llvm.org/D65714

llvm-svn: 367949
2019-08-05 23:23:52 +00:00
Hansang Bae dcdbe6515b [OpenMP] Fix broken build due to new OMPT tests
New OMPT tests with teams construct should be disabled for GCC as it
emits code with a GOMP entry not supported in the LLVM runtime.

Differential Revision: https://reviews.llvm.org/D65757

llvm-svn: 367939
2019-08-05 21:46:13 +00:00
Michael Kruse 78769ec403 [libomptarget] Harmonize emitting CUDA errors and general debug messages.
Ensures that CUDA fail reasons (such as "No CUDA-capable device detected")
are printed together with libomptarget's debug message
(e.g. "Error when setting CUDA context"). Previously, the former was
printed only in CMAKE_BUILD_TYPE=Debug builds while the latter was
enabled by LIBOMPTARGET_ENABLE_DEBUG.

With this change, also only call cuGetErrorString when the error will be
printed.

Suggested-by: Ye Luo <xw111luoye@gmail.com>

Differential Revision: https://reviews.llvm.org/D65687

llvm-svn: 367910
2019-08-05 19:12:10 +00:00
Michael Kruse 2c7a8eaf3d [OpenMP 5.0] libomptarget interface for declare mapper functions.
This patch implements the libomptarget runtime interface for OpenMP 5.0
declare mapper functions. The declare mapper functions generated by
Clang will call them to complete the mapping of members.
kmpc_mapper_num_components gets the current number of components for a
user-defined mapper; kmpc_push_mapper_component pushes back one
component for a user-defined mapper.

The design slides can be found at
https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx

Patch by Lingda Li <lildmh@gmail.com>

Differential Revision: https://reviews.llvm.org/D60972

llvm-svn: 367772
2019-08-04 04:18:28 +00:00
Hansang Bae 67e93a1ae0 Add OMPT support for teams construct
This change adds OMPT support for events from teams construct.

Differential Revision: https://reviews.llvm.org/D64025

llvm-svn: 367746
2019-08-03 02:38:53 +00:00
Jonas Hahnfeld 52b87ac32f [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS
All other files are already C++ and the build system has always
passed '-x c++' for C files, effectively compiling them as C++.

To stay warning free we need one fix in ittnotify_static.{c,cpp}:
The variable dll_path can be written to, so it must not be const.
GCC complained with -Wcast-qual and I think it's right.

Differential Revision: https://reviews.llvm.org/D65285

llvm-svn: 367343
2019-07-30 18:37:28 +00:00
Yi Kong 3d21a3af87 [openmp] Workaround bug in old Android pthread_attr_setstacksize
Round the stack size to a multiple of the page size. Older versions of
Android (until KitKat) would fail pthread_attr_setstacksize with
EINVAL if the stack size was not a multiple of the page size.

Patch by Dan Albert <danalbert@google.com>.

Test: Build, copied into the NDK, passed openmp test on ICS.
Bug: https://github.com/android-ndk/ndk/issues/9
llvm-svn: 367070
2019-07-25 22:29:55 +00:00
Jonas Hahnfeld baeab1fc44 [OpenMP] Fix build of stubs library, NFC.
Both Clang and GCC complained that they cannot initialize a return
object of type 'kmp_proc_bind_t' with an 'int'. While at it, also
fix a warning about missing parentheses thrown by Clang.

Differential Revision: https://reviews.llvm.org/D65284

llvm-svn: 367041
2019-07-25 17:51:24 +00:00
Alexey Bataev ca424d100c [OPENMP][NVPTX]Perform memory flush if number of threads to sync is 1 or less.
Summary:
According to the OpenMP standard, barrier operation must perform
implicit flush operation. Currently, if there is only one thread in the
team, barrier does not flush the memory. Patch fixes this problem.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62398

llvm-svn: 367024
2019-07-25 15:02:28 +00:00
Jonas Hahnfeld 2488ae9df1 [OpenMP] RISCV64 port
This is a port of libomp for the RISC-V 64-bit Linux target.

We have tested this port on a HiFive Unleashed development board
using a downstream LLVM that has support for the missing bits in
upstream. As of now, all tests are passing, including OMPT.

Patch by Ferran Pallarès!

Differential Revision: https://reviews.llvm.org/D59880

llvm-svn: 367021
2019-07-25 14:36:20 +00:00
Jonas Hahnfeld 6e40ae8f3d [libomptarget] Handle offload policy in push_tripcount
If the first target region in a program calls the push_tripcount
function, libomptarget didn't handle the offload policy correctly.
This could lead to unexpected error messages as seen in
http://lists.llvm.org/pipermail/openmp-dev/2019-June/002561.html

To solve this, add a check calling IsOffloadDisabled() as all other
entry points already do. If this method returns false, libomptarget
is effectively disabled.

Differential Revision: https://reviews.llvm.org/D64626

llvm-svn: 366810
2019-07-23 14:20:48 +00:00
Jonas Hahnfeld a2748c74d6 [OMPT] Cleanup reset of exit_frame pointer
This is done at call-site and does not need to be handled in
__kmp_invoke_microtask. It was already absent from the x86
and x86_64 assembly, this patch removes it from the generic
implementation in z_Linux_util.cpp and adds documentation for
AArch64 and PPC64 that it's actually not needed. I can't test
on these architectures, so I don't want to change the code just
because it looks right :)

While at it, rename some variables for consistency and add a
check in test/ompt/parallel/normal.c that the pointer was reset
before entering the barrier.

Differential Revision: https://reviews.llvm.org/D64442

llvm-svn: 366721
2019-07-22 18:46:02 +00:00
Jonas Hahnfeld 4138b2f167 Delete empty file
This is a left-over from r356288 which was reviewed in D58989.

llvm-svn: 366716
2019-07-22 18:11:06 +00:00
Alexey Bataev da43861b4a [OpenMP][libomptarget] Suppress C++ 11 related warnings when building libomptarget-nvptx bitcode library, by Doru Bercea.
Summary: Pass -std=c++11 flag to compiler to suppress C++ 11 related warnings when building NVPTX bitcode library.

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev, Hahnfeld

Subscribers: jdoerfert, Hahnfeld, jholewinski, mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55772

llvm-svn: 366438
2019-07-18 13:54:01 +00:00
Ron Lieberman 59532488b1 [OPENMP] Resolve lost LoopTripCnt for subsequent loops in same thread.
Remove loopTripCnt from threaded device stack after consuming it.
Added a libomptarget DP message to aid in future debugging and to
validate the added testcase, which only runs in Debug build.

Differential Revision: https://reviews.llvm.org/D64808

llvm-svn: 366349
2019-07-17 17:07:52 +00:00
Jonathan Peyton aa5cdafa40 Remove REQUIRES OMP spec version within lit tests
This is a follow up patch to D64534 (r365963) which removed all OMP
spec versioning within the OpenMP runtime codebase.  This patch removes
REQUIRES: openmp-x.y lines from lit tests.

llvm-svn: 366341
2019-07-17 15:41:00 +00:00
Jonas Hahnfeld 1ff5535785 [OpenMP] Move header inclusion out of 'extern "C"'
This leads to problems when compiling C++ code with libc++ for Nvidia GPUs
because Clang now uses wrappers for math functions that might include
C++ templates not allowed in 'extern "C"'.

Differentiel Revision: https://reviews.llvm.org/D64625

llvm-svn: 366229
2019-07-16 17:16:43 +00:00
Alexey Bataev 85b9651edd [OPENMP][NVPTX]Fixed checks for cuda versions.
Summary:
We used CUDART_VERSION macro to check for the installed cuda version
but this macro is defined in cuda_runtime_api.h, which is not used by
project. Better to use CUDA_VERSION macro, which is defined in cuda.h.
Also, added the check if this macro is defined. If macro is undefined,
there is something wrong with the cuda configuration and we should not
continue the compilation.
This also fixes problems with runtime building in cuda 10+.

Reviewers: grokos

Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64648

llvm-svn: 366224
2019-07-16 16:07:10 +00:00
Alexey Bataev 42816107f7 [OPENMP]Fix threadid in __kmpc_omp_taskwait call for dependent target calls.
Summary:
We used to call __kmpc_omp_taskwait function with global threadid set to
0. It may crash the application at the runtime if the thread executing
 target region is not a master thread.

Reviewers: grokos, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64571

llvm-svn: 366220
2019-07-16 15:51:32 +00:00
Jonathan Peyton e4b4f994d2 [OpenMP] Remove OMP spec versioning
Remove all older OMP spec versioning from the runtime and build system.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D64534

llvm-svn: 365963
2019-07-12 21:45:36 +00:00
Jonas Hahnfeld aca476b296 [libomptarget] Fix typos and grammar in error messages, NFC.
llvm-svn: 365890
2019-07-12 10:21:55 +00:00
Jonas Hahnfeld 2dfc5179f6 [libomptarget-nvptx] Remove dead functions
These entry points are never called by Clang trunk nor clang-ykt. If
XL doesn't use them either, they can finally go away.

Differential Revision: https://reviews.llvm.org/D52700

llvm-svn: 365817
2019-07-11 20:12:51 +00:00
Andrey Churbanov 28f44040cc NFC: fixed typo #ifdef --> #if to allow macro set to 0 work correctly
llvm-svn: 365642
2019-07-10 15:09:37 +00:00
Alexey Bataev 4ad9286a57 [OPENMP]Rename loopTripCnt member data to LoopTripCnt, NFC.
Rename variable to follow LLVM coding standard.

llvm-svn: 365368
2019-07-08 18:45:48 +00:00
Alexey Bataev 060921dee7 [OPENMP]Make __kmpc_push_tripcount thread safe.
Summary:
__kmpc_push_tripcount function is not thread safe and may lead to data
race when the target regions are executed in parallel threads. The patch
makes loopTripCnt counter thread aware and stores the tripcount value
per thread in the map. Access to map is guarded by mutex to prevent
data race in the map itself.
Test is for NVPTX target because it does not work correctly on the
host. Seems to me, there is a problem in libomp with target regions in
the parallel threads.

Reviewers: grokos

Subscribers: guansong, jfb, jdoerfert, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D64080

llvm-svn: 365332
2019-07-08 15:30:23 +00:00
Andrey Churbanov a23806e67a Create a runtime option to disable task throttling.
Patch by viroulep (Philippe Virouleau)

Differential Revision: https://reviews.llvm.org/D63196

llvm-svn: 364934
2019-07-02 15:10:20 +00:00
Andrey Churbanov e7b2c64a6e Cleanup of unused code
Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D63891

llvm-svn: 364925
2019-07-02 13:45:40 +00:00
Alexey Bataev bb55ece269 [OPENMP][NVPTX]Relax flush directive.
Summary:
According to the OpenMP standard, flush  makes a thread’s temporary view of memory consistent with memory and enforces an order on the memory operations of the variables explicitly specified or implied.

According to the Cuda toolkit documentation (https://docs.nvidia.com/cuda/archive/8.0/cuda-c-programming-guide/index.html#memory-fence-functions), __threadfence() functions provides required functionality.

__threadfence_system() also provides required functionality, but it also
includes some extra functionality, like synchronization of page-locked
host memory, synchronization for the host, etc. It is not required per
the standard and we can use more relaxed version of memory fence
operation.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62397

llvm-svn: 364572
2019-06-27 18:33:09 +00:00
Andrey Churbanov b7e6c37efe Fixed memory use-after-free problem.
Bug reported in https://bugs.llvm.org/show_bug.cgi?id=42269.
Freeing of the contention group (CG) stucture by master thread looks wrong,
because workers can leave the CG later on. Intead the freeing
is now done by the last thread leaving the CG.

Differential Revision: https://reviews.llvm.org/D63599

llvm-svn: 364456
2019-06-26 18:11:26 +00:00
Gheorghe-Teodor Bercea aace6d285d [OpenMP][libomptarget] Add support for declare target to clause under unified memory
Summary:
This patch adds support for handling variables under the:

```
#pragma omp declare target to()
```

clause when the 

```
#pragma omp requires unified_shared_memory
```

is used.

The address of the host variable is copied into the device pointer just like for the declare target link case.

Reviewers: ABataev, caomhin, grokos, AlexEichenberger

Reviewed By: grokos

Subscribers: jcownie, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63106

llvm-svn: 363825
2019-06-19 15:48:10 +00:00
Alexey Bataev 8a2bd361eb [OPENMP][CUDA]Use __syncthreads when compiled by nvcc and clang >= 9.0.
Summary:
The problems with __syncthreads() were fixed in clang >= 9.0 and the
original __syncthreads() can be used instead of the ptx instruction.

Reviewers: grokos

Subscribers: guansong, jdoerfert, openmp-commits, kkwli0, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63515

llvm-svn: 363807
2019-06-19 14:20:34 +00:00
Andrey Churbanov 405037c4e6 New implementation of OpenMP 5.0 detached tasks.
Patch by Alex Duran

Differential Revision: https://reviews.llvm.org/D62485

llvm-svn: 363799
2019-06-19 13:23:28 +00:00
Gheorghe-Teodor Bercea b48e44a65c [OpenMP] Add task alloc function
Summary: Add the target task allocation function to the interface.

Reviewers: ABataev, AlexEichenberger, caomhin, jlpeyton, AndreyChurbanov, RaviNarayanaswamy, hbae

Reviewed By: AlexEichenberger, hbae

Subscribers: hbae, RaviNarayanaswamy, cfe-commits, Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp, #clang

Differential Revision: https://reviews.llvm.org/D63010

llvm-svn: 363449
2019-06-14 20:15:15 +00:00
Andrey Churbanov d47f5488cf Added propagation of not big initial stack size of master thread to workers.
Currently implemented only for non-Windows 64-bit platforms.

Differential Revision: https://reviews.llvm.org/D62488

llvm-svn: 362618
2019-06-05 16:14:47 +00:00
Gheorghe-Teodor Bercea c5fe030c16 [OpenMP][libomptarget] Enable usage of unified memory for declare target link variables
Summary: This patch enables the usage of a host variable on the device for declare target link variables when unified memory is available.

Reviewers: ABataev, caomhin, grokos

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60884

llvm-svn: 362505
2019-06-04 15:05:53 +00:00
Andrey Churbanov 3f786dab0e Fixed build warning with -DLIBOMP_USE_HWLOC=1
Made type of depth of hwloc object to correapond with
change from unsigned in hwloc 1,x to int in hwloc 2.x.
This eliminates the warning on signed-unsigned comparison.

Differential Revision: https://reviews.llvm.org/D62332

llvm-svn: 362401
2019-06-03 14:21:59 +00:00
Hansang Bae ec1b4d1f6f Fix OMP_TARGET_OFFLOAD parsing
Current parsing allows trailing string after the permitted value,
MANDATORY|DISABLED|DEFAULT -- e.g., "mandatorynot" is also recognized
as "MANDATORY". Such cases should be recognized as incorrect/unknown
value.

Differential Revision: https://reviews.llvm.org/D62431

llvm-svn: 362125
2019-05-30 18:35:07 +00:00
Hansang Bae 7c75ac0c60 Add checks before pointer dereferencing
This change adds checks before dereferencing a pointer returned from a
function.

Differential Revision: https://reviews.llvm.org/D62224

llvm-svn: 362111
2019-05-30 16:32:20 +00:00
Michal Gorny a815cbb010 [openmp] [test] Skip kernel-breaking tests on NetBSD
The omp_taskloop_num_tasks and omp_taskwait have deadlooped
on the NetBSD buildbot previously, practically hanging the host running
it.  Disable them until we can find a good solution, or make the kernel
less fragile.

llvm-svn: 361825
2019-05-28 14:10:47 +00:00
Alexey Bataev e1947b84c1 Revert "[OPENMP][NVPTX]Fix barriers and parallel level counters, NFC."
This reverts commit r361421 to split the patch into 3 parts.

llvm-svn: 361638
2019-05-24 14:06:47 +00:00
Alexey Bataev 9d9e406684 [OPENMP][NVPTX]Fix barriers and parallel level counters, NFC.
Summary:
Parallel level counter should be volatile to prevent some dangerous
optimiations by the ptxas. Otherwise, ptxas optimizations lead to
undefined behaviour in some cases.
Also, use __threadfence() for #pragma omp flush and if the barrier
should not be used (we have only one thread in the team), still perform
flush operation since the standard requires implicit flush when
executing barriers.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D62199

llvm-svn: 361421
2019-05-22 19:50:32 +00:00
Andrey Churbanov 184ef0a0a6 Fixed third issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.
Removed wrong debug assertion.

Differential Revision: https://reviews.llvm.org/D62251

llvm-svn: 361408
2019-05-22 16:48:05 +00:00
Jonathan Peyton 3057c3a092 [OpenMP] Add implementation to two OMPT API routines
This change adds implementation to ompt_finalize_tool() and
ompt_get_task_memory().

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D61657

llvm-svn: 361309
2019-05-21 20:51:05 +00:00
Gheorghe-Teodor Bercea 9e9c918259 [OpenMP][libomptarget] Enable requires flags for target libraries.
Summary:
Target link variables are currently implemented by creating a copy of the variables on the device side and unified memory never gets exploited.

When the prgram uses the:

```
#pragma omp requires unified_shared_memory
```

directive in conjunction with a declare target link, the linked variable is no longer allocated on the device and the host version is used instead.

This behavior is overridden by performing an explicit mapping.

A Clang side patch is required.

Reviewers: ABataev, AlexEichenberger, grokos, Hahnfeld

Reviewed By: AlexEichenberger, grokos, Hahnfeld

Subscribers: Hahnfeld, jfb, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60223

llvm-svn: 361294
2019-05-21 19:35:02 +00:00
Joachim Protze 4109d5606e [OpenMP][OMPT] Fix locking testcases for 32 bit architectures
https://reviews.llvm.org/D58454 did not fix the problem for a typical use
case of building LLVM with gcc or icc and then testing with the newly built
clang compiler.
The compilers do not agree on how to extend a 32-bit pointer to uint64, so
make the pointer unsigned first, before adjusting the size.

Patch by Joachim Protze

Differential Revision: https://reviews.llvm.org/D58506

llvm-svn: 361158
2019-05-20 14:21:42 +00:00
Joachim Protze 48b8a4b519 [OMPT] Handling of the events of initial-task-begin and initial-task-end
OpenMP 5.0 says that the callback for the events initial-task-begin and
initial-task-end has to be ompt_callback_implicit_task.

Patch by Tim Cramer

Differential Revision: https://reviews.llvm.org/D58776

llvm-svn: 361157
2019-05-20 14:21:36 +00:00
Andrey Churbanov f8f788b205 Fixed second issue reported in https://bugs.llvm.org/show_bug.cgi?id=41584.
Added synchronization for possible concurrent initialization of mutexes
by multiple threads. The need of synchronization caused by commit r357927
which added the use of mutexes at threads movement to/from common pool
(earlier the mutexes were used only at suspend/resume).

Patch by Johnny Peyton.

Differential Revision: https://reviews.llvm.org/D61995

llvm-svn: 360919
2019-05-16 17:52:53 +00:00
Paul Osmialowski 0732fcc7d5 Fix hwloc topology traversal code unable to handle situation where L2 cache is common for the packages
Currently cores within package that share the same L2 cache are grouped together.
The current logic behind this assumes that the L2 cache is always at deeper
(or the same) level than the package itself. In case when L2 cache is common
for all packages (and the packages are at deeper level than L2 cache) the whole of
the further topology discovery fails to find any computational units resulting in
following assertion:

Assertion failure at kmp_affinity.cpp(715): nActiveThreads == __kmp_avail_proc.
OMP: Error #13: Assertion failure at kmp_affinity.cpp(715).

This patch adds a bit of a logic that prevents such situation from occurring.

Differential Revision: https://reviews.llvm.org/D61796

llvm-svn: 360890
2019-05-16 13:16:24 +00:00
Andrey Churbanov 6ebb785bb1 Fixed https://bugs.llvm.org/show_bug.cgi?id=41584.
Removed unconditional and unsafe decrement of counter 
of active threads in pool at shutdown time.

Differential Revision: https://reviews.llvm.org/D61944

llvm-svn: 360784
2019-05-15 16:53:45 +00:00
Andrey Churbanov 22405f3097 Introduce new OpenMP 5.0 depend object type.
The implementation should be done by compiler, user can only declare
objects of this type and use them in OpenMP directives.

Differential Revision: https://reviews.llvm.org/D61860

llvm-svn: 360774
2019-05-15 13:45:36 +00:00
Eli Friedman 025df3b827 [OpenMP][AArch64] Fix compile with LLVM trunk.
The code is currently using the ambiguous instruction
"sub sp, sp, w9, lsl #4". The ARM reference manual says this isn't
valid, and it's not clear whether it's supposed to mean uxtw or uxtx.

It doesn't matter which instruction we use here, since the high
bits of the operand are zero anyway, so I arbitrarily choose uxtw, to
preserve the register name.

See https://reviews.llvm.org/D60840 for the LLVM patch.

Differential Revision: https://reviews.llvm.org/D61770

llvm-svn: 360711
2019-05-14 21:44:54 +00:00
Andrey Churbanov 1aaf2a3c18 fixed typo made by commit r360595
llvm-svn: 360602
2019-05-13 17:04:32 +00:00
Andrey Churbanov 7f63e8c0a6 Fixed creation of aliases in Windows build.
Changed file extension of the destination of the copy of libomp.lib
(it was mistakely .dll, now it is .lib) in installation on Windows.

Differential Revision: https://reviews.llvm.org/D61673

llvm-svn: 360595
2019-05-13 16:07:37 +00:00
Alexey Bataev f9e00db818 [OPENMP][NVPTX]Simplify handling of thread limit, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
threads limit counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, kkwli0, gtbercea

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61801

llvm-svn: 360584
2019-05-13 14:21:46 +00:00
Alexey Bataev f62c266de7 [OPENMP][NVPTX]Improve number of threads counter, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
number-of-threads counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61785

llvm-svn: 360457
2019-05-10 18:56:05 +00:00
Jonathan Peyton c107332583 [OpenMP] Workaround gfortran bugzilla build bug 41755
This patch provides workaround to allow gfortran to compile the
OpenMP Fortran modules.

From the gfortran manual:
https://gcc.gnu.org/onlinedocs/gcc-9.1.0/gfortran/BOZ-literal-constants.html

"Note that initializing an INTEGER variable with a statement such as
DATA i/Z'FFFFFFFF'/ will give an integer overflow error rather than the desired
result of -1 when i is a 32-bit integer on a system that supports 64-bit
integers. The -fno-range-check option can be used as a workaround for legacy
code that initializes integers in this manner."

Bug filed: https://bugs.llvm.org/show_bug.cgi?id=41755

Differential Revision: https://reviews.llvm.org/D61603

llvm-svn: 360299
2019-05-08 23:12:31 +00:00
Dimitry Andric 181aff63fb Add non-SSE wrapper for __kmp_{load,store}_mxcsr
Summary:
To be able to successfully build OpenMP on FreeBSD/i386, which still
uses i486 as its default processor, I had to provide wrappers for the
`__kmp_load_mxcsr` and `__kmp_store_mxcsr` functions.

If the compiler signals that SSE is not available, loading and storing
mxcsr does not make sense anway, so in that case the inline functions
are empty.  This gives the minimum amount of code churn.

See also https://svnweb.freebsd.org/changeset/base/345283

Reviewers: emaste, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: hfinkel, krytarowski, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60916

llvm-svn: 360062
2019-05-06 17:58:03 +00:00
Alexey Bataev a857e31011 [OPENMP][NVPTX]Improve thread limit counter, NFC.
Summary:
Patch improves performance of the full runtime mode by moving
thread-limit counter to the shared memory. It also allows to save
global memory.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61526

llvm-svn: 359922
2019-05-03 20:00:38 +00:00
Alexey Bataev e031e17919 [OPENMP][NVPTX]Improved several standard OpenMP functions, NFC.
Summary:
Used parallelLevel[] counter to simplify and improve implementation of
the existing standard OpenMP functions. Functions are tested already in
several tests, the patch is NFC.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61459

llvm-svn: 359892
2019-05-03 14:47:20 +00:00
Alexey Bataev 8ccb8f8647 [OPENMP][NVPTX]Improve code by using parallel level counter.
Summary:
Previously for the different purposes we need to get the active/common
parallel level and with full runtime we iterated over all the records to
calculate this level. Instead, we can used the warp-based parallel level
counters used in no-runtime mode.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61395

llvm-svn: 359822
2019-05-02 20:05:01 +00:00
Alexey Bataev 4ad6dbc5fd [OPENMP][NVPTX]Improve omp_get_max_threads() function.
Summary:
Function omp_get_max_threads() can always return 1 if current execution
mode is SPMD.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, caomhin, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61379

llvm-svn: 359792
2019-05-02 14:52:52 +00:00
Alexey Bataev 8e6bf88cf7 [OPENMP][NVPTX]Improved omp_get_thread_limit() function.
Summary:
Function omp_get_thread_limit() in SPMD mode can return the maximum
available number of threads as a result.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D61378

llvm-svn: 359790
2019-05-02 14:46:32 +00:00
Dimitry Andric 147ce2334c Enable OpenMP build for 32-bit FreeBSD
Summary:
To be able to successfully build OpenMP on 32-bit FreeBSD, such as
FreeBSD/i386, I first had to provide a few wrappers (see D60916), and
then add `KMP_OS_FREEBSD` to the list of defines checked for 32-bit
architectures in `kmp_runtime.cpp`.

I have successfully built libomp.so and ran a bunch of test programs on
FreeBSD/i386 with this.

See also https://svnweb.freebsd.org/changeset/base/345283

Reviewers: emaste, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: krytarowski, guansong, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60917

llvm-svn: 359716
2019-05-01 19:32:58 +00:00
Jonathan Peyton a8426ac8c2 [OpenMP] Implement task modifier for reduction clause
Implemented task modifier in two versions - one without taking into account
omp_orig variable (the omp_orig still can be processed by compiler without help
of the library, but each reduction object will need separate initializer with
global access to omp_orig), another with omp_orig variable included into
interface (single initializer can be used for multiple reduction objects of
the same type). Second version can be used when the omp_orig is not globally
accessible, or to optimize code in case of multiple reduction objects
of the same type.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D60976

llvm-svn: 359710
2019-05-01 17:54:01 +00:00
Jonathan Peyton 71abe28e81 [OpenMP] Add OpenMP 5.0 nonmonotonic code
This patch adds:
* New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime
* Parsing of monotonic/nonmonotonic in OMP_SCHEDULE
* Tests for the monotonic flag and envirable parsing
* Logic to force monotonic when hierarchical scheduling is used

Differential Revision: https://reviews.llvm.org/D60979

llvm-svn: 359601
2019-04-30 19:20:35 +00:00
Jonathan Peyton 1ca746170b [OpenMP] Eliminate some compiler warnings
* Remove accidental == for =
* Assign values to variables to appease compiler
* Surround debug code with KMP_DEBUG
* Remove unused local typedefs

Differential Revision: https://reviews.llvm.org/D60983

llvm-svn: 359599
2019-04-30 19:13:37 +00:00
Alexey Bataev c03fe73176 [OPENMP][NVPTX]Correctly handle L2 parallelism in SPMD mode.
Summary:
The parallelLevel counter must be on per-thread basis to fully support
L2+ parallelism, otherwise we may end up with undefined behavior.
Introduce the parallelLevel on per-warp basis using shared memory. It
allows to avoid the problems with the synchronization and allows fully
support L2+ parallelism in SPMD mode with no runtime.

Reviewers: gtbercea, grokos

Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60918

llvm-svn: 359341
2019-04-26 19:30:34 +00:00
Dimitry Andric 87e7f895bb Use correct way to test for MIPS arch after rOMP355687
Summary:
I ran into some issues after rOMP355687, where __atomic_fetch_add was
being used incorrectly on x86, and this turns out to be caused by the
following added conditionals:

```
#if defined(KMP_ARCH_MIPS)
```

The problem is, these macros are always defined, and are either 0 or 1
depending on the architecture.  E.g. the correct way to test for MIPS
is:

```
#if KMP_ARCH_MIPS
```

Reviewers: petarj, jlpeyton, Hahnfeld, AndreyChurbanov

Reviewed By: petarj, AndreyChurbanov

Subscribers: AndreyChurbanov, sdardis, arichardson, atanasyan, jfb, jdoerfert, openmp-commits, llvm-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60938

llvm-svn: 358911
2019-04-22 19:20:46 +00:00
Alexey Bataev 5de5d74c8d [OPENMP][NVPTX] Fix the test, NFC.
Fix the test to run it really in SPMD mode without runtime. Previously
it was run in SPMD + full runtime mode and does not allow to cehck the
functionality correctly.

llvm-svn: 358902
2019-04-22 17:25:31 +00:00
Andrey Churbanov cf5bdb83b0 Fixed memory leak reported in Bugzilla:
https://bugs.llvm.org/show_bug.cgi?id=41494

Freed th_cg_roots structure at exit from uber thread.

Differential Revision: https://reviews.llvm.org/D60729

llvm-svn: 358572
2019-04-17 10:44:28 +00:00
Alexey Bataev 13532ea623 [OPENMP][NVPTX]Fix dynamic scheduling in L2+ SPMD parallel regions.
Summary:
If the kernel is executed in SPMD mode and the L2+ parallel for region
with the dynamic scheduling is executed, dynamic scheduling functions
are called. They expect full runtime support, but SPMD kernels may be
executed without the full runtime. It leads to the runtime crash of the
compiled program. Patch fixes this problem + fixes handling of the
parallelism level in SPMD mode, which is required as part of this patch.

Reviewers: gtbercea, kkwli0, grokos

Subscribers: guansong, jdoerfert, openmp-commits, caomhin

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60578

llvm-svn: 358442
2019-04-15 20:15:20 +00:00
Jonathan Peyton 4f21f5f5ce [OpenMP] Exchange code in asm file for inline assembly
This change replaces some of the assembly functions in z_Linux_asm.S
for inline asm in kmp.h. This allows better interaction with compiler
tools and sanitizers.

Differential Revision: https://reviews.llvm.org/D60423

llvm-svn: 358438
2019-04-15 19:19:57 +00:00
Andrey Churbanov 705384be97 Fixed possible out of bound array access.
The check of index value moved to before the write to the array.

Differential Revision: https://reviews.llvm.org/D60471

llvm-svn: 358181
2019-04-11 15:03:44 +00:00
Jonathan Peyton ebf1830bb1 [OpenMP] Implement 5.0 memory management
* Replace HBWMALLOC API with more general MEMKIND API, new functions
  and variables added.
* Have libmemkind.so loaded when accessible.
* Redirect memspaces to default one except for high bandwidth which
  is processed separately.
* Ignore some allocator traits e.g., sync_hint, access, pinned, while
  others are processed normally e.g., alignment, pool_size, fallback,
  fb_data, partition.
* Add tests for memory management

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D59783

llvm-svn: 357929
2019-04-08 17:59:28 +00:00
Jonathan Peyton feac33ebb0 [OpenMP] Clean up load balancing dynamic mode
This patch cleans up the bookkeeping code for the load balancing dynamic mode.

When a thread is moved to or from the thread pool, the th_active_in_pool flag
and the __kmp_thread_pool_active_nth global counter are both updated. This
removes the need for the corrective code in the main wait loop. Another global
counter, __kmp_thread_pool_nth, was removed completely, as it was only used for
debugging, but was not under KMP_DEBUG.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D59508

llvm-svn: 357927
2019-04-08 17:50:02 +00:00
Dimitry Andric 8f2d1eb9e8 After rL357618, quote ${CMAKE_THREAD_LIBS_INIT} so CMake does not
complain when the variable is empty.  Fixes PR 41401.

llvm-svn: 357828
2019-04-05 22:19:40 +00:00
Jonathan Peyton b727d384a3 [OpenMP] Fix hang on Windows
Debug dump on large machine shows when many OpenMP threads (401 in total)
sleep on a barrier, one of the innermost nesting levels sleeps
on a child's b_arrived flag whose value is equal to 4 and is equal to
checker value. i.e., (1) sleep bit is 0, and (2) done_check() would
return true if called.

It is unclear how this might happen. It could be Windows Server 2016's
error of EnterCriticalSection / LeaveCriticalSection, or
error of WaitForSingleObject / SetEvent / ResetEvent, or
error in the library which is very difficult to find.

As a workaround, change INFINITE wait to timed wait, so that each
thread awakens each 5 seconds (the timeout was chosen arbitrary to not
disturb other threads much), check flag condition under the lock, and
either go to sleep again or stop sleeping as a result of the check.

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D59793

llvm-svn: 357722
2019-04-04 20:35:29 +00:00
Jonathan Peyton d2b53cad18 [OpenMP][Stats] Fix stats gathering for distribute and team clause
The distribute clause needs an explicit push of a timer. The teams
clause needs a timer added and also, similarly to parallel, exchanged
with the serial timer when encountered so that serial regions are
counted properly.

Differential Revision: https://reviews.llvm.org/D59801

llvm-svn: 357621
2019-04-03 18:53:26 +00:00
Dimitry Andric 956168c802 Ensure correct pthread flags and libraries are used
On most platforms, certain compiler and linker flags have to be passed
when using pthreads, otherwise linking against libomp.so might fail with
undefined references to several pthread functions.

Use CMake's `find_package(Threads)` to determine these for standalone
builds, or take them (and optionally modify them) from the top-level
LLVM cmake files.

Also, On FreeBSD, ensure that libomp.so is linked against libm.so,
similar to NetBSD.

Adjust test cases with hardcoded `-lpthread` flag to use the common
build flags, which should now have the required pthread flags.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim, Hahnfeld

Reviewed By: Hahnfeld

Subscribers: AndreyChurbanov, tra, EricWF, Hahnfeld, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59451

llvm-svn: 357618
2019-04-03 18:11:36 +00:00
Michael Kruse d97d5ebcfa [libomptarget] Introduce LIBOMPTARGET_ENABLE_DEBUG cmake option.
At the moment, support for runtime debug output using the
OMPTARGET_DEBUG=1 environment variable is only available with
CMAKE_BUILD_TYPE=Debug builds. The patch allows setting it independently
using the LIBOMPTARGET_ENABLE_DEBUG option, which is enabled by default
depending on CMAKE_BUILD_TYPE. That is, unless this option is set
explicitly, nothing changes. This is the same mechanism used by LLVM for
LLVM_ENABLE_ASSERTIONS.

This patch also removes adding -g -O0 in debug builds, it should be
handled by cmake's CMAKE_{C|CXX}_FLAGS_DEBUG configuration option.

Idea by Hal Finkel

Differential Revision: https://reviews.llvm.org/D55952

llvm-svn: 356998
2019-03-26 15:19:15 +00:00
Jonathan Peyton 3bc703d538 [OpenMP] Add LLVM license header to file
This file was missing the LLVM license header

llvm-svn: 356962
2019-03-25 22:36:31 +00:00
Jonathan Peyton 7ca09056c7 [OpenMP] Add Intel 19.0 to list of compilers in kmp_version.cpp
llvm-svn: 356961
2019-03-25 22:31:00 +00:00
Dimitry Andric a70da7f29f Fix interoperability test compilation on FreeBSD
Summary:
While building the 8.0 releases on FreeBSD, I encountered the following
error in the regression tests, where ompt/misc/interoperability.cpp
failed to compile, with:

```
projects/openmp/runtime/test/ompt/misc/interoperability.cpp:7:10: fatal error: 'alloca.h' file not found
#include <alloca.h>
         ^~~~~~~~~~
```

Like on NetBSD, alloca(3) is defined in <stdlib.h> instead.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim

Reviewed By: jlpeyton

Subscribers: jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59736

llvm-svn: 356936
2019-03-25 18:37:49 +00:00
Dimitry Andric dab9ed87c6 Fix gettid warnings on FreeBSD
Summary:
[Split off from D59451 to get this fix in separately]

While building the 8.0 releases on FreeBSD, I encountered the following
warnings in openmp quite a few times:

```
In file included from projects/openmp/runtime/src/kmp_settings.cpp:27:
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: #warning is a language extension [-Wpedantic]
#warning No gettid found, use getpid instead
 ^
projects/openmp/runtime/src/kmp_wrapper_getpid.h:35:2: warning: No gettid found, use getpid instead [-W#warnings]
2 warnings generated.
```

I added a gettid wrapper that uses FreeBSD's pthread_getthreadid_np(3)
function for this.

Reviewers: emaste, jlpeyton, krytarowski, mgorny, protze.joachim

Reviewed By: jlpeyton

Subscribers: jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D59735

llvm-svn: 356934
2019-03-25 18:37:14 +00:00
Jonathan Peyton 61708b1e94 [OpenMP] Fix pause check with version info
Add 5.0 guard to pause code for now.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D59428

llvm-svn: 356933
2019-03-25 18:17:55 +00:00
Jonathan Peyton 6622732d9a [OpenMP] Fix OMPT cancellation test for GOMP
The GOMP sections interface uses schedule(dynamic) dispatch so it cannot
be assumed which thread executes the cancel and which thread executes
the cancellation point.  This patch allows either thread to execute either
section.

llvm-svn: 356302
2019-03-15 21:24:45 +00:00
Jonathan Peyton 5af1c22d0b [OpenMP] Add missing parenthesis in Perl module
llvm-svn: 356289
2019-03-15 18:27:14 +00:00
Jonathan Peyton 44b476c141 [OpenMP] Remove deprecated taskq
Remove very old, unused, and deprecated taskq code.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58989

llvm-svn: 356288
2019-03-15 18:24:59 +00:00
Jonathan Peyton 529e0d2ea4 [OpenMP][stats] Update stats gathering macros
llvm-svn: 355739
2019-03-08 21:23:34 +00:00
Petar Jovanovic bc3cda1526 [mips] Use libatomic instead of GCC intrinsics for 64bit
The following GCC intrinsics are not available on MIPS32:

__sync_fetch_and_add_8
__sync_fetch_and_and_8
__sync_fetch_and_or_8
__sync_val_compare_and_swap_8

Replace these with appropriate libatomic implementation.

Patch by Miodrag Dinic.

Differential Revision: https://reviews.llvm.org/D45691

llvm-svn: 355687
2019-03-08 10:53:19 +00:00
Shoaib Meenai 5be71faf4b [build] Rename clang-headers to clang-resource-headers
Summary:
The current install-clang-headers target installs clang's resource
directory headers. This is different from the install-llvm-headers
target, which installs LLVM's API headers. We want to introduce the
corresponding target to clang, and the natural name for that new target
would be install-clang-headers. Rename the existing target to
install-clang-resource-headers to free up the install-clang-headers name
for the new target, following the discussion on cfe-dev [1].

I didn't find any bots on zorg referencing install-clang-headers. I'll
send out another PSA to cfe-dev to accompany this rename.

[1] http://lists.llvm.org/pipermail/cfe-dev/2019-February/061365.html

Reviewers: beanz, phosek, tstellar, rnk, dim, serge-sans-paille

Subscribers: mgorny, javed.absar, jdoerfert, #sanitizers, openmp-commits, lldb-commits, cfe-commits, llvm-commits

Tags: #clang, #sanitizers, #lldb, #openmp, #llvm

Differential Revision: https://reviews.llvm.org/D58791

llvm-svn: 355340
2019-03-04 21:19:53 +00:00
Stefan Pintilie a908829bf5 [OPENMP] Deal with additional store inserted by Clang under -fno-PIC for PowerPC.
Changing the default from -fPIC to -fno-PIC on PowerPC exposed an issue in
OpenMP for PowerPC.
The issue is reported here:
https://bugs.llvm.org/show_bug.cgi?id=40082

This is a fix for that issue.
Also removed the XFAIL from the two tests that were failing under -fno-PIC.

Differential Revision: https://reviews.llvm.org/D56286

llvm-svn: 355229
2019-03-01 21:16:45 +00:00
Jonathan Peyton ad1ad7ae8b [OpenMP][OMPT] Distinguish different barrier kinds
This change makes the runtime decide the intended use of each barrier
invocation, for the OMPT synchronization tool callbacks.  The OpenMP 5.0
specification defines four possible barrier kinds -- implicit, explicit,
implementation, and just normal barrier.

Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D58247

llvm-svn: 355140
2019-02-28 20:55:39 +00:00
Jonathan Peyton 76b45e874d [OpenMP 5.0] Deprecate nest-var and associated features
Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been
deprecated in the 5.0 spec. Initial nesting info is now derived from
OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND.

This patch deprecates the internal ICV that corresponds to nest-var, and
replaces it with the max-active-levels-var ICV to determine nesting. The
change still allows for use of OMP_NESTED (according to 5.0 changes),
omp_get_nested, and omp_set_nested, which have had deprecation messages
added to them. The change allows certain settings of OMP_NUM_THREADS,
OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but
OMP_NESTED=0 will still force nesting to be off.

The runtime now prints informative messages about deprecation of
OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those
environment variables or routines are used. It also prints deprecated
message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED.
This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58408

llvm-svn: 355138
2019-02-28 20:47:21 +00:00
Jonathan Peyton e47d32f165 [OpenMP] Make use of sched_yield optional in runtime
This patch cleans up the yielding code and makes it optional. An
environment variable, KMP_USE_YIELD, was added. Yielding is still
on by default (KMP_USE_YIELD=1), but can be turned off completely
(KMP_USE_YIELD=0), or turned on only when oversubscription is detected
(KMP_USE_YIELD=2). Note that oversubscription cannot always be detected
by the runtime (for example, when the runtime is initialized and the
process forks, oversubscription cannot be detected currently over
multiple instances of the runtime).

Because yielding can be controlled by user now, the library mode
settings (from KMP_LIBRARY) for throughput and turnaround have been
adjusted by altering blocktime, unless that was also explicitly set.

In the original code, there were a number of places where a double yield
might have been done under oversubscription. This version checks
oversubscription and if that's not going to yield, then it does
the spin check.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58148

llvm-svn: 355120
2019-02-28 19:11:29 +00:00
Jonas Hahnfeld db3025ad57 [OpenMP] Fix check-openmp after r354553
Calling add_openmp_testsuite will add the tests to check-openmp unless
EXCLUDE_FROM_ALL is set. This is problematic because the tests for OMPT
will be included twice which doesn't work if the same test is executed
concurrently by multiple threads.

See:
http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/163
http://lab.llvm.org:8011/builders/openmp-clang-x86_64-linux-debian/builds/184

http://lab.llvm.org:8011/builders/openmp-clang-ppc64le-linux-rhel/builds/133
(On PPC some failures are unrelated to r354553, the bot has been red before
and this commit is not expected to fix that. For a proper patch please see
https://reviews.llvm.org/D56286.)

llvm-svn: 354572
2019-02-21 12:00:57 +00:00
Joachim Protze 8b96fad85c [OpenMP][OMPT] Fix locking testcases for 32 bit architectures
Fix for the bug reported in:
https://bugs.llvm.org/show_bug.cgi?id=40531

The address is now casted the same way as in the runtime code.

Differential Revision: https://reviews.llvm.org/D58454

llvm-svn: 354553
2019-02-21 08:50:49 +00:00
Gheorghe-Teodor Bercea 06e08f0b0a [OpenMP][libomptarget] New reduction scheme for team reductions
Summary:
This patch adds a more sophisticated team reduction scheme to the OpenMP libomptarget-nvptx runtime.

The scheme uses a fixed size global memory buffer whose length can be adjusted via compiler flag:
```
-fopenmp-cuda-teams-reduction-recs-num=1024
```
The global buffer is a structure of arrays (with default size of 1024 each and controlled by the above flag), one array for each reduction variable.

Values in the buffer are processed by the last team to finish executing the body of the target region.

In addition to adding support for the new flag, the compiler also emits special functions used for the reduction of the intermediate reduction values. These changes will be added in a separate compiler patch following this one.




Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D58409

llvm-svn: 354471
2019-02-20 14:55:55 +00:00
Jonathan Peyton 7d2cfa1fd5 [OpenMP] Remove XFAIL for cancellation tests using gcc
llvm-svn: 354370
2019-02-19 19:00:29 +00:00
Jonathan Peyton 154ac075cd [OpenMP 5.0] Add omp_get_supported_active_levels()
This patch adds the new 5.0 API function omp_get_supported_active_levels().

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D58211

llvm-svn: 354368
2019-02-19 18:51:11 +00:00
Jonathan Peyton 4fe5271fa0 [OpenMP] Adding GOMP compatible cancellation
Remove fatal error messages from the cancellation API for GOMP
Add __kmp_barrier_gomp_cancel() to implement cancellation of parallel regions.
This new function uses the linear barrier algorithm with a cancellable
nonsleepable wait loop.

Differential Revision: https://reviews.llvm.org/D57969

llvm-svn: 354367
2019-02-19 18:47:57 +00:00
Jonathan Peyton 511092cab0 [OpenMP] Fix broken link to browse sources
llvm-svn: 353858
2019-02-12 17:00:57 +00:00
Jonathan Peyton 2f744592a0 [OpenMP] Remove accidental commit to config-ix.cmake in r353747
llvm-svn: 353748
2019-02-11 21:09:15 +00:00
Jonathan Peyton 65ebfeecf8 [OpenMP] Fix thread_limits to work properly for teams construct
The thread-limit-var and omp_get_thread_limit API was not perfectly handled for
teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit
reports the correct value. In addition, the value is restored when leaving the
teams construct to what it was in the encountering context.

This is done partly by creating the notion of a Contention Group root (CG root)
that keeps track of the thread at the root of each separate CG, the
thread-limit-var associated with the CG, and associated counter of active
threads within the contention group.

thread-limits are passed from master to worker threads via an entry in the ICV
data structure. When a "contention group switch" occurs, a new CG root record is
made and passed from master to worker. A thread could potentially have several
CG root records if it encounters multiple nested teams constructs (but at the
moment the spec doesn't allow for nested teams, so the most one could have
currently is 2). The master of the teams masters gets the thread-limit clause
value stored to its local ICV structure, and the other teams masters copy it
from the master. The thread-limit is set from that ICV copy and restored to the
ICV copy when entering and leaving the teams construct.

This change also fixes a bug when the top-level teams construct team gets
reused, and OMP_DYNAMIC was true, which can cause the expected size of this team
to be smaller than what was actually allocated. The fix updates the size of the
team after its threads were reserved.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D56804

llvm-svn: 353747
2019-02-11 21:04:23 +00:00
Jonas Hahnfeld f26d3e7185 [OMPT] Remove test output from source tree
%s refers to the test file in the source tree. This was accidentally added in
r351197 / 2b46d30 ("[OMPT] Second chunk of final OMPT 5.0 interface updates").

Differential Revision: https://reviews.llvm.org/D58002

llvm-svn: 353715
2019-02-11 16:14:51 +00:00
Taewook Oh 91c32fd8c8 Guard a feature that unsupported by old GCC
Summary:
As @david2050 commented, changes introduced by https://reviews.llvm.org/D56397 break builds for older compilers
which don't support `__has(_cpp)_attribute`. This is a fix for the break.

Reviewers: protze.joachim, jlpeyton, AndreyChurbanov, Hahnfeld, david2050

Subscribers: openmp-commits, david2050

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D57851

llvm-svn: 353538
2019-02-08 17:15:50 +00:00
Joachim Protze 0c599c388d [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
The three switch fallthrough generate a warning with -Wimplicit-fallthrough.
Two are documented as fallthrough, one is not, but I think the intention is to also fallthrough in kmp_tasking.cpp.

Not sure whether kmp.h is the best place to define the macro.

Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld

Reviewed By: jlpeyton

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D56397

llvm-svn: 353052
2019-02-04 15:59:42 +00:00
Joachim Protze 32959e683a [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
Redo after revert by hans. The wrong include in one test is fixed.

Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.

Patch provided by @sconvent

Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze

Reviewed By: joachim.protze

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 352611
2019-01-30 08:41:06 +00:00
James Y Knight 5d71fc5d7b Adjust documentation for git migration.
This fixes most references to the paths:
 llvm.org/svn/
 llvm.org/git/
 llvm.org/viewvc/
 github.com/llvm-mirror/
 github.com/llvm-project/
 reviews.llvm.org/diffusion/

to instead point to https://github.com/llvm/llvm-project.

This is *not* a trivial substitution, because additionally, all the
checkout instructions had to be migrated to instruct users on how to
use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of
checking out various projects into various subdirectories.

I've attempted to not change any scripts here, only documentation. The
scripts will have to be addressed separately.

Additionally, I've deleted one document which appeared to be outdated
and unneeded:
  lldb/docs/building-with-debug-llvm.txt

Differential Revision: https://reviews.llvm.org/D57330

llvm-svn: 352514
2019-01-29 16:37:27 +00:00
Arnaud A. de Grandmaison f185823668 Remove no longer needed Arm specific words in the LICENSE.txt file.
As the codebase is now under the Apache 2.0 license with LLVM
Exceptions, and all Arm's contributions, past or future, are under that
new license, this Arm specific words in LICENSE.txt are no longer
needed.

llvm-svn: 352377
2019-01-28 15:42:58 +00:00
Andrey Churbanov efa6b826b4 NFC: fixed formatting to be consistent across the file
llvm-svn: 351748
2019-01-21 16:11:43 +00:00
Andrey Churbanov b8e3643506 Fixed https://reviews.llvm.org/D55078 broken Fortran fixed form.
Long lines split in order to obey Fortran fixed form compilation.

Differential Revision: https://reviews.llvm.org/D57017

llvm-svn: 351745
2019-01-21 15:30:31 +00:00
Chandler Carruth 4a1b95bda0 Fix typos throughout the license files that somehow I and my reviewers
all missed!

Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.

llvm-svn: 351731
2019-01-21 09:52:34 +00:00
Chandler Carruth 57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Chandler Carruth 469bdefd44 Install new LLVM license structure and new developer policy.
This installs the new developer policy and moves all of the license
files across all LLVM projects in the monorepo to the new license
structure. The remaining projects will be moved independently.

Note that I've left odd formatting and other idiosyncracies of the
legacy license structure text alone to make the diff easier to read.
Critically, note that we do not in any case *remove* the old license
notice or terms, as that remains necessary until we finish the
relicensing process.

I've updated a few license files that refer to the LLVM license to
instead simply refer generically to whatever license the LLVM project is
under, basically trying to minimize confusion.

This is really the culmination of so many people. Chris led the
community discussions, drafted the policy update and organized the
multi-year string of meeting between lawyers across the community to
figure out the strategy. Numerous lawyers at companies in the community
spent their time figuring out initial answers, and then the Foundation's
lawyer Heather Meeker has done *so* much to help refine and get us ready
here. I could keep going on, but I just want to make sure everyone
realizes what a huge community effort this has been from the begining.

Differential Revision: https://reviews.llvm.org/D56897

llvm-svn: 351631
2019-01-19 06:14:24 +00:00
Hans Wennborg 799b5dcbda Revert r351311 "[OMPT] Make sure that OMPT is enabled when accessing internals of the runtime"
and also the follow-up r351315.

The new test is failing on the buildbots.

> Make sure that OMPT is enabled in runtime entry points that access internals
> of the runtime. Else, return an appropiate value indicating an error or that
> the data is not available.
>
> Patch provided by @sconvent
>
> Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze
>
> Reviewed By: joachim.protze
>
> Tags: #openmp, #ompt
>
> Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 351431
2019-01-17 11:31:03 +00:00
Jonathan Peyton 9b8bb323c9 [OpenMP] Add omp_pause_resource* API
Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for
internal implementation. Implemented callable helper function to do local pause,
and added basic functionality for hard and soft pause.

Patch by Terry Wilmarth

Differential Revision: https://reviews.llvm.org/D55078

llvm-svn: 351372
2019-01-16 20:07:39 +00:00
Joachim Protze c46bd682ac [OpenMP] Output written by tests should go to build directory
llvm-svn: 351332
2019-01-16 13:06:10 +00:00
Joachim Protze 6b840ccea9 [OpenMP] Remove compiler warning about unused value
The compiler warns about an unused variable/statement:

    runtime/src/kmp_affinity.cpp:4958:18: warning: statement has no effect [-Wunused-value]
       KA_TRACE(1000, ; {
                      ^
    runtime/src/kmp_debug.h:84:24: note: in definition of macro 'KA_TRACE'
         __kmp_debug_printf x;                                                      \
                            ^

Instead of the unused reference to this function, this patch now calls the function
with an empty string. The call to this function should have no effect.

Patch provided by joachim.protze

Reviewers: jlpeyton, hbae, AndreyChurbanov

Reviewed By: AndreyChurbanov

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D56775

llvm-svn: 351323
2019-01-16 11:35:11 +00:00
Joachim Protze c3716617df Fix compiler error in r351311
llvm-svn: 351315
2019-01-16 09:39:42 +00:00
Joachim Protze 582b183dda [OMPT] Make sure that OMPT is enabled when accessing internals of the runtime
Make sure that OMPT is enabled in runtime entry points that access internals
of the runtime. Else, return an appropiate value indicating an error or that
the data is not available.

Patch provided by @sconvent

Reviewers: jlpeyton, omalyshe, hbae, Hahnfeld, joachim.protze

Reviewed By: joachim.protze

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D47717

llvm-svn: 351311
2019-01-16 08:58:17 +00:00
Jonathan Peyton 9355d0dc13 [OpenMP] Fix for nested proc_bind affinity bug
Using proc_bind clause on a nested #pragma omp parallel region
with KMP_AFFINITY set causes an assertion error. This assertion occurs because
the place-partition-var is not properly initialized in the nested master threads.
Trying to get an intuitive result with KMP_AFFINITY + proc_bind is difficult
because of how the KMP_AFFINITY gtid-to-place mapping occurs. This
patch creates an initial place list no matter what affinity mechanism is used.
For KMP_AFFINITY, the place-partition-var is initialized to all the places.

Differential Revision: https://reviews.llvm.org/D55795

llvm-svn: 351227
2019-01-15 19:39:32 +00:00
Jonathan Peyton fce3972553 [OpenMP] Add lock function definitions to fix Bug 40042
This change fixes the sanity issue reported in Bug 40042.
Lock function definitions for the three lock kinds were added
to disambiguate calls to the lock functions done directly and indirectly.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40042
Patch by Hansang Bae

Differential Revision: https://reviews.llvm.org/D56103

llvm-svn: 351224
2019-01-15 19:14:00 +00:00
Jonathan Peyton 1c268554ba [OpenMP][Cmake] Allowed OpenMP testing detect test compiler with same generator
Fix ninja build detect test compiler failed under windows.

Patch by Peiyuan Song

Differential Revision: https://reviews.llvm.org/D53479

llvm-svn: 351223
2019-01-15 19:08:26 +00:00
Jonathan Peyton dc375486b0 [OpenMP] Fix performance regression in SPEC kdtree test
Make __ompt_implicit_task_end a static function and remove the inline part.  Remove
pId variable that is unused.  This fixes small regression in SPEC kdtree benchmark.
Also reformat some of __ompt_implicit_task_end.

Differential Revision: https://reviews.llvm.org/D55788

llvm-svn: 351221
2019-01-15 18:57:24 +00:00
Joachim Protze 2b46d30fc7 [OMPT] Second chunk of final OMPT 5.0 interface updates
The omp-tools.h file is generated from the OpenMP spec to ensure that the interface
is implemented as specified.
The other changes are necessary to update the interface implementation to the
final version as published in 5.0.
The omp-tools.h header was previously called ompt.h, currently a copy under this name
is installed for legacy tools.

Patch partially perpared by @sconvent

Reviewers: AndreyChurbanov, hbae, Hahnfeld

Reviewed By: hbae

Tags: #openmp, #ompt

Differential Revision: https://reviews.llvm.org/D55579

llvm-svn: 351197
2019-01-15 15:36:53 +00:00
Hans Wennborg eb60fbfdb4 Update year in license files
In last year's update (D48219) it was suggested that the release manager
might want to do this, so here we go.

llvm-svn: 351194
2019-01-15 15:10:32 +00:00
Roman Lebedev 06e3950561 [OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612)
Summary:
Two things:
1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning.
2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off,
   thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`.
   Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly.
   It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block..
   I *think* this is the only source file with this issue,
   everything else seem to `#include` either `kmp.h` or `kmp_config.h`.
   The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake.

I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]].

Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld

Reviewed By: jlpeyton

Subscribers: guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D55783

llvm-svn: 351019
2019-01-13 12:54:34 +00:00
Gheorghe-Teodor Bercea 1653633a1c [OpenMP][libomptarget] Use shared memory variable for tracking parallel level
Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, openmp-commits

Differential Revision: https://reviews.llvm.org/D55773

llvm-svn: 350747
2019-01-09 18:30:14 +00:00
Andrey Churbanov b7a8ab3417 Doc: fixed description of a parameter of the __kmpc_taskloop
Patch by sergi.mateo.bellido@gmail.com

Differential Revision: https://reviews.llvm.org/D56432

llvm-svn: 350713
2019-01-09 13:06:23 +00:00
Alexey Bataev 26e6c86b79 [OPENMP][NVPTX]Fix dynamic scheduling.
Summary:
Previous implementation may cause the runtime crash when the number of
teams is > 1024. Patch fixes this problem + reduces number of the atomic
operations by 32 times.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56332

llvm-svn: 350524
2019-01-07 14:25:25 +00:00
Alexey Bataev 6b3153ada0 [OPENMP][NVPTX]General formatting/code improvement, NFC.
Summary: Formatting.

Reviewers: gtbercea, grokos, kkwli0

Subscribers: guansong, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56290

llvm-svn: 350431
2019-01-04 20:16:54 +00:00
Alexey Bataev dcf2edcdf5 [OPENMP][NVPTX]Improve performance + reduce number of used registers.
Summary:
Reduced number of the used register + improved performance propagating
the information about current execution/data sharing mode directly from
the compiler, where it is possible.
In some cases, it requires new/reworked interfaces of the runtime
external functions. Old functions are marked as deprecated.

Reviewers: grokos, gtbercea, kkwli0

Subscribers: guansong, jfb, openmp-commits, caomhin

Differential Revision: https://reviews.llvm.org/D56278

llvm-svn: 350405
2019-01-04 17:09:12 +00:00
Joel E. Denny f17f7a5d4d [OpenMP] Fix nvidia-cuda-toolkit detection on Debian/Ubuntu
The OpenMP runtime's cmake scripts do not correctly locate the
libdevice that the Debian/Ubuntu package nvidia-cuda-toolkit currently
includes, at least on my Ubuntu 18.04.1 installation.  This patch
fixes that for me.

This problem was discussed at length in D55269.  D40453 added a
similar adjustment in clang, but reviewers of D55269 concluded that,
for the OpenMP runtime, the right place to address this problem is in
cmake's CUDA support.  However, it was also suggested we could add a
workaround to OpenMP's cmake scripts now.  This patch contains such a
workaround, which I've tried to design so that it will have no harmful
effect if cmake improves in the future.

nvidia-cuda-toolkit also needs improvements because its intended
monolithic CUDA tree shim, /usr/lib/cuda, has many empty directories,
such as bin.  I reported that at:

<https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1808999>

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D55588

llvm-svn: 350377
2019-01-04 02:07:13 +00:00