Commit Graph

1162 Commits

Author SHA1 Message Date
Peyton, Jonathan L 9982f33e2c [OpenMP] Refactor/Rework topology discovery code
This patch does the following:

1) Introduce kmp_topology_t as the runtime-friendly structure (the
corresponding global variable is __kmp_topology) to determine the
exact machine topology which can vary widely among current and future
architectures. The current design is not easy to expand beyond the assumed
three layer topology: sockets, cores, and threads so a rework capable of
using the existing KMP_AFFINITY mechanisms is required.

This new topology structure has:
* The depth and types of the topology
* Ratio count for each consecutive level (e.g., number of cores per
   socket, number of threads per core)
* Absolute count for each level (e.g., 2 sockets, 16 cores, 32 threads)
* Equivalent topology layer map (e.g., Numa domain is equivalent to
   socket, L1/L2 cache equivalent to core)
* Whether it is uniform or not

The hardware threads are represented with the kmp_hw_thread_t
structure. This structure contains the ids (e.g., socket 0, core 1,
thread 0) and other information grabbed from the previous Address
structure. The kmp_topology_t structure contains an array of these.

2) Generalize the KMP_HW_SUBSET envirable for the new
kmp_topology_t structure. The algorithm doesn't assume any order with
tiles,numa domains,sockets,cores,threads. Instead it just parses the
envirable, makes sure it is consistent with the detected topology
(including taking into account equivalent layers) and then trims away
the unneeded subset of hardware threads. To enable this, a new
kmp_hw_subset_t structure is introduced which contains a vector of
items (hardware type, number user wants, offset). Any keyword within
__kmp_hw_get_keyword() can be used as a name and can be shortened as
well. e.g.,
KMP_HW_SUBSET=1s,2numa,4tile,2c,3t can be used on the KNL SNC-4 machine.

3) Simplify topology detection functions so they only do the singular
task of detecting the machine's topology. Printing, and all
canonicalizing functionality is now done afterwards. So many lines of
duplicated code are eliminated.

4) Add new ll_caches and numa_domains to OMP_PLACES, and
consequently, KMP_AFFINITY's granularity setting. All the names within
__kmp_hw_get_keyword() are available for use in OMP_PLACES or
KMP_AFFINITY's granularity setting.

5) Simplify and future-proof code where explicit lists of allowed
affinity settings keywords inside if() conditions.

6) Add x86 CPUID leaf 4 cache detection to existing x2apic id method
so equivalent caches could be detected (in particular for the ll_caches
place).

Differential Revision: https://reviews.llvm.org/D100997
2021-05-03 18:00:24 -05:00
Martin Storsjö 01d27fc408 [OpenMP] Fix warnings due to redundant semicolons. NFC. 2021-05-02 21:51:06 +03:00
Kevin Athey bc9120047b Correct tiny misspelling (readlef -> readelf).
Getting my feet wet here as a new committer.

Correct misspelling in check-depends.pl.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101552
2021-04-30 17:20:35 -07:00
Peyton, Jonathan L 4457565757 [OpenMP] Implement GOMP task reductions
Implement the remaining GOMP_* functions to support task reductions
in taskgroup, parallel, loop, and taskloop constructs.  The unused mem
argument to many of the work-sharing constructs has to do with the
scan() directive/ inscan() modifier.  If mem is set, each function
will call KMP_FATAL() and tell the user scan/inscan is unsupported.  The
GOMP reduction implementation is kept separate from our implementation
because of how GOMP presents reduction data and computes the reductions.
GOMP expects the privatized copies to be present even after a #pragma
omp parallel reduction(task:...) region has ended so the data is stored
inside GOMP's uintptr_t* data pseudo-structure.  This style is tightly
coupled with GCC compiler codegen.  There also isn't any init(),
combiner(), fini() functions in GOMP's codegen so the two
implementations were to disparate to try to wrap GOMP's around our own.

Differential Revision: https://reviews.llvm.org/D98806
2021-04-16 16:36:31 -05:00
Peyton, Jonathan L 5ebbb366c4 [OpenMP] Allow affinity to re-detect for child processes
Current atfork() handler for child processes does not reset
the affinity masks array which prevents users from setting their own
affinity in child processes.

Differential Revision: https://reviews.llvm.org/D99218
2021-04-16 16:34:02 -05:00
Hansang Bae 9b98497b44 [OpenMP] Add omp_target_is_accessible() to header files
-- Added omp_target_is_accessible to the header files
-- Added missing const qualifier to device memory routines

Differential Revision: https://reviews.llvm.org/D100420
2021-04-16 07:54:15 -05:00
Hansang Bae 77dc7b4653 [OpenMP] Fix printing routine for OMP_TOOL_VERBOSE_INIT
Also fixed typo in the verbose message.

Differential Revision: https://reviews.llvm.org/D100414
2021-04-14 07:55:26 -05:00
Hansang Bae 3da61ddae7 [OpenMP] Define omp_is_initial_device() variants in omp.h
omp_is_initial_device() is marked as a built-in function in the current
compiler, and user code guarded by this call may be optimized away,
resulting in undesired behavior in some cases. This patch provides a
possible fix for such cases by defining the routine as a variant
function and removing it from builtin list.

Differential Revision: https://reviews.llvm.org/D99447
2021-04-06 16:58:01 -05:00
Peyton, Jonathan L 2aebb7cb3c [OpenMP] Fix incorrect KMP_STRLEN() macro
The second argument to the strnlen_s(str, size) function should be
sizeof(str) when str is a true array of characters with known size
(instead of just a char*). Use type traits to determine if first
parameter is a character array and use the correct size based on that
trait.

Differential Revision: https://reviews.llvm.org/D98209
2021-04-05 09:03:09 -05:00
Hansang Bae 467f39249d [OpenMP] Misc. changes that add or remove pointer/bound checks
-- Added or moved checks to appropriate places.
-- Removed ineffective null check where the pointer is already being
   dereferenced around the code.
-- Initialized variables that can be used without definitions.
-- Added call to dlclose/FreeLibrary in OMPT tool activation.
-- Added a new build compiler definition.

Differential Revision: https://reviews.llvm.org/D98584
2021-03-23 18:55:08 -05:00
Shilei Tian 2df65f87c1 [OpenMP] Fixed a crash in hidden helper thread
It is reported that after enabling hidden helper thread, the program
can hit the assertion `new_gtid < __kmp_threads_capacity` sometimes. The root
cause is explained as follows. Let's say the default `__kmp_threads_capacity` is
`N`. If hidden helper thread is enabled, `__kmp_threads_capacity` will be offset
to `N+8` by default. If the number of threads we need exceeds `N+8`, e.g. via
`num_threads` clause, we need to expand `__kmp_threads`. In
`__kmp_expand_threads`, the expansion starts from `__kmp_threads_capacity`, and
repeatedly doubling it until the new capacity meets the requirement. Let's
assume the new requirement is `Y`.  If `Y` happens to meet the constraint
`(N+8)*2^X=Y` where `X` is the number of iterations, the new capacity is not
enough because we have 8 slots for hidden helper threads.

Here is an example.
```
#include <vector>

int main(int argc, char *argv[]) {
  constexpr const size_t N = 1344;
  std::vector<int> data(N);

#pragma omp parallel for
  for (unsigned i = 0; i < N; ++i) {
    data[i] = i;
  }

#pragma omp parallel for num_threads(N)
  for (unsigned i = 0; i < N; ++i) {
    data[i] += i;
  }

  return 0;
}
```
My CPU is 20C40T, then `__kmp_threads_capacity` is 160. After offset,
`__kmp_threads_capacity` becomes 168. `1344 = (160+8)*2^3`, then the assertions
hit.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D98838
2021-03-18 18:25:36 -04:00
Hansang Bae a6f9cb6adc [OpenMP] Add runtime interface for OpenMP 5.1 error directive
The proposed new interface is for supporting `at(execution)` clause in the
error directive.

Differential Revision: https://reviews.llvm.org/D98448
2021-03-16 08:55:25 -05:00
Peyton, Jonathan L 7085f04573 [OpenMP] Remove unused cpu_stackoffset member 2021-03-15 16:52:04 -05:00
AndreyChurbanov aaf16b80dd [OpenMP] libomp: eliminate pause from atomic CAS loops
For clang this change is NFC cleanup, because clang
never calls atomic functions from runtime library.

Basically, pause is good in spin-loops waiting for something.
Atomic CAS loops do not wait for anything,
each CAS failure means some other thread progressed.

Performance experiments show that the pause only causes unnecessary slowdown
on CPUs with slow pause instruction, no difference on CPUs with fast pause
instruction, removal of the pause gives lesser binary size which is good.

Differential Revision: https://reviews.llvm.org/D97079
2021-03-09 18:30:08 +03:00
AndreyChurbanov e4492b6f31 [OpenMP] NFC: temporarily disable assertion until the bug with dependences is fixed 2021-03-08 22:18:30 +03:00
Peyton, Jonathan L e2738b3758 [OpenMP] Fix potential integer overflow in dynamic schedule code
Restrict the chunk_size * chunk_num to only occur for valid
chunk_nums and reimplement calculating the limit to avoid overflow.

Differential Revision: https://reviews.llvm.org/D96747
2021-03-08 09:43:05 -06:00
tlwilmar 97d000cfc6 Added API for "masked" construct via two entrypoints: __kmpc_masked,
and __kmpc_end_masked. The "master" construct is deprecated. Changed
proc-bind keyword from "master" to "primary". Use of both master
construct and master as proc-bind keyword is still allowed, but
deprecated.

Remove references to "master" in comments and strings, and replace
with "primary" or "primary thread". Function names and variables were
not touched, nor were references to deprecated master construct. These
can be updated over time. No new code should refer to master.
2021-03-05 09:29:57 -06:00
Hansang Bae b6c2f538b2 [OpenMP] Add allocator support for target memory
This is a preview of allocator support for target memory that depends on the
offload runtime API which allocates memory as described below.

llvm_omp_target_alloc_host(size_t size, int device_num);
-- Returns non-migratable memory owned by host.
-- Memory is accessible by host and device(s).

llvm_omp_target_alloc_shared(size_t size, int device_num);
-- Returns migratable memory owned by host and device.
-- Memory is accessible by host and device.

llvm_omp_target_alloc_device(size_t size, int device_num);
-- Returns memory owned by device.
-- Memory is only accessible by device.

New memory space and predefined allocator names are
-- llvm_omp_target_host_mem_space
-- llvm_omp_target_shared_mem_space
-- llvm_omp_target_device_mem_space
-- llvm_omp_target_host_mem_alloc
-- llvm_omp_target_shared_mem_alloc
-- llvm_omp_target_device_mem_alloc

Differential Revision: https://reviews.llvm.org/D96669
2021-03-02 16:45:12 -06:00
Peyton, Jonathan L e83380fccc [OpenMP] Fix clang-cl build error regarding TSX intrinsics
Fix for https://bugs.llvm.org/show_bug.cgi?id=49339

The CMake check for the RTM intrinsics needs the -mrtm flag to be set
during the test. This way clang-cl correctly detects it has the
_xbegin() intrinsic. Otherwise, the CMake check fails.

Differential Revision: https://reviews.llvm.org/D97413
2021-03-02 07:47:42 -06:00
AndreyChurbanov 1df6e58e55 [OpenMP] libomp minor cleanup
Cleanup changes:
- check value read from file;
- remove dead code;
- make unsigned variable to read hexadecimal number to;
- add debug assertion to check ref count.

Differential Revision: https://reviews.llvm.org/D96893
2021-02-26 00:44:51 +03:00
AndreyChurbanov 4932101177 [OpenMP] libomp: fix ittnotify stack stitching for teams construct
Stitching id could be overridden causing reference of destroyed object
when number of teams is 1. The patch separates stitching id store
location for teams and parallel nested in teams.

Differential Revision: https://reviews.llvm.org/D96562
2021-02-26 00:23:24 +03:00
Peyton, Jonathan L d12ae7db99 [OpenMP] Fix accidental addition of use omp_lib_kinds
Fortran header accidentally had use omp_lib_kinds added inside a
subroutine and function. This patch removes the lines.
2021-02-25 12:49:56 -06:00
Harmen Stoppels a54f160b3a Prefer /usr/bin/env xxx over /usr/bin/xxx where xxx = perl, python, awk
Allow users to use a non-system version of perl, python and awk, which is useful
in certain package managers.

Reviewed By: JDevlieghere, MaskRay

Differential Revision: https://reviews.llvm.org/D95119
2021-02-25 11:32:27 +01:00
Shilei Tian e5da63d5a9 [OpenMP] Fixed a crash when offloading to x86_64 with target nowait
PR#49334 reports a crash when offloading to x86_64 with `target nowait`,
which is caused by referencing a nullptr. The root cause of the issue is, when
pushing a hidden helper task in `__kmp_push_task`, it also maps the gtid to its
shadow gtid, which is wrong.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D97329
2021-02-24 12:37:30 -05:00
Joachim Protze 35ab6d6390 [OpenMP][Tests][NFC] rename macro to avoid naming clash
When including <ostream>, the register_callback macro of the OMPT callback.h
clashes with a function defined in ostream. This patch renames the macro
and includes ompt into the macro name.
2021-02-24 18:03:54 +01:00
Peyton, Jonathan L 56223b1e91 [OpenMP] Help static loop code avoid over/underflow
This code alleviates some pathological loop parameters (lower,
upper, stride) within calculations involved in the static loop code.  It
bounds the chunk size to the trip count if it is greater than the trip
count and also minimizes problematic code for when trip count < nth.

Differential Revision: https://reviews.llvm.org/D96426
2021-02-22 13:22:01 -06:00
Peyton, Jonathan L 1b968467c0 [OpenMP] Remove shutdown attempt on Windows process detach
Only attempt shutdown if lpReserved is NULL. The Windows documentation
states:
When handling DLL_PROCESS_DETACH, a DLL should free resources such as
heap memory only if the DLL is being unloaded dynamically (the
lpReserved parameter is NULL). If the process is terminating (the
lpReserved parameter is non-NULL), all threads in the process except the
current thread either have exited already or have been explicitly
terminated by a call to the ExitProcess function, which might leave some
process resources such as heaps in an inconsistent state. In this case,
it is not safe for the DLL to clean up the resources. Instead, the DLL
should allow the operating system to reclaim the memory.

Differential Revision: https://reviews.llvm.org/D96750
2021-02-22 13:15:06 -06:00
Peyton, Jonathan L 8c73be9d86 [OpenMP] Limit number of dispatch buffers
This patch limits the number of dispatch buffers (used for
loop worksharing construct) to between 1 and 4096.

Differential Revision: https://reviews.llvm.org/D96749
2021-02-22 13:14:28 -06:00
Peyton, Jonathan L 55dff8b2e4 [OpenMP] Update HWLOC code for die level detection
Differential Revision: https://reviews.llvm.org/D96748
2021-02-22 13:05:55 -06:00
AndreyChurbanov 1611e5473c [OpenMP] libomp: cleanup some resource leaks
Close mutexattr and condattr local objects to eliminate resource leaks.

Differential Revision: https://reviews.llvm.org/D96892
2021-02-20 23:27:37 +03:00
Shilei Tian 309b00a42e [OpenMP][NFC] clang-format the whole openmp project
Same script as D95318. Test files are excluded.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D97088
2021-02-20 12:46:32 -05:00
AndreyChurbanov dab5d6c2eb [OpenMP] fix race condition in test 2021-02-18 02:27:49 +03:00
AndreyChurbanov cf1ddae7e3 [OpenMP][NFC] replaced 'dependencies' with 'dependences' in comments and debug prints 2021-02-18 00:38:18 +03:00
AndreyChurbanov 5631842d18 [OpenMP] NFC: fix test removing the target construct 2021-02-13 04:49:52 +03:00
AndreyChurbanov 091e8daa24 [OpenMP] fix test adding mapping of shared variables 2021-02-13 04:13:54 +03:00
Martin Storsjö 496ca4127e [OpenMP] Silence more warning flags
This silences warnings like these, in mingw builds with clang:

runtime/src/kmp_atomic.h:1021:13: warning: '__kmpc_atomic_cmplx8_rd' has C-linkage specified, but returns user-defined type 'kmp_cmplx64' (aka '__kmp_cmplx64_t') which is incompatible with C [-Wreturn-type-c-linkage]

runtime/src/z_Windows_NT_util.cpp:479:17: warning: cast from 'volatile void *' to 'type-parameter-0-0 *' drops volatile qualifier [-Wcast-qual]
    flag = (C *)th->th.th_sleep_loc;

runtime/src/z_Windows_NT_util.cpp:1321:14: warning: cast to 'void *' from smaller integer type 'DWORD' (aka 'unsigned long') [-Wint-to-void-pointer-cast]
  } else if ((void *)exit_val != (void *)th) {

Differential Revision: https://reviews.llvm.org/D96585
2021-02-12 21:55:32 +02:00
Martin Storsjö 16428a8d91 [OpenMP] Avoid warnings about unused static functions on windows
Add ifdefs around one function that only is used in unix build
configurations.

Add a void cast for a windows specific function that currently is
unused but may be intended to be used at some point.

Differential Revision: https://reviews.llvm.org/D96584
2021-02-12 21:55:31 +02:00
Martin Storsjö b388c84c09 [OpenMP] Remove two entirely unused variables
Differential Revision: https://reviews.llvm.org/D96583
2021-02-12 21:55:31 +02:00
Martin Storsjö b3d84790fa [OpenMP] Add void casts to silence unused variable warnings
These variables are used only in certain build configurations,
or marked with a todo comment indicating that they should be
used/checked/reported.

Differential Revision: https://reviews.llvm.org/D96582
2021-02-12 21:55:31 +02:00
Martin Storsjö 3f9519b768 [OpenMP] Only use #pragma comment(lib, ...) in MSVC build configurations
MinGW build configurations don't support this pragma (unless
compiling with clang, with -fms-extensions, and linking with
lld), and at least clang warns about it.

This library does end up linked by the cmake files anyway (as
long as the check works properly).

Differential Revision: https://reviews.llvm.org/D96581
2021-02-12 21:55:31 +02:00
Martin Storsjö 77632422bc [OpenMP] Fix the check for libpsapi for i386
check_library_exists fails for stdcall functions, because that
check doesn't include the necessary headers (and thus fails with
an undefined reference to _EnumProcessModules, when the import
library symbol actually is called _EnumProcessModules@16).

Merge the two previous checks check_include_files and
check_library_exists into one with check_c_source_compiles, and
merge the variables that indicate whether it succeeded.

Differential Revision: https://reviews.llvm.org/D96580
2021-02-12 21:55:30 +02:00
AndreyChurbanov 838dcdb5fc [OpenMP] libomp: minor changes to improve library performance
Three minor changes in this patch:
- added UNLIKELY hint to few rarely executed branches;
- replaced couple of run time checks with debug assertions;
- moved check of presence of ittnotify tool from inside the function call.

Differential Revision: https://reviews.llvm.org/D95816
2021-02-12 00:43:13 +03:00
Hansang Bae ffb21e7f05 [OpenMP] Enable omp_get_num_devices() on Windows
This patch enables omp_get_num_devices() and omp_get_initial_device() on
Windows by providing an alternative to dlsym on Windows, and proposes to
add a new libomptarget entry, __tgt_get_num_devices().

Differential Revision: https://reviews.llvm.org/D96182
2021-02-11 14:53:48 -06:00
Nawrin Sultana 4692bb4a8a [OpenMP] Add lower and upper bound in num_teams clause
This patch adds lower-bound and upper-bound to num_teams clause
according to OpenMP 5.1 specification. The initial number of teams
created is implementation defined, but it will be greater than or
equal to lower-bound and less than or equal to upper-bound. If
num_teams clause is not specified, the number of teams created is
implementation defined, but it will be greater or equal to 1.

Differential Revision: https://reviews.llvm.org/D95820
2021-02-10 13:58:50 -06:00
Shilei Tian 3c31b78455 [OpenMP] Fixed an issue that taskwait doesn't work on detachable task
D77609 mistakenly changed the bebavior of task waiting on detachable task that a detachable task is not waited, based on https://lists.llvm.org/pipermail/openmp-dev/2021-February/003836.html. This patch fixed it. Thank Raúl for the report.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95798
2021-02-03 13:12:43 -05:00
Peyton, Jonathan L ffca74b8b8 [OpenMP] Fix sign comparison warnings from GCC
New affinity patch introduced legitimate sign-compare warnings that
clang doesn't report but GCC-10 does. This removes the warnings by
changing two variables types to unsigned.

Differential Revision: https://reviews.llvm.org/D95818
2021-02-02 10:52:16 -06:00
AndreyChurbanov d7b12004bd [OpenMP] libomp: implement nteams-var and teams-thread-limit-var ICVs
The change includes OMP_NUM_TEAMS, OMP_TEAMS_THREAD_LIMIT env variables,
omp_set_num_teams, omp_get_max_teams, omp_set_teams_thread_limit,
omp_get_teams_thread_limit routines.

Differential Revision: https://reviews.llvm.org/D95003
2021-02-01 22:54:11 +03:00
Tobias Hieta c3c02d0d5a [OpenMP] Fix python3 compatibility in openmp's lit.cfg
Differential Revision: https://reviews.llvm.org/D95669
2021-02-01 08:20:26 +01:00
Jonathan Peyton 67773681c0 [OpenMP] Add environment variable to force monotonic dynamic scheduling
This patch introduces a new environment variable to force monotonic
behavior for users that absolutely need it.  This is in anticipation
of 5.0 change that uses non-monotonic behavior for dynamic scheduling
by default. Fixes for that and the actual switch are coming soon.

Differential Revision: https://reviews.llvm.org/D95263
2021-01-29 12:23:27 -06:00
AndreyChurbanov 7f5ad0e071 [OpenMP] libomp: fix build by cl with vs2019
Replace VLA with dynamic allocation using alloca().
This fixes https://bugs.llvm.org/show_bug.cgi?id=48919.

Differential Revision: https://reviews.llvm.org/D95627
2021-01-29 13:16:41 +03:00
AndreyChurbanov ac70a53653 [OpenMP] NFC: disabled two flakey tests as the bug in libomp not fixed yet 2021-01-29 00:54:13 +03:00
Shilei Tian c571b16834 [OpenMP] Disabled profiling in `libomp` by default to unblock link errors
Link error occurred when time profiling in libomp is enabled by default
because `libomp` is assumed to be a C library but the dependence on
`libLLVMSupport` for profiling is a C++ library. Currently the issue blocks all
OpenMP tests in Phabricator.

This patch set a new CMake option `OPENMP_ENABLE_LIBOMP_PROFILING` to
enable/disable the feature. By default it is disabled. Note that once time
profiling is enabled for `libomp`, it becomes a C++ library.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95585
2021-01-28 07:24:32 -05:00
Peyton, Jonathan L 8e67134364 [OpenMP] Fix misleading warning for OMP_PLACES
When OMP_PLACES contains an invalid value, the warning informs the user
that the fallback is OMP_PLACES=threads, but the actual internal setting
is OMP_PLACES=cores and is detected as such with KMP_SETTINGS=1.
This patch informs the user that OMP_PLACES=cores is being used instead
of OMP_PLACES=threads.

Differential Revision: https://reviews.llvm.org/D95170
2021-01-27 14:27:24 -06:00
Peyton, Jonathan L 598c590b3c [OpenMP] Add cpuid leaf 1f topology discovery
This patch adds the new algorithm for topology discovery using cpuid
leaf 1f.  Only the new die level is detected and integrated into the
current affinity mechanisms including KMP_AFFINITY (granularity level
and compact/scatter algorithm), OMP_PLACES=dies, and KMP_HW_SUBSET.

Differential Revision: https://reviews.llvm.org/D95157
2021-01-27 14:27:23 -06:00
Peyton, Jonathan L 9f87c6b47d [OpenMP] Fix HWLOC topology detection for 2.0.x
HWLOC 2.0 has numa nodes as separate children and are not in the main
parent/child topology tree anymore.  This change takes this into
account.  The main topology detection loop in the create_hwloc_map()
routine starts at a hardware thread within the initial affinity mask and
goes up the topology tree setting the socket/core/thread labels
correctly.

This change also introduces some of the more generic changes that the
future kmp_topology_t structure will take advantage of including a
generic ratio & count array (finding all ratios of topology layers like
threads/core cores/socket and finding all counts of each topology
layer), generic radix1 reduction step, generic uniformity check, and
generic printing of topology (en_US.txt)

Differential Revision: https://reviews.llvm.org/D95156
2021-01-27 14:27:23 -06:00
Giorgis Georgakoudis bb40e67318 [OpenMP] Fix building using LLVM_ENABLE_RUNTIMES
Fix when time profiling is enabled.

Related to: D94855

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D95398
2021-01-27 06:43:57 -08:00
AndreyChurbanov 498c4b6fc4 [OpenMP] libomp: fix build by clang-cl with vs2019
Problem reported by Joseph Shen <joseph.smeng@gmail.com>.
The patch changes *(&<atomic-var>) to (&<atomic-var>)->load().

Differential Revision: https://reviews.llvm.org/D95485
2021-01-27 12:18:15 +03:00
Nawrin Sultana 927af4b3c5 [OpenMP] Modify OMP_ALLOCATOR environment variable
This patch sets the def-allocator-var ICV based on the environment variables
provided in OMP_ALLOCATOR. Previously, only allowed value for OMP_ALLOCATOR
was a predefined memory allocator. OpenMP 5.1 specification allows predefined
memory allocator, predefined mem space, or predefined mem space with traits in
OMP_ALLOCATOR. If an allocator can not be created using the provided environment
variables, the def-allocator-var is set to omp_default_mem_alloc.

Differential Revision: https://reviews.llvm.org/D94985
2021-01-26 18:27:39 -06:00
Shilei Tian 9d64275ae0 [OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.

Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.

Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D77609
2021-01-25 22:16:17 -05:00
Hansang Bae 480cbed31e [OpenMP] Remove unnecessary pointer checks in a few locations
Also, return NULL from unsuccessful OMPT function lookup.

Differential Revision: https://reviews.llvm.org/D95277
2021-01-22 19:18:50 -06:00
Joseph Schuchart edbcc17b7a [OpenMP] libomp: properly initialize buckets in __kmp_dephash_extend
The buckets are initialized in __kmp_dephash_create but when they are extended
the memory is allocated but not NULL'd, potentially leaving some buckets
uninitialized after all entries have been copied into the new allocation.
This commit makes sure the buckets are properly initialized with NULL before
copying the entries.

Differential Revision: https://reviews.llvm.org/D95167
2021-01-22 20:29:46 +03:00
Giorgis Georgakoudis 6b7645dd31 [OpenMP] Add time profiling support in libomp
Profiling has been recently implemented in libomptarget (D93055). This patch enables time profiling support for libomptarget in libomp, to support profiling of multi-threaded execution of offloaded regions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94855
2021-01-21 09:15:14 -08:00
Hansang Bae 2d911f7c72 [OpenMP] Fix atomic entries for captured logical operation
Added missing code for the captured atomic operation.

Differential Revision: https://reviews.llvm.org/D94848
2021-01-19 09:59:28 -06:00
AndreyChurbanov a60bc55c69 [OpenMP] libomp: cleanup parsing of OMP_ALLOCATOR env variable.
Differential Revision: https://reviews.llvm.org/D94932
2021-01-19 16:21:22 +03:00
AndreyChurbanov aa3a59e0c6 [OpenMP][NFC] Fix test
The test fails if memkind library is accessible.
2021-01-19 00:05:34 +03:00
Shilei Tian 9bf843bdc8 Revert "[OpenMP] Added the support for hidden helper task in RTL"
This reverts commit ed939f853d.
2021-01-18 06:57:52 -05:00
Chandler Carruth f855751c12 Fix openmp CMake build on non-Linux AArch64 systems.
This just checks for `/proc/cpuinfo` existing before reading it.

Tested on an ARM macOS machine.
2021-01-17 16:18:31 -08:00
Shilei Tian ed939f853d [OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.

Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.

Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D77609
2021-01-16 14:13:35 -05:00
Terry Wilmarth 4fe17ada55 [OpenMP] Fix hierarchical barrier
Hierarchical barrier is an experimental barrier algorithm that uses aspects
of machine hierarchy to define the barrier tree structure. This patch fixes
offset calculation in hierarchical barrier. The offset is used to store info
on a flag about sleeping threads waiting on a location stored in the flag.
This commit also fixes a potential deadlock in hierarchical barrier when
using infinite blocktime by adjusting the offset value of leaf kids so that
it matches the value of leaf state. It also adds testing of default barriers
with infinite blocktime, and also tests hierarchical barrier algorithm with
both default and infinite blocktime.

Patch by Terry Wilmarth and Nawrin Sultana.

Differential Revision: https://reviews.llvm.org/D94241
2021-01-13 10:22:57 -06:00
Hansang Bae bba3a82b56 [OpenMP] Use persistent memory for omp_large_cap_mem
This change enables volatile use of persistent memory for omp_large_cap_mem*
on supported systems. It depends on libmemkind's support for persistent memory,
and requirements/details can be found at the following url.

https://pmem.io/2020/01/20/memkind-dax-kmem.html

Differential Revision: https://reviews.llvm.org/D94353
2021-01-12 20:35:27 -06:00
Hansang Bae 6f0f022038 [OpenMP] Update allocator trait key/value definitions
Use new definitions introduced in 5.1 specification.

Differential Revision: https://reviews.llvm.org/D94277
2021-01-12 20:09:45 -06:00
Shilei Tian 676c7cb0c0 [OpenMP] Added the support for cache line size 256 for A64FX
Fugaku supercomputer is built with the Fujitsu A64FX microprocessor, whose cache line is 256. In current libomp, we only have cache line size 128 for PPC64 and otherwise 64. This patch added the support of cache line 256 for A64FX. It's worth noting that although A64FX is a variant of AArch64, this property is not shared. As a result, in light of UCX source code (392443ab92/src/ucs/arch/aarch64/cpu.c (L17)), we can only determine by checking whether the CPU is FUJITSU A64FX.

Reviewed By: jdoerfert, Hahnfeld

Differential Revision: https://reviews.llvm.org/D93169
2021-01-09 11:58:47 -05:00
Hansang Bae fb1c528526 [OpenMP] Use c_int/c_size_t in Fortran target memory routine interface
The Fortran interface is now in line with 5.1 specification.

Differential Revision: https://reviews.llvm.org/D94042
2021-01-06 16:28:30 -06:00
Hansang Bae 82a29a62ab [OpenMP] Add definition/interface for target memory routines
The change includes new routines introduced in 5.1 and Fortran
interface.

Differential Revision: https://reviews.llvm.org/D93505
2021-01-04 08:12:57 -06:00
Terry Wilmarth 6b316febb4 [OpenMP] libomp: Handle implicit conversion warnings
This patch partially prepares the runtime source code to be built with
-Wconversion, which should trigger warnings if any implicit conversions
can possibly change a value. For builds done with icc or gcc, all such
warnings are handled in this patch. clang gives a much longer list of
warnings, particularly for sign conversions, which the other compilers
don't report. The -Wconversion flag is commented into cmake files, but
I'm not going to turn it on. If someone thinks it is important, and wants
to fix all the clang warnings, they are welcome to.

Types of changes made here involve either improving the consistency of types
used so that no conversion is needed, or else performing careful explicit
conversions, when we're sure a problem won't arise.

Patch is a combination of changes by Terry Wilmarth and Johnny Peyton.

Differential Revision: https://reviews.llvm.org/D92942
2020-12-31 00:39:57 +03:00
Hansang Bae e1fd202489 [OpenMP] Add definitions for 5.1 interop to omp.h 2020-12-17 13:03:59 -06:00
Peyton, Jonathan L 5aafdd7b88 [OpenMP] Introduce new file wrapper class for runtime
Introduce new kmp_safe_raii_file_t class with RAII semantics for file
open/close. It is essentially a wrapper around the C-style FILE* object.
This also unifies the way we error report if a file can't be opened.

Differential Revision: https://reviews.llvm.org/D92604
2020-12-15 14:46:30 -06:00
Hansang Bae 171ca93c54 [OpenMP] Initialize runtime in the forked child process
This patch enables serial initialization in the forked child process
to fix unstable runtime behavior when used with Python-based AI tools.

Differential Revision: https://reviews.llvm.org/D93230
2020-12-15 07:29:28 -06:00
Hansang Bae c3b5009aa7 [OpenMP] Use RTM lock for OMP lock with synchronization hint
This patch introduces a new RTM lock type based on spin lock which is
used for OMP lock with speculative hint on supported architecture.

Differential Revision: https://reviews.llvm.org/D92615
2020-12-09 19:14:53 -06:00
Nawrin Sultana 540007b427 [OpenMP] Add strict mode in num_tasks and grainsize
This patch adds new API __kmpc_taskloop_5 to accomadate strict
modifier (introduced in OpenMP 5.1) in num_tasks and grainsize
clause.

Differential Revision: https://reviews.llvm.org/D92352
2020-12-09 16:46:30 -06:00
Peyton, Jonathan L fe3b244ef7 [OpenMP] Fix norespect affinity bug for Windows
KMP_AFFINITY=norespect was triggering an error because the underlying
process affinity mask was not updated to include the entire machine.
The Windows documentation states that the thread affinities must be
subsets of the process affinity. This patch also moves the printing
(for KMP_AFFINITY=verbose) of whether the initial mask was respected
out of each topology detection function and to one location where the
initial affinity mask is read.

Differential Revision: https://reviews.llvm.org/D92587
2020-12-09 14:32:48 -06:00
Peyton, Jonathan L 9b7d6a6bff [OpenMP] Fix too long name for shm segment on macOS
Remove the user id component to the shm segment name and just use
the pid like before.

Differential Revision: https://reviews.llvm.org/D92660
2020-12-09 14:31:15 -06:00
AndreyChurbanov fff1abc406 [OpenMP] NFC: comment adjusted 2020-12-07 19:50:14 +03:00
AndreyChurbanov 22558c8501 [OpenMP] libomp: Fix possible NULL dereferences
Check pointer returned by strchr, as it can be NULL in case of broken
format of input string. Introduced new function __kmp_str_loc_numbers
for fast parsing of numbers only in the location string.
Also made some cleanup of __kmp_str_loc_init declaration and usage:
- changed type of init_fname parameter to bool;
- changed input from true to false in places where fname is not used.

Differential Revision: https://reviews.llvm.org/D90962
2020-12-07 19:09:07 +03:00
Joachim Protze a148216b31 [OpenMP][OMPT] Fix OMPT return address guard for gomp interface
D91692 missed various locations in kmp_gsupport, where the scope for
OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT
callback is called in some nested function.

This patch fixes the scoping issue, so that all OMPT tests pass, when the
tests are built with gcc.

Differential Revision: https://reviews.llvm.org/D92121
2020-12-05 19:06:28 +01:00
Joachim Protze d3ec512b1d [OpenMP][OMPT] Make sure that 0 is never used as ID in tests (NFC) 2020-12-04 18:41:56 +01:00
Hansang Bae c4a22224d9 [OpenMP] Add __kmpc_omp_target_task_alloc to dllexport
This patch enables use of the entry on Windows.

Differential Revision: https://reviews.llvm.org/D92618
2020-12-04 08:11:14 -06:00
Terry Wilmarth e0665a9050 [OpenMP] Add support for Intel's umonitor/umwait
These changes add support for Intel's umonitor/umwait usage in wait
code, for architectures that support those intrinsic functions. Usage of
umonitor/umwait is off by default, but can be turned on by setting the
KMP_USER_LEVEL_MWAIT environment variable.

Differential Revision: https://reviews.llvm.org/D91189
2020-12-01 14:07:46 -06:00
AndreyChurbanov 6bf84871e9 [OpenMP] libomp: add UNLIKELY hints to rarely executed branches
Added UNLIKELY hint to one-time or rarely executed branches.
This improves performance of the library on some tasking benchmarks.

Differential Revision: https://reviews.llvm.org/D92322
2020-12-01 16:53:21 +03:00
Joachim Protze fd3d1b09c1 [OpenMP][Tests][NFC] Use FileCheck from cmake config 2020-11-30 23:16:56 +01:00
Todd Erdner 9615890db5 [OpenMP] libomp: change shm name to include UID, call unregister_lib on SIGTERM
With the change to using shared memory, there were a few problems that need to be fixed.
- The previous filename that was used for SHM only used process id. Given that process is
  usually based on 16bit number, this was causing some conflicts on machines. Thus we add
  UID to the name to prevent this.
- It appears under some conditions (SIGTERM, etc) the shared memory files were not getting
  cleaned up. Added a call to clean up the shm files under those conditions. For this user
  needs to set envirable KMP_HANDLE_SIGNALS to true.

Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D91869
2020-12-01 00:40:47 +03:00
AndreyChurbanov f6f28b44ad [OpenMP] libomp: fix mutexinoutset dependence for proxy tasks
Once __kmp_task_finish is not executed for proxy tasks,
move mutexinoutset dependency code to __kmp_release_deps
which is executed for all task kinds.

Differential Revision: https://reviews.llvm.org/D92326
2020-12-01 00:13:31 +03:00
Joachim Protze 723be4042a [OpenMP][OMPT][NFC] Fix failing test
The test would fail for gcc, when built with debug flag.
2020-11-29 19:07:42 +01:00
Joachim Protze cdf9401df8 [OpenMP][OMPT][NFC] Fix flaky test
The test had a chance to finish the first task before the second task is
created. In this case, the dependences-pair event would not trigger.
2020-11-29 19:07:41 +01:00
Martin Storsjö 6b429668de [OpenMP][OMPT] Fix building with OMPT disabled after 6d3b81664a 2020-11-26 10:09:32 +02:00
AndreyChurbanov 9e3e332d27 [OpenMP] libomp: fix non-X86, non-AARCH64 builds
Commit https://reviews.llvm.org/rG7b5254223acbf2ef9cd278070c5a84ab278d7e5f
broke the build for some architectures, because macro KMP_PREFIX_UNDERSCORE
was defined only for x86, x86_64 and aarch64. This patch defines it for other
architectures (as a no-op).

Differential Revision: https://reviews.llvm.org/D92027
2020-11-25 20:40:23 +03:00
Joachim Protze 6d3b81664a [OpenMP][OMPT] Introduce a guard to handle OMPT return address
This is an alternative approach to address inconsistencies pointed out in: D90078
This patch makes sure that the return address is reset, when leaving the scope.
In some cases, I had to move the macro out of an if-statement to have it in the
right scope, in some cases I added an additional block to restrict the scope.

This patch does not handle inconsistencies, which might occur if the return
address is still set when we call into the application.

Test case (repeated_calls.c) provided by @hbae

Differential Revision: https://reviews.llvm.org/D91692
2020-11-25 18:17:44 +01:00
Isabel Thärigen b281a05dac [OpenMP][OMPT] Implement verbose tool loading
OpenMP 5.1 introduces the new env variable
OMP_TOOL_VERBOSE_INIT=(disabled|stdout|stderr|<filename>) to enable verbose
loading and initialization of OMPT tools.
This env variable helps to understand the cause when loading of a tool fails
(e.g., undefined symbols or dependency not in LD_LIBRARY_PATH)
Output of OMP_TOOL_VERBOSE_INIT is added for OMP_DISPLAY_ENV

Tests for this patch are integrated into the different existing tool loading
tests, making these tests more verbose. An Archer specific verbose test is
integrated into an existing Archer test.

Patch prepared by: Isabel Thärigen

Differential Revision: https://reviews.llvm.org/D91464
2020-11-25 18:17:44 +01:00
AndreyChurbanov 7b5254223a [OpenMP] fix asm code for for arm64 (AARCH64) for Darwin/macOS
Adjusted external reference for Darwin/AARCH64 link compatibility.
Made size directive conditional only if __ELF__ defined.

Patch by Michael_Pique <mpique@icloud.com>

Differential Revision: https://reviews.llvm.org/D88252
2020-11-24 13:08:24 +03:00
AndreyChurbanov 5644f734d6 Revert "[OpenMP] Add support for Intel's umonitor/umwait"
This reverts commit 9cfad5f9c5.
2020-11-20 12:16:34 +03:00
AndreyChurbanov 9cfad5f9c5 [OpenMP] Add support for Intel's umonitor/umwait
Patch by tlwilmar (Terry Wilmarth)

Differential Revision: https://reviews.llvm.org/D91189
2020-11-19 22:04:21 +03:00
Hansang Bae 44a11c342c [OpenMP] Use explicit type casting in kmp_atomic.cpp
Differential Revision: https://reviews.llvm.org/D91105
2020-11-17 14:31:13 -06:00
Nawrin Sultana 5439db05e7 [OpenMP] Add omp_realloc implementation
This patch adds omp_realloc function implementation according to
OpenMP 5.1 specification.

Differential Revision: https://reviews.llvm.org/D90971
2020-11-17 13:43:00 -06:00
Peyton, Jonathan L 8647c669a4 [OpenMP] NFC: remove tabs in message catalog file 2020-11-17 10:15:04 -06:00
Peyton, Jonathan L 0454154efd [OpenMP][stats] reset serial state when re-entering serial region
Differential Revision: https://reviews.llvm.org/D90867
2020-11-17 10:09:56 -06:00
Martin Storsjö 9bcef58b63 [OpenMP] Fix building for windows after adding omp_calloc
Differential Revision: https://reviews.llvm.org/D91478
2020-11-15 21:32:38 +02:00
Nawrin Sultana 938f1b8581 [OpenMP] Add omp_calloc implementation
This patch adds omp_calloc implementation according to OpenMP 5.1
specification.

Differential Revision: https://reviews.llvm.org/D90967
2020-11-13 14:35:46 -06:00
Shilei Tian 24d0ef0f50 [OpenMP] Fixed a bug when displaying affinity
Currently the affinity format string has initial value. When users set
the format via OMP_AFFINITY_FORMAT, it will overwrite the format string. However,
when copying the format, the tailing null is missing. As a result, if the user
format string is shorter than default value, the remaining part in the default
value still makes effort. This bug is not exposed because the test case doesn't
check the end of a string. It only checks whether given output "contains" the
check string.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D91309
2020-11-12 22:27:32 -05:00
Peyton, Jonathan L dd8723d348 [OpenMP] Fix shutdown hang/race bug
The deadlock/race happens when primary thread gets initz lock and tries to join
the worker thread which waits for the same lock in TLS key destructor.
The patch removes the lock and the code of setting TLS value which needed
the lock. Also removed setting TLS from __kmp_unregister_root_current_thread.

Differential Revision: https://reviews.llvm.org/D90647
2020-11-11 13:47:23 -06:00
Joachim Protze ce0911b3e9 [OpenMP][Tests] Fix compiler warnings in OpenMP runtime tests
This patch allows to pass the OpenMP runtime tests after configuring with
`cmake . -DOPENMP_TEST_FLAGS:STRING="-Werror"`.
The warnings for OMPT tests are addressed in D90752.

Differential Revision: https://reviews.llvm.org/D91280
2020-11-11 20:13:21 +01:00
Joachim Protze 6213ed062b [OpenMP][OMPT] Update the omp-tools header file to reflect 5.1 changes
This doesn't add functionality, but just adds the new types and renames the
master callback to masked callback.

Differential Revision: https://reviews.llvm.org/D90752
2020-11-11 20:13:21 +01:00
AndreyChurbanov 33da6bd7f5 [OpenMP] Fixes for shared memory cleanup when aborts occur
Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D90974
2020-11-11 00:16:23 +03:00
Hansang Bae ef7738240c [OpenMP] Remove obsolete Fortran module file
Modern Fortran compilers support Fortran 90, so we do not need to use
the source code for Fortran compilers that do not support Fortran 90.

Differential Revision: https://reviews.llvm.org/D90077
2020-11-09 15:26:38 -06:00
Nawrin Sultana 082031949c [OpenMP] Fix potential division by 0
This patch fixes potential division by 0 in case hwloc does not
recognize cores (or architecture has no cores).

Patch by Andrey Churbanov

Differential Revision: https://reviews.llvm.org/D90954
2020-11-06 11:52:19 -06:00
Peyton, Jonathan L 5e34877480 [OpenMP] Add ident_t flags for compiler OpenMP version
This patch adds the mask and ident_t function to get the
openmp version. It also adds logic to force monotonic:dynamic
behavior when OpenMP version less than 5.0.

The OpenMP version is stored in the format:
major*10+minor e.g., OpenMP 5.0 = 50

Differential Revision: https://reviews.llvm.org/D90632
2020-11-05 11:14:25 -06:00
Joachim Protze 7b0ca32b62 [OpenMP] avoid warning: equality comparison with extraneous parentheses
The macros are used in several places with an if(macro) pattern. This results
in several warnings about extraneous parenteses in equality comparison.

Having the constant at the lhs of the comparison, avoids this warning.

Differential Revision: https://reviews.llvm.org/D90756
2020-11-05 12:13:08 +01:00
Joachim Protze b0eb19bf8a [OpenMP][OMPT][NFC] Fix flaky test
As reported by @ronlieb, the test shows intermittent fails.
The test failed, if the dependent task was already finished, when the depending
task was to be created. We have other tests to check for the dependences pair.
2020-11-03 13:15:32 +01:00
Peyton, Jonathan L 771f0fb92d [OpenMP] Add NULL check in dispatcher debug output
Patch by Nawrin Sultana

Differential Revision: https://reviews.llvm.org/D90403
2020-10-29 14:08:03 -05:00
AndreyChurbanov d6a0957467 [OpenMP] changing OMP rtl to use shared memory instead of env variable
Patch by Erdner, Todd <todd.erdner@intel.com>

Differential Revision: https://reviews.llvm.org/D89898
2020-10-26 19:02:21 +03:00
Joachim Protze 34b34e90fc [OpenMP][Tests] NFC: fix flaky test failure caused by rare scheduling
The worker thread can start execution of the task before creation of the second task
Fixes the spurious failure reported in https://reviews.llvm.org/D61657
2020-10-05 16:55:32 +02:00
Joachim Protze 6104b30446 [OpenMP][OMPT] Update OMPT tests for newly added GOMP interface patches
This patch updates the expected results for the GOMP interface patches: D87267, D87269, and D87271.
The taskwait-depend test is changed to really use taskwait-depend and copied to an task_if0-depend test.

To pass the tests, the handling of the return address was fixed.

Differential Revision: https://reviews.llvm.org/D87680
2020-10-01 00:53:41 +02:00
Joachim Protze 55cff5b288 [OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant
OpenMP 5.1 defines omp_get_initial_device to return the same value as omp_get_num_devices.
Since this change is also 5.0 compliant, no versioning is needed.

Differential Revision: https://reviews.llvm.org/D88149
2020-10-01 00:51:11 +02:00
Peyton, Jonathan L ee1c04a926 [OpenMP] Fix if0 task with dependencies in the runtime
The current GOMP interface for serialized tasks does not take into
account task dependencies. Add the check and wait for dependencies.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=46573

Differential Revision: https://reviews.llvm.org/D87271
2020-09-24 09:47:53 -05:00
Peyton, Jonathan L 9089b4a5c5 [OpenMP] Introduce GOMP taskwait depend in the runtime
This change introduces the GOMP_taskwait_depend() function. It implements
the OpenMP 5.0 feature of #pragma omp taskwait with depend() clause by
wrapping around __kmpc_omp_wait_deps().

Differential Revision: https://reviews.llvm.org/D87269
2020-09-24 09:45:14 -05:00
Peyton, Jonathan L 72ada5ae6c [OpenMP] Introduce GOMP mutexinoutset in the runtime
Encapsulate GOMP task dependencies in separate class and introduce the
new mutexinoutset dependency type. This separate class allows
future GOMP task APIs easier access to the task dependency functionality
and better ability to propagate new dependency types to all existing GOMP
task APIs which use task dependencies.

Differential Revision: https://reviews.llvm.org/D87267
2020-09-24 09:45:13 -05:00
Peyton, Jonathan L ea34d95e0a [OpenMP] Introduce GOMP teams support in runtime
Implement GOMP_teams_reg() function which enables GOMP support of the
standalone teams construct. The GOMP_parallel* functions were modified
to call __kmp_fork_call() unconditionally so that the teams-specific
code could be reused within __kmp_fork_call() instead of reproduced
inside the GOMP_* functions.

Differential Revision: https://reviews.llvm.org/D87167
2020-09-24 09:45:13 -05:00
Raul Tambre 21c0e74c9e [CMake][OpenMP] Remove old dead CMake code
LLVM requires CMake 3.13.4 so remove code behind checks for an older version.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D87191
2020-09-07 10:56:56 +03:00
AndreyChurbanov 1596ea80fd [OpenMP] Fix import library installation with MinGW
Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D86552
2020-08-26 21:56:01 +03:00
AndreyChurbanov 09af378f49 [OpenMP] Fix build on macOS sdk 10.12 and newer
Patch by nihui (Ni Hui)

Differential Revision: https://reviews.llvm.org/D76755
2020-08-26 16:52:46 +03:00
Dimitry Andric 47b0262d3f Add <stdarg.h> include to kmp_os.h, to get the va_list type, required
after cde8f4c164. Sort system includes, while here.
2020-08-24 22:45:02 +02:00
Dimitry Andric cde8f4c164 Move special va_list handling to kmp_os.h
Instead of copying and pasting the same `#ifdef` expressions in multiple
places, define a type and a pair of macros in `kmp_os.h`, to handle
whether `va_list` is pointer-like or not:

* `kmp_va_list` is the type to use for `__kmp_fork_call()`
* `kmp_va_deref()` dereferences a `va_list`, if necessary
* `kmp_va_addr_of()` takes the address of a `va_list`, if necessary

Also add FreeBSD to the list of OSes that has a non pointer-like
va_list. This can now be easily extended to other OSes too.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D86397
2020-08-24 22:31:56 +02:00
AndreyChurbanov d0f4f5a182 [OpenMP] Check if _MSC_VER is defined before using it
Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D86448
2020-08-24 17:50:38 +03:00
Joachim Protze 66a3575c28 [OpenMP] Fix releasing of stack memory
Starting with 787eb0c637 I got spurious segmentation faults for some testcases. I could nail it down to `brel` trying to release the "memory" of the node allocated on the stack of __kmpc_omp_wait_deps. With this patch, you will see the assertion triggering for some of the tests in the test suite.

My proposed solution for the issue is to just patch __kmpc_omp_wait_deps:
```
  __kmp_init_node(&node);
-  node.dn.on_stack = 1;
+  // the stack owns the node
+  __kmp_node_ref(&node);
```

What do you think?

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D84472
2020-08-14 10:32:53 +02:00
Adrian Pop bf2aa74e51 [OpenMP] support build on msys2/mingw with clang or gcc
RTM Adaptive Locks are supported on msys2/mingw for clang and gcc.

Differential Revision: https://reviews.llvm.org/D81776
2020-08-04 23:15:36 +03:00
AndreyChurbanov 4a04bc8995 [OpenMP] Don't use MSVC workaround with MinGW
Patch by mati865@gmail.com

Differential Revision: https://reviews.llvm.org/D85210
2020-08-04 18:48:25 +03:00
David Blaikie 0c938a8dd8 OpenMP: Fix typo variabls -> variables 2020-08-03 17:00:15 -07:00
Joachim Protze 03116a9f8c [OpenMP] Use weak attribute in interface only for static library
This is to address the issue reported at:
https://bugs.llvm.org/show_bug.cgi?id=46863

Since weak is meaningless for a shared library interface function, this patch
disables the attribute, when the OpenMP library is built as shared library.

ompt_start_tool is not an interface function, but a internally called function
possibly implemented by an OMPT tool.
This function needs to be weak if possible to allow overwriting ompt_start_tool
with a function implementation built into the application.

Differential Revision: https://reviews.llvm.org/D84871
2020-07-31 12:29:05 +02:00
Jinsong Ji d28f86723f Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit bf544fa1c3.

Fixed the typo in PPCInstrInfo.cpp.
2020-07-28 14:00:11 +00:00
Jinsong Ji bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce7153.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Jinsong Ji adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
David Truby bb099c87ab [openmp] Don't copy exports into the source folder by default.
Additionally fix the copy if enabled on multi-config targets.

Summary:
This changes the copy command for libomp.so to use the output of the target as
the source of the copy, rather than trying to find it based on
${LIBOMP_LIBRARY_DIR}, which appears to be incorrect in multi-config generator
builds.

Reviewers: jdoerfert

Subscribers: mgorny, yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D84148
2020-07-24 14:34:50 +01:00
Louis Dionne afa1afd410 [CMake] Bump CMake minimum version to 3.13.4
This upgrade should be friction-less because we've already been ensuring
that CMake >= 3.13.4 is used.

This is part of the effort discussed on llvm-dev here:

  http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html

Differential Revision: https://reviews.llvm.org/D78648
2020-07-22 14:25:07 -04:00
Saiyedul Islam 741e55aeed [OpenMP] Temporarily disable failing runtime tests for clang-12
Following tests were disabled for clang-11 after upgrading to
version 5.0 in D82963:

1. openmp/runtime/test/env/kmp_set_dispatch_buf.c
2. openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c

They are also failing for clang-12. Thus this temporary disabling
until they are fixed.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D84241
2020-07-21 15:32:46 +00:00
AndreyChurbanov 617787ea77 [OpenMP] add missed REQUIRES:ompt for 2 OMPT tests 2020-07-21 16:31:17 +03:00
AndreyChurbanov 5a8779169e [OpenMP] libomp build fix without OMPT_SUPPORT 2020-07-21 16:03:17 +03:00
AndreyChurbanov 917f842159 [OpenMP] libomp cleanup: add checks of bad memory access
Add check of frm to prevent array out-of-bound access;
add check of new_nproc to prevent access of unallocated hot_teams array;
add check of location info pointer to prevent NULL dereference;
add check of d_tn pointer to prevent NULL dereference in release build.
These checks make static analyzers happier.

This is second part of the patch from https://reviews.llvm.org/D84062.
2020-07-21 00:12:46 +03:00
AndreyChurbanov 787eb0c637 [OpenMP] libomp cleanup: add check of input global tid parameter
Add check of negative gtid before indexing __kmp_threads.
This makes static analyzers happier.
This is the first part of the patch split in two parts.

Differential Revision: https://reviews.llvm.org/D84062
2020-07-20 23:49:58 +03:00
Joachim Protze f226171429 [OpenMP][Tests][NFC] Mark compatibility with older versions of clang 2020-07-20 13:53:29 +02:00
AndreyChurbanov 86fb2db49b [OpenMP] libomp cleanup: check presence of hwloc objects CORE, PACKAGE
hwloc documentation guarantees the only object that is always present
in the topology is PU. We can check the presence of other objects
in the topology, just in case.

Differential Revision: https://reviews.llvm.org/D84065
2020-07-18 01:15:37 +03:00
AndreyChurbanov 62d88a1c79 [OpenMP] libomp: add itt notifications for teams construct on host
Add barrier/region notification for parallel inside teams construct
when number of teams is 1, as VTune only shows outer level regions for
simplicity.

Differential Revision: https://reviews.llvm.org/D84024
2020-07-17 21:10:25 +03:00