Commit Graph

1204 Commits

Author SHA1 Message Date
Vladimir Inđić f2410bfb1c [OpenMP][OMPT][clang] task frame support fixed in __kmpc_fork_call
__kmp_fork_call sets the enter_frame of the active task (th_curren_task)
before new parallel region begins. After the region is finished, the
enter_frame is cleared.

The old implementation of __kmpc_fork_call didn’t clear the enter_frame of
active task.

Also, the way of initializing the enter_frame of the active task was wrong.
Consider the following two OpenMP programs.

The first program: Let R1 be the serialized parallel region that encloses
another serialized parallel region R2. Assume that thread that executes R2 is
going to create a new serialized parallel region R3 by executing
__kmpc_fork_call. This thread is responsible to set enter_frame of R2's
implicit task. Note that the information about R2's implicit task is present
inside master_th->th.th_current_task at this moment, while lwt represents the
information about R1's implicit task. The old implementation uses lwt and
resets enter_frame of R1's implicit task instead of R2's implicit task. The
new implementation uses master_th->th.th_current_task instead.

The second program: Consider the OpenMP program that contains parallel region
R1 which encloses an explicit task T. Assume that thread should create another
parallel region R2 during the execution of the T. The __kmpc_fork_call is
responsible to create R2 and set enter frame of T whose information is present
inside the master_th->th.th_current_task.
Old implementation tries to set the frame of
parent_team->t.t_implicit_task_taskdata[tid] which corresponds to the implicit
task of the R1, instead of T.

Differential Revision: https://reviews.llvm.org/D112419
2021-10-25 18:21:19 +02:00
Joachim Protze 7368227965 [OpenMP][Tests] Test omp_get_wtime for invariants
As discussed in D108488, testing for invariants of omp_get_wtime would be more
reliable than testing for duration of sleep, as return from sleep might be
delayed due to system load.

Alternatively/in addition, we could compare the time measured by omp_get_wtime
 to time measured with C++11 chrono (for portability?).

Differential Revision: https://reviews.llvm.org/D112458
2021-10-25 18:20:59 +02:00
Joachim Protze 3f229f42b7 [OpenMP][Tests][NFC] Actually check for test outcome
The CHECK: line in the test had no effect, because the test does not
pipe to FileCheck. Since the test only checks for a single value,
encode the result in the return value of the test.
2021-10-25 18:20:12 +02:00
Joachim Protze 047890bc3f [OpenMP][Tests][NFC] Mark tests trying to link COI as unsupported
For some tests with target-related functionality icc 18/19 tries to link
libioffload_target.so.5, which fails for missing COI symbols.
2021-10-25 18:20:12 +02:00
Joachim Protze d7fdd236d5 [OpenMP][Tests][NFC] Replace atomic increment by reduction
Also mark the test as unsupported by intel-21, because the test does
not terminate
2021-10-25 18:20:12 +02:00
Joachim Protze 38f78dd2e2 [OpenMP][Tools][NFC] Fix C99-style declaration of iteration variables
Where possible change to declare the variable before the loop.
Where not possible, specifically request -std=c99 (could be limited to
specific compilers like icc).
2021-10-25 18:20:12 +02:00
Vladimir Inđić ba02586fbe [OpenMP][OMPT][GOMP] task frame support in KMP_API_NAME_GOMP_PARALLEL_SECTIONS
KMP_API_NAME_GOMP_PARALLEL_SECTIONS function was missing the task frame support.
This patch introduced a fix responsible to set properly the exit_frame of
the innermost implicit task that corresponds to the parallel section construct,
as well as the enter_frame of the task that encloses the mentioned implicit task.

This patch also introduced a simple test case sections_serialized.c that contains
serialized parallel section construct and validates whether the mentioned
task frames are set correctly.

Differential Revision: https://reviews.llvm.org/D112205
2021-10-22 11:01:10 -05:00
AndreyChurbanov 52f4922ebb [OpenMP][NFC] skip atomic tests for non-x86 arch 2021-10-21 21:51:33 +03:00
Nawrin Sultana 99d1ce4a62 [OpenMP] Add GOMP allocator functions
This patch adds GOMP_alloc and GOMP_free functions of LIBGOMP.

Differential revision: https://reviews.llvm.org/D111673
2021-10-20 11:37:29 -05:00
AndreyChurbanov 63f8099e23 [OpenMP] libomp: add check of task function pointer for NULL.
This patch allows to simplify compiler implementation on "taskwait nowait"
construct. The "taskwait nowait" is semantically equivalent to the empty task.
Instead of creating an empty routine as a task entry, compiler can just send
NULL pointer to the runtime. Then the runtime will make all the work with
dependences and return because of the absent task routine.

Differential Revision: https://reviews.llvm.org/D112015
2021-10-18 19:48:30 +03:00
@vladaindjic 59a994e8da [OpenMP][OMPT] thread_num determination for programs with explicit tasks
__ompt_get_task_info_internal is now able to determine the right value of the
“thread_num” argument during the execution of an explicit task.

During the execution of a while loop that iterates over the ancestor tasks
hierarchy, the “prev_team” variable was always set to “team” variable at the
beginning of each loop iteration.

Assume that the program contains a parallel region which encloses an explicit
task executed by the worker thread of the region. Also assume that the tool
inquires the “thread_num” of a worker thread for the implicit task that
corresponds to the region (task at “ancestor_level == 1”) and expects to
receive the value of “thread_num > 0”.
After the loop finishes, both “team” and “prev_team” variables are equal and
point to the team information of the parallel region.
The “thread_num” is set to “prev_team->t.t_master_tid”, that is equal to
“team->t.t_master_tid”. In this case, “team->t.t_master_tid” is 0, since
the master thread of the region is the initial master thread of the program.
This leads to a contradiction.

To prevent this, “prev_team” variable is set to “team” variable only at the
time when the loop that has already encountered the implicit task (“taskdata”
variable contains the information about an implicit task) continues iterating
over the implicit task’s ancestors, if any.

After the mentioned loop finishes, the “prev_team” variable might be equal to
NULL. This means that the task at requested “ancestor_level” belongs to the
innermost parallel region, so the “thread_num” will be determined by calling
the “__kmp_get_tid”.

To prove that this patch works, the test case “explicit_task_thread_num.c” is
provided.
It contains the example of the program explained earlier in the summary.

Differential Revision: https://reviews.llvm.org/D110473
2021-10-18 13:54:22 +02:00
Joachim Protze c93fb143b9 [OpenMP][Tests][NFC] Work around ICC bug
Older intel compilers miss the privatization of nested loop variables for
doacross loops. Declaring the variable in the loop makes the test more
robust.
2021-10-18 13:54:15 +02:00
Joachim Protze 5918688248 [OpenMP][Tests][NFC] Flagging OMPT tests as XFAIL for Intel compilers
With Intel 19 compiler the teams tests fail to link while trying to link
liboffload.
2021-10-18 13:50:03 +02:00
Peyton, Jonathan L acb3b187c4 [OpenMP][host runtime] Add initial hybrid CPU support
Detect, through CPUID.1A, and show user different core types through
KMP_AFFINITY=verbose mechanism. Offer future runtime optimizations
 __kmp_is_hybrid_cpu() to know whether running on a hybrid system or not.

Differential Revision: https://reviews.llvm.org/D110435
2021-10-14 16:49:42 -05:00
Peyton, Jonathan L b840d3ab0d [OpenMP][host runtime] small fixup of RTM CPUID bit check 2021-10-14 16:49:42 -05:00
Peyton, Jonathan L 50b68a3d03 [OpenMP][host runtime] Add support for teams affinity
This patch implements teams affinity on the host.
The default is spread. A user can specify either spread, close, or
primary using KMP_TEAMS_PROC_BIND environment variable. Unlike
OMP_PROC_BIND, KMP_TEAMS_PROC_BIND is only a single value and is not a
list of values. The values follow the same semantics under the OpenMP
specification for parallel regions except T is the number of teams in
a league instead of the number of threads in a parallel region.

Differential Revision: https://reviews.llvm.org/D109921
2021-10-14 16:30:28 -05:00
AndreyChurbanov 621d7a75b1 [OpenMP] libomp: add atomic functions for new OpenMP 5.1 atomics.
Added functions those implement "atomic compare".
Though clang does not use library interfaces to implement OpenMP atomics,
the functions added for consistency.
Also added missed functions for 80-bit floating min/max atomics.

Differential Revision: https://reviews.llvm.org/D110109
2021-10-13 21:02:18 +03:00
AndreyChurbanov 6e98ec9b20 [OpenMP] libomp: fix ittnotify usage.
Replaced storing of ittnotify domain array index into
location info structure (which is now read-only) with storing of
(location info address + ittnotify domain + team size) into hash map.
Replaced __kmp_itt_barrier_domains and __kmp_itt_imbalance_domains arrays with
__kmp_itt_barrier_domains hash map; __kmp_itt_region_domains and
__kmp_itt_region_team_size arrays with __kmp_itt_region_domains hash map.
Basic functionality did not change (at least tried to not change).

The patch fixes https://bugs.llvm.org/show_bug.cgi?id=48644.

Differential Revision: https://reviews.llvm.org/D111580
2021-10-13 20:49:05 +03:00
AndreyChurbanov 5e58b63b28 [OpenMP] libomp: fix warning on comparison of integer expressions of different signedness
Replaced macro with global variable of correspondent type.

Differential Revision: https://reviews.llvm.org/D111562
2021-10-13 20:11:47 +03:00
AndreyChurbanov f5c0c9179f [OpenMP] libomp: add OpenMP 5.1 memory allocation routines.
Aligned allocation routines added.
Fortran interfaces added for all allocation routines.

Differential Revision: https://reviews.llvm.org/D110923
2021-10-11 19:25:00 +03:00
Martin Storsjö dec2257f35 [openmp] Fix a typo in a test REQUIRES line
Differential Revision: https://reviews.llvm.org/D110963
2021-10-03 23:51:11 +03:00
Peyton, Jonathan L 343b9e8590 [OpenMP][host runtime] Introduce kmp_cpuinfo_flags_t to replace integer flags
Store CPUID support flags as bits instead of using entire integers.

Differential Revision: https://reviews.llvm.org/D110091
2021-10-01 11:08:39 -05:00
Peyton, Jonathan L 957b4c5750 [OpenMP][testing] increase threshold for omp_get_wtime test 2021-10-01 11:07:41 -05:00
@vladaindjic 5357a98c82 [OpenMP] libomp: Usage of TASK_TIED constant inside kmp_gsupport.cpp
The minor code refactorization introduces the TASK_TIED constant inside
kmp_gsupprot.cpp as a replacement for the literal value 1.
The mentioned constant is now used in both kmp_tasking.cpp and
kmp_gsupport.cpp files.

Differential Revision: https://reviews.llvm.org/D110441
2021-09-27 19:45:56 +03:00
Usman Nadeem 248342b7c7 [OpenMP][OMPD] Fix compile error when OMPD is not supported
Differential Revision: https://reviews.llvm.org/D110120

Change-Id: I9d39dacfab5b7fbab37ee4b4d960d51e0892b24d
2021-09-21 12:45:15 -07:00
Peyton, Jonathan L 1e45cd75df [OpenMP][host runtime] Fix indirect lock table race condition
The indirect lock table can exhibit a race condition during initializing
and setting/unsetting locks. This occurs if the lock table is
resized by one thread (during an omp_init_lock) and accessed (during an
omp_set|unset_lock) by another thread.

The test runtime/test/lock/omp_init_lock.c test exposed this issue and
will fail if run enough times.

This patch restructures the lock table so pointer/iterator validity is
always kept. Instead of reallocating a single table to a larger size, the
lock table begins preallocated to accommodate 8K locks. Each row of the
table is allocated as needed with each row allowing 1K locks. If the 8K
limit is reached for the initial table, then another table, capable of
holding double the number of locks, is allocated and linked
as the next table. The indices stored in the user's locks take this
linked structure into account when finding the lock within the table.

Differential Revision: https://reviews.llvm.org/D109725
2021-09-20 13:01:58 -05:00
AndreyChurbanov 59b877d001 [OpenMP] NFC: add type casts to silence gcc warnings 2021-09-17 19:49:40 +03:00
AndreyChurbanov 7f1a6d891e [OpenMP] libomp: Update third-party sources of ittnotify client code.
The third-party ittnotify sources updated from https://github.com/intel/ittapi.
Changes applied:
- llvm license aded to all files; initial BSD license saved in LICENSE.txt;
- clang-formatted;
- renamed *.c to *.cpp, similar to what we did with all our sources;
- added #include "kmp_config.h" with definition of INTEL_ITTNOTIFY_PREFIX macro
  into ittnotify_static.cpp.

Differential Revision: https://reviews.llvm.org/D109333
2021-09-17 19:38:34 +03:00
Peyton, Jonathan L 258e27aae1 [OpenMP] Add support for GOMP depobj
GOMP depobjs are represented as a two intptr_t array. The first
element is the base address of the dependency and the second element
is the flag indicating the type the depobj represents.

Differential Revision: https://reviews.llvm.org/D108790
2021-09-15 12:47:08 -05:00
Hansang Bae 3976035d68 [OpenMP] Fix line truncation in omp_lib.h
Fixed code that exceeds 72-column.

Differential Revision: https://reviews.llvm.org/D109469
2021-09-09 09:33:45 -05:00
AndreyChurbanov d40108e0af [OpenMP] libomp: runtime part of omp_all_memory task dependence implementation.
New omp_all_memory task dependence type is implemented.
Library recognizes the new type via either
(dependence_address == NULL && dependence_flag == 0x80)
or
(dependence_address == SIZE_MAX).
A task with new dependence type depends on each preceding task
with any dependence type (kind of a dependence barrier).

Differential Revision: https://reviews.llvm.org/D108574
2021-09-08 16:55:32 +03:00
Hansang Bae 224f51d879 [OpenMP] Add interface for 5.1 scope construct
The new interface only marks begin/end of a scope construct for
corresponding OMPT events, and we can use existing interfaces for
reduction operations.

Differential Revision: https://reviews.llvm.org/D108062
2021-09-07 11:22:21 -05:00
Nawrin Sultana c24da72fa4 [OpenMP] Change monotonicity of dynamic schedule
This patch changes the default monotonicity of dynamic schedule from
monotonic to non-monotonic when no modifier is specified.

Differential Revision: https://reviews.llvm.org/D109026
2021-09-07 08:18:46 -05:00
Shilei Tian 8442967fe3 [OpenMP] Fix task wait doesn't work as expected in serialized team
As discussed in D107121, task wait doesn't work when a regular task T depends on
a detached task or a hidden helper task T' in a serialized team. The root cause is,
since the team is serialized, the last task will not be tracked by
`td_incomplete_child_tasks`. When T' is finished, it first releases its
dependences, and then decrements its parent counter. So far so good. For the thread
that is running task wait, if at the moment it is still spinning and trying to
execute tasks, it is fine because it can detect the new task and execute it.
However, if it happends to finish the function `flag.execute_tasks(...)`, it will
be broken because `td_incomplete_child_tasks` is 0 now.

In this patch, we update the rule to track children tasks a little bit. If the
task team encounters a proxy task or a hidden helper task, all following tasks
will be tracked.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D107496
2021-08-31 12:15:46 -04:00
Peyton, Jonathan L d39d3a327b [OpenMP][test] fix omp_get_wtime.c test to be more accommodating
The omp_get_wtime.c test fails intermittently if the recorded times are
off by too much which can happen when many tests are run in parallel.

Instead of failing if one timing is a little off, take average of 100
timings minus the 10 worst.

Differential Revision: https://reviews.llvm.org/D108488
2021-08-23 08:13:42 -05:00
Vignesh Balasubramanian 589519b9ab [OpenMP][OMPD]Code movement required for OMPD
These changes don't come under OMPD guard as it is a movement of existing code to capture parallel behavior correctly.
"Runtime Entry Points for OMPD" like "ompd_bp_parallel_begin" and "ompd_bp_parallel_begin" should be placed at the correct execution point for the debugging tool to access proper handles/data.
Without the below changes, in certain cases, debugging tool will pick the wrong parallel and task handle.

Reviewed By: @hbae
Differential Revision: https://reviews.llvm.org/D100366
2021-08-20 14:36:22 +05:30
Shilei Tian 1d8d43ae61 [OpenMP] Use `__kmpc_give_task` in `__kmp_push_task` when encountering a hidden helper task
This patch replaces the current implementation, overwrites `gtid` and `thread`,
with `__kmpc_give_task`.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D106977
2021-08-19 20:49:29 -04:00
Martin Storsjö f5616a981c [OpenMP] Fix the usage of sscanf on MinGW
KMP_SSCANF only evaluates to sscanf_s within
    #if KMP_OS_WINDOWS && KMP_MSVC_COMPAT
so we need to pass the sscanf_s specific parameters within a similar
condition.

Differential Revision: https://reviews.llvm.org/D108196
2021-08-17 21:36:09 +03:00
Peyton, Jonathan L b4a1f441d9 [OpenMP] Add a few small fixes
* Add comment to help ensure new construct data are added in two places
* Check for division by zero in the loop worksharing code
* Check for syntax errors in parrange parsing

Differential Revision: https://reviews.llvm.org/D105929
2021-08-16 10:02:49 -05:00
Peyton, Jonathan L 6eeb4c1f32 [OpenMP] Fix incorrect parameters to sscanf_s call
On Windows, the documentation states that when using sscanf_s,
each %c and %s specifier must also have additional size parameter.
This patch adds the size parameter in the one place where %c is
used.

Differential Revision: https://reviews.llvm.org/D105931
2021-08-16 09:59:21 -05:00
AndreyChurbanov 52cac541d4 [OpenMP] libomp: cleanup: minor fixes to silence static analyzer.
Added couple more checks to silence KlocWork static code analyzer.

Differential Revision: https://reviews.llvm.org/D107348
2021-08-16 13:39:23 +03:00
AndreyChurbanov f94da67f49 [OpenMP][NFC] libomp: reduced timeouts in the test from 50 to 2 sec. 2021-08-11 17:58:52 +03:00
Pirama Arumuga Nainar 49fabd9d76 [openmp] Do not use shared memory on Android
Android provides ashmem/ASharedMemory support on newer releases, which
we can use if requested by openmp users on Android.

Also refactor the preprocessor check for using shared memory to
kmp_config.h.cmake.

Differential Revision: https://reviews.llvm.org/D107181
2021-08-09 09:41:32 -07:00
Ye Luo 262289c103 [OpenMP] mark target task untied
OpenMP specification Tasking Terminology
target task :A mergeable and untied task that ...

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D107686
2021-08-07 12:31:20 -04:00
Shilei Tian 680c71b127 [OpenMP] Clean up for hidden helper task
This patch makes some clean up for code of hidden helper task.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D107008
2021-08-04 12:36:44 -04:00
Shilei Tian 9f5d6ea52e [OpenMP] Fix performance regression reported in bug #51235
This patch fixes the "performance regression" reported in https://bugs.llvm.org/show_bug.cgi?id=51235. In fact it has nothing to do with performance. The root cause is, the stolen task is not allowed to execute by another thread because by default it is tied task. Since hidden helper task will always be executed by hidden helper threads, it should be untied.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D107121
2021-08-04 12:34:49 -04:00
Lechen Yu 3bc8ce5dd7 [openmp] Add OMPT initialization in libomptarget
When loading libomptarget, the init function in libomptarget/src/rtl.cpp
will search for the libomptarget_start_tool function using libdl.
libomptarget_start_tool will pass those OMPT callbacks related to target
constructs to libomptarget

Differential Revision: https://reviews.llvm.org/D99803
2021-08-04 18:00:11 +02:00
AndreyChurbanov 8e29b4b323 [OpenMP] libomp: taskwait depend implementation fixed.
Fix for https://bugs.llvm.org/show_bug.cgi?id=49723.
Eliminated references from task dependency hash to node allocated on stack,
thus eliminated accesses to stale memory. So the node now never freed.
Uncommented assertion which triggered when stale memory accessed.
Removed unneeded ref count increment for stack allocated node.

Differential Revision: https://reviews.llvm.org/D106705
2021-08-03 15:45:20 +03:00
AndreyChurbanov 8b81524c6d [OpenMP][NFC] libomp: silence warnings on unused variables.
Put declarations/definitions of unused variables under corresponding macros
to silence clang build warnings.

Differential Revision: https://reviews.llvm.org/D106608
2021-07-30 17:04:42 +03:00
Terry Wilmarth d8e4cb9121 [OpenMP] libomp: Add new experimental barrier: two-level distributed barrier
Two-level distributed barrier is a new experimental barrier designed
for Intel hardware that has better performance in some cases than the
default hyper barrier.

This barrier is designed to handle fine granularity parallelism where
barriers are used frequently with little compute and memory access
between barriers. There is no need to use it for codes with few
barriers and large granularity compute, or memory intensive
applications, as little difference will be seen between this barrier
and the default hyper barrier. This barrier is designed to work
optimally with a fixed number of threads, and has a significant setup
time, so should NOT be used in situations where the number of threads
in a team is varied frequently.

The two-level distributed barrier is off by default -- hyper barrier
is used by default. To use this barrier, you must set all barrier
patterns to use this type, because it will not work with other barrier
patterns. Thus, to turn it on, the following settings are required:

KMP_FORKJOIN_BARRIER_PATTERN=dist,dist
KMP_PLAIN_BARRIER_PATTERN=dist,dist
KMP_REDUCTION_BARRIER_PATTERN=dist,dist

Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER,
and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed
barrier.

Patch fixed for ITTNotify disabled builds and non-x86 builds

Co-authored-by: Jonathan Peyton <jonathan.l.peyton@intel.com>
Co-authored-by: Vladislav Vinogradov <vlad.vinogradov@intel.com>

Differential Revision: https://reviews.llvm.org/D103121
2021-07-29 14:09:26 -05:00
Joachim Protze e32e1dae61 [OpenMP][Tests] Fix test compatibility
gcc and clang disagree in how the event handle needs to be handled.
According to OpenMP LC, gcc is right. Will open clang bug report
2021-07-28 00:08:32 +02:00
Joachim Protze 3c76e99291 [OpenMP] Fix deadlock for detachable task with child tasks
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=49066.

For detachable tasks, the assumption breaks that the proxy task cannot have
remaining child tasks when the proxy completes.
In stead of increment/decrement the incomplete task count, a high-order bit
is flipped to mark and wait for the incomplete proxy task.

Differential Revision: https://reviews.llvm.org/D101082
2021-07-28 00:01:35 +02:00
Vignesh Balasubramanian 23eced9ead Convert the error to warning for enabling OMPD in non-Linux platform
OMPD is enabled by default on Linux machines and disabled on others.
However, if explicitly enabled it throws an error and exit while configuring.

It is mentioned in Bug: https://bugs.llvm.org/show_bug.cgi?id=51121

This patch, instead of throwing error, disables OMPD support with a warning message,
so configuration can continue.

Reviewed By: @protze.joachim
Differential Revision: https://reviews.llvm.org/D106682
2021-07-27 17:25:27 +05:30
Joachim Protze c46ccb8538 [OpenMP][tests][NFC] Update test status for gcc 11 and 12
gcc 11 introduced support for depend clause, but the gomp interface of libomp
does not yet handle the information.
Also remove -fopenmp-version=50, which is no longer needed for clang, but not
supported by gcc.
2021-07-25 18:56:36 +02:00
Shilei Tian c2c43132f6 [OpenMP] Fix bug 50022
Bug 50022 [0] reports target nowait fails in certain case, which is added in this
patch. The root cause of the failure is, when the second task is created, its
parent's `td_incomplete_child_tasks` will not be incremented because there is no
parallel region here thus its team is serialized. Therefore, when the initial
thread is waiting for its unfinished children tasks, it thought there is only
one, the first task, because it is hidden helper task, so it is tracked. The
second task will only be pushed to the queue when the first task is finished.
However, when the first task finishes, it first decrements the counter of its
parent, and then release dependences. Once the counter is decremented, the thread
will move on because its counter is reset, but actually, the second task has not
been executed at all. As a result, since in this case, the main function finishes,
then `libomp` starts to destroy. When the second task is pushed somewhere, all
some of the structures might already have already been destroyed, then anything
could happen.

This patch simply moves `__kmp_release_deps` ahead of decrement of the counter.
In this way, we can make sure that the initial thread is aware of the existence
of another task(s) so it will not move on. In addition, in order to tackle
dependence chain starting with hidden helper thread, when hidden helper task is
encountered, we force the task to release dependences.

Reference:
[0] https://bugs.llvm.org/show_bug.cgi?id=50022

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D106519
2021-07-23 16:54:11 -04:00
Shilei Tian ea452353c0 [OpenMP] Refined the logic to give a regular task from a hidden helper task
In current implementation, if a regular task depends on a hidden helper task,
and when the hidden helper task is releasing its dependences, it directly calls
`__kmp_omp_task`. This could cause a problem that if `__kmp_push_task` returns
`TASK_NOT_PUSHED`, the task will be executed immediately. However, the hidden
helper threads are assumed to only execute hidden helper tasks. This could cause
problems because when calling `__kmp_omp_task`, the encountering gtid, which is
not the real one of the thread, is passed.

This patch uses `__kmp_give_task`, but because it is a static function, a new
wrapper `__kmpc_give_task` is added.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D106572
2021-07-22 19:21:29 -04:00
Shilei Tian 996baa58a4 [OpenMP] Fixed a segmentation fault when using taskloop and target nowait
The synchronization of task loop misses hidden helper tasks, causing segmentation
fault reported in https://bugs.llvm.org/show_bug.cgi?id=50002.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D106220
2021-07-19 21:09:05 -04:00
Peyton, Jonathan L 424f14f0d2 [OpenMP] Fix one sign-compare warning from GCC 2021-07-13 12:36:12 -05:00
Peyton, Jonathan L 405eefe464 [OpenMP][NFC] Change comment style to eliminate warnings from GCC
Standalone build for OpenMP runtime using GCC is giving -Wcomment
warnings where a backslash newline is encountered in the // style
comment. This switches the // style for /* style to silence the
warnings.
2021-07-13 12:27:08 -05:00
Hansang Bae db635a28e6 [OpenMP] Minor improvement in task allocation
This patch includes a few changes to improve task allocation
performance slightly. These changes are enough to restore performance
drop observed after introducing hidden helper.

Differential Revision: https://reviews.llvm.org/D105715
2021-07-13 09:07:14 -05:00
Roman Lebedev 4709d9d5be
[libomp] ompd_init(): fix heap-buffer-overflow when constructing libompd.so path
There is no guarantee that the space allocated in `libname`
is enough to accomodate the whole `dl_info.dli_fname`,
because it could e.g. have an suffix  - `.5`,
and that highlights another problem - what it should do about suffxies,
and should it do anything to resolve the symlinks before changing the filename?

```
$ LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib"  ./src/utilities/rstest/rstest -c /tmp/f49137920.NEF
dl_info.dli_fname "/usr/local/lib/libomp.so.5"
strlen(dl_info.dli_fname) 26
lib_path_length 14
lib_path_length + 12 26
=================================================================
==30949==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000002a at pc 0x000000548648 bp 0x7ffdfa0aa780 sp 0x7ffdfa0a9f40
WRITE of size 27 at 0x60300000002a thread T0
    #0 0x548647 in strcpy (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647)
    #1 0x7fb9e3e3d234 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:102:5
    #2 0x7fb9e3dcb446 in __kmp_do_serial_initialize() /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6742:3
    #3 0x7fb9e3dcb40b in __kmp_get_global_thread_id_reg /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:251:7
    #4 0x59e035 in main /home/lebedevri/rawspeed/build-Clang-SANITIZE/../src/utilities/rstest/rstest.cpp:491
    #5 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16
    #6 0x4df449 in _start (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x4df449)

0x60300000002a is located 0 bytes to the right of 26-byte region [0x603000000010,0x60300000002a)
allocated by thread T0 here:
    #0 0x55cc5d in malloc (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x55cc5d)
    #1 0x7fb9e3e3d224 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:101:17
    #2 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647) in strcpy
Shadow bytes around the buggy address:
  0x0c067fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c067fff8000: fa fa 00 00 00[02]fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==30949==ABORTING
Aborted
```
2021-07-13 15:36:46 +03:00
Joachim Protze 681055ea69 [OpenMP] Remove TSAN annotations from libomp
The annotations in libomp were never built by default. The annotations are
also superseded by the annotations which the OMPT tool libarcher.so provides.
With respect to libarcher, libomp behaves as if libarcher would be the last
element of OMP_TOOL_LIBARARIES. I.e., if no other OMPT tool gets active,
libarcher will check if an OpenMP application is built with TSan.

Since libarcher gets loaded by default, enabling LIBOMP_TSAN_SUPPORT would
result in redundant annotations for TSan, which slightly differ in details
and coverage (e.g. task dependencies are not handled well by the annotations
in libomp).

This patch removes all TSan annotations from the OpenMP runtime code.

Differential Revision: https://reviews.llvm.org/D103767
2021-07-12 18:49:11 +02:00
Michał Górny 2b0d95fb58 [openmp] [test] Add missing <limits> include to capacity_nthreads
Differential Revision: https://reviews.llvm.org/D105474
2021-07-06 20:39:53 +02:00
Hansang Bae f1b9ce2736 [OpenMP] Fix a few issues with hidden helper task
This patch includes the following changes to address a few issues when
using hidden helper task.

- Assertion is triggered when there are inadvertent calls to hidden
  helper functions on non-Linux OS
- Added deinit code in __kmp_internal_end_library function to fix random
  shutdown crashes
- Moved task data access into the lock-guarded region in __kmp_push_task

Differential Revision: https://reviews.llvm.org/D105308
2021-07-01 17:10:32 -05:00
Johannes Doerfert 4eb90e893f Revert "[OpenMP] Add Two-level Distributed Barrier"
This reverts commit 25073a4ecf.

This breaks non-x86 OpenMP builds for a while now. Until a solution is
ready to be upstreamed we revert the feature and unblock those builds.
See:
  https://reviews.llvm.org/rG25073a4ecfc9b2e3cb76776185e63bfdb094cd98#1005821
and
  https://reviews.llvm.org/rG25073a4ecfc9b2e3cb76776185e63bfdb094cd98#1005821

The currently proposed fix (D104788) seems not to be ready yet:
  https://reviews.llvm.org/D104788#2841928
2021-06-29 09:38:27 -05:00
Johannes Doerfert bc8bb3df35 Revert "[omp] Fix build without ITT after D103121 changes"
This reverts commit eab1fd389b.

This commit fixed a problem with 25073a4ecf (D103121) which is the one
we actually need to revert to unblock non-X86 builds of OpenMP. Can be
reapplied, or merged into, D103121 as it goes in again.
2021-06-29 09:38:27 -05:00
AndreyChurbanov b2787945f9 [OpenMP][NFC] libomp: fix wrong debug assertion.
Normalized bounds of chunk of iterations to steal from are inclusive,
so upper bound should not be decremented in expression to check.
Problem was in attempt to steal iterations 0:0, that caused assertion after
wrong decrement. Reported in comment to https://reviews.llvm.org/D103648.

Differential Revision: https://reviews.llvm.org/D104880
2021-06-25 02:02:14 +03:00
AndreyChurbanov 5dd4d0d46f [OpenMP] libomp: fix dynamic loop dispatcher
Restructured dynamic loop dispatcher code.
Fixed use of dispatch buffers for nonmonotonic dynamic (static_steal) schedule:
- eliminated possibility of stealing iterations of the wrong loop when victim
  thread changed its buffer to work on another loop;
- fixed race when victim thread changed its buffer to work in nested parallel;
- eliminated "static" property of the schedule, that is now a single thread can
  execute whole loop.

Differential Revision: https://reviews.llvm.org/D103648
2021-06-22 16:29:01 +03:00
Vladislav Vinogradov eab1fd389b [omp] Fix build without ITT after D103121 changes
Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D104638
2021-06-21 18:17:52 +03:00
Terry Wilmarth 25073a4ecf [OpenMP] Add Two-level Distributed Barrier
Two-level distributed barrier is a new experimental barrier designed
for Intel hardware that has better performance in some cases than the
default hyper barrier.

This barrier is designed to handle fine granularity parallelism where
barriers are used frequently with little compute and memory access
between barriers.  There is no need to use it for codes with few
barriers and large granularity compute, or memory intensive
applications, as little difference will be seen between this barrier
and the default hyper barrier. This barrier is designed to work
optimally with a fixed number of threads, and has a significant setup
time, so should NOT be used in situations where the number of threads
in a team is varied frequently.

The two-level distributed barrier is off by default -- hyper barrier
is used by default. To use this barrier, you must set all barrier
patterns to use this type, because it will not work with other barrier
patterns.  Thus, to turn it on, the following settings are required:

KMP_FORKJOIN_BARRIER_PATTERN=dist,dist
KMP_PLAIN_BARRIER_PATTERN=dist,dist
KMP_REDUCTION_BARRIER_PATTERN=dist,dist

Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER,
and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed
barrier.

Differential Revision: https://reviews.llvm.org/D103121
2021-06-16 15:34:55 -05:00
AndreyChurbanov 610fea65e2 [OpenMP] libomp: fixed implementation of OMP 5.1 inoutset task dependence type
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.

Differential Revision: https://reviews.llvm.org/D97085
2021-06-16 14:47:29 +03:00
Joachim Protze d2a7871b5e [OpenMP][NFC] Add back suppression of warning
Commit cff215565e did not fix all unused variables in different builds,
so adding back the suppression for now.
2021-06-16 10:14:59 +02:00
Joachim Protze cff215565e [OpenMP] Remove unused variables from libomp code
Several variables were left unused as a result of different patches removing
their use.

Two variables have some use:
`poll_count` is used by the KMP_BLOCKING macro only under certain conditions.
Adding (void) to tell the compiler to ignore the unused variable.

`padding` is a dummy stack allocation with no intent to be used. Also adding
(void) to make the compiler ignore the unused variable.

Differential Revision: https://reviews.llvm.org/D104303
2021-06-16 09:33:46 +02:00
Peyton, Jonathan L 56da28240f [OpenMP] Add GOMP 5.0 version symbols to API
* Add GOMP versioned pause functions
* Add GOMP versioned affinity format functions

To do the affinity format functions, only attach versioned symbols
to the APPEND Fortran entries (e.g., omp_set_affinity_format_) since
GOMP only exports two symbols (one for Fortran, one for C). Our
affinity format functions have three symbols.
e.g., with omp_set_affinity_format:
1) omp_set_affinity_format (Fortran interface)
2) omp_set_affinity_format_ (Fortran interface)
3) ompc_set_affinity_format (C interface)

Have the GOMP version of the C symbol alias the ompc_* 3) version
instead of the Fortran unappended version 1).

Differential Revision: https://reviews.llvm.org/D103647
2021-06-15 16:25:00 -05:00
Peyton, Jonathan L 92baf414db [OpenMP] Fix affinity determine capable algorithm on Linux
Remove strange checks for syscall() arguments where mask is NULL.
Valgrind reports these as error usages for the syscall.
Instead, just check if CACHE_LINE bytes is long enough. If not, then
search for the size. Also, by limiting the first size detection
attempt to CACHE_LINE bytes, instead of 1MB, we don't use more than one
cache line for the mask size. Before this patch, sometimes the returned
mask size was 640 bytes (10 cache lines) because the initial call to
getaffinity() was limited only by the internal kernel mask size
which can be very large.

Differential Revision: https://reviews.llvm.org/D103637
2021-06-15 16:21:30 -05:00
Peyton, Jonathan L 0ddde4d865 [OpenMP] Lazily assign root affinity
Lazily set affinity for root threads. Previously, the root thread
executing middle initialization would attempt to assign affinity
to other existing root threads. This was not working properly as the
set_system_affinity() function wasn't setting the affinity for the
target thread. Instead, the middle init thread was resetting the
its own affinity using the target thread's affinity mask.

Differential Revision: https://reviews.llvm.org/D103625
2021-06-15 16:21:06 -05:00
AndreyChurbanov 9ce2e5e700 Revert "[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type"
This reverts commit a1f550e052.

Revert in order to fix backwards compatibility breakage
caused by type size change for task dependence flag.
2021-06-09 17:38:38 +03:00
Vignesh Balasubramanian f61602b0d3 [OpenMP][OMPD] Implementation of OMPD debugging library - libompd.
This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs.
It contains support code required in "openmp/runtime" for OMPD implementation.

Reviewed By: @hbae
Differential Revision: https://reviews.llvm.org/D100181
2021-06-08 16:44:22 +05:30
Peyton, Jonathan L d70e1f1276 [OpenMP][runtime] add .clang-tidy file
Use same checks as compiler-rt which removes checks for readability-*
and llvm-header style.

Differential Revision: https://reviews.llvm.org/D103711
2021-06-07 13:56:39 -05:00
AndreyChurbanov a1f550e052 [OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
Size of type of the dependence flag changed from 1 to 4 bytes in clang.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.

Differential Revision: https://reviews.llvm.org/D97085
2021-06-07 21:42:51 +03:00
Bryan Chan 54f059c900 [OpenMP] Check loc for NULL before dereferencing it
The ident_t * argument in __kmp_get_monotonicity was being used without
a customary NULL check, causing the function to crash in a Debug build.
Release builds were not affected thanks to dead store elimination.
2021-06-07 10:45:48 -04:00
Terry Wilmarth 8ec9aa236e [OpenMP] Add experimental nesting mode feature
Nesting mode is a new experimental feature in the OpenMP
runtime. It allows a user to set up nesting for an application in a
way that corresponds to the hardware topology levels on the machine an
application is being run on.  For example, if a machine has 2 sockets,
each with 12 cores, then use of nesting mode could set up an outer
level of nesting that uses 2 threads per parallel region, and an inner
level of nesting that uses 12 threads per parallel region.

Nesting mode is controlled with the KMP_NESTING_MODE environment
variable as follows:

1) KMP_NESTING_MODE = 0: Nesting mode is off (default); max-active-levels-var
is set to 1 (the default -- nesting is off, nested parallel regions
are serialized).

2) KMP_NESTING_MODE = 1: Nesting mode is on, and a number of threads
will be assigned for each level discovered in the machine topology;
max-active-levels-var is set to the number of levels discovered.

3) KMP_NESTING_MODE = n, n>1: [Note: this option is experimental and may change
or be removed in the future.] Nesting mode is on, and a number of
threads will be assigned for each topology level discovered on the
machine, up to k<=n levels (since there may be fewer than n levels
discovered in the topology), and beyond the kth level, nested parallel
regions will be serialized; NOTE: max-active-levels-var is 1 (the default --
nesting is off, and nested parallel regions are serialized until the
user changes max-active-levels-var.

If the user sets OMP_NUM_THREADS or OMP_MAX_ACTIVE_LEVELS, they will
override KMP_NESTING_MODE settings for the associated environment
variables. The detected topology may be limited by an affinity mask
setting on the initial thread, or if the user sets KMP_HW_SUBSET. See
also: KMP_HOT_TEAMS_MAX_LEVEL for controlling use of hot teams for
nested parallel regions. Note that this feature only sets numbers of
threads used at nesting levels.  The user should make use of
OMP_PLACES and OMP_PROC_BIND or KMP_AFFINITY for affinitizing those
threads, if desired.

Differential Revision: https://reviews.llvm.org/D102188
2021-06-04 16:01:11 -05:00
Peyton, Jonathan L 56dd158c32 [OpenMP] fix spelling error in message-converter.pl 2021-06-04 11:20:32 -05:00
Peyton, Jonathan L f7655f3df3 [OpenMP] Fix improper printf format specifier 2021-06-02 11:04:48 -05:00
Hansang Bae 7ba4e96ede [OpenMP] Use new task type/flag for taskwait depend events.
Differential Revision: https://reviews.llvm.org/D103464
2021-06-02 10:16:38 -05:00
Peyton, Jonathan L 2020c981fa [OpenMP] Add L2-Tile equivalence for KNL
When on KNL and L2 or Tile layer is detected, manually add
the corresponding layer which is equivalent.

Differential Revision: https://reviews.llvm.org/D102865
2021-06-01 14:17:13 -05:00
Hansang Bae cf5c94ef08 [OpenMP] Define named constants for interop's foreign runtime ID
Also added missing Fortran definitions for interop support.

Differential Revision: https://reviews.llvm.org/D102883
2021-06-01 13:06:59 -05:00
Hansang Bae 95cefacfe1 [OpenMP] Fix crashing critical section with hint clause
Runtime was using the default lock type without using the hint.

Differential Revision: https://reviews.llvm.org/D102955
2021-05-24 17:25:01 -05:00
AndreyChurbanov aa6e7e8da8 [OpenMP] libomp: move warnings to after library initialization
Warnings on deprecated api cannot be suppressed if the library is not initialized.
With this change it is possible to set KMP_WARNINGS=false to suppress the warnings.

Differential Revision: https://reviews.llvm.org/D102676
2021-05-21 23:47:23 +03:00
Shilei Tian af6511d730 [OpenMP] Fixed Bug 49356
Bug 49356 (https://bugs.llvm.org/show_bug.cgi?id=49356) reports crash in
the test case `tasking/bug_taskwait_detach.cpp`, which is caused by the wrong
function declaration. `gtid` in `__kmpc_omp_task` should be `kmp_int32`.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D102584
2021-05-17 12:14:54 -04:00
Christopher Pulido 4fb0aaf033 [OpenMP] Changes to enable MSVC ARM64 build of libomp
This is the first in a series of changes to the OpenMP runtime
that have been done internally by Microsoft. This patch makes
the necessary changes to enable libomp.dll to build with
the MSVC compiler targeting ARM64.

Differential Revision: https://reviews.llvm.org/D101173
2021-05-11 23:03:12 +03:00
Peyton, Jonathan L c765d140fe [OpenMP] Fix hidden helper + affinity
When KMP_AFFINITY is set, each worker thread's gtid value is used as an
index into the place list to determine the thread's placement. With hidden
helpers enabled, this gtid value is shifted down leading to unexpected
shifted thread placement. This patch restores the previous behavior by
adjusting the mask index to take the number of hidden helper threads
into account.

Hidden helper threads are given the full initial mask and do not
participate in any of the other affinity mechanisms (place partitioning,
balanced affinity). Their affinity is only printed for debug builds.

Differential Revision: https://reviews.llvm.org/D101882
2021-05-11 08:54:22 -05:00
Peyton, Jonathan L 9982f33e2c [OpenMP] Refactor/Rework topology discovery code
This patch does the following:

1) Introduce kmp_topology_t as the runtime-friendly structure (the
corresponding global variable is __kmp_topology) to determine the
exact machine topology which can vary widely among current and future
architectures. The current design is not easy to expand beyond the assumed
three layer topology: sockets, cores, and threads so a rework capable of
using the existing KMP_AFFINITY mechanisms is required.

This new topology structure has:
* The depth and types of the topology
* Ratio count for each consecutive level (e.g., number of cores per
   socket, number of threads per core)
* Absolute count for each level (e.g., 2 sockets, 16 cores, 32 threads)
* Equivalent topology layer map (e.g., Numa domain is equivalent to
   socket, L1/L2 cache equivalent to core)
* Whether it is uniform or not

The hardware threads are represented with the kmp_hw_thread_t
structure. This structure contains the ids (e.g., socket 0, core 1,
thread 0) and other information grabbed from the previous Address
structure. The kmp_topology_t structure contains an array of these.

2) Generalize the KMP_HW_SUBSET envirable for the new
kmp_topology_t structure. The algorithm doesn't assume any order with
tiles,numa domains,sockets,cores,threads. Instead it just parses the
envirable, makes sure it is consistent with the detected topology
(including taking into account equivalent layers) and then trims away
the unneeded subset of hardware threads. To enable this, a new
kmp_hw_subset_t structure is introduced which contains a vector of
items (hardware type, number user wants, offset). Any keyword within
__kmp_hw_get_keyword() can be used as a name and can be shortened as
well. e.g.,
KMP_HW_SUBSET=1s,2numa,4tile,2c,3t can be used on the KNL SNC-4 machine.

3) Simplify topology detection functions so they only do the singular
task of detecting the machine's topology. Printing, and all
canonicalizing functionality is now done afterwards. So many lines of
duplicated code are eliminated.

4) Add new ll_caches and numa_domains to OMP_PLACES, and
consequently, KMP_AFFINITY's granularity setting. All the names within
__kmp_hw_get_keyword() are available for use in OMP_PLACES or
KMP_AFFINITY's granularity setting.

5) Simplify and future-proof code where explicit lists of allowed
affinity settings keywords inside if() conditions.

6) Add x86 CPUID leaf 4 cache detection to existing x2apic id method
so equivalent caches could be detected (in particular for the ll_caches
place).

Differential Revision: https://reviews.llvm.org/D100997
2021-05-03 18:00:24 -05:00
Martin Storsjö 01d27fc408 [OpenMP] Fix warnings due to redundant semicolons. NFC. 2021-05-02 21:51:06 +03:00
Kevin Athey bc9120047b Correct tiny misspelling (readlef -> readelf).
Getting my feet wet here as a new committer.

Correct misspelling in check-depends.pl.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101552
2021-04-30 17:20:35 -07:00
Peyton, Jonathan L 4457565757 [OpenMP] Implement GOMP task reductions
Implement the remaining GOMP_* functions to support task reductions
in taskgroup, parallel, loop, and taskloop constructs.  The unused mem
argument to many of the work-sharing constructs has to do with the
scan() directive/ inscan() modifier.  If mem is set, each function
will call KMP_FATAL() and tell the user scan/inscan is unsupported.  The
GOMP reduction implementation is kept separate from our implementation
because of how GOMP presents reduction data and computes the reductions.
GOMP expects the privatized copies to be present even after a #pragma
omp parallel reduction(task:...) region has ended so the data is stored
inside GOMP's uintptr_t* data pseudo-structure.  This style is tightly
coupled with GCC compiler codegen.  There also isn't any init(),
combiner(), fini() functions in GOMP's codegen so the two
implementations were to disparate to try to wrap GOMP's around our own.

Differential Revision: https://reviews.llvm.org/D98806
2021-04-16 16:36:31 -05:00
Peyton, Jonathan L 5ebbb366c4 [OpenMP] Allow affinity to re-detect for child processes
Current atfork() handler for child processes does not reset
the affinity masks array which prevents users from setting their own
affinity in child processes.

Differential Revision: https://reviews.llvm.org/D99218
2021-04-16 16:34:02 -05:00
Hansang Bae 9b98497b44 [OpenMP] Add omp_target_is_accessible() to header files
-- Added omp_target_is_accessible to the header files
-- Added missing const qualifier to device memory routines

Differential Revision: https://reviews.llvm.org/D100420
2021-04-16 07:54:15 -05:00
Hansang Bae 77dc7b4653 [OpenMP] Fix printing routine for OMP_TOOL_VERBOSE_INIT
Also fixed typo in the verbose message.

Differential Revision: https://reviews.llvm.org/D100414
2021-04-14 07:55:26 -05:00
Hansang Bae 3da61ddae7 [OpenMP] Define omp_is_initial_device() variants in omp.h
omp_is_initial_device() is marked as a built-in function in the current
compiler, and user code guarded by this call may be optimized away,
resulting in undesired behavior in some cases. This patch provides a
possible fix for such cases by defining the routine as a variant
function and removing it from builtin list.

Differential Revision: https://reviews.llvm.org/D99447
2021-04-06 16:58:01 -05:00