Commit Graph

1162 Commits

Author SHA1 Message Date
Quinn Pham c3b15b71ce [NFC] Inclusive Language: change master to main for .chm files
[NFC] As part of using inclusive language within the llvm project,
this patch replaces master with main when referring to `.chm` files.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D113299
2021-11-08 08:23:04 -06:00
@t-msn 0808d956c4 [OpenMP] libomp: Fix handling of barrier pattern environment variables
It is better to set all barrier patterns to use "dist" when at least
one environment variable specifies "dist". Otherwise if only one
environment is set to "dist" and others left blank inadvertently,
it would result in mixing dist barrier with default hyper barrier
pattern.

Differential Revision: https://reviews.llvm.org/D112597
2021-11-08 15:01:26 +03:00
Med Ismail Bennani 797b50d4be Revert "Use `GNUInstallDirs` to support custom installation dirs. -- LLVM"
This reverts commit 6fd2db04d0 since it
broke GreenDragon LLDB-Incremental bot:

https://green.lab.llvm.org/green/job/lldb-cmake/37560/console

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2021-11-02 19:11:44 +01:00
John Ericson 6fd2db04d0 Use `GNUInstallDirs` to support custom installation dirs. -- LLVM
This is a new draft of D28234. I previously did the unorthodox thing of
pushing to it when I wasn't the original author, but since this version

- Uses `GNUInstallDirs`, rather than mimics it, as the original author
  was hesitant to do but others requested.

- Is much broader, effecting many more projects than LLVM itself.

I figured it was time to make a new revision.

I am using this patch (and many back-ports) as the basis of
https://github.com/NixOS/nixpkgs/pull/111487 for my distro (NixOS). It
looked like people were generally on board in D28234, but I make note of
this here in case extra motivation is useful.

---

As pointed out in the original issue, a central tension is that LLVM
already has some partial support for these sorts of things. For example
`LLVM_LIBDIR_SUFFIX`, or `COMPILER_RT_INSTALL_PATH`. Because it's not
quite clear yet what to do about those, we are holding off on changing
libdirs and `compiler-rt`. for this initial PR.

---

On the advice of @lebedev.ri, I am splitting this up a bit per
subproject, starting with LLVM. To allow it to be more easily reviewed. This and the subsequent patch must be landed together, as this will not build alone. But the rest can be landed on their own.

Reviewed By: compnerd

Differential Revision: https://reviews.llvm.org/D100810
2021-11-02 10:23:30 -04:00
AndreyChurbanov a64797b5b8 [OpenMP][NFC] disable test on power because of -mlong-double-80 option 2021-10-27 16:54:44 +03:00
AndreyChurbanov c704b25b44 [OpenMP] libomp: Fix possible NULL dereference.
According to dlsym description, the value of symbol could be NULL,
and there is no error in this case. Thus dlerror will also return NULL in
this case. We need to check the value returned by dlerror before printing it.

Differential Revision: https://reviews.llvm.org/D112174
2021-10-27 16:54:44 +03:00
AndreyChurbanov e38a1deb66 [OpenMP] libomp: disable definitions of 5.1 atomics for non-x86 arch.
Declarations of 5.1 atomic entries were added under
"#if KMP_ARCH_X86 || KMP_ARCH_X86_64" in kmp_atomic.h,
but definitions of the functions missed architecture guard in kmp_atomic.cpp.
As a result mangled symbols were available on non-x86 architecture.
The patch eliminates these unexpected symbols from the library.

Differential Revision: https://reviews.llvm.org/D112261
2021-10-25 21:17:26 +03:00
Vladimir Inđić f41d08540b [OpenMP][OMPT] thread_num determination during execution of nested serialized parallel regions
__ompt_get_task_info_internal function is adapted to support thread_num
determination during the execution of multiple nested serialized
parallel regions enclosed by a regular parallel region.

Consider the following program that contains parallel region R1 executed
by two threads. Let the worker thread T of region R1 executes serialized
parallel regions R2 that encloses another serialized parallel region R3.
Note that the thread T is the master thread of both R2 and R3 regions.

Assume that __ompt_get_task_info_internal function is called with the
argument "ancestor_level == 1" during the execution of region R3.
The function should determine the "thread_num" of the thread T inside
the team of region R2, whose implicit task is at level 1 inside the
hierarchy of active tasks. Since the thread T is the master thread of
region R2, one should expected that "thread_num" takes a value 0.
After the while loop finishes, the following stands: "lwt != NULL",
"prev_lwt == NULL", "prev_team" represents the team information about
the innermost serialized parallel region R3. This results in executing
the assignment "thread_num = prev_team->t.t_master_tid". Note that
"prev_team->t.t_master_tid" was initialized at the moment of
R2’s creation and represents the "thread_num" of the thread T inside
the region R1 which encloses R2. Since the thread T is the worker thread
of the region R1, "the thread_num" takes value 1, which is a contradiction.

This patch proposes to use "lwt" instead of "prev_lwt" when determining
the "thread_num". If "lwt" exists, the task at the requested level belongs
to the serialized parallel region. Since the serialized parallel region
is executed by one thread only, the "thread_num" takes value 0.

Similarly, assume that __ompt_get_task_info_internal function is called
with the argument "ancestor_level == 2" during the execution of region R3.
The function should determine the "thread_num" of the thread T inside the
team of region R1. Since the thread is the worker inside the region R1,
one should expected that "thread_num" takes value 1. After the loop finishes,
the following stands: "lwt == NULL", "prev_lwt != NULL", "prev_team" represents
the team information about the innermost serialized parallel region R3.
This leads to execution of the assignment "thread_num = 0", which causes
a contradiction.

Ignoring the "prev_lwt" leads to executing the assignment
"thread_num = prev_team->t.t_master_tid" instead. From the previous explanation,
it is obvious that "thread_num" takes value 1.

Note that the "prev_lwt" variable is marked as unnecessary and thus removed.

This patch introduces the test case which represents the OpenMP program
described earlier in the summary.

Differential Revision: https://reviews.llvm.org/D110699
2021-10-25 18:21:20 +02:00
Vladimir Inđić f2410bfb1c [OpenMP][OMPT][clang] task frame support fixed in __kmpc_fork_call
__kmp_fork_call sets the enter_frame of the active task (th_curren_task)
before new parallel region begins. After the region is finished, the
enter_frame is cleared.

The old implementation of __kmpc_fork_call didn’t clear the enter_frame of
active task.

Also, the way of initializing the enter_frame of the active task was wrong.
Consider the following two OpenMP programs.

The first program: Let R1 be the serialized parallel region that encloses
another serialized parallel region R2. Assume that thread that executes R2 is
going to create a new serialized parallel region R3 by executing
__kmpc_fork_call. This thread is responsible to set enter_frame of R2's
implicit task. Note that the information about R2's implicit task is present
inside master_th->th.th_current_task at this moment, while lwt represents the
information about R1's implicit task. The old implementation uses lwt and
resets enter_frame of R1's implicit task instead of R2's implicit task. The
new implementation uses master_th->th.th_current_task instead.

The second program: Consider the OpenMP program that contains parallel region
R1 which encloses an explicit task T. Assume that thread should create another
parallel region R2 during the execution of the T. The __kmpc_fork_call is
responsible to create R2 and set enter frame of T whose information is present
inside the master_th->th.th_current_task.
Old implementation tries to set the frame of
parent_team->t.t_implicit_task_taskdata[tid] which corresponds to the implicit
task of the R1, instead of T.

Differential Revision: https://reviews.llvm.org/D112419
2021-10-25 18:21:19 +02:00
Joachim Protze 7368227965 [OpenMP][Tests] Test omp_get_wtime for invariants
As discussed in D108488, testing for invariants of omp_get_wtime would be more
reliable than testing for duration of sleep, as return from sleep might be
delayed due to system load.

Alternatively/in addition, we could compare the time measured by omp_get_wtime
 to time measured with C++11 chrono (for portability?).

Differential Revision: https://reviews.llvm.org/D112458
2021-10-25 18:20:59 +02:00
Joachim Protze 3f229f42b7 [OpenMP][Tests][NFC] Actually check for test outcome
The CHECK: line in the test had no effect, because the test does not
pipe to FileCheck. Since the test only checks for a single value,
encode the result in the return value of the test.
2021-10-25 18:20:12 +02:00
Joachim Protze 047890bc3f [OpenMP][Tests][NFC] Mark tests trying to link COI as unsupported
For some tests with target-related functionality icc 18/19 tries to link
libioffload_target.so.5, which fails for missing COI symbols.
2021-10-25 18:20:12 +02:00
Joachim Protze d7fdd236d5 [OpenMP][Tests][NFC] Replace atomic increment by reduction
Also mark the test as unsupported by intel-21, because the test does
not terminate
2021-10-25 18:20:12 +02:00
Joachim Protze 38f78dd2e2 [OpenMP][Tools][NFC] Fix C99-style declaration of iteration variables
Where possible change to declare the variable before the loop.
Where not possible, specifically request -std=c99 (could be limited to
specific compilers like icc).
2021-10-25 18:20:12 +02:00
Vladimir Inđić ba02586fbe [OpenMP][OMPT][GOMP] task frame support in KMP_API_NAME_GOMP_PARALLEL_SECTIONS
KMP_API_NAME_GOMP_PARALLEL_SECTIONS function was missing the task frame support.
This patch introduced a fix responsible to set properly the exit_frame of
the innermost implicit task that corresponds to the parallel section construct,
as well as the enter_frame of the task that encloses the mentioned implicit task.

This patch also introduced a simple test case sections_serialized.c that contains
serialized parallel section construct and validates whether the mentioned
task frames are set correctly.

Differential Revision: https://reviews.llvm.org/D112205
2021-10-22 11:01:10 -05:00
AndreyChurbanov 52f4922ebb [OpenMP][NFC] skip atomic tests for non-x86 arch 2021-10-21 21:51:33 +03:00
Nawrin Sultana 99d1ce4a62 [OpenMP] Add GOMP allocator functions
This patch adds GOMP_alloc and GOMP_free functions of LIBGOMP.

Differential revision: https://reviews.llvm.org/D111673
2021-10-20 11:37:29 -05:00
AndreyChurbanov 63f8099e23 [OpenMP] libomp: add check of task function pointer for NULL.
This patch allows to simplify compiler implementation on "taskwait nowait"
construct. The "taskwait nowait" is semantically equivalent to the empty task.
Instead of creating an empty routine as a task entry, compiler can just send
NULL pointer to the runtime. Then the runtime will make all the work with
dependences and return because of the absent task routine.

Differential Revision: https://reviews.llvm.org/D112015
2021-10-18 19:48:30 +03:00
@vladaindjic 59a994e8da [OpenMP][OMPT] thread_num determination for programs with explicit tasks
__ompt_get_task_info_internal is now able to determine the right value of the
“thread_num” argument during the execution of an explicit task.

During the execution of a while loop that iterates over the ancestor tasks
hierarchy, the “prev_team” variable was always set to “team” variable at the
beginning of each loop iteration.

Assume that the program contains a parallel region which encloses an explicit
task executed by the worker thread of the region. Also assume that the tool
inquires the “thread_num” of a worker thread for the implicit task that
corresponds to the region (task at “ancestor_level == 1”) and expects to
receive the value of “thread_num > 0”.
After the loop finishes, both “team” and “prev_team” variables are equal and
point to the team information of the parallel region.
The “thread_num” is set to “prev_team->t.t_master_tid”, that is equal to
“team->t.t_master_tid”. In this case, “team->t.t_master_tid” is 0, since
the master thread of the region is the initial master thread of the program.
This leads to a contradiction.

To prevent this, “prev_team” variable is set to “team” variable only at the
time when the loop that has already encountered the implicit task (“taskdata”
variable contains the information about an implicit task) continues iterating
over the implicit task’s ancestors, if any.

After the mentioned loop finishes, the “prev_team” variable might be equal to
NULL. This means that the task at requested “ancestor_level” belongs to the
innermost parallel region, so the “thread_num” will be determined by calling
the “__kmp_get_tid”.

To prove that this patch works, the test case “explicit_task_thread_num.c” is
provided.
It contains the example of the program explained earlier in the summary.

Differential Revision: https://reviews.llvm.org/D110473
2021-10-18 13:54:22 +02:00
Joachim Protze c93fb143b9 [OpenMP][Tests][NFC] Work around ICC bug
Older intel compilers miss the privatization of nested loop variables for
doacross loops. Declaring the variable in the loop makes the test more
robust.
2021-10-18 13:54:15 +02:00
Joachim Protze 5918688248 [OpenMP][Tests][NFC] Flagging OMPT tests as XFAIL for Intel compilers
With Intel 19 compiler the teams tests fail to link while trying to link
liboffload.
2021-10-18 13:50:03 +02:00
Peyton, Jonathan L acb3b187c4 [OpenMP][host runtime] Add initial hybrid CPU support
Detect, through CPUID.1A, and show user different core types through
KMP_AFFINITY=verbose mechanism. Offer future runtime optimizations
 __kmp_is_hybrid_cpu() to know whether running on a hybrid system or not.

Differential Revision: https://reviews.llvm.org/D110435
2021-10-14 16:49:42 -05:00
Peyton, Jonathan L b840d3ab0d [OpenMP][host runtime] small fixup of RTM CPUID bit check 2021-10-14 16:49:42 -05:00
Peyton, Jonathan L 50b68a3d03 [OpenMP][host runtime] Add support for teams affinity
This patch implements teams affinity on the host.
The default is spread. A user can specify either spread, close, or
primary using KMP_TEAMS_PROC_BIND environment variable. Unlike
OMP_PROC_BIND, KMP_TEAMS_PROC_BIND is only a single value and is not a
list of values. The values follow the same semantics under the OpenMP
specification for parallel regions except T is the number of teams in
a league instead of the number of threads in a parallel region.

Differential Revision: https://reviews.llvm.org/D109921
2021-10-14 16:30:28 -05:00
AndreyChurbanov 621d7a75b1 [OpenMP] libomp: add atomic functions for new OpenMP 5.1 atomics.
Added functions those implement "atomic compare".
Though clang does not use library interfaces to implement OpenMP atomics,
the functions added for consistency.
Also added missed functions for 80-bit floating min/max atomics.

Differential Revision: https://reviews.llvm.org/D110109
2021-10-13 21:02:18 +03:00
AndreyChurbanov 6e98ec9b20 [OpenMP] libomp: fix ittnotify usage.
Replaced storing of ittnotify domain array index into
location info structure (which is now read-only) with storing of
(location info address + ittnotify domain + team size) into hash map.
Replaced __kmp_itt_barrier_domains and __kmp_itt_imbalance_domains arrays with
__kmp_itt_barrier_domains hash map; __kmp_itt_region_domains and
__kmp_itt_region_team_size arrays with __kmp_itt_region_domains hash map.
Basic functionality did not change (at least tried to not change).

The patch fixes https://bugs.llvm.org/show_bug.cgi?id=48644.

Differential Revision: https://reviews.llvm.org/D111580
2021-10-13 20:49:05 +03:00
AndreyChurbanov 5e58b63b28 [OpenMP] libomp: fix warning on comparison of integer expressions of different signedness
Replaced macro with global variable of correspondent type.

Differential Revision: https://reviews.llvm.org/D111562
2021-10-13 20:11:47 +03:00
AndreyChurbanov f5c0c9179f [OpenMP] libomp: add OpenMP 5.1 memory allocation routines.
Aligned allocation routines added.
Fortran interfaces added for all allocation routines.

Differential Revision: https://reviews.llvm.org/D110923
2021-10-11 19:25:00 +03:00
Martin Storsjö dec2257f35 [openmp] Fix a typo in a test REQUIRES line
Differential Revision: https://reviews.llvm.org/D110963
2021-10-03 23:51:11 +03:00
Peyton, Jonathan L 343b9e8590 [OpenMP][host runtime] Introduce kmp_cpuinfo_flags_t to replace integer flags
Store CPUID support flags as bits instead of using entire integers.

Differential Revision: https://reviews.llvm.org/D110091
2021-10-01 11:08:39 -05:00
Peyton, Jonathan L 957b4c5750 [OpenMP][testing] increase threshold for omp_get_wtime test 2021-10-01 11:07:41 -05:00
@vladaindjic 5357a98c82 [OpenMP] libomp: Usage of TASK_TIED constant inside kmp_gsupport.cpp
The minor code refactorization introduces the TASK_TIED constant inside
kmp_gsupprot.cpp as a replacement for the literal value 1.
The mentioned constant is now used in both kmp_tasking.cpp and
kmp_gsupport.cpp files.

Differential Revision: https://reviews.llvm.org/D110441
2021-09-27 19:45:56 +03:00
Usman Nadeem 248342b7c7 [OpenMP][OMPD] Fix compile error when OMPD is not supported
Differential Revision: https://reviews.llvm.org/D110120

Change-Id: I9d39dacfab5b7fbab37ee4b4d960d51e0892b24d
2021-09-21 12:45:15 -07:00
Peyton, Jonathan L 1e45cd75df [OpenMP][host runtime] Fix indirect lock table race condition
The indirect lock table can exhibit a race condition during initializing
and setting/unsetting locks. This occurs if the lock table is
resized by one thread (during an omp_init_lock) and accessed (during an
omp_set|unset_lock) by another thread.

The test runtime/test/lock/omp_init_lock.c test exposed this issue and
will fail if run enough times.

This patch restructures the lock table so pointer/iterator validity is
always kept. Instead of reallocating a single table to a larger size, the
lock table begins preallocated to accommodate 8K locks. Each row of the
table is allocated as needed with each row allowing 1K locks. If the 8K
limit is reached for the initial table, then another table, capable of
holding double the number of locks, is allocated and linked
as the next table. The indices stored in the user's locks take this
linked structure into account when finding the lock within the table.

Differential Revision: https://reviews.llvm.org/D109725
2021-09-20 13:01:58 -05:00
AndreyChurbanov 59b877d001 [OpenMP] NFC: add type casts to silence gcc warnings 2021-09-17 19:49:40 +03:00
AndreyChurbanov 7f1a6d891e [OpenMP] libomp: Update third-party sources of ittnotify client code.
The third-party ittnotify sources updated from https://github.com/intel/ittapi.
Changes applied:
- llvm license aded to all files; initial BSD license saved in LICENSE.txt;
- clang-formatted;
- renamed *.c to *.cpp, similar to what we did with all our sources;
- added #include "kmp_config.h" with definition of INTEL_ITTNOTIFY_PREFIX macro
  into ittnotify_static.cpp.

Differential Revision: https://reviews.llvm.org/D109333
2021-09-17 19:38:34 +03:00
Peyton, Jonathan L 258e27aae1 [OpenMP] Add support for GOMP depobj
GOMP depobjs are represented as a two intptr_t array. The first
element is the base address of the dependency and the second element
is the flag indicating the type the depobj represents.

Differential Revision: https://reviews.llvm.org/D108790
2021-09-15 12:47:08 -05:00
Hansang Bae 3976035d68 [OpenMP] Fix line truncation in omp_lib.h
Fixed code that exceeds 72-column.

Differential Revision: https://reviews.llvm.org/D109469
2021-09-09 09:33:45 -05:00
AndreyChurbanov d40108e0af [OpenMP] libomp: runtime part of omp_all_memory task dependence implementation.
New omp_all_memory task dependence type is implemented.
Library recognizes the new type via either
(dependence_address == NULL && dependence_flag == 0x80)
or
(dependence_address == SIZE_MAX).
A task with new dependence type depends on each preceding task
with any dependence type (kind of a dependence barrier).

Differential Revision: https://reviews.llvm.org/D108574
2021-09-08 16:55:32 +03:00
Hansang Bae 224f51d879 [OpenMP] Add interface for 5.1 scope construct
The new interface only marks begin/end of a scope construct for
corresponding OMPT events, and we can use existing interfaces for
reduction operations.

Differential Revision: https://reviews.llvm.org/D108062
2021-09-07 11:22:21 -05:00
Nawrin Sultana c24da72fa4 [OpenMP] Change monotonicity of dynamic schedule
This patch changes the default monotonicity of dynamic schedule from
monotonic to non-monotonic when no modifier is specified.

Differential Revision: https://reviews.llvm.org/D109026
2021-09-07 08:18:46 -05:00
Shilei Tian 8442967fe3 [OpenMP] Fix task wait doesn't work as expected in serialized team
As discussed in D107121, task wait doesn't work when a regular task T depends on
a detached task or a hidden helper task T' in a serialized team. The root cause is,
since the team is serialized, the last task will not be tracked by
`td_incomplete_child_tasks`. When T' is finished, it first releases its
dependences, and then decrements its parent counter. So far so good. For the thread
that is running task wait, if at the moment it is still spinning and trying to
execute tasks, it is fine because it can detect the new task and execute it.
However, if it happends to finish the function `flag.execute_tasks(...)`, it will
be broken because `td_incomplete_child_tasks` is 0 now.

In this patch, we update the rule to track children tasks a little bit. If the
task team encounters a proxy task or a hidden helper task, all following tasks
will be tracked.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D107496
2021-08-31 12:15:46 -04:00
Peyton, Jonathan L d39d3a327b [OpenMP][test] fix omp_get_wtime.c test to be more accommodating
The omp_get_wtime.c test fails intermittently if the recorded times are
off by too much which can happen when many tests are run in parallel.

Instead of failing if one timing is a little off, take average of 100
timings minus the 10 worst.

Differential Revision: https://reviews.llvm.org/D108488
2021-08-23 08:13:42 -05:00
Vignesh Balasubramanian 589519b9ab [OpenMP][OMPD]Code movement required for OMPD
These changes don't come under OMPD guard as it is a movement of existing code to capture parallel behavior correctly.
"Runtime Entry Points for OMPD" like "ompd_bp_parallel_begin" and "ompd_bp_parallel_begin" should be placed at the correct execution point for the debugging tool to access proper handles/data.
Without the below changes, in certain cases, debugging tool will pick the wrong parallel and task handle.

Reviewed By: @hbae
Differential Revision: https://reviews.llvm.org/D100366
2021-08-20 14:36:22 +05:30
Shilei Tian 1d8d43ae61 [OpenMP] Use `__kmpc_give_task` in `__kmp_push_task` when encountering a hidden helper task
This patch replaces the current implementation, overwrites `gtid` and `thread`,
with `__kmpc_give_task`.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D106977
2021-08-19 20:49:29 -04:00
Martin Storsjö f5616a981c [OpenMP] Fix the usage of sscanf on MinGW
KMP_SSCANF only evaluates to sscanf_s within
    #if KMP_OS_WINDOWS && KMP_MSVC_COMPAT
so we need to pass the sscanf_s specific parameters within a similar
condition.

Differential Revision: https://reviews.llvm.org/D108196
2021-08-17 21:36:09 +03:00
Peyton, Jonathan L b4a1f441d9 [OpenMP] Add a few small fixes
* Add comment to help ensure new construct data are added in two places
* Check for division by zero in the loop worksharing code
* Check for syntax errors in parrange parsing

Differential Revision: https://reviews.llvm.org/D105929
2021-08-16 10:02:49 -05:00
Peyton, Jonathan L 6eeb4c1f32 [OpenMP] Fix incorrect parameters to sscanf_s call
On Windows, the documentation states that when using sscanf_s,
each %c and %s specifier must also have additional size parameter.
This patch adds the size parameter in the one place where %c is
used.

Differential Revision: https://reviews.llvm.org/D105931
2021-08-16 09:59:21 -05:00
AndreyChurbanov 52cac541d4 [OpenMP] libomp: cleanup: minor fixes to silence static analyzer.
Added couple more checks to silence KlocWork static code analyzer.

Differential Revision: https://reviews.llvm.org/D107348
2021-08-16 13:39:23 +03:00
AndreyChurbanov f94da67f49 [OpenMP][NFC] libomp: reduced timeouts in the test from 50 to 2 sec. 2021-08-11 17:58:52 +03:00