Check Task Scheduling Constraint (TSC) on stealing of untied task.
This is needed because the untied task can produce tied children
those can break TSC if untied is not a descendant of current task.
This can cause live lock on complex tyasking tests
(e.g. kastors/strassen-task-dep).
Differential Revision: https://reviews.llvm.org/D26182
llvm-svn: 285703
Summary:
If directives are used in a macro, clang complains with:
```
src/projects/openmp/runtime/src/kmp_runtime.c:7486:2: error: embedding a directive within macro arguments has undefined behavior [-Werror,-Wembedded-directive]
#if KMP_USE_MONITOR
```
This patch fixes two occurrences of the issue in `kmp_runtime.cpp`.
Reviewers: tlwilmar, jlpeyton, AndreyChurbanov, Hahnfeld
Subscribers: Hahnfeld, openmp-commits
Differential Revision: https://reviews.llvm.org/D25823
llvm-svn: 284728
Function strerror_r() has different signatures in different
implementations of libc: glibc's version returns a char*, while BSDs
and musl return a int. libomp unconditionally assumes glibc on Linux
and thus fails to compile against musl-libc. This patch addresses this
issue.
Differential Revision: https://reviews.llvm.org/D25071
llvm-svn: 284492
New mixed type atomic routines added for regular capture operations as well as
reverse update/capture operations. LHS - all integer and float types (no
complex so far), RHS - float16.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D25275
llvm-svn: 284489
This change removes/disables unnecessary code when monitor thread is not used.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D25102
llvm-svn: 283577
Support finding lit as plain 'lit', which is the name used by setup.py
in LLVM's utils/lit.
Differential Revision: https://reviews.llvm.org/D25072
llvm-svn: 282876
Add check for "45" version to use "201511" string for OpenMP 4.5,
otherwise "200505" is used in Fortran module. Also, fix kmp_openmp_version
variable (used for the debugger, e.g.) and kmp_version_omp_api that is used
in KMP_VERSION=1 output.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D24761
llvm-svn: 282868
New routines should be used for atomics like "<int>OP=<float>" when <int> is
unsigned. Using functions __kmpc_atomic_fixed<bits>_<op>_fp) produces incorrect
results
Differential Revision: https://reviews.llvm.org/D24756
llvm-svn: 282509
This change set disables creation of the monitor thread by default. The global
counter maintained by the monitor thread was replaced by logic that uses system
time directly, and cyclic yielding on Linux target was also removed since there
was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1)
enables creation of monitor thread again if it is really necessary for some
reasons.
Differential Revision: https://reviews.llvm.org/D24739
llvm-svn: 282507
Fix lit search to correctly respect LIBOMP_LLVM_LIT_EXECUTABLE as full
program path.
The variable passed to find_program() is created by CMake as a cache
variable, and therefore can be directly overriden by the user. Since
this was the design of LIBOMP_LLVM_LIT_EXECUTABLE (as can be deduced
from the error messages) and there is no other use of LIT_EXECUTABLE,
remove the redundant variable and pass LIBOMP_LLVM_LIT_EXECUTABLE
directly to find_program().
Furthermore, the previous code did not work since the HINTS argument
specifies more search directories rather than expected full path.
Quoting the CMake documentation:
> 3. Search the paths specified by the HINTS option. These should be
> paths computed by system introspection, such as a hint provided by
> the location of another item already found. Hard-coded guesses should
> be specified with the PATHS option.
Differential Revision: https://reviews.llvm.org/D24710
llvm-svn: 281887
Introduce a new LIBOMP_INSTALL_VARIABLES cache variable that can be used
to disable creating libgomp and libiomp5 aliases on 'make install'.
Those aliases are undesired e.g. on Gentoo systems where libomp is used
purely by clang.
Differential Revision: https://reviews.llvm.org/D24563
llvm-svn: 281512
Previous differencials D23305-D23310 changed task frame information management only for the kmp interface, but not for the whole gomp interface. This broke some testcases when building with gcc.
This patch fixes the broken task frame information for the gomp interface.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D24502
llvm-svn: 281468
In case, the current team is a serialized team (lwt), the frame information should be written to this data structure.
Before, nested serialized teams would overwrite the same task information.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23310
llvm-svn: 281467
The comment already states, that this function should work similarly as __ompt_get_taskinfo.
The function only looked for lwt entries of the current team, but not when unrolling the parents. This fix aligns the implementation to __ompt_get_taskinfo.
The new test case creates a single theaded team (->lwt) and then a nested active team.
Before the innermost print_id(1) would deliver a different team then the outer print_id(0).
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23309
llvm-svn: 281466
The exit address is set when execution of a task is started and should be reset as soon as the execution is finished.
Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation.
The testcase shows the effect of this patch:
Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task.
This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23307
llvm-svn: 281465
The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops.
Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for :
Since there is no runtime frame between the executed task and the encountering task.
The test case compares exit and reenter addresses against addresses captured in application code
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23305
llvm-svn: 281464
OMPT tests can check for right frame information of tasks:
* parent_task_frame was directly printed as a pointer, but actually points to a struct ompt_frame {void*, void*}
* NULL is printed in the beginning of execution and loaded to FileChecker variable [[NULL]]
* implicit tasks now also print their frame information
* macro to print frame address from application
* print task info for barrier begin
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23304
llvm-svn: 281463
Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we
use the get_max_proc() function which can vary based on the operating system.
For example on Windows with multiple processor groups, it might be the case that
the highest bit possible in the bitmask is not equal to the number of hardware
threads on the machine but something higher than that.
Differential Revision: https://reviews.llvm.org/D24206
llvm-svn: 281245
There is a bug in CMakeLists which causes powerpc64le systems to be recognized as big-endian. This patch fixes the issue.
Differential Revision: https://reviews.llvm.org/D23626
llvm-svn: 281068
Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device.
Also, added support for the environment variable OMP_DEFAULT_DEVICE.
Differential Revision: https://reviews.llvm.org/D23587
llvm-svn: 281065
When affinity isn't supported, __kmp_affinity_compact doesn't exist. The
problem is that in kmp_affinity.h there is a function which uses it without the
proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to
ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on
it, but I think it is cleaner to have it under the proper guard. Since the
function is only used in the kmp_affinity.cpp file and there aren't any plans to
have it elsewhere. I have moved it there.
llvm-svn: 280542
the __kmp_affinity_determine_capable() functions are highly operating system
specific. This change has the functions use the type they expect explicitly.
llvm-svn: 280538
In case atomic reduction method is not available (the compiler can't generate
it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified.
This change replaces the assertion with a warning and sets the reduction method
to the default one - 'critical'.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D23990
llvm-svn: 280519
Summary:
On FreeBSD, linking the misc_bugs/omp_foreign_thread_team_reuse.c test
case fails with:
/usr/local/bin/ld: /tmp/omp_foreign_thread_team_reuse-c5e71b.o: undefined reference to symbol 'pthread_create@@FBSD_1.0'
This is because the program is linked without `-lpthread`. Since the
%libomp-compile-and-run macro does not allow that option to be added to
the compile command line, split it up and add the required `-lpthread`
between %libomp-compile and %libomp-run.
Reviewers: jlpeyton, hfinkel, Hahnfeld
Subscribers: Hahnfeld, emaste, openmp-commits
Differential Revision: https://reviews.llvm.org/D23084
llvm-svn: 278036
Consider the following code:
int dep;
#pragma omp target nowait depend(out: dep)
{
sleep(1);
}
#pragma omp task depend(in: dep)
{
printf("Task with dependency\n");
}
printf("Doing some work...\n");
In its current state the runtime will block on the second task and not
continue execution.
Differential Revision: https://reviews.llvm.org/D23116
llvm-svn: 277992
Consider the following code which may be executed by a serial team:
int dep;
#pragma omp target nowait depend(out: dep)
{
sleep(1);
}
#pragma omp task depend(in: dep)
{
#pragma omp target nowait
{
sleep(1);
}
}
Here the explicit task may not be freed until the nested proxy task has
finished. The current code hasn't considered this and called __kmp_free_task
anyway which triggered an assert because of remaining incomplete children:
KMP_DEBUG_ASSERT( TCR_4(taskdata->td_incomplete_child_tasks) == 0 );
Differential Revision: https://reviews.llvm.org/D23115
llvm-svn: 277991
node->dn.task is only filled after the dependencies are already processed.
This currently leads to unhelpful output from KA_TRACE or even a crash
if one enables KMP_SUPPORT_GRAPH_OUTPUT.
llvm-svn: 277717
Summary:
Android does not have pthread_cancel. Disable KMP_CANCEL_THREADS if
__ANDROID__ is defined.
Subscribers: tberghammer, srhines, openmp-commits, danalbert
Differential Revision: https://reviews.llvm.org/D23029
llvm-svn: 277618
This patch enables balanced affinity on machines that do not have
hardware threads and have cores clustered into packages. In facts,
balacing algorithm could be generalized for any arrangement with
at least two levels of hierarchy (depth > 1).
Differential Revision: https://reviews.llvm.org/D22365
llvm-svn: 277212
Summary:
When compiling the runtime library with clang we get warnings like:
```
error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs]
va_start( args, id );
^
note: parameter of type 'kmp_i18n_id_t' (aka 'kmp_i18n_id') is declared here
kmp_i18n_id_t id,
```
My understanding is that the va_start macro only gets the promoted type so it won't know what was the exact type of the argument, which can potentially not work for some targets given that the implementation of the the calling convention could not be done properly.
This patch fixes that by using a built-in type in the function signature.
Reviewers: tlwilmar, jlpeyton, AndreyChurbanov
Subscribers: arpith-jacob, carlo.bertolli, caomhin, openmp-commits
Differential Revision: https://reviews.llvm.org/D22427
llvm-svn: 276428
When linking with libhwloc, the ORDERED EPCC test slows down on big
machines (> 48 cores). Performance analysis showed that a cache thrash
was occurring and this padding helps alleviate the problem.
Also, inside the main spin-wait loop in kmp_wait_release.h, we can eliminate
the references to the global shared variables by instead creating a local
variable, oversubscribed and instead checking that.
Differential Revision: http://reviews.llvm.org/D22093
llvm-svn: 274894