Fix static initializers to use the proper unlocked value for the poll
field of the tas and futex locks.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D33794
llvm-svn: 304828
Some code was restructured to move it under KMP_DEBUG. The rest is
formatting changes to fix some things broken by clang-format
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D33744
llvm-svn: 304438
With these settings, the create_hwloc_map() method was being called causing an
assert(). After some consideration, it was determined that disabling affinity
explicitly should just disable hwloc as well. i.e., KMP_AFFINITY overrides
KMP_TOPOLOGY_METHOD. This lets the user know that the Hwloc mechanism is being
ignored when KMP_AFFINITY=disabled.
Differential Revision: https://reviews.llvm.org/D33208
llvm-svn: 304344
This change checks if the initial affinity mask is equal to exactly one
Windows processor group's affinity mask. If it is, then the code does not
respect the initial affinity mask and uses the entire machine instead.
The reasoning behind this is that, by default, Windows assigns exactly one
processor group as the initial affinity mask even when there are multiple
Windows processor groups available. User's typically want to use the whole
machine, so we ignore this special case and use the entire machine.
If the initial affinity mask is a proper subset of one group, or spans multiple
groups, then the initial affinity mask is respected since we can assume that the
operating system did not assign this initial affinity mask. This change only
affects machines with multiple processor groups
Differential Revision: https://reviews.llvm.org/D33210
llvm-svn: 304343
An assert() was being tripped when KMP_AFFINITY=respect + Multiple Processor
Groups. Let __kmp_affinity_create_proc_group_map() function be able to create
address2os object which contains a single group by deleting restriction that
process affinity mask must span multiple groups.
llvm-svn: 303101
This patch contains the clang-format and cleanup of the entire code base. Some
of clang-formats changes made the code look worse in places. A best effort was
made to resolve the bulk of these problems, but many remain. Most of the
problems were mangling line-breaks and tabbing of comments.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D32659
llvm-svn: 302929
This patch chagnes the plugin interface so that:
1) future plugins can take advantage of systems with shared CPU/device storage
2) instead of using base addresses, target regions are launched by providing target addresseds and base offsets explicitly.
Differential revision: https://reviews.llvm.org/D33028
llvm-svn: 302663
Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor
HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and
HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on
the older names when using an older Hwloc.
Differential Revision: https://reviews.llvm.org/D32496
llvm-svn: 301349
Without this fix cancellation status for parallel, sections and for persists
across construct boundaries.
Differential Revision: https://reviews.llvm.org/D31419
llvm-svn: 299434
This change slightly improves performance of KMP_YIELD_NOW() macro, by using
_rdtsc() intrinsic function if possible.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D31008
llvm-svn: 298314
Affinity initialization code expects __kmp_affinity_type has the value
affinity_default by default, but the cleanup code does not properly set the
value back to affinity_default. This may introduce some issues when multiple
roots are trying to initialize/uninitialize the runtime successively.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D31012
llvm-svn: 298313
This change fixes an assertion failure the in case KMP_AFFINITY is set with
'proclist' specified but without 'explicit'
e.g., KMP_AFFINITY=verbose,proclist=[0-31]
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D30404
llvm-svn: 297480
Summary:
Bionic didn't get a GNU style strerror_r until Android M. Until then
we unconditionally exposed the POSIX one. Expand the check to account
for this.
Reviewers: pirama, AndreyChurbanov, jlpeyton
Reviewed By: jlpeyton
Subscribers: openmp-commits, srhines
Differential Revision: https://reviews.llvm.org/D30056
llvm-svn: 297235
Add build option LIBOMP_OMP_VERSION=50, 5.0 headers, and add the year/month
associated with OpenMP 5.0 in relevant source locations. Also, remove the
deprecated LIBOMP_OMP_VERSION=41 option.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D30450
llvm-svn: 297083
This adds AArch64 support to recently added part of the runtime responsible for offloading to target. This piece of code allows offloading-to-self on AArch64 machines.
Differential Revision: https://reviews.llvm.org/D30644
llvm-svn: 297070
This section of code (__kmp_test_then_* functions) is guarded by
(KMP_ARCH_X86 || KMP_ARCH_X86_64) so it does not make sense to have other
architecture guards inside this section. Non-x86 architectures always
use intrinsics (__sync_*)
llvm-svn: 296525
When using -rtlib=libgcc, the fallback implementation of __atomic_*
builtins is provided via libatomic (included in GCC). However, neither
GCC itself nor clang link libatomic implicitly, and it seems that GCC
upstream expects projects to link it explicitly as necessary.
Since compiler-rt provides __atomic_* builtins directly in the main
library, check if they are provided by the default libraries first.
If they are not, check if -latomic is available to provide them
and add explicit -latomic for tests in this case.
This fixes unresolved __atomic_load() references when running openmp
tests on i386 with libgcc backend.
Differential Revision: https://reviews.llvm.org/D30083
llvm-svn: 296183
Add counter to count number of static_steal for loops
Add counter for number of chunks executed per static_steal for loop
Add counter for number of chunks stolen per static_steal for loop
llvm-svn: 295461
Added test kmp_task_reduction_nest.cpp which has an example of
possible compiler codegen.
Differential Revision: https://reviews.llvm.org/D29600
llvm-svn: 295343
Fixed bug due to which a parent struct was deallocated when one of the struct's pointers was being unmapped.
Differential Revision: https://reviews.llvm.org/D29914
llvm-svn: 295231
This change allows the runtime to turn __kmp_yield() on/off repeatedly on Linux.
This feature was removed when disabling monitor thread, but there are
applications that perform better with this feature on.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D29227
llvm-svn: 295203
Added new ThreadSanitizer annotations to remove false positives with OpenMP reduction.
Cleaned up Tsan annotations header file from unused annotations.
Patch by Simone Atzeni!
Differential Revision: https://reviews.llvm.org/D29202
llvm-svn: 295158
This change allows setting LIBOMPTARGET_LLVM_LIT_EXECUTABLE and
LIBOMPTARGET_FILECHECK_EXECUTABLE as full path. It also honors
OPENMP_LLVM_TOOLS_DIR which is meant as a common configuration
for both libomp and libomptarget.
Maybe this should be done in a common CMake module, but I'm no expert here.
Differential Revision: https://reviews.llvm.org/D29172
llvm-svn: 294284
This is the patch upstreaming the plugins part of libomptarget (CUDA, generic-elf-64).
Differential Revision: https://reviews.llvm.org/D14253
llvm-svn: 293724
glibc < 2.18 is C99 compliant and only provides the format macros in C++ if
__STDC_FORMAT_MACROS is defined. This change fixes the debug build for
GCC 4.8, GCC 6.2 and Clang 3.9.1 that were previously broken on my machine.
It shows no regression for libc++ >= 4.0.0 which has a fix since September:
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20160926/171659.html
llvm-svn: 293468
Put the duplicated i_maxmin into traits_t by adding new members max_value and
min_value. Put ___kmp_size_type into traits_t by adding member type_size.
Differential Revision: https://reviews.llvm.org/D28847
llvm-svn: 293316
When the monitor thread is used, most threads in the team directly go to
sleep if the copy of bt_intervals/bt_set is not available in the cache,
and this happens at least once per thread in the wait function, making the
overall performance slightly better.
This change tries to mimic this behavior by using the bt_intervals cache,
which simply keeps the blocktime interval in terms of the platform-dependent
ticks or nanoseconds.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D28906
llvm-svn: 293312
The lock tables were being reallocated if kmp_set_defaults() was called.
In the env_init code it says that the user should be able to switch between
different KMP_CONSISTENCY_CHECK values which is what this change enables.
llvm-svn: 292349
There is no corresponding free() for this expandable array. The logic is
added in __kmp_cleanup() next to the freeing of __kmp_nested_nth.
llvm-svn: 292348
Clang 4.0 trunk warns:
warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]
This points to a potential bug if the code really wants to check if the single
bit is not set: If for example (buf.edx >> 9) = 2 (has any bit set except the
least significant one), 'logical not' will return 0 which stays 0 after the
'bitwise and'.
To do this correctly we first need to evaluate the 'bitwise and'. In that case
it returns 2 & 1 = 0 which after the 'logical not' evaluates to 1.
Differential Revision: https://reviews.llvm.org/D28599
llvm-svn: 291764
Paul Osmialowski pointed out a double free bug in shutdown code. This patch
Moves the freeing of the implicit task to above the freeing of all fast memory
to prevent the double-free issue.
Differential Revision: https://reviews.llvm.org/D26860
llvm-svn: 287551
Have developer timers use partitioning scheme which also required that some
redundant developer timers be removed in favor of the already existing normal
timers. Move per thread stats initialization to just after global thread id
assignment which is as early as possible. Also put all global stats
initialization code in __kmp_stats_init() and all global stats destruction code
in __kmp_stats_fini().
Differential Revision: https://reviews.llvm.org/D26361
llvm-svn: 286892
This set of changes enables the affinity interface (Either the preexisting
native operating system or HWLOC) to be dynamically set at runtime
initialization. The point of this change is that we were seeing performance
degradations when using HWLOC. This allows the user to use the old affinity
mechanisms which on large machines (>64 cores) makes a large difference in
initialization time.
These changes mostly move affinity code under a small class hierarchy:
KMPAffinity
class Mask {}
KMPNativeAffinity : public KMPAffinity
class Mask : public KMPAffinity::Mask
KMPHwlocAffinity
class Mask : public KMPAffinity::Mask
Since all interface functions (for both affinity and the mask implementation)
are virtual, the implementation can be chosen at runtime initialization.
Differential Revision: https://reviews.llvm.org/D26356
llvm-svn: 286890
This patch allows ThreadSanitizer (Tsan) to verify OpenMP programs.
It means that no false positive will be reported by Tsan when
verifying an OpenMP programs.
This patch introduces annotations within the OpenMP runtime module to
provide information about thread synchronization to the Tsan runtime.
In order to enable the Tsan support when building the runtime, you must
enable the TSAN_SUPPORT option with the following environment variable:
-DLIBOMP_TSAN_SUPPORT=TRUE
The annotations will be enabled in the main shared library
(same mechanism of OMPT).
Patch by Simone Atzeni and Joachim Protze!
Differential Revision: https://reviews.llvm.org/D13072
llvm-svn: 286115
Check Task Scheduling Constraint (TSC) on stealing of untied task.
This is needed because the untied task can produce tied children
those can break TSC if untied is not a descendant of current task.
This can cause live lock on complex tyasking tests
(e.g. kastors/strassen-task-dep).
Differential Revision: https://reviews.llvm.org/D26182
llvm-svn: 285703
Summary:
If directives are used in a macro, clang complains with:
```
src/projects/openmp/runtime/src/kmp_runtime.c:7486:2: error: embedding a directive within macro arguments has undefined behavior [-Werror,-Wembedded-directive]
#if KMP_USE_MONITOR
```
This patch fixes two occurrences of the issue in `kmp_runtime.cpp`.
Reviewers: tlwilmar, jlpeyton, AndreyChurbanov, Hahnfeld
Subscribers: Hahnfeld, openmp-commits
Differential Revision: https://reviews.llvm.org/D25823
llvm-svn: 284728
Function strerror_r() has different signatures in different
implementations of libc: glibc's version returns a char*, while BSDs
and musl return a int. libomp unconditionally assumes glibc on Linux
and thus fails to compile against musl-libc. This patch addresses this
issue.
Differential Revision: https://reviews.llvm.org/D25071
llvm-svn: 284492
New mixed type atomic routines added for regular capture operations as well as
reverse update/capture operations. LHS - all integer and float types (no
complex so far), RHS - float16.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D25275
llvm-svn: 284489
This change removes/disables unnecessary code when monitor thread is not used.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D25102
llvm-svn: 283577
Support finding lit as plain 'lit', which is the name used by setup.py
in LLVM's utils/lit.
Differential Revision: https://reviews.llvm.org/D25072
llvm-svn: 282876