This patch enables OMPT by default if version 50 or later is built and the config says, that OMPT will be supported.
Differential Revision: https://reviews.llvm.org/D41508
llvm-svn: 321675
We now have several options that apply for both libraries and they
shouldn't be documented in multiple files. When already merging
the two Build_With_CMake.txt documents, convert them to
reStructuredText which is used for all of LLVM's documentation.
Differential Revision: https://reviews.llvm.org/D40920
llvm-svn: 321481
As for normal task creation, the task frame addresses need to be stored
for the encountering task.
Differential Revision: https://reviews.llvm.org/D41165
llvm-svn: 321421
The compiler warns that _BSD_SOURCE is deprecated and _DEFAULT_SOURCE should
be used instead. We keep _BSD_SOURCE for older compilers, that don't know
about _DEFAULT_SOURCE.
The linker drops the tool when linking, since there is no visible need for
the library. So we need to tell the linker, that the tool should be linked
anyway.
Differential Revision: https://reviews.llvm.org/D41499
llvm-svn: 321362
The format string for hints only prints the second argument (string) and drops
the first argument (hint id). Depending on how you read the POSIX text for
printf, this could be valid. But for practical reason, i.e., unpacking the
va_list passed to printf based on the formating information, it makes sense
to fix the implementation and not pass the id for hint.
Failing testcases were:
misc_bugs/teams-reduction.c
ompt/parallel/not_enough_threads.c
Differential Revision: https://reviews.llvm.org/D41504
llvm-svn: 321361
This function is defined in OpenMP-TR6 section 4.1.5.1.6
The functions was not implemented yet.
Since ompt-functions can only be called after the runtime was initialized and
has loaded a tool, it can assume the runtime to be initialized. In contrast
to omp_get_num_procs which needs to check whether the runtime is initialized.
Differential Revision: https://reviews.llvm.org/D40949
llvm-svn: 321269
This revision fixes failing testcases with parallel for loops and the gomp
interface. The return address needs to be stored at entry to runtime.
The storage is cleared on usage, so we need to update the storage before
calling again internal functions, that will trigger event callbacks.
Differential Revision: https://reviews.llvm.org/D41181
llvm-svn: 321265
We use the bitmap ompt_enabled thoughout the runtime, to avoid loading the
vector of callback functions when testing if specific code should be executed.
Before invoking an event callback function, the pointer is tested for NULL.
This revision resets the corresponding bit in ompt_enabled to 0 if
NULL is passed in set_callback.
Differential Revision: https://reviews.llvm.org/D41171
llvm-svn: 321264
Clang 5 or higher adds an intermediate function call in certain cases when
compiling with debug flag. This revision updates the testcases to work
correctly.
Differential Revision: https://reviews.llvm.org/D40595
llvm-svn: 321263
Reasons for expected failures are mainly bugs when using lables in OpenMP regions
or missing support of some OpenMP features.
For some worksharing clauses, support to distinguish the kind of workshare was
added just recently.
If an issue was fixed in a minor release version of a compiler, we flag the
test as unsupported for this compiler version to avoid false positives.
Same for fixes that where backported to older compiler versions.
Differential Revision: https://reviews.llvm.org/D40384
llvm-svn: 321262
There are two /proc/cpuinfo layots in use for AArch64: old and new.
The old one has all 'processor : n' lines in one section, hence
checking for duplications does not make sense.
Differential Revision: https://reviews.llvm.org/D41000
llvm-svn: 320593
All architectures except x86_64 used the linear barrier implementation
by default which doesn't give good performance for a larger number
of threads.
Improvements for PARALLEL overhead (EPCC) with this patch on a Power8
system (2 sockets x 10 cores x 8 threads, OMP_PLACES=cores)
20 threads: 4.55us -> 3.49us
40 threads: 8.84us -> 4.06us
80 threads: 19.18us -> 4.74us
160 threads: 54.22us -> 6.73us
Differential Revision: https://reviews.llvm.org/D40358
llvm-svn: 320152
To make thread affinity work according to the OpenMP spec, the
runtime needs information about the hardware topology. On Linux
the default way is to parse /proc/cpuinfo which contains this
information for x86 machines but (at least) not for AArch64 and
Power architectures.
Fortunately, there is a different code path which is able to get
that data from sysfs. The needed patch has landed in 2006 for
Linux 2.6.16 which is safe to assume nowadays (even RHEL 5 had
a kernel version derived from 2.6.18, and we are now at RHEL 7!).
Differential Revision: https://reviews.llvm.org/D40357
llvm-svn: 320151
Otherwise I see hangs in the omp_single_copyprivate test when
compiling in release mode. With the debug assertions, I get a
failure `head > 0 && tail > 0`.
Differential Revision: https://reviews.llvm.org/D40722
llvm-svn: 320150
This last of four patches adds a new file for the interface
functions that Clang uses during code generation. The only
change except simply moving the current code is renaming the
function CheckDeviceAndCtors() and using the correct type for
64bit device ids.
Differential Revision: https://reviews.llvm.org/D40801
llvm-svn: 319972
This third patch moves the implementation of the user-facing
OpenMP API functions into its own file. For now, the code is
only moved, no cleanups applied yet.
Differential Revision: https://reviews.llvm.org/D40800
llvm-svn: 319971
This is the second patch to split the current monolithic
implementation into separate files. Note that this change
doesn't cleanup the code yet.
Differential Revision: https://reviews.llvm.org/D40799
llvm-svn: 319970
This is the first of four patches to split the target agnostic
library into multiple (smaller) files. It only moves the code
to separate implementation files and does no cleanup (yet) except
removing unneeded headers.
Differential Revision: https://reviews.llvm.org/D40798
llvm-svn: 319969
Future patches will add (private) header files in src/ that should
not be visible to plugins, so move the "public" ones to a new
include/ directory. This is still internal in a sense that the
contained files won't be installed for the user.
Similarly, the target agnostic offloading library should be built
directly in src/. The parent directory is responsible for finding
dependencies and including all subdirectories.
Differential Revision: https://reviews.llvm.org/D40797
llvm-svn: 319968
Redundant extra verbose output of binding to full mask in case
affinity=balanced or OMP_PLACES=<any> or OMP_PROC_BIND=<any>
Differential Revision: https://reviews.llvm.org/D40624
llvm-svn: 319960
This change is a trivial fix for enums that removes specification of "last" or
"upper" values, or other boundary values. This simplifies the code in places,
and results in never needing to update the "upper" values again.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D40804
llvm-svn: 319957
The runtime will use the global kmp_critical_name as a lock and
tries to atomically store a pointer in there. This will fail
if the global is only aligned by 4 bytes, the size of one int32_t
element. Use a union to ensure the global is aligned to the size
of a pointer on the current platform.
llvm-svn: 319811
__kmpc_reduce_nowait() correctly swapped the teams for reductions
in a teams construct. Apply the same logic to __kmpc_reduce() and
__kmpc_reduce_end().
Differential Revision: https://reviews.llvm.org/D40753
llvm-svn: 319788
Perform a nested CMake invocation to avoid writing our own parser
for compiler versions when we are not testing the in-tree compiler.
Use the extracted information to mark a test as unsupported that
hangs with Clang prior to version 4.0.1 and restrict tests for
libomptarget to Clang version 6.0.0 and later.
Differential Revision: https://reviews.llvm.org/D40083
llvm-svn: 319448
This change makes kmp_r_sched_t type into a union for simpler
comparisons and assignments
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D40374
llvm-svn: 319379
kmp_aligned_malloc() always returned NULL on Windows (stub library only)
that may cause Fortran application crash. With this change all memory
allocation functions were fixed to use aligned{m,re,rec}alloc() to
allocate/reallocate memory. To deallocate that memory _aligned_free() is
used in kmp_free().
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D40296
llvm-svn: 319375
Added two warnings:
1) Before building the topology map check if tiles are requested but the
topo method is not hwloc;
2) After building the topology map check if tiles are requested but not
detected by the library.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D40340
llvm-svn: 319374
Fortran array elements made default integer in OMP_GET_PLACE_PROC_IDS and
OMP_GET_PARTITION_PLACE_NUMS subroutines, otherwise call to them produces
incorrect result.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D40356
llvm-svn: 319372
The code for the two OpenMP runtime libraries was very similar.
Move to common CMake file that is included and provides a simple
interface for adding testsuites. Also add a common check-openmp
target that runs all testsuites that have been registered.
Note that this renames all test options to the common OPENMP
namespace, for example OPENMP_TEST_C_COMPILER instead of
LIBOMP_TEST_COMPILER and so on.
Differential Revision: https://reviews.llvm.org/D40082
llvm-svn: 319343
These are needed by both libraries, so we can do that in a
common namespace and unify configuration parameters.
Also make sure that the user isn't requesting libomptarget
if the library cannot be built on the system. Issue an error
in that case.
Differential Revision: https://reviews.llvm.org/D40081
llvm-svn: 319342
As a first step, this allows us to generalize the detection of
standalone builds and make it fully compatible when building in
llvm/runtimes/ which automatically sets OPENMP_STANDLONE_BUILD.
Differential Revision: https://reviews.llvm.org/D40080
llvm-svn: 319341
Summary:
We want to automatically copy the appropriate mailing list
for review requests to the openmp repository.
For context, see the proposal and discussion here:
http://lists.llvm.org/pipermail/cfe-dev/2017-November/056032.html
Similar to D40179, I set up a new Diffusion repository with callsign
"OMP" for OpenMP:
https://reviews.llvm.org/source/openmp/
This explicitly updates openmp's .arcconfig to point to the new
OMP repository in Diffusion, which will let us use Herald rule H272
to automatically subscribe openmp-commits to review requests.
Reviewers: hans, grokos, Hahnfeld
Reviewed By: grokos
Subscribers: sammccall, klimek, openmp-commits
Differential Revision: https://reviews.llvm.org/D40499
llvm-svn: 319254
Power has a weak consistency model so we need memory barriers to
make writes (both from runtime and from user code) available for
all threads.
Differential Revision: https://reviews.llvm.org/D40175
llvm-svn: 318848
We have just fixed the codegen of omp_is_initial_device() to reliably work
when offloading to the same device, see commit r316001. This fixes the
failing tests that were the reason why we disabled the library for 5.0.
Differential Revision: https://reviews.llvm.org/D39052
llvm-svn: 318847
This is the libomptarget-side patch which changes the __tgt_* API function signatures in preparation for the new map interface.
Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits
Differential revision: https://reviews.llvm.org/D40313
llvm-svn: 318790
These tests were failing rarely on my MacBook when there was some
activity in the background. Read: one of a thousand executions?
* sections.c missed the sorting based on thread ids. This worked
as long as the master thread finished its section before the
worker thread started the second one but failed if the master
thread was put to sleep by the OS.
* The checks in single.c assumed that the master thread executes
the single region which works most of the time because it is
usually faster than the newly spawned worker thread.
Differential Revision: https://reviews.llvm.org/D39853
llvm-svn: 318527
The testsuite directory is not used or updated and confuses new users to the
OpenMP project. These tests were rewritten using the lit format and put under
the runtime/test directory. This patch removes the entire testsuite/ directory.
Differential Revision: https://reviews.llvm.org/D39767
llvm-svn: 318056
Traditionally, the library had a weak symbol for ompt_start_tool()
that served as fallback and disabled OMPT if called. Tools could
provide their own version and replace the default implementation
to register callbacks and lookup functions. This mechanism has
worked reasonably well on Linux systems where this interface was
initially developed.
On Darwin / Mac OS X the situation is a bit more complicated and
the weak symbol doesn't work out-of-the-box. In my tests, the
library with the tool needed to link against the OpenMP runtime
to make the process work. This would effectively mean that a tool
needed to choose a runtime library whereas one design goal of the
interface was to allow tools that are agnostic of the runtime.
The solution is to use dlsym() with the argument RTLD_DEFAULT so
that static implementations of ompt_start_tool() are found in the
main executable. This works because the linker on Mac OS X includes
all symbols of an executable in the global symbol table by default.
To use the same code path on Linux, the application would need to
be built with -Wl,--export-dynamic. To avoid this restriction, we
continue to use weak symbols on Linux systems as before.
Finally this patch extends the existing test to cover all possible
ways of initializing the tool as described by the standard. It
also fixes ompt_finalize() to not call omp_get_thread_num() when
the library is shut down which resulted in hangs on Darwin.
The changes have been tested on Linux to make sure that it passes
the current tests as well as the newly extended one.
Differential Revision: https://reviews.llvm.org/D39801
llvm-svn: 317980