Commit Graph

2313 Commits

Author SHA1 Message Date
Shilei Tian b95d31a849 [OpenMP][Offloading] Enlarge the work size of `wtime.c` in case of any noise 2022-07-22 16:03:39 -04:00
Joel E. Denny cfa6e79df3 [Libomptarget] Don't report lack of CUDA devices
Sometimes libomptarget's CUDA plugin produces unhelpful diagnostics
about a lack of CUDA devices before an application runs:

```
$ clang -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa hello-world.c
$ ./a.out
CUDA error: Error returned from cuInit
CUDA error: no CUDA-capable device is detected
Hello World: 4
```

This can happen when the CUDA plugin was built but all CUDA devices
are currently disabled in some manner, perhaps because
`CUDA_VISIBLE_DEVICES` is set to the empty string.  As shown in the
above example, it can even happen when we haven't compiled the
application for offloading to CUDA.

The following code from `openmp/libomptarget/plugins/cuda/src/rtl.cpp`
appears to be intended to handle this case, and it chooses not to
write a diagnostic to stderr unless debugging is enabled:

```
if (NumberOfDevices == 0) {
  DP("There are no devices supporting CUDA.\n");
  return;
}
```

The problem is that the above code is never reached because the
earlier `cuInit` returns `CUDA_ERROR_NO_DEVICE`.  This patch handles
that `cuInit` case in the same manner as the above code handles the
`NumberOfDevices == 0` case.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D130371
2022-07-22 14:46:45 -04:00
Shilei Tian 0c86c4f50c [OpenMP] Fix test error introduced in D130179 2022-07-22 14:16:47 -04:00
Shilei Tian 602e0eb9f0 [OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake
Multiple calls to `omp_get_wtime` could be optimized out due to the function
is mistakenly marked as `readnone`. This patch fixes the issue, and also add the
support to run optimization on `libomptarget` tests.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D130179
2022-07-22 13:46:45 -04:00
Shilei Tian 77cb30e3a6 Revert "[OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake"
This reverts commit ad34f1dba8.
2022-07-22 11:45:13 -04:00
Shilei Tian ad34f1dba8 [OpenMP][DeviceRTL] Fix the issue that multiple calls to `omp_get_wtime` is optimized out by mistake
Multiple calls to `omp_get_wtime` could be optimized out due to the function
is mistakenly marked as `readnone`. This patch fixes the issue, and also add the
support to run optimization on `libomptarget` tests.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D130179
2022-07-22 11:43:30 -04:00
Joseph Huber a3804a3145 [Libomptarget] Make the plugins link as LLVM libraries
Previously we made `libomptarget` link as an LLVM library so we have
access to the LLVM core libraries. After the initial patch stuck we can
now apply the same changes to the plugins. This will allow us to use
LLVM in all of `libomptarget` when we have uses for them. In the future
this should allow us to remove the dependencies on `libelf`, `libffi`,
and `dl`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D130262
2022-07-22 09:34:12 -04:00
Joseph Huber 908054df4f [Libomptarget] Only export needed definitions in the BC library
This patch adds the use of the `-internalize-public-api-file` option in
the internalization pass to internalize any definition that isn't
explicitly needed for the interface. This will allow us to perform more
optimizations on the file that normally would not have been possible
with functions internal to the library not being internal.

Depends on D130293

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D130298
2022-07-22 08:24:35 -04:00
Joseph Huber e82e07d74a [Libomptarget] Build the DeviceRTL BC using clang directly
Currently the bitcode library is build using the clang front-end
manually. This was originally done because we did not support device
only compilation. Now we support device only compilation, at least for a
single offloading toolchain, so we can instead use clang directly rather
than using the front-end. This saves us needing to define things like
`aux_triple`.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D130293
2022-07-22 08:24:29 -04:00
Ron Lieberman 45a379ce2f Revert "[Libomptarget] Stop testing CPU offloading with LTO"
This reverts commit 3e8d46921f.
2022-07-22 12:10:06 +00:00
Ye Luo 4794bbffb2 Revert "[OpenMP][OMPD] GDB plugin code to leverage libompd to provide debugging"
This reverts commit 51d3f421f4.
2022-07-21 22:00:33 -05:00
Ye Luo ee95be3c46 Revert "Fixing build bot failure due to python-pip unavailability."
This reverts commit 9dc0d6aaa1.
2022-07-21 22:00:32 -05:00
Johannes Doerfert 1da6ae4b54 [OpenMP][FIX] Ensure thread and team state are defined properly
The namespaces were missing causing the symbols to have "C" mangling.
To avoid this in the future we qualify the names now fully.
2022-07-21 21:57:14 -05:00
Joseph Huber 3e8d46921f [Libomptarget] Stop testing CPU offloading with LTO
Summary:
Some of the buildbots don't find the libraries because they don't build
for the GPU. Although it should always be there it's unclear why these
buildbots are having problemsd. LTO is only interesting on the GPU and
these tests take extra time anyway so I'm just going to disable them for
now.
2022-07-21 16:47:41 -04:00
John Ericson 07b749800c [cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.

Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.

@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.

Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.

To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.

For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117977
2022-07-21 19:04:00 +00:00
Johannes Doerfert d150152615 [OpenMP] Introduce more fine-grained control over the thread state use
We can help optimizations by making sure we use the team state whenever
it is clear there is no thread state. To this end we introduce a new
state flag (`state::HasThreadState`) and explicit control for the
`state::ValueRAII` helpers, including a dedicated "assert equal".

Differential Revision: https://reviews.llvm.org/D130113
2022-07-21 12:30:38 -05:00
Johannes Doerfert 7472b42b78 [OpenMP] Use Undef instead of null as pointer for inactive lanes
Our conditional writes in the runtime look like this:
```
  if (active)
    *ptr = value;
```
In the RAII we need to assign `ptr` which comes from a lookup call.
If a thread that is not the main thread calls lookup with the intention
to write the pointer, we'll create a new thread state. As such, we need
to avoid calling lookup for inactive threads. We used to use `nullptr`
as their `ptr` value but that can cause pessimistic reasoning. We now
use `undef` instead.

Differential Revision: https://reviews.llvm.org/D130114
2022-07-21 12:28:45 -05:00
Johannes Doerfert a42361dc1c [OpenMP] Expose the state in the header to allow non-lto optimizations
We used to inline the `lookup` calls such that the runtime had "known"
access offsets when it was shipped. With the new static library build it
doesn't as the lookup is an indirection we cannot look through. This
should help us optimize the code better until we can do LTO for the
runtime again.

Differential Revision: https://reviews.llvm.org/D130111
2022-07-21 12:28:44 -05:00
Joseph Huber e01ce4e88a [Libomptarget] Add checks for CUDA subarchitecture using new info
This patch extends the `is_valid_binary` routine to also check if the
binary's architecture string matches the one parsed from the runtime.
This should allow us to only use the binary whose compute capability
matches, allowing us to support basic multi-architecture binaries for
CUDA.

Depends on D127432

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D127505
2022-07-21 13:20:06 -04:00
Joseph Huber fbcb1ee7f3 [Libomptarget] Add support for offloading binaries in libomptarget
The previous path changed the linker wrapper to embed the offloading
binary format inside the target image instead. This will allow us to
more generically bundle metadata with these images, such as requires
clauses or the target architecture it was compiled for.

I wasn't sure how to handle this best, so I introduced a new type that
replaces the old `__tgt_device_image` struct that we can expand inside
the runtime library. I made the new `__tgt_device_binary` struct pretty
much the same for now. In the future we could change this struct to
pretty much be the `OffloadBinary` class in the future.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D127432
2022-07-21 13:20:04 -04:00
Joseph Huber 5d8a76feb0 [Libomptarget] Build the device library even if the sm list is empty
We previously had some logic that stopped us from building the device runtime if
there were no NVPTX architectures provided. This is incorrect because we could
have AMDGPU libraries. Even if the lists are empty we should be able to attempt
to build these and get dummy output. THis wilil make it much easier for our
tooling which expects certain libraries. If the user wishes to disable the
library entirely they should use `-DLIBOMPTARGET_BUILD_DEVICERTL_BCLIB=OFF"

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D130266
2022-07-21 10:57:47 -04:00
Joseph Huber dc52712a06 [Libomptarget] Make libomptarget an LLVM library
This patch makes libomptarget depend on LLVM libraries to be built. The
reason for this is because we already have an implicit dependency on
LLVM headers for ELF identification and extraction as well as an
optional dependenly on the LLVMSupport library for time tracing
information. Furthermore, there are changes in the future that require
using more LLVM libraries, and will heavily simplify some future code as
well as open up the large amount of useful LLVM libraries to
libomptarget.

This will make "standalone" builds of `libomptarget' more difficult for
vendors wishing to ship their own. This will require a sufficiently new
version of LLVM to be installed on the system that should be picked up
by the existing handling for the implicit headers.

The things this patch changes are as follows:
  - `libomptarget.so` links against LLVMSupport and LLVMObject
  - `libomptarget.so` is a symbolic link to `libomptarget.so.15`
  - If using a shared library build, user applications will depend on LLVM
    libraries as well
  - We can now use LLVM resources in Libomptarget.

Note that this patch only changes this to apply to libomptarget itself,
not the plugins. Additional patches will be necessary for that.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D129875
2022-07-20 15:58:06 -04:00
Joseph Huber b5b20164d2 Revert "[Libomptarget] Make libomptarget an LLVM library"
This reverts commit 643dfd97d5.

This patch still makes the AMDGPU buildbots unhappy. Reverting for now
until the AMD folks figure it out.
2022-07-20 10:18:55 -04:00
Joseph Huber 6b0db92bbd [Libomptarget] Fix LTO command line in test
Summary:
The test passed -offload-lto instead of -foffload-lto.
2022-07-20 10:18:55 -04:00
Joseph Huber 643dfd97d5 [Libomptarget] Make libomptarget an LLVM library
This patch makes libomptarget depend on LLVM libraries to be built. The
reason for this is because we already have an implicit dependency on
LLVM headers for ELF identification and extraction as well as an
optional dependenly on the LLVMSupport library for time tracing
information. Furthermore, there are changes in the future that require
using more LLVM libraries, and will heavily simplify some future code as
well as open up the large amount of useful LLVM libraries to
libomptarget.

This will make "standalone" builds of `libomptarget' more difficult for
vendors wishing to ship their own. This will require a sufficiently new
version of LLVM to be installed on the system that should be picked up
by the existing handling for the implicit headers.

The things this patch changes are as follows:
  - `libomptarget.so` links against LLVMSupport and LLVMObject
  - `libomptarget.so` is a symbolic link to `libomptarget.so.15`
  - If using a shared library build, user applications will depend on LLVM
    libraries as well
  - We can now use LLVM resources in Libomptarget.

Note that this patch only changes this to apply to libomptarget itself,
not the plugins. Additional patches will be necessary for that.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D129875
2022-07-20 09:52:09 -04:00
Jonathan Peyton 40ce65b5b2 [OpenMP][libomp] Fix affinity warnings and unify under one macro
Warnings that occur during affinity initialization are supposed
to be guarded by KMP_AFFINITY=nowarnings,noverbose, but some had been
missed by this logic. Create one macro for affinity warnings that takes
these settings into account.

Differential Revision: https://reviews.llvm.org/D125991
2022-07-19 13:10:25 -05:00
AndreyChurbanov 17dcde5f1b [OpenMP][libomp] Allow reset affinity mask after parallel
Added control to reset affinity of primary thread after outermost parallel
region to initial affinity encountered before OpenMP runtime was initialized.
KMP_AFFINITY environment variable reset/noreset modifier introduced.
Default behavior is unchanged.

Differential Revision: https://reviews.llvm.org/D125993
2022-07-19 13:05:05 -05:00
Jonathan Peyton 28c8da2965 [OpenMP][libomp] Fix fallthrough attribute detection for Intel compilers
icc does not properly detect lack of fallthrough attribute since it
defines __GNU__ > 7 and also icc's __has_cpp_attribute/__has_attribute
feature detectors do not properly detect the lack of fallthrough attribute.

Differential Revision: https://reviews.llvm.org/D126001
2022-07-19 13:04:25 -05:00
AndreyChurbanov a01d274fbd [OpenMP][libomp] Fix /dev/shm pollution after forked child process terminates
Made library registration conditional and skip it in the __kmp_atfork_child
handler, postponed it till middle initialization in the child.
This fixes the problem of applications those use e.g. popen/pclose
which terminate the forked child process.

Differential Revision: https://reviews.llvm.org/D125996
2022-07-19 12:59:58 -05:00
Jon Chesterfield e46f727b38 Revert "[Libomptarget] Make libomptarget an LLVM library"
This reverts commit 70039be627.
2022-07-19 17:59:45 +01:00
Joseph Huber 70039be627 [Libomptarget] Make libomptarget an LLVM library
This patch makes libomptarget depend on LLVM libraries to be built. The
reason for this is because we already have an implicit dependency on
LLVM headers for ELF identification and extraction as well as an
optional dependenly on the LLVMSupport library for time tracing
information. Furthermore, there are changes in the future that require
using more LLVM libraries, and will heavily simplify some future code as
well as open up the large amount of useful LLVM libraries to
libomptarget.

This will make "standalone" builds of `libomptarget' more difficult for
vendors wishing to ship their own. This will require a sufficiently new
version of LLVM to be installed on the system that should be picked up
by the existing handling for the implicit headers.

The things this patch changes are as follows:
  - `libomptarget.so` links against LLVMSupport and LLVMObject
  - `libomptarget.so` is a symbolic link to `libomptarget.so.15`
  - If using a shared library build, user applications will depend on LLVM
    libraries as well
  - We can now use LLVM resources in Libomptarget.

Note that this patch only changes this to apply to libomptarget itself,
not the plugins. Additional patches will be necessary for that.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D129875
2022-07-19 12:33:31 -04:00
Joseph Huber cdea437057 [Libomptarget] Fix warnings on address space attributes
The device runtime uses the address space attribute to control the
placement of important constants on the GPU. The changes made in D126061
caused these to start emitting errors as they were not applied to the
type. This patch fixes the issues to make the warnings go away.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D129896
2022-07-15 17:21:30 -04:00
Joseph Huber 1f940b69c3 [Libomptarget][NFC] Fix signed comparison warnings
Summary:
Non-functional change, just fixing some sign comparison warnings by
making both match.
2022-07-15 13:22:55 -04:00
Shilei Tian 65ebcee197 [OpenMP] Ignore .eggs file in OpenMP
The OMPD patches introduces GDB plugin. When it is built, it will create a
coulple of temp files in `.eggs`. This patch add it into `.gitignore` in case it
messed up the git tracking.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D129711
2022-07-14 12:06:50 -04:00
Joseph Huber b1d574867d [Libomptarget] Allow static assert to work on 32-bit systems
Summary:
We use a static assert to make sure that someone doesn't change the size
of an argument struct without properly updating all the other logic.
This originally only checked the size on a 64-bit system with 8-byte
pointers, causing builds on 32-bit systems to fail. This patch allows
either pointer size to work.

Fixes #56486
2022-07-12 08:05:01 -04:00
Vignesh Balasubramanian 9dc0d6aaa1 Fixing build bot failure due to python-pip unavailability.
commit: 51d3f421f4
failed due to missing python-pip om machine.
Now the ompd gdb-plugin code will be skipped with a warning
if pip is not available in the machine.
2022-07-12 16:01:59 +05:30
Vignesh Balasubramanian 51d3f421f4 [OpenMP][OMPD] GDB plugin code to leverage libompd to provide debugging
support for OpenMP programs.

This is 5th of 6 patches started from https://reviews.llvm.org/D100181
This plugin code, when loaded in gdb, adds a few commands like
ompd icv, ompd bt, ompd parallel.
These commands create an interface for GDB to read the OpenMP
runtime through libompd.

Reviewed By: @dreachem
Differential Revision: https://reviews.llvm.org/D100185
2022-07-12 14:38:41 +05:30
Shilei Tian e7d998e51e [NFC][OpenMP][Offloading] Fix compilation warning caused by misuse of `static_cast` 2022-07-08 20:59:37 -04:00
Joseph Huber 269d5c16bc [Libomptarget][NFC] Move legacy functions to a separate file
This patch moves the old legacy interfaces into `libomptarget` to a
separate file. These do not need to be included anywhere and are simply
provided for backwards compatibility with the ABI. This cleans up the
interface greatly.

Depends on D128817

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D128818
2022-07-08 14:44:21 -04:00
Joseph Huber c9353eb4bc [Libomptarget] Use new tripcount argument in the runtime.
The previous patch added an argument to the `__tgt_target_kernel`
runtime function which includes the tripcount used for the loop clause.
This was originally passed in via the `__kmpc_push_target_tripcount`
function. Now we move this logic to the kernel launch itself and remove
the need for the push function.

Depends on D128816

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D128817
2022-07-08 14:44:19 -04:00
Joseph Huber ad23e4d85f [Libomptarget] Implement a unified kernel entry function
This patch implements a unified kernel entry function that will be
targeted from both teams and non-teams clauses. We introduce a new
interface and make the old functions call in using the new one. A
following patch will include the necessary changes to Clang to call
these new functions instead.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D128549
2022-07-08 14:44:06 -04:00
Ye Luo fca79b78c4 [libomptarget] compile DeviceRTL bc files with -O3
bc files of DeviceRTL are compiled with -O3, the same as the static library.

Differential Revision: https://reviews.llvm.org/D129344
2022-07-08 10:00:26 -05:00
Vadim Paretsky 43d5c4d539 [OpenMP] add 4 custom APIs supporting MSVC OMP codegen
This check-in adds 4 APIs to support MSVC, specifically:

* 3 APIs (__kmpc_sections_init, __kmpc_next_section,
   __kmpc_end_sections) to support the dynamic scheduling of OMP sections.
* 1 API (__kmpc_copyprivate_light, a light-weight version of
  __kmpc_copyrprivate) to support the OMP single copyprivate clause.

Differential Revision: https://reviews.llvm.org/D128403
2022-07-05 17:26:18 -05:00
Joseph Huber d27d0a673c [Libomptarget][NFC] Make Libomptarget use the LLVM naming convention
Libomptarget grew out of a project that was originally not in LLVM. As
we develop libomptarget this has led to an increasingly large clash
between the naming conventions used. This patch fixes most of the
variable names that did not confrom to the LLVM standard, that is
`VariableName` for variables and `functionName` for functions.

This patch was primarily done using my editor's linting messages, if
there are any issues I missed arising from the automation let me know.

Reviewed By: saiislam

Differential Revision: https://reviews.llvm.org/D128997
2022-07-05 14:53:38 -04:00
Shilei Tian 696bca9bb2 [NFC][OpenMP][CUDA] Remove unnecessary default label 2022-07-01 09:50:29 -04:00
Jose M Monsalve Diaz 616dd9ae14 [OpenMP] Implementing omp_get_device_num()
This patch implements omp_get_device_num() in the host and the device.

It uses the already existing getDeviceNum in the device config for the device.
And in the host it uses the omp_get_num_devices().

Two simple tests added

Differential Revision: https://reviews.llvm.org/D128347
2022-06-29 02:18:21 -05:00
Shilei Tian 2695e23ad9 [OpenMP][CUDA] Fix the issue that P2P memcpy doesn't work
This patch fixes the issue that P2P memcpy doesn't work. The root cause is we didn't set current context when calling the API function. In addition, a matrix to track the states of each pair of devices is also added such that we only need to query and configure the device once.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D122764
2022-06-28 15:32:03 -04:00
Daniel Douglas d4a7b8de52 [OpenMP][libomp] avoid spin wait and yield on arm64 macOS
This patch changes the default behavior to avoid spin waiting and
yielding. (See “Don’t Keep Threads Active And Idle” section here:
https://developer.apple.com/documentation/apple-silicon/tuning-your-code-s-performance-for-apple-silicon)

We verified using instruments traces that the changes improve scheduling
behavior on macOS.

We also collected results using EPCC schedbench
(https://github.com/LangdalP/EPCC-OpenMP-micro-benchmarks) that are
attached here that show a reduction in standard deviation and max test
run time across all scheduling types. Static scheduling sees dramatic
improvements with these changes, we see a 2-4x average runtime
improvement in the benchmark.

Differential Revision: https://reviews.llvm.org/D126510
2022-06-24 12:02:16 -05:00
Jonathan Peyton b7b4986576 [OpenMP][libomp] Hold old __kmp_threads arrays until library shutdown
When many nested teams are formed, __kmp_threads may be reallocated
to accommodate new threads. This reallocation causes a data
race when another existing team's thread simultaneously references
__kmp_threads. This patch keeps the old thread arrays around until library
shutdown so these lingering references can complete without issue and
access to __kmp_threads remains a simple array reference.

Fixes: https://github.com/llvm/llvm-project/issues/54708
Differential Revision: https://reviews.llvm.org/D125013
2022-06-22 10:30:35 -05:00
Joseph Huber 3351ae61d9 [Libomptarget] Remove duplicate data environment exit
Summary:
This patch removes a duplicated exit from the OpenMP data envrionment.
We already have an RAII method that guards this environment so it is
unnecessary.
2022-06-21 22:35:32 -04:00