Commit Graph

161 Commits

Author SHA1 Message Date
Shilei Tian 75812e7704 [OpenMP][Offloading] Change N back to 256 in bug49334.cpp 2022-02-23 16:10:35 -05:00
Shilei Tian 092a5bb72b [OpenMP][Offloading] Fix test case issues in bug49334.cpp
`bug49334.cpp` has one issue that causes flaky result reported in #53730.
The root cause is `BlockedC` is never initialized but in `BlockMatMul_TargetNowait`
it is directly read and written (via `+=`). Fixes #53730.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D119988
2022-02-17 10:22:48 -05:00
Joseph Huber 777039a51c [Libomptarget] Run CPU offloading tests using the new driver
This patch adds a new target to the OpenMP CPU offloading tests. This
tests the usage of the new driver for CPU offloading. If this all works
then we can move to transition to the new driver as the default.

Depends on D119613

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D119736
2022-02-15 15:05:32 -05:00
Shilei Tian c27f530d4c [OpenMP][Offloading] Fix infinite loop in applyToShadowMapEntries
This patch fixes the issue that the for loop in `applyToShadowMapEntries`
is infinite because `Itr` is not incremented in `CB`. Fixes #53727.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D119471
2022-02-12 22:02:53 -05:00
Shilei Tian 702a976c12 [OpenMP][Offloading] Change the way to compare floating point values in bug49334.cpp
`bug49334.cpp` directly uses `!=` to compare two floating point values,
which is almost wrong.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D119485
2022-02-10 18:20:36 -05:00
Joseph Huber 9582f09690 [Libomptarget] Increase stack size for bug49779 test
The 'bug49779.cpp' test has been failing recently. This is because the
runtime is sufficiently complex when using nested parallelism without
optimizations that the CUDA tools cannot statically determine the stack
size. Because of this the kernel can exceed the thread stack size and
crash. Work around this using the 'LIBOMPTARGET_STACK_SIZE' environment
variable and add an FAQ entry for this situation.

Fixes #53670

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D119357
2022-02-09 15:37:23 -05:00
Joseph Huber f8ffac5987 [OpenMP] Enable new driver tests for AMDGPU
This patch enables running the new driver tests for AMDGPU. Previously
this was disabled because some tests failed. This was only because the
new driver tests hadn't been listed as unsupported or expected to fail.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D119240
2022-02-08 09:55:29 -05:00
Joseph Huber 034adaf5be [OpenMP] Completely remove old device runtime
This patch completely removes the old OpenMP device runtime. Previously,
the old runtime had the prefix `libomptarget-new-` and the old runtime
was simply called `libomptarget-`. This patch makes the formerly new
runtime the only runtime available. The entire project has been deleted,
and all references to the `libomptarget-new` runtime has been replaced
with `libomptarget-`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D118934
2022-02-04 15:31:33 -05:00
Joseph Huber b4be18219e [Libomptarget] Remove AMDGPU XFAIL from test
Summary;
This test should pass now with AMDGPU. Previously the symbols were
hidden and would fail when read.
2022-02-04 13:40:03 -05:00
Jon Chesterfield 8b7e99c41d [openmp] Disable tests that presently hang on CI 2022-02-01 13:01:35 +00:00
Joseph Huber 0ac799b5c9 [Libomptarget] Run GPU offloading tests using the new drvier
This patch adds a new target to the tests to run using the new driver as
the method for generating offloading code.

Depends on D116541

Differential Revision: https://reviews.llvm.org/D118637
2022-01-31 23:11:43 -05:00
Jon Chesterfield 9b9d08111b Set rpath on openmp executables
Openmp executables need to find libomp and libomptarget at runtime.
This currently requires LD_LIBRARY_PATH or the user to specify rpath. Change
that to set the expected location of the openmp libraries in the install tree.

Whether rpath means rpath or runpath is system dependent. The attached test
shows that the Wl,--disable-new-dtags control interacts correctly with this feature.

The implicit rpath field is appended to any user specified ones which is ideal.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D118493
2022-01-31 16:35:00 +00:00
Jon Chesterfield a841a3a579 Revert "Set rpath on openmp executables"
Failed some buildbots, bad assumptions about structure of install path

This reverts commit a80d5c34e4.
2022-01-31 16:18:03 +00:00
Jon Chesterfield a80d5c34e4 Set rpath on openmp executables
Openmp executables need to find libomp and libomptarget at runtime.
This currently requires LD_LIBRARY_PATH or the user to specify rpath. Change
that to set the expected location of the openmp libraries in the install tree.

Whether rpath means rpath or runpath is system dependent. The attached test
shows that the Wl,--disable-new-dtags control interacts correctly with this feature.

The implicit rpath field is appended to any user specified ones which is ideal.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D118493
2022-01-31 16:01:08 +00:00
Sri Hari Krishna Narayanan f44e41af41 Runtime for Interop directive
This implements the runtime portion of the interop directive.
It expects the frontend and IRBuilder portions to be in place
for proper execution. It currently works only for GPUs
and has several TODOs that should be addressed going forward.

Reviewed By: RaviNarayanaswamy

Differential Revision: https://reviews.llvm.org/D106674
2022-01-27 15:16:24 -05:00
Jon Chesterfield ce8f365884 [openmp] Always pass valid triple to openmp-targets when using newRTL
Previously, we sometimes pass fopenmp-targets=nvptx64-nvidia-cuda-newRTL

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D117715
2022-01-19 22:07:22 +00:00
Jon Chesterfield 8baf4ba890 [openmp][amdgpu] Remove xfail from test using declare target variable 2022-01-19 15:55:37 +00:00
Shilei Tian 458db51c10 [OpenMP] Add missing `tt_hidden_helper_task_encountered` along with `tt_found_proxy_tasks`
In most cases, hidden helper task behave similar as detached tasks. That means,
for example, if we have to wait for detached tasks, we have to do the same thing
for hidden helper tasks as well. This patch adds the missing condition for hidden
helper task accordingly along with detached task.

Reviewed By: AndreyChurbanov

Differential Revision: https://reviews.llvm.org/D107316
2021-12-29 23:22:53 -05:00
Joel E. Denny 51168ce8d5 [OpenMP] Add test for custom state machine if have reduction
D113602 broke the custom state machine when a reduction is present, as
revealed by the reproducer this patch adds to the test suite.  In that
case, openmp-opts changes the return value to undef in
`__kmpc_get_warp_size` (which the custom state machine calls as of
D113602).  Later optimizations then optimize away the custom state
machine code as if all threads are outside the thread block, so the
target region does not execute.  D114802 fixed that but didn't add a
reproducer.

This patch also adds a `__OMP_RTL_ATTRS` entry for
`__kmpc_get_warp_size` to OMPKinds.def, which D113602 missed.  This
change does not seem to have any impact on the reduction problem.

Reviewed By: JonChesterfield, jdoerfert

Differential Revision: https://reviews.llvm.org/D113824
2021-12-10 12:53:54 -05:00
Jon Chesterfield a2b3b4dadc [openmp] Run tests on both runtimes, independent of the default
Minor fix to the lit.cfg. Currently, nvptx runs the tests twice on the new runtime.
Soon, amdgpu will run them on the new runtime as well as the old.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D115150
2021-12-06 16:41:23 +00:00
Jon Chesterfield 1a87a18955 [openmp][amdgpu] Disable tests requiring USM on amdgcn
These tests tend to hang or crash on hardware that doesn't
support USM. Disabling them helps diagnose other issues. To safely
enable we require a means of testing whether USM is expected to work.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D115144
2021-12-06 13:25:23 +00:00
Matt Arsenault 90f914c870 OpenMP: Un-xfail tests that pass now
729bf9b26b should have fixed these
2021-12-04 11:25:22 -05:00
Ron Lieberman 8f4013ad46 Restric xfail on openmp/libomptarget/test/mapping/reduction_implicit_map.cpp to amdgcn-amd-amdhsa 2021-12-02 20:58:26 +00:00
Ron Lieberman f87c2c637e xfail: libomptarget reduction_implicit_map.cpp after reapply of Start calling setTargetAttributes 2021-12-02 20:38:25 +00:00
Jon Chesterfield fb9fc3c951 [openmp][amdgpu] Disable three tests in preparation for new runtime 2021-12-02 07:57:14 +00:00
Joel E. Denny c9dfe322ee [OpenMP] Fix main thread barrier for Pascal and amdgpu
Fixes what's left of https://bugs.llvm.org/show_bug.cgi?id=51781.

Reviewed By: jdoerfert, JonChesterfield, tianshilei1992

Differential Revision: https://reviews.llvm.org/D113602
2021-11-12 11:18:45 -05:00
Jon Chesterfield 27177b82d4 [OpenMP] Lower printf to __llvm_omp_vprintf
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf`
which takes the same const char*, void* arguments as cuda vprintf and also
passes the size of the void* alloca which will be needed by a non-stub
implementation of `__llvm_omp_vprintf` for amdgpu.

This removes the amdgpu link error on any printf in a target region in favour
of silently compiling code that doesn't print anything to stdout.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112680
2021-11-10 15:30:56 +00:00
Jon Chesterfield 0fa45d6d80 Revert "[OpenMP] Lower printf to __llvm_omp_vprintf"
This reverts commit db81d8f6c4.
2021-11-08 20:28:57 +00:00
Jon Chesterfield dc9edc6a6d Revert "[openmp] Fix build, test passes on CI unexpectedly"
This reverts commit c499d690cd.
2021-11-08 20:28:52 +00:00
Jon Chesterfield c499d690cd [openmp] Fix build, test passes on CI unexpectedly 2021-11-08 18:45:27 +00:00
Jon Chesterfield db81d8f6c4 [OpenMP] Lower printf to __llvm_omp_vprintf
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf`
which takes the same const char*, void* arguments as cuda vprintf and also
passes the size of the void* alloca which will be needed by a non-stub
implementation of `__llvm_omp_vprintf` for amdgpu.

This removes the amdgpu link error on any printf in a target region in favour
of silently compiling code that doesn't print anything to stdout.

The exact set of changes to check-openmp probably needs revision before commit

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112680
2021-11-08 18:38:00 +00:00
Jon Chesterfield 4d50803ce4 [libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
CI is showing a different set of pass/fails to local, committing this
without the tests enabled by default while debugging that difference.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227
2021-10-28 12:34:01 +01:00
Jon Chesterfield 6c7b203d1d Revert "[libomptarget] Build DeviceRTL for amdgpu"
- more tests failing on CI than failed locally when writing this patch

This reverts commit 33427fdb7b.
2021-10-28 01:01:53 +01:00
Jon Chesterfield 33427fdb7b [libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227
2021-10-28 00:41:45 +01:00
Ron Lieberman be03ef3ed1 [openmp][lit] Add support to OpenMP lit.cfg for ROCR_VISIBLE_DEVICES env-var
add support for ROCR_VISIBLE_DEVICES similar to name and purpose
as CUDA_VISIBLE_DEVICES

Differential Revision: https://reviews.llvm.org/D112503
2021-10-26 13:46:42 +00:00
Jon Chesterfield bf6f955f39 [libomptarget] Run GPU offloading tests on both new and old runtime
Implemented by patching python config instead of modifying all
the tests so that -generic and XFAIL work as usual. Expectation is for
this to be reverted once the old runtime is deleted.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D112225
2021-10-22 23:28:44 +01:00
Jon Chesterfield 251b1e7c25 [libomptarget] Pass OMP_TARGET_OFFLOAD env variable through to tests
Useful for OMP_TARGET_OFFLOAD=MANDATORY when testing

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D111995
2021-10-18 16:03:03 +01:00
Joseph Huber 208f900527 [Libomptarget] Add an external interface to dynamic shared memory
This patch adds an external interface to access the dynamic shared
memory buffer in the device runtime. The function introduced is
``llvm_omp_get_dynamic_shared``. This includes a host-side
definition that only returns a null pointer so that it can be used when
host-fallback is enabled without crashing. Support for dynamic shared
memory was also ported to the old device runtime.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D110957
2021-10-08 15:36:57 -04:00
Jon Chesterfield 3247329107 [openmp] Add addrspacecast to getOrCreateIdent
Fixes 51982. Adds a missing CreatePointerCast and allocates a global in
the correct address space.

Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\
blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting
parts while checking the assertion failure still occurred.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110556
2021-09-30 21:36:31 +01:00
Jon Chesterfield 80fa43fe9a Revert "[openmp] Add addrspacecast to getOrCreateIdent"
This reverts commit 1a761e5b7b.
Failed CI, albeit with a different failure mode to BZ51982
2021-09-27 19:27:35 +01:00
Jon Chesterfield 1a761e5b7b [openmp] Add addrspacecast to getOrCreateIdent
Fixes 51982. Minor refactor to remove `return x = y` construct.

Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\
blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting
parts while checking the assertion failure still occurred.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110556
2021-09-27 19:23:12 +01:00
Joseph Huber f1c821fa85 [OpenMP] Add support for dynamic shared memory in new RTL
This patch adds support for using dynamic shared memory in the new
device runtime. The new function `__kmpc_get_dynamic_shared` will return a
pointer to the buffer of dynamic shared memory. Currently the amount of memory
allocated is set by an environment variable.

In the future this amount will be added to the amount used for the smart stack
which will be configured in a similar way.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D110006
2021-09-17 21:25:36 -04:00
Jon Chesterfield 2a581710c1 [openmp] No longer use LIBRARY_PATH to find devicertl
Given D109057, change test runner to use the libomptarget-x-bc-path
argument instead of the LIBRARY_PATH environment variable to find the device
library.

Also drop the use of LIBRARY_PATH environment variable as it is far
too easy to pull in the device library from an unrelated toolchain by accident
with the current setup. No loss in flexibility to developers as the clang
commandline used here is still available.

Reviewed By: jdoerfert, tianshilei1992

Differential Revision: https://reviews.llvm.org/D109061
2021-09-09 17:16:41 +01:00
Joel E. Denny 1f9e437065 [OpenMP][AMDGPU] Remove unneeded XFAILs 2021-09-01 18:00:25 -04:00
Joel E. Denny d11bab0b73 [OpenMP] Use IsHostPtr where needed for targetDataBegin
As discussed in D105990, without this patch, `targetDataBegin`
determines whether to transfer data (as opposed to assuming it's in
shared memory) using the condition `!UseUSM || HasCloseModifier`.
However, this condition is broken if use of discrete memory was forced
by `omp_target_associate_ptr`.  This patch extends
`unified_shared_memory/associate_ptr.c` to reveal this case, and it
fixes it using `!IsHostPtr` in `DeviceTy::getTargetPointer` to replace
this condition.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D107927
2021-09-01 17:31:42 -04:00
Joel E. Denny 8e4836b2a2 [OpenMP] Use IsHostPtr where needed for targetDataEnd
As discussed in D105990, without this patch, `targetDataEnd`
determines whether to transfer data or delete a device mapping (as
opposed to assuming it's in shared memory) using two different
conditions, each of which is broken for some cases:

1. `!(UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin)`: The
   broken case is rare: the device and host might happen to use the
   same address for their mapped allocations.  I don't know how to
   write a test that's likely to reveal this case, but this patch does
   fix it, as discussed below.
2. `!UNIFIED_SHARED_MEMORY || HasCloseModifier`: There are at least
   two broken cases:
    1. The `close` modifier might have been specified on an `omp
      target enter data` but not the corresponding `omp target exit
      data`, which thus might falsely assume a mapping is in shared
      memory.  The test `unified_shared_memory/close_enter_exit.c`
      already has a missing deletion as a result, and this patch adds
      a check for that.  This patch also adds the new test
      `close_member.c` to reveal a missing transfer and deletion.
    2. Use of discrete memory might have been forced by
      `omp_target_associate_ptr`, as in the test
      `unified_shared_memory/api.c`.  In the current `targetDataEnd`
      implementation, this condition turns out not be used for this
      case: because the reference count is infinite, a transfer is
      possible only with an `always` modifier, and this condition is
      never used in that case.  To ensure it's never used for that
      case in the future, this patch adds the test
      `unified_shared_memory/associate_ptr.c`.

Fortunately, `DeviceTy::getTgtPtrBegin` already has a solution: it
reports whether the allocation was found in shared memory via the
variable `IsHostPtr`.

After this patch, `HasCloseModifier` is no longer used in
`targetDataEnd`, and I wonder if the `close` modifier is ever useful
on an `omp target data end`.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D107925
2021-09-01 17:31:42 -04:00
Jon Chesterfield cef1199686 Revert "[openmp] No longer use LIBRARY_PATH to find devicertl"
This reverts commit 7a228f872f.
Failing test case under CI
2021-09-01 20:44:12 +01:00
Jon Chesterfield 7a228f872f [openmp] No longer use LIBRARY_PATH to find devicertl
Given D109057, change test runner to use the libomptarget-x-bc-path
argument instead of the LIBRARY_PATH environment variable to find the device
library.

Also drop the use of LIBRARY_PATH environment variable as it is far
too easy to pull in the device library from an unrelated toolchain by accident
with the current setup. No loss in flexibility to developers as the clang
commandline used here is still available.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D109061
2021-09-01 20:24:34 +01:00
Jon Chesterfield 718e5a9883 [libomptarget] Set runpath on libomptarget, use that to drop LD_LIBRARY_PATH from test runner
Using rpath instead of LD_LIBRARY_PATH to find libomp.so and
libomptarget.so lets one rerun the already built test executables without
setting environment variables and removes the risk of the test runner picking
up different libraries to the developer debugging the failure.

rpath usually means runpath, which is not transitive, so set runpath on
libomptarget itself so that it can find the plugins located next to it,
spelled $ORIGIN. This provides sufficient functionality to drop D102043

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D109071
2021-09-01 18:47:56 +01:00
Joel E. Denny 1688b4cf8e [OpenMP][AMDGPU] XFAIL test where kernels call printf 2021-08-31 22:11:28 -04:00