Commit Graph

52 Commits

Author SHA1 Message Date
Jon Chesterfield 26790ed248 [libomptarget] Require LLVM source tree to build libomptarget
[libomptarget] Require LLVM source tree to build libomptarget

This is to permit reliably #including files from the LLVM tree in libomptarget,
as an improvement on the copy and paste that is currently in use. See D87841
for the first example of removing duplication given this new requirement.

The weekly openmp dev call reached consensus on this approach. See also D87841
for some alternatives that were considered. In the future, we may want to
introduce a new top level repo for shared constants, or start using the ADT
library within openmp.

This will break sufficiently exotic build systems, trivial fixes as below.

Building libomptarget as part of the monorepo will continue to work.
If openmp is built separately, it now requires a cmake macro indicating
where to find the LLVM source tree.

If openmp is built separately, without the llvm source tree already on disk,
the build machine will need a copy of a subset of the llvm source tree and
the cmake macro indicating where it is.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D89426
2020-10-21 18:53:00 +01:00
JonChesterfield 55dc123555 [libomptarget][amdgcn] Refactor memcpy to eliminate maps
[libomptarget][amdgcn] Refactor memcpy to eliminate maps

Builds on D89776 to remove now dead code.

Reviewed By: pdhaliwal

Differential Revision: https://reviews.llvm.org/D89888
2020-10-21 16:59:33 +01:00
Pushpinder Singh aa616efbb3 [libomptarget][AMDGPU][NFC] Split atmi_memcpy for h2d and d2h
The calls to atmi_memcpy presently determine the direction of copy (host to
device or device to host) by storing pointers in a map during malloc and
looking up the pointers during memcpy. As each call site already knows the
direction, this stash+lookup can be eliminated.

This NFC will be followed by a functional one that deletes those map lookups.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D89776

Change-Id: I1d9089bc1e56b3a9a30e334735fa07dee1f84990
2020-10-20 06:29:32 -04:00
JonChesterfield 7d2ecef5ed [openmp][libomptarget] Include header from LLVM source tree
[openmp][libomptarget] Include header from LLVM source tree

The change is to the amdgpu plugin so is unlikely to break anything.

The point of contention is whether libomptarget can depend on LLVM.
A community discussion was cautiously not opposed yesterday.

This introduces a compile time dependency on the LLVM source tree, in this case
expressed as skipping the building of the plugin if LLVM_MAIN_INCLUDE_DIR is not
set. One the source files will #include llvm/Frontend/OpenMP/OMPGridValues.h,
instead of copy&pasting the numbers across.

For users that download the monorepo, the llvm tree is already on disk. This will
inconvenience users who download only the openmp source as a tar, as they would
now also have to download (at least a file or two) from the llvm source, if they want
to build the parts of the openmp project that (post this patch) depend on llvm.

There was interest expressed in going further - using llvm tools as part of
building libomp, or linking against llvm libraries. That seems less clear cut
an improvement and worthy of further discussion. This patch seeks only to change
policy to support openmp depending on the llvm source tree. Including in the
other direction, or using libraries / tools etc, are purposefully out of scope.

Reviewers are a best guess at interested parties, please feel free to add others

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D87841
2020-10-15 15:46:19 +01:00
Manoel Roemmer c816ee13ad [OpenMP][VE plugin] Fixing failure to build VE plugin with consolidated error handling in libomptarget
The libomptarget VE plugin [[
http://lab.llvm.org:8014/builders/clang-ve-ninja/builds/8937/steps/build-unified-tree/logs/stdio
| fails zu build ]] after ae95ceeb8f .

Differential Revision: https://reviews.llvm.org/D88476
2020-09-29 17:38:01 +02:00
Ye Luo 03111e5e7a [OpenMP] Protect unrecogonized CUDA error code
If an error code can not be recognized by cuGetErrorString, errStr remains null and causes crashing at DP() printing.
Protect this case.

Reviewed By: jhuber6, tianshilei1992

Differential Revision: https://reviews.llvm.org/D87980
2020-09-21 13:43:08 -04:00
JonChesterfield a9be2b5cb2 [libomptarget] Disable build of amdgpu plugin as it doesn't build with rocm. 2020-09-18 18:10:27 +01:00
Joseph Huber ae209397b1 [OpenMP] Begin Printing Information Dumps In Libomptarget and Plugins
Summary:
This patch starts adding support for adding information dumps to libomptarget
and rtl plugins. The information printing is controlled by the
LIBOMPTARGET_INFO environment variable introduced in D86483. The goal of this
patch is to provide the user with additional information about the device
during kernel execution and providing the user with information dumps in the
case of failure. This patch added the ability to dump the pointer mapping table
as well as printing the number of blocks and threads in the cuda RTL.

Reviewers: jdoerfort gkistanova	ye-luo

Subscribers: guansong openmp-commits sstefan1 yaxunl ye-luo

Tags: #OpenMP

Differential Revision: https://reviews.llvm.org/D87165
2020-09-09 12:03:56 -04:00
Joseph Huber ae95ceeb8f [OpenMP] Consolidate error handling and debug messages in Libomptarget
Summary:

This patch consolidates the error handling and messaging routines to a single
file omptargetmessage. The goal is to simplify the error handling interface
prior to adding more error handling support

Reviewers: jdoerfert grokos ABataev AndreyChurbanov ronlieb JonChesterfield ye-luo tianshilei1992

Subscribers: danielkiss guansong jvesely kerbowa nhaehnle openmp-commits sstefan1 yaxunl
2020-09-01 15:28:19 -04:00
JonChesterfield 5d989fb37d [libomptarget][amdgpu] Improve thread safety, remove dead code 2020-08-26 22:04:03 +01:00
Jon Chesterfield 28fbf422f2 [libomptarget][amdgpu] Update plugin CMake to work with latest rocr library 2020-08-26 20:01:42 +01:00
Jon Chesterfield 6e1b11087f [libomptarget][amdgpu] Support building with static rocm libraries 2020-08-19 15:44:30 +01:00
Johannes Doerfert 5272d29e2c [OpenMP][CUDA] Keep one kernel list per device, not globally.
Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D86039
2020-08-16 14:38:35 -05:00
Johannes Doerfert aa27cfc1e7 [OpenMP][CUDA] Cache the maximal number of threads per block (per kernel)
Instead of calling `cuFuncGetAttribute` with
`CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` for every kernel invocation,
we can do it for the first one and cache the result as part of the
`KernelInfo` struct. The only functional change is that we now expect
`cuFuncGetAttribute` to succeed and otherwise propagate the error.
Ignoring any error seems like a slippery slope...

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D86038
2020-08-16 14:38:33 -05:00
Jon Chesterfield d0b312955f [libomptarget] Implement host plugin for amdgpu
[libomptarget] Implement host plugin for amdgpu

Replacement for D71384. Primary difference is inlining the dependency on atmi
followed by extensive simplification and bugfixes. This is the latest version
from https://github.com/ROCm-Developer-Tools/amd-llvm-project/tree/aomp12 with
minor patches and a rename from hsa to amdgpu, on the basis that this can't be
used by other implementations of hsa without additional work.

This will not build unless the ROCM_DIR variable is passed so won't break other
builds. That variable is used to locate two amdgpu specific libraries that ship
as part of rocm:
libhsakmt at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
libhsa-runtime64 at https://github.com/RadeonOpenCompute/ROCR-Runtime
These libraries build from source. The build scripts in those repos are for
shared libraries, but can be adapted to statically link both into this plugin.

There are caveats.
- This works well enough to run various tests and benchmarks, and will be used
  to support the current clang bring up
- It is adequately thread safe for the above but there will be races remaining
- It is not stylistically correct for llvm, though has had clang-format run
- It has suboptimal memory management and locking strategies
- The debug printing / error handling is inconsistent

I would like to contribute this pretty much as-is and then improve it in-tree.
This would be advantagous because the aomp12 branch that was in use for fixing
this codebase has just been joined with the amd internal rocm dev process.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85742
2020-08-15 23:58:28 +01:00
George Rokos 40470eb27a [libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t.
The standard way of printing `int64_t` data is via the PRId64 macro, `ld`
is for `long int` and int64_t is not guaranteed to be typedef'ed as `long int`
on all platforms. E.g. on Windows we get mismatch warnings.

Differential Revision: https://reviews.llvm.org/D85353
2020-08-05 13:28:35 -07:00
Ye Luo c5348aecd7 [OpenMP] Use primary context in CUDA plugin
Summary:
Retaining per device primary context is preferred to creating a context owned by the plugin.

From CUDA documentation
1. Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used." from https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DRIVER.html
2. Right under cuCtxCreate. In most cases it is recommended to use cuDevicePrimaryCtxRetain. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf
3. The primary context is unique per device and shared with the CUDA runtime API. These functions allow integration with other libraries using CUDA.  https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PRIMARY__CTX.html#group__CUDA__PRIMARY__CTX

Two issues are addressed by this patch:
1. Not using the primary context caused interoperability issue with libraries like cublas, cusolver. CUBLAS_STATUS_EXECUTION_FAILED and cudaErrorInvalidResourceHandle
2. On OLCF summit, "Error returned from cuCtxCreate" and "CUDA error is: invalid device ordinal"

Regarding the flags of the primary context. If it is inactive, we set CU_CTX_SCHED_BLOCKING_SYNC. If it is already active, we respect the current flags.

Reviewers: grokos, ABataev, jdoerfert, protze.joachim, AndreyChurbanov, Hahnfeld

Reviewed By: jdoerfert

Subscribers: openmp-commits, yaxunl, guansong, sstefan1, tianshilei1992

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D82718
2020-07-07 10:14:51 -04:00
Ye Luo 45bb073da8 [OpenMP] fix clang warning about printf format in CUDA plugin
Summary: Warnings are printed by clang when building LIBOMPTARGET_ENABLE_DEBUG=ON due incorrect format string.

Reviewers: tianshilei1992, jdoerfert

Reviewed By: tianshilei1992

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D82789
2020-06-29 22:35:39 -04:00
Shilei Tian a014fbbc21 [OpenMP] Improve D2D memcpy to use more efficient driver API
Summary:
In current implementation, D2D memcpy is first to copy data back to host and then
copy from host to device. This is very efficient if the device supports D2D
memcpy, like CUDA.

In this patch, D2D memcpy will first try to use native supported driver API. If
it fails, fall back to original way. It is worth noting that D2D memcpy in this
scenerio contains two ideas:
- Same devices: this is the D2D memcpy in the CUDA context.
- Different devices: this is the PeerToPeer memcpy in the CUDA context.
My implementation merges this two parts. It chooses the best API according to
the source device and destination device.

Reviewers: jdoerfert, AndreyChurbanov, grokos

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D80649
2020-06-04 16:59:06 -04:00
Manoel Roemmer 6b9e43c67e [Openmp][VE] Libomptarget plugin for NEC SX-Aurora
This patch adds a libomptarget plugin for the NEC SX-Aurora TSUBASA Vector
Engine (VE target).  The code is largely based on the existing generic-elf
plugin and uses the NEC VEO and VEOSINFO libraries for offloading.

Differential Revision: https://reviews.llvm.org/D76843
2020-05-12 10:47:30 +02:00
Shilei Tian cb038927ef [OpenMP] Fix an issue of wrong return type of DeviceRTLTy::getNumOfDevices
Summary: There is a typo in DeviceRTLTy::getNumOfDevices that the type of its return value is bool. It will lead to a problem of wrong device number returned from omp_get_num_devices.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D79255
2020-05-03 15:59:06 -04:00
Shilei Tian 4031bb982b [OpenMP] Refined CUDA plugin to put all CUDA operations into class
Summary: Current implementation mixed everything up so that there is almost no encapsulation. In this patch, all CUDA related operations are put into a new class DeviceRTLTy and only necessary functions are exposed. In addition, all C++ code now conforms with LLVM code standard, keeping those API functions following C style.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jfb, yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77951
2020-04-13 13:32:46 -04:00
Shilei Tian feed674dec [OpenMP] Introduce stream pool to make sure the correctness of device synchr...
...onization

Summary: In previous patch, in order to optimize performance, we only synchronize once
for each target region. The syncrhonization is via stream synchronization.
However, in the extreme situation, the performce might be bad. Consider the
following case: There is a task that requires transferring huge amount of data
(call many times of data transferring function). It is scheduled to the first
stream. And then we have 255 very light tasks scheduled to the remaining 255
streams (by default we have 256 streams). They can be finished before we do
synchronization at the end of the first task. Next, we get another very huge
task. It will be scheduled again to the first stream. Now the first task
finishes its kernel launch and call stream synchronization. Right now, the
stream already contains two kernels, and the synchronization will wait until the
two kernels finish instead of just the first one for the first task.

In this patch, we introduce stream pool. After each synchronization, the stream
will be returned back to the pool to make sure that for each synchronization,
only expected operations are waited.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: gregrodgers, yaxunl, lildmh, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77412
2020-04-11 07:08:56 -04:00
Shilei Tian 03ff643d2e [OpenMP] Put old APIs back and added new _async series for backward compatibility
Summary: According to comments on bi-weekly meeting, this patch put back old APIs and added new `_async` series

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77822
2020-04-09 22:40:58 -04:00
Shilei Tian 32ed29271f [OpenMP] Optimized stream selection by scheduling data mapping for the same target region into a same stream
Summary:
This patch introduces two things for offloading:
1. Asynchronous data transferring: those functions are suffix with `_async`. They have one more argument compared with their synchronous counterparts: `__tgt_async_info*`, which is a new struct that only has one field, `void *Identifier`. This struct is for information exchange between different asynchronous operations. It can be used for stream selection, like in this case, or operation synchronization, which is also used. We may expect more usages in the future.
2. Optimization of stream selection for data mapping. Previous implementation was using asynchronous device memory transfer but synchronizing after each memory transfer. Actually, if we say kernel A needs four memory copy to device and two memory copy back to host, then we can schedule these seven operations (four H2D, two D2H, and one kernel launch) into a same stream and just need synchronization after memory copy from device to host. In this way, we can save a huge overhead compared with synchronization after each operation.

Reviewers: jdoerfert, ye-luo

Reviewed By: jdoerfert

Subscribers: yaxunl, lildmh, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D77005
2020-04-07 14:55:47 -04:00
Jon Chesterfield 856c995436 [libomptarget] Add missing elf_end call in elf_common.c
Summary:
[libomptarget] Add missing elf_end call in elf_common.c
Noticed when reviewing D76843.

Reviewers: simoll, jdoerfert, efocht, AndreyChurbanov, grokos, manorom

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D76874
2020-03-26 19:07:33 +00:00
George Rokos 0a42c9bfe4 Enable CUDA offloading on aarch64 host
Differential Revision: https://reviews.llvm.org/D76469
2020-03-20 15:38:47 -07:00
Johannes Doerfert a5153dbc36 [OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D74145
2020-02-11 22:07:14 -06:00
Kazuaki Ishizaki 4c6a098ad5 [OpenMP] NFC: Fix trivial typos in comments
Reviewers: jdoerfert, Jim

Reviewed By: Jim

Subscribers: Jim, mgorny, guansong, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D72285
2020-01-07 14:05:03 +08:00
Bryan Chan 4d3198e243 [OpenMP] build offload plugins before testing them
Summary:
"make check-all" or "make check-libomptarget" would attempt to run offloading
tests before the offload plugins are built. This patch corrects that by adding
dependencies to the libomptarget CMake rules.

Reviewers: jdoerfert

Subscribers: mgorny, guansong, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D70803
2019-11-28 17:43:56 -05:00
Ron Lieberman dc34b1c94d Test commit: adds a . to comment. NFC 2019-11-04 16:51:03 -06:00
Sergey Dmitriev 4b343fd84c [Clang][OpenMP Offload] Create start/end symbols for the offloading entry table with a help of a linker
Linker automatically provides __start_<section name> and __stop_<section name> symbols to satisfy unresolved references if <section name> is representable as a C identifier (see https://sourceware.org/binutils/docs/ld/Input-Section-Example.html for details). These symbols indicate the start address and end address of the output section respectively. Therefore, renaming OpenMP offload entries section name from ".omp.offloading_entries" to "omp_offloading_entries" to use this feature.

This is the first part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943).

Differential Revision: https://reviews.llvm.org/D68070

llvm-svn: 373118
2019-09-27 20:00:51 +00:00
Michael Kruse 78769ec403 [libomptarget] Harmonize emitting CUDA errors and general debug messages.
Ensures that CUDA fail reasons (such as "No CUDA-capable device detected")
are printed together with libomptarget's debug message
(e.g. "Error when setting CUDA context"). Previously, the former was
printed only in CMAKE_BUILD_TYPE=Debug builds while the latter was
enabled by LIBOMPTARGET_ENABLE_DEBUG.

With this change, also only call cuGetErrorString when the error will be
printed.

Suggested-by: Ye Luo <xw111luoye@gmail.com>

Differential Revision: https://reviews.llvm.org/D65687

llvm-svn: 367910
2019-08-05 19:12:10 +00:00
Gheorghe-Teodor Bercea aace6d285d [OpenMP][libomptarget] Add support for declare target to clause under unified memory
Summary:
This patch adds support for handling variables under the:

```
#pragma omp declare target to()
```

clause when the 

```
#pragma omp requires unified_shared_memory
```

is used.

The address of the host variable is copied into the device pointer just like for the declare target link case.

Reviewers: ABataev, caomhin, grokos, AlexEichenberger

Reviewed By: grokos

Subscribers: jcownie, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D63106

llvm-svn: 363825
2019-06-19 15:48:10 +00:00
Gheorghe-Teodor Bercea c5fe030c16 [OpenMP][libomptarget] Enable usage of unified memory for declare target link variables
Summary: This patch enables the usage of a host variable on the device for declare target link variables when unified memory is available.

Reviewers: ABataev, caomhin, grokos

Reviewed By: grokos

Subscribers: Hahnfeld, guansong, jdoerfert, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D60884

llvm-svn: 362505
2019-06-04 15:05:53 +00:00
Chandler Carruth 57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Jonas Hahnfeld bb51d39871 [libomptarget][CUDA] Use cuDeviceGetAttribute, NFCI.
cuDeviceGetProperties has apparently been deprecated since CUDA 5.0.
Nvidia started using annotations only in CUDA 9.2, so nobody noticed
nor cared before.
The new function returns the same values, tested with a P100.

Differential Revision: https://reviews.llvm.org/D51624

llvm-svn: 341372
2018-09-04 15:13:28 +00:00
Joachim Protze bb869f42b7 [libomptarget] Also support several images for elf
In revision r336569 (D49036) libomptarget support for multiple nvidia images
has been fixed in case a target region resides inside one or multiple
libraries and in the compiled application. But the issues is still present
for elf images.
This fix will also support multiple images for elf.

Patch by Jannis Klinkenberg

Reviewers: protze.joachim, ABataev, grokos

Reviewed By: protze.joachim, ABataev, grokos

Subscribers: openmp-commits

Differential Revision: https://reviews.llvm.org/D49418

llvm-svn: 337355
2018-07-18 07:23:46 +00:00
Alexey Bataev 2622e9e5b3 [OPENMP, NVPTX] Support several images in the executable.
Summary:
Currently Cuda plugin supports loading of the single image, though we
may have the executable with the several images, if it has target
regions inside of the dynamically loaded library. Patch allows to load
multiple images.

Reviewers: grokos

Subscribers: guansong, openmp-commits, kkwli0

Differential Revision: https://reviews.llvm.org/D49036

llvm-svn: 336569
2018-07-09 17:46:55 +00:00
Jonas Hahnfeld 65e0b8784c [CMake] Unify install path for libraries
Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands.
This also fixes installation of libomptarget-nvptx that previously
didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX.

Differential Revision: https://reviews.llvm.org/D47130

llvm-svn: 333284
2018-05-25 15:56:41 +00:00
Guansong Zhang e1c7a46d5b [OpenMP] Use LIBOMPTARGET_DEVICE_RTL_DEBUG env var to control debug messages on the device side
Summary:
Enable the device side debug messages at compile time, use env var to control at runtime.

To achieve this, an environment data block is passed to the device lib when it is loaded.

By default, the message is off, to enable it, a user need to set LIBOMPDEVICE_DEBUG=1.

Reviewers: grokos

Reviewed By: grokos

Subscribers: openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D46210

llvm-svn: 331550
2018-05-04 19:29:28 +00:00
Jonas Hahnfeld a349d4820c [libomptarget] Check for library with CUDA Driver API
That's what we really need to link the CUDA plugin against,
not the CUDA runtime API in CUDA_LIBRARIES! While the latter
comes with the CUDA SDK, the Driver API is installed with
the kernel driver and there is at most one per system. As
fallback we can use the stubs library distributed with the
CUDA SDK for linking.

Differential Revision: https://reviews.llvm.org/D42643

llvm-svn: 323787
2018-01-30 16:49:13 +00:00
Jonas Hahnfeld c189523529 [libomptarget] Only use CUDA Driver API
Use equivalents for the last calls to the Runtime API. Remove
stray assert in case of an error found during review, we should
only return OFFLOAD_FAIL.

Differential Revision: https://reviews.llvm.org/D42686

llvm-svn: 323786
2018-01-30 16:49:06 +00:00
Jonas Hahnfeld 5af381acad [CMake] Refactor common settings and flags
These are needed by both libraries, so we can do that in a
common namespace and unify configuration parameters.
Also make sure that the user isn't requesting libomptarget
if the library cannot be built on the system. Issue an error
in that case.

Differential Revision: https://reviews.llvm.org/D40081

llvm-svn: 319342
2017-11-29 19:31:48 +00:00
Sergey Dmitriev b305d26b57 [OpenMP] libomptarget: move debugging dumps under control of env var LIBOMPTARGET_DEBUG
Disable default debugging dumps for libomptarget and plugins and move dumps
under control of environment variable LIBOMPTARGET_DEBUG=<integer>. Dumps
are enabled when LIBOMPTARGET_DEBUG is set to a positive integer value.

Debugging dumps are available only in debug build; release build does not
support it.

Differential Revision: https://reviews.llvm.org/D33227

llvm-svn: 310841
2017-08-14 15:09:59 +00:00
George Rokos 0e86bfb5bb [OpenMP] libomptarget: eliminate compiler warnings at build
Thanks to Sergey Dmitriev for submitting the patch.

Differential Revision: https://reviews.llvm.org/D33851

llvm-svn: 304601
2017-06-02 22:41:35 +00:00
George Rokos 1546d31924 [OpenMP] Changes in the plugin interface
This patch chagnes the plugin interface so that:
1) future plugins can take advantage of systems with shared CPU/device storage
2) instead of using base addresses, target regions are launched by providing target addresseds and base offsets explicitly.

Differential revision: https://reviews.llvm.org/D33028
 

llvm-svn: 302663
2017-05-10 14:12:36 +00:00
George Rokos c13df8e5e0 [OpenMP] Optimized default kernel launch parameters in CUDA plugin
Differential Revision: https://reviews.llvm.org/D32321

llvm-svn: 301321
2017-04-25 16:34:13 +00:00
George Rokos 01954092d0 [OpenMP] CUDA plugin: More descriptive error messages
Differential Revision: https://reviews.llvm.org/D31206

llvm-svn: 298527
2017-03-22 17:36:22 +00:00
George Rokos f3fe2dd235 [OpenMP] CUDA plugin: add include directory for libelf
Allow the user to manually specify where libelf is installed.

Differential Revision: https://reviews.llvm.org/D31207

llvm-svn: 298515
2017-03-22 16:41:46 +00:00