llvm-project

Commit Graph

Author	SHA1	Message	Date
Jon Chesterfield	26790ed248	[libomptarget] Require LLVM source tree to build libomptarget [libomptarget] Require LLVM source tree to build libomptarget This is to permit reliably #including files from the LLVM tree in libomptarget, as an improvement on the copy and paste that is currently in use. See D87841 for the first example of removing duplication given this new requirement. The weekly openmp dev call reached consensus on this approach. See also D87841 for some alternatives that were considered. In the future, we may want to introduce a new top level repo for shared constants, or start using the ADT library within openmp. This will break sufficiently exotic build systems, trivial fixes as below. Building libomptarget as part of the monorepo will continue to work. If openmp is built separately, it now requires a cmake macro indicating where to find the LLVM source tree. If openmp is built separately, without the llvm source tree already on disk, the build machine will need a copy of a subset of the llvm source tree and the cmake macro indicating where it is. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D89426	2020-10-21 18:53:00 +01:00
JonChesterfield	55dc123555	[libomptarget][amdgcn] Refactor memcpy to eliminate maps [libomptarget][amdgcn] Refactor memcpy to eliminate maps Builds on D89776 to remove now dead code. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D89888	2020-10-21 16:59:33 +01:00
Pushpinder Singh	aa616efbb3	[libomptarget][AMDGPU][NFC] Split atmi_memcpy for h2d and d2h The calls to atmi_memcpy presently determine the direction of copy (host to device or device to host) by storing pointers in a map during malloc and looking up the pointers during memcpy. As each call site already knows the direction, this stash+lookup can be eliminated. This NFC will be followed by a functional one that deletes those map lookups. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D89776 Change-Id: I1d9089bc1e56b3a9a30e334735fa07dee1f84990	2020-10-20 06:29:32 -04:00
JonChesterfield	7d2ecef5ed	[openmp][libomptarget] Include header from LLVM source tree [openmp][libomptarget] Include header from LLVM source tree The change is to the amdgpu plugin so is unlikely to break anything. The point of contention is whether libomptarget can depend on LLVM. A community discussion was cautiously not opposed yesterday. This introduces a compile time dependency on the LLVM source tree, in this case expressed as skipping the building of the plugin if LLVM_MAIN_INCLUDE_DIR is not set. One the source files will #include llvm/Frontend/OpenMP/OMPGridValues.h, instead of copy&pasting the numbers across. For users that download the monorepo, the llvm tree is already on disk. This will inconvenience users who download only the openmp source as a tar, as they would now also have to download (at least a file or two) from the llvm source, if they want to build the parts of the openmp project that (post this patch) depend on llvm. There was interest expressed in going further - using llvm tools as part of building libomp, or linking against llvm libraries. That seems less clear cut an improvement and worthy of further discussion. This patch seeks only to change policy to support openmp depending on the llvm source tree. Including in the other direction, or using libraries / tools etc, are purposefully out of scope. Reviewers are a best guess at interested parties, please feel free to add others Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87841	2020-10-15 15:46:19 +01:00
Manoel Roemmer	c816ee13ad	[OpenMP][VE plugin] Fixing failure to build VE plugin with consolidated error handling in libomptarget The libomptarget VE plugin [[ http://lab.llvm.org:8014/builders/clang-ve-ninja/builds/8937/steps/build-unified-tree/logs/stdio \| fails zu build ]] after `ae95ceeb8f` . Differential Revision: https://reviews.llvm.org/D88476	2020-09-29 17:38:01 +02:00
Ye Luo	03111e5e7a	[OpenMP] Protect unrecogonized CUDA error code If an error code can not be recognized by cuGetErrorString, errStr remains null and causes crashing at DP() printing. Protect this case. Reviewed By: jhuber6, tianshilei1992 Differential Revision: https://reviews.llvm.org/D87980	2020-09-21 13:43:08 -04:00
JonChesterfield	a9be2b5cb2	[libomptarget] Disable build of amdgpu plugin as it doesn't build with rocm.	2020-09-18 18:10:27 +01:00
Joseph Huber	ae209397b1	[OpenMP] Begin Printing Information Dumps In Libomptarget and Plugins Summary: This patch starts adding support for adding information dumps to libomptarget and rtl plugins. The information printing is controlled by the LIBOMPTARGET_INFO environment variable introduced in D86483. The goal of this patch is to provide the user with additional information about the device during kernel execution and providing the user with information dumps in the case of failure. This patch added the ability to dump the pointer mapping table as well as printing the number of blocks and threads in the cuda RTL. Reviewers: jdoerfort gkistanova ye-luo Subscribers: guansong openmp-commits sstefan1 yaxunl ye-luo Tags: #OpenMP Differential Revision: https://reviews.llvm.org/D87165	2020-09-09 12:03:56 -04:00
Joseph Huber	ae95ceeb8f	[OpenMP] Consolidate error handling and debug messages in Libomptarget Summary: This patch consolidates the error handling and messaging routines to a single file omptargetmessage. The goal is to simplify the error handling interface prior to adding more error handling support Reviewers: jdoerfert grokos ABataev AndreyChurbanov ronlieb JonChesterfield ye-luo tianshilei1992 Subscribers: danielkiss guansong jvesely kerbowa nhaehnle openmp-commits sstefan1 yaxunl	2020-09-01 15:28:19 -04:00
JonChesterfield	5d989fb37d	[libomptarget][amdgpu] Improve thread safety, remove dead code	2020-08-26 22:04:03 +01:00
Jon Chesterfield	28fbf422f2	[libomptarget][amdgpu] Update plugin CMake to work with latest rocr library	2020-08-26 20:01:42 +01:00
Jon Chesterfield	6e1b11087f	[libomptarget][amdgpu] Support building with static rocm libraries	2020-08-19 15:44:30 +01:00
Johannes Doerfert	5272d29e2c	[OpenMP][CUDA] Keep one kernel list per device, not globally. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D86039	2020-08-16 14:38:35 -05:00
Johannes Doerfert	aa27cfc1e7	[OpenMP][CUDA] Cache the maximal number of threads per block (per kernel) Instead of calling `cuFuncGetAttribute` with `CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` for every kernel invocation, we can do it for the first one and cache the result as part of the `KernelInfo` struct. The only functional change is that we now expect `cuFuncGetAttribute` to succeed and otherwise propagate the error. Ignoring any error seems like a slippery slope... Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D86038	2020-08-16 14:38:33 -05:00
Jon Chesterfield	d0b312955f	[libomptarget] Implement host plugin for amdgpu [libomptarget] Implement host plugin for amdgpu Replacement for D71384. Primary difference is inlining the dependency on atmi followed by extensive simplification and bugfixes. This is the latest version from https://github.com/ROCm-Developer-Tools/amd-llvm-project/tree/aomp12 with minor patches and a rename from hsa to amdgpu, on the basis that this can't be used by other implementations of hsa without additional work. This will not build unless the ROCM_DIR variable is passed so won't break other builds. That variable is used to locate two amdgpu specific libraries that ship as part of rocm: libhsakmt at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface libhsa-runtime64 at https://github.com/RadeonOpenCompute/ROCR-Runtime These libraries build from source. The build scripts in those repos are for shared libraries, but can be adapted to statically link both into this plugin. There are caveats. - This works well enough to run various tests and benchmarks, and will be used to support the current clang bring up - It is adequately thread safe for the above but there will be races remaining - It is not stylistically correct for llvm, though has had clang-format run - It has suboptimal memory management and locking strategies - The debug printing / error handling is inconsistent I would like to contribute this pretty much as-is and then improve it in-tree. This would be advantagous because the aomp12 branch that was in use for fixing this codebase has just been joined with the amd internal rocm dev process. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85742	2020-08-15 23:58:28 +01:00
George Rokos	40470eb27a	[libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t. The standard way of printing `int64_t` data is via the PRId64 macro, `ld` is for `long int` and int64_t is not guaranteed to be typedef'ed as `long int` on all platforms. E.g. on Windows we get mismatch warnings. Differential Revision: https://reviews.llvm.org/D85353	2020-08-05 13:28:35 -07:00
Ye Luo	c5348aecd7	[OpenMP] Use primary context in CUDA plugin Summary: Retaining per device primary context is preferred to creating a context owned by the plugin. From CUDA documentation 1. Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used." from https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DRIVER.html 2. Right under cuCtxCreate. In most cases it is recommended to use cuDevicePrimaryCtxRetain. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf 3. The primary context is unique per device and shared with the CUDA runtime API. These functions allow integration with other libraries using CUDA. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PRIMARY__CTX.html#group__CUDA__PRIMARY__CTX Two issues are addressed by this patch: 1. Not using the primary context caused interoperability issue with libraries like cublas, cusolver. CUBLAS_STATUS_EXECUTION_FAILED and cudaErrorInvalidResourceHandle 2. On OLCF summit, "Error returned from cuCtxCreate" and "CUDA error is: invalid device ordinal" Regarding the flags of the primary context. If it is inactive, we set CU_CTX_SCHED_BLOCKING_SYNC. If it is already active, we respect the current flags. Reviewers: grokos, ABataev, jdoerfert, protze.joachim, AndreyChurbanov, Hahnfeld Reviewed By: jdoerfert Subscribers: openmp-commits, yaxunl, guansong, sstefan1, tianshilei1992 Tags: #openmp Differential Revision: https://reviews.llvm.org/D82718	2020-07-07 10:14:51 -04:00
Ye Luo	45bb073da8	[OpenMP] fix clang warning about printf format in CUDA plugin Summary: Warnings are printed by clang when building LIBOMPTARGET_ENABLE_DEBUG=ON due incorrect format string. Reviewers: tianshilei1992, jdoerfert Reviewed By: tianshilei1992 Subscribers: yaxunl, guansong, sstefan1, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D82789	2020-06-29 22:35:39 -04:00
Shilei Tian	a014fbbc21	[OpenMP] Improve D2D memcpy to use more efficient driver API Summary: In current implementation, D2D memcpy is first to copy data back to host and then copy from host to device. This is very efficient if the device supports D2D memcpy, like CUDA. In this patch, D2D memcpy will first try to use native supported driver API. If it fails, fall back to original way. It is worth noting that D2D memcpy in this scenerio contains two ideas: - Same devices: this is the D2D memcpy in the CUDA context. - Different devices: this is the PeerToPeer memcpy in the CUDA context. My implementation merges this two parts. It chooses the best API according to the source device and destination device. Reviewers: jdoerfert, AndreyChurbanov, grokos Reviewed By: jdoerfert Subscribers: yaxunl, guansong, sstefan1, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D80649	2020-06-04 16:59:06 -04:00
Manoel Roemmer	6b9e43c67e	[Openmp][VE] Libomptarget plugin for NEC SX-Aurora This patch adds a libomptarget plugin for the NEC SX-Aurora TSUBASA Vector Engine (VE target). The code is largely based on the existing generic-elf plugin and uses the NEC VEO and VEOSINFO libraries for offloading. Differential Revision: https://reviews.llvm.org/D76843	2020-05-12 10:47:30 +02:00
Shilei Tian	cb038927ef	[OpenMP] Fix an issue of wrong return type of DeviceRTLTy::getNumOfDevices Summary: There is a typo in DeviceRTLTy::getNumOfDevices that the type of its return value is bool. It will lead to a problem of wrong device number returned from omp_get_num_devices. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: yaxunl, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D79255	2020-05-03 15:59:06 -04:00
Shilei Tian	4031bb982b	[OpenMP] Refined CUDA plugin to put all CUDA operations into class Summary: Current implementation mixed everything up so that there is almost no encapsulation. In this patch, all CUDA related operations are put into a new class DeviceRTLTy and only necessary functions are exposed. In addition, all C++ code now conforms with LLVM code standard, keeping those API functions following C style. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jfb, yaxunl, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77951	2020-04-13 13:32:46 -04:00
Shilei Tian	feed674dec	[OpenMP] Introduce stream pool to make sure the correctness of device synchr... ...onization Summary: In previous patch, in order to optimize performance, we only synchronize once for each target region. The syncrhonization is via stream synchronization. However, in the extreme situation, the performce might be bad. Consider the following case: There is a task that requires transferring huge amount of data (call many times of data transferring function). It is scheduled to the first stream. And then we have 255 very light tasks scheduled to the remaining 255 streams (by default we have 256 streams). They can be finished before we do synchronization at the end of the first task. Next, we get another very huge task. It will be scheduled again to the first stream. Now the first task finishes its kernel launch and call stream synchronization. Right now, the stream already contains two kernels, and the synchronization will wait until the two kernels finish instead of just the first one for the first task. In this patch, we introduce stream pool. After each synchronization, the stream will be returned back to the pool to make sure that for each synchronization, only expected operations are waited. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: gregrodgers, yaxunl, lildmh, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77412	2020-04-11 07:08:56 -04:00
Shilei Tian	03ff643d2e	[OpenMP] Put old APIs back and added new _async series for backward compatibility Summary: According to comments on bi-weekly meeting, this patch put back old APIs and added new `_async` series Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: yaxunl, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77822	2020-04-09 22:40:58 -04:00
Shilei Tian	32ed29271f	[OpenMP] Optimized stream selection by scheduling data mapping for the same target region into a same stream Summary: This patch introduces two things for offloading: 1. Asynchronous data transferring: those functions are suffix with `_async`. They have one more argument compared with their synchronous counterparts: `__tgt_async_info`, which is a new struct that only has one field, `void Identifier`. This struct is for information exchange between different asynchronous operations. It can be used for stream selection, like in this case, or operation synchronization, which is also used. We may expect more usages in the future. 2. Optimization of stream selection for data mapping. Previous implementation was using asynchronous device memory transfer but synchronizing after each memory transfer. Actually, if we say kernel A needs four memory copy to device and two memory copy back to host, then we can schedule these seven operations (four H2D, two D2H, and one kernel launch) into a same stream and just need synchronization after memory copy from device to host. In this way, we can save a huge overhead compared with synchronization after each operation. Reviewers: jdoerfert, ye-luo Reviewed By: jdoerfert Subscribers: yaxunl, lildmh, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77005	2020-04-07 14:55:47 -04:00
Jon Chesterfield	856c995436	[libomptarget] Add missing elf_end call in elf_common.c Summary: [libomptarget] Add missing elf_end call in elf_common.c Noticed when reviewing D76843. Reviewers: simoll, jdoerfert, efocht, AndreyChurbanov, grokos, manorom Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D76874	2020-03-26 19:07:33 +00:00
George Rokos	0a42c9bfe4	Enable CUDA offloading on aarch64 host Differential Revision: https://reviews.llvm.org/D76469	2020-03-20 15:38:47 -07:00
Johannes Doerfert	a5153dbc36	[OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D74145	2020-02-11 22:07:14 -06:00
Kazuaki Ishizaki	4c6a098ad5	[OpenMP] NFC: Fix trivial typos in comments Reviewers: jdoerfert, Jim Reviewed By: Jim Subscribers: Jim, mgorny, guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72285	2020-01-07 14:05:03 +08:00
Bryan Chan	4d3198e243	[OpenMP] build offload plugins before testing them Summary: "make check-all" or "make check-libomptarget" would attempt to run offloading tests before the offload plugins are built. This patch corrects that by adding dependencies to the libomptarget CMake rules. Reviewers: jdoerfert Subscribers: mgorny, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70803	2019-11-28 17:43:56 -05:00
Ron Lieberman	dc34b1c94d	Test commit: adds a . to comment. NFC	2019-11-04 16:51:03 -06:00
Sergey Dmitriev	4b343fd84c	[Clang][OpenMP Offload] Create start/end symbols for the offloading entry table with a help of a linker Linker automatically provides __start_<section name> and __stop_<section name> symbols to satisfy unresolved references if <section name> is representable as a C identifier (see https://sourceware.org/binutils/docs/ld/Input-Section-Example.html for details). These symbols indicate the start address and end address of the output section respectively. Therefore, renaming OpenMP offload entries section name from ".omp.offloading_entries" to "omp_offloading_entries" to use this feature. This is the first part of the patch for eliminating OpenMP linker script (please see https://reviews.llvm.org/D64943). Differential Revision: https://reviews.llvm.org/D68070 llvm-svn: 373118	2019-09-27 20:00:51 +00:00
Michael Kruse	78769ec403	[libomptarget] Harmonize emitting CUDA errors and general debug messages. Ensures that CUDA fail reasons (such as "No CUDA-capable device detected") are printed together with libomptarget's debug message (e.g. "Error when setting CUDA context"). Previously, the former was printed only in CMAKE_BUILD_TYPE=Debug builds while the latter was enabled by LIBOMPTARGET_ENABLE_DEBUG. With this change, also only call cuGetErrorString when the error will be printed. Suggested-by: Ye Luo <xw111luoye@gmail.com> Differential Revision: https://reviews.llvm.org/D65687 llvm-svn: 367910	2019-08-05 19:12:10 +00:00
Gheorghe-Teodor Bercea	aace6d285d	[OpenMP][libomptarget] Add support for declare target to clause under unified memory Summary: This patch adds support for handling variables under the: ``` #pragma omp declare target to() ``` clause when the ``` #pragma omp requires unified_shared_memory ``` is used. The address of the host variable is copied into the device pointer just like for the declare target link case. Reviewers: ABataev, caomhin, grokos, AlexEichenberger Reviewed By: grokos Subscribers: jcownie, guansong, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D63106 llvm-svn: 363825	2019-06-19 15:48:10 +00:00
Gheorghe-Teodor Bercea	c5fe030c16	[OpenMP][libomptarget] Enable usage of unified memory for declare target link variables Summary: This patch enables the usage of a host variable on the device for declare target link variables when unified memory is available. Reviewers: ABataev, caomhin, grokos Reviewed By: grokos Subscribers: Hahnfeld, guansong, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60884 llvm-svn: 362505	2019-06-04 15:05:53 +00:00
Chandler Carruth	57b08b0944	Update more file headers across all of the LLVM projects in the monorepo to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648	2019-01-19 10:56:40 +00:00
Jonas Hahnfeld	bb51d39871	[libomptarget][CUDA] Use cuDeviceGetAttribute, NFCI. cuDeviceGetProperties has apparently been deprecated since CUDA 5.0. Nvidia started using annotations only in CUDA 9.2, so nobody noticed nor cared before. The new function returns the same values, tested with a P100. Differential Revision: https://reviews.llvm.org/D51624 llvm-svn: 341372	2018-09-04 15:13:28 +00:00
Joachim Protze	bb869f42b7	[libomptarget] Also support several images for elf In revision r336569 (D49036) libomptarget support for multiple nvidia images has been fixed in case a target region resides inside one or multiple libraries and in the compiled application. But the issues is still present for elf images. This fix will also support multiple images for elf. Patch by Jannis Klinkenberg Reviewers: protze.joachim, ABataev, grokos Reviewed By: protze.joachim, ABataev, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D49418 llvm-svn: 337355	2018-07-18 07:23:46 +00:00
Alexey Bataev	2622e9e5b3	[OPENMP, NVPTX] Support several images in the executable. Summary: Currently Cuda plugin supports loading of the single image, though we may have the executable with the several images, if it has target regions inside of the dynamically loaded library. Patch allows to load multiple images. Reviewers: grokos Subscribers: guansong, openmp-commits, kkwli0 Differential Revision: https://reviews.llvm.org/D49036 llvm-svn: 336569	2018-07-09 17:46:55 +00:00
Jonas Hahnfeld	65e0b8784c	[CMake] Unify install path for libraries Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands. This also fixes installation of libomptarget-nvptx that previously didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX. Differential Revision: https://reviews.llvm.org/D47130 llvm-svn: 333284	2018-05-25 15:56:41 +00:00
Guansong Zhang	e1c7a46d5b	[OpenMP] Use LIBOMPTARGET_DEVICE_RTL_DEBUG env var to control debug messages on the device side Summary: Enable the device side debug messages at compile time, use env var to control at runtime. To achieve this, an environment data block is passed to the device lib when it is loaded. By default, the message is off, to enable it, a user need to set LIBOMPDEVICE_DEBUG=1. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D46210 llvm-svn: 331550	2018-05-04 19:29:28 +00:00
Jonas Hahnfeld	a349d4820c	[libomptarget] Check for library with CUDA Driver API That's what we really need to link the CUDA plugin against, not the CUDA runtime API in CUDA_LIBRARIES! While the latter comes with the CUDA SDK, the Driver API is installed with the kernel driver and there is at most one per system. As fallback we can use the stubs library distributed with the CUDA SDK for linking. Differential Revision: https://reviews.llvm.org/D42643 llvm-svn: 323787	2018-01-30 16:49:13 +00:00
Jonas Hahnfeld	c189523529	[libomptarget] Only use CUDA Driver API Use equivalents for the last calls to the Runtime API. Remove stray assert in case of an error found during review, we should only return OFFLOAD_FAIL. Differential Revision: https://reviews.llvm.org/D42686 llvm-svn: 323786	2018-01-30 16:49:06 +00:00
Jonas Hahnfeld	5af381acad	[CMake] Refactor common settings and flags These are needed by both libraries, so we can do that in a common namespace and unify configuration parameters. Also make sure that the user isn't requesting libomptarget if the library cannot be built on the system. Issue an error in that case. Differential Revision: https://reviews.llvm.org/D40081 llvm-svn: 319342	2017-11-29 19:31:48 +00:00
Sergey Dmitriev	b305d26b57	[OpenMP] libomptarget: move debugging dumps under control of env var LIBOMPTARGET_DEBUG Disable default debugging dumps for libomptarget and plugins and move dumps under control of environment variable LIBOMPTARGET_DEBUG=<integer>. Dumps are enabled when LIBOMPTARGET_DEBUG is set to a positive integer value. Debugging dumps are available only in debug build; release build does not support it. Differential Revision: https://reviews.llvm.org/D33227 llvm-svn: 310841	2017-08-14 15:09:59 +00:00
George Rokos	0e86bfb5bb	[OpenMP] libomptarget: eliminate compiler warnings at build Thanks to Sergey Dmitriev for submitting the patch. Differential Revision: https://reviews.llvm.org/D33851 llvm-svn: 304601	2017-06-02 22:41:35 +00:00
George Rokos	1546d31924	[OpenMP] Changes in the plugin interface This patch chagnes the plugin interface so that: 1) future plugins can take advantage of systems with shared CPU/device storage 2) instead of using base addresses, target regions are launched by providing target addresseds and base offsets explicitly. Differential revision: https://reviews.llvm.org/D33028 llvm-svn: 302663	2017-05-10 14:12:36 +00:00
George Rokos	c13df8e5e0	[OpenMP] Optimized default kernel launch parameters in CUDA plugin Differential Revision: https://reviews.llvm.org/D32321 llvm-svn: 301321	2017-04-25 16:34:13 +00:00
George Rokos	01954092d0	[OpenMP] CUDA plugin: More descriptive error messages Differential Revision: https://reviews.llvm.org/D31206 llvm-svn: 298527	2017-03-22 17:36:22 +00:00
George Rokos	f3fe2dd235	[OpenMP] CUDA plugin: add include directory for libelf Allow the user to manually specify where libelf is installed. Differential Revision: https://reviews.llvm.org/D31207 llvm-svn: 298515	2017-03-22 16:41:46 +00:00

1 2

52 Commits