llvm-project

Commit Graph

Author	SHA1	Message	Date
Shilei Tian	9b2832c089	[OpenMP] Wait for kernel prior to memory deallocation Summary: In the function `target`, memory deallocation and `target_data_end` is called immediately returning from launching kernel. This might cause a race condition that the corresponding memory is still being used by the kernel and a potential issue that when the kernel starts to execute, its required data have already been deallocated, especially when multiple kernels running concurrently. Since nevertheless, we will block the thread issuing the target offloading at the end of the target, we just move the synchronization ahead a little bit to make sure the correctness. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: yaxunl, guansong, sstefan1, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D84381	2020-07-22 22:55:34 -04:00
Joel E. Denny	708752b2f6	[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2) This implements OpenMP runtime support for the OpenMP TR8 `present` map type modifier. The previous patch in this series implements Clang front end support. See that patch summary for behaviors that are not yet supported. Reviewed By: grokos, jdoerfert Differential Revision: https://reviews.llvm.org/D83062	2020-07-22 14:04:58 -04:00
Joel E. Denny	fc247c8f3c	Revert "[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2)" This reverts commit `45b8f7ec35`. It attempts to use debug macros `DPxMOD` and `DPxPTR` in release builds. Will fix and reapply later.	2020-07-22 11:22:08 -04:00
Joel E. Denny	45b8f7ec35	[OpenMP] Implement TR8 `present` map type modifier in runtime (2/2) This implements OpenMP runtime support for the OpenMP TR8 `present` map type modifier. The previous patch in this series implements Clang front end support. See that patch summary for behaviors that are not yet supported. Reviewed By: grokos, jdoerfert Differential Revision: https://reviews.llvm.org/D83062	2020-07-22 10:15:32 -04:00
Joachim Protze	ae31d7838c	[OpenMP][NFC] pass on env variables to libomptarget tests	2020-07-22 12:14:45 +02:00
George Rokos	140ab574a1	[OpenMP][Offload] Declare mapper runtime implementation Libomptarget patch adding runtime support for "declare mapper". Patch co-developed by Lingda Li and George Rokos. Differential revision: https://reviews.llvm.org/D68100	2020-07-15 18:11:43 -07:00
Johannes Doerfert	5937434677	[OpenMP] Silence unused symbol warning with proper ifdefs	2020-07-11 11:57:42 -05:00
Johannes Doerfert	c98699582a	[OpenMP][NFC] Remove unused (always fixed) arguments There are various runtime calls in the device runtime with unused, or always fixed, arguments. This is bad for all sorts of reasons. Clean up two before as we match them in OpenMPOpt now. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D83268	2020-07-11 00:51:51 -05:00
Johannes Doerfert	cd0ea03e6f	[OpenMP][NFC] Remove unused and untested code from the device runtime Summary: We carried a lot of unused and untested code in the device runtime. Among other reasons, we are planning major rewrites for which reduced size is going to help a lot. The number of code lines reduced by 14%! Before: ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- CUDA 13 489 841 2454 C/C++ Header 14 322 493 1377 C 12 117 124 559 CMake 4 64 64 262 C++ 1 6 6 39 ------------------------------------------------------------------------------- SUM: 44 998 1528 4691 ------------------------------------------------------------------------------- After: ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- CUDA 13 366 733 1879 C/C++ Header 14 317 484 1293 C 12 117 124 559 CMake 4 64 64 262 C++ 1 6 6 39 ------------------------------------------------------------------------------- SUM: 44 870 1411 4032 ------------------------------------------------------------------------------- Reviewers: hfinkel, jhuber6, fghanim, JonChesterfield, grokos, AndreyChurbanov, ye-luo, tianshilei1992, ggeorgakoudis, Hahnfeld, ABataev, hbae, ronlieb, gregrodgers Subscribers: jvesely, yaxunl, bollu, guansong, jfb, sstefan1, aaron.ballman, openmp-commits, cfe-commits Tags: #clang, #openmp Differential Revision: https://reviews.llvm.org/D83349	2020-07-10 19:09:41 -05:00
Ye Luo	c5348aecd7	[OpenMP] Use primary context in CUDA plugin Summary: Retaining per device primary context is preferred to creating a context owned by the plugin. From CUDA documentation 1. Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used." from https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DRIVER.html 2. Right under cuCtxCreate. In most cases it is recommended to use cuDevicePrimaryCtxRetain. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf 3. The primary context is unique per device and shared with the CUDA runtime API. These functions allow integration with other libraries using CUDA. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PRIMARY__CTX.html#group__CUDA__PRIMARY__CTX Two issues are addressed by this patch: 1. Not using the primary context caused interoperability issue with libraries like cublas, cusolver. CUBLAS_STATUS_EXECUTION_FAILED and cudaErrorInvalidResourceHandle 2. On OLCF summit, "Error returned from cuCtxCreate" and "CUDA error is: invalid device ordinal" Regarding the flags of the primary context. If it is inactive, we set CU_CTX_SCHED_BLOCKING_SYNC. If it is already active, we respect the current flags. Reviewers: grokos, ABataev, jdoerfert, protze.joachim, AndreyChurbanov, Hahnfeld Reviewed By: jdoerfert Subscribers: openmp-commits, yaxunl, guansong, sstefan1, tianshilei1992 Tags: #openmp Differential Revision: https://reviews.llvm.org/D82718	2020-07-07 10:14:51 -04:00
Saiyedul Islam	38d6640ba5	[libomptarget] Implement atomic inc and fence functions for AMDGCN using clang builtins This function uses __builtin_amdgcn_atomic_inc32(): uint32_t atomicInc(uint32_t *address, uint32_t max); These functions use __builtin_amdgcn_fence(): __kmpc_impl_threadfence() __kmpc_impl_threadfence_block() __kmpc_impl_threadfence_system() They will take place of current mechanism of directly calling IR functions. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D83132	2020-07-07 06:36:25 +00:00
Fangrui Song	6ba4380ed6	[libomptarget][test] Fix text relocations by adding -fPIC	2020-07-05 12:51:28 -07:00
Ye Luo	45bb073da8	[OpenMP] fix clang warning about printf format in CUDA plugin Summary: Warnings are printed by clang when building LIBOMPTARGET_ENABLE_DEBUG=ON due incorrect format string. Reviewers: tianshilei1992, jdoerfert Reviewed By: tianshilei1992 Subscribers: yaxunl, guansong, sstefan1, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D82789	2020-06-29 22:35:39 -04:00
Ye Luo	6e5f64c44f	[OpenMP] Adopt std::set in HostDataToTargetMap Summary: lookupMapping took significant time due to linear complexity searching. This is bad for offloading from multiple host threads because lookupMapping is protected by mutex. Use std::set for logarithmic complexity searching. Before my change. libomptarget inclusive time 16.7 sec, exclusive time 8.6 sec. After the change libomptarget inclusive time 7.3 sec, exclusive time 0.4 sec. Most of the overhead of libomptarget (exclusive time) is gone. Reviewers: jdoerfert, grokos Reviewed By: grokos Subscribers: tianshilei1992, yaxunl, guansong, sstefan1 Tags: #openmp Differential Revision: https://reviews.llvm.org/D82264	2020-06-24 12:22:45 -04:00
Shilei Tian	aaf50adb53	Revert "[OpenMP][NFC] Added DeviceID and Event pointer to __tgt_async_info" This reverts commit `ee1bf45e1d`.	2020-06-17 15:01:16 -04:00
Shilei Tian	ee1bf45e1d	[OpenMP][NFC] Added DeviceID and Event pointer to __tgt_async_info DeviceID is added for some cases that we only have the __tgt_async_info but do not know its corresponding device id. However, to communicate with target plugins, we need that information. Event is added for another way to synchronize.	2020-06-17 14:29:09 -04:00
Shilei Tian	a014fbbc21	[OpenMP] Improve D2D memcpy to use more efficient driver API Summary: In current implementation, D2D memcpy is first to copy data back to host and then copy from host to device. This is very efficient if the device supports D2D memcpy, like CUDA. In this patch, D2D memcpy will first try to use native supported driver API. If it fails, fall back to original way. It is worth noting that D2D memcpy in this scenerio contains two ideas: - Same devices: this is the D2D memcpy in the CUDA context. - Different devices: this is the PeerToPeer memcpy in the CUDA context. My implementation merges this two parts. It chooses the best API according to the source device and destination device. Reviewers: jdoerfert, AndreyChurbanov, grokos Reviewed By: jdoerfert Subscribers: yaxunl, guansong, sstefan1, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D80649	2020-06-04 16:59:06 -04:00
Manoel Roemmer	6b9e43c67e	[Openmp][VE] Libomptarget plugin for NEC SX-Aurora This patch adds a libomptarget plugin for the NEC SX-Aurora TSUBASA Vector Engine (VE target). The code is largely based on the existing generic-elf plugin and uses the NEC VEO and VEOSINFO libraries for offloading. Differential Revision: https://reviews.llvm.org/D76843	2020-05-12 10:47:30 +02:00
Joel E. Denny	dd5ba4b585	[OpenMP][NFC] Fix `not` sustitution in tests D78566 introduced a `\bnot\b` lit substitution in OpenMP test suites. However, that would corrupt a command like `FileCheck -implicit-check-not` or any file name like `%t.not`. We could use lookbehind/lookahead assertions to avoid such cases, but this patch switches to `%not` (suggested during the D78566 review) as a safer option. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D79529	2020-05-11 14:53:48 -04:00
Shilei Tian	cb038927ef	[OpenMP] Fix an issue of wrong return type of DeviceRTLTy::getNumOfDevices Summary: There is a typo in DeviceRTLTy::getNumOfDevices that the type of its return value is bool. It will lead to a problem of wrong device number returned from omp_get_num_devices. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: yaxunl, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D79255	2020-05-03 15:59:06 -04:00
Ron Lieberman	ee9c53d271	[libomptarget] Initialize reference parameter IsNew within Device::getOrAllocTgtPtr The two locals IsNew and Pointer_IsNew were uninitialized at declaration, and then passed by reference to Device.getOrAllocTgtPtr which in turn did not assign on all paths within the function. This resulted in occasional runtime failures in one application. Device::getOrAllocTgtPtr will now initialize IsNew to false on entry to function. Differential Revision: https://reviews.llvm.org/D78744	2020-04-24 15:33:37 -05:00
Joel E. Denny	5f6aa9680c	[OpenMP] target_data_begin: fail on device alloc fail Without this patch, target_data_begin continues after an illegal mapping or an out-of-memory error on the device. With this patch, it terminates the runtime with an error instead. The new test exercises only illegal mappings. I didn't think of a good way to exercise out-of-memory errors from the test suite. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D78170	2020-04-21 17:10:50 -04:00
Joel E. Denny	ba942610f6	[OpenMP] Add scaffolding for negative runtime tests Without this patch, the openmp project's test suites do not appear to have support for negative tests. However, D78170 needs to add a test that an expected runtime failure occurs. This patch makes `not` visible in all of the openmp project's test suites. In all but `libomptarget/test`, it should be possible for a test author to insert `not` before a use of the lit substitution for running a test program. In `libomptarget/test`, that substitution is target-specific, and its value is `echo` when the target is not available. In that case, inserting `not` before a lit substitution would expect an `echo` fail, so this patch instead defines a separate lit substitution for expected runtime fails. Reviewed By: jdoerfert, Hahnfeld Differential Revision: https://reviews.llvm.org/D78566	2020-04-21 17:10:50 -04:00
Shilei Tian	4031bb982b	[OpenMP] Refined CUDA plugin to put all CUDA operations into class Summary: Current implementation mixed everything up so that there is almost no encapsulation. In this patch, all CUDA related operations are put into a new class DeviceRTLTy and only necessary functions are exposed. In addition, all C++ code now conforms with LLVM code standard, keeping those API functions following C style. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jfb, yaxunl, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77951	2020-04-13 13:32:46 -04:00
Shilei Tian	feed674dec	[OpenMP] Introduce stream pool to make sure the correctness of device synchr... ...onization Summary: In previous patch, in order to optimize performance, we only synchronize once for each target region. The syncrhonization is via stream synchronization. However, in the extreme situation, the performce might be bad. Consider the following case: There is a task that requires transferring huge amount of data (call many times of data transferring function). It is scheduled to the first stream. And then we have 255 very light tasks scheduled to the remaining 255 streams (by default we have 256 streams). They can be finished before we do synchronization at the end of the first task. Next, we get another very huge task. It will be scheduled again to the first stream. Now the first task finishes its kernel launch and call stream synchronization. Right now, the stream already contains two kernels, and the synchronization will wait until the two kernels finish instead of just the first one for the first task. In this patch, we introduce stream pool. After each synchronization, the stream will be returned back to the pool to make sure that for each synchronization, only expected operations are waited. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: gregrodgers, yaxunl, lildmh, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77412	2020-04-11 07:08:56 -04:00
Shilei Tian	03ff643d2e	[OpenMP] Put old APIs back and added new _async series for backward compatibility Summary: According to comments on bi-weekly meeting, this patch put back old APIs and added new `_async` series Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: yaxunl, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77822	2020-04-09 22:40:58 -04:00
Shilei Tian	32ed29271f	[OpenMP] Optimized stream selection by scheduling data mapping for the same target region into a same stream Summary: This patch introduces two things for offloading: 1. Asynchronous data transferring: those functions are suffix with `_async`. They have one more argument compared with their synchronous counterparts: `__tgt_async_info`, which is a new struct that only has one field, `void Identifier`. This struct is for information exchange between different asynchronous operations. It can be used for stream selection, like in this case, or operation synchronization, which is also used. We may expect more usages in the future. 2. Optimization of stream selection for data mapping. Previous implementation was using asynchronous device memory transfer but synchronizing after each memory transfer. Actually, if we say kernel A needs four memory copy to device and two memory copy back to host, then we can schedule these seven operations (four H2D, two D2H, and one kernel launch) into a same stream and just need synchronization after memory copy from device to host. In this way, we can save a huge overhead compared with synchronization after each operation. Reviewers: jdoerfert, ye-luo Reviewed By: jdoerfert Subscribers: yaxunl, lildmh, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77005	2020-04-07 14:55:47 -04:00
Kazuaki Ishizaki	4201679110	[OpenMP] NFC: Fix trivial typo Differential Revision: https://reviews.llvm.org/D77430	2020-04-04 12:06:54 +09:00
JonChesterfield	09834f9761	[libomptarget][nfc] Move non-freestanding headers out of common Summary: [libomptarget][nfc] Move non-freestanding headers out of common Lowers the bar for building deviceRTL. Drops math.h entirely as it wasn't used and libm is a big dependency. Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D77071	2020-03-31 23:43:18 +01:00
Jon Chesterfield	856c995436	[libomptarget] Add missing elf_end call in elf_common.c Summary: [libomptarget] Add missing elf_end call in elf_common.c Noticed when reviewing D76843. Reviewers: simoll, jdoerfert, efocht, AndreyChurbanov, grokos, manorom Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D76874	2020-03-26 19:07:33 +00:00
JonChesterfield	0813f41005	[libomptarget][nfc] Explicitly static function scope shared variables Summary: [libomptarget][nfc] Explicitly static function scope shared variables `__shared__` in CUDA implies static in function scope. See e.g. D.2.1.1 in CUDA_C_Programming_Guide.pdf, http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/ This is surprising for non-cuda developers, see e.g. D73239 where I thought local variables would be thread local. Tested by IR diff of libomptarget.bc (no change), running in tree tests, and binary diff of the nvcc static archives (no significant change). Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D76713	2020-03-24 18:51:50 +00:00
JonChesterfield	298527587c	[libomptarget][nfc] Disable amdgcn rtl build. The cmake logic for finding llvm is misbehaving.	2020-03-21 00:01:03 +00:00
George Rokos	0a42c9bfe4	Enable CUDA offloading on aarch64 host Differential Revision: https://reviews.llvm.org/D76469	2020-03-20 15:38:47 -07:00
Tom Scogland	a23d7282ca	openmp: fix memcpy memory leak Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D72637	2020-03-12 23:24:16 -05:00
Alexey Bataev	c422d69b1a	[LIBOMPTARGET]Fix PR45139: Bug in mixing Python and OpenMP target offload. Summary: Explicitly initialize data members of RTLsTy class upon construction. Reviewers: grokos Subscribers: guansong, openmp-commits, caomhin, kkwli0 Tags: #openmp Differential Revision: https://reviews.llvm.org/D75946	2020-03-11 09:12:02 -04:00
Jon Chesterfield	221ada654b	[libomptarget] Implement locks for amdgcn Summary: [libomptarget] Implement locks for amdgcn The nvptx implementation deadlocks on amdgcn. atomic_cas with multiple active lanes can deadlock - if one lane succeeds, all the others are locked out. The set_lock implementation therefore runs on a single lane. Also uses a sleep intrinsic instead of the system clock for a probably minor performance improvement. The unset/test implementations may be revised later, based on code size / performance or similar concerns. This implements the lock at a per-wavefront scope. That's not strictly as specified, since openmp describes locks in terms of threads. I think the nvptx implementation provides true per-thread locking on volta and the same per-warp locking on other architectures. Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D75546	2020-03-05 20:25:31 +00:00
Jon Chesterfield	918a1065be	[libomptarget][nfc] Move GetWarp/LaneId functions into per arch code Summary: [libomptarget][nfc] Move GetWarp/LaneId functions into per arch code No code change for nvptx. Amdgcn currently has two implementations of GetLaneId, this patch keeps the one a colleague considered to be superior for our ISA. GetWarpId is currently the same function for amdgcn and nvptx, but I think it's cleaner to keep it grouped with all the others than to keep it in support.cu. Reviewers: jdoerfert, grokos, ABataev Reviewed By: jdoerfert Subscribers: jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D75587	2020-03-05 17:05:58 +00:00
Jon Chesterfield	84ac0dffd4	[libomptarget][nfc][amdgcn] Replace magic number with named intrinsic	2020-03-05 11:50:30 +00:00
Jon Chesterfield	133db44996	[libomptarget] Implement most hip atomic functions in terms of intrinsics Summary: [libomptarget] Implement hip atomic functions in terms of intrinsics All but atomicInc can be implemented using type generic clang intrinsics. There is not yet a corresponding intrinsic for atomicInc in clang, only one in LLVM. This patch leaves atomicInc as an unresolved symbol. Reviewers: jdoerfert, ABataev, hfinkel, grokos, arsenm Reviewed By: arsenm Subscribers: sri, saiislam, wdng, jvesely, mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D73076	2020-03-04 17:56:40 +00:00
Jon Chesterfield	ad3d021b9e	[libomptarget][nfc][amdgcn] Simplify assert_fail implementation	2020-03-03 18:24:51 +00:00
Alexey Bataev	c4a9d976c1	[LIBOMPTARGET]Lower priority of global constructor/destructor to silence the warning from gcc. Summary: fixed the warning from gcc since prios 0-100 are reserved for the internal use. Reviewers: grokos Subscribers: kkwli0, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D75458	2020-03-02 15:15:11 -05:00
Alexey Bataev	63cef621f9	[LIBOMPTARGET]Fix PR44933: fix crash because of the too early deinitialization of libomptarget. Summary: Instead of using global variables with unpredicted time of deinitialization, use dynamically allocated variables with functions explicitly marked as global constructor/destructor and priority. This allows to prevent the crash because of the incorrect order of dynamic libraries deinitialization. Reviewers: grokos, hfinkel Subscribers: caomhin, kkwli0, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D74837	2020-02-25 15:54:37 -05:00
Alexey Bataev	578c13d13c	[OPENMP]Fix the test, NFC.	2020-02-13 10:40:06 -05:00
Ethan Stewart	190a11148b	Changed omp_get_max_threads() implementation to more closely match spec description. Summary: The 5.0 spec states, "The omp_get_max_threads routine returns an upper bound on the number of threads that could be used to form a new team if a parallel construct without a num_threads clause were encountered after execution returns from this routine." The attached test shows Max Threads: 96, Num Threads: 128 without the proposed change. The number of threads should not exceed the (max) nthreads ICV, hence we should return the higher SPMD thread number even when omp_get_max_threads() is called in a generic kernel. This change does fail the api test, max_threads.c, because now it would return 64 instead of 32. Reviewers: jdoerfert, ABataev, grokos, JonChesterfield Reviewed By: jdoerfert Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D74092	2020-02-12 23:29:34 +00:00
JonChesterfield	c2ce9ea4e3	[libomptarget][nfc] Change enum values to match those in cuda/rtl Summary: [libomptarget][nfc] Change enum values to match those in cuda/rtl support.h and cuda/rtl.cpp (and downsteam hsa/rtl.cpp) have enums for execution mode. These are actually independent - the numbers that used within support, or within the plugin, are never passed across the boundary. Nevertheless, trying to work out why the values are different between the two has generated a reasonable amount of confusion. This patch changes support to match the values in plugin, on the basis that the plugin also has some comments which I'd have to update if I changed that one instead. Credit to Ron for working through this in our own fork. See rocm-developer-tools/aomp/issues/7 for that earlier diagnostic write up. Also happy with generic = 0, spmd = 1 - provided it's the same in both places. Reviewers: jdoerfert, grokos, ABataev, ronlieb Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D74503	2020-02-12 23:27:08 +00:00
Johannes Doerfert	a5153dbc36	[OpenMP][Offloading] Added support for multiple streams so that multiple kernels can be executed concurrently Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D74145	2020-02-11 22:07:14 -06:00
Johannes Doerfert	3ff4e2eee8	[OpenMP] Switch default C++ standard to C++ 14 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D74258	2020-02-11 17:11:54 -06:00
Jonas Devlieghere	4fe839ef3a	[CMake] Rename EXCLUDE_FROM_ALL and make it an argument to add_lit_testsuite EXCLUDE_FROM_ALL means something else for add_lit_testsuite as it does for something like add_executable. Distinguish between the two by renaming the variable and making it an argument to add_lit_testsuite. Differential revision: https://reviews.llvm.org/D74168	2020-02-06 15:33:18 -08:00
Jon Chesterfield	6a82f0f0b9	[libomptarget] Implement wavefront functions for amdgcn Summary: [libomptarget] Implement wavefront functions for amdgcn Reviewers: jdoerfert, ABataev, grokos, arsenm Reviewed By: arsenm Subscribers: saiislam, wdng, arsenm, jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D73077	2020-02-04 21:55:29 +00:00
Jon Chesterfield	ab9762a9f5	Revert "[nfc][libomptarget] Remove SHARED annotation from local variables" This reverts commit `0e9374e374`. Revert D73239. It fails some local testing, cause presently unknown	2020-01-27 20:05:17 +00:00
Jon Chesterfield	0e9374e374	[nfc][libomptarget] Remove SHARED annotation from local variables Summary: [nfc][libomptarget] Remove SHARED annotation from local variables A few local variables in reduction.cu were marked SHARED. This patch leaves all per-kernel global state localised in omp_data.cu. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D73239	2020-01-23 00:00:23 +00:00
Alexey Bataev	9148b8b734	[OpenMP][Offloading] Fix the issue that omp_get_num_devices returns wrong number of devices, by Shiley Tian. Summary: This patch is to fix issue in the following simple case: #include <omp.h> #include <stdio.h> int main(int argc, char *argv[]) { int num = omp_get_num_devices(); printf("%d\n", num); return 0; } Currently it returns 0 even devices exist. Since this file doesn't contain any target region, the host entry is empty so further actions like initialization will not be proceeded, leading to wrong device number returned by runtime function call. Reviewers: jdoerfert, ABataev, protze.joachim Reviewed By: ABataev Subscribers: protze.joachim Tags: #openmp Differential Revision: https://reviews.llvm.org/D72576	2020-01-21 13:25:18 -05:00
Jon Chesterfield	03c2a59cd6	[libomptarget] Implement smid for amdgcn Summary: [libomptarget] Implement smid for amdgcn Implementation is in a new file as it uses an intrinsic with complicated encoding that warranted substantial comments. Reviewers: jdoerfert, grokos, ABataev, ronlieb Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72956	2020-01-20 14:52:17 +00:00
George Rokos	e244145ab0	[LIBOMPTARGET] Do not increment/decrement the refcount for "declare target" objects The reference counter for global objects marked with declare target is INF. This patch prevents the runtime from incrementing /decrementing INF refcounts. Without it, the map(delete: global_object) directive actually deallocates the global on the device. With this patch, such a directive becomes a no-op. Differential Revision: https://reviews.llvm.org/D72525	2020-01-14 16:30:38 -08:00
Jon Chesterfield	2a43688a0a	[nfc][libomptarget] Refactor nvptx/target_impl.cu Summary: [nfc][libomptarget] Refactor nxptx/target_impl.cu Use __kmpc_impl_atomic_add instead of atomicAdd to match the rest of the file. Alternatively, target_impl.cu could use the cuda functions directly. Using a mixture in this file was an oversight, happy to resolve in either direction. Removed some comments that look outdated. Call __kmpc_impl_unset_lock directly to avoid a redundant diagnostic and remove an implict dependency on interface.h. Reviewers: ABataev, grokos, jdoerfert Reviewed By: jdoerfert Subscribers: jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72719	2020-01-14 19:27:45 +00:00
Jon Chesterfield	2d287bec3c	[nfc][libomptarget] Refactor amdgcn target_impl Summary: [nfc][libomptarget] Refactor amdgcn target_impl Removes references to internal libraries from the header Standardises on C++ mangling for all the target_impl functions Update comment block clang-format Move some functions into a new target_impl.hip source file This lays the groundwork for implementing the remaining unresolved symbols in the target_impl.hip source. Reviewers: jdoerfert, grokos, ABataev, ronlieb Reviewed By: jdoerfert Subscribers: jvesely, mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72712	2020-01-14 19:27:07 +00:00
Alexey Bataev	b19c0810e5	[LIBOMPTARGET]Ignore empty target descriptors. Summary: If the dynamically loaded module has been compiled with -fopenmp-targets and has no target regions, it has empty target descriptor. It leads to a crash at the runtime if another module has at least one target region and at least one entry in its descriptor. The runtime library is unable to load the empty binary descriptor and terminates the execution. Caused by a clang-offload-wrapper. Reviewers: grokos, jdoerfert Subscribers: caomhin, kkwli0, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72472	2020-01-10 09:45:27 -05:00
Kazuaki Ishizaki	4c6a098ad5	[OpenMP] NFC: Fix trivial typos in comments Reviewers: jdoerfert, Jim Reviewed By: Jim Subscribers: Jim, mgorny, guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D72285	2020-01-07 14:05:03 +08:00
Jon Chesterfield	bc48af8c57	[libomptarget][nfc] Change unintentional target_impl prefix to kmpc_impl	2019-12-30 20:50:23 +00:00
Jon Chesterfield	63e2aa5658	[libomptarget][nfc] Provide target_impl malloc/free Summary: [libomptarget][nfc] Provide target_impl malloc/free Sufficient to build support.cu for amdgcn Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71685	2019-12-19 16:54:28 +00:00
JonChesterfield	b40822fc14	[libomptarget][nvptx] Fix build, second symbol reordering	2019-12-19 02:02:44 +00:00
Jon Chesterfield	89a2bef27a	[libomptarget][nvptx] Fix build, symbol ordering in target_impl.h	2019-12-19 01:50:06 +00:00
JonChesterfield	9aefe5f65e	[libomptarget][amdgcn] Correct return type of extern __clock64 to unsigned	2019-12-19 00:11:21 +00:00
Jon Chesterfield	2caeaf2f45	[libomptarget][nfc] Introduce atomic wrapper function Summary: [libomptarget][nfc] Introduce atomic wrapper function Wraps atomic functions in a template prefixed __kmpc_atomic that dispatches to cuda or hip atomic functions. Intended to be easily extended to dispatch to OpenCL or C++ atomics for a third target. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: Anastasia, jvesely, mgrang, dexonsmith, llvm-commits, mgorny, jfb, openmp-commits Tags: #openmp, #llvm Differential Revision: https://reviews.llvm.org/D71404	2019-12-18 20:06:17 +00:00
JonChesterfield	8adae6027c	[libomptarget][nfc] Extract function from data_sharing, move to common Summary: [libomptarget][nfc] Extract function from data_sharing, move to common Finding the first active thread in the warp is different on nvptx and amdgcn, mostly due to warp size and the desire for efficiency. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71643	2019-12-18 19:39:35 +00:00
Alexey Bataev	15d47deedd	[LIBOPENMP][NVPTX]Fix the build error in the runtime.	2019-12-17 14:46:04 -05:00
JonChesterfield	0c83f8ccc7	[libomptarget][nfc] Move three files under common, build them for amdgcn Summary: [libomptarget][nfc] Move three files under common, build them for amdgcn Change to reduction.cu to remove two dead includes, otherwise no code change. Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71601	2019-12-17 18:02:49 +00:00
JonChesterfield	3d3e4076cd	[libomptarget][nfc] Move omp locks under target_impl Summary: [libomptarget][nfc] Move omp locks under target_impl These are likely to be target specific, even down to the lock_t which is correspondingly moved out of interface.h. The alternative is to include interface.h in target_impl which substantiatially increases the scope of those symbols. The current nvptx implementation deadlocks on amdgcn. The preferred implementation for that arch is still under discussion - this change leaves declarations in target_impl. The functions could be inline for nvptx. I'd prefer to keep the internals hidden in the target_impl translation unit, but will add the (possibly renamed) macros to target_impl.h if preferred. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71574	2019-12-17 12:18:57 +00:00
Jon Chesterfield	ce12a523b0	[libomptarget][nfc] Move timer functions behind target_impl Summary: [libomptarget][nfc] Move timer functions behind target_impl Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71584	2019-12-17 02:22:29 +00:00
Jon Chesterfield	53bcd1e141	[libomptarget][nfc] Wrap cuda min() in target_impl Summary: [libomptarget][nfc] Wrap cuda min() in target_impl nvptx forwards to cuda min, amdgcn implements directly. Sufficient to build parallel.cu for amdgcn, added to CMakeLists. All call sites are homogenous except one that passes a uint32_t and an int32_t. This could be smoothed over by taking two type parameters and some care over the return type, but overall I think the inline <uint32_t> calling attention to what was an implicit sign conversion is cleaner. Reviewers: ABataev, jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71580	2019-12-17 01:30:04 +00:00
JonChesterfield	69fcc6ecc1	Revert "Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn"" Summary: This reverts commit `dd8a7fcdd7`. Alexey reports undefined symbols for the new inline functions defined in target_impl.h This does not reproduce for me for nvptx, or amdgcn, under release or debug builds. I believe the patch is fine, based on: - the semantics of an inline function in C++ (the cuda INLINE functions end up as linkonce_odr in IR), which are only legal to drop if they have no uses - the code generated from a debug build of clang 9 does not show these undef symbols - the tests pass - the code is trivial To progress from here I either need: - A tie break - someone to play the role of CI in determining whether the patch works - Alexey to provide sufficient information about his build for me to reproduce the failure - Alexey to debug why the symbols are disappearing for him and report back Reviewers: ABataev, jdoerfert, grokos Subscribers: jvesely, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71502	2019-12-16 16:16:14 +00:00
Alexey Bataev	dd8a7fcdd7	Revert "[libomptarget] Move resource id functions into target specific code, implement for amdgcn" This reverts commit `dbb3fec8ad` since it breaks the NVPTX tests.	2019-12-13 16:36:06 -05:00
Jon Chesterfield	40d72134fd	[libomptarget] Build most of common/src for amdgcn Summary: [libomptarget] Build most of common/src for amdgcn Excluding parallel.cu, which uses an integer min() from cuda, Excluding support.cu, which calls malloc that is not yet available for amdgcn Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: gregrodgers, ronlieb, jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71446	2019-12-13 17:48:19 +00:00
Jon Chesterfield	56adcebfda	[libomptarget][nfc] Add nop syncwarp function for amdgcn	2019-12-13 14:27:52 +00:00
Jon Chesterfield	479868646a	[libomptarget][nfc] Add declarations of atomic functions for amdgcn Summary: [libomptarget][nfc] Add declarations of atomic functions for amdgcn This enables building more source for amdgcn. The functions are usually available in a hip runtime header, but are duplicated here to decouple the implementation Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71412	2019-12-12 22:56:14 +00:00
Jon Chesterfield	dbb3fec8ad	[libomptarget] Move resource id functions into target specific code, implement for amdgcn Summary: [libomptarget] Move resource id functions into target specific code, implement for amdgcn Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71382	2019-12-12 22:49:02 +00:00
Jon Chesterfield	b399252028	[libomptarget][nfc] Add missing header for amdgcn/target_impl	2019-12-12 09:36:57 +00:00
JonChesterfield	0dd62c5c2e	[libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl Summary: [libomptarget][nfc] Move cuda threadfence functions behind kmpc_impl Part of building code under common/ without requiring a cuda compiler Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: jvesely, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71102	2019-12-06 15:41:18 +00:00
Jon Chesterfield	cd90f49d70	[libomptarget][nfc] Move three more files to common Summary: [libomptarget][nfc] Move three more files to common Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71103	2019-12-06 15:29:50 +00:00
Jon Chesterfield	4af84d2686	[libomptarget][nfc] Introduce SHARED, ALIGN macros Summary: [libomptarget][nfc] Introduce SHARED, ALIGN macros Move remaining cuda attributes behind such macros Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits, jvesely Tags: #openmp Differential Revision: https://reviews.llvm.org/D71076	2019-12-05 21:57:58 +00:00
Jon Chesterfield	d0b9ed5c49	[libomptarget][nfc] Move omptarget-nvptx under common Summary: [libomptarget][nfc] Move omptarget-nvptx under common Almost all files depend on require omptarget-nvptx, which no longer contains any obviously architecture dependent code. Moving it under common unblocks task/loop for amdgcn, and allows moving other code. At some point there should probably be a widespread symbol renaming to replace the nvptx string. I'd prefer to get things working first. Building this (and task.cu, loop.cu) without a cuda library requires some more refactoring, e.g. wrap threadfence(), use DEVICE macro more consistently. Patches for that are orthogonal and will be posted shortly. Reviewers: jdoerfert, ABataev, grokos Reviewed By: ABataev Subscribers: mgorny, fedor.sergeev, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D71073	2019-12-05 20:34:15 +00:00
JonChesterfield	3ada8d2a87	[libomptarget] Build a minimal deviceRTL for amdgcn Summary: [libomptarget] Build a minimal deviceRTL for amdgcn Repeat of D70414, with an include path fixed. Diff for sanity checking. The CMakeLists.txt file is functionally identical to the one used in the aomp fork. Whitespace changes were made based on nvptx/CMakeLists.txt, plus the copyright notice updated to match (Greg was the original author so would like his sign off on that here). This change will build a small subset of the deviceRTL if an appropriate toolchain is available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency of debug.h. Reviewers: ABataev, jdoerfert Reviewed By: ABataev Subscribers: jvesely, mgorny, jfb, openmp-commits, jdoerfert Tags: #openmp Differential Revision: https://reviews.llvm.org/D70971	2019-12-04 16:43:37 +00:00
Alexey Bataev	02b9c5d963	Revert "[libomptarget] Build a minimal deviceRTL for amdgcn" This reverts commit `877ffa716f` because it breaks the build.	2019-12-03 12:35:08 -05:00
Jon Chesterfield	877ffa716f	[libomptarget] Build a minimal deviceRTL for amdgcn Summary: [libomptarget] Build a minimal deviceRTL for amdgcn The CMakeLists.txt file is functionally identical to the one used in the aomp fork. Whitespace changes were made based on nvptx/CMakeLists.txt, plus the copyright notice updated to match (Greg was the original author so would like his sign off on that here). This change will build a small subset of the deviceRTL if an appropriate toolchain is available, e.g. a local install of rocm. Support.h is moved from nvptx as a dependency of debug.h. Reviewers: jdoerfert, ABataev, grokos, ronlieb, gregrodgers Reviewed By: jdoerfert Subscribers: jfb, Hahnfeld, jvesely, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70414	2019-12-03 15:18:41 +00:00
Bryan Chan	4d3198e243	[OpenMP] build offload plugins before testing them Summary: "make check-all" or "make check-libomptarget" would attempt to run offloading tests before the offload plugins are built. This patch corrects that by adding dependencies to the libomptarget CMake rules. Reviewers: jdoerfert Subscribers: mgorny, guansong, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70803	2019-11-28 17:43:56 -05:00
JonChesterfield	a84b48d01e	[nfc][libomptarget] Remove casts of string literals to char*	2019-11-19 19:41:59 +00:00
JonChesterfield	4681e2e434	[nfc][libomptarget] Write amdgcn macros in terms of compiler intrinsics	2019-11-19 17:23:46 +00:00
Jon Chesterfield	5a4a05d776	[libomptarget][nfc] Move some source into common from nvptx Summary: [libomptarget][nfc] Move some source into common from nvptx Moves some source that compiles cleanly under amdgcn into a common subdirectory Includes some non-trivial files and some headers. Keeps the cuda file extension. The build systems for different architectures seem unlikely to have much in common. The idea is therefore to set include paths such that files under common/src compile as if they were under arch/src as the mechanism for sharing. In particular, files under common/src need to be able to include target_impl.h. The corresponding -Icommon is left out in favour of explicit includes on the basis that the it makes it clearer which files under common are used by a given architecture. Reviewers: jdoerfert, ABataev, grokos Reviewed By: ABataev Subscribers: jfb, mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70328	2019-11-18 18:17:36 +00:00
JonChesterfield	32dfbd131d	[libomptarget][nfc] Use cuda variable wrappers from support.h Summary: [libomptarget][nfc] Use cuda variable wrappers from support.h Reimplementation of D69693, after the revert of D69885 Use the wrappers in support.h for cuda builtin variables at all call sites. Localises use of cuda and removes WARPSIZE==32 assumption in debug.h. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70186	2019-11-14 12:45:09 +00:00
JonChesterfield	fd9fa9995c	[libomptarget] Move supporti.h to support.cu Summary: [libomptarget] Move supporti.h to support.cu Reimplementation of D69652, without the unity build and refactors. Will need a clean build of libomptarget as the cmakelists changed. Reviewers: ABataev, jdoerfert Reviewed By: jdoerfert Subscribers: mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D70131	2019-11-13 11:36:46 +00:00
Jon Chesterfield	7cea0cea77	[libomptarget] Revert all improvements to support Summary: [libomptarget] Revert all improvements to support The change to unity build for nvcc has broken the build for some developers. This patch reverts to a known-working state. There has been some confusion over exactly how the build broke. I think we have reached a common understanding that the disappearing symbols are from the bitcode library built by clang. The static archive built by nvcc may show the same problem. Some of the confusion arose from building the deviceRTL twice and using one or the other library based on various environmental factors. I'm pretty sure the problem is clang expanding `__forceinline__` into both `__inline__` and `attribute(("always_inline"))`. The `__inline__` attribute resolves to linkonce_odr which is not safe for exporting symbols from translation units. "always_inline" is the desired semantic for small functions defined in one translation unit that are intended to be inlined at link time. "inline" is not. This therefore reintroduces the dependency hazard of supporti.h and some code duplication, and blocks progress separating deviceRTL into reusable components. See also D69857, D69859 for attempts at a fix instead of a revert. Reviewers: ABataev, jdoerfert, grokos, ikitayama, tianshilei1992 Reviewed By: ABataev Subscribers: mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69885	2019-11-06 15:44:10 +00:00
Ron Lieberman	dc34b1c94d	Test commit: adds a . to comment. NFC	2019-11-04 16:51:03 -06:00
JonChesterfield	94c59ea8dd	[libomptarget] Implement target_impl for amdgcn Summary: [libomptarget] Implement target_impl for amdgcn Smallest atomic addition for a new target. Implements enough of the amdgcn specific code that some of the source files under nvptx/src could be compiled, without modification, to run on amdgcn. This foreshadows a work in progress patch to move said source out of nvptx/src. Patch based on fork at https://github.com/ROCm-Developer-Tools/llvm-project Reviewers: ABataev, jdoerfert, grokos, ronlieb Subscribers: jvesely, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69718	2019-11-01 15:46:35 +00:00
Alexey Bataev	e57f8ad914	[LIBOMPTARGET]Call GetLaneId function, do not use its address in debug log functions.	2019-11-01 09:43:47 -04:00
JonChesterfield	9b06ac98d0	[nfc][omptarget] Use builtin var abstraction. Second pass at D69476 Summary: [nfc][omptarget] Use builtin var abstraction. Second pass at D69476 Use the wrappers in support.h for cuda builtin variables at all call sites. Localises use of cuda and removes WARPSIZE==32 assumption in debug.h. Reviewers: ABataev, jdoerfert, grokos Reviewed By: jdoerfert Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69693	2019-11-01 02:21:44 +00:00
JonChesterfield	764c8420e4	[nfc][libomptarget] Reorganise support header Summary: [nfc][libomptarget] Reorganise support header All functions defined in support implementation are now declared in support.h Reordered functions in support implementation to match the sequence in support.h Added include guards to support.h Added #include interface to support.h to provide kmp_Ident declaration Move supporti.h to support.cu and s/INLINE/EXTERN/g Add remaining includes to support.cu A minor side effect is to change the name mangling of the support functions to extern "C". If this matters another macro along the lines of INLINE/EXTERN can be added - perhaps DEVICE as that's the obvious implementation. Reviewers: jdoerfert, ABataev, grokos Reviewed By: jdoerfert Subscribers: mgorny, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69652	2019-10-31 17:15:02 +00:00
Jon Chesterfield	e9f9dfab82	[libomptarget] Change nvcc compilation to use a unity build Summary: [libomptarget] Change nvcc compilation to use a unity build This allows nvcc to inline functions between what would otherwise be distinct translation units, which in turn removes any runtime cost from implementing functions in source files (as opposed to inline in headers). This will then allow the circular dependencies in deviceRTL to be readily broken and individual components more easily shared between architectures. Reviewers: ABataev, jdoerfert, grokos, RaviNarayanaswamy, hfinkel, ronlieb, gregrodgers Reviewed By: jdoerfert Subscribers: mgorny, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69489	2019-10-31 01:58:51 +00:00
Jon Chesterfield	8548e2f543	[nfc][libomptarget] Move named_sync() into target_impl Summary: [nfc][libomptarget] Move named_sync() into target_impl Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69487	2019-10-30 16:25:05 +00:00
Jon Chesterfield	74bb5ee674	[nfc][libomptarget] Move smid() into target_impl Summary: [nfc][libomptarget] Move smid() into target_impl Reviewers: ABataev, jdoerfert, grokos Reviewed By: ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69485	2019-10-30 13:39:15 +00:00
Jon Chesterfield	62a161cc00	[libomptarget] Always call malloc, free via SafeMalloc, SafeFree wrapper Summary: [libomptarget] Always call malloc, free via SafeMalloc, SafeFree wrapper NFC for release, adds some verbosity to debug printing. Motivation is to provide one place where local modifications can be made to the behaviour of all heap allocation or deallocation while debugging. Reviewers: jdoerfert, ABataev, grokos Reviewed By: ABataev Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D69492	2019-10-30 13:35:34 +00:00

1 2 3 4 5 ...

339 Commits