llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonas Hahnfeld	aca476b296	[libomptarget] Fix typos and grammar in error messages, NFC. llvm-svn: 365890	2019-07-12 10:21:55 +00:00
Jonas Hahnfeld	2dfc5179f6	[libomptarget-nvptx] Remove dead functions These entry points are never called by Clang trunk nor clang-ykt. If XL doesn't use them either, they can finally go away. Differential Revision: https://reviews.llvm.org/D52700 llvm-svn: 365817	2019-07-11 20:12:51 +00:00
Alexey Bataev	4ad9286a57	[OPENMP]Rename loopTripCnt member data to LoopTripCnt, NFC. Rename variable to follow LLVM coding standard. llvm-svn: 365368	2019-07-08 18:45:48 +00:00
Alexey Bataev	060921dee7	[OPENMP]Make __kmpc_push_tripcount thread safe. Summary: __kmpc_push_tripcount function is not thread safe and may lead to data race when the target regions are executed in parallel threads. The patch makes loopTripCnt counter thread aware and stores the tripcount value per thread in the map. Access to map is guarded by mutex to prevent data race in the map itself. Test is for NVPTX target because it does not work correctly on the host. Seems to me, there is a problem in libomp with target regions in the parallel threads. Reviewers: grokos Subscribers: guansong, jfb, jdoerfert, openmp-commits, kkwli0, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D64080 llvm-svn: 365332	2019-07-08 15:30:23 +00:00
Alexey Bataev	bb55ece269	[OPENMP][NVPTX]Relax flush directive. Summary: According to the OpenMP standard, flush makes a thread’s temporary view of memory consistent with memory and enforces an order on the memory operations of the variables explicitly specified or implied. According to the Cuda toolkit documentation (https://docs.nvidia.com/cuda/archive/8.0/cuda-c-programming-guide/index.html#memory-fence-functions), __threadfence() functions provides required functionality. __threadfence_system() also provides required functionality, but it also includes some extra functionality, like synchronization of page-locked host memory, synchronization for the host, etc. It is not required per the standard and we can use more relaxed version of memory fence operation. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D62397 llvm-svn: 364572	2019-06-27 18:33:09 +00:00
Gheorghe-Teodor Bercea	aace6d285d	[OpenMP][libomptarget] Add support for declare target to clause under unified memory Summary: This patch adds support for handling variables under the: ``` #pragma omp declare target to() ``` clause when the ``` #pragma omp requires unified_shared_memory ``` is used. The address of the host variable is copied into the device pointer just like for the declare target link case. Reviewers: ABataev, caomhin, grokos, AlexEichenberger Reviewed By: grokos Subscribers: jcownie, guansong, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D63106 llvm-svn: 363825	2019-06-19 15:48:10 +00:00
Alexey Bataev	8a2bd361eb	[OPENMP][CUDA]Use __syncthreads when compiled by nvcc and clang >= 9.0. Summary: The problems with __syncthreads() were fixed in clang >= 9.0 and the original __syncthreads() can be used instead of the ptx instruction. Reviewers: grokos Subscribers: guansong, jdoerfert, openmp-commits, kkwli0, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D63515 llvm-svn: 363807	2019-06-19 14:20:34 +00:00
Gheorghe-Teodor Bercea	c5fe030c16	[OpenMP][libomptarget] Enable usage of unified memory for declare target link variables Summary: This patch enables the usage of a host variable on the device for declare target link variables when unified memory is available. Reviewers: ABataev, caomhin, grokos Reviewed By: grokos Subscribers: Hahnfeld, guansong, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60884 llvm-svn: 362505	2019-06-04 15:05:53 +00:00
Alexey Bataev	e1947b84c1	Revert "[OPENMP][NVPTX]Fix barriers and parallel level counters, NFC." This reverts commit r361421 to split the patch into 3 parts. llvm-svn: 361638	2019-05-24 14:06:47 +00:00
Alexey Bataev	9d9e406684	[OPENMP][NVPTX]Fix barriers and parallel level counters, NFC. Summary: Parallel level counter should be volatile to prevent some dangerous optimiations by the ptxas. Otherwise, ptxas optimizations lead to undefined behaviour in some cases. Also, use __threadfence() for #pragma omp flush and if the barrier should not be used (we have only one thread in the team), still perform flush operation since the standard requires implicit flush when executing barriers. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D62199 llvm-svn: 361421	2019-05-22 19:50:32 +00:00
Gheorghe-Teodor Bercea	9e9c918259	[OpenMP][libomptarget] Enable requires flags for target libraries. Summary: Target link variables are currently implemented by creating a copy of the variables on the device side and unified memory never gets exploited. When the prgram uses the: ``` #pragma omp requires unified_shared_memory ``` directive in conjunction with a declare target link, the linked variable is no longer allocated on the device and the host version is used instead. This behavior is overridden by performing an explicit mapping. A Clang side patch is required. Reviewers: ABataev, AlexEichenberger, grokos, Hahnfeld Reviewed By: AlexEichenberger, grokos, Hahnfeld Subscribers: Hahnfeld, jfb, guansong, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60223 llvm-svn: 361294	2019-05-21 19:35:02 +00:00
Alexey Bataev	f9e00db818	[OPENMP][NVPTX]Simplify handling of thread limit, NFC. Summary: Patch improves performance of the full runtime mode by moving threads limit counter to the shared memory. It also allows to save global memory. Reviewers: grokos, kkwli0, gtbercea Subscribers: guansong, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D61801 llvm-svn: 360584	2019-05-13 14:21:46 +00:00
Alexey Bataev	f62c266de7	[OPENMP][NVPTX]Improve number of threads counter, NFC. Summary: Patch improves performance of the full runtime mode by moving number-of-threads counter to the shared memory. It also allows to save global memory. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D61785 llvm-svn: 360457	2019-05-10 18:56:05 +00:00
Alexey Bataev	a857e31011	[OPENMP][NVPTX]Improve thread limit counter, NFC. Summary: Patch improves performance of the full runtime mode by moving thread-limit counter to the shared memory. It also allows to save global memory. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61526 llvm-svn: 359922	2019-05-03 20:00:38 +00:00
Alexey Bataev	e031e17919	[OPENMP][NVPTX]Improved several standard OpenMP functions, NFC. Summary: Used parallelLevel[] counter to simplify and improve implementation of the existing standard OpenMP functions. Functions are tested already in several tests, the patch is NFC. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61459 llvm-svn: 359892	2019-05-03 14:47:20 +00:00
Alexey Bataev	8ccb8f8647	[OPENMP][NVPTX]Improve code by using parallel level counter. Summary: Previously for the different purposes we need to get the active/common parallel level and with full runtime we iterated over all the records to calculate this level. Instead, we can used the warp-based parallel level counters used in no-runtime mode. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61395 llvm-svn: 359822	2019-05-02 20:05:01 +00:00
Alexey Bataev	4ad6dbc5fd	[OPENMP][NVPTX]Improve omp_get_max_threads() function. Summary: Function omp_get_max_threads() can always return 1 if current execution mode is SPMD. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61379 llvm-svn: 359792	2019-05-02 14:52:52 +00:00
Alexey Bataev	8e6bf88cf7	[OPENMP][NVPTX]Improved omp_get_thread_limit() function. Summary: Function omp_get_thread_limit() in SPMD mode can return the maximum available number of threads as a result. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D61378 llvm-svn: 359790	2019-05-02 14:46:32 +00:00
Alexey Bataev	c03fe73176	[OPENMP][NVPTX]Correctly handle L2 parallelism in SPMD mode. Summary: The parallelLevel counter must be on per-thread basis to fully support L2+ parallelism, otherwise we may end up with undefined behavior. Introduce the parallelLevel on per-warp basis using shared memory. It allows to avoid the problems with the synchronization and allows fully support L2+ parallelism in SPMD mode with no runtime. Reviewers: gtbercea, grokos Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60918 llvm-svn: 359341	2019-04-26 19:30:34 +00:00
Alexey Bataev	5de5d74c8d	[OPENMP][NVPTX] Fix the test, NFC. Fix the test to run it really in SPMD mode without runtime. Previously it was run in SPMD + full runtime mode and does not allow to cehck the functionality correctly. llvm-svn: 358902	2019-04-22 17:25:31 +00:00
Alexey Bataev	13532ea623	[OPENMP][NVPTX]Fix dynamic scheduling in L2+ SPMD parallel regions. Summary: If the kernel is executed in SPMD mode and the L2+ parallel for region with the dynamic scheduling is executed, dynamic scheduling functions are called. They expect full runtime support, but SPMD kernels may be executed without the full runtime. It leads to the runtime crash of the compiled program. Patch fixes this problem + fixes handling of the parallelism level in SPMD mode, which is required as part of this patch. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D60578 llvm-svn: 358442	2019-04-15 20:15:20 +00:00
Michael Kruse	d97d5ebcfa	[libomptarget] Introduce LIBOMPTARGET_ENABLE_DEBUG cmake option. At the moment, support for runtime debug output using the OMPTARGET_DEBUG=1 environment variable is only available with CMAKE_BUILD_TYPE=Debug builds. The patch allows setting it independently using the LIBOMPTARGET_ENABLE_DEBUG option, which is enabled by default depending on CMAKE_BUILD_TYPE. That is, unless this option is set explicitly, nothing changes. This is the same mechanism used by LLVM for LLVM_ENABLE_ASSERTIONS. This patch also removes adding -g -O0 in debug builds, it should be handled by cmake's CMAKE_{C\|CXX}_FLAGS_DEBUG configuration option. Idea by Hal Finkel Differential Revision: https://reviews.llvm.org/D55952 llvm-svn: 356998	2019-03-26 15:19:15 +00:00
Gheorghe-Teodor Bercea	06e08f0b0a	[OpenMP][libomptarget] New reduction scheme for team reductions Summary: This patch adds a more sophisticated team reduction scheme to the OpenMP libomptarget-nvptx runtime. The scheme uses a fixed size global memory buffer whose length can be adjusted via compiler flag: ``` -fopenmp-cuda-teams-reduction-recs-num=1024 ``` The global buffer is a structure of arrays (with default size of 1024 each and controlled by the above flag), one array for each reduction variable. Values in the buffer are processed by the last team to finish executing the body of the target region. In addition to adding support for the new flag, the compiler also emits special functions used for the reduction of the intermediate reduction values. These changes will be added in a separate compiler patch following this one. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, jdoerfert, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D58409 llvm-svn: 354471	2019-02-20 14:55:55 +00:00
Chandler Carruth	57b08b0944	Update more file headers across all of the LLVM projects in the monorepo to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648	2019-01-19 10:56:40 +00:00
Gheorghe-Teodor Bercea	1653633a1c	[OpenMP][libomptarget] Use shared memory variable for tracking parallel level Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55773 llvm-svn: 350747	2019-01-09 18:30:14 +00:00
Alexey Bataev	26e6c86b79	[OPENMP][NVPTX]Fix dynamic scheduling. Summary: Previous implementation may cause the runtime crash when the number of teams is > 1024. Patch fixes this problem + reduces number of the atomic operations by 32 times. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56332 llvm-svn: 350524	2019-01-07 14:25:25 +00:00
Alexey Bataev	6b3153ada0	[OPENMP][NVPTX]General formatting/code improvement, NFC. Summary: Formatting. Reviewers: gtbercea, grokos, kkwli0 Subscribers: guansong, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56290 llvm-svn: 350431	2019-01-04 20:16:54 +00:00
Alexey Bataev	dcf2edcdf5	[OPENMP][NVPTX]Improve performance + reduce number of used registers. Summary: Reduced number of the used register + improved performance propagating the information about current execution/data sharing mode directly from the compiler, where it is possible. In some cases, it requires new/reworked interfaces of the runtime external functions. Old functions are marked as deprecated. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56278 llvm-svn: 350405	2019-01-04 17:09:12 +00:00
Joel E. Denny	f17f7a5d4d	[OpenMP] Fix nvidia-cuda-toolkit detection on Debian/Ubuntu The OpenMP runtime's cmake scripts do not correctly locate the libdevice that the Debian/Ubuntu package nvidia-cuda-toolkit currently includes, at least on my Ubuntu 18.04.1 installation. This patch fixes that for me. This problem was discussed at length in D55269. D40453 added a similar adjustment in clang, but reviewers of D55269 concluded that, for the OpenMP runtime, the right place to address this problem is in cmake's CUDA support. However, it was also suggested we could add a workaround to OpenMP's cmake scripts now. This patch contains such a workaround, which I've tried to design so that it will have no harmful effect if cmake improves in the future. nvidia-cuda-toolkit also needs improvements because its intended monolithic CUDA tree shim, /usr/lib/cuda, has many empty directories, such as bin. I reported that at: <https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1808999> Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D55588 llvm-svn: 350377	2019-01-04 02:07:13 +00:00
Jonathan Peyton	76f3980a20	[OpenMP] Add omp_get_device_num() and update several other device API functions Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Currently, we are leaving it to the compiler to handle this properly if it is called inside target. Also, did some cleanup and updating of duplicate device API functions (in both libomp and libomptarget) to make them into weak functions that check for the symbol from libomptarget, and will call the version in libomptarget if it is present. If any additional device API functions are implemented also in libomptarget in the future, we should add the dlsym calls to the host functions. Also, if the omp_target_* functions are to be implemented for the host (this has been requested), they should attempt to call the libomptarget versions as well. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55578 llvm-svn: 350352	2019-01-03 21:14:19 +00:00
Alexey Bataev	3c74be8049	[OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC. Summary: One of the LLVM optimizations, split critical edges, also clones tail instructions. This is a dangerous operation for __syncthreads() functions and this transformation leads to undefined behavior or incorrect results. Patch fixes this problem by replacing __syncthreads() function with the assembler instruction, which cost is too high and wich cannot be copied. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56274 llvm-svn: 350333	2019-01-03 17:43:46 +00:00
Vyacheslav Zakharin	e889ac7e6b	[libomptarget] Added install component for libomptarget Differential Revision: https://reviews.llvm.org/D56108 llvm-svn: 350254	2019-01-02 19:39:49 +00:00
Alexey Bataev	d1cd005ec5	[OPENMP][NVPTX]Added/fixed debugging messages, NFC. Summary: Added or fixed new/old debugging messages for the better diagnostics. Reviewers: gtbercea, kkwli0, grokos Reviewed By: grokos Subscribers: caomhin, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D56102 llvm-svn: 350137	2018-12-28 21:36:09 +00:00
Alexey Bataev	28eccf5ba0	[OPENMP][NVPTX]Fixed initialization of the data-sharing interface. Summary: Avoid using of the atomic loop to wait for the completion of the data-sharing interface initialization, use __shfl_sync instead for the communication within the warp to signal other threads in the warp about completion of the initialization. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D56100 llvm-svn: 350129	2018-12-28 17:31:06 +00:00
Alexey Bataev	1708858dbd	[OPENMP][NVPTX]Outline assert into noinline function, NFC. Summary: At high optimization level asserts lead to some unexpected results because of auto-inserted unreachable instructions. This outlining prevents some of such dangerous optimizations and leads to better stability. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D56101 llvm-svn: 350128	2018-12-28 17:29:47 +00:00
Alexey Bataev	9056f1116d	[OPENMP][NVPTX]Revert __kmpc_shuffle_int64 to its original form. Summary: Use the original shuffle implementation for __kmpc_shuffle_int64 since default implementation uses the same implementation. Reviewers: gtbercea Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55514 llvm-svn: 348772	2018-12-10 16:50:36 +00:00
Alexey Bataev	cc6cf64c38	[OPENMP][NVPTX]Enable fast shuffles on 64bit values only if CUDA >= 9. Summary: Shuffle on 64bit data is allowed only for CUDA >= 9.0. Also, fixed the constant for the mask, need one extra L in the end. Reviewers: gtbercea, kkwli0 Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55440 llvm-svn: 348758	2018-12-10 14:29:05 +00:00
Alexey Bataev	8acafff404	[OPENMP][NVPTX]Save registers for optimized builds with enabled logging. Summary: Introduced special noinline function log that allows to save some registers for optimized builds but with enabled logging. Also, it increases the stability of the optimized builds with inlined runtime. Reviewers: gtbercea, kkwli0 Reviewed By: gtbercea Subscribers: caomhin, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55436 llvm-svn: 348606	2018-12-07 16:08:29 +00:00
Alexey Bataev	653e8ba79a	[OPENMP][NVPTX]Correct type casting for printf args + simplified shfl64 function. Summary: Explicitly casted printf's args to the required types + simplified shfl64 function. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55379 llvm-svn: 348521	2018-12-06 19:45:48 +00:00
Alexey Bataev	5442f3e549	[OPENMP][NVPTX]Fix __kmpc_flush to flush the memory per system, not per block. Summary: According to the standard, after memory flushing the changes in the memory must be visible to all the threads in all teams. Patch fixes this. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55370 llvm-svn: 348491	2018-12-06 15:27:58 +00:00
Gheorghe-Teodor Bercea	10b2e60b7e	[OpenMP][libomptarget] Flush intermediate values during team reduction Summary: Ensure intermediate values of a team reduction are flushed to memory. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D55219 llvm-svn: 348148	2018-12-03 15:21:49 +00:00
Alexey Bataev	0f221f53d8	[OPENMP][NVPTX]Make runtime compatible with the original runtime. Summary: Reworked runtime to make it compatible with the requirements of the original runtime library. Also, simplified some code to reduce number of function calls. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55130 llvm-svn: 348003	2018-11-30 16:52:38 +00:00
Gheorghe-Teodor Bercea	31c1589ab0	[OpenMP][libomptarget] Add new version of SPMD deinit kernel function with argument Summary: To enable the compiler to optimize parts of the function that are not needed when runtime can be omitted, a new version of the SPMD deinit kernel function is needed. This function takes the runtime required flag as an argument. Reviewers: ABataev, kkwli0, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D54969 llvm-svn: 347714	2018-11-27 21:23:40 +00:00
Alexey Bataev	d4de439cf4	[OPENMP][NVPTX]Basic support for reductions across the teams. Summary: Added functions __kmpc_nvptx_teams_reduce_nowait_simple and __kmpc_nvptx_teams_end_reduce_nowait_simple to implement basic support for reductions across the teams. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D54967 llvm-svn: 347710	2018-11-27 21:06:09 +00:00
Gheorghe-Teodor Bercea	ad8632a9ba	[OpenMP][libomptarget] Refactor SPMD and runtime requirement checking Summary: Refactor the checking for SPMD mode and whether the runtime is initialized or not. This uses constant flags which enables the runtime to optimize out unused sections of code that depend on these flags. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54960 llvm-svn: 347698	2018-11-27 19:45:10 +00:00
Alexey Bataev	8ab0924ab4	[OPENMP][NVPTX]Improved lock/critical constructs. Summary: Improved support for critical constructs + omp_..._lock... constructs. Reviewers: gtbercea, kkwli0, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54766 llvm-svn: 347342	2018-11-20 20:19:36 +00:00
Alexey Bataev	15ab891e68	[OPENMP]Make lambda mapping follow reqs for PTR_AND_OBJ mapping. Summary: The base pointer for the lambda mapping must point to the lambda capture placement and pointer must point to the captured variable itself. Patch fixes this problem. Reviewers: gtbercea Subscribers: guansong, openmp-commits, kkwli0, caomhin Differential Revision: https://reviews.llvm.org/D54260 llvm-svn: 346407	2018-11-08 15:47:30 +00:00
Alexey Bataev	9476ca7db9	[OPENMP][OFFLOADING]Change the lambda capturing flags. Summary: The previously used combination `PTR_AND_OBJ \| PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ \| LITERAL`. Reviewers: gtbercea Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D54035 llvm-svn: 345981	2018-11-02 15:24:47 +00:00
Alexey Bataev	463e9f3224	[OPENMP][NVPTX]Fixed/improved support for globalization in team contexts. Summary: Current globalization scheme works correctly only for SPMD+lightweight runtime mode and does not work for full runtime. Patch improves support for the globalization scheme + reduces global memory consumption in lightweight runtime mode. Patch adds runtime functions to work with the statically allocated global memory. It allows to improve performance and memory consumption. This global memory must be allocated by the compiler. Reviewers: grokos, kkwli0, gtbercea, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53943 llvm-svn: 345976	2018-11-02 14:43:23 +00:00
Gheorghe-Teodor Bercea	b10bacf122	[OpenMP][libomptarget] Add runtime function for pushing coalesced global records Summary: In the case of coalesced global records, we need to push the exact data size passed in. This patch fixes this by outlining the common functionality of the previous push function and by adding a separate entry point for coalesced pushes. The pop function remains unchanged. Reviewers: ABataev, grokos, caomhin Reviewed By: ABataev, grokos Subscribers: jholewinski, cfe-commits, Hahnfeld, guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53141 llvm-svn: 345867	2018-11-01 18:08:12 +00:00

1 2 3 4

156 Commits