llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Bataev	7b3eabdcd2	[OPENMP][NVPTX]Fix PR40893: Size doesn't match for '_openmp_teams_reductions_buffer_$_. nvlink does not handle weak linkage correctly, same symbols with the different sizes are reported as erroneous though the largest size must be chosen instead. Patch fixes this problem by using Internal linkage instead of the Common. llvm-svn: 356072	2019-03-13 18:21:10 +00:00
Alexey Bataev	8061acd501	[OPENMP][NVPTX]Use faster teams reduction algorithm. A faster way to reduce the values in teams reductions was found, the codegen is updated to use this faster algorithm and new runtime functions. llvm-svn: 354479	2019-02-20 16:36:22 +00:00
Alexey Bataev	25d3de8a0a	[OPENMP][NVPTX]Reduce number of barriers in reductions. After the fix for the syncthreads we don't need to generate extra barriers for the parallel reductions. llvm-svn: 350530	2019-01-07 15:45:09 +00:00
Alexey Bataev	8e009036c9	[OPENMP][NVPTX]Use new functions from the runtime library. Updated codegen to use the new functions from the runtime library. llvm-svn: 350415	2019-01-04 17:25:09 +00:00
Alexey Bataev	6a1b06bcd4	[OPENMP][NVPTX]Emit shared memory buffer for reduction as 128 bytes buffer. Seems to me, nvlink has a bug with the proper support of the weakly linked symbols. It does not allow to define several shared memory buffer with the different sizes even with the weak linkage. Instead we always use 128 bytes buffer to prevent nvlink from the error message emission. llvm-svn: 349540	2018-12-18 21:01:42 +00:00
Alexey Bataev	ae51b96f99	[OPENMP][NVPTX]Improved interwarp copy function. Inlined runtime with the current implementation of the interwarp copy function leads to the undefined behavior because of the not quite correct implementation of the barriers. Start using generic __kmpc_barier function instead of the custom made barriers. llvm-svn: 349192	2018-12-14 21:00:58 +00:00
Gheorghe-Teodor Bercea	2b40470c61	[OpenMP] Add a new version of the SPMD deinit kernel function Summary: This patch adds a new runtime for the SPMD deinit kernel function which replaces the previous function. The new function takes as argument the flag which signals whether the runtime is required or not. This enables the compiler to optimize out the part of the deinit function which are not needed. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D54970 llvm-svn: 347915	2018-11-29 20:53:49 +00:00
Alexey Bataev	a116602475	[OPENMP][NVPTX]Basic support for reductions across the teams. Added basic codegen support for the reductions across the teams. llvm-svn: 347715	2018-11-27 21:24:54 +00:00
Alexey Bataev	f2f39be9ed	[OPENMP][NVPTX]Emit correct reduction code for teams/parallel reductions. Fixed previously committed code for the reduction support in teams/parallel constructs taking into account new design of the NVPTX support in the compiler. Teams reduction are not fully functional yet, it is going to be fixed in the following patches. llvm-svn: 347081	2018-11-16 19:38:21 +00:00
Alexey Bataev	9ea3c38597	[OPENMP][NVPTX] Support memory coalescing for globalized variables. Added support for memory coalescing for better performance for globalized variables. From now on all the globalized variables are represented as arrays of 32 elements and each thread accesses these elements using `tid & 31` as index. llvm-svn: 344049	2018-10-09 14:49:00 +00:00
Alexey Bataev	12c62908b5	[OPENMP, NVPTX] Fix reduction of the big data types/structures. If the shuffle is required for the reduced structures/big data type, current code may cause compiler crash because of the loading of the aggregate values. Patch fixes this problem. llvm-svn: 335377	2018-06-22 19:10:38 +00:00
Alexey Bataev	9ff8083d98	[OPENMP] General code improvements. llvm-svn: 330154	2018-04-16 20:16:21 +00:00
Alexey Bataev	e290ec02c7	[OPENMP, NVPTX] Fix codegen for the teams reduction. Added NUW flags for all the add\|mul\|sub operations + replaced sdiv by udiv as we operate on unsigned values only (addresses, converted to integers) llvm-svn: 329411	2018-04-06 16:03:36 +00:00
Alexey Bataev	b2575930b3	[OPENMP] Fix casting in NVPTX support library. If the reduction required shuffle in the NVPTX codegen, we may need to cast the reduced value to the integer type. This casting was implemented incorrectly and may cause compiler crash. Patch fixes this problem. llvm-svn: 321818	2018-01-04 20:18:55 +00:00
Jonas Hahnfeld	891c7fb19d	[OpenMP] Adjust arguments of nvptx runtime functions In the future the compiler will analyze whether the OpenMP runtime needs to be (fully) initialized and avoid that overhead if possible. The functions already take an argument to transfer that information to the runtime, so pass in the default value 1. (This is needed for binary compatibility with libomptarget-nvptx currently being upstreamed.) Differential Revision: https://reviews.llvm.org/D40354 llvm-svn: 318836	2017-11-22 14:46:49 +00:00
Arpith Chacko Jacob	fc711b1f47	[OpenMP] Teams reduction on the NVPTX device. This patch implements codegen for the reduction clause on any teams construct for elementary data types. It builds on parallel reductions on the GPU. Subsequently, the team master writes to a unique location in a global memory scratchpad. The last team to do so loads and reduces this array to calculate the final result. This patch emits two helper functions that are used by the OpenMP runtime on the GPU to perform reductions across teams. Patch by Tian Jin in collaboration with Arpith Jacob Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D29879 llvm-svn: 295335	2017-02-16 16:48:49 +00:00

16 Commits