llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Bataev	8c5555c39a	[OPENMP][NVPTX]Mark more functions as always_inline for better performance. Internally generated functions must be marked as always_inlines in most cases. Patch marks some extra reduction function + outlined parallel functions as always_inline for better performance, but only if the optimization is requested. llvm-svn: 361269	2019-05-21 15:11:58 +00:00
Alexey Bataev	dc9e7dcbb0	[OPENMP][NVPTX]Run combined constructs with if clause in SPMD mode. All target-parallel-based constructs can be run in SPMD mode from now on. Even if num_threads clauses or if clauses are used, such constructs can be executed in SPMD mode. llvm-svn: 358595	2019-04-17 16:53:08 +00:00
Alexey Bataev	8e009036c9	[OPENMP][NVPTX]Use new functions from the runtime library. Updated codegen to use the new functions from the runtime library. llvm-svn: 350415	2019-01-04 17:25:09 +00:00
Alexey Bataev	6a1b06bcd4	[OPENMP][NVPTX]Emit shared memory buffer for reduction as 128 bytes buffer. Seems to me, nvlink has a bug with the proper support of the weakly linked symbols. It does not allow to define several shared memory buffer with the different sizes even with the weak linkage. Instead we always use 128 bytes buffer to prevent nvlink from the error message emission. llvm-svn: 349540	2018-12-18 21:01:42 +00:00
Alexey Bataev	f2f39be9ed	[OPENMP][NVPTX]Emit correct reduction code for teams/parallel reductions. Fixed previously committed code for the reduction support in teams/parallel constructs taking into account new design of the NVPTX support in the compiler. Teams reduction are not fully functional yet, it is going to be fixed in the following patches. llvm-svn: 347081	2018-11-16 19:38:21 +00:00
Alexey Bataev	09c9eea78f	[OPENMP][NVPTX]Allow to use shared memory for the target\|teams\|distribute variables. If the total size of the variables, declared in target\|teams\|distribute regions, is less than the maximal size of shared memory available, the buffer is allocated in the shared memory. llvm-svn: 346507	2018-11-09 16:18:04 +00:00
Alexey Bataev	e40901806f	[OPENMP][NVPTX]Improve emission of the globalized variables for target/teams/distribute regions. Target/teams/distribute regions exist for all the time the kernel is executed. Thus, if the variable is declared in their context and then escape it, we can allocate global memory statically instead of allocating it dynamically. Patch captures all the globalized variables in target/teams/distribute contexts, merges them into the records, one per each target region. Those records are then joined into the union, one per compilation unit (to save the global memory). Those units are organized into 2 x dimensional arrays, where the first dimension is the number of blocks per SM and the second one is the number of SMs. Runtime functions manage this global memory space between the executing teams. llvm-svn: 345978	2018-11-02 14:54:07 +00:00
Alexey Bataev	ff23bb6622	[OPENMP][NVPTX]Reduce memory use for globalized vars in target/teams/distribute regions. Previously introduced globalization scheme that uses memory coalescing scheme may increase memory usage fr the variables that are devlared in target/teams/distribute contexts. We don't need 32 copies of such variables, just 1. Patch reduces memory use in this case. llvm-svn: 344273	2018-10-11 18:30:31 +00:00
Alexey Bataev	9ea3c38597	[OPENMP][NVPTX] Support memory coalescing for globalized variables. Added support for memory coalescing for better performance for globalized variables. From now on all the globalized variables are represented as arrays of 32 elements and each thread accesses these elements using `tid & 31` as index. llvm-svn: 344049	2018-10-09 14:49:00 +00:00
Alexey Bataev	b99dcb5f31	[OPENMP, NVPTX] Do not globalize local variables in parallel regions. In generic data-sharing mode we are allowed to not globalize local variables that escape their declaration context iff they are declared inside of the parallel region. We can do this because L2 parallel regions are executed sequentially and, thus, we do not need to put shared local variables in the global memory. llvm-svn: 336567	2018-07-09 17:43:58 +00:00
Alexey Bataev	91433f6877	[OPENMP, NVPTX] Reduce the number of the globalized variables. Patch tries to make better analysis of the variables that should be globalized. From now, instead of all parallel directives it will check only distribute parallel .. directives and check only for firstprivte/lastprivate variables if they must be globalized. llvm-svn: 335632	2018-06-26 17:24:03 +00:00

11 Commits