Go to file
Tobias Grosser 62acb344d0 [GPGPU] Synchronize after each kernel, not each copy out
Summary:
This change reduces the overall number of synchronize calls for kernels with
a lot of output data at the cost of additional synchronize calls for kernels
launched in sequence without any device to host transfers in between. As the
latter pattern is a lot less frequent, this seems a better tradeoff.

Even though the above motivation would be motivation enough, this is just
a step towards enabling ppcg to not compute to and from device copy calls
at all, which would be incorrect in case we still relied on these calls to
place our synchronization statements.

Reviewers: Meinersbur, bollu, singam-sanjay

Reviewed By: bollu

Subscribers: nemanjai, kbarton, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D36867

llvm-svn: 311155
2017-08-18 12:55:58 +00:00
clang AMDGPU: add missing amdgcn processors and tests 2017-08-18 01:13:39 +00:00
clang-tools-extra [clang-tidy] Add modernize-use-equals-default.IgnoreMacros option 2017-08-17 23:07:59 +00:00
compiler-rt [XRay][compiler-rt][NFC] Expand the PIC test case for XRay 2017-08-18 05:24:32 +00:00
debuginfo-tests Add a test for PR33166. 2017-05-25 19:33:16 +00:00
libclc amdgcn: rewrite barrier() using fence and clang __builtin_amdgcn_s_barrier 2017-08-16 17:09:00 +00:00
libcxx [libcxx] [test] Rename _Up to U, etc. NFCI. 2017-08-11 20:54:09 +00:00
libcxxabi Revert "[libcxxabi] When built with ASan, __cxa_throw calls __asan_handle_no_return" 2017-08-16 22:05:54 +00:00
libunwind [CMake] Allow overriding lib dir suffix independently from LLVM 2017-08-08 00:37:59 +00:00
lld [ELF] - Don't segfault when accessing location counter inside MEMORY command. 2017-08-17 08:47:21 +00:00
lldb [cmake] Add explicit linkage from Core to curses 2017-08-17 20:33:21 +00:00
llgo irgen: Create functions instead of global variables for builtin hash and equal algorithms. 2017-06-04 22:11:28 +00:00
llvm [AArch64] Do not promote f16 when subtarget HasFullFP16 2017-08-18 10:51:14 +00:00
openmp Remove BUILD_TV 2017-08-17 19:09:28 +00:00
parallel-libs [Axccel] Remove -Wno-missing-braces in build 2016-12-19 21:34:07 +00:00
polly [GPGPU] Synchronize after each kernel, not each copy out 2017-08-18 12:55:58 +00:00