llvm-project/parallel-libs/streamexecutor/lib
Jason Henline 57ea481945 [SE] RegisteredHostMemory for async device copies
Summary:
Improve the error-prone interface that allows users to pass host
pointers that haven't been registered to asynchronous copy methods. In
CUDA, this is an extremely easy error to make, and instead of failing at
runtime, it succeeds and gives the right answers by turning the async
copy into a sync copy. So, you silently get a huge performance
degradation if you misuse the old interface. This new interface should
prevent that.

Reviewers: jlebar

Subscribers: jprice, beanz, parallel_libs-commits

Differential Revision: https://reviews.llvm.org/D24353

llvm-svn: 281225
2016-09-12 16:09:41 +00:00
..
CMakeLists.txt [SE] RegisteredHostMemory for async device copies 2016-09-12 16:09:41 +00:00
Device.cpp [SE] Rename PlatformInterfaces to PlatformDevice 2016-09-06 19:27:00 +00:00
DeviceMemory.cpp [SE] GlobalDeviceMemory owns its handle 2016-09-02 17:22:42 +00:00
Error.cpp [SE] Remove Utils directory 2016-09-09 23:33:58 +00:00
HostMemory.cpp [SE] RegisteredHostMemory for async device copies 2016-09-12 16:09:41 +00:00
Kernel.cpp [SE] Rename PlatformInterfaces to PlatformDevice 2016-09-06 19:27:00 +00:00
KernelSpec.cpp
PackedKernelArgumentArray.cpp [StreamExecutor] Add basic Stream operations 2016-08-16 17:58:31 +00:00
Platform.cpp [StreamExecutor] Add Platform and PlatformManager 2016-08-25 21:33:07 +00:00
PlatformDevice.cpp [SE] Rename PlatformInterfaces to PlatformDevice 2016-09-06 19:27:00 +00:00
PlatformManager.cpp [StreamExecutor] Add Platform and PlatformManager 2016-08-25 21:33:07 +00:00
Stream.cpp [SE] Remove Platform*Handle classes 2016-09-06 17:07:22 +00:00