llvm-project/openmp/libomptarget/test
Shilei Tian a014fbbc21 [OpenMP] Improve D2D memcpy to use more efficient driver API
Summary:
In current implementation, D2D memcpy is first to copy data back to host and then
copy from host to device. This is very efficient if the device supports D2D
memcpy, like CUDA.

In this patch, D2D memcpy will first try to use native supported driver API. If
it fails, fall back to original way. It is worth noting that D2D memcpy in this
scenerio contains two ideas:
- Same devices: this is the D2D memcpy in the CUDA context.
- Different devices: this is the PeerToPeer memcpy in the CUDA context.
My implementation merges this two parts. It chooses the best API according to
the source device and destination device.

Reviewers: jdoerfert, AndreyChurbanov, grokos

Reviewed By: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D80649
2020-06-04 16:59:06 -04:00
..
api [OpenMP][Offloading] Fix the issue that omp_get_num_devices returns wrong number of devices, by Shiley Tian. 2020-01-21 13:25:18 -05:00
env
mapping [OpenMP] target_data_begin: fail on device alloc fail 2020-04-21 17:10:50 -04:00
offloading [OpenMP] Improve D2D memcpy to use more efficient driver API 2020-06-04 16:59:06 -04:00
unified_shared_memory [OpenMP][libomptarget] Add support for close map modifier 2019-08-09 21:32:57 +00:00
CMakeLists.txt [OpenMP] build offload plugins before testing them 2019-11-28 17:43:56 -05:00
lit.cfg [OpenMP][NFC] Fix `not` sustitution in tests 2020-05-11 14:53:48 -04:00
lit.site.cfg.in [OpenMP] Add scaffolding for negative runtime tests 2020-04-21 17:10:50 -04:00