llvm-project

History

Joseph Huber 244e98ff48 [Libomptarget] Improve device runtime implementation for globalized variables. Currently the runtime implementation of `__kmpc_alloc_shared` is extremely slow because it allocated memory for each thread individually. This patch adds a small buffer for the threads to share data and will greatly improve performance for builds where all globalization could not be optimized out. If the shared buffer is full, then memory will not only be allocated per-warp rather than per-thread. Depends on D97680 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D104666		2021-06-22 11:52:49 -04:00
..
include/target	[OpenMP][FIX] Repair accidental replacement of _shfl_sync with _shfl	2021-03-15 22:46:00 -05:00
src	[Libomptarget] Improve device runtime implementation for globalized variables.	2021-06-22 11:52:49 -04:00
allocator.h	[OpenMP][deviceRTLs] Build the deviceRTLs with OpenMP instead of target dependent language	2021-01-26 12:28:47 -05:00
debug.h	[OpenMP][deviceRTLs] Drop `assert` in common parts of `deviceRTLs`	2021-02-04 12:39:43 -05:00
device_environment.h	[libomptarget][nfc] Drop unused DEVICE macro	2021-03-15 20:12:50 +00:00
generated_microtask_cases.gen	[OpenMP] Simplify offloading parallel call codegen	2021-04-21 18:46:07 -07:00
omptarget.h	[Libomptarget] Improve device runtime implementation for globalized variables.	2021-06-22 11:52:49 -04:00
omptargeti.h	[OpenMP][NFC] clang-format the whole openmp project	2021-02-20 12:46:32 -05:00
state-queue.h	…
state-queuei.h	[OpenMP][NFC] clang-format the whole openmp project	2021-02-20 12:46:32 -05:00
support.h	[OpenMP] Simplify offloading parallel call codegen	2021-04-21 18:46:07 -07:00