llvm-project/openmp/libomptarget/deviceRTLs
Jon Chesterfield 221ada654b [libomptarget] Implement locks for amdgcn
Summary:
[libomptarget] Implement locks for amdgcn

The nvptx implementation deadlocks on amdgcn. atomic_cas with multiple
active lanes can deadlock - if one lane succeeds, all the others are locked
out. The set_lock implementation therefore runs on a single lane.

Also uses a sleep intrinsic instead of the system clock for a probably
minor performance improvement. The unset/test implementations may be revised
later, based on code size / performance or similar concerns.

This implements the lock at a per-wavefront scope. That's not strictly as
specified, since openmp describes locks in terms of threads. I think the
nvptx implementation provides true per-thread locking on volta and the same
per-warp locking on other architectures.

Reviewers: jdoerfert, ABataev, grokos

Reviewed By: jdoerfert

Subscribers: jvesely, mgorny, jfb, openmp-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D75546
2020-03-05 20:25:31 +00:00
..
amdgcn [libomptarget] Implement locks for amdgcn 2020-03-05 20:25:31 +00:00
common [libomptarget][nfc] Move GetWarp/LaneId functions into per arch code 2020-03-05 17:05:58 +00:00
nvptx [libomptarget][nfc] Move GetWarp/LaneId functions into per arch code 2020-03-05 17:05:58 +00:00
CMakeLists.txt [libomptarget] Build a minimal deviceRTL for amdgcn 2019-12-04 16:43:37 +00:00
interface.h [libomptarget][nfc] Move omp locks under target_impl 2019-12-17 12:18:57 +00:00