llvm-project

Commit Graph

Author	SHA1	Message	Date
Austin Kerbow	2db700215a	[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If there is a need to have highly optimized codegen and kernel developers have knowledge of inter-wave runtime behavior which is unknown to the compiler this builtin can be used to tune scheduling. This intrinsic creates a barrier between scheduling regions. The immediate parameter is a mask to determine the types of instructions that should be prevented from crossing the sched_barrier. In this initial patch, there are only two variations. A mask of 0 means that no instructions may be scheduled across the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing instructions may cross the sched_barrier. Note that this intrinsic is only meant to work with the scheduling passes. Any other transformations that may move code will not be impacted in the ways described above. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D124700	2022-05-11 13:22:51 -07:00
Austin Kerbow	62bcfcb5a5	[AMDGPU] Add llvm.amdgcn.s.setprio intrinsic Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D120976	2022-03-12 22:15:42 -08:00
Yaxun (Sam) Liu	25942d7c49	[AMDGPU] Allow relaxed/consume memory order for atomic inc/dec Reviewed by: Jon Chesterfield Differential Revision: https://reviews.llvm.org/D100144	2021-04-09 09:23:41 -04:00
Saiyedul Islam	0882c9d4fc	[AMDGPU] Change Clang AMDGCN atomic inc/dec builtins to take unsigned values builtin_amdgcn_atomic_inc32(uint Ptr, uint Val, unsigned MemoryOrdering, const char SyncScope) builtin_amdgcn_atomic_inc64(uint64_t Ptr, uint64_t Val, unsigned MemoryOrdering, const char SyncScope) builtin_amdgcn_atomic_dec32(uint Ptr, uint Val, unsigned MemoryOrdering, const char SyncScope) builtin_amdgcn_atomic_dec64(uint64_t Ptr, uint64_t Val, unsigned MemoryOrdering, const char SyncScope) As AMDGCN IR instrinsic for atomic inc/dec does unsigned comparison, these clang builtins should also take unsigned types instead of signed int types. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D83121	2020-07-07 06:36:25 +00:00
Saiyedul Islam	675cefbf60	[AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics Summary: __builtin_amdgcn_atomic_inc32(int Ptr, int Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_inc64(int64_t Ptr, int64_t Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_dec32(int Ptr, int Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_dec64(int64_t Ptr, int64_t Val, unsigned MemoryOrdering, const char SyncScope) First and second arguments gets transparently passed to the amdgcn atomic inc/dec intrinsic. Fifth argument of the intrinsic is set as true if the first argument of the builtin is a volatile pointer. The third argument of this builtin is one of the memory-ordering specifiers ATOMIC_ACQUIRE, ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST following C++11 memory model semantics. This is mapped to corresponding LLVM atomic memory ordering for the atomic inc/dec instruction using CLANG atomic C ABI. The fourth argument is an AMDGPU-specific synchronization scope defined as string. Reviewers: arsenm, sameerds, JonChesterfield, jdoerfert Reviewed By: arsenm, sameerds Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80804	2020-06-09 17:02:58 +00:00
Matt Arsenault	97f3f0bab0	AMDGPU: Add intrinsic for s_setreg This will be more useful with fenv access implemented.	2020-05-28 14:26:38 -04:00
Saiyedul Islam	06bdffb2bb	[AMDGPU] Expose llvm fence instruction as clang intrinsic Expose llvm fence instruction as clang builtin for AMDGPU target __builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope) The first argument of this builtin is one of the memory-ordering specifiers __ATOMIC_ACQUIRE, __ATOMIC_RELEASE, __ATOMIC_ACQ_REL, or __ATOMIC_SEQ_CST following C++11 memory model semantics. This is mapped to corresponding LLVM atomic memory ordering for the fence instruction using LLVM atomic C ABI. The second argument is an AMDGPU-specific synchronization scope defined as string. Reviewed By: sameerds Differential Revision: https://reviews.llvm.org/D75917	2020-04-27 09:39:03 +05:30
Yaxun Liu	aae1e87f4b	AMDGPU: add __builtin_amdgcn_update_dpp Emit llvm.amdgcn.update.dpp for both __builtin_amdgcn_mov_dpp and __builtin_amdgcn_update_dpp. The first argument to llvm.amdgcn.update.dpp will be undef for __builtin_amdgcn_mov_dpp. Differential Revision: https://reviews.llvm.org/D52320 llvm-svn: 344665	2018-10-17 02:32:26 +00:00
Daniil Fukalov	1b14a3ad3d	[AMDGPU] fixes for lds f32 builtins 1. added restrictions to memory scope, order and volatile parameters 2. added custom processing for these builtins - currently is not used code, needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm tree) 3. builtins renamed as requested Differential Revision: https://reviews.llvm.org/D43281 llvm-svn: 332848	2018-05-21 16:18:07 +00:00
Yaxun Liu	4d86799219	[AMDGPU] Add builtin functions readlane ds_permute mov_dpp Differential Revision: https://reviews.llvm.org/D30551 llvm-svn: 297436	2017-03-10 01:30:46 +00:00
Jan Vesely	9488560bb8	AMDGPU: export s_sendmsg{halt} instrinsics Differential Revision: https://reviews.llvm.org/D30366 llvm-svn: 296241	2017-02-25 04:20:24 +00:00
Jan Vesely	d26dbb389f	AMDGPU: export s_waitcnt builtin Differential Revision: https://reviews.llvm.org/D30359 llvm-svn: 296239	2017-02-25 04:20:20 +00:00
Matt Arsenault	24b5ae4497	AMDGPU: Add builtin for getreg intrinsic llvm-svn: 292636	2017-01-20 19:24:22 +00:00
Konstantin Zhuravlyov	81a78bb864	[AMDGPU] Add f16 builtin functions (VI+) Differential Revision: https://reviews.llvm.org/D26476 llvm-svn: 286741	2016-11-13 02:37:05 +00:00

14 Commits