llvm-project

Commit Graph

Author	SHA1	Message	Date
Artem Belevich	42960b4188	[NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins. Differential Revision: https://reviews.llvm.org/D38148 llvm-svn: 313898	2017-09-21 18:44:49 +00:00
Artem Belevich	4654dc89be	[NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins. Differential Revision: https://reviews.llvm.org/D38090 llvm-svn: 313820	2017-09-20 21:23:07 +00:00
Artem Belevich	fda9905062	[CUDA] added __nvvm_atom_{sys\|cta}_* builtins. These builtins are available on sm_60+ GPU only. Differential Revision: https://reviews.llvm.org/D24944 llvm-svn: 282609	2016-09-28 17:47:35 +00:00
Justin Lebar	495f1a22af	[CUDA] Rename the __nvvm_bar0 builtin back to __syncthreads. The builtin was renamed in r274770. But __syncthreads is part of our user-facing API, so we need to keep the name as-is. Patch by Justin Bogner. llvm-svn: 274780	2016-07-07 18:15:03 +00:00
Justin Bogner	2d5de7e568	NVPTX: Use the nvvm builtins to read SRegs rather than the legacy ptx ones The ptx spellings were removed from LLVM in r274769. llvm-svn: 274770	2016-07-07 16:41:08 +00:00
Justin Lebar	2e4ecfdebe	[CUDA] Implement __ldg using intrinsics. Summary: Previously it was implemented as inline asm in the CUDA headers. This change allows us to use the [addr+imm] addressing mode when executing ld.global.nc instructions. This translates into a 1.3x speedup on some benchmarks that call this instruction from within an unrolled loop. Reviewers: tra, rsmith Subscribers: jhen, cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D19990 llvm-svn: 270150	2016-05-19 22:49:13 +00:00
Justin Lebar	717d2b0a0d	[CUDA] Implement atomicInc and atomicDec builtins These functions cannot be implemented as atomicrmw or cmpxchg instructions, so they are implemented as a call to the NVVM intrinsics @llvm.nvvm.atomic.load.inc.32.p0i32 and @llvm.nvvm.atomic.load.dec.32.p0i32. Patch by Jason Henline. Reviewers: jlebar Differential Revision: http://reviews.llvm.org/D18322 llvm-svn: 264009	2016-03-22 00:09:28 +00:00
Jingyue Wu	f1eca25b16	[CUDA] fix codegen for __nvvm_atom_cas_* Summary: __nvvm_atom_cas_* returns the old value instead of whether the swap succeeds. Reviewers: eliben, tra Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D13306 llvm-svn: 248951	2015-09-30 21:49:32 +00:00
Artem Belevich	236cfdc4be	[CUDA] 32-bit NVPTX should have 32-bit long type. Currently it's 64-bit which will lead to mismatch between host and device code if we compile for i386. Differential Revision: http://reviews.llvm.org/D13181 llvm-svn: 248753	2015-09-28 22:54:08 +00:00
Jingyue Wu	2d69f9608e	[CUDA] fix codegen for __nvvm_atom_min/max_gen_u* Summary: Clang should emit "atomicrmw umin/umax" instead of "atomicrmw min/max". Reviewers: eliben, tra Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12487 llvm-svn: 246455	2015-08-31 17:25:51 +00:00
Artem Belevich	d21e5c6684	[CUDA] Implemented __nvvm_atom__gen_ builtins. Integer variants are implemented as atomicrmw or cmpxchg instructions. Atomic add for floating point (__nvvm_atom_add_gen_f()) is implemented as a call to an overloaded @llvm.nvvm.atomic.load.add.f32.* LVVM intrinsic. Differential Revision: http://reviews.llvm.org/D10666 llvm-svn: 240669	2015-06-25 18:29:42 +00:00
Justin Holewinski	6e9bfa344c	[NVPTX] Fix type error for some builtins in BuiltinsNVPTX.def llvm-svn: 223116	2014-12-02 12:58:24 +00:00
NAKAMURA Takumi	a1d1388a2b	clang/test/CodeGen/builtins-nvptx.c: Prune "REQUIRES: nvptx64-registered-target". "nvptx" should imply it. llvm-svn: 196348	2013-12-04 03:41:02 +00:00
Justin Holewinski	9f3bfeb3b6	[NVPTX] Add entire list of supported builtins llvm-svn: 182468	2013-05-22 12:58:29 +00:00
Justin Holewinski	c0cad046b6	[NVPTX] Add __nvvm_* intrinsics as Clang builtins Fixes bug 13354. llvm-svn: 167647	2012-11-09 23:50:51 +00:00
NAKAMURA Takumi	f1f6e99c53	Revert r166541, "clang/test: Add appropriate requirements as REQUIRES, corresponding to r166532." According to r166543, it is not needed for now. llvm-svn: 166544	2012-10-24 03:59:09 +00:00
NAKAMURA Takumi	a22fe582d2	clang/test: Add appropriate requirements as REQUIRES, corresponding to r166532. llvm-svn: 166541	2012-10-24 02:57:57 +00:00
Justin Holewinski	c05323dd5c	Un-XFAIL CodeGen/builtins-nvptx.c now that the proper changes have landed in LLVM core llvm-svn: 157418	2012-05-24 21:39:33 +00:00
John McCall	d65dbd8e6a	XFAIL this test, which does not pass on trunk since the grand renaming in r157403. llvm-svn: 157413	2012-05-24 20:58:21 +00:00
Justin Holewinski	83e9668133	Replace PTX back-end with NVPTX back-end in all places where Clang cares NV_CONTRIB llvm-svn: 157403	2012-05-24 17:43:12 +00:00

20 Commits