llvm-project

Commit Graph

Author	SHA1	Message	Date
Justin Lebar	27ee130e38	[CUDA] Give templated device functions internal linkage, templated kernels external linkage. Summary: This lets LLVM perform IPO over these functions. In particular, it allows LLVM to emit ld.global.nc for loads to __restrict pointers in kernels that are never written to. Reviewers: rsmith Subscribers: cfe-commits, tra Differential Revision: http://reviews.llvm.org/D21337 llvm-svn: 274261	2016-06-30 18:41:33 +00:00
Artem Belevich	ca2b951cbc	[CUDA] Make sure device-side __global__ functions are always visible. __global__ functions are a special case in CUDA. Even when the symbol would normally not be externally visible according to C++ rules, they still must be visible in CUDA GPU object so host-side stub can launch them. Differential Revision: http://reviews.llvm.org/D19748 llvm-svn: 268299	2016-05-02 20:30:03 +00:00
Artem Belevich	7b41f70e6c	[CUDA] __global__ functions should always be visible externally. Adjust __global__ functions with DiscardableODR linkage to use StrongODR linkage instead, so they are visible externally. Differential Revision: http://reviews.llvm.org/D13067 llvm-svn: 248400	2015-09-23 17:44:53 +00:00
Artem Belevich	c3fa25def7	[CUDA] Add implicit __attribute__((used)) to all __global__ functions. This makes sure that we emit kernels that were instantiated from the host code and which would never be explicitly referenced by anything else on device side. Differential Revision: http://reviews.llvm.org/D11666 llvm-svn: 248293	2015-09-22 17:22:51 +00:00
Daniel Jasper	3b0f87d289	Revert "[CUDA] Add implicit __attribute__((used)) to all __global__ functions." This is breaking internal test. I'll provide a reproduction. llvm-svn: 244583	2015-08-11 11:02:09 +00:00
Artem Belevich	b7e4aab40c	[CUDA] Add implicit __attribute__((used)) to all __global__ functions. This allows emitting kernels that were instantiated from the host code and which would never be explicitly referenced otherwise. Differential Revision: http://reviews.llvm.org/D11666 llvm-svn: 244501	2015-08-10 20:57:02 +00:00
Duncan P. N. Exon Smith	b3a66691f8	IR: Make metadata typeless in assembly, clang side Match LLVM changes from r224257. llvm-svn: 224259	2014-12-15 19:10:08 +00:00
Eli Bendersky	3468d9d929	Move all CUDA testing inputs to Inputs/ subdirectory inside the tests. llvm-svn: 207453	2014-04-28 22:21:28 +00:00
Stephen Lin	4362261b00	CHECK-LABEL-ify some code gen tests to improve diagnostic experience when tests fail. llvm-svn: 188447	2013-08-15 06:47:53 +00:00
Justin Holewinski	368374308d	Use kernel metadata to differentiate between kernel and device functions for the NVPTX target. llvm-svn: 178418	2013-03-30 14:38:24 +00:00
Justin Holewinski	83e9668133	Replace PTX back-end with NVPTX back-end in all places where Clang cares NV_CONTRIB llvm-svn: 157403	2012-05-24 17:43:12 +00:00
Peter Collingbourne	a9455ec9f8	CUDA: add -fcuda-is-device flag This frontend-only flag is used by the IR generator to determine whether to filter CUDA declarations for the host or for the device. llvm-svn: 141301	2011-10-06 18:29:46 +00:00
Peter Collingbourne	5bad4afa2f	CUDA: set proper calling conventions for PTX llvm-svn: 141296	2011-10-06 16:49:54 +00:00

13 Commits