Commit Graph

13 Commits

Author SHA1 Message Date
Justin Lebar 27ee130e38 [CUDA] Give templated device functions internal linkage, templated kernels external linkage.
Summary:
This lets LLVM perform IPO over these functions.  In particular, it
allows LLVM to emit ld.global.nc for loads to __restrict pointers in
kernels that are never written to.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: http://reviews.llvm.org/D21337

llvm-svn: 274261
2016-06-30 18:41:33 +00:00
Artem Belevich ca2b951cbc [CUDA] Make sure device-side __global__ functions are always visible.
__global__ functions are a special case in CUDA.

Even when the symbol would normally not be externally
visible according to C++ rules, they still must be visible
in CUDA GPU object so host-side stub can launch them.

Differential Revision: http://reviews.llvm.org/D19748

llvm-svn: 268299
2016-05-02 20:30:03 +00:00
Artem Belevich 7b41f70e6c [CUDA] __global__ functions should always be visible externally.
Adjust __global__ functions with DiscardableODR linkage to use
StrongODR linkage instead, so they are visible externally.

Differential Revision: http://reviews.llvm.org/D13067

llvm-svn: 248400
2015-09-23 17:44:53 +00:00
Artem Belevich c3fa25def7 [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
This makes sure that we emit kernels that were instantiated from the
host code and which would never be explicitly referenced by anything
else on device side.

Differential Revision: http://reviews.llvm.org/D11666

llvm-svn: 248293
2015-09-22 17:22:51 +00:00
Daniel Jasper 3b0f87d289 Revert "[CUDA] Add implicit __attribute__((used)) to all __global__ functions."
This is breaking internal test. I'll provide a reproduction.

llvm-svn: 244583
2015-08-11 11:02:09 +00:00
Artem Belevich b7e4aab40c [CUDA] Add implicit __attribute__((used)) to all __global__ functions.
This allows emitting kernels that were instantiated from the host code
and which would never be explicitly referenced otherwise.

Differential Revision: http://reviews.llvm.org/D11666

llvm-svn: 244501
2015-08-10 20:57:02 +00:00
Duncan P. N. Exon Smith b3a66691f8 IR: Make metadata typeless in assembly, clang side
Match LLVM changes from r224257.

llvm-svn: 224259
2014-12-15 19:10:08 +00:00
Eli Bendersky 3468d9d929 Move all CUDA testing inputs to Inputs/ subdirectory inside the tests.
llvm-svn: 207453
2014-04-28 22:21:28 +00:00
Stephen Lin 4362261b00 CHECK-LABEL-ify some code gen tests to improve diagnostic experience when tests fail.
llvm-svn: 188447
2013-08-15 06:47:53 +00:00
Justin Holewinski 368374308d Use kernel metadata to differentiate between kernel and device
functions for the NVPTX target.

llvm-svn: 178418
2013-03-30 14:38:24 +00:00
Justin Holewinski 83e9668133 Replace PTX back-end with NVPTX back-end in all places where Clang cares
NV_CONTRIB

llvm-svn: 157403
2012-05-24 17:43:12 +00:00
Peter Collingbourne a9455ec9f8 CUDA: add -fcuda-is-device flag
This frontend-only flag is used by the IR generator to determine
whether to filter CUDA declarations for the host or for the device.

llvm-svn: 141301
2011-10-06 18:29:46 +00:00
Peter Collingbourne 5bad4afa2f CUDA: set proper calling conventions for PTX
llvm-svn: 141296
2011-10-06 16:49:54 +00:00