Commit Graph

5 Commits

Author SHA1 Message Date
Yaxun Liu 9767089d00 [HIP] Support early finalization of device code for -fno-gpu-rdc
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.

The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.

This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.

Differential Revision: https://reviews.llvm.org/D52377

llvm-svn: 343611
2018-10-02 17:48:54 +00:00
Justin Lebar 5489f85fda [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.
Summary:
Previously this triggered a -Wundefined-internal warning.  But it's not
an undefined variable -- any variable of this form is a pointer to the
base of GPU core's shared memory.

Reviewers: tra

Subscribers: sanjoy, rsmith

Differential Revision: https://reviews.llvm.org/D46782

llvm-svn: 332621
2018-05-17 16:15:07 +00:00
Jonas Hahnfeld ee47d8cb96 [CUDA] Allow external variables in separate compilation
According to the CUDA Programming Guide this is prohibited in
whole program compilation mode. This makes sense because external
references cannot be satisfied in that mode anyway. However,
such variables are allowed in separate compilation mode which
is a valid use case.

Differential Revision: https://reviews.llvm.org/D42923

llvm-svn: 325136
2018-02-14 16:04:03 +00:00
Justin Lebar 281ce2af17 [CUDA] Allow extern __shared__ on empty-length arrays.
"extern __shared__ int x[]" is OK.

llvm-svn: 283068
2016-10-02 15:24:50 +00:00
Justin Lebar 1041101953 [CUDA] Disallow 'extern __shared__' variables.
Also add a test that we disallow

  __constant__ __shared__ int x;

because it's possible to break this without breaking

  __shared__ __constant__ int x;

Reviewers: rnk

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D25125

llvm-svn: 282985
2016-09-30 23:57:30 +00:00