Summary:
According to CUDA documentation, global variables declared with __device__,
__constant__ can be initialized from host code, so mark them as
externally initialized. Because __shared__ variables cannot have an
initialization as part of their declaration and since the value maybe kept
across different kernel invocation, the value of __shared__ is effectively
undefined instead of zero initialized.
Wrongly using zero initializer may cause illegitimate optimization, e.g.
removing unused __constant__ variable because it's not updated in the device
code and the value is initialized with zero.
Test Plan: test/CodeGenCUDA/address-spaces.cu
Patch by Xuetian Weng
Reviewers: jholewinski, eliben, tra, jingyue
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12241
llvm-svn: 245786
Summary:
Due to CUDA's implicit address space casting, the type of a static local
variable may be more specific (i.e. with address space qualifiers) than
the type expected by the constructor. Emit an addrspacecast in that
case.
Test Plan: Clang used to crash on the added test.
Reviewers: nlewycky, pcc, eliben, rsmith
Reviewed By: eliben, rsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8575
llvm-svn: 233208
variables without a storage class within a function, to implement
CUDA B.2.5: "__shared__ and __constant__ variables have implied static
storage [duration]."
llvm-svn: 162788
Because in CUDA types do not have associated address spaces,
globals are declared in their "native" address space, and accessed
by bitcasting the pointer to address space 0. This relies on address
space 0 being a unified address space.
llvm-svn: 157167