forked from OSchip/llvm-project
45f2a56856
nvcc supports accessing file-scope static device variables in host code by host APIs like cudaMemcpyToSymbol etc. CUDA/HIP let users access device variables in host code by shadow variables. In host compilation, clang emits a shadow variable for each device variable, and calls __*RegisterVariable to register it in init function. The address of the shadow variable and the device side mangled name of the device variable is passed to __*RegisterVariable. Runtime looks up the symbol by name in the device binary to find the address of the device variable. The problem with static device variables is that they have internal linkage, therefore their name may be changed by the linker if there are multiple symbols with the same name. Also they end up as local symbols in the elf file, whereas the runtime only looks up the global symbols. Another reason for making the static device variables external linkage is that they may be initialized externally by host code and their final value may be accessed by host code after kernel execution, therefore they actually have external linkage. Giving them internal linkage will cause incorrect optimizations on them. To support accessing static device var in host code for -fno-gpu-rdc mode, change the intnernal linkage to external linkage. The name does not need change since there is only one TU for -fno-gpu-rdc mode. Also the externalization is done only if the device static var is referenced by host code. Differential Revision: https://reviews.llvm.org/D80858 |
||
---|---|---|
.. | ||
Inputs | ||
address-spaces.cu | ||
alias.cu | ||
amdgpu-hip-implicit-kernarg.cu | ||
amdgpu-kernel-arg-pointer-type.cu | ||
amdgpu-kernel-attrs.cu | ||
amdgpu-visibility.cu | ||
amdgpu-workgroup-size.cu | ||
builtins-amdgcn.cu | ||
constexpr-variables.cu | ||
convergent.cu | ||
cuda-builtin-vars.cu | ||
debug-info-address-class.cu | ||
debug-info-template.cu | ||
deferred-diag.cu | ||
dependent-libs.cu | ||
device-init-fun.cu | ||
device-stub.cu | ||
device-var-init.cu | ||
device-vtable.cu | ||
filter-decl.cu | ||
flush-denormals.cu | ||
fp-contract.cu | ||
function-overload.cu | ||
kernel-amdgcn.cu | ||
kernel-args-alignment.cu | ||
kernel-args.cu | ||
kernel-call.cu | ||
kernel-dbg-info.cu | ||
kernel-stub-name.cu | ||
lambda.cu | ||
launch-bounds.cu | ||
library-builtin.cu | ||
link-device-bitcode.cu | ||
llvm-used.cu | ||
ms-linker-options.cu | ||
norecurse.cu | ||
nothrow.cu | ||
openmp-target.cu | ||
printf-aggregate.cu | ||
printf.cu | ||
propagate-metadata.cu | ||
ptx-kernels.cu | ||
static-device-var-no-rdc.cu | ||
surface.cu | ||
texture.cu | ||
types.cu | ||
unnamed-types.cu | ||
usual-deallocators.cu |