Commit Graph

1 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu abd8cd9199 [CUDA][HIP] Fix linkage for -fgpu-rdc
Currently for explicit template function instantiation in CUDA/HIP device
compilation clang emits instantiated kernel with external linkage
and instantiated device function with internal linkage.

This is fine for -fno-gpu-rdc since there is only one TU.

However this causes duplicate symbols for kernels for -fgpu-rdc if
the same instantiation happen in multiple TU. Or missing symbols
if a device function calls an explicitly instantiated template function
in a different TU.

To make explicit template function instantiation work for
-fgpu-rdc we need to follow the C++ linkage paradigm, i.e.
use weak_odr linkage.

Differential Revision: https://reviews.llvm.org/D90311
2020-11-03 08:07:19 -05:00