llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	98575708da	[CUDA][HIP] Fix device template variables Currently clang does not emit device template variables instantiated only in host functions, however, nvcc is able to do that: https://godbolt.org/z/fneEfferY This patch fixes this issue by refactoring and extending the existing mechanism for emitting static device var ODR-used by host only. Basically clang records device variables ODR-used by host code and force them to be emitted in device compilation. The existing mechanism makes sure these device variables ODR-used by host code are added to llvm.compiler-used, therefore they are guaranteed not to be deleted. It also fixes non-ODR-use of static device variable by host code causing static device variable to be emitted and registered, which should not. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D102237	2021-05-12 11:13:29 -04:00
Fangrui Song	fd739804e0	[test] Add {{.}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences For a default visibility external linkage definition, dso_local is set for ELF -fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences. To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition, which sets dso_local for default visibility external linkage definitions. To make this flip smooth and enable future (dso_local as definition default), this patch replaces (function) `define ` with `define{{.}} `, (variable/constant/alias) `= ` with `={{.}} `, or inserts appropriate `{{.}} `.	2020-12-31 00:27:11 -08:00
Artem Belevich	be86b6773b	[CUDA] Allow local static variables with target attributes. While CUDA documentation claims that such variables are not allowed[1], NVCC has been accepting them since CUDA-10.0[2] and some headers in CUDA-11 rely on this working. 1. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#static-variables-function 2. https://godbolt.org/z/zsodzc Differential Revision: https://reviews.llvm.org/D88345	2020-11-03 10:30:38 -08:00
Artem Belevich	0a3ebb4d8d	Revert "[CUDA] Allow local static variables with target attributes." This reverts commit `f38a9e5117` Which triggered assertions.	2020-11-02 15:09:07 -08:00
Artem Belevich	f38a9e5117	[CUDA] Allow local static variables with target attributes. While CUDA documentation claims that such variables are not allowed[1], NVCC has been accepting them since CUDA-10.0[2] and some headers in CUDA-11 rely on this working. 1. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#static-variables-function 2. https://godbolt.org/z/zsodzc Differential Revision: https://reviews.llvm.org/D88345	2020-11-02 14:37:13 -08:00
Yaxun (Sam) Liu	301e23305d	[CUDA][HIP] Fix static device var used by host code only A static device variable may be accessed in host code through cudaMemCpyFromSymbol etc. Currently clang does not emit the static device variable if it is only referenced by host code, which causes host code to fail at run time. This patch fixes that. Differential Revision: https://reviews.llvm.org/D88115	2020-09-23 08:18:19 -04:00
Yaxun (Sam) Liu	fb04d7b4a6	[CUDA][HIP] Do not externalize implicit constant static variable Differential Revision: https://reviews.llvm.org/D85686	2020-08-10 19:02:49 -04:00
Yaxun (Sam) Liu	45f2a56856	[CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc nvcc supports accessing file-scope static device variables in host code by host APIs like cudaMemcpyToSymbol etc. CUDA/HIP let users access device variables in host code by shadow variables. In host compilation, clang emits a shadow variable for each device variable, and calls __RegisterVariable to register it in init function. The address of the shadow variable and the device side mangled name of the device variable is passed to __RegisterVariable. Runtime looks up the symbol by name in the device binary to find the address of the device variable. The problem with static device variables is that they have internal linkage, therefore their name may be changed by the linker if there are multiple symbols with the same name. Also they end up as local symbols in the elf file, whereas the runtime only looks up the global symbols. Another reason for making the static device variables external linkage is that they may be initialized externally by host code and their final value may be accessed by host code after kernel execution, therefore they actually have external linkage. Giving them internal linkage will cause incorrect optimizations on them. To support accessing static device var in host code for -fno-gpu-rdc mode, change the intnernal linkage to external linkage. The name does not need change since there is only one TU for -fno-gpu-rdc mode. Also the externalization is done only if the device static var is referenced by host code. Differential Revision: https://reviews.llvm.org/D80858	2020-08-05 07:57:38 -04:00

8 Commits