llvm-project

Commit Graph

Author	SHA1	Message	Date
Jonas Hahnfeld	3b9cbba9a8	[CUDA] Fix emission of constant strings in sections CGM.GetAddrOfConstantCString() sets the adress of the created GlobalValue to unnamed. When emitting the object file LLVM will mark the surrounding section as SHF_MERGE iff the string is nul-terminated and contains no other nuls (see IsNullTerminatedString). This results in problems when saving temporaries because LLVM doesn't set an EntrySize, so reading in the serialized assembly file fails. This never happened for the GPU binaries because they usually contain a nul-character somewhere. Instead this only affected the module ID when compiling relocatable device code. However, this points to a potentially larger problem: If we put a constant string into a named section, we really want the data to end up in that section in the object file. To avoid LLVM merging sections this patch unmarks the GlobalVariable's address as unnamed which also fixes the problem of invalid serialized assembly files when saving temporaries. Differential Revision: https://reviews.llvm.org/D47902 llvm-svn: 334281	2018-06-08 11:17:08 +00:00
Yaxun Liu	29155b01c1	[HIP] Support offloading by linker script To support linking device code in different source files, it is necessary to embed fat binary at host linking stage. This patch emits an external symbol for fat binary in host codegen, then embed the fat binary by lld through a linker script. Differential Revision: https://reviews.llvm.org/D46472 llvm-svn: 332724	2018-05-18 15:07:56 +00:00
Yaxun Liu	887c569bcb	[HIP] Add hip input kind and codegen for kernel launching HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ). The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP). This patch adds support of input kind and language standard hip. When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA in most cases and only special handling of hip program is needed LangOpts.HIP is checked. This patch also adds support of kernel launching of HIP program using HIP host API. When -x hip is not specified, there is no behaviour change for CUDA. Patch by Greg Rodgers. Revised and lit test added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D44984 llvm-svn: 330790	2018-04-25 01:10:37 +00:00
Jonas Hahnfeld	f5527c2381	[CUDA] Register relocatable GPU binaries nvcc generates a unique registration function for each object file that contains relocatable device code. Unique names are achieved with a module id that is also reflected in the function's name. Differential Revision: https://reviews.llvm.org/D42922 llvm-svn: 330425	2018-04-20 13:04:45 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Artem Belevich	4c09318be2	[CUDA] Place GPU binary into .nv_fatbin section and align it by 8. This matches the way nvcc encapsulates GPU binaries into host object file. Now cuobjdump can deal with clang-compiled object files. Differential Revision: https://reviews.llvm.org/D23429 llvm-svn: 278549	2016-08-12 18:44:01 +00:00
Artem Belevich	3609085dc4	Fixed test failure platforms with name mangling different from Linux. * Run cc with -triple x86_64-linux-gnu to make symbol mangling predictable. * Use temporary file as a fake GPU input so its content does not interfere with pattern matching. llvm-svn: 262516	2016-03-02 21:03:20 +00:00
Artem Belevich	8c1ec1ef38	[CUDA] Do not generate unnecessary runtime init code. Differential Revision: http://reviews.llvm.org/D17780 llvm-svn: 262499	2016-03-02 18:28:53 +00:00
Artem Belevich	42e1949b46	[CUDA] Emit host-side 'shadows' for device-side global variables ... and register them with CUDA runtime. This is needed for commonly used cudaMemcpy*() APIs that use address of host-side shadow to access their counterparts on device side. Fixes PR26340 Differential Revision: http://reviews.llvm.org/D17779 llvm-svn: 262498	2016-03-02 18:28:50 +00:00
Artem Belevich	e958275250	[cuda] Fixed test case failure on s390x llvm-svn: 237007	2015-05-11 18:35:58 +00:00
Artem Belevich	8d062ad560	Fixed test failure on machines with 32-bit size_t. llvm-svn: 236773	2015-05-07 21:06:03 +00:00
Artem Belevich	52cc487ba8	[cuda] Include GPU binary into host object file and generate init/deinit code. - added -fcuda-include-gpubinary option to incorporate results of device-side compilation into host-side one. - generate code to register GPU binaries and associated kernels with CUDA runtime and clean-up on exit. - added test case for init/deinit code generation. Differential Revision: http://reviews.llvm.org/D9507 llvm-svn: 236765	2015-05-07 19:34:16 +00:00
Eli Bendersky	3468d9d929	Move all CUDA testing inputs to Inputs/ subdirectory inside the tests. llvm-svn: 207453	2014-04-28 22:21:28 +00:00
Peter Collingbourne	fa4d6033a3	CUDA: IR generation support for device stubs llvm-svn: 141304	2011-10-06 18:51:56 +00:00

14 Commits