llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun Liu	9767089d00	[HIP] Support early finalization of device code for -fno-gpu-rdc This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original options as aliases. When -fgpu-rdc is off, clang will assume the device code in each translation unit does not call external functions except those in the device library, therefore it is possible to compile the device code in each translation unit to self-contained kernels and embed them in the host object, so that the host object behaves like usual host object which can be linked by lld. The benefits of this feature is: 1. allow users to create static libraries which can be linked by host linker; 2. amortized device code linking time. This patch modifies HIP action builder to insert actions for linking device code and generating HIP fatbin, and pass HIP fatbin to host backend action. It extracts code for constructing command for generating HIP fatbin as a function so that it can be reused by early finalization. It also modifies codegen of HIP host constructor functions to embed the device fatbin when it is available. Differential Revision: https://reviews.llvm.org/D52377 llvm-svn: 343611	2018-10-02 17:48:54 +00:00
Alexey Bataev	e36c67b35c	[NVPTX] Emit debug info in DWARF-2 by default for Cuda devices. Summary: NVPTX target supports debug info in DWARF-2 format. Patch adds emission of debug info in DWARF-2 by default. Reviewers: tra, jlebar Subscribers: aprantl, JDevlieghere, cfe-commits Differential Revision: https://reviews.llvm.org/D42581 llvm-svn: 330272	2018-04-18 16:31:09 +00:00
Jonas Hahnfeld	5379c6d6fd	[CUDA] Add option to generate relocatable device code As a first step, pass '-c/--compile-only' to ptxas so that it doesn't complain about references to external function. This will successfully generate object files, but they won't work at runtime because the registration routines need to adapted. Differential Revision: https://reviews.llvm.org/D42921 llvm-svn: 324878	2018-02-12 10:46:45 +00:00
Jonas Hahnfeld	15dd8c6c32	[CUDA] Fix test cuda-external-tools.cu This didn't verify the CHECK prefix before! Differential Revision: https://reviews.llvm.org/D42920 llvm-svn: 324877	2018-02-12 10:46:34 +00:00
Gheorghe-Teodor Bercea	53431bc046	[OpenMP] Pass -v to PTXAS if it was passed to the driver. Summary: When compiling code being offloaded by OpenMP to an NVIDIA GPU, pass the -v to PTXAS if it was passed to the CLANG driver. Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar Reviewed By: jlebar Subscribers: Hahnfeld, rengolin, cfe-commits Differential Revision: https://reviews.llvm.org/D29644 llvm-svn: 310295	2017-08-07 20:19:23 +00:00
Serge Pavlov	b43573b9a4	Driver must return non-zero code on errors in command line This is recommit of r302775, reverted in r302777 due to a fail in clang-tidy. Original mesage is below. Now if clang driver is given wrong arguments, in some cases it continues execution and returns zero code. This change fixes this behavior. The fix revealed some errors in clang test set. File test/Driver/gfortran.f90 added in r118203 checks forwarding gfortran flags to GCC. Now driver reports error on this file, because the option -working-directory implemented in clang differs from the option with the same name implemented in gfortran, in clang the option requires argument, in gfortran does not. In the file test/Driver/arm-darwin-builtin.c clang is called with options -fbuiltin-strcat and -fbuiltin-strcpy. These option were removed in r191435 and now clang reports error on this test. File arm-default-build-attributes.s uses option -verify, which is not supported by driver, it is cc1 option. Similarly, the file split-debug.h uses options -fmodules-embed-all-files and -fmodule-format=obj, which are not supported by driver. Other revealed errors are mainly mistypes. Differential Revision: https://reviews.llvm.org/D33013 llvm-svn: 303756	2017-05-24 14:57:17 +00:00
Serge Pavlov	738d3b97af	Reverted r302775 llvm-svn: 302777	2017-05-11 08:25:22 +00:00
Serge Pavlov	c5cc230587	Driver must return non-zero code on errors in command line Now if clang driver is given wrong arguments, in some cases it continues execution and returns zero code. This change fixes this behavior. The fix revealed some errors in clang test set. File test/Driver/gfortran.f90 added in r118203 checks forwarding gfortran flags to GCC. Now driver reports error on this file, because the option -working-directory implemented in clang differs from the option with the same name implemented in gfortran, in clang the option requires argument, in gfortran does not. In the file test/Driver/arm-darwin-builtin.c clang is called with options -fbuiltin-strcat and -fbuiltin-strcpy. These option were removed in r191435 and now clang reports error on this test. File arm-default-build-attributes.s uses option -verify, which is not supported by driver, it is cc1 option. Similarly, the file split-debug.h uses options -fmodules-embed-all-files and -fmodule-format=obj, which are not supported by driver. Other revealed errors are mainly mistypes. Differential Revision: https://reviews.llvm.org/D33013 llvm-svn: 302775	2017-05-11 08:00:33 +00:00
Justin Lebar	66c4fd7987	[CUDA] Driver changes to support CUDA compilation on MacOS. Summary: Compiling CUDA device code requires us to know the host toolchain, because CUDA device-side compiles pull in e.g. host headers. When we only supported Linux compilation, this worked because CudaToolChain, which is responsible for device-side CUDA compilation, inherited from the Linux toolchain. But in order to support MacOS, CudaToolChain needs to take a HostToolChain pointer. Because a CUDA toolchain now requires a host TC, we no longer will create a CUDA toolchain from Driver::getToolChain -- you have to go through CreateOffloadingDeviceToolChains. I am pretty sure this is correct, and that previously any attempt to create a CUDA toolchain through getToolChain() would eventually have resulted in us throwing "error: unsupported use of NVPTX for host compilation". In any case hacking getToolChain to create a CUDA+host toolchain would be wrong, because a Driver can be reused for multiple compilations, potentially with different host TCs, and getToolChain will cache the result, causing us to potentially use a stale host TC. So that's the main change in this patch. In addition, we have to pull CudaInstallationDetector out of Generic_GCC and into a top-level class. It's now used by the Generic_GCC and MachO toolchains. Reviewers: tra Subscribers: rryan, hfinkel, sfantao Differential Revision: https://reviews.llvm.org/D26774 llvm-svn: 287285	2016-11-18 00:41:22 +00:00
Justin Lebar	b41f33cf02	[CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug. Reviewers: tra Subscribers: cfe-commits, jhen Differential Revision: http://reviews.llvm.org/D19251 llvm-svn: 266708	2016-04-19 02:27:11 +00:00
Artem Belevich	0a0e54c194	[CUDA] pass debug options to ptxas. ptxas optimizations are disabled if we need to generate debug info as ptxas does not accept '-g' otherwise. Differential Revision: http://reviews.llvm.org/D17111 llvm-svn: 261018	2016-02-16 22:03:20 +00:00
Justin Lebar	2836dcdb75	[CUDA] Handle -O options (more) correctly. Summary: Previously we'd crash the driver if you passed -O0. Now we try to handle all of clang's various optimization flags in a sane way. Reviewers: tra Subscribers: cfe-commits, echristo, jhen Differential Revision: http://reviews.llvm.org/D16307 llvm-svn: 258174	2016-01-19 19:52:21 +00:00
Justin Lebar	21e5d4fcfa	[CUDA] Invoke ptxas and fatbinary during compilation. Summary: Previously we compiled CUDA device code to PTX assembly and embedded that asm as text in our host binary. Now we compile to PTX assembly and then invoke ptxas to assemble the PTX into a cubin file. We gather the ptx and cubin files for each of our --cuda-gpu-archs and combine them using fatbinary, and then embed that into the host binary. Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary, which pass args down to the external tools. Reviewers: tra, echristo Subscribers: cfe-commits, jhen Differential Revision: http://reviews.llvm.org/D16082 llvm-svn: 257809	2016-01-14 21:41:27 +00:00

13 Commits