Commit Graph

13 Commits

Author SHA1 Message Date
Yaxun Liu 9767089d00 [HIP] Support early finalization of device code for -fno-gpu-rdc
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.

The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.

This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.

Differential Revision: https://reviews.llvm.org/D52377

llvm-svn: 343611
2018-10-02 17:48:54 +00:00
Alexey Bataev e36c67b35c [NVPTX] Emit debug info in DWARF-2 by default for Cuda devices.
Summary:
NVPTX target supports debug info in DWARF-2 format. Patch adds emission
of debug info in DWARF-2 by default.

Reviewers: tra, jlebar

Subscribers: aprantl, JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D42581

llvm-svn: 330272
2018-04-18 16:31:09 +00:00
Jonas Hahnfeld 5379c6d6fd [CUDA] Add option to generate relocatable device code
As a first step, pass '-c/--compile-only' to ptxas so that it
doesn't complain about references to external function. This
will successfully generate object files, but they won't work
at runtime because the registration routines need to adapted.

Differential Revision: https://reviews.llvm.org/D42921

llvm-svn: 324878
2018-02-12 10:46:45 +00:00
Jonas Hahnfeld 15dd8c6c32 [CUDA] Fix test cuda-external-tools.cu
This didn't verify the CHECK prefix before!

Differential Revision: https://reviews.llvm.org/D42920

llvm-svn: 324877
2018-02-12 10:46:34 +00:00
Gheorghe-Teodor Bercea 53431bc046 [OpenMP] Pass -v to PTXAS if it was passed to the driver.
Summary: When compiling code being offloaded by OpenMP to an NVIDIA GPU, pass the -v to PTXAS if it was passed to the CLANG driver.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, tstellar

Reviewed By: jlebar

Subscribers: Hahnfeld, rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29644

llvm-svn: 310295
2017-08-07 20:19:23 +00:00
Serge Pavlov b43573b9a4 Driver must return non-zero code on errors in command line
This is recommit of r302775, reverted in r302777 due to a fail in
clang-tidy. Original mesage is below.

Now if clang driver is given wrong arguments, in some cases it
continues execution and returns zero code. This change fixes this
behavior.

The fix revealed some errors in clang test set.

File test/Driver/gfortran.f90 added in r118203 checks forwarding
gfortran flags to GCC. Now driver reports error on this file, because
the option -working-directory implemented in clang differs from the
option with the same name implemented in gfortran, in clang the option
requires argument, in gfortran does not.

In the file test/Driver/arm-darwin-builtin.c clang is called with
options -fbuiltin-strcat and -fbuiltin-strcpy. These option were removed
in r191435 and now clang reports error on this test.

File arm-default-build-attributes.s uses option -verify, which is not
supported by driver, it is cc1 option.

Similarly, the file split-debug.h uses options -fmodules-embed-all-files
and -fmodule-format=obj, which are not supported by driver.

Other revealed errors are mainly mistypes.

Differential Revision: https://reviews.llvm.org/D33013

llvm-svn: 303756
2017-05-24 14:57:17 +00:00
Serge Pavlov 738d3b97af Reverted r302775
llvm-svn: 302777
2017-05-11 08:25:22 +00:00
Serge Pavlov c5cc230587 Driver must return non-zero code on errors in command line
Now if clang driver is given wrong arguments, in some cases it
continues execution and returns zero code. This change fixes this
behavior.

The fix revealed some errors in clang test set.

File test/Driver/gfortran.f90 added in r118203 checks forwarding
gfortran flags to GCC. Now driver reports error on this file, because
the option -working-directory implemented in clang differs from the
option with the same name implemented in gfortran, in clang the option
requires argument, in gfortran does not.

In the file test/Driver/arm-darwin-builtin.c clang is called with
options -fbuiltin-strcat and -fbuiltin-strcpy. These option were removed
in r191435 and now clang reports error on this test.

File arm-default-build-attributes.s uses option -verify, which is not
supported by driver, it is cc1 option.

Similarly, the file split-debug.h uses options -fmodules-embed-all-files
and -fmodule-format=obj, which are not supported by driver.

Other revealed errors are mainly mistypes.

Differential Revision: https://reviews.llvm.org/D33013

llvm-svn: 302775
2017-05-11 08:00:33 +00:00
Justin Lebar 66c4fd7987 [CUDA] Driver changes to support CUDA compilation on MacOS.
Summary:
Compiling CUDA device code requires us to know the host toolchain,
because CUDA device-side compiles pull in e.g. host headers.

When we only supported Linux compilation, this worked because
CudaToolChain, which is responsible for device-side CUDA compilation,
inherited from the Linux toolchain.  But in order to support MacOS,
CudaToolChain needs to take a HostToolChain pointer.

Because a CUDA toolchain now requires a host TC, we no longer will
create a CUDA toolchain from Driver::getToolChain -- you have to go
through CreateOffloadingDeviceToolChains.  I am *pretty* sure this is
correct, and that previously any attempt to create a CUDA toolchain
through getToolChain() would eventually have resulted in us throwing
"error: unsupported use of NVPTX for host compilation".

In any case hacking getToolChain to create a CUDA+host toolchain would
be wrong, because a Driver can be reused for multiple compilations,
potentially with different host TCs, and getToolChain will cache the
result, causing us to potentially use a stale host TC.

So that's the main change in this patch.

In addition, we have to pull CudaInstallationDetector out of Generic_GCC
and into a top-level class.  It's now used by the Generic_GCC and MachO
toolchains.

Reviewers: tra

Subscribers: rryan, hfinkel, sfantao

Differential Revision: https://reviews.llvm.org/D26774

llvm-svn: 287285
2016-11-18 00:41:22 +00:00
Justin Lebar b41f33cf02 [CUDA] Add --no-cuda-noopt-debug, which disables --cuda-noopt-debug.
Reviewers: tra

Subscribers: cfe-commits, jhen

Differential Revision: http://reviews.llvm.org/D19251

llvm-svn: 266708
2016-04-19 02:27:11 +00:00
Artem Belevich 0a0e54c194 [CUDA] pass debug options to ptxas.
ptxas optimizations are disabled if we need to generate debug info
as ptxas does not accept '-g' otherwise.

Differential Revision: http://reviews.llvm.org/D17111

llvm-svn: 261018
2016-02-16 22:03:20 +00:00
Justin Lebar 2836dcdb75 [CUDA] Handle -O options (more) correctly.
Summary:
Previously we'd crash the driver if you passed -O0.  Now we try to
handle all of clang's various optimization flags in a sane way.

Reviewers: tra

Subscribers: cfe-commits, echristo, jhen

Differential Revision: http://reviews.llvm.org/D16307

llvm-svn: 258174
2016-01-19 19:52:21 +00:00
Justin Lebar 21e5d4fcfa [CUDA] Invoke ptxas and fatbinary during compilation.
Summary:
Previously we compiled CUDA device code to PTX assembly and embedded
that asm as text in our host binary.  Now we compile to PTX assembly and
then invoke ptxas to assemble the PTX into a cubin file.  We gather the
ptx and cubin files for each of our --cuda-gpu-archs and combine them
using fatbinary, and then embed that into the host binary.

Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary,
which pass args down to the external tools.

Reviewers: tra, echristo

Subscribers: cfe-commits, jhen

Differential Revision: http://reviews.llvm.org/D16082

llvm-svn: 257809
2016-01-14 21:41:27 +00:00