Use -mlink-builtin-bitcode instead of llvm-link to link
device library so that device library bitcode and user
device code can be compiled in a consistent way.
This is the same approach used by CUDA and OpenMP.
Differential Revision: https://reviews.llvm.org/D60513
llvm-svn: 358290
The CHECK lines as structured were requiring them to appear only in a certain
position while all that is really needed is to check that they are present.
llvm-svn: 354001
Since we removed changed the way HIP Toolchain will propagate -m options into LLC, we need to remove from these older tests.
This is related to rC353880.
Differential Revision: https://reviews.llvm.org/D57977
llvm-svn: 353885
Introduce an option to request global visibility settings be applied to
declarations without a definition or an explicit visibility, rather than
the existing behavior of giving these default visibility. When the
visibility of all or most extern definitions are known this allows for
the same optimisations -fvisibility permits without updating source code
to annotate all declarations.
Differential Revision: https://reviews.llvm.org/D56868
llvm-svn: 352391
AMDGPU backend will switch to code object version 3 by default.
Since HIP runtime is not ready, disable it until the runtime is ready.
Differential Revision: https://reviews.llvm.org/D53325
llvm-svn: 344630
This patch renames -f{no-}cuda-rdc to -f{no-}gpu-rdc and keeps the original
options as aliases. When -fgpu-rdc is off,
clang will assume the device code in each translation unit does not call
external functions except those in the device library, therefore it is possible
to compile the device code in each translation unit to self-contained kernels
and embed them in the host object, so that the host object behaves like
usual host object which can be linked by lld.
The benefits of this feature is: 1. allow users to create static libraries which
can be linked by host linker; 2. amortized device code linking time.
This patch modifies HIP action builder to insert actions for linking device
code and generating HIP fatbin, and pass HIP fatbin to host backend action.
It extracts code for constructing command for generating HIP fatbin as
a function so that it can be reused by early finalization. It also modifies
codegen of HIP host constructor functions to embed the device fatbin
when it is available.
Differential Revision: https://reviews.llvm.org/D52377
llvm-svn: 343611