Add GNU Static Lib Tool, which supports the --emit-static-lib
flag. For HIP, a static library archive will be created and
consist of HIP Fat Binary host object with the device images embedded.
Using llvm-ar to create the static archive. Also, delete existing
output file to ensure a new archive is created each time.
Reviewers: yaxunl, tra, rjmccall, echristo
Subscribers: echristo, JonChesterfield, scchan, msearles
Differential Revision: https://reviews.llvm.org/D78759
This patch is a follow up on https://reviews.llvm.org/D78759.
Extract the HIP Linker script from generic GNU linker,
and move it into HIP ToolChain. Update OffloadActionBuilder
Link actions feature to apply device linking and host linking
actions separately. Using MC Directives, embed the device images
and define symbols.
Reviewers: JonChesterfield, yaxunl
Subscribers: tra, echristo, jdoerfert, msearles, scchan
Differential Revision: https://reviews.llvm.org/D81963
Summary:
As seen in:
https://bugs.llvm.org/show_bug.cgi?id=45693
When clang looks for a tool it has a set of
possible names for it, in priority order.
Previously it would look for these names in
the program path. Then look for all the names
in the PATH.
This means that aarch64-none-elf-gcc on the PATH
would lose to gcc in the program path.
(which was /usr/bin in the bug's case)
This changes that logic to search each name in both
possible locations, then move to the next name.
Which is more what you would expect to happen when
using a non default triple.
(-B prefixes maybe should follow this logic too,
but are not changed in this patch)
Subscribers: kristof.beyls, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79988
Add -fpch-instantiate-templates which makes template instantiations be
performed already in the PCH instead of it being done in every single
file that uses the PCH (but every single file will still do it as well
in order to handle its own instantiations). I can see 20-30% build
time saved with the few tests I've tried.
The change may reorder compiler output and also generated code, but
should be generally safe and produce functionally identical code.
There are some rare cases that do not compile with it,
such as test/PCH/pch-instantiate-templates-forward-decl.cpp. If
template instantiation bailed out instead of reporting the error,
these instantiations could even be postponed, which would make them
work.
Enable this by default for clang-cl. MSVC creates PCHs by compiling
them using an empty .cpp file, which means templates are instantiated
while building the PCH and so the .h needs to be self-contained,
making test/PCH/pch-instantiate-templates-forward-decl.cpp to fail
with MSVC anyway. So the option being enabled for clang-cl matches this.
Differential Revision: https://reviews.llvm.org/D69585
Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now.
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D82244
On AIX, we use __atexit to register dtor functions rather than __cxa_atexit.
So a driver change is needed to default AIX to using -fno-use-cxa-atexit.
Windows platform does not uses __cxa_atexit either. Following its precedent,
we remove the assertion for when -fuse-cxa-atexit is specified by the user,
do not produce a message and silently default to -fno-use-cxa-atexit behavior.
Differential Revision: https://reviews.llvm.org/D82136
The accepted options to -mharden-sls= are:
* all: enable all mitigations against Straight Line Speculation that are
implemented.
* none: disable all mitigations against Straight Line Speculation.
* retbr: enable the mitigation against Straight Line Speculation for RET
and BR instructions.
* blr: enable the mitigation against Straight Line Speculation for BLR
instructions.
Differential Revision: https://reviews.llvm.org/D81404
Reviewers: dylanmckay
Reviewed By: dylanmckay
Subscribers: Jim, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77334
This was originally committed in
03b0831144 but I missed the commit
attribution.
Patch by Dennis van der Schagt.
Enable -amdgpu-internalize-symbols to eliminate unused functions and global variables
for whole program to speed up compilation and improve performance.
For -fno-gpu-rdc, -amdgpu-internalize-symbols is passed to clang -cc1.
For -fgpu-rdc, -amdgpu-internalize-symbols is passed to lld.
Differential Revision: https://reviews.llvm.org/D81959
Currently rocm detector expects device library bitcodes named as *.bc
instead of *.amdgcn.bc. However in rocm3.5 the device library bitcodes
are named as *.amdgcn.bc, which causes rocm3.5 not detected.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D81713
Summary:
The Android NDK's clang driver is used with an Android -target setting,
and the driver automatically finds the Android sysroot at a path
relative to the driver. The sysroot has the libc++ headers in it.
Remove Hurd::computeSysRoot as it is equivalent to the new
ToolChain::computeSysRoot method.
Fixes PR46213.
Reviewers: srhines, danalbert, #libc, kristina
Reviewed By: srhines, danalbert
Subscribers: ldionne, sthibaul, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81622
Summary:
Add a flag to omit the xray_fn_idx to cut size overhead and relocations
roughly in half at the cost of reduced performance for single function
patching. Minor additions to compiler-rt support per-function patching
without the index.
Reviewers: dberris, MaskRay, johnislarry
Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D81995
The msvcrt library isn't a pure import library; it does contain
regular object files with wrappers/fallbacks, and these can require
linking against kernel32.
This only makes a difference when linking with ld.bfd, as lld
always searches all static libraries.
This matches a similar change made recently in gcc in
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=850533ab160ef40eccfd039e1e3b138cf26e76b8,
although clang adds --start-group --end-group around these libraries
if -static is specified, which gcc doesn't. But try to match gcc's
linking order in any case, for consistency.
Differential Revision: https://reviews.llvm.org/D80880
Summary:
We're trying to use the --config options to pass distro specific
options for Fedora via the CFLAGS variable. However, some projects
end up using the CFLAGS variable multiple times in their command line,
which leads to an error when --config is used.
This patch resolves this issue by allowing more than one --config option
on the command line as long as the file names are the same.
Reviewers: sepavloff, hfinkel
Reviewed By: sepavloff
Subscribers: cfe-commits, llvm-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81424
This patch is a follow up on https://reviews.llvm.org/D81627.
In addition to default -fno-gpu-rdc case, this patches let
HIP toolchain not use llvm-link/opt/llc to link device code
for -fgpu-rdc case. Instead, uses standard lto.
This will eliminate some redundant optimizations and speed
up the compilation/linking.
Differential Revision: https://reviews.llvm.org/D81861
Currently HIP toolchain calls clang to emit bitcode then calls opt/llc for device compilation for the default -fno-gpu-rdc
case, which is unnecessary since clang is able to compile a single source file to ISA.
This patch fixes the HIP action builder and toolchain so that the default -fno-gpu-rdc can be done like a canonical
toolchain, i.e. one clang -cc1 invocation to compile source code to ISA.
This can avoid unnecessary processes to speed up the compilation, and avoid redundant LLVM passes which are
performed in clang -cc1 and opt.
Differential Revision: https://reviews.llvm.org/D81627
It's useful for using clang from tools that may need need to provide SDK files
from non-standard locations.
Clang CLI only provides a way to specify VFS for include files, so there's no
good way to test this yet.
Differential Revision: https://reviews.llvm.org/D81771
Summary:
- In HIP, just as the regular device-only compilation, the device-only
relocatable code compilation should not involve offload bundle.
- In addition, that device-only relocatable code compilation should have
the similar 3 steps, namely preprocessor, compile, and backend, to the
regular code generation with `-emit-llvm`.
Reviewers: yaxunl, tra
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81427
Summary:
ROCm.h had been getting the declarations for various data structures
by being #included next to them, rather than #includeing them itself.
This change fixes that by explicitly including the appropriate headers.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81432
Summary:
Add -ftrivial-auto-var-init-stop-after= to limit the number of times
stack variables are initialized when -ftrivial-auto-var-init= is used to
initialize stack variables to zero or a pattern. This flag can be used
to bisect uninitialized uses of a stack variable exposed by automatic
variable initialization, such as http://crrev.com/c/2020401.
Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich
Reviewed By: jfb
Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77168
Summary: This patch removes the special handling for Darwin on PowerPC in the default target cpu handling, because Darwin is no longer supported on the PowerPC platform.
Reviewers: hubert.reinterpretcast, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: wuzish, nemanjai, shchenz, steven.zhang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81115
To support std::complex and some other standard C/C++ functions in HIP device code,
they need to be forced to be __host__ __device__ functions by pragmas. This is done
by some clang standard C++ wrapper headers which are shared between cuda-clang and hip-Clang.
For these standard C++ wapper headers to work properly, specific include path order
has to be enforced:
clang C++ wrapper include path
standard C++ include path
clang include path
Also, these C++ wrapper headers require device version of some standard C/C++ functions
must be declared before including them. This needs to be done by including a default
header which declares or defines these device functions. The default header is always
included before any other headers are included by users.
This patch adds the the default header and include path for HIP.
Differential Revision: https://reviews.llvm.org/D81176
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Differential Revision: https://reviews.llvm.org/D79835
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.
Patch By: mcgrathr
Differential Revision: https://reviews.llvm.org/D79835
Summary: This patch changes the AIX default target CPU to power4 since this is the the lowest arch for the lowest OS level supported.
Reviewers: hubert.reinterpretcast, cebowleratibm, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80835
Summary:
An upgrade of LLVM for CrOS [0] containing [1] triggered a bunch of
errors related to writing to reserved registers for a Linux kernel's
arm64 compat vdso (which is a aarch32 image).
After a discussion on LKML [2], it was determined that
-f{no-}omit-frame-pointer was not being specified. Comparing GCC and
Clang [3], it becomes apparent that GCC defaults to omitting the frame
pointer implicitly when optimizations are enabled, and Clang does not.
ie. setting -O1 (or above) implies -fomit-frame-pointer. Clang was
defaulting to -fno-omit-frame-pointer implicitly unless -fomit-frame-pointer
was set explicitly.
Why this becomes a problem is that the Linux kernel's arm64 compat vdso
contains code that uses r7. r7 is used sometimes for the frame pointer
(for example, when targeting thumb (-mthumb)). See useR7AsFramePointer()
in llvm/llvm-project/llvm/lib/Target/ARM/ARMSubtarget.h. This is mostly
for legacy/compatibility reasons, and the 2019 Q4 revision of the ARM
AAPCS looks to standardize r11 as the frame pointer for aarch32, though
this is not yet implemented in LLVM.
Users that are reliant on the implicit value if unspecified when
optimizations are enabled should explicitly choose -fomit-frame-pointer
(new behavior) or -fno-omit-frame-pointer (old behavior).
[0] https://bugs.chromium.org/p/chromium/issues/detail?id=1084372
[1] https://reviews.llvm.org/D76848
[2] https://lore.kernel.org/lkml/20200526173117.155339-1-ndesaulniers@google.com/
[3] https://godbolt.org/z/0oY39t
Reviewers: kristof.beyls, psmith, danalbert, srhines, MaskRay, ostannard, efriedma
Reviewed By: psmith, danalbert, srhines, MaskRay, efriedma
Subscribers: efriedma, olista01, MaskRay, vhscampos, cfe-commits, llvm-commits, manojgupta, llozano, glider, hctim, eugenis, pcc, peter.smith, srhines
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D80828