This patch adds a configuration option to simply use the default pass
pipeline in favor of the LTO-specific one. We observed some severe
performance penalties when uding device-side LTO for OpenMP offloading
applications caused by the LTO-pass pipeline. This is primarily because
OpenMP uses an LLVM bitcode library to implement a GPU runtime library.
In a standard compilation we link this bitcode library into each source
file and optimize it with the default pipeline. When performing LTO we
link it late with all the files, but the bitcode library never has the
regular optimization pipeline applied to it so we miss a few
optimizations just using the LTO pipeline to optimize it.
I'm not committed to this solution, but it's the easiest method to solve
this performance regression when using LTO without changing the
optimizatin pipeline for other users.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D122133
Current ASTContext.getAttributedType() takes attribute kind,
ModifiedType and EquivType as the hash to decide whether an AST node
has been generated or note. But this is not enough for btf_type_tag
as the attribute might have the same ModifiedType and EquivType, but
still have different string associated with attribute.
For example, for a data structure like below,
struct map_value {
int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a;
int __attribute__((btf_type_tag("tag2"))) __attribute__((btf_type_tag("tag4"))) *b;
};
The current ASTContext.getAttributedType() will produce
an AST similar to below:
struct map_value {
int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a;
int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *b;
};
and this is incorrect.
It is very difficult to use the current AttributedType as it is hard to
get the tag information. To fix the problem, this patch introduced
BTFTagAttributedType which is similar to AttributedType
in many ways but with an additional BTFTypeTagAttr. The tag itself can
be retrieved with BTFTypeTagAttr.
With the new BTFTagAttributed type, the debuginfo code can be greatly
simplified compared to previous TypeLoc based approach.
Differential Revision: https://reviews.llvm.org/D120296
This reverts commit 049f4e4eab.
The problem was a stray dependency in CLANG_TEST_DEPS which caused cmake
to fail if clang-pseudo wasn't built. This is now removed.
This should make clearer that:
- it's not part of clang proper
- there's no expectation to update it along with clang (beyond green tests)
- clang should not depend on it
This is intended to be expose a library, so unlike other tools has a split
between include/ and lib/.
The main renames are:
clang/lib/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/lib/*
clang/include/clang/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/include/clang-pseudo/*
clang/tools/clang/pseudo/* => clang-tools-extra/pseudo/tool/*
clang/test/Syntax/* => clang-tools-extra/pseudo/test/*
clang/unittests/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/unittests/*
#include "clang/Tooling/Syntax/Pseudo/*" => #include "clang-pseudo/*"
namespace clang::syntax::pseudo => namespace clang::pseudo
check-clang => check-clang-pseudo
clangToolingSyntaxPseudo => clangPseudo
The clang-pseudo and ClangPseudoTests binaries are not renamed.
See discussion around:
https://discourse.llvm.org/t/rfc-a-c-pseudo-parser-for-tooling/59217/50
Differential Revision: https://reviews.llvm.org/D121233
This patch adds the offload kind to the embedded section name in
preparation for offloading to different kinda like CUDA or HIP.
Depends on D120288
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120271
This patch implements a DenseMap info struct for the device file type.
This is used to help grouping device files that have the same triple and
architecture. Because of this the filename, which will always be unique
for each file, is not used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120288
With explicit modules build, the '-fmodules-cache-path=' argument is unused.
This patch removes the argument to avoid warnings or errors (with '-Werror') stemming from that.
Depends on D118915.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D120474
The `clang-scan-deps` tool currently generates `-fmodule-file=` command-line arguments for the whole transitive closure of modular dependencies. This is not necessary, we only need to provide the direct dependencies on the command line. Information about transitive dependencies is stored within the `.pcm` files of direct dependencies. This makes the command lines shorter, but should be a NFC otherwise (unless there are bugs in the loading mechanism for explicit modules).
Depends on D120465.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D118915
Summary:
This patch correctly handles the `--sysroot=` option when passed to the
linker wrapper. This allows users to correctly find libraries that may
contain offloading code if using this option.
Currently, manpage for scan-build is installed as a program, with
permission of 755. This patch makes installation of scan-build.1 as
file, with 644 permission.
Patch by Manas.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D120646
`hip-openmp-compatible` flag treats hip and hipv4 offload kinds
as compatible with openmp offload kind while extracting code objects
from a heterogenous archive library. Vice versa is also considered
compatible if hip code was compiled with -fgpu-rdc.
This flag only relaxes compatibility criteria on `OffloadKind`,
rest of the components like `Triple` and `GPUArhc` still needs to
be compatible.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D120697
Summary;
This path adds printing support for the linker wrapper. When the user
passes `-v` it will not print the commands used by the linker wrapper to
indicate to the user what is happening during the linking.
The TokenStream class is the representation of the source code that will
be fed into the GLR parser.
This patch allows a "raw" TokenStream to be built by reading source code.
It also supports scanning a TokenStream to find the directive structure.
Next steps (with placeholders in the code): heuristically choosing a
path through #ifs, preprocessing the code by stripping directives and comments.
These will produce a suitable stream to feed into the parser proper.
Differential Revision: https://reviews.llvm.org/D119162
In an indirect buffer, buffer-file-name is nil, so check the base buffer
instead. This works fine in direct buffers where buffer-base-buffer returns
nil.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D120408
The dependency scanner already generates canonical -cc1 command lines that can be used to compile discovered modular dependencies.
For translation unit command lines, the scanner only generates additional driver arguments the build system is expected to append to the original command line.
While this works most of the time, there are situations where that's not the case. For example with `-Wunused-command-line-argument`, Clang will complain about the `-fmodules-cache-path=` argument that's not being used in explicit modular builds. Combine that with `-Werror` and the build outright fails.
To prevent such failures, this patch changes the dependency scanner to return the full driver command line to compile the original translation unit. This gives us more opportunities to massage the arguments into something reasonable.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D118986
This patch introduces a dense implementation of the LR parsing table, which is
used by LR parsers.
We build a SLR(1) parsing table from the LR(0) graph.
Statistics of the LR parsing table on the C++ spec grammar:
- number of states: 1449
- number of actions: 83069
- size of the table (bytes): 334928
Differential Revision: https://reviews.llvm.org/D118196
Summary:
This patch removes the error we recieve when attempting to extract
offloading sections. We shouldn't consider this a failure because
extracting bitcode isn't necessarily required.
Summary:
We were not previously saving strings when saving symbol names during
LTO symbol resolution. This caused a crash inside the dense set when
some of the strings would rarely be moved internally by the object file
class.
This patch adds support for linking CPU offloading applications in the
linker wrapper. We generate the necessary linking job using the host
linker's path and library arguments. This may not be true for more
complex offloading schemes, but this is sufficient for now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119613
Ensure CLANG_PLUGIN_SUPPORT is compatible with llvm_add_library.
Fixes an issue noted in D111100.
Differential Revision: https://reviews.llvm.org/D119199
This will allow moving the IncludeCleaner library essentials to Clang
and decoupling them from the majority of clangd.
The patch itself just moves the code, it doesn't change existing
functionality.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D119130
Summary:
This patch changes the ClangLinkerWrapper to use the executable path
when searching for the lld binary. Previously we relied on the program
name. Also not finding 'llvm-strip' is not considered an error anymore
because it is an optional optimization.
Add the build directory to the search path for llvm-strip instead
of solely relying on the PATH environment variable setting.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D118965
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.
More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.
This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].
As a consequence of that patch, downstream user may need to manually some extra
files:
llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h
In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 10978519
before: 11245451
Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions
Differential Revision: https://reviews.llvm.org/D118781
The linker wrapper tool uses the 'nvlink' and 'ptxas' binaries to link
and assemble device files. Previously we searched for this using the
binaries in the user's path. This didn't work in cases where the user
passed in a specific Cuda path to Clang. This patch changes the linker
wrapper to accept an argument for the Cuda path we can get from Clang.
This should fix#53573.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D118944
We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`
```
(gdb) bt
#0 0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007f2bb3d152e4 in ?? ()
#3 0x00007ffcc5f0cc80 in ?? ()
#4 0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
#5 0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
#7 0x00007f2bb3d144ee in ?? ()
#8 0x746e692f706d742f in ?? ()
#9 0x692d747065637265 in ?? ()
#10 0x2f653631326b3034 in ?? ()
#11 0x646d632e35353532 in ?? ()
#12 0x0000000000000000 in ?? ()
```
I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.
All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.
So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.
PS: I would use lock_guards, but it's C.
Reviewed-by: NoQ
Differential Revision: https://reviews.llvm.org/D118439
Summary:
This patch removes the system call to the `clang-offload-wrapper` tool
by replicating its functionality in a new file. This improves
performance and makes the future wrapping functionality easier to
change.
Differential Revision: https://reviews.llvm.org/D118198
Summary:
This patch replaces the system call to the `llc` binary with a library
call to the target machine interface. This should be faster than
relying on an external system call to compile the final wrapper binary.
Differential Revision: https://reviews.llvm.org/D118197
Summary:
This parses the executable name out of the linker arguments so we can
use it to give more informative temporary file names and so we don't
accidentally use it for device linking.
Summary:
This patch implements the `-save-temps` flag for the linker wrapper.
This allows the user to inspect the intermeditary outpout that the
linker wrapper creates.
Summary:
Various changes to the linker wrapper, and the bitcode embedding is not
done after the optimizations have run rather than after linking is done.
This saves time when doing JIT.
This patch improves the symbol resolution done for LTO with offloading
applications. The symbol resolution done here allows the LTO backend to
internalize more functions. The symbol resoltion done is a simplified
view that does not take into account various options like `--wrap` or
`--dyanimic-list` and always assumes we are creating a shared object.
The actual target may be an executable, but semantically it is used as a
shared object because certain objects need to be visible outside of the
executable when they are read by the OpenMP plugin.
Depends on D117246
Differential Revision: https://reviews.llvm.org/D118155
This patch adds support for linking AMDGPU images using the LLD binary.
AMDGPU files are always bitcode images and will always use the LTO
backend. Additionally we now pass the default architecture found with
the `amdgpu-arch` tool to the argument list.
Depends on D117156
Differential Revision: https://reviews.llvm.org/D117246
This patch adds support for a few extra flags in the linker wrapper,
such as debugging flags, verbose output, and passing arguments to ptxas. We also
now forward pass remarks to the LLVM backend so they will show up in the LTO
passes.
Depends on D117049
Differential Revision: https://reviews.llvm.org/D117156
Summary;
This patch adds support for embedding device images in the linker
wrapper tool. This will be used for performing JIT functionality in the
future.
Depends on D117048
Differential Revision: https://reviews.llvm.org/D117049