Summary:
A previous patch merged the command execution and printing into a helper
function. The old printing code wasn't removed causing each to be
printed twice.
After basic support for embedding and handling CUDA files was added to
the new driver, we should be able to call CUDA functions from OpenMP
code. This patch makes the necessary changes to successfuly link in CUDA
programs that were compiled using the new driver. With this patch it
should be possible to compile device-only CUDA code (no kernels) and
call it from OpenMP as follows:
```
$ clang++ cuda.cu -fopenmp-new-driver -offload-arch=sm_70 -c
$ clang++ openmp.cpp cuda.o -fopenmp-new-driver -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_70
```
Currently this requires using a host variant to suppress the generation
of a CPU-side fallback call.
Depends on D120272
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120273
Delete the output streams coming from
CompilerInstance::createOutputFile() and friends once writes are
finished. Concretely, replacing `OS->flush()` with `OS.reset()` in:
- `ExtractAPIAction::EndSourceFileAction()`
- `PrecompiledPreambleAction::setEmittedPreamblePCH()`
- `cc1_main()'s support for `-ftime-trace`
This fixes theoretical bugs related to proxy streams, which may have
cleanups to run in their destructor. For example, a proxy that
CompilerInstance sometimes uses is `buffer_ostream`, which wraps a
`raw_ostream` lacking pwrite support and adds it. `flush()` does not
promise that output is complete; `buffer_ostream` needs to wait until
the destructor to forward anything so that it can service later calls to
`pwrite()`. If the destructor isn't called then the proxied stream
hasn't received any content.
This also protects against some logic bugs, triggering a null
dereference on a later attempt to write to the stream.
No tests, since in practice these particular code paths never use
use `buffer_ostream`; you need to be writing a binary file to a
pipe (such as stdout) to hit it, but `-extract-api` writes a text file
and the other two use computed filenames that will never (in practice)
be a pipe. This is effectively NFC, for now.
But I have some other patches in the works that add guard rails,
crashing if the stream hasn't been destructed by the time the
CompilerInstance is told to keep the output file, since in most cases
this is a problem.
Differential Revision: https://reviews.llvm.org/D124635
This is to improve maintenance a bit and remove need to maintain the additional option and related code-paths.
Differential Revision: https://reviews.llvm.org/D124558
Summary:
A previous patch updated the path searching in the linker wrapper. I
made an error and caused `lld`, which is necessary to link AMDGPU
images, to not be found on some systems. This patch fixes this by
correctly searching that linker-wrapper's binary path first again.
Add support for concepts and requires expression in the clang index.
Genarate USRs for concepts.
Also change how `RecursiveASTVisitor` handles return type requirement in
requires expressions. The new code unpacks the synthetic template parameter
list used for storing the actual expression. This simplifies
implementation of the indexing. No code seems to depend on the original
traversal anyway and the synthesized template parameter list is easily
accessible from inside the requires expression if needed.
Add tests in the clangd codebase.
Fixes https://github.com/clangd/clangd/issues/1103.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D124441
When we do LTO we consider ourselves to have whole program visibility if
every single input file we have contains LLVM bitcode. If we have whole
program visibliity then we can create a single image and utilize CUDA's
non-RDC mode by not passing `-c` to `ptxas` and ignoring the `nvlink`
job. This should be faster for some situations and also saves us the
time executing `nvlink`.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D124292
This patch extends cc1as to export the build version load command with
LC_VERSION_MIN_MACOSX.
This is especially important for Mac Catalyst as Mac Catalyst uses
the MacOS's compiler rt built-ins.
Differential Revision: https://reviews.llvm.org/D121868
The linker wrapper is used to perform linking and wrapping of embedded
device object files. Currently its internals are not able to be tested
easily. This patch adds the `--dry-run` and `--print-wrapped-module`
options to investigate the link jobs that will be run along with the
wrapped code that will be created to register the binaries.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D124039
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.
Differential Revision: https://reviews.llvm.org/D122683
Summary:
The changes in D122987 ensures that the offloading sections always have
the SHF_EXCLUDE flag. This means that we do not need to manually strip
these sections for ELF or COFF targets.
This is the template version of https://reviews.llvm.org/D114251.
This patch introduces a new template name kind (UsingTemplateName). The
UsingTemplateName stores the found using-shadow decl (and underlying
template can be retrieved from the using-shadow decl). With the new
template name, we can be able to find the using decl that a template
typeloc (e.g. TemplateSpecializationTypeLoc) found its underlying template,
which is useful for tooling use cases (include cleaner etc).
This patch merely focuses on adding the node to the AST.
Next steps:
- support using-decl in qualified template name;
- update the clangd and other tools to use this new node;
- add ast matchers for matching different kinds of template names;
Differential Revision: https://reviews.llvm.org/D123127
This patch changes type of the `File` parameter in `PPCallbacks::InclusionDirective()` from `const FileEntry *` to `Optional<FileEntryRef>`.
With the API change in place, this patch then removes some uses of the deprecated `FileEntry::getName()` (e.g. in `DependencyGraph.cpp` and `ModuleDependencyCollector.cpp`).
Reviewed By: dexonsmith, bnbarham
Differential Revision: https://reviews.llvm.org/D123574
It breaks arm build, there is no free bit for the extra
UsingShadowDecl in TemplateName::StorageType.
Reverting it to build the buildbot back until we comeup with a fix.
This reverts commit 5a5be4044f.
This is the template version of https://reviews.llvm.org/D114251.
This patch introduces a new template name kind (UsingTemplateName). The
UsingTemplateName stores the found using-shadow decl (and underlying
template can be retrieved from the using-shadow decl). With the new
template name, we can be able to find the using decl that a template
typeloc (e.g. TemplateSpecializationTypeLoc) found its underlying template,
which is useful for tooling use cases (include cleaner etc).
This patch merely focuses on adding the node to the AST.
Next steps:
- support using-decl in qualified template name;
- update the clangd and other tools to use this new node;
- add ast matchers for matching different kinds of template names;
Differential Revision: https://reviews.llvm.org/D123127
Summary:
A previous patch added the option to use the default pipeline when
perfomring LTO rather than the regular LTO pipeline. This greatly
improved performance regressions we were observing with the LTO
pipeline. However, this should not be used if the user explicitly
disables optimizations as the default pipeline expects some
optimizatoins to be perfomed.
Currently, clang-offload-bundler has -inputs and -outputs options that accept
values with comma as the delimiter. This causes issues with file paths
containing commas, which are valid file paths on Linux.
This add two new options -input and -output, which accept one single file,
and allow multiple instances. This allows arbitrary file paths. The old
-inputs and -outputs options will be kept for backward compatibility, but
are not allowed to be used with -input and -output options for simplicity.
In the future, -inputs and -outputs options will be phasing out.
RFC: https://discourse.llvm.org/t/rfc-adding-input-and-output-options-to-clang-offload-bundler/60049
Patch by: Siu Chi Chan
Reviewed by: Yaxun Liu
Differential Revision: https://reviews.llvm.org/D120662
Summary:
Currently there is no option to configure the number of thin-backend
threads to use when performing thin-lto on the device, but we should
default to use all the threads rather than just one. In the future we
should use the same arguments that gold / lld use and parse it here.
Do import the definition of objects from a foreign translation unit if that's type is const and trivial.
Differential Revision: https://reviews.llvm.org/D122805
This builtin returns the address of a global instance of the
`std::source_location::__impl` type, which must be defined (with an
appropriate shape) before calling the builtin.
It will be used to implement std::source_location in libc++ in a
future change. The builtin is compatible with GCC's implementation,
and libstdc++'s usage. An intentional divergence is that GCC declares
the builtin's return type to be `const void*` (for
ease-of-implementation reasons), while Clang uses the actual type,
`const std::source_location::__impl*`.
In order to support this new functionality, I've also added a new
'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after
MSGuidDecl, and is used to represent a generic concept of an lvalue
constant with global scope, deduplicated by its value. It's possible
that MSGuidDecl itself, or some of the other similar sorts of things
in Clang might be able to be refactored onto this more-generic
concept, but there's enough special-case weirdness in MSGuidDecl that
I gave up attempting to share code there, at least for now.
Finally, for compatibility with libstdc++'s <source_location> header,
I've added a second exception to the "cannot cast from void* to T* in
constant evaluation" rule. This seems a bit distasteful, but feels
like the best available option.
Reviewers: aaron.ballman, erichkeane
Differential Revision: https://reviews.llvm.org/D120159
First of all, this is the convention: all other tools have their
dependencies private. While it does not have an effect on linking
(there is no linking against executables), it does have an effect
on exporting: having the targets private allows installing the tools
without the libraries in a statically linked build, or a build against
libclang-cpp.so.
Reviewed By: v.g.vassilev
Differential Revision: https://reviews.llvm.org/D122546
Adds basic parsing/sema/serialization support for the
#pragma omp target parallel loop directive.
Differential Revision: https://reviews.llvm.org/D122359
This patch adds a configuration option to simply use the default pass
pipeline in favor of the LTO-specific one. We observed some severe
performance penalties when uding device-side LTO for OpenMP offloading
applications caused by the LTO-pass pipeline. This is primarily because
OpenMP uses an LLVM bitcode library to implement a GPU runtime library.
In a standard compilation we link this bitcode library into each source
file and optimize it with the default pipeline. When performing LTO we
link it late with all the files, but the bitcode library never has the
regular optimization pipeline applied to it so we miss a few
optimizations just using the LTO pipeline to optimize it.
I'm not committed to this solution, but it's the easiest method to solve
this performance regression when using LTO without changing the
optimizatin pipeline for other users.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D122133
Current ASTContext.getAttributedType() takes attribute kind,
ModifiedType and EquivType as the hash to decide whether an AST node
has been generated or note. But this is not enough for btf_type_tag
as the attribute might have the same ModifiedType and EquivType, but
still have different string associated with attribute.
For example, for a data structure like below,
struct map_value {
int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a;
int __attribute__((btf_type_tag("tag2"))) __attribute__((btf_type_tag("tag4"))) *b;
};
The current ASTContext.getAttributedType() will produce
an AST similar to below:
struct map_value {
int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a;
int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *b;
};
and this is incorrect.
It is very difficult to use the current AttributedType as it is hard to
get the tag information. To fix the problem, this patch introduced
BTFTagAttributedType which is similar to AttributedType
in many ways but with an additional BTFTypeTagAttr. The tag itself can
be retrieved with BTFTypeTagAttr.
With the new BTFTagAttributed type, the debuginfo code can be greatly
simplified compared to previous TypeLoc based approach.
Differential Revision: https://reviews.llvm.org/D120296
This reverts commit 049f4e4eab.
The problem was a stray dependency in CLANG_TEST_DEPS which caused cmake
to fail if clang-pseudo wasn't built. This is now removed.
This should make clearer that:
- it's not part of clang proper
- there's no expectation to update it along with clang (beyond green tests)
- clang should not depend on it
This is intended to be expose a library, so unlike other tools has a split
between include/ and lib/.
The main renames are:
clang/lib/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/lib/*
clang/include/clang/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/include/clang-pseudo/*
clang/tools/clang/pseudo/* => clang-tools-extra/pseudo/tool/*
clang/test/Syntax/* => clang-tools-extra/pseudo/test/*
clang/unittests/Tooling/Syntax/Pseudo/* => clang-tools-extra/pseudo/unittests/*
#include "clang/Tooling/Syntax/Pseudo/*" => #include "clang-pseudo/*"
namespace clang::syntax::pseudo => namespace clang::pseudo
check-clang => check-clang-pseudo
clangToolingSyntaxPseudo => clangPseudo
The clang-pseudo and ClangPseudoTests binaries are not renamed.
See discussion around:
https://discourse.llvm.org/t/rfc-a-c-pseudo-parser-for-tooling/59217/50
Differential Revision: https://reviews.llvm.org/D121233
This patch adds the offload kind to the embedded section name in
preparation for offloading to different kinda like CUDA or HIP.
Depends on D120288
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120271
This patch implements a DenseMap info struct for the device file type.
This is used to help grouping device files that have the same triple and
architecture. Because of this the filename, which will always be unique
for each file, is not used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120288
With explicit modules build, the '-fmodules-cache-path=' argument is unused.
This patch removes the argument to avoid warnings or errors (with '-Werror') stemming from that.
Depends on D118915.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D120474
The `clang-scan-deps` tool currently generates `-fmodule-file=` command-line arguments for the whole transitive closure of modular dependencies. This is not necessary, we only need to provide the direct dependencies on the command line. Information about transitive dependencies is stored within the `.pcm` files of direct dependencies. This makes the command lines shorter, but should be a NFC otherwise (unless there are bugs in the loading mechanism for explicit modules).
Depends on D120465.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D118915
Summary:
This patch correctly handles the `--sysroot=` option when passed to the
linker wrapper. This allows users to correctly find libraries that may
contain offloading code if using this option.
Currently, manpage for scan-build is installed as a program, with
permission of 755. This patch makes installation of scan-build.1 as
file, with 644 permission.
Patch by Manas.
Reviewed By: steakhal
Differential Revision: https://reviews.llvm.org/D120646