The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.
This patch gives descriptive name to the generated `CompilerInvocation` that can be used to derive the command-line to build a modular dependency.
Depends on D111725.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111728
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.
This patch gives a distinct name to the `CompilerInstance` that's used to run the implicit build during dependency scan.
Depends on D111724.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111725
The `ModuleDepCollectorPP` class holds a reference to `ModuleDepCollector` as well as `ModuleDepCollector`'s `CompilerInstance`. The fact that these refer to the same object is non-obvious.
This patch removes the `CompilerInvocation` reference from `ModuleDepCollectorPP` and accesses it through `ModuleDepCollector` instead.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111724
One of main goals of the dependency scanner is to be strict about module compatibility. This is achieved through strict context hash. This patch ensures that strict context hash is enabled not only during the scan itself (and its minimized implicit build), but also when actually reporting the dependency.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111720
Adds `selectBound`, a `Stencil` combinator that allows the user to supply multiple alternative cases, discriminated by bound node IDs.
Differential Revision: https://reviews.llvm.org/D111708
Previously it only used Windows command lines for MSVC triples, but this
was causing issues for windows-gnu. In fact, everything 'native' Windows
(ie, not Cygwin) should use Windows command line parsing.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D111195
During explicit modular build, PCM files are typically specified via the `-fmodule-file=<path>` command-line option. Early during the compilation, Clang uses the `ASTReader` to read their contents and caches the result so that the module isn't loaded implicitly later on. A listener is attached to the `ASTReader` to collect names of the modules read from the PCM files. However, if the PCM has already been loaded previously via PCH:
1. the `ASTReader` doesn't do anything for the second time,
2. the listener is not invoked at all,
3. the module load result is not cached,
4. the compilation fails when attempting to load the module implicitly later on.
This patch solves this problem by attaching the listener to the `ASTReader` for PCH reading as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111560
To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.
In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.
We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.
There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.
Depends on D102923.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102488
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
Rename methods to clearly signal when they only deal with ASCII,
simplify the parsing of identifier, and use start/continue instead of
head/body for consistency with Unicode terminology.
During dependency scanning, we generally want to suppress -Werror. Apply the same logic to the DiagnosticOptions instance used for command-line parsing.
This fixes a test failure on the PS4 bot, where the system header directory could not be found, which was reported due to -Werror being on the command line and not being sanitized.
In this patch the dependency scanner starts using proper `DiagnosticOptions` parsed from the actual TU command-line in order to mimic what the actual compiler would do. The actual functionality will be enabled and tested in follow-up patches. (This split is necessary to avoid temporary regression.)
Depends on D108976.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D108982
This patch allows the clients of `ToolInvocation` to provide custom diagnostic options to be used during driver -> cc1 command-line transformation and parsing.
Tests covering this functionality are in a follow-up commit. To make this testable, the `DiagnosticsEngine` needs to be properly initialized via `CompilerInstance::createDiagnostics`.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D108976
The dependency scanner currently uses `ClangTool` to invoke the dependency scanning action.
However, `ClangTool` seems to be the wrong level of abstraction. It's intended to be run over a collection of compile commands, which we actively avoid via `SingleCommandCompilationDatabase`. It automatically injects `-fsyntax-only` and other flags, which we avoid by calling `clearArgumentsAdjusters()`. It deduces the resource directory based on the current executable path, which we'd like to change to deducing from `argv[0]`.
Internally, `ClangTool` uses `ToolInvocation` which seems to be more in line with what the dependency scanner tries to achieve. This patch switches to directly using `ToolInvocation` instead. NFC.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108979
This patch changes how the dependency scanner creates the fake input file when scanning dependencies of a single module (introduced in D109485). The scanner now has its own `InMemoryFilesystem` which sits under the minimizing FS (when that's requested). This makes it possible to drop the duplicate work in `DependencyScanningActions::runInvocation` that sets up the main file ID. Besides that, this patch makes it possible to land D108979, where we drop `ClangTool` entirely.
Depends on D109485.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D109498
module lookup by name alone
This removes the need to create a fake source file that imports a
module.
rdar://64538073
Differential Revision: https://reviews.llvm.org/D109485
The way we parse `DiagnosticOptions` is a bit involved.
`DiagnosticOptions` are parsed as part of the cc1-parsing function `CompilerInvocation::CreateFromArgs` which takes `DiagnosticsEngine` as an argument to be able to report errors in command-line arguments. But to create `DiagnosticsEngine`, `DiagnosticOptions` are needed. This is solved by exposing the `ParseDiagnosticArgs` to clients and making its `DiagnosticsEngine` argument optional, essentially breaking the dependency cycle.
The `ParseDiagnosticArgs` function takes `llvm::opt::ArgList &`, which each client needs to create from the command-line (typically represented as `std::vector<const char *>`). Creating this data structure in this context is somewhat particular. This code pattern is copy-pasted in some places across the upstream code base and also in downstream repos. To make things a bit more uniform, this patch extracts the code into a new reusable function: `CreateAndPopulateDiagOpts`.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108918
There are a number of language and preprocessor options that are reset in the `CompilerInvocation` that describes the build of an implicit module. This patch uses the logic for explicit modules as well.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108710
Translation units with multiple direct modular dependencies trigger a non-deterministic ordering in `clang-scan-deps`. This boils down to usage of `std::unordered_map`, which gets replaced by `std::map` in this patch.
Depends on D103526.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103807
The `ASTReader` populates `Module::PresumedModuleMapFile` only for top-level modules, not submodules. To avoid generating empty `-fmodule-map-file=` arguments, make discovered modules depend on top-level precompiled modules. The granularity of submodules is not important here.
The documentation of `Module::PresumedModuleMapFile` says this field is non-empty only when building from preprocessed source. This means there can still be cases where the dependency scanner generates empty `-fmodule-map-file=` arguments. That's being addressed in separate patch: D108544.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108647
In this patch, the dependency scanner starts collecting precompiled dependencies from all encountered submodules, not only from top-level modules.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D108540
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.
Differential Revision: https://reviews.llvm.org/D107528
When `-fno-integrated-as` is passed to the Clang driver (or set by default by a specific toolchain), it will construct an assembler job in addition to the cc1 job. Similarly, the `-fembed-bitcode` driver flag will create additional cc1 job that reads LLVM IR file.
The Clang tooling library only cares about the job that reads a source file. Instead of relying on the fact that the client injected `-fsyntax-only` to the driver invocation to get a single `-cc1` invocation that reads the source file, this patch filters out such jobs from `Compilation` automatically and ignores the rest.
This fixes a test failure in `ClangScanDeps/headerwithname.cpp` and `ClangScanDeps/headerwithnamefollowedbyinclude.cpp` on AIX reported here: https://reviews.llvm.org/D103461#2841918 and `clang-scan-deps` failures with `-fembed-bitcode`.
Depends on D106788.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105695
Whenever -fmodule-name=top_level_module name is parsed, and clang actually tries to
import top_level_module, the headers are imported textually and the module isn't actually
built. However, the dependency scanner could still record it as a potential dependency
if the module was reimported and thus recorded by the preprocessor callbacks.
This change avoids collecting this kind of module as a dependency by verifying that we don't
collect top level modules without actual PCM files.
Differential Revision: https://reviews.llvm.org/D106100
This patch avoid minimizing input files that contributed to a PCH or its modules. This prevents the implicit modular build to fail on unexpected file size. Depends on D106146.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104536
This patch separates the local and global caches of `DependencyScanningFilesystem` into two buckets: minimized files and original files. This is necessary to deal with precompiled modules/headers.
Consider a single worker with its instance of filesystem:
1. Build system uses the worker to scan dependencies of module A => filesystem cache gets populated with minimized input files.
2. Build system uses the results to explicitly build module A => explicitly built module captures the state of the real filesystem (containing non-minimized input files).
3. Build system uses the prebuilt module A as an explicit precompiled dependency for another compile job B.
4. Build system uses the same worker to scan dependencies for job B => worker uses implicit modular build to discover dependencies, which validates the filesystem state embedded in the prebuilt module (non-minimized files) to the current view of the filesystem (minimized files), resulting in validation failures.
This problem can be avoided in step 4 by collecting input files from the precompiled module and marking them as "ignored" in the minimizing filesystem. This way, the validation should succeed, since we should be always dealing with the original (non-minized) input files. However, the filesystem already minimized the input files in step 1 and put it in the cache, which gets used in step 4 as well even though it's marked ignored (do not minimize). This patch essentially fixes this oversight by making the `"file is minimized"` part of the cache key (from high level).
Depends on D106064.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D106146
This patch normalizes filenames in `DependencyScanningWorkerFilesystem` so that lookup of ignored files works correctly on Windows (where `/` and `\` are equivalent).
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D106064
Compilation database might have empty string as a command line argument.
But ExpandResponseFilesDatabase::expand doesn't expect this and assumes
that string.front() can be used for any argument. It is undefined behaviour if
string is empty. With debug build mode it causes crash in clangd.
Test Plan: check-clang
Differential Revision: https://reviews.llvm.org/D105120
This is mostly a mechanical change, but a testcase that contains
parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp)
isn't touched.
Currently, `access` doesn't recognize a dereferenced smart pointer. So,
`access(e, "field")` where `e = *x`, yields:
* `x->field`, for normal-pointer x,
* `(*x).field`, for smart-pointer x.
This patch normalizes handling of smart pointer to match normal pointer, when
the smart pointer type supports `->`.
Differential Revision: https://reviews.llvm.org/D104390
Currently, the implementation combines OOP and overloads, using a template to
tie the two together. In practice, this has proven confusing with no
benefits. This patch simplifies the code to use standard OOP design (a
collection of classes deriving from an interface).
Differential Revision: https://reviews.llvm.org/D104317
The dependency scanning worker uses `std::move` to "reset" `DependencyOutputOptions` in the `CompilerInstance` that performs the implicit build. It's probably preferable to replace the object with value-initialized instance, rather than depending on the behavior of a moved-from object.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104106
There's no need to pass `DependencyOutputOptions` to each call of `handleFileDependency`, since the options don't ever change.
This patch adds new `handleDependencyOutputOpts` method to the `DependencyConsumer` interface and the dependency scanner uses it to report the options only once.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104104
One of the goals of the dependency scanner is to generate command-lines that can be used to explicitly build modular dependencies of a translation unit. The only modifications to these command-lines should be for the purposes of explicit modular build.
However, the current version of dependency scanner leaks its implementation details into the command-lines.
The first problem is that the `clang-scan-deps` tool adjusts the original textual command-line (adding `-Eonly -M -MT <target> -sys-header-deps -Wno-error -o /dev/null `, removing `--serialize-diagnostics`) in order to set up the `DependencyScanning` library. This has been addressed in D103461, D104012, D104030, D104031, D104033. With these patches, the `DependencyScanning` library receives the unmodified `CompilerInvocation`, sets it up and uses it for the implicit modular build.
Finally, to prevent leaking the implementation details to the resulting command-lines, this patch generates them from the **original** unmodified `CompilerInvocation` rather than from the one that drives the implicit build.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D104036
This patch moves enabling system header deps from `clang-scan-deps` into the `DependencyScanning` library. This will make it easier to preserve semantics of the original TU command-line for modular dependencies (see D104036).
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D104033
This moves another piece of logic specific to `clang-scan-deps` into the `DependencyScanning` library. This makes it easier to check how the original command-line looked like in the library and will enable the library to stop inventing `-Wno-error` for modular dependencies (see D104036).
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D104031
The `clang-scan-deps` tool has some logic that parses and modifies the original Clang command-line. The goal is to setup `DependencyOutputOptions` by injecting `-M -MT <target>` and prevent the creation of output files.
This patch moves the logic into the `DependencyScanning` library, and uses the parsed `CompilerInvocation` instead of the raw command-line. The code simpler and can be used from the C++ API as well.
The `-o /dev/null` arguments are not necessary, since the `DependencyScanning` library only runs a preprocessing action, so there's no way it'll produce an actual object file.
Related: The `-M` argument implies `-w`, which would appear on the command-line of modular dependencies even though it was not on the original TU command line (see D104036).
Some related tests were updated.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D104030
To prevent the creation of diagnostics file, `clang-scan-deps` strips the corresponding command-line argument. This behavior is useful even when using the C++ `DependencyScanner` library.
This patch transforms stripping of command-line in `clang-scan-deps` into stripping of `CompilerInvocation` in `DependencyScanning`.
AFAIK, the `clang-cl` driver doesn't even accept `--serialize-diagnostics`, so I've removed the test. (It would fail with an unknown command-line argument otherwise.)
Note: Since we're generating command-lines for modular dependencies from `CompilerInvocation`, the `--serialize-diagnostics` will be dropped. This was already happening in `clang-scan-deps` before this patch, but it will now happen also when using `DependencyScanning` library directly. This is resolved in D104036.
Reviewed By: dexonsmith, arphaman
Differential Revision: https://reviews.llvm.org/D104012
When a translation unit uses a PCH and imports the same modules as the PCH, we'd prefer to resolve to those modules instead of inventing new modules and reporting them as modular dependencies. Since the PCH modules have already been built nudge the compiler to reuse them when deciding whether to build a new module and don't report them as regular modular dependencies.
Depends on D103524 & D103802.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D103526
The `PreprocessOnlyAction` doesn't support loading the AST file of a precompiled header. This is problematic for dependency scanning, since the `#include` manufactured for the PCH is treated as textual. This means the PCH contents get scanned with each TU, which is redundant. Moreover, dependencies of the PCH end up being considered dependency of the TU.
To handle AST file of PCH properly, this patch creates new `FrontendAction` that behaves the same way `PreprocessorOnlyAction` does, but treats the manufactured PCH `#include` as a normal compilation would (by not claiming it only uses a preprocessor and creating the default AST consumer).
The AST file is now reported as a file dependency of the TU.
Depends on D103519.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D103524
When a project uses PCH with explicit modules, the build will look like this:
1. scan PCH dependencies
2. explicitly build PCH
3. scan TU dependencies
4. explicitly build TU
Step 2 produces an object file for the PCH, which the dependency scanner needs to read in step 3. This patch adds support for this.
The `clang-scan-deps` invocation in the attached test would fail without this change.
Depends on D103516.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D103519