Since the order of the IDs in the AST file (e.g. DeclIDs, SelectorIDs)
is not stable, it is not safe to load an AST file that depends on
another AST file that has been rebuilt since the importer was built,
even if "nothing changed". We previously used size and modtime to check
this, but I've seen cases where a module rebuilt quickly enough to foil
this check and caused very hard to debug build errors.
To save cycles when we're loading the AST, we just generate a random
nonce value and check that it hasn't changed when we load an imported
module, rather than actually hash the whole file.
This is slightly complicated by the fact that we need to verify the
signature inside addModule, since we might otherwise consider that a
mdoule is "OutOfDate" when really it is the importer that is out of
date. I didn't see any regressions in module load time after this
change.
llvm-svn: 220493
Implicit module builds are not well-suited to a lot of build systems. In
particular, they fare badly in distributed build systems, and they lead to
build artifacts that are not tracked as part of the usual dependency management
process. This change allows explicitly-built module files (which are already
supported through the -emit-module flag) to be explicitly loaded into a build,
allowing build systems to opt to manage module builds and dependencies
themselves.
This is only the first step in supporting such configurations, and it should
be considered experimental and subject to change or removal for now.
llvm-svn: 220359
Successfully loaded module files may be referenced in other
ModuleManagers, so don't invalidate them. Two related things are fixed:
1) I thought the last module in the manager was always the one that
failed, but it isn't. So check explicitly against the list of
vetted modules from ReadASTCore.
2) We now keep the file descriptor of pcm file open, which avoids the
possibility of having two different pcms for the same module loaded when
building in parallel with headers being modified during a build.
<rdar://problem/16835846>
llvm-svn: 211330
This reapplies r209910 with a fix for the assertion failures hit on the
buildbots.
original commit message:
I thought we could get away without this, but it means that the
FileEntry objects actually refer to the wrong files, since pcms are not
updated inplace, they are atomically renamed into place after compiling
a module.
So we are close to the original behaviour of invalidating the cache for
all modules being removed, but now we should only invalidate the ones
that depend on whichever module failed to load.
Unfortunately I haven't come up with a new test that didn't require
a race between parallel invocations of clang.
<rdar://problem/17038180>
llvm-svn: 209922
I thought we could get away without this, but it means that the
FileEntry objects actually refer to the wrong files, since pcms are not
updated inplace, they are atomically renamed into place after compiling
a module.
So we are close to the original behaviour of invalidating the cache for
all modules being removed, but now we should only invalidate the ones
that depend on whichever module failed to load.
Unfortunately I haven't come up with a new test that didn't require
a race between parallel invocations of clang.
<rdar://problem/17038180>
llvm-svn: 209910
It appears that Windows doesn't like renaming over open files, which we
do in clearOutputFiles. The file being compiled should be safe to
removed, but this isn't very satisfying - we don't want to manually
manage the lifetime of files we cannot prove have no references.
llvm-svn: 209195
Follow-up fix for 209138. Actually, since we already have this file
open, we don't want to refresh the stat() info, since that might be
newer than what we have open (bad!).
llvm-svn: 209143
FileManager::invalidateCache is not safe to call when there may be
existing references to the file. What module load failure needs is
to refresh so stale stat() info isn't stored.
This may be the last user of invalidateCache; I'll take a look and
remove it if possible in a future commit.
This caused a use-after-free error as well as a spurious error message
that a module was "found in both 'X.pcm' and 'X.pcm'" in some cases.
llvm-svn: 209138
We need to open an ASTFile while checking its expected size and
modification time, or another clang instance can modify the file between
the stat() and the open().
llvm-svn: 207735
To differentiate between two modules with the same name, we will
consider the path the module map file that they are defined by* part of
the ‘key’ for looking up the precompiled module (pcm file).
Specifically, this patch renames the precompiled module (pcm) files from
cache-path/<module hash>/Foo.pcm
to
cache-path/<module hash>/Foo-<hash of module map path>.pcm
In addition, I’ve taught the ASTReader to re-resolve the names of
imported modules during module loading so that if the header search
context changes between when a module was originally built and when it
is loaded we can rebuild it if necessary. For example, if module A
imports module B
first time:
clang -I /path/to/A -I /path/to/B ...
second time:
clang -I /path/to/A -I /different/path/to/B ...
will now rebuild A as expected.
* in the case of inferred modules, we use the module map file that
allowed the inference, not the __inferred_module.map file, since the
inferred file path is the same for every inferred module.
llvm-svn: 206201
predicate. The wrapper used by SetVector was erroneously requiring an
adaptable predicate. It has been fixed and we really don't want to
require an indirect call for every predicate evaluation.
llvm-svn: 202744
Previously reverted in r201755 due to causing an assertion failure.
I've removed the offending assertion, and taught the CompilerInstance to
create a default virtual file system inside createFileManager. In the
future, we should be able to reach into the CompilerInvocation to
customize this behaviour without breaking clients that don't care.
llvm-svn: 201818
the build
When Clang loads the module, it verifies the user source files that the module
was built from. If any file was changed, the module is rebuilt. There are two
problems with this:
1. correctness: we don't verify system files (there are too many of them, and
stat'ing all of them would take a lot of time);
2. performance: the same module file is verified again and again during a
single build.
This change allows the build system to optimize source file verification. The
idea is based on the fact that while the project is being built, the source
files don't change. This allows us to verify the module only once during a
single build session. The build system passes a flag,
-fbuild-session-timestamp=, to inform Clang of the time when the build started.
The build system also requests to enable this feature by passing
-fmodules-validate-once-per-build-session. If these flags are not passed, the
behavior is not changed. When Clang verifies the module the first time, it
writes out a timestamp file. Then, when Clang loads the module the second
time, it finds a timestamp file, so it can compare the verification timestamp
of the module with the time when the build started. If the verification
timestamp is too old, the module is verified again, and the timestamp file is
updated.
llvm-svn: 201224
This option can be useful for end users who want to know why they
ended up with a ton of different variants of the "std" module in their
module cache. This problem should go away over time, as we reduce the
need for module variants, but it will never go away entirely.
llvm-svn: 178148
The refactoring in r177367 introduced a serious performance bug where
the "lazy" resolution of module file names in the global module index
to actual module file entries in the module manager would perform
repeated negative stats(). The new interaction requires the module
manager to inform the global module index when a module file has been
loaded, eliminating the extraneous stat()s and a bunch of bookkeeping
on both sides.
llvm-svn: 177750
The global module index was querying the file manager for each of the
module files it knows about at load time, to prune out any out-of-date
information. The file manager would then cache the results of the
stat() falls used to find that module file.
Later, the same translation unit could end up trying to import one of the
module files that had previously been ignored by the module cache, but
after some other Clang instance rebuilt the module file to bring it
up-to-date. The stale stat() results in the file manager would
trigger a second rebuild of the already-up-to-date module, causing
failures down the line.
The global module index now lazily resolves its module file references
to actual AST reader module files only after the module file has been
loaded, eliminating the stat-caching race. Moreover, the AST reader
can communicate to its caller that a module file is missing (rather
than simply being out-of-date), allowing us to simplify the
module-loading logic and allowing the compiler to recover if a
dependent module file ends up getting deleted.
llvm-svn: 177367
ModuleManager::visit() by keeping a free list of the two data
structures used to store state (a preallocated stack and a visitation
number vector). Improves -fsyntax-only performance for my modules test
case by 2.8%. Modules has pulled ahead by almost 10% with the global
module index.
llvm-svn: 173692
index, optimizing the operation that skips lookup in modules where we
know the identifier will not be found. This makes the global module
index optimization actually useful, providing an 8.5% speedup over
modules without the global module index for -fsyntax-only.
llvm-svn: 173529
and limiting ourselves to two memory allocations. 10% speedup in
-fsyntax-only time for modules.
With this change, we can actually see some performance different from
the global module index, but it's still about 1%.
llvm-svn: 173512
AST reader.
The global module index tracks all of the identifiers known to a set
of module files. Lookup of those identifiers looks first in the global
module index, which returns the set of module files in which that
identifier can be found. The AST reader only needs to look into those
module files and any module files not known to the global index (e.g.,
because they were (re)built after the global index), reducing the
number of on-disk hash tables to visit. For an example source I'm
looking at, we go from 237844 total identifier lookups into on-disk
hash tables down to 126817.
Unfortunately, this does not translate into a performance advantage.
At best, it's a wash once the global module index has been built, but
that's ignore the cost of building the global module index (which
is itself fairly large). Profiles show that the global module index
code is far less efficient than it should be; optimizing it might give
enough of an advantage to justify its continued inclusion.
llvm-svn: 173405
generational scheme for identifiers that avoids searching the hash
tables of a given module more than once for a given
identifier. Previously, loading any new module invalidated all of the
previous lookup results for all identifiers, causing us to perform the
lookups repeatedly.
llvm-svn: 148412
library, since modules cut across all of the libraries. Rename
serialization::Module to serialization::ModuleFile to side-step the
annoying naming conflict. Prune a bunch of ModuleMap.h includes that
are no longer needed (most files only needed the Module type).
llvm-svn: 145538