Currently the analyzer lazily models some functions using 'BodyFarm',
which constructs a fake function implementation that the analyzer
can simulate that approximates the semantics of the function when
it is called. BodyFarm does this by constructing the AST for
such definitions on-the-fly. One strength of BodyFarm
is that all symbols and types referenced by synthesized function
bodies are contextual adapted to the containing translation unit.
The downside is that these ASTs are hardcoded in Clang's own
source code.
A more scalable model is to allow these models to be defined as source
code in separate "model" files and have the analyzer use those
definitions lazily when a function body is needed. Among other things,
it will allow more customization of the analyzer for specific APIs
and platforms.
This patch provides the initial infrastructure for this feature.
It extends BodyFarm to use an abstract API 'CodeInjector' that can be
used to synthesize function bodies. That 'CodeInjector' is
implemented using a new 'ModelInjector' in libFrontend, which lazily
parses a model file and injects the ASTs into the current translation
unit.
Models are currently found by specifying a 'model-path' as an
analyzer option; if no path is specified the CodeInjector is not
used, thus defaulting to the current behavior in the analyzer.
Models currently contain a single function definition, and can
be found by finding the file <function name>.model. This is an
initial starting point for something more rich, but it bootstraps
this feature for future evolution.
This patch was contributed by Gábor Horváth as part of his
Google Summer of Code project.
Some notes:
- This introduces the notion of a "model file" into
FrontendAction and the Preprocessor. This nomenclature
is specific to the static analyzer, but possibly could be
generalized. Essentially these are sources pulled in
exogenously from the principal translation.
Preprocessor gets a 'InitializeForModelFile' and
'FinalizeForModelFile' which could possibly be hoisted out
of Preprocessor if Preprocessor exposed a new API to
change the PragmaHandlers and some other internal pieces. This
can be revisited.
FrontendAction gets a 'isModelParsingAction()' predicate function
used to allow a new FrontendAction to recycle the Preprocessor
and ASTContext. This name could probably be made something
more general (i.e., not tied to 'model files') at the expense
of losing the intent of why it exists. This can be revisited.
- This is a moderate sized patch; it has gone through some amount of
offline code review. Most of the changes to the non-analyzer
parts are fairly small, and would make little sense without
the analyzer changes.
- Most of the analyzer changes are plumbing, with the interesting
behavior being introduced by ModelInjector.cpp and
ModelConsumer.cpp.
- The new functionality introduced by this change is off-by-default.
It requires an analyzer config option to enable.
llvm-svn: 216550
After post-commit review and community discussion, this seems like a
reasonable direction to continue, making ownership semantics explicit in
the source using the type system.
llvm-svn: 215323
This allows using EndOfMainFile from a PPCallback to access data from the
action. The pattern of PPCallback referencing an action is common in clang-tidy.
Differential Revision: http://reviews.llvm.org/D4773
llvm-svn: 215145
This reverts commit r213307.
Reverting to have some on-list discussion/confirmation about the ongoing
direction of smart pointer usage in the LLVM project.
llvm-svn: 213325
(after fixing a bug in MultiplexConsumer I noticed the ownership of the
nested consumers was implemented with raw pointers - so this fixes
that... and follows the source back to its origin pushing unique_ptr
ownership up through there too)
llvm-svn: 213307
- Plugins don't need to export _ZN4llvm8Registry*.
- Win32.DLL cannot merge common symbols among DLLs. Static members in llvm::Registry should be instantiated in a parent.
llvm-svn: 212821
ensure that querying the first declaration for its most recent declaration
checks for redeclarations from the imported module.
This works as follows:
* The 'most recent' pointer on a canonical declaration grows a pointer to the
external AST source and a generation number (space- and time-optimized for
the case where there is no external source).
* Each time the 'most recent' pointer is queried, if it has an external source,
we check whether it's up to date, and update it if not.
* The ancillary data stored on the canonical declaration is allocated lazily
to avoid filling it in for declarations that end up being non-canonical.
We'll still perform a redundant (ASTContext) allocation if someone asks for
the most recent declaration from a decl before setPreviousDecl is called,
but such cases are probably all bugs, and are now easy to find.
Some finessing is still in order here -- in particular, we use a very general
mechanism for handling the DefinitionData pointer on CXXRecordData, and a more
targeted approach would be more compact.
Also, the MayHaveOutOfDateDef mechanism should now be expunged, since it was
addressing only a corner of the full problem space here. That's not covered
by this patch.
Early performance benchmarks show that this makes no measurable difference to
Clang performance without modules enabled (and fixes a major correctness issue
with modules enabled). I'll revert if a full performance comparison shows any
problems.
llvm-svn: 209046
Use this to fix the leak of DeserializedDeclsDumper and DeserializedDeclsChecker
in FrontendAction (found by LSan), PR19560.
The "delete this" bool is necessary because both PCHGenerator and ASTUnit
return the same object from both getDeserializationListener() and
getASTMutationListener(), so ASTReader can't just have a unique_ptr.
It's also not possible to just let FrontendAction (or CompilerInstance) own
these listeners due to lifetime issues (see comments on PR19560).
Finally, ASTDeserializationListener can't easily be refcounted, since several of
the current listeners are allocated on the stack.
Having this bool isn't ideal, but it's a pattern that's used in other places in
the codebase too, and it seems better than leaking.
llvm-svn: 208277
To differentiate between two modules with the same name, we will
consider the path the module map file that they are defined by* part of
the ‘key’ for looking up the precompiled module (pcm file).
Specifically, this patch renames the precompiled module (pcm) files from
cache-path/<module hash>/Foo.pcm
to
cache-path/<module hash>/Foo-<hash of module map path>.pcm
In addition, I’ve taught the ASTReader to re-resolve the names of
imported modules during module loading so that if the header search
context changes between when a module was originally built and when it
is loaded we can rebuild it if necessary. For example, if module A
imports module B
first time:
clang -I /path/to/A -I /path/to/B ...
second time:
clang -I /path/to/A -I /different/path/to/B ...
will now rebuild A as expected.
* in the case of inferred modules, we use the module map file that
allowed the inference, not the __inferred_module.map file, since the
inferred file path is the same for every inferred module.
llvm-svn: 206201
With r197755 we started reading the contents of buffer file entries, but the
buffers may point to ASTReader blobs that have been disposed.
Fix this by having the CompilerInstance object keep a reference to the ASTReader
as well as having the ASTContext keep reference to the ExternalASTSource.
This was very difficult to construct a test case for.
rdar://16149782
llvm-svn: 202346
Previously reverted in r201755 due to causing an assertion failure.
I've removed the offending assertion, and taught the CompilerInstance to
create a default virtual file system inside createFileManager. In the
future, we should be able to reach into the CompilerInvocation to
customize this behaviour without breaking clients that don't care.
llvm-svn: 201818
We don't stat the system headers to check for stalenes during regular
PCH loading for performance reasons. When explicitly saying
-verify-pch, we want to check all the dependencies - user or system.
llvm-svn: 200979
This option will:
- load the given pch file
- verify it is not out of date by stat'ing dependencies, and
- return 0 on success and non-zero on error
llvm-svn: 200884
encodes the canonical rules for LLVM's style. I noticed this had drifted
quite a bit when cleaning up LLVM, so wanted to clean up Clang as well.
llvm-svn: 198686
A while ago we allowed libclang to build a PCH that had compiler errors; this was to retain the performance
afforded by a PCH even if the user's code is in an intermediate state.
Extend this for the precompiled preamble as well.
rdar://14109828
llvm-svn: 183717
since only one of them is allowed in command-line, process them separately.
Otherwise, if more than one is specified in the command-line, one is processed normally
and the others are going to be treated and included as header files.
Related to radar://13140508
llvm-svn: 174385
The global module index is a "global" index for all of the module
files within a particular subdirectory in the module cache, which
keeps track of all of the "interesting" identifiers and selectors
known in each of the module files. One can perform a fast lookup in
the index to determine which module files will have more information
about entities with a particular name/selector. This information can
help eliminate redundant lookups into module files (a serious
performance problem) and help with creating auto-import/auto-include
Fix-Its.
The global module index is created or updated at the end of a
translation unit that has triggered a (re)build of a module by
scraping all of the .pcm files out of the module cache subdirectory,
so it catches everything. As with module rebuilds, we use the file
system's atomicity to synchronize.
llvm-svn: 173301
* Fix a typo, s/BeginSourceAction/BeginSourceFile/, so that the documentation
for FrontendAction::BeginSourceFileAction links correctly to BeginSourceFile;
* Add some basic \file documentation for FrontendAction.h;
* More use of "\brief" instead of repeating the name of the entity being
documented;
* Stop using Doxygen-style "///" comments in FrontendAction.cpp, as they were
polluting the documentation for BeginSourceFile;
* Drop incorrect "\see" markup that broke Doxygen's formatting;
* Other minor documentation fixes.
llvm-svn: 173213
uncovered.
This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.
I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.
llvm-svn: 169237
The stat cache became essentially useless ever since we started
validating all file entries in the PCH.
But the motivating reason for removing it now is that it also affected
correctness in this situation:
-You have a header without include guards (using "#pragma once" or #import)
-When creating the PCH:
-The same header is referenced in an #include with different filename cases.
-In the PCH, of course, we record only one file entry for the header file
-But we cache in the PCH file the stat info for both filename cases
-Then the source files are updated and the header file is updated in a way that
its size and modification time are the same but its inode changes
-When using the PCH:
-We validate the headers, we check that header file and we create a file entry with its current inode
-There's another #include with a filename with different case than the previously created file entry
-In order to get its stat info we go through the cached stat info of the PCH and we receive the old inode
-because of the different inodes, we think they are different files so we go ahead and include its contents.
Removing the stat cache will potentially break clients that are attempting to use the stat cache
as a way of avoiding having the actual input files available. If that use case is important, patches are welcome
to bring it back in a way that will actually work correctly (i.e., emit a PCH that is self-contained, coping with
literal strings, line/column computations, etc.).
This fixes rdar://5502805
llvm-svn: 167172
the macros that are #define'd or #undef'd on the command line. This
checking happens much earlier than the current macro-definition
checking and is far cleaner, because it does a direct comparison
rather than a diff of the predefines buffers. Moreover, it allows us
to use the result of this check to skip over PCH files within a
directory that have non-matching -D's or -U's on the command
line. Finally, it improves the diagnostics a bit for mismatches,
fixing <rdar://problem/8612222>.
The old predefines-buffer diff'ing will go away in a subsequent commit.
llvm-svn: 166641
check each of the files within that directory to determine if any of
them is an AST file that matches the language and target options. If
so, the first matching AST file is loaded. This fixes a longstanding
discrepency with GCC's precompiled header implementation.
llvm-svn: 166469
This reduces the spam make test leaves behind in /tmp. The assert isn't
particularly useful because it's not run with -disable-free (the default when
using the clang driver) but should cover all -cc1 tests.
llvm-svn: 165910
MacroInfo*. Instead of simply dumping an offset into the current file,
give each macro definition a proper ID with all of the standard
modules-remapping facilities. Additionally, when a macro is modified
in a subsequent AST file (e.g., #undef'ing a macro loaded from another
module or from a precompiled header), provide a macro update record
rather than rewriting the entire macro definition. This gives us
greater consistency with the way we handle declarations, and ties
together macro definitions much more cleanly.
Note that we're still not actually deserializing macro history (we
never were), but it's far easy to do properly now.
llvm-svn: 165560