The ASTUnit needs to initialize an ASTWriter at the beginning of
parsing to fully handle serialization of a translation unit that
imports modules. Do this by introducing an option to enable it, which
corresponds to CXTranslationUnit_ForSerialization on the C API side.
llvm-svn: 165717
macro history.
When deserializing macro history, we arrange history such that the
macros that have definitions (that haven't been #undef'd) and are
visible come at the beginning of the list, which is what the
preprocessor and other clients of Preprocessor::getMacroInfo()
expect. If additional macro definitions become visible later, they'll
be moved toward the front of the list. Note that it's possible to have
ambiguities, but we don't diagnose them yet.
There is a partially-implemented design decision here that, if a
particular identifier has been defined or #undef'd within the
translation unit, that definition (or #undef) hides any macro
definitions that come from imported modules. There's still a little
work to do to ensure that the right #undef'ing happens.
Additionally, we'll need to scope the update records for #undefs, so
they only kick in when the submodule containing that update record
becomes visible.
llvm-svn: 165682
MacroInfo*. Instead of simply dumping an offset into the current file,
give each macro definition a proper ID with all of the standard
modules-remapping facilities. Additionally, when a macro is modified
in a subsequent AST file (e.g., #undef'ing a macro loaded from another
module or from a precompiled header), provide a macro update record
rather than rewriting the entire macro definition. This gives us
greater consistency with the way we handle declarations, and ties
together macro definitions much more cleanly.
Note that we're still not actually deserializing macro history (we
never were), but it's far easy to do properly now.
llvm-svn: 165560
write out the macro history for that macro. Similarly, we need to cope
with reading a macro definition that has been #undef'd.
Take advantage of this new ability so that global code-completion
results can refer to #undef'd macros, rather than losing them
entirely. For multiply defined/#undef'd macros, we will still get the
wrong result, but it's better than getting no result.
llvm-svn: 165502
enough information so we can mangle them correctly in cases involving
dependent parameter types. (This specifically impacts cases involving
null pointers and cases involving parameters of reference type.)
Fix the mangler to use this information instead of trying to scavenge
it out of the parameter declaration.
<rdar://problem/12296776>.
llvm-svn: 164656
Summary: Passes all tests (+ the new one with code completion), but needs a thorough review in part related to modules.
Reviewers: doug.gregor
Reviewed By: alexfh
CC: cfe-commits, rsmith
Differential Revision: http://llvm-reviews.chandlerc.com/D41
llvm-svn: 164610
clang has recently started to warn about the enum compares:
lib/Serialization/ASTWriter.cpp:2760:31: warning: comparison of literal 256 with expression of type
'clang::DeclarationName::NameKind' is always true [-Wtautological-constant-out-of-range-compare]
llvm-svn: 164220
definition info; it needs to be there because the mangler needs to
access it before we're finished defining the lambda class.
PR12808.
llvm-svn: 164186
Summary:
Summary: Keep history of macro definitions and #undefs with corresponding source locations, so that we can later find out all macros active in a specified source location. We don't save the history in PCH (no need currently). Memory overhead is about sizeof(void*)*3*<number of macro definitions and #undefs>+<in-memory size of all #undef'd macros>
I've run a test on a file composed of 109 .h files from boost 1.49 on x86-64 linux.
Stats before this patch:
*** Preprocessor Stats:
73222 directives found:
19171 #define.
4345 #undef.
#include/#include_next/#import:
5233 source files entered.
27 max include stack depth
19210 #if/#ifndef/#ifdef.
2384 #else/#elif.
6891 #endif.
408 #pragma.
14466 #if/#ifndef#ifdef regions skipped
80023/451669/1270 obj/fn/builtin macros expanded, 85724 on the fast path.
127145 token paste (##) operations performed, 11008 on the fast path.
Preprocessor Memory: 5874615B total
BumpPtr: 4399104
Macro Expanded Tokens: 417768
Predefines Buffer: 8135
Macros: 1048576
#pragma push_macro Info: 0
Poison Reasons: 1024
Comment Handlers: 8
Stats with this patch:
...
Preprocessor Memory: 7541687B total
BumpPtr: 6066176
Macro Expanded Tokens: 417768
Predefines Buffer: 8135
Macros: 1048576
#pragma push_macro Info: 0
Poison Reasons: 1024
Comment Handlers: 8
In my test increase in memory usage is about 1.7Mb, which is ~28% of initial preprocessor's memory usage and about 0.8% of clang's total VMM allocation.
As for CPU overhead, it should only be noticeable when iterating over all macros, and should mostly consist of couple extra dereferences and one comparison per macro + skipping of #undef'd macros. It's less trivial to measure, though, as the preprocessor consumes a very small fraction of compilation time.
Reviewers: doug.gregor, klimek, rsmith, djasper
Reviewed By: doug.gregor
CC: cfe-commits, chandlerc
Differential Revision: http://llvm-reviews.chandlerc.com/D28
llvm-svn: 162810
a defaulted special member function until the exception specification is needed
(using the same criteria used for the delayed instantiation of exception
specifications for function temploids).
EST_Delayed is now EST_Unevaluated (using 1330's terminology), and, like
EST_Uninstantiated, carries a pointer to the FunctionDecl which will be used to
resolve the exception specification.
This is enabled for all C++ modes: it's a little faster in the case where the
exception specification isn't used, allows our C++11-in-C++98 extensions to
work, and is still correct for C++98, since in that mode the computation of the
exception specification can't fail.
The diagnostics here aren't great (in particular, we should include implicit
evaluation of exception specifications for defaulted special members in the
template instantiation backtraces), but they're not much worse than before.
Our approach to the problem of cycles between in-class initializers and the
exception specification for a defaulted default constructor is modified a
little by this change -- we now reject any odr-use of a defaulted default
constructor if that constructor uses an in-class initializer and the use is in
an in-class initialzer which is declared lexically earlier. This is a closer
approximation to the current draft solution in core issue 1351, but isn't an
exact match (but the current draft wording isn't reasonable, so that's to be
expected).
llvm-svn: 160847
currently we take address of std::vector's contents only after we finished
adding all comments (so no reallocation can happen), this will change in
future.
llvm-svn: 159845
coming from an AST file are registered for serialization.
A static data member instantiation of in a chained PCH could be missed
when serializing decls; the result was that when emitting the visible decls
map of its DeclContext, we would use a DeclID that was not actually emitted,
leading to crashes or hangs.
Fix this by making sure such decls are always registered for serialization.
Also introduce extra sanity checks to make sure we don't register new
declarations or types after we have serialized the types/decls block.
rdar://11728990
llvm-svn: 159550
target Objective-C runtime down to the frontend: break this
down into a single target runtime kind and version, and compute
all the relevant information from that. This makes it
relatively painless to add support for new runtimes to the
compiler. Make the new -cc1 flag, -fobjc-runtime=blah-x.y.z,
available at the driver level as a better and more general
alternative to -fgnu-runtime and -fnext-runtime. This new
concept of an Objective-C runtime also encompasses what we
were previously separating out as the "Objective-C ABI", so
fragile vs. non-fragile runtimes are now really modelled as
different kinds of runtime, paving the way for better overall
differentiation.
As a sort of special case, continue to accept the -cc1 flag
-fobjc-runtime-has-weak, as a sop to PLCompatibilityWeak.
I won't go so far as to say "no functionality change", even
ignoring the new driver flag, but subtle changes in driver
semantics are almost certainly not intended.
llvm-svn: 158793
* Retain comments in the AST
* Serialize/deserialize comments
* Find comments attached to a certain Decl
* Expose raw comment text and SourceRange via libclang
llvm-svn: 158771
We need an efficient mechanism to determine whether a defaulted default
constructor is constexpr, in order to determine whether a class is a literal
type, so keep the incrementally-built form on CXXRecordDecl. Remove the
on-demand computation of same, so that we only have one method for determining
whether a default constructor is constexpr. This doesn't affect correctness,
since default constructor lookup is much simpler than selecting a constructor
for copying or moving.
We don't need a corresponding mechanism for defaulted copy or move constructors,
since they can't affect whether a type is a literal type. Conversely, checking
whether such functions are constexpr can require non-trivial effort, so we defer
such checks until the copy or move constructor is required.
Thus we now only compute whether a copy or move constructor is constexpr on
demand, and only compute whether a default constructor is constexpr in advance.
This is unfortunate, but seems like the best solution.
llvm-svn: 158290
The integral APSInt value is now stored in a decomposed form and the backing
store for large values is allocated via the ASTContext. This way its not
leaked as TemplateArguments are never destructed when they are allocated in
the ASTContext. Since the integral data is immutable it is now shared between
instances, making copying TemplateArguments a trivial operation.
Currently getting the integral data out of a TemplateArgument requires creating
a new APSInt object. This is cheap when the value is small but can be expensive
if it's not. If this turns out to be an issue a more efficient accessor could
be added.
llvm-svn: 158150
in-class initializer for one of its fields. Value-initialization of such
a type should use the in-class initializer!
The former was just a bug, the latter is a (reported) standard defect.
llvm-svn: 156274
attached. Since we do not support any attributes which appertain to a statement
(yet), testing of this is necessarily quite minimal.
Patch by Alexander Kornienko!
llvm-svn: 154723
InjectedClassNameType; otherwise, it won't be properly wired to the
original (canonical) declaration when it is deserialized. Fixes
<rdar://problem/11112464>.
llvm-svn: 153442
Reintroduce lazy name lookup table building, ensuring that the lazy building step
produces the same lookup table that would be built by the eager step.
Avoid building a lookup table for the translation unit outside C++, even in cases
where we can't recover the contents of the table from the declaration chain on
the translation unit, since we're not going to perform qualified lookup into it
anyway. Continue to support lazily building such lookup tables for now, though,
since ASTMerge uses them.
In my tests, this performs very similarly to ToT with r152608 backed out, for C,
Obj-C and C++, and does not suffer from PR10447.
llvm-svn: 152905
track whether the referenced declaration comes from an enclosing
local context. I'm amenable to suggestions about the exact meaning
of this bit.
llvm-svn: 152491
analysis to make the AST representation testable. They are represented by a
new UserDefinedLiteral AST node, which is a sugared CallExpr. All semantic
properties, including full CodeGen support, are achieved for free by this
representation.
UserDefinedLiterals can never be dependent, so no custom instantiation
behavior is required. They are mangled as if they were direct calls to the
underlying literal operator. This matches g++'s apparent behavior (but not its
actual mangling, which is broken for literal-operator-ids).
User-defined *string* literals are now fully-operational, but the semantic
analysis is quite hacky and needs more work. No other forms of user-defined
literal are created yet, but the AST support for them is present.
This patch committed after midnight because we had already hit the quota for
new kinds of literal yesterday.
llvm-svn: 152211
compiler errors or not.
-Control whether ASTReader should reject such a PCH by a boolean flag at ASTReader's creation time.
By default, such a PCH file will be rejected with an error when trying to load it.
[libclang] Allow clang_saveTranslationUnit to create a PCH file even if compiler errors
occurred.
-Have libclang API calls accept a PCH that had compiler errors.
The general idea is that we want libclang to stay functional even if a PCH had a compiler error.
rdar://10976363.
llvm-svn: 152192
NSNumber, and boolean literals. This includes both Sema and Codegen support.
Included is also support for new Objective-C container subscripting.
My apologies for the large patch. It was very difficult to break apart.
The patch introduces changes to the driver as well to cause clang to link
in additional runtime support when needed to support the new language features.
Docs are forthcoming to document the implementation and behavior of these features.
llvm-svn: 152137
data members for deleted or user-provided destructors.
Now it's computed in advance, serialize it, and in passing fix all the other
record DefinitionData flags whose serialization was missing.
llvm-svn: 151441
arguments. There are two aspects to this:
- Make sure that when marking the declarations referenced in a
default argument, we don't try to mark local variables, both because
it's a waste of time and because the semantics are wrong: we're not
in a place where we could capture these variables again even if it
did make sense.
- When a lambda expression occurs in a default argument of a
function template, make sure that the corresponding closure type is
considered dependent, so that it will get properly instantiated. The
second bit is a bit of a hack; to fix it properly, we may have to
rearchitect our handling of default arguments, parsing them only
after creating the function definition. However, I'd like to
separate that work from the lambdas work.
llvm-svn: 151076
default arguments of function parameters. This simple-sounding task is
complicated greatly by two issues:
(1) Default arguments aren't actually a real context, so we need to
maintain extra state within lambda expressions to track when a
lambda was actually in a default argument.
(2) At the time that we parse a default argument, the FunctionDecl
doesn't exist yet, so lambda closure types end up in the enclosing
context. It's not clear that we ever want to change that, so instead
we introduce the notion of the "effective" context of a declaration
for the purposes of name mangling.
llvm-svn: 151011
name mangling in the Itanium C++ ABI for lambda expressions is so
dependent on context, we encode the number used to encode each lambda
as part of the lambda closure type, and maintain this value within
Sema.
Note that there are a several pieces still missing:
- We still get the linkage of lambda expressions wrong
- We aren't properly numbering or mangling lambda expressions that
occur in default function arguments or in data member initializers.
- We aren't (de-)serializing the lambda numbering tables
llvm-svn: 150982
id-expression 'x' will compute the type based on the assumption that
'x' will be captured, even if it isn't captured, per C++11
[expr.prim.lambda]p18. There are two related refactors that go into
implementing this:
1) Split out the check that determines whether we should capture a
particular variable reference, along with the computation of the
type of the field, from the actual act of capturing the
variable.
2) Always compute the result of decltype() within Sema, rather than
AST, because the decltype() computation is now context-sensitive.
llvm-svn: 150347
to pretty-print such function types better, and to fix a case where we were not
instantiating templates in lexical order. In passing, move the Variadic bit from
Type's bitfields to FunctionProtoType to get the Type bitfields down to 32 bits.
Also ensure that we always substitute the return type of a function when
substituting explicitly-specified arguments, since that can cause us to bail
out with a SFINAE error before we hit a hard error in parameter substitution.
llvm-svn: 150241
This seems to negatively affect compile time onsome ObjC tests
(which use a lot of partial diagnostics I assume). I have to come
up with a way to keep them inline without including Diagnostic.h
everywhere. Now adding a new diagnostic requires a full rebuild
of e.g. the static analyzer which doesn't even use those diagnostics.
This reverts commit 6496bd10dc3a6d5e3266348f08b6e35f8184bc99.
This reverts commit 7af19b817ba964ac560b50c1ed6183235f699789.
This reverts commit fdd15602a42bbe26185978ef1e17019f6d969aa7.
This reverts commit 00bd44d5677783527d7517c1ffe45e4d75a0f56f.
This reverts commit ef9b60ffed980864a8db26ad30344be429e58ff5.
llvm-svn: 150006
The new info is propagated to TSTLoc on template instantiation, getting rid of 3 FIXMEs in TreeTransform.h and another one Parser.cpp.
Simplified code in TypeSpecLocFiller visitor methods for DTSTLoc and DependentNameTypeLoc by removing what now seems to be dead code (adding corresponding assertions).
llvm-svn: 149923
Fix all the files that depended on transitive includes of Diagnostic.h.
With this patch in place changing a diagnostic no longer requires a full rebuild of the StaticAnalyzer.
llvm-svn: 149781
single attribute ("system") that allows us to mark a module as being a
"system" module. Each of the headers that makes up a system module is
considered to be a system header, so that we (for example) suppress
warnings there.
If a module is being inferred for a framework, and that framework
directory is within a system frameworks directory, infer it as a
system framework.
llvm-svn: 149143
the direct serialization of the linked-list structure. Instead, use a
scheme similar to how we handle redeclarations, with redeclaration
lists on the side. This addresses several issues:
- In cases involving mixing and matching of many categories across
many modules, the linked-list structure would not be consistent
across different modules, and categories would get lost.
- If a module is loaded after the class definition and its other
categories have already been loaded, we wouldn't see any categories
in the newly-loaded module.
llvm-svn: 149112
corresponding to TagType and ObjCInterfaceType. Previously, we would
serialize the definition (if available) or the canonical declaration
(if no definition was available). However, this can end up forcing the
deserialization of the definition even through we might not want to
yet.
Instead, always serialize the canonical declaration reference in the
TagType/ObjCInterfaceType entry, and as part of loading a pending
definition, update the "decl" pointer within the type node to point at
the definition. This is more robust in hard-to-isolate cases
where the *Type gets built and filled in before we see the definition.
llvm-svn: 148323
protocol, record the definition pointer in the canonical declaration
for that entity, and then propagate that definition pointer from the
canonical declaration to all other deserialized declarations. This
approach works well even when deserializing declarations that didn't
know about the original definition, which can occur with modules.
A nice bonus from this definition-deserialization approach is that we
no longer need update records when a definition is added, because the
redeclaration chains ensure that the if any declaration is loaded, the
definition will also get loaded.
llvm-svn: 148223
chains, again. The prior implementation was very linked-list oriented, and
the list-splicing logic was both fairly convoluted (when loading from
multiple modules) and failed to preserve a reasonable ordering for the
redeclaration chains.
This new implementation uses a simpler strategy, where we store the
ordered redeclaration chains in an array-like structure (indexed based
on the first declaration), and use that ordering to add individual
deserialized declarations to the end of the existing chain. That way,
the chain mimics the ordering from its modules, and a bug somewhere is
far less likely to result in a broken linked list.
llvm-svn: 148222
storage for the global declaration ID. Declarations that are parsed
(rather than deserialized) are unaffected, so the number of
declarations that pay this cost tends to be relatively small (since
relatively few declarations are ever deserialized).
This replaces a largish DenseMap within the AST reader. It's not
strictly a win in terms of memory use---not every declaration was
added to that DenseMap in the first place---but it's cleaner to have
this information available for every deserialized declaration, so that
future clients can rely on it.
llvm-svn: 147617
in the module map. This provides a bit more predictability for the
user, as well as eliminating the need to sort the submodules when
serializing them.
llvm-svn: 147564
for Objective-C protocols, including:
- Using the first declaration as the canonical declaration
- Using the definition as the primary DeclContext
- Making sure that all declarations have a pointer to the definition
data, and that we know which declaration is the definition
- Serialization support for redeclaration chains and for adding
definitions to already-serialized declarations.
However, note that we're not taking advantage of much of this code
yet, because we're still re-using ObjCProtocolDecls.
llvm-svn: 147410
features needed for a particular module to be available. This allows
mixed-language modules, where certain headers only work under some
language variants (e.g., in C++, std.tuple might only be available in
C++11 mode).
llvm-svn: 147387
set of (previously-canonical) declaration IDs to the module file, so
that future AST reader instances that load the module know which
declarations are merged. This is important in the fairly tricky case
where a declaration of an entity, e.g.,
@class X;
occurs before the import of a module that also declares that
entity. We merge the declarations, and record the fact that the
declaration of X loaded from the module was merged into the (now
canonical) declaration of X that we parsed.
llvm-svn: 147181
hitting a submodule that was never actually created, e.g., because
that header wasn't parsed. In such cases, complain (because the
module's umbrella headers don't cover everything) and fall back to
including the header.
Later, we'll add a warning at module-build time to catch all such
cases. However, this fallback is important to eliminate assertions in
the ASTWriter when this happens.
llvm-svn: 146933
chains. The previous implementation relied heavily on the declaration
chain being stored as a (circular) linked list on disk, as it is in
memory. However, when deserializing from multiple modules, the
different chains could get mixed up, leading to broken declaration chains.
The new solution keeps track of the first and last declarations in the
chain for each module file. When we load a declaration, we search all
of the module files for redeclarations of that declaration, then
splice together all of the lists into a coherent whole (along with any
redeclarations that were actually parsed).
As a drive-by fix, (de-)serialize the redeclaration chains of
TypedefNameDecls, which had somehow gotten missed previously. Add a
test of this serialization.
This new scheme creates a redeclaration table that is fairly large in
the PCH file (on the order of 400k for Cocoa.h's 12MB PCH file). The
table is mmap'd in and searched via a binary search, but it's still
quite large. A future tweak will eliminate entries for declarations
that have no redeclarations anywhere, and should
drastically reduce the size of this table.
llvm-svn: 146841
redeclaration chain for Objective-C classes, including:
- Using the first declaration as the canonical declaration.
- Using the definition as the primary DeclContext
- Making sure that all declarations have a pointer to the definition
data, and the definition knows that it is the definition.
- Serialization support for when a definition gets added to a
declaration that comes from an AST file.
However, note that we're not taking advantage of much of this code
yet, because we're still re-using ObjCInterfaceDecls.
llvm-svn: 146667
umbrella headers in the sense that all of the headers within that
directory (and eventually its subdirectories) are considered to be
part of the module with that umbrella directory. However, unlike
umbrella headers, which are expected to include all of the headers
within their subdirectories, Clang will automatically include all of
the headers it finds in the named subdirectory.
The intent here is to allow a module map to trivially turn a
subdirectory into a module, where the module's structure can mimic the
directory structure.
llvm-svn: 146165
header to also support umbrella directories. The umbrella directory
for an umbrella header is the directory in which the umbrella header
resides.
No functionality change yet, but it's coming.
llvm-svn: 146158
to re-export anything that it imports. This opt-in feature makes a
module behave more like a header, because it can be used to re-export
the transitive closure of a (sub)module's dependencies.
llvm-svn: 145811
"main" files that import modules. When loading any of these kinds of
AST files, we make the modules that were imported visible into the
translation unit that loaded the PCH file or preamble.
llvm-svn: 145737
only the macro definitions from visible (sub)modules will actually be
visible. This provides the same behavior for macros that r145640
provided for declarations.
llvm-svn: 145683
a standard global/local scheme, so that submodule definitions will
eventually be able to refer to submodules in other top-level
modules. We'll need this functionality soonish.
llvm-svn: 145549
library, since modules cut across all of the libraries. Rename
serialization::Module to serialization::ModuleFile to side-step the
annoying naming conflict. Prune a bunch of ModuleMap.h includes that
are no longer needed (most files only needed the Module type).
llvm-svn: 145538
submodules. This information will eventually be used for name hiding
when dealing with submodules. For now, we only use it to ensure that
the module "key" returned when loading a module will always be a
module (rather than occasionally being a FileEntry).
llvm-svn: 145497
file in the source manager. This allows us to properly create and use
modules described by module map files without umbrella headers (or
with incompletely umbrella headers). More generally, we can actually
build a PCH file that makes use of file -> buffer remappings, which
could be useful in libclang in the future.
llvm-svn: 144830
it is going to be rewritten (and the chain will be serialized again), otherwise we may form a cycle in its
categories list when deserializing.
Also introduce ASTMutationListener::CompletedObjCForwardRef to notify that a forward reference
was completed; using Decl's isChangedSinceDeserialization/setChangedSinceDeserialization
is bug inducing and kinda gross, we should phase it out.
Fixes infinite loop in rdar://10418538.
llvm-svn: 144465
that it retains source location information for the type. Aside from
general goodness (being able to walk the types described in that
information), we now have a proper representation for dependent
delegating constructors. Fixes PR10457 (for real).
llvm-svn: 143410
Introduce a FILE_SORTED_DECLS [de]serialization record that contains
a file sorted array of file-level DeclIDs in a PCH/Module.
The rationale is to allow "targeted" deserialization of decls inside
a range of a source file.
Cocoa PCH increased by 0.8%
Difference of creation time for Cocoa PCH is below the noise level.
llvm-svn: 143238
AST file more lazy, so that we don't eagerly load that information for
all known identifiers each time a new AST file is loaded. The eager
reloading made some sense in the context of precompiled headers, since
very few identifiers were defined before PCH load time. With modules,
however, a huge amount of code can get parsed before we see an
@import, so laziness becomes important here.
The approach taken to make this information lazy is fairly simple:
when we load a new AST file, we mark all of the existing identifiers
as being out-of-date. Whenever we want to access information that may
come from an AST (e.g., whether the identifier has a macro definition,
or what top-level declarations have that name), we check the
out-of-date bit and, if it's set, ask the AST reader to update the
IdentifierInfo from the AST files. The update is a merge, and we now
take care to merge declarations before/after imports with declarations
from multiple imports.
The results of this optimization are fairly dramatic. On a small
application that brings in 14 non-trivial modules, this takes modules
from being > 3x slower than a "perfect" PCH file down to 30% slower
for a full rebuild. A partial rebuild (where the PCH file or modules
can be re-used) is down to 7% slower. Making the PCH file just a
little imperfect (e.g., adding two smallish modules used by a bunch of
.m files that aren't in the PCH file) tips the scales in favor of the
modules approach, with 24% faster partial rebuilds.
This is just a first step; the lazy scheme could possibly be improved
by adding versioning, so we don't search into modules we already
searched. Moreover, we'll need similar lazy schemes for all of the
other lookup data structures, such as DeclContexts.
llvm-svn: 143100
PreprocessingRecord's getPreprocessedEntitiesInRange.
Also remove all the stuff that were added in ASTUnit that are unnecessary now
that we do a binary search for preprocessed entities and deserialize only
what is necessary.
llvm-svn: 140063
which will do a binary search and return a pair of iterators
for preprocessed entities in the given source range.
Source ranges of preprocessed entities are stored twice currently in
the PCH/Module file but this will be fixed in a subsequent commit.
llvm-svn: 140058
arbitrary amount of code. This forces us to stage the AST writer more
strictly, ensuring that we don't assign a declaration ID to a
declaration until after we're certain that no more modules will get
loaded.
llvm-svn: 139974
-Use an array of offsets for all preprocessed entities
-Get rid of the separate array of offsets for just macro definitions;
for references to macro definitions use an index inside the preprocessed
entities array.
-Deserialize each preprocessed entity lazily, at first request; not in bulk.
Paves the way for binary searching of preprocessed entities that will offer
efficiency and will simplify things on the libclang side a lot.
llvm-svn: 139809
language options. Use that .def file to declare the LangOptions class
and initialize all of its members, eliminating a source of annoying
initialization bugs.
AST serialization changes are next up.
llvm-svn: 139605
declaration was deserialized from an AST file. Use this instead of
Decl::getPCHLevel() wherever possible. This is a simple step toward
killing off Decl::getPCHLevel().
llvm-svn: 139427
'id' that can be used (only!) via a contextual keyword as the result
type of an Objective-C message send. 'instancetype' then gives the
method a related result type, which we have already been inferring for
a variety of methods (new, alloc, init, self, retain). Addresses
<rdar://problem/9267640>.
llvm-svn: 139275
builtin types (When requested). This is another step toward making
ASTUnit build the ASTContext as needed when loading an AST file,
rather than doing so after the fact. No actual functionality change (yet).
llvm-svn: 138985
include guards don't show up as macro definitions in every translation
unit that imports a module. Macro definitions can, however, be
exported with the intentionally-ugly #__export_macro__
directive. Implement this feature by not even bothering to serialize
non-exported macros to a module, because clients of that module need
not (should not) know that these macros even exist.
llvm-svn: 138943
The initial incentive was to fix a crash when PCH chaining categories
to an interface, but the fix was done in the "modules way" that I hear
is popular with the kids these days.
Each module stores the local chain of categories and we combine them
when the interface is loaded. We also warn if non-dependent modules
introduce duplicate named categories.
llvm-svn: 138926
sure that all of the CXXConversionDecls go into the same
bucket. Otherwise, name lookup might not find them all. Fixes
<rdar://problem/10041960>.
llvm-svn: 138824
Empty lookups can occur in the DeclContext map when we are chaining PCHs, where
the empty lookup indicates that we already looked in ExternalASTSource.
llvm-svn: 138816
table when serializing an AST file. This was a holdover from the days
before chained PCH, and is a complete waste of time and storage
now. It's a good thing it's useless, because I have no idea how I
would have implemented MaterializeVisibleDecls efficiently in the
presence of modules.
llvm-svn: 138496
Currently getMacroArgExpandedLocation is very inefficient and for the case
of a location pointing at the main file it will end up checking almost all of
the SLocEntries. Make it faster:
-Use a map of macro argument chunks to their expanded source location. The map
is for a single source file, it's stored in the file's ContentCache and lazily
computed, like the source lines cache.
-In SLocEntry's FileInfo add an 'unsigned NumCreatedFIDs' field that keeps track
of the number of FileIDs (files and macros) that were created during preprocessing
of that particular file SLocEntry. This is useful when computing the macro argument
map in skipping included files while scanning for macro arg FileIDs that lexed from
a specific source file. Due to padding, the new field does not increase the size
of SLocEntry.
llvm-svn: 138225
-import-module) vs. loaded because some other module depends on
them. As part of doing this, pass down the module that caused a module
to be loaded directly, rather than assuming that we're loading a
chain. Finally, write out all of the directly-loaded modules when
serializing an AST file (using the new IMPORTS record), so that an AST
file can depend on more than one other AST file, all of which will be
loaded when that AST file is loaded. This allows us to form and load a
tree of modules, but we can't yet load a DAG of modules.
llvm-svn: 137923
all AST files have a normal METADATA record that has the same form
regardless of whether we refer to a chained PCH or any other kind of
AST file.
Introduce the IMPORTS record, which describes all of the AST files
that are imported by this AST file, and how (as a module, a PCH file,
etc.). Currently, we emit at most one entry to this record, to support
chained PCH.
llvm-svn: 137869
type over into the AST context, then make that declaration a
predefined declaration in the AST format. This ensures that different
AST files will at least agree on the (global) declaration ID for 'id',
and eliminates one of the "special" types in the AST file format.
llvm-svn: 137429
eliminating a pile of redundant code (and probably some bugs in the
process). The variation between chained and non-chained PCH is fairly
small now anyway.
llvm-svn: 137410
declaration that never actually gets serialized. Instead, serialize
the various kinds of update records (lexical decls, visible decls, the
addition of an anonymous namespace) for the translation unit, even if
we're not chaining. This way, we won't have to deal with multiple
loaded translation unit declarations.
llvm-svn: 137395
enumerations from the ASTContext into CodeGen, so that we don't need
to serialize it to AST files. This appears to be the last of the
low-hanging fruit for SpecialTypes.
llvm-svn: 137124
layout of a constant NSString from the ASTContext over to CodeGen,
since this is solely CodeGen's responsibility. Eliminates one of the
unnecessary "special" types that we serialize.
llvm-svn: 137121
the last of the ID/offset/index mappings that I know
of. Unfortunately, the "gap" method of testing doesn't work here due
to the way the preprocessing record performs iteration. We'll do more
testing once multi-AST loading is possible.
llvm-svn: 136902
IDs will never cross module boundaries, since they're tied to the
CXXDefinitionData, so just use a local mapping throughout. Eliminate
the global -> local tables and supporting data.
llvm-svn: 136847
AST file, along with an enumeration naming those predefined
declarations. No functionality change, but this will make it easier to
introduce new predefined declarations, when/if we need them.
llvm-svn: 136781
reader, to allow AST files to be loaded with their declarations
remapped to different ID numbers. Fix a number of places where we were
either failing to map local declaration IDs into global declaration
IDs or where interpreting the local declaration IDs within the wrong
module.
I've tested this via the usual "random gaps" method. It works well
except for the preamble tests, because our handling of the precompiled
preamble requires declaration and preprocessed entity to be stable
when parsing code and then loading that back into memory. This
property will hold in general, but my randomized testing naturally
breaks this property to get more coverage. In the future, I expect
that the precompiled preamble logic won't need this property.
I am very unhappy with the current handling of the translation unit,
which is a rather egregious hack. We're going to have to do something
very different here for loading multiple AST files, because we don't
want to have to cope with merging two translation units. Likely, we'll
just handle translation units entirely via "update" records, and
predefine a single, fixed declaration ID for the translation
unit. That will come later.
llvm-svn: 136779
by eliminating the type ID from constructor, destructor, and
conversion function names. There are several reasons for this change:
- A given type (say, int*) isn't guaranteed to have a single, unique
type ID within a chain of PCH files. Hence, we could end up hashing
based on the wrong type ID, causing name lookup to fail.
- The mapping from types back to type IDs required one DenseMap
entry for every type that was ever deserialized, which was an
unacceptable cost to support just the name lookup of constructors,
destructors, and conversion functions. Plus, this mapping could
never actually work with chained or multiple PCH, based on the first
bullet.
Once we have eliminated the type from the hash function, these
problems go away, as does my horrible "reverse type remap" hack, which
was doomed from the start (see bullet #1 above) and far too
complicated.
However, note that removing the type from the hash function means that
all constructors, destructors, and conversion functions have the same
hash key, so I've updated the caller to double-check that the
declarations found have the appropriate name.
llvm-svn: 136708
reader. This scheme permits an AST file to be loaded with its type IDs
shifted anywhere in the type ID space.
At present, the type indices are still allocated in the same boring
way they always have been, just by adding up the number of types in
each PCH file within the chain. However, I've done testing with this
patch by randomly sliding the base indices at load time, to ensure
that remapping is occurring as expected. I may eventually formalize
this in some testing flag, but loading multiple (non-chained) AST
files at once will eventually exercise the same code.
There is one known problem with this patch, which involves name lookup
of operator names (e.g., "x.operator int*()") in cases where multiple
PCH files in the chain. The hash function itself depends on having a
stable type ID, which doesn't happen with chained PCH and *certainly*
doesn't happen when sliding type IDs around. We'll need another
approach. I'll tackle that next.
llvm-svn: 136693
completely broken deserialization mapping code we had for VTableUses,
which would have broken horribly as soon as our local-to-global ID
mapping became interesting.
llvm-svn: 136371
we could turn this into an on-disk hash table so we don't load the
whole thing the first time we need it. However, it tends to be very,
very small (i.e., empty) for most precompiled headers, so it isn't all
that interesting.
llvm-svn: 136352
- Added LazyVector::erase() to support this use case.
- Factored out the LazyDecl-of-Decls to RecordData translation in
the ASTWriter. There is still a pile of code duplication here to
eliminate.
llvm-svn: 136270
contents are lazily loaded on demand from an external source (e.g., an
ExternalASTSource or ExternalSemaSource). The "loaded" entities are
kept separate from the "local" entities, so that the two can grow
independently.
Switch Sema::TentativeDefinitions from a normal vector that is eagerly
populated by the ASTReader into one of these LazyVectors, making the
ASTReader a bit more like me (i.e., lazy).
llvm-svn: 136262
etc. With this I think essentially all of the SourceManager APIs are
converted. Comments and random other bits of cleanup should be all thats
left.
llvm-svn: 136057
and various other 'expansion' based terms. I've tried to reformat where
appropriate and catch as many references in comments but I'm going to do
several more passes. Also I've tried to expand parameter names to be
more clear where appropriate.
llvm-svn: 136056
so that we have one, simple way to map from global bit offsets to
local bit offsets. Eliminates a number of loops over the chain, and
generalizes for more interesting bit remappings.
Also, as an amusing oddity, we were computing global bit offsets
*backwards* for preprocessed entities (e.g., the directly included PCH
file in the chain would start at offset zero, rather than the original
PCH that occurs first in translation unit). Even more amusingly, it
made precompiled preambles work, because we were forgetting to adjust
the local bit offset to a global bit offset when storing preprocessed
entity offsets in the ASTUnit. Two wrongs made a right, and now
they're both right.
llvm-svn: 135750
entities generated directly by the preprocessor from those loaded from
the external source (e.g., the ASTReader). By separating these two
sets of entities into different vectors, we allow both to grow
independently, and eliminate the need for preallocating all of the
loaded preprocessing entities. This is similar to the way the recent
SourceManager refactoring treats FileIDs and the source location
address space.
As part of this, switch over to building a continuous range map to
track preprocessing entities.
llvm-svn: 135646
source locations from source locations loaded from an AST/PCH file.
Previously, loading an AST/PCH file involved carefully pre-allocating
space at the beginning of the source manager for the source locations
and FileIDs that correspond to the prefix, and then appending the
source locations/FileIDs used for parsing the remaining translation
unit. This design forced us into loading PCH files early, as a prefix,
whic has become a rather significant limitation.
This patch splits the SourceManager space into two parts: for source
location "addresses", the lower values (growing upward) are used to
describe parsed code, while upper values (growing downward) are used
for source locations loaded from AST/PCH files. Similarly, positive
FileIDs are used to describe parsed code while negative FileIDs are
used to file/macro locations loaded from AST/PCH files. As a result,
we can load PCH/AST files even during parsing, making various
improvemnts in the future possible, e.g., teaching #include <foo.h> to
look for and load <foo.h.gch> if it happens to be already available.
This patch was originally written by Sebastian Redl, then brought
forward to the modern age by Jonathan Turner, and finally
polished/finished by me to be committed.
llvm-svn: 135484