for Objective-C protocols, including:
- Using the first declaration as the canonical declaration
- Using the definition as the primary DeclContext
- Making sure that all declarations have a pointer to the definition
data, and that we know which declaration is the definition
- Serialization support for redeclaration chains and for adding
definitions to already-serialized declarations.
However, note that we're not taking advantage of much of this code
yet, because we're still re-using ObjCProtocolDecls.
llvm-svn: 147410
separately-allocated DefinitionData structure. Introduce various
functions that will help with the separation of declarations from
definitions (isThisDeclarationADefinition(), hasDefinition(),
getDefinition()).
llvm-svn: 147408
features needed for a particular module to be available. This allows
mixed-language modules, where certain headers only work under some
language variants (e.g., in C++, std.tuple might only be available in
C++11 mode).
llvm-svn: 147387
set of (previously-canonical) declaration IDs to the module file, so
that future AST reader instances that load the module know which
declarations are merged. This is important in the fairly tricky case
where a declaration of an entity, e.g.,
@class X;
occurs before the import of a module that also declares that
entity. We merge the declarations, and record the fact that the
declaration of X loaded from the module was merged into the (now
canonical) declaration of X that we parsed.
llvm-svn: 147181
declaration of that same class that either came from some other module
or occurred in the translation unit loading the module. In this case,
we need to merge the two redeclaration chains immediately so that all
such declarations have the same canonical declaration in the resulting
AST (even though they don't in the module files we've imported).
Focusing on Objective-C classes until I'm happy with the design, then
I'll both (1) extend this notion to other kinds of declarations, and
(2) optimize away this extra checking when we're not dealing with
modules. For now, doing this checking for PCH files/preambles gives us
better testing coverage.
llvm-svn: 147123
redeclaration chains: only ever have the reader search for
redeclarations of the first (canonical) declaration, since we only
ever record redeclaration ranges for the that declaration. Searching
for redeclarations of non-canonical declarations will never find
anything, so it's a complete waste of time.
llvm-svn: 147055
hitting a submodule that was never actually created, e.g., because
that header wasn't parsed. In such cases, complain (because the
module's umbrella headers don't cover everything) and fall back to
including the header.
Later, we'll add a warning at module-build time to catch all such
cases. However, this fallback is important to eliminate assertions in
the ASTWriter when this happens.
llvm-svn: 146933
with a definition pointer (e.g., C++ and Objective-C classes), zip
through the redeclaration chain to make sure that all of the
declarations point to the definition data.
As part of this, realized again why the first redeclaration of an
entity in a file is important, and brought back that idea.
llvm-svn: 146886
redeclaration templates (RedeclarableTemplateDecl), similarly to the
way (de-)serialization is implemented for Redeclarable<T>. In the
process, found a simpler formulation for handling redeclaration
chains and implemented that in both places.
The new test establishes that we're building the redeclaration chains
properly. However, the FIXME indicates where we're tickling a
different bug that has to do with us not setting the DefinitionData
pointer properly in redeclarations that we detected after the
definition itself was deserialized. The (separable) fix for that bug
is forthcoming.
llvm-svn: 146883
imported modules that don't introduce any new entities of a particular
kind. Allow these entries to be replaced with entries for another
loaded module.
In the included test case, selectors exhibit this behavior.
llvm-svn: 146870
which there are no redeclarations. This reduced by size of the PCH
file for Cocoa.h by ~650k: ~536k of that was in the new
LOCAL_REDECLARATIONS table, which went from a ridiculous 540k down to
an acceptable 3.5k, while the rest was due to the more compact
abbreviated representation of redeclarable declaration kinds (which no
longer need to store the 'first' declaration ID).
llvm-svn: 146869
variable is initialized by a non-constant expression, and pass in the variable
being declared so that earlier-initialized fields' values can be used.
Rearrange VarDecl init evaluation to make this possible, and in so doing fix a
long-standing issue in our C++ constant expression handling, where we would
mishandle cases like:
extern const int a;
const int n = a;
const int a = 5;
int arr[n];
Here, n is not initialized by a constant expression, so can't be used in an ICE,
even though the initialization expression would be an ICE if it appeared later
in the TU. This requires computing whether the initializer is an ICE eagerly,
and saving that information in PCH files.
llvm-svn: 146856
chains. The previous implementation relied heavily on the declaration
chain being stored as a (circular) linked list on disk, as it is in
memory. However, when deserializing from multiple modules, the
different chains could get mixed up, leading to broken declaration chains.
The new solution keeps track of the first and last declarations in the
chain for each module file. When we load a declaration, we search all
of the module files for redeclarations of that declaration, then
splice together all of the lists into a coherent whole (along with any
redeclarations that were actually parsed).
As a drive-by fix, (de-)serialize the redeclaration chains of
TypedefNameDecls, which had somehow gotten missed previously. Add a
test of this serialization.
This new scheme creates a redeclaration table that is fairly large in
the PCH file (on the order of 400k for Cocoa.h's 12MB PCH file). The
table is mmap'd in and searched via a binary search, but it's still
quite large. A future tweak will eliminate entries for declarations
that have no redeclarations anywhere, and should
drastically reduce the size of this table.
llvm-svn: 146841
including deserializing their bodies, so that any other declarations that
get referenced in the body will be fully deserialized by the time we pass them to the consumer.
Could not reduce to a test case unfortunately. rdar://10587158.
llvm-svn: 146817
applies to an actual definition. Plus, clarify the purpose of this
field and give the accessor a different name, since getLocEnd() is
supposed to be the same as getSourceRange().getEnd().
llvm-svn: 146694
declarations and definitions) as ObjCInterfaceDecls within the same
redeclaration chain. This new representation matches what we do for
C/C++ variables/functions/classes/templates/etc., and makes it
possible to answer the query "where are all of the declarations of
this class?"
llvm-svn: 146679
redeclaration chain for Objective-C classes, including:
- Using the first declaration as the canonical declaration.
- Using the definition as the primary DeclContext
- Making sure that all declarations have a pointer to the definition
data, and the definition knows that it is the definition.
- Serialization support for when a definition gets added to a
declaration that comes from an AST file.
However, note that we're not taking advantage of much of this code
yet, because we're still re-using ObjCInterfaceDecls.
llvm-svn: 146667
separately-allocated DefinitionData structure, which we manage the
same way as CXXRecordDecl::DefinitionData. This prepares the way for
making ObjCInterfaceDecls redeclarable, to more accurately model
forward declarations of Objective-C classes and eliminate the mutation
of ObjCInterfaceDecl that causes us serious trouble in the AST reader.
Note that ObjCInterfaceDecl's accessors are fairly robust against
being applied to forward declarations, because Clang (and Sema in
particular) doesn't perform RequireCompleteType/hasDefinition() checks
everywhere it has to. Each of these overly-robust cases is marked with
a FIXME, which we can tackle over time.
llvm-svn: 146644
belonged in the Serialization library, it's setting up a compilation,
not just deserializing.
This should fix PR11512, making Serialization actually be layered below
Frontend, a long standing layering violation in Clang.
llvm-svn: 146233
different from what the comments indicated. Also drop a no longer used
include that also violates the layering between Serialization and
Frontend.
llvm-svn: 146230
part of HeaderSearch. This function just normalizes filenames for use
inside of a synthetic include directive, but it is used in both the
Frontend and Serialization libraries so it needs a common home.
llvm-svn: 146227
diagnostics. Conflating them was highly confusing and makes it harder to
establish a firm layering separation between these two libraries.
llvm-svn: 146207
umbrella headers in the sense that all of the headers within that
directory (and eventually its subdirectories) are considered to be
part of the module with that umbrella directory. However, unlike
umbrella headers, which are expected to include all of the headers
within their subdirectories, Clang will automatically include all of
the headers it finds in the named subdirectory.
The intent here is to allow a module map to trivially turn a
subdirectory into a module, where the module's structure can mimic the
directory structure.
llvm-svn: 146165
header to also support umbrella directories. The umbrella directory
for an umbrella header is the directory in which the umbrella header
resides.
No functionality change yet, but it's coming.
llvm-svn: 146158
to re-export anything that it imports. This opt-in feature makes a
module behave more like a header, because it can be used to re-export
the transitive closure of a (sub)module's dependencies.
llvm-svn: 145811
when deserialized, fixing random crashes in libclang.
Also simplifies how OpaqueValueExprs are [de]serialized.
The reader/writer automatically retains pointer equality of sub-statements (when a
statement node is referenced in multiple nodes), so no need to manually handle it.
llvm-svn: 145752
"main" files that import modules. When loading any of these kinds of
AST files, we make the modules that were imported visible into the
translation unit that loaded the PCH file or preamble.
llvm-svn: 145737
precompiled header. Previously, we were trying to gather predefines
buffers from all kinds of AST files (which doesn't make sense) and
were performing some validation when AST files were loaded as main
files.
With these tweaks, using PCH files that import modules no longer fails
immediately (due to mismatched predefines buffers). However, module
visibility is lost, so this feature does not yet work.
llvm-svn: 145709
only the macro definitions from visible (sub)modules will actually be
visible. This provides the same behavior for macros that r145640
provided for declarations.
llvm-svn: 145683
(sub)module, all of the names may be hidden, just the macro names may
be exposed (for example, after the preprocessor has seen the import of
the module but the parser has not), or all of the names may be
exposed. Importing a module makes its names, and the names in any of
its non-explicit submodules, visible to name lookup (transitively).
This commit only introduces the notion of name visible and marks
modules and submodules as visible when they are imported. The actual
name-hiding logic in the AST reader will follow (along with test cases).
llvm-svn: 145586
a standard global/local scheme, so that submodule definitions will
eventually be able to refer to submodules in other top-level
modules. We'll need this functionality soonish.
llvm-svn: 145549
library, since modules cut across all of the libraries. Rename
serialization::Module to serialization::ModuleFile to side-step the
annoying naming conflict. Prune a bunch of ModuleMap.h includes that
are no longer needed (most files only needed the Module type).
llvm-svn: 145538
we may end up having added more pending stuff to do, so go in a loop until everything
is cleared out.
This fixes the error in rdar://10278815 which has a certain David Lynch-esque quality..
error: unknown type name 'BOOL'; did you mean 'BOOL'?
llvm-svn: 145536
submodules. This information will eventually be used for name hiding
when dealing with submodules. For now, we only use it to ensure that
the module "key" returned when loading a module will always be a
module (rather than occasionally being a FileEntry).
llvm-svn: 145497
inside an objc container that "contains" other file-level declarations.
When getting the array of file-level declarations that overlap with a file region,
we failed to report that the region overlaps with an objc container, if
the container had other file-level declarations declared lexically inside it.
Fix this by marking such declarations as "isTopLevelDeclInObjCContainer" in the AST
and handling them appropriately.
llvm-svn: 145109
file in the source manager. This allows us to properly create and use
modules described by module map files without umbrella headers (or
with incompletely umbrella headers). More generally, we can actually
build a PCH file that makes use of file -> buffer remappings, which
could be useful in libclang in the future.
llvm-svn: 144830
it is going to be rewritten (and the chain will be serialized again), otherwise we may form a cycle in its
categories list when deserializing.
Also introduce ASTMutationListener::CompletedObjCForwardRef to notify that a forward reference
was completed; using Decl's isChangedSinceDeserialization/setChangedSinceDeserialization
is bug inducing and kinda gross, we should phase it out.
Fixes infinite loop in rdar://10418538.
llvm-svn: 144465
In certain cases ASTReader would call the normal DiagnosticsEngine API to initialize
the state of diagnostic pragmas but DiagnosticsEngine would try to compare source locations
leading to crash because the main FileID was not yet initialized.
Yet another case of the ASTReader trying to use the normal APIs and inadvertently breaking
invariants. Fix this by having the ASTReader set up the internal state directly.
llvm-svn: 144153
property references to use a new PseudoObjectExpr
expression which pairs a syntactic form of the expression
with a set of semantic expressions implementing it.
This should significantly reduce the complexity required
elsewhere in the compiler to deal with these kinds of
expressions (e.g. IR generation's special l-value kind,
the static analyzer's Message abstraction), at the lower
cost of specifically dealing with the odd AST structure
of these expressions. It should also greatly simplify
efforts to implement similar language features in the
future, most notably Managed C++'s properties and indexed
properties.
Most of the effort here is in dealing with the various
clients of the AST. I've gone ahead and simplified the
ObjC rewriter's use of properties; other clients, like
IR-gen and the static analyzer, have all the old
complexity *and* all the new complexity, at least
temporarily. Many thanks to Ted for writing and advising
on the necessary changes to the static analyzer.
I've xfailed a small diagnostics regression in the static
analyzer at Ted's request.
llvm-svn: 143867
that it retains source location information for the type. Aside from
general goodness (being able to walk the types described in that
information), we now have a proper representation for dependent
delegating constructors. Fixes PR10457 (for real).
llvm-svn: 143410
Introduce a FILE_SORTED_DECLS [de]serialization record that contains
a file sorted array of file-level DeclIDs in a PCH/Module.
The rationale is to allow "targeted" deserialization of decls inside
a range of a source file.
Cocoa PCH increased by 0.8%
Difference of creation time for Cocoa PCH is below the noise level.
llvm-svn: 143238
of decl bit offsets.
This allows us to easily get at the location of a decl without deserializing it.
It increases size of Cocoa PCH by only 0.2%.
llvm-svn: 143123
AST file more lazy, so that we don't eagerly load that information for
all known identifiers each time a new AST file is loaded. The eager
reloading made some sense in the context of precompiled headers, since
very few identifiers were defined before PCH load time. With modules,
however, a huge amount of code can get parsed before we see an
@import, so laziness becomes important here.
The approach taken to make this information lazy is fairly simple:
when we load a new AST file, we mark all of the existing identifiers
as being out-of-date. Whenever we want to access information that may
come from an AST (e.g., whether the identifier has a macro definition,
or what top-level declarations have that name), we check the
out-of-date bit and, if it's set, ask the AST reader to update the
IdentifierInfo from the AST files. The update is a merge, and we now
take care to merge declarations before/after imports with declarations
from multiple imports.
The results of this optimization are fairly dramatic. On a small
application that brings in 14 non-trivial modules, this takes modules
from being > 3x slower than a "perfect" PCH file down to 30% slower
for a full rebuild. A partial rebuild (where the PCH file or modules
can be re-used) is down to 7% slower. Making the PCH file just a
little imperfect (e.g., adding two smallish modules used by a bunch of
.m files that aren't in the PCH file) tips the scales in favor of the
modules approach, with 24% faster partial rebuilds.
This is just a first step; the lazy scheme could possibly be improved
by adding versioning, so we don't search into modules we already
searched. Moreover, we'll need similar lazy schemes for all of the
other lookup data structures, such as DeclContexts.
llvm-svn: 143100
essence, the redeclaration chain for a class could end up in an
inconsistent state while deserializing multiple declarations in that
chain, where the circular linked list was not, in fact,
circular. Since only two redeclarations of the same entity will get
loaded when we're in this state, restore circularity when both have
been loaded. Fixes <rdar://problem/10324940> / PR11195.
llvm-svn: 143037
expressions: expressions which refer to a logical rather
than a physical l-value, where the logical object is
actually accessed via custom getter/setter code.
A subsequent patch will generalize the AST for these
so that arbitrary "implementing" sub-expressions can
be provided.
Right now the only client is ObjC properties, but
this should be generalizable to similar language
features, e.g. Managed C++'s __property methods.
llvm-svn: 142914
statements. As noted in the documentation for the AST node, the
semantics of __if_exists/__if_not_exists are somewhat different from
the way Visual C++ implements them, because our parsed-template
representation can't accommodate VC++ semantics without serious
contortions. Hopefully this implementation is "good enough".
llvm-svn: 142901
preprocessed entities that are #included in the range that we are interested.
This is useful when we are interested in preprocessed entities of a specific file, e.g
when we are annotating tokens. There is also an optimization where we cache the last
result of PreprocessingRecord::getPreprocessedEntitiesInRange and we re-use it if
there is a call with the same range as before.
rdar://10313365
llvm-svn: 142887
in such a case just write out a reference of a previously serialized Stmt, instead
of serializing it all over again.
This saves memory + space + [de]serializing time, and avoids blowing up memory
with pathological cases. rdar://10293911
llvm-svn: 142696
-Add the location of the class name to all objc container decls, not just ObjCInterfaceDecl.
-Make objc decls consistent with the rest of the NamedDecls and have getLocation() point to the
class name, not the location of '@'.
llvm-svn: 141061
Instead of always storing all source locations for the selector identifiers
we check whether all the identifiers are in a "standard" position; "standard" position is
-Immediately before the arguments: -(id)first:(int)x second:(int)y;
-With a space between the arguments: -(id)first: (int)x second: (int)y;
-For nullary selectors, immediately before ';': -(void)release;
In such cases we infer the locations instead of storing them.
llvm-svn: 140989
Instead of always storing all source locations for the selector identifiers
we check whether all the identifiers are in a "standard" position; "standard" position is
-Immediately before the arguments: [foo first:1 second:2]
-With a space between the arguments: [foo first: 1 second: 2]
-For nullary selectors, immediately before ']': [foo release]
In such cases we infer the locations instead of storing them.
llvm-svn: 140987
PreprocessingRecord's getPreprocessedEntitiesInRange.
Also remove all the stuff that were added in ASTUnit that are unnecessary now
that we do a binary search for preprocessed entities and deserialize only
what is necessary.
llvm-svn: 140063
check whether the requested location points inside the precompiled preamble,
in which case the returned source location will be a "loaded" one.
llvm-svn: 140060
which will do a binary search and return a pair of iterators
for preprocessed entities in the given source range.
Source ranges of preprocessed entities are stored twice currently in
the PCH/Module file but this will be fixed in a subsequent commit.
llvm-svn: 140058
don't call ReadSLocEntryRecord() directly because the entry may have
already been loaded in which case calling ReadSLocEntryRecord()
directly would trigger an assertion in SourceManager.
llvm-svn: 140052
arbitrary amount of code. This forces us to stage the AST writer more
strictly, ensuring that we don't assign a declaration ID to a
declaration until after we're certain that no more modules will get
loaded.
llvm-svn: 139974