identifiers to determine good typo-correction candidates. Once we've
identified those candidates, we perform name lookup on each of them
and the consider the results.
This optimization makes typo correction > 2x faster on a benchmark
example using a single typo (NSstring) in a tiny file that includes
Cocoa.h from a precompiled header, since we are deserializing far less
information now during typo correction.
There is a semantic change here, which is interesting. The presence of
a similarly-named entity that is not visible can now affect typo
correction. This is both good (you won't get weird corrections if the
thing you wanted isn't in scope) and bad (you won't get good
corrections if there is a similarly-named-but-completely-unrelated
thing). Time will tell whether it was a good choice or not.
llvm-svn: 116528
instead of deserializing the complete declaration context of the record.
Iterating over the fields of a record is very common (e.g to determine the layout), unfortunately we needlessly deserialize every declaration
that the declaration context of the record contains; this can be bad for large C++ classes that contain a lot of methods.
Fix this by allow deserialization of just the fields when we want to iterate over them.
Progress for rdar://7260160.
llvm-svn: 116507
following amusing sequence:
- AST writing schedules writing a type X* that it had never seen
before
- AST writing starts writing another declaration, ends up
deserializing X* from a prior AST file. Now we have two type IDs for
the same type!
- AST writer tries to write X*. It only has the lower-numbered ID
from the the prior AST file, so references to the higher-numbered ID
that was scheduled for writing go off into lalaland.
To fix this, keep the higher-numbered ID so we end up writing the type
twice. Since this issue occurs so rarely, and type records are
generally rather small, I deemed this better than the alternative: to
keep a separate mapping from the higher-numbered IDs to the
lower-numbered IDs, which we would end up having to check whenever we
want to deserialize any type.
Fixes <rdar://problem/8511624>, I think.
llvm-svn: 115647
waiting until we think we need it: we didn't catch all of the places
where we actually needed it, and we probably wouldn't ever. Fixes a
C++ PCH crasher.
llvm-svn: 115617
the interface as having changed since it was originally
serialized. This ensures that we see class extensions/categories in
chained PCH files.
llvm-svn: 115421
file is somehow changed in a chained PCH file, make sure that we write
out the macro definition. Fixes part of <rdar://problem/8499034>.
llvm-svn: 115259
The canonical FunctionTemplateDecl contains the specializations but we cannot use getCanonicalDecl on Template because it may still be initializing.
Write and read it from PCH.
Fixes http://llvm.org/PR8134
llvm-svn: 113744
When including a PCH and later re-emitting to another PCH, the name lookup tables of DeclContexts
may be incomplete, since we now lazily deserialize the visible decls of a particular name.
Fix the issue by iterating over the un-deserialized visible decls and completing the lookup tables
of DeclContexts before writing them out.
llvm-svn: 111698
-There are 2 instances that change the TokenID for GNU libstdc++ 4.2 compatibility.
To handler those cases introduce a RevertedTokenID bitfield, RevertTokenIDToIdentifier() and hasRevertedTokenIDToIdentifier() methods.
Store the bitfield in PCH.
llvm-svn: 110868
redeclaration. That way we are sure that the full redeclarations chain is loaded.
When using chained PCHs, first declarations point to the most recent redeclarations in the same PCH.
To address this use a REDECLS_UPDATE_LATEST record block to keep track of which first declarations need
to point to a most recent redeclaration in another PCH.
llvm-svn: 110125
DeclIsRequiredFunctionOrFileScopedVar.
This is essentially a CodeGen predicate that is also needed by the PCH mechanism to determine whether a decl
needs to be deserialized during PCH loading for codegen purposes.
Since this logic is shared by CodeGen and the PCH mechanism, move it to the ASTContext,
thus CodeGenModule's GetLinkageForFunction/GetLinkageForVariable and the GVALinkage enum is moved out of CodeGen.
This fixes current (and avoids future) codegen-from-PCH bugs.
llvm-svn: 109784
DeclIsRequiredFunctionOrFileScopedVar.
This function is part of the public CodeGen interface since it's essentially a CodeGen predicate that is also
needed by the PCH mechanism to determine whether a decl needs to be deserialized during PCH loading for codegen purposes.
This fixes current (and avoids future) codegen-from-PCH bugs.
llvm-svn: 109546
reparsing an ASTUnit. When saving a preamble, create a buffer larger
than the actual file we're working with but fill everything from the
end of the preamble to the end of the file with spaces (so the lexer
will quickly skip them). When we load the file, create a buffer of the
same size, filling it with the file and then spaces. Then, instruct
the lexer to start lexing after the preamble, therefore continuing the
parse from the spot where the preamble left off.
It's now possible to perform a simple preamble build + parse (+
reparse) with ASTUnit. However, one has to disable a bunch of checking
in the PCH reader to do so. That part isn't committed; it will likely
be handled with some other kind of flag (e.g., -fno-validate-pch).
As part of this, fix some issues with null termination of the memory
buffers created for the preamble; we were trying to explicitly
NULL-terminate them, even though they were also getting implicitly
NULL terminated, leading to excess warnings about NULL characters in
source files.
llvm-svn: 109445
leaks though) and add methods to its interface for adding/finding specializations.
Simplifies its users a bit and we no longer need to replace specializations in the folding set with
their redeclarations. We just return the most recent redeclarations.
As a bonus, it fixes http://llvm.org/PR7670.
llvm-svn: 108832
A ParmVarDecl instantiated from a FunctionProtoType may have Record as DeclContext,
in which case isStaticDataMember() will erroneously return true.
llvm-svn: 108692
Some of the invariant checks for creating Record/Enum types don't hold true during PCH reading.
Introduce more suitable ASTContext::getRecordType() and getEnumType().
llvm-svn: 107598
This commit 'introduces' a slightly different way to restore the state of the AST object.
It makes PCHDeclReader/PCHDeclWriter friends and gives them access to the private members of the object.
The rationale is to avoid using/modifying the AST interfaces for PCH read/write so that to:
-Avoid complications with objects that have side-effects during creation or when using some setters.
-Not 'pollute' the AST interface with methods only used by the PCH reader/writer
-Allow AST objects to be read-only.
llvm-svn: 107219
Before this commit, sub-stmts were stored as encountered and when they were placed in the Stmts stack we had to know what index
each stmt operand has. This complicated supporting variable sub-stmts and sub-stmts that were contained in TypeSourceInfos, e.g.
x = sizeof(int[1]);
would crash PCH.
Now, sub-stmts are stored in reverse order, from last to first, so that when reading them, in order to get the next sub-stmt we just
need to pop the last stmt from the stack. This greatly simplified the way stmts are written and read (just use PCHWriter::AddStmt and
PCHReader::ReadStmt accordingly) and allowed variable stmt operands and TypeSourceInfo exprs.
llvm-svn: 107087
Amadini.
This change introduces a new expression node type, OffsetOfExpr, that
describes __builtin_offsetof. Previously, __builtin_offsetof was
implemented using a unary operator whose subexpression involved
various synthesized array-subscript and member-reference expressions,
which was ugly and made it very hard to instantiate as a
template. OffsetOfExpr represents the AST more faithfully, with proper
type source information and a more compact representation.
OffsetOfExpr also has support for dependent __builtin_offsetof
expressions; it can be value-dependent, but will never be
type-dependent (like sizeof or alignof). This commit introduces
template instantiation for __builtin_offsetof as well.
There are two major caveats to this patch:
1) CodeGen cannot handle the case where __builtin_offsetof is not a
constant expression, so it produces an error. So, to avoid
regressing in C, we retain the old UnaryOperator-based
__builtin_offsetof implementation in C while using the shiny new
OffsetOfExpr implementation in C++. The old implementation can go
away once we have proper CodeGen support for this case, which we
expect won't cause much trouble in C++.
2) __builtin_offsetof doesn't work well with non-POD class types,
particularly when the designated field is found within a base
class. I will address this in a subsequent patch.
Fixes PR5880 and a bunch of assertions when building Boost.Python
tests.
llvm-svn: 102542
statements. Instead of the @try having a single @catch, where all of
the @catch's were chained (using an O(n^2) algorithm nonetheless),
@try just holds an array of its @catch blocks. The resulting AST is
slightly more compact (not important) and better represents the actual
language semantics (good).
llvm-svn: 102221
method parameter, provide a note pointing at the parameter itself so
the user does not have to manually look for the function/method being
called and match up parameters to arguments. For example, we now get:
t.c:4:5: warning: incompatible pointer types passing 'long *' to
parameter of
type 'int *' [-pedantic]
f(long_ptr);
^~~~~~~~
t.c:1:13: note: passing argument to parameter 'x' here
void f(int *x);
^
llvm-svn: 102038
destination type for initialization, assignment, parameter-passing,
etc. The main issue fixed here is that we used rather confusing
wording for diagnostics such as
t.c:2:9: warning: initializing 'char const [2]' discards qualifiers,
expected 'char *' [-pedantic]
char *name = __func__;
^ ~~~~~~~~
We're not initializing a 'char const [2]', we're initializing a 'char
*' with an expression of type 'char const [2]'. Similar problems
existed for other diagnostics in this area, so I've normalized them all
with more precise descriptive text to say what we're
initializing/converting/assigning/etc. from and to. The warning for
the code above is now:
t.c:2:9: warning: initializing 'char *' from an expression of type
'char const [2]' discards qualifiers [-pedantic]
char *name = __func__;
^ ~~~~~~~~
Fixes <rdar://problem/7447179>.
llvm-svn: 100832
buffer was invalid when it was created, and use that bit to always set
the "Invalid" flag according to whether the buffer is invalid. This
ensures that all accesses to an invalid buffer are marked invalid,
improving recovery.
llvm-svn: 98690
presence or absence of header map arguments when using the precompiled
header would cause Clang to get confused about which headers had
already been included/imported, along with their controlling
macros. The fundamental problem is that the serialization of the
header search information was relying on the UIDs of FileEntry objects
at PCH generation time and PCH load time to be equivalent, which
effectively means that we had to probe the same files in the same
order. Differing header map arguments caused an extra FileEntry
lookup, but it's easy to imagine other minor command-line arguments
triggering this problem.
Header-search information is now encoded along with the
source-location entry for a file, so that we register information
about a file's properties as a header at the same time we create the
FileEntry for that file.
Fixes <rdar://problem/7743243>.
llvm-svn: 98636
SourceManager's getBuffer() and, therefore, could fail, along with
Preprocessor::getSpelling(). Use the Invalid parameters in the literal
parsers (string, floating point, integral, character) to make them
robust against errors that stem from, e.g., PCH files that are not
consistent with the underlying file system.
I still need to audit every use caller to all of these routines, to
determine which ones need specific handling of error conditions.
llvm-svn: 98608
- This is designed to make it obvious that %clang_cc1 is a "test variable"
which is substituted. It is '%clang_cc1' instead of '%clang -cc1' because it
can be useful to redefine what gets run as 'clang -cc1' (for example, to set
a default target).
llvm-svn: 91446
files.
- The issue is that PCH uses a stat cache, which may reference files which have
been deleted or moved. In such cases ContentCache::getBuffer was returning 0
but most clients are incapable of dealing with this (i.e., they don't).
For the time being, resolve this issue by just making up some invalid file
contents and. Eventually we should detect that we are in an inconsistent
situation and error out with a nice message that the PCH is out of date.
llvm-svn: 90699
- Declaration context of ParmVarDecls (that we got from the Declarator) was not their containing function.
- C++ out-of-line method definitions didn't get an access specifier.
Both were exposed by a crash when emitting a C++ method to a PCH file (assert at Decl::CheckAccessDeclContext()).
llvm-svn: 75597
The idea is to segregate Objective-C "object" pointers from general C pointers (utilizing the recently added ObjCObjectPointerType). The fun starts in Sema::GetTypeForDeclarator(), where "SomeInterface *" is now represented by a single AST node (rather than a PointerType whose Pointee is an ObjCInterfaceType). Since a significant amount of code assumed ObjC object pointers where based on C pointers/structs, this patch is very tedious. It should also explain why it is hard to accomplish this in smaller, self-contained patches.
This patch does most of the "heavy lifting" related to moving from PointerType->ObjCObjectPointerType. It doesn't include all potential "cleanups". The good news is additional cleanups can be done later (some are noted in the code). This patch is so large that I didn't want to include any changes that are purely aesthetic.
By making the ObjC types truly built-in, they are much easier to work with (and require fewer "hacks"). For example, there is no need for ASTContext::isObjCIdStructType() or ASTContext::isObjCClassStructType()! We believe this change (and the follow-up cleanups) will pay dividends over time.
Given the amount of code change, I do expect some fallout from this change (though it does pass all of the clang tests). If you notice any problems, please let us know asap! Thanks.
llvm-svn: 75314
FILE type, rather than using name lookup to find FILE within the
translation unit. Within precompiled headers, FILE is treated as yet
another "special type" (like __builtin_va_list).
This change should provide a performance improvement (not verified),
since the lookup into the translation unit declaration
forces the (otherwise unneeded) construction of a large hash table.
More importantly, with precompiled headers, the construction
of that table requires deserializing most of the top-level
declarations from the precompiled header, which are then unused.
Fixes PR 4509.
llvm-svn: 74911
with a particular system root directory and can be used with a different
system root directory when the headers it depends on have been installed.
Relocatable precompiled headers rewrite the file names of the headers used
when generating the PCH file into the corresponding file names of the
headers available when using the PCH file.
Addresses <rdar://problem/7001604>.
llvm-svn: 74885
of a top-level declaration loads another top-level declaration of the
same name whose type depends on the first declaration having been
completed. This commit breaks the circular dependency by delaying
loads of top-level declarations triggered by loading a name until we
are no longer recursively loading types or declarations.
llvm-svn: 74847
line when using a PCH that were not provided when building the PCH
file. If those names were used as identifiers somewhere in the PCH
file, reject the PCH file.
llvm-svn: 70321
PCH file and the predefines buffer used when including the PCH
file. We (explicitly) detect conflicting macro definitions (rejecting
the PCH file) and about missing macro definitions (they'll be
automatically pulled from the PCH file anyway).
We're missing some checking to make sure that new macro definitions
won't have any impact on the PCH file itself (e.g., #define'ing an
identifier that the PCH file used).
llvm-svn: 70316
pools, combined). The methods in the global method pool are lazily
loaded from an on-disk hash table when Sema looks into its version of
the hash tables.
llvm-svn: 69989
SEL, Class, Protocol, CFConstantString, and
__objcFastEnumerationState. With this, we can now run the Objective-C
methods and properties PCH tests.
llvm-svn: 69932
This enables class recognition to work with PCH. I believe this means we can remove Sema::ObjCInterfaceDecls and it's usage within Sema::LookupName(). Will investigate.
llvm-svn: 69891
identifiers from a precompiled header.
This patch changes the primary name lookup method for entities within
a precompiled header. Previously, we would load all of the names of
declarations at translation unit scope into a large DenseMap (inside
the TranslationUnitDecl's DeclContext), and then perform a special
"last resort" lookup into this DeclContext when we knew there was a
PCH file (see Sema::LookupName). Now, when we see an identifier named
for the first time, we load all of the declarations with that name
that are visible from the translation unit into the IdentifierInfo's
chain of declarations. Thus, the explicit "look into the translation
unit's DeclContext" code is gone, and Sema effectively uses the same
IdentifierInfo-based name lookup mechanism whether we are using a PCH
file or not.
This approach should help PCH scale with the size of the input program
rather than the size of the PCH file. The "Hello, World!" application
with Carbon.h as a PCH file now loads 20% of the identifiers in the
PCH file rather than 85% of the identifiers.
90% of the 20% of identifiers loaded are actually loaded when we
deserialize the preprocessor state. The next step is to make the
preprocessor load macros lazily, which should drastically reduce the
number of types, declarations, and identifiers loaded for "Hello,
World".
llvm-svn: 69737
tentative definitions off to the ASTConsumer at the end of the
translation unit.
Eliminate CodeGen's internal tracking of tentative definitions, and
instead hook into ASTConsumer::CompleteTentativeDefinition. Also,
tweak the definition-deferal logic for C++, where there are no
tentative definitions.
Fixes <rdar://problem/6808352>, and will make it much easier for
precompiled headers to cope with tentative definitions in the future.
llvm-svn: 69681
AST context's __builtin_va_list type will be set when the PCH file is
loaded. This fixes the crash when CodeGen'ing a va_arg expression
pulled in from a PCH file.
llvm-svn: 69421
lazy PCH deserialization. Propagate that argument wherever it needs to
be. No functionality change, except that I've tightened up a few PCH
tests in preparation.
llvm-svn: 69406
1) Accidentally used delete [] on an array of statements that was allocated with ASTContext's allocator
2) Deserialization of names with multiple declarations (e.g., a struct and a function) used the wrong mangling constant, causing it to view declaration IDs as Decl*s.
403.gcc builds and links properly.
llvm-svn: 69390
- PR3980.
- <rdar://problem/6762287> [irgen] crash when generating tentative
definition of incomplete structure
- This also avoids creating common definitions for things which are
later overwritten.
- XFAIL'ed external-defs.c, it isn't completing types properly yet.
llvm-svn: 69231
kind PCH handles that has an expression as an operand, so most of this
work is in the infrastructure to rebuild expression trees from the
serialized representation. We now store expressions in post-order
(e.g., Reverse Polish Notation), so that we can easily rebuild the
appropriate expression tree.
llvm-svn: 69101
wrap-up (e.g., turning tentative definitions into definitions). Also,
very that, when we actually use the PCH file, we get the ride code
generation for tentative definitions and definitions that show up in
the PCH file.
llvm-svn: 69043
non-inline external definitions (and tentative definitions) that are
found at the top level. The corresponding declarations are stored in a
record in the PCH file, so that they can be provided to the
ASTConsumer (via HandleTopLevelDecl) when the PCH file is read.
llvm-svn: 69005
PCH. This works now, except for limitations not being able to do things
with identifiers. The basic example in the testcase works though.
llvm-svn: 68832
improvement, source locations read from the PCH file will properly
resolve to the source files that were used to build the PCH file
itself.
Once we have the preprocessor state stored in the PCH file, source
locations that refer to macro instantiations that occur in the PCH
file should have the appropriate instantiation information.
llvm-svn: 68758
de-serialization of abstract syntax trees.
PCH support serializes the contents of the abstract syntax tree (AST)
to a bitstream. When the PCH file is read, declarations are serialized
as-needed. For example, a declaration of a variable "x" will be
deserialized only when its VarDecl can be found by a client, e.g.,
based on name lookup for "x" or traversing the entire contents of the
owner of "x".
This commit provides the framework for serialization and (lazy)
deserialization, along with support for variable and typedef
declarations (along with several kinds of types). More
declarations/types, along with important auxiliary structures (source
manager, preprocessor, etc.), will follow.
llvm-svn: 68732