that are referenced in the ASTs. This assumes that we serialize out the
decls/stmts first, and use the pointer-tracking logic in the Serializer to
determine if an IdentifierInfo (or its string key) is ever referenced.
This is a significant space optimization for serialized ASTs.
Consider the following program:
void foo(int x,int y) {
return x > y ? x : y+1;
}
Here are the sizes of the files for the serialized ASTs:
Full IdentifierTable: 23676 bytes
Only-referenced Identifiers: 304 bytes.
For this simple program, this is a 77% reduction in the file size of the
serialized ASTs.
llvm-svn: 43975
and Materialize/Read instead of using specializations of SerializeTrait<>. The
resulting code is much cleaner. We are also setting the stage so that only the
parts of the IdentifierTable that are ever referenced within the ASTs are
serialized, and not the whole table.
llvm-svn: 43904
updated it to the recently updated Serialization API.
Changed clients of SourceLocation serialization to call the
appropriate new methods.
Updated Decl serialization code to put new skeleton serialization code
in place that is much better than the older trait-specialization
approach.
llvm-svn: 43625
Disabled assignments for ContentCache.
Copy-ctor for ContentCache now has an assertion preventing it to
be copied from an object that already has an allocated buffer.
llvm-svn: 43526
new split-header file configuration (Serialize.h and Deserialize.h)
now in place in the core LLVM repository.
Removed unneeded SerializeTrait specializations for enums in
TokenKinds.h
llvm-svn: 43306
bottleneck for -E computation, because every token that starts a line needs
to determine *which* line it is on (so -E mode can insert the appropriate
vertical whitespace). This optimization improves this common case where
it is striding through the line # table.
This speeds up -E on xalancbmk by 3.2%
llvm-svn: 40459
SourceManager::getInstantiationLoc. With this change, every token
expanded from a macro doesn't get its own MacroID. :)
This reduces # macro IDs in carbon.h from 16805 to 9197
llvm-svn: 40108
fileid/offset pair, it now contains a bit discriminating between
mapped locations and file locations. This separates the tables for
macros and files in SourceManager, and allows better separation of
concepts in the rest of the compiler. This allows us to have *many*
macro instantiations before running out of 'addressing space'.
This is also more efficient, because testing whether something is a
macro expansion is now a bit test instead of a table lookup (which
also used to require having a srcmgr around, now it doesn't).
This is fully functional, but there are several refinements and
optimizations left.
llvm-svn: 40103
accurate diagnostics. For test/Lexer/comments.c we now emit:
int x = 000000080; /* expected-error {{invalid digit}} */
^
constants.c:7:4: error: invalid digit '8' in octal constant
00080; /* expected-error {{invalid digit}} */
^
The last line is due to an escaped newline. The full line looks like:
int y = 0000\
00080; /* expected-error {{invalid digit}} */
Previously, we emitted:
constants.c:4:9: error: invalid digit '8' in octal constant
int x = 000000080; /* expected-error {{invalid digit}} */
^
constants.c:6:9: error: invalid digit '8' in octal constant
int y = 0000\
^
which isn't too bad, but the new way is better for the user,
regardless of whether there is an escaped newline or not.
All the other lexer-related diagnostics should switch over
to using AdvanceToTokenCharacter where appropriate. Help
wanted :).
This implements test/Lexer/constants.c.
llvm-svn: 39906
out of the llvm namespace. This makes the clang namespace be a sibling of
llvm instead of being a child.
The good thing about this is that it makes many things unambiguous. The
bad things is that many things in the llvm namespace (notably data structures
like smallvector) now require an llvm:: qualifier. IMO, libsystem and libsupport
should be split out of llvm into their own namespace in the future, which will fix
this issue.
llvm-svn: 39659
Reviewed by: Chris Lattner
- Added a method "IgnoreDiagnostic" so that the diagnostic client can
tell the diagnostic object that it doesn't want to handle a particular
diagnostic message. In which case, it won't be counted as either a
diagnostic or error.
llvm-svn: 39641
Reviewed by: Chris Lattner
- Make the counting of errors and diagnostic messages sane. Place them into the
Diagnostic class instead of in the DiagnosticClient class.
llvm-svn: 39615
Submitted by:
Reviewed by:
An important, but truly mind numbing change.
Added 6 flavors of Sema::Diag() that take 1 or two SourceRanges. Considered
adding 3 flavors (using default args), however this wasn't as clear.
Removed 2 flavors of Sema::Diag() that took LexerToken's (they weren't used).
Changed all the typechecking routines to pass the appropriate range(s).
Hacked the diagnostic machinery and driver to acccommodate the new data.
What's left? A FIXME in clang.c to use the ranges. Chris offered to do the
honors:-) Which includes taking us to the end of an identifier:-)
llvm-svn: 39456
of source code. For example:
$ clang INPUTS/carbon_h.c -arch i386 -arch ppc
prints:
...
/usr/lib/gcc/i686-apple-darwin8/4.0.1/include/mmintrin.h:51:3: note: use of a target-specific builtin function, source is not 'portable'
__builtin_ia32_emms ();
^
because carbon.h pulls in xmmintrin.h, and __builtin_ia32_emms isn't a builtin on ppc.
Though clang now supports target-specific builtins, the full table isn't implemented yet.
llvm-svn: 39328
class instead. SmallString allows to code to avoid hitting malloc in
the normal case (or will, when some other stuff is converted over).
llvm-svn: 39084
came from a macro expansion, this allows us to keep track of both where the
character data came from and where the logical position of the token is (at
the expansion site). This implements Preprocessor/indent_macro.c, and
reduces the number of cpp iostream -E diffs vs GCC from 2589 to 2297.
llvm-svn: 38557
Now, instead of keeping a pointer to the start of the token in memory, we keep the
start of the token as a SourceLocation node. This means that each LexerToken knows
the full include stack it was created with, and means that the LexerToken isn't
reliant on a "CurLexer" member to be around (lexer tokens would previously go out of
scope when their lexers were deallocated).
This simplifies several things, and forces good cleanup elsewhere. Now the
Preprocessor is the one that knows how to dump tokens/macros and is the one that
knows how to get the spelling of a token (it has all the context).
llvm-svn: 38551