bottleneck for -E computation, because every token that starts a line needs
to determine *which* line it is on (so -E mode can insert the appropriate
vertical whitespace). This optimization improves this common case where
it is striding through the line # table.
This speeds up -E on xalancbmk by 3.2%
llvm-svn: 40459
SourceManager::getInstantiationLoc. With this change, every token
expanded from a macro doesn't get its own MacroID. :)
This reduces # macro IDs in carbon.h from 16805 to 9197
llvm-svn: 40108
fileid/offset pair, it now contains a bit discriminating between
mapped locations and file locations. This separates the tables for
macros and files in SourceManager, and allows better separation of
concepts in the rest of the compiler. This allows us to have *many*
macro instantiations before running out of 'addressing space'.
This is also more efficient, because testing whether something is a
macro expansion is now a bit test instead of a table lookup (which
also used to require having a srcmgr around, now it doesn't).
This is fully functional, but there are several refinements and
optimizations left.
llvm-svn: 40103
accurate diagnostics. For test/Lexer/comments.c we now emit:
int x = 000000080; /* expected-error {{invalid digit}} */
^
constants.c:7:4: error: invalid digit '8' in octal constant
00080; /* expected-error {{invalid digit}} */
^
The last line is due to an escaped newline. The full line looks like:
int y = 0000\
00080; /* expected-error {{invalid digit}} */
Previously, we emitted:
constants.c:4:9: error: invalid digit '8' in octal constant
int x = 000000080; /* expected-error {{invalid digit}} */
^
constants.c:6:9: error: invalid digit '8' in octal constant
int y = 0000\
^
which isn't too bad, but the new way is better for the user,
regardless of whether there is an escaped newline or not.
All the other lexer-related diagnostics should switch over
to using AdvanceToTokenCharacter where appropriate. Help
wanted :).
This implements test/Lexer/constants.c.
llvm-svn: 39906
out of the llvm namespace. This makes the clang namespace be a sibling of
llvm instead of being a child.
The good thing about this is that it makes many things unambiguous. The
bad things is that many things in the llvm namespace (notably data structures
like smallvector) now require an llvm:: qualifier. IMO, libsystem and libsupport
should be split out of llvm into their own namespace in the future, which will fix
this issue.
llvm-svn: 39659
came from a macro expansion, this allows us to keep track of both where the
character data came from and where the logical position of the token is (at
the expansion site). This implements Preprocessor/indent_macro.c, and
reduces the number of cpp iostream -E diffs vs GCC from 2589 to 2297.
llvm-svn: 38557
Now, instead of keeping a pointer to the start of the token in memory, we keep the
start of the token as a SourceLocation node. This means that each LexerToken knows
the full include stack it was created with, and means that the LexerToken isn't
reliant on a "CurLexer" member to be around (lexer tokens would previously go out of
scope when their lexers were deallocated).
This simplifies several things, and forces good cleanup elsewhere. Now the
Preprocessor is the one that knows how to dump tokens/macros and is the one that
knows how to get the spelling of a token (it has all the context).
llvm-svn: 38551