Summary:
By adding a hook to consume all tokens produced by the preprocessor.
The intention of this change is to make it possible to consume the
expanded tokens without re-runnig the preprocessor with minimal changes
to the preprocessor and minimal performance penalty when preprocessing
without recording the tokens.
The added hook is very low-level and reconstructing the expanded token
stream requires more work in the client code, the actual algorithm to
collect the tokens using this hook can be found in the follow-up change.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: eraman, nemanjai, kbarton, jsji, riccibruno, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59885
llvm-svn: 361007
internal lexing steps in the preprocessor.
It is not safe to use the preprocessor's token lookahead except when
operating on the final sequence of tokens that would be produced by
phase 4 of translation. Doing so corrupts the token lookahead cache used
by the parser. (See added testcase for an example.) Lookahead should
instead be viewed as a layer on top of the normal lexer.
Added assertions to catch any further incorrect uses of lookahead within
lexing actions.
llvm-svn: 358230
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This fixes PR32732 by updating CurLexerKind to reflect available lexers.
We were hitting null pointer in Preprocessor::Lex because CurLexerKind
was CLK_Lexer but CurLexer was null. And we set it to null in
Preprocessor::HandleEndOfFile when exiting a file with code completion
point.
To reproduce the crash it is important for a comment to be inside a
class specifier. In this case in Parser::ParseClassSpecifier we improve
error recovery by pushing a semicolon token back into the preprocessor
and later on try to lex a token because we haven't reached the end of
file.
Also clang crashes only on code completion in included file, i.e. when
IncludeMacroStack is not empty. Though we reset CurLexer even if include
stack is empty. The difference is that during pushing back a semicolon
token, preprocessor calls EnterCachingLexMode which decides it is
already in caching mode because various lexers are null and
IncludeMacroStack is not empty. As the result, CurLexerKind remains
CLK_Lexer instead of updating to CLK_CachingLexer.
rdar://problem/34787685
Reviewers: akyrtzi, doug.gregor, arphaman
Reviewed By: arphaman
Subscribers: cfe-commits, kfunk, arphaman, nemanjai, kbarton
Differential Revision: https://reviews.llvm.org/D41688
llvm-svn: 323008
in macro argument pre-expansion mode when skipping a function body
This commit fixes a token caching problem that currently occurs when clang is
skipping a function body (e.g. when looking for a code completion token) and at
the same time caching the tokens for _Pragma when lexing it in macro argument
pre-expansion mode.
When _Pragma is being lexed in macro argument pre-expansion mode, it caches the
tokens so that it can avoid interpreting the pragma immediately (as the macro
argument may not be used in the macro body), and then either backtracks over or
commits these tokens. The problem is that, when we're backtracking/committing in
such a scenario, there's already a previous backtracking position stored in
BacktrackPositions (as we're skipping the function body), and this leads to a
situation where the cached tokens from the pragma (like '(' 'string_literal'
and ')') will remain in the cached tokens array incorrectly even after they're
consumed (in the case of backtracking) or just ignored (in the case when they're
committed). Furthermore, what makes it even worse, is that because of a previous
backtracking position, the logic that deals with when should we call
ExitCachingLexMode in CachingLex no longer works for us in this situation, and
more tokens in the macro argument get cached, to the point where the EOF token
that corresponds to the macro argument EOF is cached. This problem leads to all
sorts of issues in code completion mode, where incorrect errors get presented
and code completion completely fails to produce completion results.
rdar://28523863
Differential Revision: https://reviews.llvm.org/D28772
llvm-svn: 296140
While in the area, also change some unsigned variables to size_t, and
introduce an LLVM_FALLTHROUGH instead of a comment stating that.
Differential Revision: http://reviews.llvm.org/D25982
llvm-svn: 285193
This assert is intended to defend against backtracking into the middle
of a sequence of tokens that is being replaced with an annotation, but
it's OK if we backtrack to the exact position of the start of the
annotation sequence. Use a <= comparison instead of <.
Fixes PR25946
llvm-svn: 284777
Consider the following ObjC++ snippet:
--
@protocol PA;
@protocol PB;
@class NSArray<ObjectType>;
typedef int some_t;
id<PA> FA(NSArray<id<PB>> *h, some_t group);
--
This would hit an assertion in the parser after generating an annotation token
while trying to update the token cache:
Assertion failed: (CachedTokens[CachedLexPos-1].getLastLoc() == Tok.getAnnotationEndLoc() && "The annotation should be until the most recent cached token")
...
7 clang::Preprocessor::AnnotatePreviousCachedTokens(clang::Token const&) + 494
8 clang::Parser::TryAnnotateTypeOrScopeTokenAfterScopeSpec(bool, bool, clang::CXXScopeSpec&, bool) + 1163
9 clang::Parser::TryAnnotateTypeOrScopeToken(bool, bool) + 361
10 clang::Parser::isCXXDeclarationSpecifier(clang::Parser::TPResult, bool*) + 598
...
The cached preprocessor token in this case is:
greatergreater '>>' Loc=<testcase.mm:7:24>
while the annotation ("NSArray<id<PB>>") ends at "testcase.mm:7:25", hence the
assertion.
Properly update the CachedTokens during template parsing to contain
two greater tokens instead of a greatergreater.
Differential Revision: http://reviews.llvm.org/D15173
rdar://problem/23494277
llvm-svn: 259311
cached during the non-cached lex, otherwise we are going to drop them.
Fixes a bogus "_Pragma takes a parenthesized string literal" error when
expanding consecutive _Pragmas in a macro argument.
Part of rdar://11168596
llvm-svn: 153994
keyword. We now handle this keyword in HandleIdentifier, making a note
for ourselves when we've seen the __import_module__ keyword so that
the next lexed token can trigger a module import (if needed). This
greatly simplifies Preprocessor::Lex(), and completely erases the 5.5%
-Eonly slowdown Argiris noted when I originally implemented
__import_module__. Big thanks to Argiris for noting that horrible
regression!
llvm-svn: 139265
In the case of backtracking, the cached token lexer will be the only
lexer on the stack, without this the token stack will be empty and EOF
won't be returned.
This fixes PR7072.
llvm-svn: 108124
annotation token, because some of the tokens we're annotating might
not be in the set of cached tokens (we could have consumed them
unconditionally).
Also, move the tentative parsing from ParseTemplateTemplateArgument
into the one caller that needs it, improving recovery.
llvm-svn: 86904
Token now has a class of kinds for "literals", which include
numeric constants, strings, etc. These tokens can optionally have
a pointer to the start of the token in the lexer buffer. This
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.
This change is performance neutral, but will make other changes
more feasible down the road.
llvm-svn: 63028
1) New public methods added:
-EnableBacktrackAtThisPos
-DisableBacktrack
-Backtrack
-isBacktrackEnabled
2) LookAhead() implementation is replaced with a more efficient one.
3) LookNext() is removed.
llvm-svn: 54611