Commit Graph

30 Commits

Author SHA1 Message Date
Ilya Biryukov 929af67361 [Lex] Allow to consume tokens while preprocessing
Summary:
By adding a hook to consume all tokens produced by the preprocessor.
The intention of this change is to make it possible to consume the
expanded tokens without re-runnig the preprocessor with minimal changes
to the preprocessor and minimal performance penalty when preprocessing
without recording the tokens.

The added hook is very low-level and reconstructing the expanded token
stream requires more work in the client code, the actual algorithm to
collect the tokens using this hook can be found in the follow-up change.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: eraman, nemanjai, kbarton, jsji, riccibruno, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59885

llvm-svn: 361007
2019-05-17 09:32:05 +00:00
Richard Smith 8af8b8611c [C++20] Implement context-sensitive header-name lexing and pp-import parsing in the preprocessor.
llvm-svn: 358231
2019-04-11 21:18:23 +00:00
Richard Smith 75f9681874 Remove use of lookahead from _Pragma handling and from all other
internal lexing steps in the preprocessor.

It is not safe to use the preprocessor's token lookahead except when
operating on the final sequence of tokens that would be produced by
phase 4 of translation. Doing so corrupts the token lookahead cache used
by the parser. (See added testcase for an example.) Lookahead should
instead be viewed as a layer on top of the normal lexer.

Added assertions to catch any further incorrect uses of lookahead within
lexing actions.

llvm-svn: 358230
2019-04-11 21:18:22 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Volodymyr Sapsai 9d540f1539 [Lex] Fix crash on code completion in comment in included file.
This fixes PR32732 by updating CurLexerKind to reflect available lexers.
We were hitting null pointer in Preprocessor::Lex because CurLexerKind
was CLK_Lexer but CurLexer was null. And we set it to null in
Preprocessor::HandleEndOfFile when exiting a file with code completion
point.

To reproduce the crash it is important for a comment to be inside a
class specifier. In this case in Parser::ParseClassSpecifier we improve
error recovery by pushing a semicolon token back into the preprocessor
and later on try to lex a token because we haven't reached the end of
file.

Also clang crashes only on code completion in included file, i.e. when
IncludeMacroStack is not empty. Though we reset CurLexer even if include
stack is empty. The difference is that during pushing back a semicolon
token, preprocessor calls EnterCachingLexMode which decides it is
already in caching mode because various lexers are null and
IncludeMacroStack is not empty. As the result, CurLexerKind remains
CLK_Lexer instead of updating to CLK_CachingLexer.

rdar://problem/34787685

Reviewers: akyrtzi, doug.gregor, arphaman

Reviewed By: arphaman

Subscribers: cfe-commits, kfunk, arphaman, nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D41688

llvm-svn: 323008
2018-01-19 23:41:47 +00:00
Alex Lorenz 24a1bedf76 [Preprocessor] Fix incorrect token caching that occurs when lexing _Pragma
in macro argument pre-expansion mode when skipping a function body

This commit fixes a token caching problem that currently occurs when clang is
skipping a function body (e.g. when looking for a code completion token) and at
the same time caching the tokens for _Pragma when lexing it in macro argument
pre-expansion mode.

When _Pragma is being lexed in macro argument pre-expansion mode, it caches the
tokens so that it can avoid interpreting the pragma immediately (as the macro
argument may not be used in the macro body), and then either backtracks over or
commits these tokens. The problem is that, when we're backtracking/committing in
such a scenario, there's already a previous backtracking position stored in
BacktrackPositions (as we're skipping the function body), and this leads to a
situation where the cached tokens from the pragma (like '(' 'string_literal'
and ')') will remain in the cached tokens array incorrectly even after they're
consumed (in the case of backtracking) or just ignored (in the case when they're
committed). Furthermore, what makes it even worse, is that because of a previous
backtracking position, the logic that deals with when should we call
ExitCachingLexMode in CachingLex no longer works for us in this situation, and
more tokens in the macro argument get cached, to the point where the EOF token
that corresponds to the macro argument EOF is cached. This problem leads to all
sorts of issues in code completion mode, where incorrect errors get presented
and code completion completely fails to produce completion results.

rdar://28523863

Differential Revision: https://reviews.llvm.org/D28772

llvm-svn: 296140
2017-02-24 17:45:16 +00:00
Erik Verbruggen e4fd6522c1 [PP] Replace some index based for loops with range based ones
While in the area, also change some unsigned variables to size_t, and
introduce an LLVM_FALLTHROUGH instead of a comment stating that.

Differential Revision: http://reviews.llvm.org/D25982

llvm-svn: 285193
2016-10-26 13:06:13 +00:00
Reid Kleckner ae818501b7 Fix off-by-one error in PPCaching.cpp token annotation assertion
This assert is intended to defend against backtracking into the middle
of a sequence of tokens that is being replaced with an annotation, but
it's OK if we backtrack to the exact position of the start of the
annotation sequence. Use a <= comparison instead of <.

Fixes PR25946

llvm-svn: 284777
2016-10-20 20:53:20 +00:00
Bruno Cardoso Lopes 428a5aa9a5 [Parser] Update CachedTokens while parsing ObjectiveC template argument list
Consider the following ObjC++ snippet:

--
@protocol PA;
@protocol PB;

@class NSArray<ObjectType>;
typedef int some_t;

id<PA> FA(NSArray<id<PB>> *h, some_t group);
--

This would hit an assertion in the parser after generating an annotation token
while trying to update the token cache:

Assertion failed: (CachedTokens[CachedLexPos-1].getLastLoc() == Tok.getAnnotationEndLoc() && "The annotation should be until the most recent cached token")
...
7 clang::Preprocessor::AnnotatePreviousCachedTokens(clang::Token const&) + 494
8 clang::Parser::TryAnnotateTypeOrScopeTokenAfterScopeSpec(bool, bool, clang::CXXScopeSpec&, bool) + 1163
9 clang::Parser::TryAnnotateTypeOrScopeToken(bool, bool) + 361
10 clang::Parser::isCXXDeclarationSpecifier(clang::Parser::TPResult, bool*) + 598
...

The cached preprocessor token in this case is:

greatergreater '>>' Loc=<testcase.mm:7:24>

while the annotation ("NSArray<id<PB>>") ends at "testcase.mm:7:25", hence the
assertion.

Properly update the CachedTokens during template parsing to contain
two greater tokens instead of a greatergreater.

Differential Revision: http://reviews.llvm.org/D15173

rdar://problem/23494277

llvm-svn: 259311
2016-01-31 00:47:51 +00:00
Chandler Carruth c4399b105b Fix the build break introduced by r195799 by restoring two close
curlies.

llvm-svn: 195802
2013-11-27 01:40:12 +00:00
James Dennett 4a4f72d8d9 Documentation cleanup: Doxygen-ification, typo fixes, and changing some of
the duplicated documentation from .cpp files so that it's not processed by
Doxygen and hence doesn't generate duplicate output.

llvm-svn: 195799
2013-11-27 01:27:40 +00:00
Argyrios Kyrtzidis 4a1049c8a2 [preprocessor] In Preprocessor::CachingLex() check whether there were more tokens
cached during the non-cached lex, otherwise we are going to drop them.

Fixes a bogus "_Pragma takes a parenthesized string literal" error when
expanding consecutive _Pragmas in a macro argument.

Part of rdar://11168596

llvm-svn: 153994
2012-04-04 02:57:01 +00:00
Douglas Gregor 8d76cca3a2 Don't treat 'import' as a contextual keyword when we're in a caching lexer, or when modules are disabled.
llvm-svn: 147524
2012-01-04 06:20:15 +00:00
Douglas Gregor af5c48490e Optimize the preprocessor's handling of the __import_module__
keyword. We now handle this keyword in HandleIdentifier, making a note
for ourselves when we've seen the __import_module__ keyword so that
the next lexed token can trigger a module import (if needed). This
greatly simplifies Preprocessor::Lex(), and completely erases the 5.5%
-Eonly slowdown Argiris noted when I originally implemented
__import_module__. Big thanks to Argiris for noting that horrible
regression!

llvm-svn: 139265
2011-09-07 23:11:54 +00:00
Argyrios Kyrtzidis d1d239f35c Remove the check for repeated tok::eofs, we are not supposed to go past eof so this code is
totally unnecessary.

llvm-svn: 108199
2010-07-12 21:41:41 +00:00
Argyrios Kyrtzidis c2924de667 If we are past tok::eof and in caching lex mode, avoid caching repeated tok::eofs.
llvm-svn: 108175
2010-07-12 18:49:30 +00:00
Chris Lattner 5a503e9f70 we do in fact have to cache the EOF token returned by the preprocessor.
In the case of backtracking, the cached token lexer will be the only 
lexer on the stack, without this the token stack will be empty and EOF
won't be returned.

This fixes PR7072.

llvm-svn: 108124
2010-07-12 04:25:32 +00:00
Sebastian Redl b0e3e1bf67 When placing an annotation token over an existing annotation token, make sure that the new token's range extends to the end of the old token. Assert that in AnnotateCachedTokens. Fixes PR6248.
llvm-svn: 95555
2010-02-08 19:35:18 +00:00
Douglas Gregor c998409cce Remove an overly-eager assertion when replacing tokens with an
annotation token, because some of the tokens we're annotating might
not be in the set of cached tokens (we could have consumed them
unconditionally).

Also, move the tentative parsing from ParseTemplateTemplateArgument
into the one caller that needs it, improving recovery.

llvm-svn: 86904
2009-11-12 00:03:40 +00:00
Mike Stump 11289f4280 Remove tabs, and whitespace cleanups.
llvm-svn: 81346
2009-09-09 15:08:12 +00:00
Nuno Lopes bd2cd92907 fix segfault (because of erasing after the vector boundaries) when the cached token position is at the end
llvm-svn: 77159
2009-07-26 16:36:45 +00:00
Chris Lattner 5a7971e0c3 This change refactors some of the low-level lexer interfaces a bit.
Token now has a class of kinds for "literals", which include 
numeric constants, strings, etc.  These tokens can optionally have
a pointer to the start of the token in the lexer buffer.  This 
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.

This change is performance neutral, but will make other changes
more feasible down the road.

llvm-svn: 63028
2009-01-26 19:29:26 +00:00
Argyrios Kyrtzidis f5e2812e69 Remove Preprocessor::CacheTokens boolean data member. The same functionality can be provided by using Preprocessor::isBacktrackEnabled().
llvm-svn: 59631
2008-11-19 14:23:14 +00:00
Ted Kremenek 50b4f48225 Use PushIncludeMacroStack() instead of manually manipulating the include stack.
llvm-svn: 59181
2008-11-12 22:21:57 +00:00
Argyrios Kyrtzidis c7e67a04c3 Introduce annotation tokens, a special kind of token, created and used only by the parser to replace a group of tokens with a single token encoding semantic information.
Will be fully utilized later for C++ nested-name-specifiers.

llvm-svn: 58911
2008-11-08 16:17:04 +00:00
Argyrios Kyrtzidis 91c3f526dc Line endings: CRLF -> LF
llvm-svn: 55829
2008-09-05 08:53:53 +00:00
Argyrios Kyrtzidis 75b34536d0 Rename Preprocessor::DisableBacktrack -> Preprocessor::CommitBacktrackedTokens.
llvm-svn: 55281
2008-08-24 12:29:43 +00:00
Argyrios Kyrtzidis bd024c7fdb Change line endings: CRLF -> LF
llvm-svn: 55235
2008-08-23 12:05:53 +00:00
Argyrios Kyrtzidis a65490c5df Allow nested backtracks.
llvm-svn: 55204
2008-08-22 21:27:50 +00:00
Argyrios Kyrtzidis b3dd1e0889 Allow the preprocessor to cache the lexed tokens, so that we can do efficient lookahead and backtracking.
1) New public methods added:
  -EnableBacktrackAtThisPos
  -DisableBacktrack
  -Backtrack
  -isBacktrackEnabled

2) LookAhead() implementation is replaced with a more efficient one.
3) LookNext() is removed.

llvm-svn: 54611
2008-08-10 13:15:22 +00:00