When a macro instantiation occurs, reserve a SLocEntry chunk with length the
full length of the macro definition source. Set the spelling location of this chunk
to point to the start of the macro definition and any tokens that are lexed directly
from the macro definition will get a location from this chunk with the appropriate offset.
For any tokens that come from argument expansion, '##' paste operator, etc. have their
instantiation location point at the appropriate place in the instantiated macro definition
(the argument identifier and the '##' token respectively).
This improves macro instantiation diagnostics:
Before:
t.c:5:9: error: invalid operands to binary expression ('struct S' and 'int')
int y = M(/);
^~~~
t.c:5:11: note: instantiated from:
int y = M(/);
^
After:
t.c:5:9: error: invalid operands to binary expression ('struct S' and 'int')
int y = M(/);
^~~~
t.c:3:20: note: instantiated from:
\#define M(op) (foo op 3);
~~~ ^ ~
t.c:5:11: note: instantiated from:
int y = M(/);
^
The memory savings for a candidate boost library that abuses the preprocessor are:
- 32% less SLocEntries (37M -> 25M)
- 30% reduction in PCH file size (900M -> 635M)
- 50% reduction in memory usage for the SLocEntry table (1.6G -> 800M)
llvm-svn: 134587
The previous name was inaccurate as this token in fact appears at
the end of every preprocessing directive, not just macro definitions.
No functionality change, except for a diagnostic tweak.
llvm-svn: 126631
The extra data stored on user-defined literal Tokens is stored in extra
allocated memory, which is managed by the PreprocessorLexer because there isn't
a better place to put it that makes sure it gets deallocated, but only after
it's used up. My testing has shown no significant slowdown as a result, but
independent testing would be appreciated.
llvm-svn: 112458
reparsing an ASTUnit. When saving a preamble, create a buffer larger
than the actual file we're working with but fill everything from the
end of the preamble to the end of the file with spaces (so the lexer
will quickly skip them). When we load the file, create a buffer of the
same size, filling it with the file and then spaces. Then, instruct
the lexer to start lexing after the preamble, therefore continuing the
parse from the spot where the preamble left off.
It's now possible to perform a simple preamble build + parse (+
reparse) with ASTUnit. However, one has to disable a bunch of checking
in the PCH reader to do so. That part isn't committed; it will likely
be handled with some other kind of flag (e.g., -fno-validate-pch).
As part of this, fix some issues with null termination of the memory
buffers created for the preamble; we were trying to explicitly
NULL-terminate them, even though they were also getting implicitly
NULL terminated, leading to excess warnings about NULL characters in
source files.
llvm-svn: 109445
is present.
Rather than using clang_getCursorExtent(), which requires
us to lex the token at the ending position to determine its
length. Then, we'd be comparing [a, b) source ranges that cover the
characters in the range rather than the normal behavior for Clang's
source ranges, which covers the tokens in the range. However, relexing
causes us to read the source file (which may come from a precompiled
header), which is rather unfortunate and affects performance.
In the new scheme, we only use Clang-style source ranges that cover
the tokens in the range. At the entry points where this matters
(clang_annotateTokens, clang_getCursor), we make sure to move source
locations to the start of the token.
Addresses most of <rdar://problem/8049381>.
llvm-svn: 109134
which is the part of the file that contains all of the initial
comments, includes, and preprocessor directives that occur before any
of the actual code. Added a new -print-preamble cc1 action that is
only used for testing.
llvm-svn: 108913
1) Suppress diagnostics as soon as we form the code-completion
token, so we don't get any error/warning spew from the early
end-of-file.
2) If we consume a code-completion token when we weren't expecting
one, go into a code-completion recovery path that produces the best
results it can based on the context that the parser is in.
llvm-svn: 104585
make it miss (invalid) things like:
<<<<<<<
>>>>>>>
and crash if
<<<<<<<
was at the end of the line. When we find a >>>>>>> that is not at the
end of the line, make sure to reset Pos so we don't crash on something
like:
<<<<<<< >>>>>>>
This isn't worth making testcases for, since each would require a new file.
rdar://7987078 - signal 11 compiling "<<<<<<<<<<"
llvm-svn: 103968
in an input file like this:
# 42
int x;
we were emitting:
# <something>
int x;
(with a space before the int) because we weren't clearing the leading
whitespace flag properly after the \n from the directive was handled.
llvm-svn: 101084
SourceManager's getBuffer() (and similar) operations. This abstract
can be used to force callers to cope with errors in getBuffer(), such
as missing files and changed files. Fix a bunch of callers to use the
new interface.
Add some very basic checks for file consistency (file size,
modification time) into ContentCache::getBuffer(), although these
checks don't help much until we've updated the main callers (e.g.,
SourceManager::getSpelling()).
llvm-svn: 98585
doing so invalidates the file guard optimization and is not
in the spirit of "#if 0" because it is supposed to completely
skip everything, even if it isn't lexically valid. Patch by
Abramo Bagnara!
llvm-svn: 95253