This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.
This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.
The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.
Differential revision: https://reviews.llvm.org/D72703
Summary:
As discussed in http://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
the overflow of the souce locations (limited to 2^31 chars) can generate all sorts of
weird things (bogus warnings, hangs, crashes, miscompilation and correct compilation).
In debug mode this assert would fail. So it might be a good start, as in PR42301,
to detect the failure and exit with a proper error message.
Reviewers: rsmith, thakis, miyuki
Reviewed By: miyuki
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D70183
Builtins are rarely if ever accessed via the Preprocessor. They are
typically found on the ASTContext, so there should be no performance
penalty to using a pointer indirection to store the builtin context.
This commit adds an optimization to clang-scan-deps and clang's preprocessor that skips excluded preprocessor
blocks by bumping the lexer pointer, and not lexing the tokens until reaching appropriate #else/#endif directive.
The skip positions and lexer offsets are computed when the file is minimized, directly from the minimized tokens.
On an 18-core iMacPro with macOS Catalina Beta I got 10-15% speed-up from this optimization when running clang-scan-deps on
the compilation database for a recent LLVM and Clang (3511 files).
Differential Revision: https://reviews.llvm.org/D67127
llvm-svn: 371656
when the FileManager is reused across invocations
This commit introduces a parallel API to FileManager's getFile: getFileEntryRef, which returns
a reference to the FileEntry, and the name that was used to access the file. In the case of
a VFS with 'use-external-names', the FileEntyRef contains the external name of the file,
not the filename that was used to access it.
The new API is adopted only in the HeaderSearch and Preprocessor for include file lookup, so that the
accessed path can be propagated to SourceManager's FileInfo. SourceManager's FileInfo now can report this accessed path, using
the new getName method. This API is then adopted in the dependency collector, which now correctly reports dependencies when a file
is included both using a symlink and a real path in the case when the FileManager is reused across multiple Preprocessor invocations.
Note that this patch does not fix all dependency collector issues, as the same problem is still present in other cases when dependencies
are obtained using FileSkipped, InclusionDirective, and HasInclude. This will be fixed in follow-up commits.
Differential Revision: https://reviews.llvm.org/D65907
llvm-svn: 369680
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
Differential revision: https://reviews.llvm.org/D66259
llvm-svn: 368942
Summary:
By adding a hook to consume all tokens produced by the preprocessor.
The intention of this change is to make it possible to consume the
expanded tokens without re-runnig the preprocessor with minimal changes
to the preprocessor and minimal performance penalty when preprocessing
without recording the tokens.
The added hook is very low-level and reconstructing the expanded token
stream requires more work in the client code, the actual algorithm to
collect the tokens using this hook can be found in the follow-up change.
Reviewers: rsmith
Reviewed By: rsmith
Subscribers: eraman, nemanjai, kbarton, jsji, riccibruno, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59885
llvm-svn: 361007
is not used since it consumes all preprocessor directives until it returns
a real token. Using the specific Lexer (i.e. CurLexer->Lex) makes it
possible to stop skipping after an #include or #pragma hdrstop. Previously
the skipping code was only handling CurLexer, now all will be handled
correctly.
Fixes: llvm.org/PR41585
Differential Revision: https://reviews.llvm.org/D61217
llvm-svn: 359506
and the global and private module fragment.
For now, the private module fragment introducer is ignored, but use of
the global module fragment introducer should be properly enforced.
llvm-svn: 358353
internal lexing steps in the preprocessor.
It is not safe to use the preprocessor's token lookahead except when
operating on the final sequence of tokens that would be produced by
phase 4 of translation. Doing so corrupts the token lookahead cache used
by the parser. (See added testcase for an example.) Lookahead should
instead be viewed as a layer on top of the normal lexer.
Added assertions to catch any further incorrect uses of lookahead within
lexing actions.
llvm-svn: 358230
Use the new kind for both angled header-name tokens and for
double-quoted header-name tokens.
This is in preparation for C++20's context-sensitive header-name token
formation rules.
llvm-svn: 356530
tokens.
We now actually form an angled_string_literal token for a header name by
concatenation rather than just working out what its contents would be.
This substantially simplifies downstream processing and is necessary for
C++20 header unit imports.
llvm-svn: 356433
Change MemoryBufferCache to InMemoryModuleCache, moving it from Basic to
Serialization. Another patch will start using it to manage module build
more explicitly, but this is split out because it's mostly mechanical.
Because of the move to Serialization we can no longer abuse the
Preprocessor to forward it to the ASTReader. Besides the rename and
file move, that means Preprocessor::Preprocessor has one fewer parameter
and ASTReader::ASTReader has one more.
llvm-svn: 355777
When a framework with the same name is available at multiple framework
search paths, we use the first matching location. If a framework at this
location doesn't have all the headers, it can be confusing for
developers because they see only an error `'Foo/Foo.h' file not found`,
can find the complete framework with required header, and don't know the
incomplete framework was used instead.
Add a note explaining a framework without required header was found.
Also mention framework directory path to make it easier to find the
incomplete framework.
rdar://problem/39246514
Reviewers: arphaman, erik.pilkington, jkorous
Reviewed By: jkorous
Subscribers: jkorous, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D56561
llvm-svn: 353231
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
When debugging a boost build with a modified
version of Clang, I discovered that the PTH implementation
stores TokenKind in 8 bits. However, we currently have 368
TokenKinds.
The result is that the value gets truncated and the wrong token
gets picked up when including PTH files. It seems that this will
go wrong every time someone uses a token that uses the 9th bit.
Upon asking on IRC, it was brought up that this was a highly
experimental features that was considered a failure. I discovered
via googling that BoostBuild (mostly Boost.Math) is the only user of
this
feature, using the CC1 flag directly. I believe that this can be
transferred over to normal PCH with minimal effort:
https://github.com/boostorg/build/issues/367
Based on advice on IRC and research showing that this is a nearly
completely unused feature, this patch removes it entirely.
Note: I considered leaving the build-flags in place and making them
emit an error/warning, however since I've basically identified and
warned the only user, it seemed better to just remove them.
Differential Revision: https://reviews.llvm.org/D54547
Change-Id: If32744275ef1f585357bd6c1c813d96973c4d8d9
llvm-svn: 348266
Summary:
The dir component ("somedir" in #include <somedir/fo...>) is considered fixed.
We append "foo" to each directory on the include path, and then list its files.
Completions are of the forms:
#include <somedir/fo^
foo.h>
fox/
The filter is set to the filename part ("fo"), so fuzzy matching can be
applied to the filename only.
No fancy scoring/priorities are set, and no information is added to
CodeCompleteResult to make smart scoring possible. Could be in future.
Reviewers: ilya-biryukov
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52076
llvm-svn: 342449
With clang-cl, when the user specifies /Yc or /Yu without a filename
the compiler uses a #pragma hdrstop in the main source file to
determine the end of the PCH. If a header is specified with /Yc or
/Yu #pragma hdrstop has no effect.
The optional #pragma hdrstop filename argument is not yet supported.
Differential Revision: https://reviews.llvm.org/D51391
llvm-svn: 341963
Summary:
Migrate callers to print().
dump() should be useful to downstreams and third parties as a debugging
aid. Everyone trips up on this and creates confusing output.
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D50661
llvm-svn: 339810
Summary:
This change is to support a new fature in clangd, tests will be send toclang-tools-extra with that change.
Unittests are included in: https://reviews.llvm.org/D50449
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50443
llvm-svn: 339540
Implement support for MS-style PCH through headers.
This enables support for /Yc and /Yu where the through header is either
on the command line or included in the source. It replaces the current
support the requires the header also be specified with /FI.
This change adds a -cc1 option -pch-through-header that is used to either
start or stop compilation during PCH create or use.
When creating a PCH, the compilation ends after compilation of the through
header.
When using a PCH, tokens are skipped until after the through header is seen.
Patch By: mikerice
Differential Revision: https://reviews.llvm.org/D46652
llvm-svn: 336379
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
When a '>>' token is split into two '>' tokens (in C++11 onwards), or (as an
extension) when we do the same for other tokens starting with a '>', we can't
just use a location pointing to the first '>' as the location of the split
token, because that would result in our miscomputing the length and spelling
for the token. As a consequence, for example, a refactoring replacing 'A<X>'
with something else would sometimes replace one character too many, and
similarly diagnostics highlighting a template-id source range would highlight
one character too many.
Fix this by creating an expansion range covering the first character of the
'>>' token, whose spelling is '>'. For this to work, we generalize the
expansion range of a macro FileID to be either a token range (the common case)
or a character range (used in this new case).
llvm-svn: 331155
This fixes issues with "class" being reported as an identifier in "enum class" because the construct is not present when using default language options.
Patch by Johann Klähn.
llvm-svn: 330159
Summary:
This patch removes IdentifierInfo from completion token after remembering
the identifier in the preprocessor.
Prior to this patch, completion token had the IdentifierInfo set to null when
completing at the start of identifier and to the II for completion prefix
when in the middle of identifier.
This patch unifies how code completion token is handled when it is insterted
before the identifier and in the middle of the identifier.
The actual IdentifierInfo can still be obtained from the Preprocessor.
Reviewers: bkramer, arphaman
Reviewed By: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42241
llvm-svn: 323133
Summary:
llvm has grown a WritableMemoryBuffer class, which is convertible
(inherits from) a MemoryBuffer. We can use it to avoid conts_casting the
buffer contents when we want to write to it.
Reviewers: dblaikie, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D41387
llvm-svn: 321167
When a preamble ends in a conditional preprocessor block that is being
skipped, the preprocessor needs to continue skipping that block when
the preamble is used.
This fixes PR34570.
llvm-svn: 317308
This patch implements an extension to the preprocessor:
__VA_OPT__(contents) --> which expands into its contents if variadic arguments are supplied to the parent macro, or behaves as an empty token if none.
- Currently this feature is only enabled for C++2a (this could be enabled, with some careful tweaks, for other dialects with the appropriate extension or compatibility warnings)
- The patch was reviewed here: https://reviews.llvm.org/D35782 and asides from the above (and moving some of the definition and expansion recognition logic into the corresponding state machines), I believe I incorporated all of Richard's suggestions.
A few technicalities (most of which were clarified through private correspondence between rsmith, hubert and thomas) are worth mentioning. Given:
#define F(a,...) a #__VA_OPT__(a ## a) a ## __VA_OPT__(__VA_ARGS__)
- The call F(,) Does not supply any tokens for the variadic arguments and hence VA_OPT behaves as a placeholder.
- When expanding VA_OPT (for e.g. F(,1) token pasting occurs eagerly within its contents if the contents need to be stringified.
- A hash or a hashhash prior to VA_OPT does not inhibit expansion of arguments if they are the first token within VA_OPT.
- When a variadic argument is supplied, argument substitution occurs within the contents as does stringification - and these resulting tokens are inserted back into the macro expansions token stream just prior to the entire stream being rescanned and concatenated.
See wg21.link/P0306 for further details on the feature.
Acknowledgment: This patch would have been poorer if not for Richard Smith's usual thoughtful analysis and feedback.
llvm-svn: 315840
This patch fixes broken preamble-skipping when the preamble region includes a byte order mark (BOM). Previously, parsing would fail if preamble PCH generation was enabled and a BOM was present.
This also fixes preamble invalidation when a BOM appears or disappears. This may seem to be an obscure edge case, but it happens regularly with IDEs that pass buffer overrides that never (or always) have a BOM, yet the underlying file from the initial parse that generated a PCH might (or might not) have a BOM.
I've included a test case for these scenarios.
Differential Revision: https://reviews.llvm.org/D37491
llvm-svn: 313796
Summary:
The crash occurs when the first token after a preamble is a macro
expansion.
Fixed by moving replayPreambleConditionalStack from Parser into
Preprocessor. It is now called right after the predefines file is
processed.
Reviewers: erikjv, bkramer, klimek, yvvan
Reviewed By: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36872
llvm-svn: 311330
The goal of this commit is to fix clang-format so it does not merge tokens when
using the alternative spelling keywords. (eg: "not foo" should not become "notfoo")
The problem is that Preprocessor::HandleIdentifier used to drop the identifier info
from the token for these keyword. This means the first condition of
TokenAnnotator::spaceRequiredBefore is not met. We could add explicit check for
the spelling in that condition, but I think it is better to keep the IdentifierInfo
and handle the operator keyword explicitly when needed. That actually leads to simpler
code, and probably slightly more efficient as well.
Another side effect of this change is that __identifier(and) will now work as
one would expect, removing a FIXME from the MicrosoftExtensions.cpp test
Differential Revision: https://reviews.llvm.org/D35172
llvm-svn: 308008
UBSan found an issue with a nullptr being assigned to a reference.
This was because a following function went back and checked the
identifier in the CPPOperatorName case. This patch corrects that
location with the original logic as well.
llvm-svn: 305128
to support operator keywords used in Windows SDK, alter token type when
seen in system headers
Hello, I submitted D33505 to address this problem, but the
proposal was rejected as too big a hammer.
This change will allow clang to parse the WindowsSDK header <query.h>
which uses the operator name "or" as a field name. Treat cpp operator
keywords as ordinary identifiers inside the Microsoft headers, but
treat them as usual in the user's program.
Original Submitter: Melanie Blower (mibintc)
Differential Revision: https://reviews.llvm.org/D33782
llvm-svn: 305087
Previously, a preamble only included #if blocks (and friends like
ifdef) if there was a corresponding #endif before any declaration or
definition. The problem is that any header file that uses include guards
will not have a preamble generated, which can make code-completion very
slow.
To prevent errors about unbalanced preprocessor conditionals in the
preamble, and unbalanced preprocessor conditionals after a preamble
containing unfinished conditionals, the conditional stack is stored
in the pch file.
This fixes PR26045.
Differential Revision: http://reviews.llvm.org/D15994
llvm-svn: 304207