Commit Graph

315 Commits

Author SHA1 Message Date
Richard Smith b5f8171a1b PR37189 Fix incorrect end source location and spelling for a split '>>' token.
When a '>>' token is split into two '>' tokens (in C++11 onwards), or (as an
extension) when we do the same for other tokens starting with a '>', we can't
just use a location pointing to the first '>' as the location of the split
token, because that would result in our miscomputing the length and spelling
for the token. As a consequence, for example, a refactoring replacing 'A<X>'
with something else would sometimes replace one character too many, and
similarly diagnostics highlighting a template-id source range would highlight
one character too many.

Fix this by creating an expansion range covering the first character of the
'>>' token, whose spelling is '>'. For this to work, we generalize the
expansion range of a macro FileID to be either a token range (the common case)
or a character range (used in this new case).

llvm-svn: 331155
2018-04-30 05:25:48 +00:00
Ilya Biryukov ef4ece75fd [CodeComplete] Fix completion in the middle of ident in ctor lists.
Summary:
The example that was broken before (^ designates completion points):

    class Foo {
      Foo() : fie^ld^() {} // no completions were provided here.
      int field;
    };

To fix it we don't cut off lexing after an identifier followed by code
completion token is lexed. Instead we skip the rest of identifier and
continue lexing.
This is consistent with behavior of completion when completion token is
right before the identifier.

Reviewers: sammccall, aaron.ballman, bkramer, sepavloff, arphaman, rsmith

Reviewed By: aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D44932

llvm-svn: 330833
2018-04-25 15:13:34 +00:00
Ilya Biryukov b3510c4254 [CodeComplete] Fix completion at the end of keywords
Summary:
Make completion behave consistently no matter if it is run at the
start, in the middle or at the end of an identifier that happens to
be a keyword or a macro name. Since completion is often ran on
incomplete identifiers, they may turn into keywords by accident.

For example, we should produce same results for all of these
completion points:

    // ^ is completion point.
    ^class
    cla^ss
    class^

Previously clang produced different results for the last case (as if
the completion point was after a space: `class ^`).

This change also updates some offsets in tests that (unintentionally?)
relied on the old behavior.

Reviewers: sammccall, bkramer, arphaman, aaron.ballman

Reviewed By: sammccall

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D45887

llvm-svn: 330717
2018-04-24 13:48:53 +00:00
Alexander Kornienko 2a8c18d991 Fix typos in clang
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:

  archtype
  cas
  classs
  checkk
  compres
  definit
  frome
  iff
  inteval
  ith
  lod
  methode
  nd
  optin
  ot
  pres
  statics
  te
  thru

Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)

Differential revision: https://reviews.llvm.org/D44188

llvm-svn: 329399
2018-04-06 15:14:32 +00:00
Volodymyr Sapsai abb8dfc114 [Lex] Avoid out-of-bounds dereference in LexAngledStringLiteral.
Fix makes the loop in LexAngledStringLiteral more like the loops in
LexStringLiteral, LexCharConstant. When we skip a character after
backslash, we need to check if we reached the end of the file instead of
reading the next character unconditionally.

Discovered by OSS-Fuzz:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3832

rdar://problem/35572754

Reviewers: arphaman, kcc, rsmith, dexonsmith

Reviewed By: rsmith, dexonsmith

Subscribers: cfe-commits, rsmith, dexonsmith

Differential Revision: https://reviews.llvm.org/D41423

llvm-svn: 322390
2018-01-12 18:54:35 +00:00
Richard Smith 77091b167f Warn if we find a Unicode homoglyph for a symbol in an identifier.
Specifically, warn if:
 * we find a character that the language standard says we must treat as an
   identifier, and
 * that character is not reasonably an identifier character (it's a punctuation
   character or similar), and 
 * it renders identically to a valid non-identifier character in common
   fixed-width fonts.

Some tools "helpfully" substitute the surprising characters for the expected
characters, and replacing semicolons with Greek question marks is a common
"prank".

llvm-svn: 320697
2017-12-14 13:15:08 +00:00
Eugene Zelenko cb96ac64b0 [Lex] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 320207
2017-12-08 22:39:26 +00:00
Taewook Oh cebac48bf7 Stringizing raw string literals containing newline
Summary: This patch implements 4.3 of http://open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4220.pdf. If a raw string contains a newline character, replace each newline character with the \n escape code. Without this patch, included test case (macro_raw_string.cpp) results compilation failure.

Reviewers: rsmith, doug.gregor, jkorous-apple

Reviewed By: jkorous-apple

Subscribers: jkorous-apple, vsapsai, cfe-commits

Differential Revision: https://reviews.llvm.org/D39279

llvm-svn: 319904
2017-12-06 17:00:53 +00:00
Aaron Ballman c351fba69e Now that C++17 is official (https://www.iso.org/standard/68564.html), start changing the C++1z terminology over to C++17. NFC intended, these are all mechanical changes.
llvm-svn: 319688
2017-12-04 20:27:34 +00:00
Richard Smith edbf5972a4 [c++2a] P0515R3: lexer support for new <=> token.
llvm-svn: 319509
2017-12-01 01:07:10 +00:00
Alex Lorenz ebbbb81266 [refactor][extract] insert semicolons into extracted/inserted code
when needed

This commit implements the semicolon insertion logic into the extract
refactoring. The following rules are used:

- extracting expression: add terminating ';' to the extracted function.
- extracting statements that don't require terminating ';' (e.g. switch): add
  terminating ';' to the callee.
- extracting statements with ';':  move (if possible) the original ';' from the
  callee and add terminating ';'.
- otherwise, add ';' to both places.

Differential Revision: https://reviews.llvm.org/D39441

llvm-svn: 317343
2017-11-03 18:11:22 +00:00
Aaron Ballman 606093a53b Add -f[no-]double-square-bracket-attributes as new driver options to control use of [[]] attributes in all language modes. This is the initial implementation of WG14 N2165, which is a proposal to add [[]] attributes to C2x, but also allows you to enable these attributes in C++98, or disable them in C++11 or later.
llvm-svn: 315856
2017-10-15 15:01:42 +00:00
Alex Lorenz d5bf436d3a [Lex] Avoid out-of-bounds dereference in SkipLineComment
Credit to OSS-Fuzz for discovery:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3145

rdar://34526482

llvm-svn: 315785
2017-10-14 01:18:30 +00:00
Alex Lorenz c1e32fca96 A '<' with a trigraph '#' is not a valid editor placeholder
Credit to OSS-Fuzz for discovery:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3137#c5

rdar://34923985

llvm-svn: 315398
2017-10-11 00:41:20 +00:00
Cameron Desrochers 531aec2e90 Fixed unused variable warning introduced in r313796 causing build failure
llvm-svn: 313802
2017-09-20 19:37:37 +00:00
Cameron Desrochers 84fd064ef9 [PCH] Fixed preamble breaking with BOM presence (and particularly, fluctuating BOM presence)
This patch fixes broken preamble-skipping when the preamble region includes a byte order mark (BOM). Previously, parsing would fail if preamble PCH generation was enabled and a BOM was present.

This also fixes preamble invalidation when a BOM appears or disappears. This may seem to be an obscure edge case, but it happens regularly with IDEs that pass buffer overrides that never (or always) have a BOM, yet the underlying file from the initial parse that generated a PCH might (or might not) have a BOM.

I've included a test case for these scenarios.

Differential Revision: https://reviews.llvm.org/D37491

llvm-svn: 313796
2017-09-20 19:03:37 +00:00
Erich Keane e916d54614 [Preprocessor] Correct internal token parsing of newline characters in CRLF
Correct implementation:  Apparently I managed in r311683 to submit the wrong
version of the patch for this, so I'm correcting it now.

Differential Revision: https://reviews.llvm.org/D37079

llvm-svn: 312542
2017-09-05 17:32:36 +00:00
Erich Keane 5a2b322e0d [Preprocessor] Correct internal token parsing of newline characters in CRLF
Discovered due to a goofy git setup, the test system-headerline-directive.c 
(and a few others) failed because the token-consumption will consume only the 
'\r' in CRLF, making the preprocessor's printed value give the wrong line number 
when returning from an include. For example:

(line 1):#include <noline.h>\r\n

The "file exit" code causes the printer to try to print the 'returned to the 
main file' line. It looks up what the current line number is. However, since the 
current 'token' is the '\n' (since only the \r was consumed), it will give the 
line number as '1", not '2'. This results in a few failed tests, but more 
importantly, results in error messages being incorrect when compiling a 
previously preprocessed file.

Differential Revision: https://reviews.llvm.org/D37079

llvm-svn: 311683
2017-08-24 18:36:07 +00:00
Alexander Kornienko cf007a7614 [Lexer] Finding beginning of token with escaped new line
Summary:
Lexer::GetBeginningOfToken produced invalid location when
backtracking across escaped new lines.

This fixes PR26228

Reviewers: akyrtzi, alexfh, rsmith, doug.gregor

Reviewed By: alexfh

Subscribers: alexfh, cfe-commits

Patch by Paweł Żukowski!

Differential Revision: https://reviews.llvm.org/D30748

llvm-svn: 310576
2017-08-10 10:06:16 +00:00
Erik Verbruggen 795eee96b3 Fix invalid warnings for header guards in preambles
Fixes https://bugs.llvm.org/show_bug.cgi?id=33574

Differential Revision: https://reviews.llvm.org/D34882

llvm-svn: 307134
2017-07-05 09:44:07 +00:00
Alex Lorenz fb7654a8fc [PR33394] Avoid lexing editor placeholders when Clang is used only
for preprocessing

r300667 added support for editor placeholder to Clang. That commit didn’t take
into account that users who use Clang for preprocessing only (-E) will get the
"editor placeholder in source file" error when preprocessing their source
(PR33394). This commit ensures that Clang doesn't lex editor placeholders when
running a preprocessor only action.

rdar://32718000

Differential Revision: https://reviews.llvm.org/D34256

llvm-svn: 305576
2017-06-16 20:13:39 +00:00
Galina Kistanova 39edaaa65c Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.
llvm-svn: 304643
2017-06-03 06:25:47 +00:00
Erik Verbruggen b34c79ff27 Allow for unfinished #if blocks in preambles
Previously, a preamble only included #if blocks (and friends like
ifdef) if there was a corresponding #endif before any declaration or
definition. The problem is that any header file that uses include guards
will not have a preamble generated, which can make code-completion very
slow.

To prevent errors about unbalanced preprocessor conditionals in the
preamble, and unbalanced preprocessor conditionals after a preamble
containing unfinished conditionals, the conditional stack is stored
in the pch file.

This fixes PR26045.

Differential Revision: http://reviews.llvm.org/D15994

llvm-svn: 304207
2017-05-30 11:54:55 +00:00
Alex Lorenz d47546612d [Lexer] Ensure that the token is not an annotation token when
retrieving the identifer info for an Objective-C keyword

This commit fixes an assertion that's triggered in getIdentifier when the token
is an annotation token.

rdar://32225463

llvm-svn: 303246
2017-05-17 11:08:36 +00:00
Alex Lorenz 9c5c2bfe54 Add a fix-it for -Wunguarded-availability
This patch adds a fix-it for the -Wunguarded-availability warning. This fix-it
is similar to the Swift one: it suggests that you wrap the statement in an
`if (@available)` check. The produced fixits are indented (just like the Swift
ones) to make them look nice in Xcode's fix-it preview.

rdar://31680358

Differential Revision: https://reviews.llvm.org/D32424

llvm-svn: 302253
2017-05-05 16:42:44 +00:00
Alex Lorenz 1be800c511 Add support for editor placeholders to Clang
This commit teaches Clang to recognize editor placeholders that are produced
when an IDE like Xcode inserts a code-completion result that includes a
placeholder. Now when the lexer sees a placeholder token, it emits an
'editor placeholder in source file' error and creates an identifier token
that represents the placeholder. The parser/sema can now recognize the
placeholders and can suppress the diagnostics related to the placeholders. This
ensures that live issues in an IDE like Xcode won't get spurious diagnostics
related to placeholders.

This commit also adds a new compiler option named '-fallow-editor-placeholders'
that silences the 'editor placeholder in source file' error. This is useful
for an IDE like Xcode as we don't want to display those errors in live issues.

rdar://31581400

Differential Revision: https://reviews.llvm.org/D32081

llvm-svn: 300667
2017-04-19 08:58:56 +00:00
Richard Smith 4c132e576b Do not warn about whitespace between ??/ trigraph and newline in line comments if trigraphs are disabled in the current language.
llvm-svn: 300609
2017-04-18 21:45:04 +00:00
Richard Smith 1d2ae94b5d Fix mishandling of escaped newlines followed by newlines or nuls.
Previously, if an escaped newline was followed by a newline or a nul, we'd lex
the escaped newline as a bogus space character. This led to a bunch of
different broken corner cases:

For the pattern "\\\n\0#", we would then have a (horizontal) space whose
spelling ends in a newline, and would decide that the '#' is at the start of a
line, and incorrectly start preprocessing a directive in the middle of a
logical source line. If we were already in the middle of a directive, this
would result in our attempting to process multiple directives at the same time!
This resulted in crashes, asserts, and hangs on invalid input, as discovered by
fuzz-testing.

For the pattern "\\\n" at EOF (with an implicit following nul byte), we would
produce a bogus trailing space character with spelling "\\\n". This was mostly
harmless, but would lead to clang-format getting confused and misformatting in
rare cases. We now produce a trailing EOF token with spelling "\\\n",
consistent with our handling for other similar cases -- an escaped newline is
always part of the token containing the next character, if any.

For the pattern "\\\n\n", this was somewhat more benign, but would produce an
extraneous whitespace token to clients who care about preserving whitespace.
However, it turns out that our lexing for line comments was relying on this bug
due to an off-by-one error in its computation of the end of the comment, on the
slow path where the comment might contain escaped newlines.

llvm-svn: 300515
2017-04-17 23:44:51 +00:00
Sanne Wouda db1bdf472a Skip Unicode character expansion in assembly files
Summary: When using the C preprocessor with assembly files, either with a
capital `S` file extension, or with `-xassembler-with-cpp`, the Unicode escape
sequence `\u` is ignored. The `\u` pattern can be used for expanding a macro
argument that starts with `u`.

Author: Salman Arif <salman.arif@arm.com>

Reviewers: rengolin, olista01

Reviewed By: olista01

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D31765

llvm-svn: 299754
2017-04-07 10:13:00 +00:00
Eric Fiselier cb2f326a75 Allow lexer to handle string_view literals. Patch from Anton Bikineev.
This implements the compiler side of p0403r0. This patch was reviewed as
https://reviews.llvm.org/D26829.

llvm-svn: 290744
2016-12-30 04:51:10 +00:00
Justin Lebar 9091055efa Move UTF functions into namespace llvm.
Summary:
This lets people link against LLVM and their own version of the UTF
library.

I determined this only affects llvm, clang, lld, and lldb by running

$ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq
  clang
  lld
  lldb
  llvm

Tested with

  ninja lldb
  ninja check-clang check-llvm check-lld

(ninja check-lldb doesn't complete for me with or without this patch.)

Reviewers: rnk

Subscribers: klimek, beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D24996

llvm-svn: 282822
2016-09-30 00:38:45 +00:00
Eugene Zelenko e95e7d5d64 Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes.
Differential revision: https://reviews.llvm.org/D24115

llvm-svn: 280870
2016-09-07 21:53:17 +00:00
Vassil Vassilev 644ea61d2d Implement filtering for code completion of identifiers.
Patch by Cristina Cristescu and Axel Naumann!

Agreed on post commit review (D17820).

llvm-svn: 276878
2016-07-27 14:56:59 +00:00
Benjamin Kramer 22f24f6815 [Lexer] Let the compiler infer string lengths. No functionality change intended.
llvm-svn: 265126
2016-04-01 10:04:07 +00:00
Benjamin Kramer e550bbdf9d [Lexer] Don't read out of bounds if a conflict marker is at the end of a file
This can happen as we look for '<<<<' while scanning tokens but then expect
'<<<<\n' to tell apart perforce from diff3 conflict markers. Just harden
the pointer arithmetic.

Found by libfuzzer + asan!

llvm-svn: 265125
2016-04-01 09:58:45 +00:00
Richard Smith 560a3579b2 Update diagnostics now that hexadecimal literals look likely to be part of C++17.
llvm-svn: 262753
2016-03-04 22:32:06 +00:00
Richard Trieu cc3949d99a Remove use of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261271
2016-02-18 22:34:54 +00:00
Anastasia Stulova 735c6cdebd [OpenCL] Adding reserved operator logical xor for OpenCL
This patch adds the reserved operator ^^ when compiling for OpenCL (spec v1.1 s6.3.g),
which results in a more meaningful error message.

Patch by Neil Hickey!

Review: http://reviews.llvm.org/D13280

M    test/SemaOpenCL/unsupported.cl
M    include/clang/Basic/TokenKinds.def
M    include/clang/Basic/DiagnosticParseKinds.td
M    lib/Basic/OperatorPrecedence.cpp
M    lib/Lex/Lexer.cpp
M    lib/Parse/ParseExpr.cpp

llvm-svn: 259651
2016-02-03 15:17:14 +00:00
Richard Trieu 3a5c958182 Fix -Wnull-conversion for long macros.
Move the function to get a macro name from DiagnosticRenderer.cpp to Lexer.cpp
so that other files can use it.  Lexer now has two functions to get the
immediate macro name, the newly added one is better for diagnostic purposes.
Make -Wnull-conversion use this function for better NULL macro detection.

llvm-svn: 258778
2016-01-26 02:51:55 +00:00
Nico Weber de2310bddf Emit a -Wmicrosoft warning when treating ^Z as EOF in MS mode.
llvm-svn: 256596
2015-12-29 23:17:27 +00:00
Vinicius Tinti 92e68c2766 [clang] Disable Unicode in asm files
Clang should not convert tokens to Unicode when preprocessing assembly
files.

Fixes PR25558.

llvm-svn: 253738
2015-11-20 23:42:39 +00:00
Craig Topper 7f5ff2175f Use %select to merge similar diagnostics. NFC
llvm-svn: 253119
2015-11-14 02:09:55 +00:00
Craig Topper a6324c9463 Disable trigraph and escaped newline expansion on all types of raw string literals not just ASCII type.
llvm-svn: 251025
2015-10-22 15:35:21 +00:00
Rafael Espindola c0f18a91ac Replace a few std::string& with StringRef. NFC.
Patch by Косов Евгений!

llvm-svn: 238774
2015-06-01 20:00:16 +00:00
Kostya Serebryany 6c2479bee4 Fix buffer overflow in Lexer
Summary:
Fix PR22407, where the Lexer overflows the buffer when parsing
 #include<\
(end of file after slash)

Test Plan:
Added a test that will trigger in asan build.
This case is also covered by the clang-fuzzer bot.

Reviewers: rnk

Reviewed By: rnk

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D9489

llvm-svn: 236466
2015-05-04 22:30:29 +00:00
Benjamin Kramer f04f98d543 Use delegating ctors to reduce code duplication. NFC.
llvm-svn: 231476
2015-03-06 14:15:57 +00:00
David Majnemer 5a54977ea8 Lex: Don't crash if both conflict markers are on the same line
We would check if the terminator marker is on a newline.  However, the
logic would end up out-of-bounds if the terminator marker immediately
follows the start marker.

This fixes PR21820.

llvm-svn: 224210
2014-12-14 04:53:11 +00:00
Richard Smith 3e3a705062 [c++1z] Support for u8 character literals.
llvm-svn: 221576
2014-11-08 06:08:42 +00:00
Jay Foad 6af95d3864 Fix warning in Altivec code when building with GCC 4.8.2 on Ubuntu 14.04.
llvm-svn: 220855
2014-10-29 14:42:12 +00:00
Aaron Ballman dd69ef38db C++1y is now C++14!
Changes diagnostic options, language standard options, diagnostic identifiers, diagnostic wording to use c++14 instead of c++1y. It also modifies related test cases to use the updated diagnostic wording.

llvm-svn: 215982
2014-08-19 15:55:55 +00:00