Commit Graph

151 Commits

Author SHA1 Message Date
Benjamin Kramer 8671028e95 [lex] Don't read past the end of the buffer
While dereferencing ThisTokEnd is fine and we know that it's not in
[a-zA-Z0-9_.], ThisTokEnd[1] is really past the end.

Found by asan and with a little help from clang-fuzz.

llvm-svn: 233491
2015-03-29 14:11:37 +00:00
Benjamin Kramer 7fd88386b0 [lex] Turn range checks into asserts.
We know that the last accessible char is not in [a-zA-Z0-9_.] so we can
happily scan on as long as it is. No functionality change.

llvm-svn: 233490
2015-03-29 14:11:22 +00:00
David Blaikie 96cedb52b3 Make Oveflow tracking more legible (CR feedback from Richard Smith on r232999)
llvm-svn: 233006
2015-03-23 19:54:44 +00:00
David Blaikie 252f743858 Refactor: Simplify boolean expresssions in lib/Lex
Simplify boolean expressions using `true` and `false` with `clang-tidy`

Patch by Richard Thomson.

Differential Revision: http://reviews.llvm.org/D8531

llvm-svn: 232999
2015-03-23 19:39:19 +00:00
Richard Smith 3e3a705062 [c++1z] Support for u8 character literals.
llvm-svn: 221576
2014-11-08 06:08:42 +00:00
Aaron Ballman dd69ef38db C++1y is now C++14!
Changes diagnostic options, language standard options, diagnostic identifiers, diagnostic wording to use c++14 instead of c++1y. It also modifies related test cases to use the updated diagnostic wording.

llvm-svn: 215982
2014-08-19 15:55:55 +00:00
Craig Topper 9d5583ef0a Convert StringLiteralParser constructor to use ArrayRef instead of a pointer and count.
llvm-svn: 211763
2014-06-26 04:58:39 +00:00
David Majnemer 65a407c2ce Lex: Use the correct types for MS integer suffixes
Something went wrong with r211426, it is an older version of this code
and should not have been committed.  It was reverted with r211434.

Original commit message:
We didn't properly implement support for the sized integer suffixes.
Suffixes like i16 were essentially ignored instead of mapping them to
the appropriately sized integer type.

This fixes PR20008.

Differential Revision: http://reviews.llvm.org/D4132

llvm-svn: 211441
2014-06-21 18:46:07 +00:00
Rafael Espindola d46e4a2303 Revert "Lex: Use the correct types for MS integer suffixes"
This reverts commit r211426.

This broke the arm bots. The crash can be reproduced on X86 by running.
./bin/clang -cc1  -fsyntax-only -verify -fms-extensions ~/llvm/clang/test/Lexer/ms-extensions.c -triple arm-linux

llvm-svn: 211434
2014-06-21 12:39:25 +00:00
David Majnemer 252cbe25cb Lex: Use the correct types for MS integer suffixes
We didn't properly implement support for the sized integer suffixes.
Suffixes like i16 were essentially ignored instead of mapping them to
the appropriately sized integer type.

This fixes PR20008.

Differential Revision: http://reviews.llvm.org/D4132

llvm-svn: 211426
2014-06-21 00:51:59 +00:00
Peter Collingbourne efe09b4c65 Permit the "if" literal suffix with Microsoft extensions enabled.
Differential Revision: http://reviews.llvm.org/D3963

llvm-svn: 209859
2014-05-29 23:10:15 +00:00
Alexander Kornienko d3b4e08960 Remove limits on the number of fix-it hints and ranges in the DiagnosticsEngine.
Summary:
The limits on the number of fix-it hints and ranges attached to a
diagnostic are arbitrary and don't apply universally to all users of the
DiagnosticsEngine. The way the limits are enforced may lead to diagnostics
generating invalid sets of fixes. I suggest removing the limits, which will also
simplify the implementation.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: klimek, cfe-commits

Differential Revision: http://reviews.llvm.org/D3879

llvm-svn: 209468
2014-05-22 19:56:11 +00:00
Craig Topper d2d442ca73 [C++11] Use 'nullptr'. Lex edition.
llvm-svn: 209083
2014-05-17 23:10:59 +00:00
Richard Smith 70ee92fa4d Add some missing checks for C++1y digit separators that don't in fact separate
digits. Turns out we have completely separate lexing codepaths for floating
point numbers depending on whether or not they start with a zero. Who knew...
=)

llvm-svn: 206932
2014-04-22 23:50:25 +00:00
David Blaikie dcb72d72ff Remove uses of SmallString::equals in favor of SmallVectorImpl<char>'s operator==
llvm-svn: 203373
2014-03-09 05:18:27 +00:00
Richard Smith 8b7258bdb3 PR18855: Add support for UCNs and UTF-8 encoding within ud-suffixes.
llvm-svn: 201532
2014-02-17 21:52:30 +00:00
NAKAMURA Takumi f2bc8f35a2 NumericLiteralParser::ParseNumberStartingWithZero(): Try to appease MSC16's miscompilation.
Investigating yet. It seems msc16 miscompiles s[1] to be folded.

llvm-svn: 191485
2013-09-27 04:42:28 +00:00
Richard Smith 99dc071104 Fix buildbot breakage.
llvm-svn: 191424
2013-09-26 05:57:03 +00:00
Richard Smith 1e130489b3 Replace a bool with an enum for clarity, based on review comment from James Dennett.
llvm-svn: 191420
2013-09-26 04:19:11 +00:00
Richard Smith fde9485297 Implement C++1y digit separator proposal (' as a digit separator). This is not
yet approved by full committee, but was unanimously supported by EWG.

llvm-svn: 191417
2013-09-26 03:33:06 +00:00
Richard Smith 2a98862be2 Handle standard libraries that miss out the space when defining the standard
literal operators. Also, for now, allow the proposed C++1y "il", "i", and "if"
suffixes too. (Will revert the latter if LWG decides not to go ahead with that
change after all.)

llvm-svn: 191274
2013-09-24 04:06:10 +00:00
Eli Friedman f9edb00fa4 Fix CharByteWidth assertion in LiteralSupport.
Patch by Eelis van der Weegen.

llvm-svn: 190971
2013-09-18 23:23:13 +00:00
Nick Lewycky 8054f1de88 Revert r188863 which could propose wrong fixits for multibyte character literals.
llvm-svn: 188918
2013-08-21 18:57:51 +00:00
Nick Lewycky 3151d7c76a Issue fixits replacing invalid character literals with the equivalent \xNN
escape code.

llvm-svn: 188863
2013-08-21 04:10:58 +00:00
Nick Lewycky 63cc55b479 No functionality change. Adjust a bunch of formatting issues in this code and
fix a typo in a comment.

llvm-svn: 188857
2013-08-21 02:40:19 +00:00
Richard Smith f4198b7598 C++1y literal suffix support:
* Allow ns, us, ms, s, min, h as numeric ud-suffixes
 * Allow s as string ud-suffix

llvm-svn: 186933
2013-07-23 08:14:48 +00:00
Eli Friedman 088d39afc6 Integers which are too large should be an error.
Switch some warnings over to errors which should never have been warnings
in the first place.  (Also, a minor fix to the preprocessor rules for
integer literals while I'm here.)

llvm-svn: 186903
2013-07-23 00:25:18 +00:00
Richard Smith c5c27f2a1f Note that we support (and in fact have supported since the dawn of time itself)
C++1y binary literals.

llvm-svn: 179883
2013-04-19 20:47:20 +00:00
Jordan Rose a7d03840e6 Excise <cctype> from Clang (except clang-tblgen) in favor of CharInfo.h.
Nearly all of these changes are one-to-one replacements; the few that
aren't have to do with custom identifier validation.

llvm-svn: 174768
2013-02-08 22:30:41 +00:00
Dmitri Gribenko 9feeef40f5 Move UTF conversion routines from clang/lib/Basic to llvm/lib/Support
This is required to use them in TableGen.

llvm-svn: 173924
2013-01-30 12:06:08 +00:00
Jordan Rose c0cba27230 PR15067: Don't assert when a UCN appears in a C90 file.
Unfortunately, we can't accept the UCN as an extension because we're
required to treat it as two tokens for preprocessing purposes.

llvm-svn: 173622
2013-01-27 20:12:04 +00:00
Jordan Rose aa89cf1a66 Unify diagnostics for \x, \u, and \U without any following hex digits.
llvm-svn: 173368
2013-01-24 20:50:13 +00:00
Jordan Rose 78ed86a7e5 Adopt llvm::hexDigitValue.
llvm-svn: 172861
2013-01-18 22:33:58 +00:00
Richard Smith 2bf7fdb723 s/CPlusPlus0x/CPlusPlus11/g
llvm-svn: 171367
2013-01-02 11:42:31 +00:00
Chandler Carruth 3a02247dc9 Sort all of Clang's files under 'lib', and fix up the broken headers
uncovered.

This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.

I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.

llvm-svn: 169237
2012-12-04 09:13:33 +00:00
Benjamin Kramer 7d574e269d LiteralSupport: Don't overflow the temporary buffer when decoding invalid string parts.
Instead just use a dummy buffer, we're not going to use the decoded string anyways.
Fixes PR14292.

llvm-svn: 167594
2012-11-08 19:22:31 +00:00
Benjamin Kramer f23a6e6f80 LiteralSupport: Clean up style violations. No functionality change.
llvm-svn: 167593
2012-11-08 19:22:26 +00:00
David Blaikie a0613170b4 Handle string encoding diagnostics when there are too many invalid ranges.
llvm-svn: 167059
2012-10-30 23:22:22 +00:00
Seth Cantrell 4cfc817a9a improve highlighting of invalid string encodings
limit highlight to exactly the bad encoding, and highlight every
bad encoding in a string.

llvm-svn: 166900
2012-10-28 18:24:46 +00:00
Jordan Rose de584de370 Rename CanFitInto64Bits to alwaysFitsInto64Bits per discussion on IRC.
This makes the behavior clearer concerning literals with the maximum
number of digits. For a 32-bit example, 4,000,000,000 is a valid uint32_t,
but 5,000,000,000 is not, so we'd have to count 10-digit decimal numbers
as "unsafe" (meaning we have to check for overflow when parsing them,
just as we would for numbers with 11 digits or higher). This is the same,
only with 64 bits to play with.

No functionality change.

llvm-svn: 164639
2012-09-25 22:32:51 +00:00
Dmitri Gribenko 511288b2b5 Optimize NumericLiteralParser::GetIntegerValue().
It does a conservative estimate on the size of numbers that can fit into
uint64_t.  This bound is improved.

llvm-svn: 164624
2012-09-25 19:09:15 +00:00
Dmitri Gribenko 7ba91723e7 Small cleanup of literal semantic analysis: hiding 'char *' pointers behind
StringRef makes code cleaner.  Also, make the temporary buffer smaller:
512 characters is unreasonably large for integer literals.

llvm-svn: 164484
2012-09-24 09:53:54 +00:00
Richard Smith 639b8d05dd When a bad UTF-8 encoding or bogus escape sequence is encountered in a
string literal, produce a diagnostic pointing at the erroneous character
range, not at the start of the literal.

llvm-svn: 163459
2012-09-08 07:16:20 +00:00
Nico Weber 4b18c3ff40 Share ConvertUTF8toWide() between Lex and CodeGen.
llvm-svn: 159634
2012-07-03 02:24:52 +00:00
James Dennett 99c193b3c0 Documentation cleanup: add \verbatim markup for grammar productions
llvm-svn: 158740
2012-06-19 21:04:25 +00:00
James Dennett 1cc2203286 Documentation cleanup: added \verbatim...\verbatim markup to fix the
formatting of Doxygen's output for StringLiteralParser::StringLiteralParser.

llvm-svn: 158616
2012-06-17 03:34:42 +00:00
Richard Smith 0948d93b7f Fix off-by-one error in UTF-16 encoding: don't try to use a surrogate pair for U+FFFF.
llvm-svn: 158391
2012-06-13 05:41:29 +00:00
Richard Smith 4060f77462 PR13099: Teach -Wformat about raw string literals, UTF-8 strings and Unicode escape sequences.
llvm-svn: 158390
2012-06-13 05:37:23 +00:00
Argyrios Kyrtzidis 9933e3ac88 In StringLiteralParser::init, make sure we emit an error when
failing to lex the string, as suggested by Eli.

Part of rdar://11305263.

llvm-svn: 156081
2012-05-03 17:50:32 +00:00
Argyrios Kyrtzidis 4e5b5c36f4 In StringLiteralParser::init(), fail gracefully if the string is
not as we expect; it may be due to racing issue of a file coming from PCH
changing after the PCH is loaded.

rdar://11353109

llvm-svn: 156043
2012-05-03 01:01:56 +00:00