Dmitri Gribenko
7ba91723e7
Small cleanup of literal semantic analysis: hiding 'char *' pointers behind
...
StringRef makes code cleaner. Also, make the temporary buffer smaller:
512 characters is unreasonably large for integer literals.
llvm-svn: 164484
2012-09-24 09:53:54 +00:00
Richard Smith
639b8d05dd
When a bad UTF-8 encoding or bogus escape sequence is encountered in a
...
string literal, produce a diagnostic pointing at the erroneous character
range, not at the start of the literal.
llvm-svn: 163459
2012-09-08 07:16:20 +00:00
Nico Weber
4b18c3ff40
Share ConvertUTF8toWide() between Lex and CodeGen.
...
llvm-svn: 159634
2012-07-03 02:24:52 +00:00
James Dennett
99c193b3c0
Documentation cleanup: add \verbatim markup for grammar productions
...
llvm-svn: 158740
2012-06-19 21:04:25 +00:00
James Dennett
1cc2203286
Documentation cleanup: added \verbatim...\verbatim markup to fix the
...
formatting of Doxygen's output for StringLiteralParser::StringLiteralParser.
llvm-svn: 158616
2012-06-17 03:34:42 +00:00
Richard Smith
0948d93b7f
Fix off-by-one error in UTF-16 encoding: don't try to use a surrogate pair for U+FFFF.
...
llvm-svn: 158391
2012-06-13 05:41:29 +00:00
Richard Smith
4060f77462
PR13099: Teach -Wformat about raw string literals, UTF-8 strings and Unicode escape sequences.
...
llvm-svn: 158390
2012-06-13 05:37:23 +00:00
Argyrios Kyrtzidis
9933e3ac88
In StringLiteralParser::init, make sure we emit an error when
...
failing to lex the string, as suggested by Eli.
Part of rdar://11305263.
llvm-svn: 156081
2012-05-03 17:50:32 +00:00
Argyrios Kyrtzidis
4e5b5c36f4
In StringLiteralParser::init(), fail gracefully if the string is
...
not as we expect; it may be due to racing issue of a file coming from PCH
changing after the PCH is loaded.
rdar://11353109
llvm-svn: 156043
2012-05-03 01:01:56 +00:00
David Blaikie
bbafb8a745
Unify naming of LangOptions variable/get function across the Clang stack (Lex to AST).
...
The member variable is always "LangOpts" and the member function is always "getLangOpts".
Reviewed by Chris Lattner
llvm-svn: 152536
2012-03-11 07:00:24 +00:00
Richard Smith
2a70e65436
Improve diagnostics for UCNs referring to control characters and members of the
...
basic source character set in C++98. Add -Wc++98-compat diagnostics for same in
literals in C++11. Extend such support to cover string literals as well as
character literals, and mark N2170 as done.
This seems too minor to warrant a release note to me. Let me know if you disagree.
llvm-svn: 152444
2012-03-09 22:27:51 +00:00
Richard Smith
812924502b
When checking the encoding of an 8-bit string literal, don't just check the
...
first codepoint! Also, don't reject empty raw string literals for spurious
"encoding" issues. Also, don't rely on undefined behavior in ConvertUTF.c.
llvm-svn: 152344
2012-03-08 21:59:28 +00:00
Richard Smith
39570d0020
Add support for cooked forms of user-defined-integer-literal and
...
user-defined-floating-literal. Support for raw forms of these literals
to follow.
llvm-svn: 152302
2012-03-08 08:45:32 +00:00
Richard Smith
75b67d6dc5
User-defined literal support for character literals.
...
llvm-svn: 152277
2012-03-08 01:34:56 +00:00
Richard Smith
e18f0faff2
Lexing support for user-defined literals. Currently these lex as the same token
...
kinds as the underlying string literals, and we silently drop the ud-suffix;
those issues will be fixed by subsequent patches.
llvm-svn: 152012
2012-03-05 04:02:15 +00:00
Eli Friedman
9436352a82
Implement warning for non-wide string literals with an unexpected encoding. Downgrade error for non-wide character literals with an unexpected encoding to a warning for compatibility with gcc and older versions of clang. <rdar://problem/10837678>.
...
llvm-svn: 150295
2012-02-11 05:08:10 +00:00
Aaron Ballman
e1224a5067
Fixing hex floating literal support so that it handles 0x.2p2 properly.
...
llvm-svn: 150072
2012-02-08 13:36:33 +00:00
Aaron Ballman
b97a5addd5
Hex literals without a significand no longer crash the lexer. Fixes bug 7910
...
Patch by Eitan Adler
llvm-svn: 149984
2012-02-07 13:46:03 +00:00
Dylan Noblesmith
2c1dd2716a
Basic: import SmallString<> into clang namespace
...
(I was going to fix the TODO about DenseMap too, but
that would break self-host right now. See PR11922.)
llvm-svn: 149799
2012-02-05 02:13:05 +00:00
Seth Cantrell
9c2d6f0279
stop claiming unicode escape sequences are too long in strings, because they never are
...
llvm-svn: 148391
2012-01-18 12:27:08 +00:00
Seth Cantrell
8b2b677f39
Improves support for Unicode in character literals
...
Updates ProcessUCNExcape() for C++. C++11 allows UCNs in character
and string literals that represent control characters and basic
source characters. Also C++03 allows UCNs that refer to surrogate
codepoints.
UTF-8 sequences in character literals are now handled as single
c-chars.
Added error for multiple characters in Unicode character literals.
Added errors for when a the execution charset encoding of a c-char
cannot be represented as a single code unit in the associated
character type. Note that for the purposes of this error the asso-
ciated character type for a narrow character literal is char, not
int, even though in C narrow character literals have type int.
llvm-svn: 148389
2012-01-18 12:27:04 +00:00
Nico Weber
d60b72f696
Fix a regression in wide character codegen. See PR11369.
...
llvm-svn: 144521
2011-11-14 05:17:37 +00:00
Eli Friedman
20554708fb
Fix one last place where we weren't writing into a string literal consistently.
...
llvm-svn: 143769
2011-11-05 00:41:04 +00:00
Eli Friedman
d1370791c2
Use native endianness for writing out character escapes to the result buffer for string literal parsing. No functional change on little-endian architectures; should fix test failures on PPC.
...
llvm-svn: 143585
2011-11-02 23:06:23 +00:00
Eli Friedman
703e7153af
Perform proper conversion for strings encoded in the source file as UTF-8. (For now, we are assuming the source character set is always UTF-8; this can be easily extended if necessary.)
...
Tests will be coming up in a subsequent commit.
Patch by Seth Cantrell.
llvm-svn: 143416
2011-11-01 02:14:50 +00:00
Douglas Gregor
227c352bae
We do parse hexfloats in C++11; make it actually work.
...
llvm-svn: 141798
2011-10-12 18:51:02 +00:00
Douglas Gregor
4d68366b2f
When parsing a character literal, extract the characters from the
...
buffer as an 'unsigned char', so that integer promotion doesn't
sign-extend character values > 127 into oblivion. Fixes
<rdar://problem/10188919>.
llvm-svn: 140608
2011-09-27 17:00:18 +00:00
David Blaikie
9c902b5502
Rename Diagnostic to DiagnosticsEngine as per issue 5397
...
llvm-svn: 140478
2011-09-25 23:23:43 +00:00
David Blaikie
76bd3c80d4
Fix missing includes for llvm_unreachable
...
llvm-svn: 140368
2011-09-23 05:35:21 +00:00
David Blaikie
83d382b1ca
Switch assert(0/false) llvm_unreachable.
...
llvm-svn: 140367
2011-09-23 05:06:16 +00:00
Francois Pichet
0706d203cf
Rename LangOptions::Microsoft to LangOptions::MicrosoftExt to make it clear that this flag must be used only for Microsoft extensions and not emulation; to avoid confusion with the new LangOptions::MicrosoftMode flag.
...
Many of the code now under LangOptions::MicrosoftExt will eventually be moved under the LangOptions::MicrosoftMode flag.
llvm-svn: 139987
2011-09-17 17:15:52 +00:00
Douglas Gregor
86325ad2b5
Allow C99 hexfloats in C++0x mode. This change resolves the standards
...
collision between C99 hexfloats and C++0x user-defined literals by
giving C99 hexfloats precedence. Also, warning about user-defined
literals that conflict with hexfloats and those that have names that
are reserved by the implementation. Fixes <rdar://problem/9940194>.
llvm-svn: 138839
2011-08-30 22:40:35 +00:00
Craig Topper
6eb2058a6a
Warn about and truncate UCNs that are too big for their character literal type.
...
llvm-svn: 138031
2011-08-19 03:20:12 +00:00
NAKAMURA Takumi
9f8a02d34e
De-Unicode-ify.
...
llvm-svn: 137430
2011-08-12 05:49:51 +00:00
Craig Topper
5265bb211d
Raw string followup. Pass a couple StringRefs by value.
...
llvm-svn: 137301
2011-08-11 05:10:55 +00:00
Craig Topper
54edccafc5
Add support for C++0x raw string literals.
...
llvm-svn: 137298
2011-08-11 04:06:15 +00:00
Craig Topper
61147ed270
Fix comment (test commit)
...
llvm-svn: 137039
2011-08-08 06:10:39 +00:00
Douglas Gregor
fb65e592e0
Add support for C++0x unicode string and character literals, from Craig Topper!
...
llvm-svn: 136210
2011-07-27 05:40:30 +00:00
Chris Lattner
0e62c1cc0b
remove unneeded llvm:: namespace qualifiers on some core types now that LLVM.h imports
...
them into the clang namespace.
llvm-svn: 135852
2011-07-23 10:55:15 +00:00
Argyrios Kyrtzidis
8b7252a8b3
Fix a nasty bug where inside StringLiteralParser:
...
1. We would assume that the length of the string literal token was at least 2
2. We would allocate a buffer with size length-2
And when the stars aligned (one of which would be an invalid source location due to stale PCH)
The length would be 0 and we would try to allocate a 4GB buffer.
Add checks for this corner case and a bunch of asserts.
(We really really should have had an assert for 1.).
Note that there's no test case since I couldn't get one (it was major PITA to reproduce),
maybe later.
llvm-svn: 131492
2011-05-17 22:09:56 +00:00
Chris Lattner
57540c5be0
fix a bunch of comment typos found by codespell. Patch by
...
Luis Felipe Strano Moraes!
llvm-svn: 129559
2011-04-15 05:22:18 +00:00
Francois Pichet
12df1dc8f2
Microsoft integer suffix changes:
...
i64 is like LL
i32 is like L
Also set isMicrosoftInteger to true only if the suffix is well formed.
llvm-svn: 123230
2011-01-11 11:57:53 +00:00
Ted Kremenek
8c4c74f4fb
Fix diagnostic for reporting bad escape sequence.
...
Patch by Paul Curtis!
llvm-svn: 120759
2010-12-03 00:09:56 +00:00
Chris Lattner
39720111e0
move getSpelling from Preprocessor to Lexer, which it is more conceptually related to.
...
llvm-svn: 119479
2010-11-17 07:26:20 +00:00
Chris Lattner
6bab435db6
propagate preprocessor out of StringLiteralParser. It is now
...
possible to create one without a preprocessor.
llvm-svn: 119476
2010-11-17 07:21:13 +00:00
Chris Lattner
2be8aa9611
push the preprocessor out of EncodeUCNEscape
...
llvm-svn: 119475
2010-11-17 07:12:42 +00:00
Chris Lattner
2a6ee91619
move AdvanceToTokenCharacter and getLocForEndOfToken from
...
Preprocessor to Lexer where they make more sense.
llvm-svn: 119474
2010-11-17 07:05:50 +00:00
Chris Lattner
b1ab2c2d3d
add a static version of PP::AdvanceToTokenCharacter.
...
llvm-svn: 119472
2010-11-17 06:55:10 +00:00
Chris Lattner
bde1b81eb8
push use of Preprocessor out farther.
...
llvm-svn: 119471
2010-11-17 06:46:14 +00:00
Chris Lattner
3a324d3232
push use of Preprocessor out of getOffsetOfStringByte
...
llvm-svn: 119470
2010-11-17 06:35:43 +00:00