Richard Smith
812924502b
When checking the encoding of an 8-bit string literal, don't just check the
...
first codepoint! Also, don't reject empty raw string literals for spurious
"encoding" issues. Also, don't rely on undefined behavior in ConvertUTF.c.
llvm-svn: 152344
2012-03-08 21:59:28 +00:00
Richard Smith
39570d0020
Add support for cooked forms of user-defined-integer-literal and
...
user-defined-floating-literal. Support for raw forms of these literals
to follow.
llvm-svn: 152302
2012-03-08 08:45:32 +00:00
Richard Smith
75b67d6dc5
User-defined literal support for character literals.
...
llvm-svn: 152277
2012-03-08 01:34:56 +00:00
Richard Smith
e18f0faff2
Lexing support for user-defined literals. Currently these lex as the same token
...
kinds as the underlying string literals, and we silently drop the ud-suffix;
those issues will be fixed by subsequent patches.
llvm-svn: 152012
2012-03-05 04:02:15 +00:00
Eli Friedman
9436352a82
Implement warning for non-wide string literals with an unexpected encoding. Downgrade error for non-wide character literals with an unexpected encoding to a warning for compatibility with gcc and older versions of clang. <rdar://problem/10837678>.
...
llvm-svn: 150295
2012-02-11 05:08:10 +00:00
Aaron Ballman
e1224a5067
Fixing hex floating literal support so that it handles 0x.2p2 properly.
...
llvm-svn: 150072
2012-02-08 13:36:33 +00:00
Aaron Ballman
b97a5addd5
Hex literals without a significand no longer crash the lexer. Fixes bug 7910
...
Patch by Eitan Adler
llvm-svn: 149984
2012-02-07 13:46:03 +00:00
Dylan Noblesmith
2c1dd2716a
Basic: import SmallString<> into clang namespace
...
(I was going to fix the TODO about DenseMap too, but
that would break self-host right now. See PR11922.)
llvm-svn: 149799
2012-02-05 02:13:05 +00:00
Seth Cantrell
9c2d6f0279
stop claiming unicode escape sequences are too long in strings, because they never are
...
llvm-svn: 148391
2012-01-18 12:27:08 +00:00
Seth Cantrell
8b2b677f39
Improves support for Unicode in character literals
...
Updates ProcessUCNExcape() for C++. C++11 allows UCNs in character
and string literals that represent control characters and basic
source characters. Also C++03 allows UCNs that refer to surrogate
codepoints.
UTF-8 sequences in character literals are now handled as single
c-chars.
Added error for multiple characters in Unicode character literals.
Added errors for when a the execution charset encoding of a c-char
cannot be represented as a single code unit in the associated
character type. Note that for the purposes of this error the asso-
ciated character type for a narrow character literal is char, not
int, even though in C narrow character literals have type int.
llvm-svn: 148389
2012-01-18 12:27:04 +00:00
Nico Weber
d60b72f696
Fix a regression in wide character codegen. See PR11369.
...
llvm-svn: 144521
2011-11-14 05:17:37 +00:00
Eli Friedman
20554708fb
Fix one last place where we weren't writing into a string literal consistently.
...
llvm-svn: 143769
2011-11-05 00:41:04 +00:00
Eli Friedman
d1370791c2
Use native endianness for writing out character escapes to the result buffer for string literal parsing. No functional change on little-endian architectures; should fix test failures on PPC.
...
llvm-svn: 143585
2011-11-02 23:06:23 +00:00
Eli Friedman
703e7153af
Perform proper conversion for strings encoded in the source file as UTF-8. (For now, we are assuming the source character set is always UTF-8; this can be easily extended if necessary.)
...
Tests will be coming up in a subsequent commit.
Patch by Seth Cantrell.
llvm-svn: 143416
2011-11-01 02:14:50 +00:00
Douglas Gregor
227c352bae
We do parse hexfloats in C++11; make it actually work.
...
llvm-svn: 141798
2011-10-12 18:51:02 +00:00
Douglas Gregor
4d68366b2f
When parsing a character literal, extract the characters from the
...
buffer as an 'unsigned char', so that integer promotion doesn't
sign-extend character values > 127 into oblivion. Fixes
<rdar://problem/10188919>.
llvm-svn: 140608
2011-09-27 17:00:18 +00:00
David Blaikie
9c902b5502
Rename Diagnostic to DiagnosticsEngine as per issue 5397
...
llvm-svn: 140478
2011-09-25 23:23:43 +00:00
David Blaikie
76bd3c80d4
Fix missing includes for llvm_unreachable
...
llvm-svn: 140368
2011-09-23 05:35:21 +00:00
David Blaikie
83d382b1ca
Switch assert(0/false) llvm_unreachable.
...
llvm-svn: 140367
2011-09-23 05:06:16 +00:00
Francois Pichet
0706d203cf
Rename LangOptions::Microsoft to LangOptions::MicrosoftExt to make it clear that this flag must be used only for Microsoft extensions and not emulation; to avoid confusion with the new LangOptions::MicrosoftMode flag.
...
Many of the code now under LangOptions::MicrosoftExt will eventually be moved under the LangOptions::MicrosoftMode flag.
llvm-svn: 139987
2011-09-17 17:15:52 +00:00
Douglas Gregor
86325ad2b5
Allow C99 hexfloats in C++0x mode. This change resolves the standards
...
collision between C99 hexfloats and C++0x user-defined literals by
giving C99 hexfloats precedence. Also, warning about user-defined
literals that conflict with hexfloats and those that have names that
are reserved by the implementation. Fixes <rdar://problem/9940194>.
llvm-svn: 138839
2011-08-30 22:40:35 +00:00
Craig Topper
6eb2058a6a
Warn about and truncate UCNs that are too big for their character literal type.
...
llvm-svn: 138031
2011-08-19 03:20:12 +00:00
NAKAMURA Takumi
9f8a02d34e
De-Unicode-ify.
...
llvm-svn: 137430
2011-08-12 05:49:51 +00:00
Craig Topper
5265bb211d
Raw string followup. Pass a couple StringRefs by value.
...
llvm-svn: 137301
2011-08-11 05:10:55 +00:00
Craig Topper
54edccafc5
Add support for C++0x raw string literals.
...
llvm-svn: 137298
2011-08-11 04:06:15 +00:00
Craig Topper
61147ed270
Fix comment (test commit)
...
llvm-svn: 137039
2011-08-08 06:10:39 +00:00
Douglas Gregor
fb65e592e0
Add support for C++0x unicode string and character literals, from Craig Topper!
...
llvm-svn: 136210
2011-07-27 05:40:30 +00:00
Chris Lattner
0e62c1cc0b
remove unneeded llvm:: namespace qualifiers on some core types now that LLVM.h imports
...
them into the clang namespace.
llvm-svn: 135852
2011-07-23 10:55:15 +00:00
Argyrios Kyrtzidis
8b7252a8b3
Fix a nasty bug where inside StringLiteralParser:
...
1. We would assume that the length of the string literal token was at least 2
2. We would allocate a buffer with size length-2
And when the stars aligned (one of which would be an invalid source location due to stale PCH)
The length would be 0 and we would try to allocate a 4GB buffer.
Add checks for this corner case and a bunch of asserts.
(We really really should have had an assert for 1.).
Note that there's no test case since I couldn't get one (it was major PITA to reproduce),
maybe later.
llvm-svn: 131492
2011-05-17 22:09:56 +00:00
Chris Lattner
57540c5be0
fix a bunch of comment typos found by codespell. Patch by
...
Luis Felipe Strano Moraes!
llvm-svn: 129559
2011-04-15 05:22:18 +00:00
Francois Pichet
12df1dc8f2
Microsoft integer suffix changes:
...
i64 is like LL
i32 is like L
Also set isMicrosoftInteger to true only if the suffix is well formed.
llvm-svn: 123230
2011-01-11 11:57:53 +00:00
Ted Kremenek
8c4c74f4fb
Fix diagnostic for reporting bad escape sequence.
...
Patch by Paul Curtis!
llvm-svn: 120759
2010-12-03 00:09:56 +00:00
Chris Lattner
39720111e0
move getSpelling from Preprocessor to Lexer, which it is more conceptually related to.
...
llvm-svn: 119479
2010-11-17 07:26:20 +00:00
Chris Lattner
6bab435db6
propagate preprocessor out of StringLiteralParser. It is now
...
possible to create one without a preprocessor.
llvm-svn: 119476
2010-11-17 07:21:13 +00:00
Chris Lattner
2be8aa9611
push the preprocessor out of EncodeUCNEscape
...
llvm-svn: 119475
2010-11-17 07:12:42 +00:00
Chris Lattner
2a6ee91619
move AdvanceToTokenCharacter and getLocForEndOfToken from
...
Preprocessor to Lexer where they make more sense.
llvm-svn: 119474
2010-11-17 07:05:50 +00:00
Chris Lattner
b1ab2c2d3d
add a static version of PP::AdvanceToTokenCharacter.
...
llvm-svn: 119472
2010-11-17 06:55:10 +00:00
Chris Lattner
bde1b81eb8
push use of Preprocessor out farther.
...
llvm-svn: 119471
2010-11-17 06:46:14 +00:00
Chris Lattner
3a324d3232
push use of Preprocessor out of getOffsetOfStringByte
...
llvm-svn: 119470
2010-11-17 06:35:43 +00:00
Chris Lattner
30d4c928ac
add a static form of the efficient PP::getSpelling method.
...
llvm-svn: 119469
2010-11-17 06:31:48 +00:00
Chris Lattner
7a02bfdfce
refactor the interface to StringLiteralParser::getOffsetOfStringByte,
...
pushing the dependency on the preprocessor out a bit.
llvm-svn: 119468
2010-11-17 06:26:08 +00:00
Chris Lattner
26f6c227dc
allow I128 suffixes in msextensions mode just like i128 suffixes, patch
...
by Martin Vejnar!
llvm-svn: 116460
2010-10-14 00:24:10 +00:00
Nico Weber
a6bde81bc8
Add support for UCNs for character literals
...
llvm-svn: 116129
2010-10-09 00:27:47 +00:00
Nico Weber
9762e0a234
Add support for 4-byte UCNs like \U12345678. Warn about UCNs in c90 mode.
...
llvm-svn: 115743
2010-10-06 04:57:26 +00:00
Fariborz Jahanian
39de024e66
Prevent warning when built with assert off.
...
llvm-svn: 112680
2010-08-31 23:54:38 +00:00
Fariborz Jahanian
abaae2b692
Some support for unicode string constants
...
in wide strings. radar 8360841.
llvm-svn: 112672
2010-08-31 23:34:27 +00:00
Alexis Hunt
3b7918625c
Revert my user-defined literal commits - r1124{58,60,67} pending
...
some issues being sorted out.
llvm-svn: 112493
2010-08-30 17:47:05 +00:00
Alexis Hunt
79eb5469e0
Implement C++0x user-defined string literals.
...
The extra data stored on user-defined literal Tokens is stored in extra
allocated memory, which is managed by the PreprocessorLexer because there isn't
a better place to put it that makes sure it gets deallocated, but only after
it's used up. My testing has shown no significant slowdown as a result, but
independent testing would be appreciated.
llvm-svn: 112458
2010-08-29 21:26:48 +00:00
Benjamin Kramer
e8394df11b
Random temporary string cleanup.
...
llvm-svn: 110807
2010-08-11 14:47:12 +00:00
Douglas Gregor
b37b46e488
Complain when string literals are too long for the active language
...
standard's minimum requirements.
llvm-svn: 108837
2010-07-20 14:33:20 +00:00