Richard Smith
e18f0faff2
Lexing support for user-defined literals. Currently these lex as the same token
...
kinds as the underlying string literals, and we silently drop the ud-suffix;
those issues will be fixed by subsequent patches.
llvm-svn: 152012
2012-03-05 04:02:15 +00:00
Eli Friedman
9436352a82
Implement warning for non-wide string literals with an unexpected encoding. Downgrade error for non-wide character literals with an unexpected encoding to a warning for compatibility with gcc and older versions of clang. <rdar://problem/10837678>.
...
llvm-svn: 150295
2012-02-11 05:08:10 +00:00
Aaron Ballman
e1224a5067
Fixing hex floating literal support so that it handles 0x.2p2 properly.
...
llvm-svn: 150072
2012-02-08 13:36:33 +00:00
Aaron Ballman
b97a5addd5
Hex literals without a significand no longer crash the lexer. Fixes bug 7910
...
Patch by Eitan Adler
llvm-svn: 149984
2012-02-07 13:46:03 +00:00
Dylan Noblesmith
2c1dd2716a
Basic: import SmallString<> into clang namespace
...
(I was going to fix the TODO about DenseMap too, but
that would break self-host right now. See PR11922.)
llvm-svn: 149799
2012-02-05 02:13:05 +00:00
Seth Cantrell
9c2d6f0279
stop claiming unicode escape sequences are too long in strings, because they never are
...
llvm-svn: 148391
2012-01-18 12:27:08 +00:00
Seth Cantrell
8b2b677f39
Improves support for Unicode in character literals
...
Updates ProcessUCNExcape() for C++. C++11 allows UCNs in character
and string literals that represent control characters and basic
source characters. Also C++03 allows UCNs that refer to surrogate
codepoints.
UTF-8 sequences in character literals are now handled as single
c-chars.
Added error for multiple characters in Unicode character literals.
Added errors for when a the execution charset encoding of a c-char
cannot be represented as a single code unit in the associated
character type. Note that for the purposes of this error the asso-
ciated character type for a narrow character literal is char, not
int, even though in C narrow character literals have type int.
llvm-svn: 148389
2012-01-18 12:27:04 +00:00
Nico Weber
d60b72f696
Fix a regression in wide character codegen. See PR11369.
...
llvm-svn: 144521
2011-11-14 05:17:37 +00:00
Eli Friedman
20554708fb
Fix one last place where we weren't writing into a string literal consistently.
...
llvm-svn: 143769
2011-11-05 00:41:04 +00:00
Eli Friedman
d1370791c2
Use native endianness for writing out character escapes to the result buffer for string literal parsing. No functional change on little-endian architectures; should fix test failures on PPC.
...
llvm-svn: 143585
2011-11-02 23:06:23 +00:00
Eli Friedman
703e7153af
Perform proper conversion for strings encoded in the source file as UTF-8. (For now, we are assuming the source character set is always UTF-8; this can be easily extended if necessary.)
...
Tests will be coming up in a subsequent commit.
Patch by Seth Cantrell.
llvm-svn: 143416
2011-11-01 02:14:50 +00:00
Douglas Gregor
227c352bae
We do parse hexfloats in C++11; make it actually work.
...
llvm-svn: 141798
2011-10-12 18:51:02 +00:00
Douglas Gregor
4d68366b2f
When parsing a character literal, extract the characters from the
...
buffer as an 'unsigned char', so that integer promotion doesn't
sign-extend character values > 127 into oblivion. Fixes
<rdar://problem/10188919>.
llvm-svn: 140608
2011-09-27 17:00:18 +00:00
David Blaikie
9c902b5502
Rename Diagnostic to DiagnosticsEngine as per issue 5397
...
llvm-svn: 140478
2011-09-25 23:23:43 +00:00
David Blaikie
76bd3c80d4
Fix missing includes for llvm_unreachable
...
llvm-svn: 140368
2011-09-23 05:35:21 +00:00
David Blaikie
83d382b1ca
Switch assert(0/false) llvm_unreachable.
...
llvm-svn: 140367
2011-09-23 05:06:16 +00:00
Francois Pichet
0706d203cf
Rename LangOptions::Microsoft to LangOptions::MicrosoftExt to make it clear that this flag must be used only for Microsoft extensions and not emulation; to avoid confusion with the new LangOptions::MicrosoftMode flag.
...
Many of the code now under LangOptions::MicrosoftExt will eventually be moved under the LangOptions::MicrosoftMode flag.
llvm-svn: 139987
2011-09-17 17:15:52 +00:00
Douglas Gregor
86325ad2b5
Allow C99 hexfloats in C++0x mode. This change resolves the standards
...
collision between C99 hexfloats and C++0x user-defined literals by
giving C99 hexfloats precedence. Also, warning about user-defined
literals that conflict with hexfloats and those that have names that
are reserved by the implementation. Fixes <rdar://problem/9940194>.
llvm-svn: 138839
2011-08-30 22:40:35 +00:00
Craig Topper
6eb2058a6a
Warn about and truncate UCNs that are too big for their character literal type.
...
llvm-svn: 138031
2011-08-19 03:20:12 +00:00
NAKAMURA Takumi
9f8a02d34e
De-Unicode-ify.
...
llvm-svn: 137430
2011-08-12 05:49:51 +00:00
Craig Topper
5265bb211d
Raw string followup. Pass a couple StringRefs by value.
...
llvm-svn: 137301
2011-08-11 05:10:55 +00:00
Craig Topper
54edccafc5
Add support for C++0x raw string literals.
...
llvm-svn: 137298
2011-08-11 04:06:15 +00:00
Craig Topper
61147ed270
Fix comment (test commit)
...
llvm-svn: 137039
2011-08-08 06:10:39 +00:00
Douglas Gregor
fb65e592e0
Add support for C++0x unicode string and character literals, from Craig Topper!
...
llvm-svn: 136210
2011-07-27 05:40:30 +00:00
Chris Lattner
0e62c1cc0b
remove unneeded llvm:: namespace qualifiers on some core types now that LLVM.h imports
...
them into the clang namespace.
llvm-svn: 135852
2011-07-23 10:55:15 +00:00
Argyrios Kyrtzidis
8b7252a8b3
Fix a nasty bug where inside StringLiteralParser:
...
1. We would assume that the length of the string literal token was at least 2
2. We would allocate a buffer with size length-2
And when the stars aligned (one of which would be an invalid source location due to stale PCH)
The length would be 0 and we would try to allocate a 4GB buffer.
Add checks for this corner case and a bunch of asserts.
(We really really should have had an assert for 1.).
Note that there's no test case since I couldn't get one (it was major PITA to reproduce),
maybe later.
llvm-svn: 131492
2011-05-17 22:09:56 +00:00
Chris Lattner
57540c5be0
fix a bunch of comment typos found by codespell. Patch by
...
Luis Felipe Strano Moraes!
llvm-svn: 129559
2011-04-15 05:22:18 +00:00
Francois Pichet
12df1dc8f2
Microsoft integer suffix changes:
...
i64 is like LL
i32 is like L
Also set isMicrosoftInteger to true only if the suffix is well formed.
llvm-svn: 123230
2011-01-11 11:57:53 +00:00
Ted Kremenek
8c4c74f4fb
Fix diagnostic for reporting bad escape sequence.
...
Patch by Paul Curtis!
llvm-svn: 120759
2010-12-03 00:09:56 +00:00
Chris Lattner
39720111e0
move getSpelling from Preprocessor to Lexer, which it is more conceptually related to.
...
llvm-svn: 119479
2010-11-17 07:26:20 +00:00
Chris Lattner
6bab435db6
propagate preprocessor out of StringLiteralParser. It is now
...
possible to create one without a preprocessor.
llvm-svn: 119476
2010-11-17 07:21:13 +00:00
Chris Lattner
2be8aa9611
push the preprocessor out of EncodeUCNEscape
...
llvm-svn: 119475
2010-11-17 07:12:42 +00:00
Chris Lattner
2a6ee91619
move AdvanceToTokenCharacter and getLocForEndOfToken from
...
Preprocessor to Lexer where they make more sense.
llvm-svn: 119474
2010-11-17 07:05:50 +00:00
Chris Lattner
b1ab2c2d3d
add a static version of PP::AdvanceToTokenCharacter.
...
llvm-svn: 119472
2010-11-17 06:55:10 +00:00
Chris Lattner
bde1b81eb8
push use of Preprocessor out farther.
...
llvm-svn: 119471
2010-11-17 06:46:14 +00:00
Chris Lattner
3a324d3232
push use of Preprocessor out of getOffsetOfStringByte
...
llvm-svn: 119470
2010-11-17 06:35:43 +00:00
Chris Lattner
30d4c928ac
add a static form of the efficient PP::getSpelling method.
...
llvm-svn: 119469
2010-11-17 06:31:48 +00:00
Chris Lattner
7a02bfdfce
refactor the interface to StringLiteralParser::getOffsetOfStringByte,
...
pushing the dependency on the preprocessor out a bit.
llvm-svn: 119468
2010-11-17 06:26:08 +00:00
Chris Lattner
26f6c227dc
allow I128 suffixes in msextensions mode just like i128 suffixes, patch
...
by Martin Vejnar!
llvm-svn: 116460
2010-10-14 00:24:10 +00:00
Nico Weber
a6bde81bc8
Add support for UCNs for character literals
...
llvm-svn: 116129
2010-10-09 00:27:47 +00:00
Nico Weber
9762e0a234
Add support for 4-byte UCNs like \U12345678. Warn about UCNs in c90 mode.
...
llvm-svn: 115743
2010-10-06 04:57:26 +00:00
Fariborz Jahanian
39de024e66
Prevent warning when built with assert off.
...
llvm-svn: 112680
2010-08-31 23:54:38 +00:00
Fariborz Jahanian
abaae2b692
Some support for unicode string constants
...
in wide strings. radar 8360841.
llvm-svn: 112672
2010-08-31 23:34:27 +00:00
Alexis Hunt
3b7918625c
Revert my user-defined literal commits - r1124{58,60,67} pending
...
some issues being sorted out.
llvm-svn: 112493
2010-08-30 17:47:05 +00:00
Alexis Hunt
79eb5469e0
Implement C++0x user-defined string literals.
...
The extra data stored on user-defined literal Tokens is stored in extra
allocated memory, which is managed by the PreprocessorLexer because there isn't
a better place to put it that makes sure it gets deallocated, but only after
it's used up. My testing has shown no significant slowdown as a result, but
independent testing would be appreciated.
llvm-svn: 112458
2010-08-29 21:26:48 +00:00
Benjamin Kramer
e8394df11b
Random temporary string cleanup.
...
llvm-svn: 110807
2010-08-11 14:47:12 +00:00
Douglas Gregor
b37b46e488
Complain when string literals are too long for the active language
...
standard's minimum requirements.
llvm-svn: 108837
2010-07-20 14:33:20 +00:00
Chris Lattner
c548be9ab3
Remove a dead argument to ProcessUCNEscape.
...
Fix string concatenation to treat escapes in concatenated strings that
are wide because of other string chunks to process the escapes as wide
themselves. Before we would warn about and miscompile the attached testcase.
This fixes rdar://8040728 - miscompile + warning: hex escape sequence out of range
llvm-svn: 106012
2010-06-15 18:06:43 +00:00
Fariborz Jahanian
93bef10131
Fix a miscompile of wchar pascal strings.
...
(radar 8020384)
llvm-svn: 104996
2010-05-28 19:40:48 +00:00
Douglas Gregor
9af03022ff
Tell the string literal parser when it's not permitted to emit
...
diagnostics. That would be while we're parsing string literals for the
sole purpose of producing a diagnostic about them. Fixes
<rdar://problem/8026030>.
llvm-svn: 104684
2010-05-26 05:35:51 +00:00