llvm-project

Commit Graph

Author	SHA1	Message	Date
Jordan Rose	c0cba27230	PR15067: Don't assert when a UCN appears in a C90 file. Unfortunately, we can't accept the UCN as an extension because we're required to treat it as two tokens for preprocessing purposes. llvm-svn: 173622	2013-01-27 20:12:04 +00:00
Jordan Rose	aa89cf1a66	Unify diagnostics for \x, \u, and \U without any following hex digits. llvm-svn: 173368	2013-01-24 20:50:13 +00:00
Jordan Rose	78ed86a7e5	Adopt llvm::hexDigitValue. llvm-svn: 172861	2013-01-18 22:33:58 +00:00
Richard Smith	2bf7fdb723	s/CPlusPlus0x/CPlusPlus11/g llvm-svn: 171367	2013-01-02 11:42:31 +00:00
Chandler Carruth	3a02247dc9	Sort all of Clang's files under 'lib', and fix up the broken headers uncovered. This required manually correcting all of the incorrect main-module headers I could find, and running the new llvm/utils/sort_includes.py script over the files. I also manually added quite a few missing headers that were uncovered by shuffling the order or moving headers up to be main-module-headers. llvm-svn: 169237	2012-12-04 09:13:33 +00:00
Benjamin Kramer	7d574e269d	LiteralSupport: Don't overflow the temporary buffer when decoding invalid string parts. Instead just use a dummy buffer, we're not going to use the decoded string anyways. Fixes PR14292. llvm-svn: 167594	2012-11-08 19:22:31 +00:00
Benjamin Kramer	f23a6e6f80	LiteralSupport: Clean up style violations. No functionality change. llvm-svn: 167593	2012-11-08 19:22:26 +00:00
David Blaikie	a0613170b4	Handle string encoding diagnostics when there are too many invalid ranges. llvm-svn: 167059	2012-10-30 23:22:22 +00:00
Seth Cantrell	4cfc817a9a	improve highlighting of invalid string encodings limit highlight to exactly the bad encoding, and highlight every bad encoding in a string. llvm-svn: 166900	2012-10-28 18:24:46 +00:00
Jordan Rose	de584de370	Rename CanFitInto64Bits to alwaysFitsInto64Bits per discussion on IRC. This makes the behavior clearer concerning literals with the maximum number of digits. For a 32-bit example, 4,000,000,000 is a valid uint32_t, but 5,000,000,000 is not, so we'd have to count 10-digit decimal numbers as "unsafe" (meaning we have to check for overflow when parsing them, just as we would for numbers with 11 digits or higher). This is the same, only with 64 bits to play with. No functionality change. llvm-svn: 164639	2012-09-25 22:32:51 +00:00
Dmitri Gribenko	511288b2b5	Optimize NumericLiteralParser::GetIntegerValue(). It does a conservative estimate on the size of numbers that can fit into uint64_t. This bound is improved. llvm-svn: 164624	2012-09-25 19:09:15 +00:00
Dmitri Gribenko	7ba91723e7	Small cleanup of literal semantic analysis: hiding 'char *' pointers behind StringRef makes code cleaner. Also, make the temporary buffer smaller: 512 characters is unreasonably large for integer literals. llvm-svn: 164484	2012-09-24 09:53:54 +00:00
Richard Smith	639b8d05dd	When a bad UTF-8 encoding or bogus escape sequence is encountered in a string literal, produce a diagnostic pointing at the erroneous character range, not at the start of the literal. llvm-svn: 163459	2012-09-08 07:16:20 +00:00
Nico Weber	4b18c3ff40	Share ConvertUTF8toWide() between Lex and CodeGen. llvm-svn: 159634	2012-07-03 02:24:52 +00:00
James Dennett	99c193b3c0	Documentation cleanup: add \verbatim markup for grammar productions llvm-svn: 158740	2012-06-19 21:04:25 +00:00
James Dennett	1cc2203286	Documentation cleanup: added \verbatim...\verbatim markup to fix the formatting of Doxygen's output for StringLiteralParser::StringLiteralParser. llvm-svn: 158616	2012-06-17 03:34:42 +00:00
Richard Smith	0948d93b7f	Fix off-by-one error in UTF-16 encoding: don't try to use a surrogate pair for U+FFFF. llvm-svn: 158391	2012-06-13 05:41:29 +00:00
Richard Smith	4060f77462	PR13099: Teach -Wformat about raw string literals, UTF-8 strings and Unicode escape sequences. llvm-svn: 158390	2012-06-13 05:37:23 +00:00
Argyrios Kyrtzidis	9933e3ac88	In StringLiteralParser::init, make sure we emit an error when failing to lex the string, as suggested by Eli. Part of rdar://11305263. llvm-svn: 156081	2012-05-03 17:50:32 +00:00
Argyrios Kyrtzidis	4e5b5c36f4	In StringLiteralParser::init(), fail gracefully if the string is not as we expect; it may be due to racing issue of a file coming from PCH changing after the PCH is loaded. rdar://11353109 llvm-svn: 156043	2012-05-03 01:01:56 +00:00
David Blaikie	bbafb8a745	Unify naming of LangOptions variable/get function across the Clang stack (Lex to AST). The member variable is always "LangOpts" and the member function is always "getLangOpts". Reviewed by Chris Lattner llvm-svn: 152536	2012-03-11 07:00:24 +00:00
Richard Smith	2a70e65436	Improve diagnostics for UCNs referring to control characters and members of the basic source character set in C++98. Add -Wc++98-compat diagnostics for same in literals in C++11. Extend such support to cover string literals as well as character literals, and mark N2170 as done. This seems too minor to warrant a release note to me. Let me know if you disagree. llvm-svn: 152444	2012-03-09 22:27:51 +00:00
Richard Smith	812924502b	When checking the encoding of an 8-bit string literal, don't just check the first codepoint! Also, don't reject empty raw string literals for spurious "encoding" issues. Also, don't rely on undefined behavior in ConvertUTF.c. llvm-svn: 152344	2012-03-08 21:59:28 +00:00
Richard Smith	39570d0020	Add support for cooked forms of user-defined-integer-literal and user-defined-floating-literal. Support for raw forms of these literals to follow. llvm-svn: 152302	2012-03-08 08:45:32 +00:00
Richard Smith	75b67d6dc5	User-defined literal support for character literals. llvm-svn: 152277	2012-03-08 01:34:56 +00:00
Richard Smith	e18f0faff2	Lexing support for user-defined literals. Currently these lex as the same token kinds as the underlying string literals, and we silently drop the ud-suffix; those issues will be fixed by subsequent patches. llvm-svn: 152012	2012-03-05 04:02:15 +00:00
Eli Friedman	9436352a82	Implement warning for non-wide string literals with an unexpected encoding. Downgrade error for non-wide character literals with an unexpected encoding to a warning for compatibility with gcc and older versions of clang. <rdar://problem/10837678>. llvm-svn: 150295	2012-02-11 05:08:10 +00:00
Aaron Ballman	e1224a5067	Fixing hex floating literal support so that it handles 0x.2p2 properly. llvm-svn: 150072	2012-02-08 13:36:33 +00:00
Aaron Ballman	b97a5addd5	Hex literals without a significand no longer crash the lexer. Fixes bug 7910 Patch by Eitan Adler llvm-svn: 149984	2012-02-07 13:46:03 +00:00
Dylan Noblesmith	2c1dd2716a	Basic: import SmallString<> into clang namespace (I was going to fix the TODO about DenseMap too, but that would break self-host right now. See PR11922.) llvm-svn: 149799	2012-02-05 02:13:05 +00:00
Seth Cantrell	9c2d6f0279	stop claiming unicode escape sequences are too long in strings, because they never are llvm-svn: 148391	2012-01-18 12:27:08 +00:00
Seth Cantrell	8b2b677f39	Improves support for Unicode in character literals Updates ProcessUCNExcape() for C++. C++11 allows UCNs in character and string literals that represent control characters and basic source characters. Also C++03 allows UCNs that refer to surrogate codepoints. UTF-8 sequences in character literals are now handled as single c-chars. Added error for multiple characters in Unicode character literals. Added errors for when a the execution charset encoding of a c-char cannot be represented as a single code unit in the associated character type. Note that for the purposes of this error the asso- ciated character type for a narrow character literal is char, not int, even though in C narrow character literals have type int. llvm-svn: 148389	2012-01-18 12:27:04 +00:00
Nico Weber	d60b72f696	Fix a regression in wide character codegen. See PR11369. llvm-svn: 144521	2011-11-14 05:17:37 +00:00
Eli Friedman	20554708fb	Fix one last place where we weren't writing into a string literal consistently. llvm-svn: 143769	2011-11-05 00:41:04 +00:00
Eli Friedman	d1370791c2	Use native endianness for writing out character escapes to the result buffer for string literal parsing. No functional change on little-endian architectures; should fix test failures on PPC. llvm-svn: 143585	2011-11-02 23:06:23 +00:00
Eli Friedman	703e7153af	Perform proper conversion for strings encoded in the source file as UTF-8. (For now, we are assuming the source character set is always UTF-8; this can be easily extended if necessary.) Tests will be coming up in a subsequent commit. Patch by Seth Cantrell. llvm-svn: 143416	2011-11-01 02:14:50 +00:00
Douglas Gregor	227c352bae	We do parse hexfloats in C++11; make it actually work. llvm-svn: 141798	2011-10-12 18:51:02 +00:00
Douglas Gregor	4d68366b2f	When parsing a character literal, extract the characters from the buffer as an 'unsigned char', so that integer promotion doesn't sign-extend character values > 127 into oblivion. Fixes <rdar://problem/10188919>. llvm-svn: 140608	2011-09-27 17:00:18 +00:00
David Blaikie	9c902b5502	Rename Diagnostic to DiagnosticsEngine as per issue 5397 llvm-svn: 140478	2011-09-25 23:23:43 +00:00
David Blaikie	76bd3c80d4	Fix missing includes for llvm_unreachable llvm-svn: 140368	2011-09-23 05:35:21 +00:00
David Blaikie	83d382b1ca	Switch assert(0/false) llvm_unreachable. llvm-svn: 140367	2011-09-23 05:06:16 +00:00
Francois Pichet	0706d203cf	Rename LangOptions::Microsoft to LangOptions::MicrosoftExt to make it clear that this flag must be used only for Microsoft extensions and not emulation; to avoid confusion with the new LangOptions::MicrosoftMode flag. Many of the code now under LangOptions::MicrosoftExt will eventually be moved under the LangOptions::MicrosoftMode flag. llvm-svn: 139987	2011-09-17 17:15:52 +00:00
Douglas Gregor	86325ad2b5	Allow C99 hexfloats in C++0x mode. This change resolves the standards collision between C99 hexfloats and C++0x user-defined literals by giving C99 hexfloats precedence. Also, warning about user-defined literals that conflict with hexfloats and those that have names that are reserved by the implementation. Fixes <rdar://problem/9940194>. llvm-svn: 138839	2011-08-30 22:40:35 +00:00
Craig Topper	6eb2058a6a	Warn about and truncate UCNs that are too big for their character literal type. llvm-svn: 138031	2011-08-19 03:20:12 +00:00
NAKAMURA Takumi	9f8a02d34e	De-Unicode-ify. llvm-svn: 137430	2011-08-12 05:49:51 +00:00
Craig Topper	5265bb211d	Raw string followup. Pass a couple StringRefs by value. llvm-svn: 137301	2011-08-11 05:10:55 +00:00
Craig Topper	54edccafc5	Add support for C++0x raw string literals. llvm-svn: 137298	2011-08-11 04:06:15 +00:00
Craig Topper	61147ed270	Fix comment (test commit) llvm-svn: 137039	2011-08-08 06:10:39 +00:00
Douglas Gregor	fb65e592e0	Add support for C++0x unicode string and character literals, from Craig Topper! llvm-svn: 136210	2011-07-27 05:40:30 +00:00
Chris Lattner	0e62c1cc0b	remove unneeded llvm:: namespace qualifiers on some core types now that LLVM.h imports them into the clang namespace. llvm-svn: 135852	2011-07-23 10:55:15 +00:00
Argyrios Kyrtzidis	8b7252a8b3	Fix a nasty bug where inside StringLiteralParser: 1. We would assume that the length of the string literal token was at least 2 2. We would allocate a buffer with size length-2 And when the stars aligned (one of which would be an invalid source location due to stale PCH) The length would be 0 and we would try to allocate a 4GB buffer. Add checks for this corner case and a bunch of asserts. (We really really should have had an assert for 1.). Note that there's no test case since I couldn't get one (it was major PITA to reproduce), maybe later. llvm-svn: 131492	2011-05-17 22:09:56 +00:00
Chris Lattner	57540c5be0	fix a bunch of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129559	2011-04-15 05:22:18 +00:00
Francois Pichet	12df1dc8f2	Microsoft integer suffix changes: i64 is like LL i32 is like L Also set isMicrosoftInteger to true only if the suffix is well formed. llvm-svn: 123230	2011-01-11 11:57:53 +00:00
Ted Kremenek	8c4c74f4fb	Fix diagnostic for reporting bad escape sequence. Patch by Paul Curtis! llvm-svn: 120759	2010-12-03 00:09:56 +00:00
Chris Lattner	39720111e0	move getSpelling from Preprocessor to Lexer, which it is more conceptually related to. llvm-svn: 119479	2010-11-17 07:26:20 +00:00
Chris Lattner	6bab435db6	propagate preprocessor out of StringLiteralParser. It is now possible to create one without a preprocessor. llvm-svn: 119476	2010-11-17 07:21:13 +00:00
Chris Lattner	2be8aa9611	push the preprocessor out of EncodeUCNEscape llvm-svn: 119475	2010-11-17 07:12:42 +00:00
Chris Lattner	2a6ee91619	move AdvanceToTokenCharacter and getLocForEndOfToken from Preprocessor to Lexer where they make more sense. llvm-svn: 119474	2010-11-17 07:05:50 +00:00
Chris Lattner	b1ab2c2d3d	add a static version of PP::AdvanceToTokenCharacter. llvm-svn: 119472	2010-11-17 06:55:10 +00:00
Chris Lattner	bde1b81eb8	push use of Preprocessor out farther. llvm-svn: 119471	2010-11-17 06:46:14 +00:00
Chris Lattner	3a324d3232	push use of Preprocessor out of getOffsetOfStringByte llvm-svn: 119470	2010-11-17 06:35:43 +00:00
Chris Lattner	30d4c928ac	add a static form of the efficient PP::getSpelling method. llvm-svn: 119469	2010-11-17 06:31:48 +00:00
Chris Lattner	7a02bfdfce	refactor the interface to StringLiteralParser::getOffsetOfStringByte, pushing the dependency on the preprocessor out a bit. llvm-svn: 119468	2010-11-17 06:26:08 +00:00
Chris Lattner	26f6c227dc	allow I128 suffixes in msextensions mode just like i128 suffixes, patch by Martin Vejnar! llvm-svn: 116460	2010-10-14 00:24:10 +00:00
Nico Weber	a6bde81bc8	Add support for UCNs for character literals llvm-svn: 116129	2010-10-09 00:27:47 +00:00
Nico Weber	9762e0a234	Add support for 4-byte UCNs like \U12345678. Warn about UCNs in c90 mode. llvm-svn: 115743	2010-10-06 04:57:26 +00:00
Fariborz Jahanian	39de024e66	Prevent warning when built with assert off. llvm-svn: 112680	2010-08-31 23:54:38 +00:00
Fariborz Jahanian	abaae2b692	Some support for unicode string constants in wide strings. radar 8360841. llvm-svn: 112672	2010-08-31 23:34:27 +00:00
Alexis Hunt	3b7918625c	Revert my user-defined literal commits - r1124{58,60,67} pending some issues being sorted out. llvm-svn: 112493	2010-08-30 17:47:05 +00:00
Alexis Hunt	79eb5469e0	Implement C++0x user-defined string literals. The extra data stored on user-defined literal Tokens is stored in extra allocated memory, which is managed by the PreprocessorLexer because there isn't a better place to put it that makes sure it gets deallocated, but only after it's used up. My testing has shown no significant slowdown as a result, but independent testing would be appreciated. llvm-svn: 112458	2010-08-29 21:26:48 +00:00
Benjamin Kramer	e8394df11b	Random temporary string cleanup. llvm-svn: 110807	2010-08-11 14:47:12 +00:00
Douglas Gregor	b37b46e488	Complain when string literals are too long for the active language standard's minimum requirements. llvm-svn: 108837	2010-07-20 14:33:20 +00:00
Chris Lattner	c548be9ab3	Remove a dead argument to ProcessUCNEscape. Fix string concatenation to treat escapes in concatenated strings that are wide because of other string chunks to process the escapes as wide themselves. Before we would warn about and miscompile the attached testcase. This fixes rdar://8040728 - miscompile + warning: hex escape sequence out of range llvm-svn: 106012	2010-06-15 18:06:43 +00:00
Fariborz Jahanian	93bef10131	Fix a miscompile of wchar pascal strings. (radar 8020384) llvm-svn: 104996	2010-05-28 19:40:48 +00:00
Douglas Gregor	9af03022ff	Tell the string literal parser when it's not permitted to emit diagnostics. That would be while we're parsing string literals for the sole purpose of producing a diagnostic about them. Fixes <rdar://problem/8026030>. llvm-svn: 104684	2010-05-26 05:35:51 +00:00
Chris Lattner	1cf5bdd03d	emit warn_char_constant_too_large at most once per literal, fixing PR6852 llvm-svn: 101580	2010-04-16 23:44:05 +00:00
Douglas Gregor	7bda4b8310	Introduce optional "Invalid" parameters to routines that invoke the SourceManager's getBuffer() and, therefore, could fail, along with Preprocessor::getSpelling(). Use the Invalid parameters in the literal parsers (string, floating point, integral, character) to make them robust against errors that stem from, e.g., PCH files that are not consistent with the underlying file system. I still need to audit every use caller to all of these routines, to determine which ones need specific handling of error conditions. llvm-svn: 98608	2010-03-16 05:20:39 +00:00
Fariborz Jahanian	8c6c0b6a1f	ui64, etc. are valid VS suffixes. Fixes radar 7562363. llvm-svn: 94224	2010-01-22 21:36:53 +00:00
Alexis Hunt	91b78382b5	Do not parse hexadecimal floating point literals in C++0x mode because they are incompatible with user-defined literals, specifically with the following form: 0x1p+1 The preprocessing-number token extends only as far as the 'p'; the '+' is not included. Previously we could get away with this extension as p was an invalid suffix, but now with user-defined literals, 'p' might well be a valid suffix and we are forced to consider it as such. This patch also adds a warning in non-0x C++ modes telling the user that this extension is incompatible with C++0x that is enabled by default (previously and with other languages, we warn only with a compliance option such as -pedantic). llvm-svn: 93135	2010-01-10 23:37:56 +00:00
John McCall	53b93a091e	Diagnose out-of-bounds floating-point constants. Fixes rdar://problem/6974641 llvm-svn: 92127	2009-12-24 09:08:04 +00:00
John McCall	230a5d527e	Eliminate a completely unnecessary buffer copy when parsing float literals. llvm-svn: 91974	2009-12-23 01:37:10 +00:00
Nuno Lopes	baa1bc44af	cleanup parsing of MS integer suffixes a little. this fixes PR5616 btw, I believe that isMicrosoftInteger can go away; it's not read anywhere llvm-svn: 90036	2009-11-28 13:37:52 +00:00
Mike Stump	c99c022841	This fixes support for complex literals, reworked to avoid a goto, and to add a flag noting the presence of a Microsoft extension suffix (i8, i16, i32, i64). Patch by John Thompson. llvm-svn: 83591	2009-10-08 22:55:36 +00:00
Mike Stump	11289f4280	Remove tabs, and whitespace cleanups. llvm-svn: 81346	2009-09-09 15:08:12 +00:00
Erick Tryzelaar	b90731117c	Update lexer to work with the new APFloat string parsing. llvm-svn: 79211	2009-08-16 23:36:28 +00:00
Daniel Dunbar	a444cc2fa8	CharLiteralParser::IsMultiChar was sometimes uninitialized. llvm-svn: 77420	2009-07-29 01:46:05 +00:00
Alisdair Meredith	ed28f6e433	Fix the build llvm-svn: 75627	2009-07-14 08:10:06 +00:00
Eli Friedman	28a00aa646	PR4353: Add support for \E as a character escape. llvm-svn: 73153	2009-06-10 01:32:39 +00:00
Eli Friedman	9ffd4a9b96	Move CharIsSigned from TargetInfo to LangOptions. llvm-svn: 72928	2009-06-05 07:05:05 +00:00
Eli Friedman	d8cec57b9d	PR4283: Don't truncate multibyte character constants in the preprocessor. llvm-svn: 72686	2009-06-01 05:25:02 +00:00
Chris Lattner	8577f62622	Implement -Wfour-char-constants, which is an extension, not an extwarn, and apparently not part of -Wall llvm-svn: 70329	2009-04-28 21:51:46 +00:00
Chris Lattner	74c95e20af	implement -Wmultichar llvm-svn: 70315	2009-04-28 18:52:02 +00:00
Eli Friedman	5d72d41189	Get rid of some useless uses of NoExtensions. The philosophy here is that if we're going to print an extension warning anyway, there's no point to changing behavior based on NoExtensions: it will only make error recovery worse. Note that this doesn't cause any behavior change because NoExtensions isn't used by the current front-end. I'm still considering what to do about the remaining use of NoExtensions in IdentifierTable.cpp. llvm-svn: 70273	2009-04-28 00:51:18 +00:00
Sanjiv Gupta	f09cb95236	Use an APInt of target int size to detect overflow while parsing multichars. So 'abc' on i16 platforms will warn but not on i32 platforms. llvm-svn: 69653	2009-04-21 02:21:29 +00:00
Chris Lattner	66037791b1	temporarily revert r69046 llvm-svn: 69054	2009-04-14 18:05:08 +00:00
Sanjiv Gupta	69650b099a	Literal value calculation isn't likely to overflow on targets having int as 32 or less. Fixing the assert as it otherwise triggers for PIC16 which as i16 as int. llvm-svn: 69046	2009-04-14 16:46:37 +00:00
Steve Naroff	c94adda157	ProcessUCNEscape(): Incorportate some feedback from Chris. llvm-svn: 68198	2009-04-01 11:09:15 +00:00
Eli Friedman	1c3fb22cad	Fix pascal string support; testcase from mailing list message. llvm-svn: 68181	2009-04-01 03:17:08 +00:00
Steve Naroff	f2a880ca22	Incorporate feedback from Eli. llvm-svn: 68107	2009-03-31 10:29:45 +00:00
Steve Naroff	7b753d21b5	Implement UCN support for C string literals (C99 6.4.3) and add some very basic tests. Chris Goller has graciously offered to write some test to help validate UCN support. From a front-end perspective, I believe this code should work for ObjC @-strings. At the moment, I believe we need to tweak the code generation for @-strings (which doesn't appear to handle them). Will be investigating. llvm-svn: 68076	2009-03-30 23:46:03 +00:00

1 2 3 4

171 Commits