llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	39720111e0	move getSpelling from Preprocessor to Lexer, which it is more conceptually related to. llvm-svn: 119479	2010-11-17 07:26:20 +00:00
Chris Lattner	2a6ee91619	move AdvanceToTokenCharacter and getLocForEndOfToken from Preprocessor to Lexer where they make more sense. llvm-svn: 119474	2010-11-17 07:05:50 +00:00
Chandler Carruth	c3ce5840af	Update remaining attribute macros to new style. llvm-svn: 117204	2010-10-23 08:44:57 +00:00
Sebastian Redl	517523014d	In MeasureTokenLength, the FileLoc supplied to the lexer must point to the start of the buffer, or we risk overflow. llvm-svn: 115117	2010-09-30 01:03:03 +00:00
Chris Lattner	0f0492e69c	improve isHexaLiteral to work with escaped newlines and trigraphs, patch by Francois Pichet! llvm-svn: 112602	2010-08-31 16:42:00 +00:00
Chris Lattner	dec7334218	silence a warning llvm-svn: 112549	2010-08-30 23:11:03 +00:00
Alexis Hunt	3b7918625c	Revert my user-defined literal commits - r1124{58,60,67} pending some issues being sorted out. llvm-svn: 112493	2010-08-30 17:47:05 +00:00
Chris Lattner	5f183aa592	add a fixme. llvm-svn: 112491	2010-08-30 17:11:14 +00:00
Chris Lattner	7a9e9e7d76	use 'features' instead of 'PP->getLangOptions'. llvm-svn: 112490	2010-08-30 17:09:08 +00:00
Douglas Gregor	759ef23bb8	In Microsoft compatibility mode, don't parse the exponent as part of the pp-number in a hexadecimal floating point literal, from Francois Pichet! Fixes PR7968. llvm-svn: 112481	2010-08-30 14:50:47 +00:00
Alexis Hunt	79eb5469e0	Implement C++0x user-defined string literals. The extra data stored on user-defined literal Tokens is stored in extra allocated memory, which is managed by the PreprocessorLexer because there isn't a better place to put it that makes sure it gets deallocated, but only after it's used up. My testing has shown no significant slowdown as a result, but independent testing would be appreciated. llvm-svn: 112458	2010-08-29 21:26:48 +00:00
Douglas Gregor	115837041e	Introduce a preprocessor code-completion hook for contexts where we expect "natural" language and should not provide any completions, e.g., comments, string literals, #error. llvm-svn: 112054	2010-08-25 17:04:25 +00:00
Douglas Gregor	3a7ad25eb6	Introduce basic code-completion support for preprocessor directives, e.g., after a "#" we'll suggest #if, #ifdef, etc. llvm-svn: 111943	2010-08-24 19:08:16 +00:00
Douglas Gregor	02690ba643	Don't emit end-of-file diagnostics like "unterminated conditional" or "unterminated string" when we're performing code completion. llvm-svn: 110933	2010-08-12 17:04:55 +00:00
Benjamin Kramer	e8394df11b	Random temporary string cleanup. llvm-svn: 110807	2010-08-11 14:47:12 +00:00
Douglas Gregor	028d3e4d0f	Use precompiled preambles for in-process code completion. llvm-svn: 110596	2010-08-09 20:45:32 +00:00
Douglas Gregor	3f4bea0646	Introduce basic support for loading a precompiled preamble while reparsing an ASTUnit. When saving a preamble, create a buffer larger than the actual file we're working with but fill everything from the end of the preamble to the end of the file with spaces (so the lexer will quickly skip them). When we load the file, create a buffer of the same size, filling it with the file and then spaces. Then, instruct the lexer to start lexing after the preamble, therefore continuing the parse from the spot where the preamble left off. It's now possible to perform a simple preamble build + parse (+ reparse) with ASTUnit. However, one has to disable a bunch of checking in the PCH reader to do so. That part isn't committed; it will likely be handled with some other kind of flag (e.g., -fno-validate-pch). As part of this, fix some issues with null termination of the memory buffers created for the preamble; we were trying to explicitly NULL-terminate them, even though they were also getting implicitly NULL terminated, leading to excess warnings about NULL characters in source files. llvm-svn: 109445	2010-07-26 21:36:20 +00:00
Douglas Gregor	cd8bdd025f	Improve performance during cursor traversal when a region of interest is present. Rather than using clang_getCursorExtent(), which requires us to lex the token at the ending position to determine its length. Then, we'd be comparing [a, b) source ranges that cover the characters in the range rather than the normal behavior for Clang's source ranges, which covers the tokens in the range. However, relexing causes us to read the source file (which may come from a precompiled header), which is rather unfortunate and affects performance. In the new scheme, we only use Clang-style source ranges that cover the tokens in the range. At the entry points where this matters (clang_annotateTokens, clang_getCursor), we make sure to move source locations to the start of the token. Addresses most of <rdar://problem/8049381>. llvm-svn: 109134	2010-07-22 20:22:31 +00:00
Douglas Gregor	af82e3510b	Introduce a new lexer function to compute the "preamble" of a file, which is the part of the file that contains all of the initial comments, includes, and preprocessor directives that occur before any of the actual code. Added a new -print-preamble cc1 action that is only used for testing. llvm-svn: 108913	2010-07-20 20:18:03 +00:00
Chris Lattner	86851b8a7a	fix PR4499, patch by Kyle Dean! llvm-svn: 107836	2010-07-07 23:24:27 +00:00
Chris Lattner	52d96ac930	simpler fix for rdar://8044135 - escaped newlines have already been processed, so they don't have to be tip-toed around. llvm-svn: 105182	2010-05-30 23:27:38 +00:00
Douglas Gregor	fe4a4107d8	Improve our handling of NULL after an escaping '\' in a string literal. Fixes <rdar://problem/8044135>. llvm-svn: 105181	2010-05-30 22:59:50 +00:00
Douglas Gregor	6da3db4af3	Improve code completion in failure cases in two ways: 1) Suppress diagnostics as soon as we form the code-completion token, so we don't get any error/warning spew from the early end-of-file. 2) If we consume a code-completion token when we weren't expecting one, go into a code-completion recovery path that produces the best results it can based on the context that the parser is in. llvm-svn: 104585	2010-05-25 05:58:43 +00:00
Chris Lattner	467f6bcfe5	robustify the conflict marker stuff. Don't add 7 twice, which would make it miss (invalid) things like: <<<<<<< >>>>>>> and crash if <<<<<<< was at the end of the line. When we find a >>>>>>> that is not at the end of the line, make sure to reset Pos so we don't crash on something like: <<<<<<< >>>>>>> This isn't worth making testcases for, since each would require a new file. rdar://7987078 - signal 11 compiling "<<<<<<<<<<" llvm-svn: 103968	2010-05-17 20:27:25 +00:00
Chris Lattner	561aabd943	when code completing inside a C-style block comment, don't emit errors about a missing */ since we truncated the file. This fixes rdar://7948776 llvm-svn: 103913	2010-05-16 19:54:05 +00:00
Chris Lattner	1a9e873bf9	fix a minor bug I noticed while work with Jordy's patch for PR6101, in an input file like this: # 42 int x; we were emitting: # <something> int x; (with a space before the int) because we weren't clearing the leading whitespace flag properly after the \n from the directive was handled. llvm-svn: 101084	2010-04-12 23:04:41 +00:00
Douglas Gregor	a771f46c82	Reinstate my CodeModificationHint -> FixItHint renaming patch, without the C-only "optimization". llvm-svn: 100022	2010-03-31 17:46:05 +00:00
Douglas Gregor	30e631862f	Revert r100008, which inexplicably breaks the clang-i686-darwin10 builder llvm-svn: 100018	2010-03-31 17:25:35 +00:00
Douglas Gregor	3baad0d4f7	Rename CodeModificationHint to FixItHint, since we've been using the term "fix-it" everywhere and even I get tired of long names sometimes. No functionality change. llvm-svn: 100008	2010-03-31 15:31:50 +00:00
Douglas Gregor	1668355e06	Remove unused variable llvm-svn: 98691	2010-03-16 22:54:32 +00:00
Douglas Gregor	dc970f0866	Audit all Preprocessor::getSpelling() callers, improving failure recovery for those that need it. llvm-svn: 98689	2010-03-16 22:30:13 +00:00
Douglas Gregor	42fe858cd6	Audit all callers of SourceManager::getCharacterData(); update some of them to recover more gracefully on failure. llvm-svn: 98672	2010-03-16 20:46:42 +00:00
Benjamin Kramer	eb92dc0b09	Let SourceManager::getBufferData return StringRef instead of a pair of two const char*. llvm-svn: 98630	2010-03-16 14:14:31 +00:00
Douglas Gregor	e0fbb83b8b	Give SourceManager a Diagnostic object with which to report errors, and start simplifying the interfaces in SourceManager that can fail. llvm-svn: 98594	2010-03-16 00:06:06 +00:00
Douglas Gregor	802b77601e	Introduce a new BufferResult class to act as the return type of SourceManager's getBuffer() (and similar) operations. This abstract can be used to force callers to cope with errors in getBuffer(), such as missing files and changed files. Fix a bunch of callers to use the new interface. Add some very basic checks for file consistency (file size, modification time) into ContentCache::getBuffer(), although these checks don't help much until we've updated the main callers (e.g., SourceManager::getSpelling()). llvm-svn: 98585	2010-03-15 22:54:52 +00:00
Chris Lattner	93ddf80eb7	don't inform comment handlers about comments in #if 0 blocks, doing so invalidates the file guard optimization and is not in the spirit of "#if 0" because it is supposed to completely skip everything, even if it isn't lexically valid. Patch by Abramo Bagnara! llvm-svn: 95253	2010-02-03 21:06:21 +00:00
Douglas Gregor	562c1f9365	Teach CIndex's cursor visitor to restrict its traversal to a specific region of interest (if provided). Implement clang_getCursor() in terms of this traversal rather than using the Index library; the unified cursor visitor is more complete, and will be The Way Forward. Minor other tweaks needed to make this work: - Extend Preprocessor::getLocForEndOfToken() to accept an offset from the end, making it easy to move to the last character in the token (rather than just past the end of the token). - In Lexer::MeasureTokenLength(), the length of whitespace is zero. llvm-svn: 94200	2010-01-22 19:49:59 +00:00
Chris Lattner	87d0208c41	allow the HandlerComment callback to push tokens into the preprocessor. This could be used by an OpenMP implementation or something. Patch by Abramo Bagnara! llvm-svn: 93795	2010-01-18 22:35:47 +00:00
Chris Lattner	21d9b9a948	add a TODO for a perf improvement in LexIdentifier. llvm-svn: 93141	2010-01-11 02:38:50 +00:00
Alexis Hunt	91b78382b5	Do not parse hexadecimal floating point literals in C++0x mode because they are incompatible with user-defined literals, specifically with the following form: 0x1p+1 The preprocessing-number token extends only as far as the 'p'; the '+' is not included. Previously we could get away with this extension as p was an invalid suffix, but now with user-defined literals, 'p' might well be a valid suffix and we are forced to consider it as such. This patch also adds a warning in non-0x C++ modes telling the user that this extension is incompatible with C++0x that is enabled by default (previously and with other languages, we warn only with a compliance option such as -pedantic). llvm-svn: 93135	2010-01-10 23:37:56 +00:00
Chris Lattner	3dfff974ec	reimplement r90860, fixing a couple of problems: 1. Don't make a copy of LangOptions every time a lexer is created. 2. Don't make CharInfo global mutable state. 3. Fix the implementation to properly treat ^Z as EOF instead of as horizontal whitespace, which matches the semantic implemented by VC++. llvm-svn: 91586	2009-12-17 05:29:40 +00:00
Chris Lattner	7c027ee4c2	teach clang to recover gracefully from conflict markers left in source files: PR5238. llvm-svn: 91270	2009-12-14 06:16:57 +00:00
Steve Naroff	04bc01833e	Integrate the following from the 'objective-rewrite' branch: http://llvm.org/viewvc/llvm-project?view=rev&revision=80043 llvm-svn: 90860	2009-12-08 16:38:12 +00:00
Douglas Gregor	53ad6b94b0	Extend the source manager with the ability to override the contents of files with the contents of an arbitrary memory buffer. Use this new functionality to drastically clean up the way in which we handle file truncation for code-completion: all of the truncation/completion logic is now encapsulated in the preprocessor where it belongs (<rdar://problem/7434737>). llvm-svn: 90300	2009-12-02 06:49:09 +00:00
Chris Lattner	710bb87147	Fix PR5633 by making the preprocessor handle the case where we can stat a file but where mmaping it fails. In this case, we emit an error like: t.c:1:10: fatal error: error opening file '../../foo.h' instead of "cannot find file". llvm-svn: 90110	2009-11-30 04:18:44 +00:00
Benjamin Kramer	5e738284d7	Move DISABLE_INLINE to the front of the decl so MSVC can parse it. Patch by Amine Khaldi! llvm-svn: 88797	2009-11-14 16:36:57 +00:00
Chris Lattner	a3d4f16b12	Teach Lexer::MeasureTokenLength to be able to measure the length of comment tokens. Patch by Abramo Bagnara! llvm-svn: 84100	2009-10-14 15:04:18 +00:00
Douglas Gregor	ea9b03e6e2	Replace the -code-completion-dump option with -code-completion-at=filename:line:column which performs code completion at the specified location by truncating the file at that position and enabling code completion. This approach makes it possible to run multiple tests from a single test file, and gives a more natural command-line interface. llvm-svn: 82571	2009-09-22 21:11:38 +00:00
Douglas Gregor	3545ff43f4	Refactor and simplify the CodeCompleteConsumer, so that all of the real work is performed within Sema. Addresses Chris's comments, but still retains the heavyweight list-of-multimaps data structure. llvm-svn: 82459	2009-09-21 16:56:56 +00:00
Douglas Gregor	2436e7116b	Initial implementation of a code-completion interface in Clang. In essence, code completion is triggered by a magic "code completion" token produced by the lexer [], which the parser recognizes at certain points in the grammar. The parser then calls into the Action object with the appropriate CodeCompletionXXX action. Sema implements the CodeCompletionXXX callbacks by performing minimal translation, then forwarding them to a CodeCompletionConsumer subclass, which uses the results of semantic analysis to provide code-completion results. At present, only a single, "printing" code completion consumer is available, for regression testing and debugging. However, the design is meant to permit other code-completion consumers. This initial commit contains two code-completion actions: one for member access, e.g., "x." or "p->", and one for nested-name-specifiers, e.g., "std::". More code-completion actions will follow, along with improved gathering of code-completion results for the various contexts. [] In the current -code-completion-dump testing/debugging mode, the file is truncated at the completion point and EOF is translated into "code completion". llvm-svn: 82166	2009-09-17 21:32:03 +00:00
Mike Stump	11289f4280	Remove tabs, and whitespace cleanups. llvm-svn: 81346	2009-09-09 15:08:12 +00:00
Chris Lattner	de50a0c251	Convert the CharInfo table to be statically initialized, instead of dynamically initialized. Patch by Ryan Flynn! llvm-svn: 74919	2009-07-07 17:09:54 +00:00
Chris Lattner	5c34938aa4	fix an out-of-date comment. llvm-svn: 74894	2009-07-07 05:05:42 +00:00
Douglas Gregor	c6d5edd2ed	Add support for retrieving the Doxygen comment associated with a given declaration in the AST. The new ASTContext::getCommentForDecl function searches for a comment that is attached to the given declaration, and returns that comment, which may be composed of several comment blocks. Comments are always available in an AST. However, to avoid harming performance, we don't actually parse the comments. Rather, we keep the source ranges of all of the comments within a large, sorted vector, then lazily extract comments via a binary search in that vector only when needed (which never occurs in a "normal" compile). Comments are written to a precompiled header/AST file as a blob of source ranges. That blob is only lazily loaded when one requests a comment for a declaration (this never occurs in a "normal" compile). The indexer testbed now supports comment extraction. When the -point-at location points to a declaration with a Doxygen-style comment, the indexer testbed prints the associated comment block(s). See test/Index/comments.c for an example. Some notes: - We don't actually attempt to parse the comment blocks themselves, beyond identifying them as Doxygen comment blocks to associate them with a declaration. - We won't find comment blocks that aren't adjacent to the declaration, because we start our search based on the location of the declaration. - We don't go through the necessary hops to find, for example, whether some redeclaration of a declaration has comments when our current declaration does not. Similarly, we don't attempt to associate a \param Foo marker in a function body comment with the parameter named Foo (although that is certainly possible). - Verification of my "no performance impact" claims is still "to be done". llvm-svn: 74704	2009-07-02 17:08:52 +00:00
Chris Lattner	c183595534	Fix our check for "random whitespace between a \ and newline" to work with dos style newlines. I have a trivial test for this: // RUN: clang-cc %s -verify #define test(x, y) \ x ## y but I don't know how to get svn to not change newlines and testrunner doesn't work with dos style newlines either, so "not worth it". :) rdar://6994000 llvm-svn: 73945	2009-06-23 05:15:06 +00:00
Chris Lattner	ff96dd0301	Fix rdar://6880630 - # in _Pragma does not start a preprocessor directive. llvm-svn: 71643	2009-05-13 06:10:29 +00:00
Eli Friedman	5d72d41189	Get rid of some useless uses of NoExtensions. The philosophy here is that if we're going to print an extension warning anyway, there's no point to changing behavior based on NoExtensions: it will only make error recovery worse. Note that this doesn't cause any behavior change because NoExtensions isn't used by the current front-end. I'm still considering what to do about the remaining use of NoExtensions in IdentifierTable.cpp. llvm-svn: 70273	2009-04-28 00:51:18 +00:00
Chris Lattner	40493eb6eb	fix rdar://6816766 - Crash with function-like macro test at end of directive. llvm-svn: 69964	2009-04-24 07:15:46 +00:00
Chris Lattner	38b2cde4c4	add a new Lexer::SkipEscapedNewLines method. llvm-svn: 69483	2009-04-18 22:27:02 +00:00
Chris Lattner	fbce7aa1f4	factor escape newline measuring out into its own helper function. llvm-svn: 69482	2009-04-18 22:05:41 +00:00
Chris Lattner	dfbfc44df7	remove unneeded scopes. llvm-svn: 69481	2009-04-18 21:57:20 +00:00
Chris Lattner	b40289b2b8	Fix two problems from PR3916, and one problem I noticed while hacking on the code. llvm-svn: 69404	2009-04-17 23:56:52 +00:00
Chris Lattner	184e65d363	Change Lexer::MeasureTokenLength to take a LangOptions reference. This allows it to accurately measure tokens, so that we get: t.cpp:8:13: error: unknown type name 'X' static foo::X P; ~~~~~^ instead of the woefully inferior: t.cpp:8:13: error: unknown type name 'X' static foo::X P; ~~~~ ^ Most of this is just plumbing to push the reference around. llvm-svn: 69099	2009-04-14 23:22:57 +00:00
Chris Lattner	ecdaf40c9e	fix rdar://6757323, where an escaped newline in a // comment was causing the char after the newline to get eaten. llvm-svn: 68430	2009-04-05 00:26:41 +00:00
Mike Stump	0be8875ea4	A code modification hint for files that don't end in a newline. Eventually, would be nice to be able to run these modifications even when we don't want the warning or errors for the actual diagnostic. llvm-svn: 68272	2009-04-02 02:29:42 +00:00
Chris Lattner	d14705b9b4	silence some errors that should not apply to .S files on code like: '' ' ' llvm-svn: 67237	2009-03-18 21:10:12 +00:00
Chris Lattner	2534324a4e	properly form a full token for # before calling HandleDirective. llvm-svn: 67235	2009-03-18 20:58:27 +00:00
Chris Lattner	fa217bda40	simplify some logic by making ScratchBuffer handle the application of trailing \0's to created tokens instead of making all clients do it. No functionality change. llvm-svn: 66373	2009-03-08 08:08:45 +00:00
Chris Lattner	91668def8b	fix PR3609, emit: t.c:1:10: error: missing terminating '>' character #include <stdio.h ^ instead of: t.c:1:10: error: missing terminating " character #include <stdio.h ^ llvm-svn: 65052	2009-02-19 18:29:56 +00:00
Chris Lattner	9dc9c206d3	track "just a little more" location information for macro instantiations. Now instead of just tracking the expansion history, also track the full range of the macro that got replaced. For object-like macros, this doesn't change anything. For _Pragma and function-like macros, this means we track the locations of the ')'. This is required for PR3579 because apparently GCC uses the line of the ')' of a function-like macro as the location to expand __LINE__ to. llvm-svn: 64601	2009-02-15 20:52:18 +00:00
Chris Lattner	60f36223a9	move library-specific diagnostic headers into library private dirs. Reduce redundant #includes. Patch by Anders Johnsen! llvm-svn: 63271	2009-01-29 05:15:15 +00:00
Chris Lattner	7368d581c1	Split the single monolithic DiagnosticKinds.def file into one .def file for each library. This means that adding a diagnostic to sema doesn't require all the other libraries to be rebuilt. Patch by Anders Johnsen! llvm-svn: 63111	2009-01-27 18:30:58 +00:00
Chris Lattner	d381721810	Fix a bug I introduced in my changes, which caused MeasureTokenLength to crash when given an instantiation location. Thanks to Fariborz for the testcase. llvm-svn: 63057	2009-01-26 22:24:27 +00:00
Chris Lattner	7e20927756	allow _Pragmas formed from #defines to keep their full instantiation history llvm-svn: 63035	2009-01-26 20:15:46 +00:00
Chris Lattner	5a7971e0c3	This change refactors some of the low-level lexer interfaces a bit. Token now has a class of kinds for "literals", which include numeric constants, strings, etc. These tokens can optionally have a pointer to the start of the token in the lexer buffer. This makes it faster to get spelling and do other gymnastics, because we don't have to go through source locations. This change is performance neutral, but will make other changes more feasible down the road. llvm-svn: 63028	2009-01-26 19:29:26 +00:00
Chris Lattner	4fa23625ab	Check in the long promised SourceLocation rewrite. This lays the ground work for implementing #line, and fixes the "out of macro ID's" problem. There is nothing particularly tricky about the code, other than the very performance sensitive SourceManager::getFileID() method. llvm-svn: 62978	2009-01-26 00:43:02 +00:00
Chris Lattner	1f6c7fe6a8	This is a follow-up to r62675: Refactor how the preprocessor changes a token from being an tok::identifier to a keyword (e.g. tok::kw_for). Instead of doing this in HandleIdentifier, hoist this common case out into the caller, so that every keyword doesn't have to go through HandleIdentifier. This drops time in HandleIdentifier from 1.25ms to .62ms, and speeds up clang -Eonly with PTH by about 1%. llvm-svn: 62855	2009-01-23 18:35:48 +00:00
Chris Lattner	8256b970a3	a trivial micro optimization to save a load. llvm-svn: 62676	2009-01-21 07:45:14 +00:00
Chris Lattner	ad89ec013f	Add a bit to IdentifierInfo that acts as a simple predicate which tells us whether Preprocessor::HandleIdentifier needs to be called. Because this method is only rarely needed, this saves a call and a bunch of random checks. This drops the time in HandleIdentifier from 3.52ms to .98ms on cocoa.h on my machine. llvm-svn: 62675	2009-01-21 07:43:11 +00:00
Chris Lattner	cbc35ecb04	Rename SourceManager::getCanonicalFileID -> getFileID. There is no longer such thing as a non-canonical FileID. llvm-svn: 62499	2009-01-19 07:46:45 +00:00
Chris Lattner	29a2a191f2	Make SourceLocation::getFileLoc private to reduce the API exposure of SourceLocation. This requires making some cleanups to token pasting and _Pragma expansion. llvm-svn: 62490	2009-01-19 06:46:35 +00:00
Chris Lattner	71dc14b9f0	Rename SourceLocation::getFileID to getChunkID, because it returns the chunk ID not the file ID. This exposes problems in TextDiagnosticPrinter where it should have been using the canonical file ID but wasn't. Fix these along the way. llvm-svn: 62427	2009-01-17 08:45:21 +00:00
Chris Lattner	5509d533f6	simplify some lookups. llvm-svn: 62426	2009-01-17 08:30:10 +00:00
Chris Lattner	757169b60f	Change the Lexer ctor used to lex _Pragma directives into a static factory method. This lets us clean up the interface and make it more obvious that this method is really really _Pragma specific. Note that _Pragma handling uglifies the Lexer in the critical path. It would be very interesting to consider making _Pragma remapping be a new special lexer class of its own. llvm-svn: 62425	2009-01-17 08:27:52 +00:00
Chris Lattner	c809089b26	Change the Lexer ctor used in the non _Pragma case to take a FileID instead of a SourceLocation. This should speed it up and definitely simplifies it. llvm-svn: 62422	2009-01-17 08:03:42 +00:00
Chris Lattner	5965a28a4b	More simplifications to the lexer ctors. llvm-svn: 62419	2009-01-17 07:56:59 +00:00
Chris Lattner	fcf6452eb4	make the verbose raw-lexer ctor fully explicit instead of having embedded magic. llvm-svn: 62417	2009-01-17 07:42:27 +00:00
Chris Lattner	08354fef13	add a simplified lexer ctor that sets up the lexer to raw-lex an entire file. llvm-svn: 62414	2009-01-17 07:35:14 +00:00
Chris Lattner	f76b92092e	refactor some common initialization code out of the two lexer ctors. llvm-svn: 62411	2009-01-17 06:55:17 +00:00
Chris Lattner	d32480d3db	this massive patch introduces a simple new abstraction: it makes "FileID" a concept that is now enforced by the compiler's type checker instead of yet-another-random-unsigned floating around. This is an important distinction from the "FileID" currently tracked by SourceLocation. That FileID may refer to the start of a file or to a chunk within it. The new FileID only refers to the file (and its #include stack and eventually #line data), it cannot refer to a chunk. FileID is a completely opaque datatype to all clients, only SourceManager is allowed to poke and prod it. llvm-svn: 62407	2009-01-17 06:22:33 +00:00
Chris Lattner	1abd20901b	Instead of iterating over FileID's, have PTH generation iterate over the content cache directly. Content cache has a 1-1 mapping with fileentries, whereas multiple FileIDs can be the same FileEntry. llvm-svn: 62401	2009-01-17 03:48:08 +00:00
Chris Lattner	5882771102	Fix PR2477 - clang misparses "//*" in C89 mode llvm-svn: 62368	2009-01-16 22:39:25 +00:00
Chris Lattner	8a42586c54	more SourceLocation lexicon change: instead of referring to the "logical" location, refer to the "instantiation" location. llvm-svn: 62316	2009-01-16 07:36:28 +00:00
Chris Lattner	53e384f633	Change some terminology in SourceLocation: instead of referring to the "physical" location of tokens, refer to the "spelling" location. This is more concrete and useful, tokens aren't really physical objects! llvm-svn: 62309	2009-01-16 07:00:02 +00:00
Chris Lattner	e141a9e225	rdar://6060752 - don't warn about trigraphs in bcpl-style comments llvm-svn: 60942	2008-12-12 07:34:39 +00:00
Chris Lattner	89770575cd	fix thought-o llvm-svn: 60937	2008-12-12 07:14:34 +00:00
Douglas Gregor	90abb6dead	Objective-C keywords are not always identifiers. Some are also C++ keywords llvm-svn: 60373	2008-12-01 21:46:47 +00:00
Daniel Dunbar	5c4cc09498	Comment fix. llvm-svn: 59997	2008-11-25 00:20:22 +00:00
Chris Lattner	f3cb394f41	Fix a weird inconsistency with hex floats. Previously the lexer would not eat the "-1" in "0x0p-1", but LiteralSupport would accept it when extensions are on. This caused strangeness and failures when hexfloats were properly treated as an extension (not error) in LiteralSupport. llvm-svn: 59865	2008-11-22 07:39:03 +00:00
Chris Lattner	014156e108	actually, this version isn't really needed. llvm-svn: 59859	2008-11-22 06:22:39 +00:00
Chris Lattner	57dab26be1	remove a sneaky version of Diag hiding in PreprocessorLexer. llvm-svn: 59858	2008-11-22 06:20:42 +00:00
Chris Lattner	6d27a16b95	Change the Lexer::Diag method to not magically silence warnings, force the caller to check instead. This eliminates the need (and the risk!) of weird null DiagnosticBuilder's floating around. llvm-svn: 59856	2008-11-22 02:02:22 +00:00
Chris Lattner	427c9c1763	Split the DiagnosticInfo class into two disjoint classes: one for building up the diagnostic that is in flight (DiagnosticBuilder) and one for pulling structured information out of the diagnostic when formatting and presenting it. There is no functionality change with this patch. llvm-svn: 59849	2008-11-22 00:59:29 +00:00
Ted Kremenek	45245217bc	- Move static function IsNonPragmaNonMacroLexer into Preprocessor.h. - Add variants of IsNonPragmaNonMacroLexer to accept an IncludeMacroStack entry (simplifies some uses). - Use IsNonPragmaNonMacroLexer in Preprocessor::LookupFile. - Add 'FileID' to PreprocessorLexer, and have Preprocessor query this fileid when looking up the FileEntry for a file Performance testing of -Eonly on Cocoa.h shows no performance regression because of this patch. llvm-svn: 59666	2008-11-19 21:57:25 +00:00
Chris Lattner	907dfe94e1	Convert the lexer and start converting the PP over to using canonical Diag methods. llvm-svn: 59511	2008-11-18 07:59:24 +00:00
Ted Kremenek	66312a3ff4	Move some diagnostic handling to PreprocessorLexer. llvm-svn: 59191	2008-11-12 23:13:54 +00:00
Ted Kremenek	2f4f2dea82	Remove Lexer::LexIncludeFilename. llvm-svn: 59186	2008-11-12 22:44:15 +00:00
Chris Lattner	b11c3233d8	Change FormTokenWithChars to take the token kind to form, since all clients were setting a kind and then forming it. This is just a minor API cleanup, no functionality change. llvm-svn: 57404	2008-10-12 04:51:35 +00:00
Chris Lattner	99e7d23455	When in keep whitespace mode, make sure to return block comments that are unterminated. llvm-svn: 57403	2008-10-12 04:19:49 +00:00
Chris Lattner	e01e758e11	Change SkipBlockComment and SkipBCPLComment to return true when in keep comment mode, instead of returning false. This matches SkipWhitespace. llvm-svn: 57402	2008-10-12 04:15:42 +00:00
Chris Lattner	4d96344c19	Add a new mode to the lexer which enables it to return all characters, even whitespace, as tokens from the file. This is enabled with L->SetKeepWhitespaceMode(true) on a raw lexer. In this mode, you too can use clang as a really complex version of 'cat' with code like this: Lexer RawLex(SourceLocation::getFileLoc(SM.getMainFileID(), 0), PP.getLangOptions(), File.first, File.second); RawLex.SetKeepWhitespaceMode(true); Token RawTok; RawLex.LexFromRawLexer(RawTok); while (RawTok.isNot(tok::eof)) { std::cout << PP.getSpelling(RawTok); RawLex.LexFromRawLexer(RawTok); } This will emit exactly the input file, with no canonicalization or other translation. Realistic clients actually do something with the tokens of course :) llvm-svn: 57401	2008-10-12 04:05:48 +00:00
Chris Lattner	097a8b8777	Fix a couple more places that poke KeepCommentMode unnecesarily. llvm-svn: 57398	2008-10-12 03:27:19 +00:00
Chris Lattner	8637abd333	add a new inKeepCommentMode() accessor to abstract the KeepCommentMode ivar. llvm-svn: 57397	2008-10-12 03:22:02 +00:00
Chris Lattner	e3f863a388	fix misleading comment. llvm-svn: 57396	2008-10-12 01:34:51 +00:00
Chris Lattner	7c2e9809b1	Simplify raw mode lexing by treating an unterminate /**/ comment the same we we do an unterminated string or character literal. This makes it so we can guarantee that the lexer never calls into the preprocessor (which would be suicide for a raw lexer). llvm-svn: 57395	2008-10-12 01:31:51 +00:00
Chris Lattner	6b0c5ad096	add a comment. llvm-svn: 57394	2008-10-12 01:23:27 +00:00
Chris Lattner	50c9050037	Change how raw lexers are handled: instead of creating them and then using LexRawToken, create one and use LexFromRawLexer. This avoids twiddling the RawLexer flag around and simplifies some code (even speeding raw lexing up a tiny bit). This change also improves the token paster to use a Lexer on the stack instead of new/deleting it. llvm-svn: 57393	2008-10-12 01:15:46 +00:00
Chris Lattner	79ef843533	silence some release-assert warnings. llvm-svn: 57391	2008-10-12 00:28:42 +00:00
Chris Lattner	87e97ea7b8	improve a comment. llvm-svn: 57389	2008-10-12 00:23:07 +00:00
Daniel Dunbar	12c9ddced1	Change Parser & Sema to use interned "super" for comparions. - Added as private members for each because it is not clear where to put the common definition. Perhaps the IdentifierInfos all of these "pseudo-keywords" should be collected into one place (this would KnownFunctionIDs and Objective-C property IDs, for example). Remove Token::isNamedIdentifier. - There isn't a good reason to use strcmp when we have interned strings, and there isn't a good reason to encourage clients to do so. llvm-svn: 54794	2008-08-14 22:04:54 +00:00
Nate Begeman	5eee93328e	Fix typo llvm-svn: 49632	2008-04-14 02:26:39 +00:00
Chris Lattner	8f96d04ceb	don't diagnose empty source files, thanks Neil! llvm-svn: 49575	2008-04-12 05:54:25 +00:00
Chris Lattner	9b7206eb4f	don't read off the front of the buffer. Thanks to Sam for pointing this out. llvm-svn: 49535	2008-04-11 16:20:41 +00:00
Chris Lattner	7a51313d8a	Make a major restructuring of the clang tree: introduce a top-level lib dir and move all the libraries into it. This follows the main llvm tree, and allows the libraries to be built in parallel. The top level now enforces that all the libs are built before Driver, but we don't care what order the libs are built in. This speeds up parallel builds, particularly incremental ones. llvm-svn: 48402	2008-03-15 23:59:48 +00:00

... 3 4 5 6 7

324 Commits