Commit Graph

262 Commits

Author SHA1 Message Date
Chris Lattner 184e65d363 Change Lexer::MeasureTokenLength to take a LangOptions reference.
This allows it to accurately measure tokens, so that we get:

t.cpp:8:13: error: unknown type name 'X'
static foo::X  P;
       ~~~~~^

instead of the woefully inferior:

t.cpp:8:13: error: unknown type name 'X'
static foo::X  P;
       ~~~~ ^

Most of this is just plumbing to push the reference around.

llvm-svn: 69099
2009-04-14 23:22:57 +00:00
Chris Lattner ecdaf40c9e fix rdar://6757323, where an escaped newline in a // comment
was causing the char after the newline to get eaten.

llvm-svn: 68430
2009-04-05 00:26:41 +00:00
Mike Stump 0be8875ea4 A code modification hint for files that don't end in a newline.
Eventually, would be nice to be able to run these modifications even
when we don't want the warning or errors for the actual diagnostic.

llvm-svn: 68272
2009-04-02 02:29:42 +00:00
Chris Lattner d14705b9b4 silence some errors that should not apply to .S files on code like:
''
   '
 '

llvm-svn: 67237
2009-03-18 21:10:12 +00:00
Chris Lattner 2534324a4e properly form a full token for # before calling HandleDirective.
llvm-svn: 67235
2009-03-18 20:58:27 +00:00
Chris Lattner fa217bda40 simplify some logic by making ScratchBuffer handle the application of trailing
\0's to created tokens instead of making all clients do it.  No functionality
change.

llvm-svn: 66373
2009-03-08 08:08:45 +00:00
Chris Lattner 91668def8b fix PR3609, emit:
t.c:1:10: error: missing terminating '>' character
#include <stdio.h
         ^

instead of:

t.c:1:10: error: missing terminating " character
#include <stdio.h
         ^

llvm-svn: 65052
2009-02-19 18:29:56 +00:00
Chris Lattner 9dc9c206d3 track "just a little more" location information for macro instantiations.
Now instead of just tracking the expansion history, also track the full
range of the macro that got replaced.  For object-like macros, this doesn't
change anything.  For _Pragma and function-like macros, this means we track
the locations of the ')'.

This is required for PR3579 because apparently GCC uses the line of the ')'
of a function-like macro as the location to expand __LINE__ to.

llvm-svn: 64601
2009-02-15 20:52:18 +00:00
Chris Lattner 60f36223a9 move library-specific diagnostic headers into library private dirs. Reduce
redundant #includes.  Patch by Anders Johnsen!

llvm-svn: 63271
2009-01-29 05:15:15 +00:00
Chris Lattner 7368d581c1 Split the single monolithic DiagnosticKinds.def file into one
.def file for each library.  This means that adding a diagnostic
to sema doesn't require all the other libraries to be rebuilt.

Patch by Anders Johnsen!

llvm-svn: 63111
2009-01-27 18:30:58 +00:00
Chris Lattner d381721810 Fix a bug I introduced in my changes, which caused MeasureTokenLength
to crash when given an instantiation location.  Thanks to Fariborz for
the testcase.

llvm-svn: 63057
2009-01-26 22:24:27 +00:00
Chris Lattner 7e20927756 allow _Pragmas formed from #defines to keep their full instantiation
history

llvm-svn: 63035
2009-01-26 20:15:46 +00:00
Chris Lattner 5a7971e0c3 This change refactors some of the low-level lexer interfaces a bit.
Token now has a class of kinds for "literals", which include 
numeric constants, strings, etc.  These tokens can optionally have
a pointer to the start of the token in the lexer buffer.  This 
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.

This change is performance neutral, but will make other changes
more feasible down the road.

llvm-svn: 63028
2009-01-26 19:29:26 +00:00
Chris Lattner 4fa23625ab Check in the long promised SourceLocation rewrite. This lays the
ground work for implementing #line, and fixes the "out of macro ID's" 
problem.

There is nothing particularly tricky about the code, other than the
very performance sensitive SourceManager::getFileID() method.

llvm-svn: 62978
2009-01-26 00:43:02 +00:00
Chris Lattner 1f6c7fe6a8 This is a follow-up to r62675:
Refactor how the preprocessor changes a token from being an tok::identifier to a 
keyword (e.g. tok::kw_for).  Instead of doing this in HandleIdentifier, hoist this
common case out into the caller, so that every keyword doesn't have to go through
HandleIdentifier.  This drops time in HandleIdentifier from 1.25ms to .62ms, and
speeds up clang -Eonly with PTH by about 1%.

llvm-svn: 62855
2009-01-23 18:35:48 +00:00
Chris Lattner 8256b970a3 a trivial micro optimization to save a load.
llvm-svn: 62676
2009-01-21 07:45:14 +00:00
Chris Lattner ad89ec013f Add a bit to IdentifierInfo that acts as a simple predicate which
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks.  This drops the time in HandleIdentifier 
from 3.52ms to .98ms on cocoa.h on my machine.

llvm-svn: 62675
2009-01-21 07:43:11 +00:00
Chris Lattner cbc35ecb04 Rename SourceManager::getCanonicalFileID -> getFileID. There is
no longer such thing as a non-canonical FileID.

llvm-svn: 62499
2009-01-19 07:46:45 +00:00
Chris Lattner 29a2a191f2 Make SourceLocation::getFileLoc private to reduce the API exposure of
SourceLocation.  This requires making some cleanups to token pasting
and _Pragma expansion.

llvm-svn: 62490
2009-01-19 06:46:35 +00:00
Chris Lattner 71dc14b9f0 Rename SourceLocation::getFileID to getChunkID, because it returns
the chunk ID not the file ID.  This exposes problems in 
TextDiagnosticPrinter where it should have been using the canonical
file ID but wasn't.  Fix these along the way.

llvm-svn: 62427
2009-01-17 08:45:21 +00:00
Chris Lattner 5509d533f6 simplify some lookups.
llvm-svn: 62426
2009-01-17 08:30:10 +00:00
Chris Lattner 757169b60f Change the Lexer ctor used to lex _Pragma directives into a static factory
method.  This lets us clean up the interface and make it more obvious that
this method is *really really* _Pragma specific.

Note that _Pragma handling uglifies the Lexer in the critical path.  It would
be very interesting to consider making _Pragma remapping be a new special
lexer class of its own.

llvm-svn: 62425
2009-01-17 08:27:52 +00:00
Chris Lattner c809089b26 Change the Lexer ctor used in the non _Pragma case to take a FileID instead
of a SourceLocation.  This should speed it up and definitely simplifies it.

llvm-svn: 62422
2009-01-17 08:03:42 +00:00
Chris Lattner 5965a28a4b More simplifications to the lexer ctors.
llvm-svn: 62419
2009-01-17 07:56:59 +00:00
Chris Lattner fcf6452eb4 make the verbose raw-lexer ctor fully explicit instead of having
embedded magic.

llvm-svn: 62417
2009-01-17 07:42:27 +00:00
Chris Lattner 08354fef13 add a simplified lexer ctor that sets up the lexer to raw-lex an
entire file.

llvm-svn: 62414
2009-01-17 07:35:14 +00:00
Chris Lattner f76b92092e refactor some common initialization code out of the two lexer ctors.
llvm-svn: 62411
2009-01-17 06:55:17 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 1abd20901b Instead of iterating over FileID's, have PTH generation iterate over the
content cache directly.  Content cache has a 1-1 mapping with fileentries,
whereas multiple FileIDs can be the same FileEntry.

llvm-svn: 62401
2009-01-17 03:48:08 +00:00
Chris Lattner 5882771102 Fix PR2477 - clang misparses "//*" in C89 mode
llvm-svn: 62368
2009-01-16 22:39:25 +00:00
Chris Lattner 8a42586c54 more SourceLocation lexicon change: instead of referring to the
"logical" location, refer to the "instantiation" location.

llvm-svn: 62316
2009-01-16 07:36:28 +00:00
Chris Lattner 53e384f633 Change some terminology in SourceLocation: instead of referring to
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!

llvm-svn: 62309
2009-01-16 07:00:02 +00:00
Chris Lattner e141a9e225 rdar://6060752 - don't warn about trigraphs in bcpl-style comments
llvm-svn: 60942
2008-12-12 07:34:39 +00:00
Chris Lattner 89770575cd fix thought-o
llvm-svn: 60937
2008-12-12 07:14:34 +00:00
Douglas Gregor 90abb6dead Objective-C keywords are not always identifiers. Some are also C++ keywords
llvm-svn: 60373
2008-12-01 21:46:47 +00:00
Daniel Dunbar 5c4cc09498 Comment fix.
llvm-svn: 59997
2008-11-25 00:20:22 +00:00
Chris Lattner f3cb394f41 Fix a weird inconsistency with hex floats. Previously the lexer
would not eat the "-1" in "0x0p-1", but LiteralSupport would accept
it when extensions are on.  This caused strangeness and failures 
when hexfloats were properly treated as an extension (not error)
in LiteralSupport.

llvm-svn: 59865
2008-11-22 07:39:03 +00:00
Chris Lattner 014156e108 actually, this version isn't really needed.
llvm-svn: 59859
2008-11-22 06:22:39 +00:00
Chris Lattner 57dab26be1 remove a sneaky version of Diag hiding in PreprocessorLexer.
llvm-svn: 59858
2008-11-22 06:20:42 +00:00
Chris Lattner 6d27a16b95 Change the Lexer::Diag method to not magically silence warnings,
force the caller to check instead.  This eliminates the need (and the
risk!) of weird null DiagnosticBuilder's floating around.

llvm-svn: 59856
2008-11-22 02:02:22 +00:00
Chris Lattner 427c9c1763 Split the DiagnosticInfo class into two disjoint classes:
one for building up the diagnostic that is in flight (DiagnosticBuilder)
and one for pulling structured information out of the diagnostic when
formatting and presenting it.

There is no functionality change with this patch.

llvm-svn: 59849
2008-11-22 00:59:29 +00:00
Ted Kremenek 45245217bc - Move static function IsNonPragmaNonMacroLexer into Preprocessor.h.
- Add variants of IsNonPragmaNonMacroLexer to accept an IncludeMacroStack entry
  (simplifies some uses).
- Use IsNonPragmaNonMacroLexer in Preprocessor::LookupFile.
- Add 'FileID' to PreprocessorLexer, and have Preprocessor query this fileid
  when looking up the FileEntry for a file

Performance testing of -Eonly on Cocoa.h shows no performance regression because
of this patch.

llvm-svn: 59666
2008-11-19 21:57:25 +00:00
Chris Lattner 907dfe94e1 Convert the lexer and start converting the PP over to using canonical Diag methods.
llvm-svn: 59511
2008-11-18 07:59:24 +00:00
Ted Kremenek 66312a3ff4 Move some diagnostic handling to PreprocessorLexer.
llvm-svn: 59191
2008-11-12 23:13:54 +00:00
Ted Kremenek 2f4f2dea82 Remove Lexer::LexIncludeFilename.
llvm-svn: 59186
2008-11-12 22:44:15 +00:00
Chris Lattner b11c3233d8 Change FormTokenWithChars to take the token kind to form, since all clients
were setting a kind and then forming it.  This is just a minor API cleanup, 
no functionality change.

llvm-svn: 57404
2008-10-12 04:51:35 +00:00
Chris Lattner 99e7d23455 When in keep whitespace mode, make sure to return block comments that are
unterminated.

llvm-svn: 57403
2008-10-12 04:19:49 +00:00
Chris Lattner e01e758e11 Change SkipBlockComment and SkipBCPLComment to return true when in
keep comment mode, instead of returning false.  This matches SkipWhitespace.

llvm-svn: 57402
2008-10-12 04:15:42 +00:00
Chris Lattner 4d96344c19 Add a new mode to the lexer which enables it to return all characters,
even whitespace, as tokens from the file.  This is enabled with
L->SetKeepWhitespaceMode(true) on a raw lexer.  In this mode, you too
can use clang as a really complex version of 'cat' with code like this:

  Lexer RawLex(SourceLocation::getFileLoc(SM.getMainFileID(), 0),
               PP.getLangOptions(), File.first, File.second);
  
  RawLex.SetKeepWhitespaceMode(true);
  
  Token RawTok;
  RawLex.LexFromRawLexer(RawTok);
  while (RawTok.isNot(tok::eof)) {
    std::cout << PP.getSpelling(RawTok);
    RawLex.LexFromRawLexer(RawTok);
  }

This will emit exactly the input file, with no canonicalization or other
translation.  Realistic clients actually do something with the tokens of
course :)

llvm-svn: 57401
2008-10-12 04:05:48 +00:00
Chris Lattner 097a8b8777 Fix a couple more places that poke KeepCommentMode unnecesarily.
llvm-svn: 57398
2008-10-12 03:27:19 +00:00
Chris Lattner 8637abd333 add a new inKeepCommentMode() accessor to abstract the KeepCommentMode
ivar.

llvm-svn: 57397
2008-10-12 03:22:02 +00:00
Chris Lattner e3f863a388 fix misleading comment.
llvm-svn: 57396
2008-10-12 01:34:51 +00:00
Chris Lattner 7c2e9809b1 Simplify raw mode lexing by treating an unterminate /**/ comment the
same we we do an unterminated string or character literal.  This makes
it so we can guarantee that the lexer never calls into the 
preprocessor (which would be suicide for a raw lexer).

llvm-svn: 57395
2008-10-12 01:31:51 +00:00
Chris Lattner 6b0c5ad096 add a comment.
llvm-svn: 57394
2008-10-12 01:23:27 +00:00
Chris Lattner 50c9050037 Change how raw lexers are handled: instead of creating them and then
using LexRawToken, create one and use LexFromRawLexer.  This avoids
twiddling the RawLexer flag around and simplifies some code (even 
speeding raw lexing up a tiny bit).

This change also improves the token paster to use a Lexer on the stack
instead of new/deleting it. 

llvm-svn: 57393
2008-10-12 01:15:46 +00:00
Chris Lattner 79ef843533 silence some release-assert warnings.
llvm-svn: 57391
2008-10-12 00:28:42 +00:00
Chris Lattner 87e97ea7b8 improve a comment.
llvm-svn: 57389
2008-10-12 00:23:07 +00:00
Daniel Dunbar 12c9ddced1 Change Parser & Sema to use interned "super" for comparions.
- Added as private members for each because it is not clear where to
   put the common definition. Perhaps the IdentifierInfos all of these
   "pseudo-keywords" should be collected into one place (this would
   KnownFunctionIDs and Objective-C property IDs, for example).

Remove Token::isNamedIdentifier.
 - There isn't a good reason to use strcmp when we have interned
   strings, and there isn't a good reason to encourage clients to do
   so.

llvm-svn: 54794
2008-08-14 22:04:54 +00:00
Nate Begeman 5eee93328e Fix typo
llvm-svn: 49632
2008-04-14 02:26:39 +00:00
Chris Lattner 8f96d04ceb don't diagnose empty source files, thanks Neil!
llvm-svn: 49575
2008-04-12 05:54:25 +00:00
Chris Lattner 9b7206eb4f don't read off the front of the buffer. Thanks to Sam for pointing this out.
llvm-svn: 49535
2008-04-11 16:20:41 +00:00
Chris Lattner 7a51313d8a Make a major restructuring of the clang tree: introduce a top-level
lib dir and move all the libraries into it.  This follows the main
llvm tree, and allows the libraries to be built in parallel.  The
top level now enforces that all the libs are built before Driver,
but we don't care what order the libs are built in.  This speeds
up parallel builds, particularly incremental ones.

llvm-svn: 48402
2008-03-15 23:59:48 +00:00