Commit Graph

301 Commits

Author SHA1 Message Date
Zhongxing Xu 2d17ff466e Properly initialize Preprocessor::CurLexerKind to avoid use of uninitialized variable.
llvm-svn: 140514
2011-09-26 03:37:43 +00:00
David Blaikie 9c902b5502 Rename Diagnostic to DiagnosticsEngine as per issue 5397
llvm-svn: 140478
2011-09-25 23:23:43 +00:00
Argyrios Kyrtzidis 64f6381097 Introduce PreprocessingRecord::getPreprocessedEntitiesInRange()
which will do a binary search and return a pair of iterators
for preprocessed entities in the given source range.

Source ranges of preprocessed entities are stored twice currently in
the PCH/Module file but this will be fixed in a subsequent commit.

llvm-svn: 140058
2011-09-19 20:40:25 +00:00
Douglas Gregor 97eec24b0b Add an experimental flag -fauto-module-import that automatically turns
#include or #import direcctives of framework headers into module
imports of the corresponding framework module.

llvm-svn: 139860
2011-09-15 22:00:41 +00:00
Douglas Gregor af5c48490e Optimize the preprocessor's handling of the __import_module__
keyword. We now handle this keyword in HandleIdentifier, making a note
for ourselves when we've seen the __import_module__ keyword so that
the next lexed token can trigger a module import (if needed). This
greatly simplifies Preprocessor::Lex(), and completely erases the 5.5%
-Eonly slowdown Argiris noted when I originally implemented
__import_module__. Big thanks to Argiris for noting that horrible
regression!

llvm-svn: 139265
2011-09-07 23:11:54 +00:00
Benjamin Kramer 60053cf547 Use const_cast to avoid warnings.
llvm-svn: 139104
2011-09-04 20:26:28 +00:00
Argyrios Kyrtzidis 5cec2aea3f Support code-completion for C++ inline methods and ObjC buffering methods.
Previously we would cut off the source file buffer at the code-completion
point; this impeded code-completion inside C++ inline methods and,
recently, with buffering ObjC methods.

Have the code-completion inserted into the source buffer so that it can
be buffered along with a method body. When we actually hit the code-completion
point the cut-off lexing or parsing.

Fixes rdar://10056932&8319466

llvm-svn: 139086
2011-09-04 03:32:15 +00:00
Douglas Gregor 83297dfc7e Allow the preprocessor to be constructed without performing target-
and language-specific initialization. Use this to allow ASTUnit to
create a preprocessor object *before* loading the AST file. No actual
functionality change.

llvm-svn: 138983
2011-09-01 23:39:15 +00:00
Douglas Gregor 7018d5bcfb Teach ASTContext and Preprocessor to hold on to references to the same
LangOptions, rather than making distinct copies of
LangOptions. Granted, LangOptions doesn't actually get modified, but
this will eventually make it easier to construct ASTContext and
Preprocessor before we know all of the LangOptions.

llvm-svn: 138959
2011-09-01 20:23:19 +00:00
Eli Friedman 874844123f Make sure to initialize field. Hopefully this will fix some test failures on Windows.
llvm-svn: 138880
2011-08-31 18:45:31 +00:00
Douglas Gregor ca97589f7d Switch __import__ over to __import_module__, so we don't conflict with
existing practice with Python extension modules. Not that Python
extension modules should be using a double-underscored identifier
anyway, but...

llvm-svn: 138870
2011-08-31 18:19:09 +00:00
Douglas Gregor d90c3c92d5 Take an entirely different approach to handling the "parsing" of
__import__ within the preprocessor, since the prior one foolishly
assumed that Preprocessor::Lex() was re-entrant. We now handle
__import__ at the top level (only), after macro expansion. This should
fix the buildbot failures.

llvm-svn: 138704
2011-08-27 06:37:51 +00:00
Douglas Gregor 081425343b Introduce support for a simple module import declaration, which
loads the named module. The syntax itself is intentionally hideous and
will be replaced at some later point with something more
palatable. For now, we're focusing on the semantics:
  - Module imports are handled first by the preprocessor (to get macro
  definitions) and then the same tokens are also handled by the parser
  (to get declarations). If both happen (as in normal compilation),
  the second one is redundant, because we currently have no way to
  hide macros or declarations when loading a module. Chris gets credit
  for this mad-but-workable scheme.
  - The Preprocessor now holds on to a reference to a module loader,
  which is responsible for loading named modules. CompilerInstance is
  the only important module loader: it now knows how to create and
  wire up an AST reader on demand to actually perform the module load.
  - We search for modules in the include path, using the module name
  with the suffix ".pcm" (precompiled module) for the file name. This
  is a temporary hack; we hope to improve the situation in the
  future.

llvm-svn: 138679
2011-08-26 23:56:07 +00:00
Ted Kremenek 8b77fe75af Change Preprocessor::getTotalMemory() to use llvm::capacity_in_bytes().
llvm-svn: 136239
2011-07-27 18:41:23 +00:00
Ted Kremenek 182543aba2 Report more memory using in Preprocessor::getTotalMemory() and PreprocessingRecord::getTotalMemory().
Most of the memory was already reported; but now we report more memory from side data structures.

Fixes <rdar://problem/9379717>.

llvm-svn: 136150
2011-07-26 21:17:24 +00:00
Chandler Carruth 115b077f30 Rename create(MacroArg)InstantiationLoc to create(MacroArg)ExpansionLoc.
llvm-svn: 136054
2011-07-26 03:03:05 +00:00
Chris Lattner 0e62c1cc0b remove unneeded llvm:: namespace qualifiers on some core types now that LLVM.h imports
them into the clang namespace.

llvm-svn: 135852
2011-07-23 10:55:15 +00:00
Chandler Carruth a88a221855 Move the rest of the preprocessor terminology from 'instantiate' and
variants to 'expand'. This changed a couple of public APIs, including
one public type "MacroInstantiation" which is now "MacroExpansion". The
rest of the codebase was updated to reflect this, especially the
libclang code. Two of the C++ (and thus easily changed) libclang APIs
were updated as well because they pertained directly to the old
MacroInstantiation class.

No functionality changed.

llvm-svn: 135139
2011-07-14 08:20:46 +00:00
Argyrios Kyrtzidis 8cc0459907 Introduce a caching mechanism for macro expanded tokens.
Previously macro expanded tokens were added to Preprocessor's bump allocator and never released,
even after the TokenLexer that were lexing them was finished, thus they were wasting memory.
A very "useful" boost library was causing clang to eat 1 GB just for the expanded macro tokens.

Introduce a special cache that works like a stack; a TokenLexer can add the macro expanded tokens
in the cache, and when it finishes, the tokens are removed from the end of the cache.

Now consumed memory by expanded tokens for that library is ~ 1.5 MB.

Part of rdar://9327049.

llvm-svn: 134105
2011-06-29 22:20:11 +00:00
Argyrios Kyrtzidis e379ee31c0 Introduce Preprocessor::getTotalMemory() and use it in CIndex.cpp, no functionality change.
llvm-svn: 134103
2011-06-29 22:20:04 +00:00
Douglas Gregor 998caead70 Introduce a new libclang parsing flag,
CXTranslationUnit_NestedMacroInstantiations, which indicates whether
we want to see "nested" macro instantiations (e.g., those that occur
inside other macro instantiations) within the detailed preprocessing
record. Many clients (e.g., those that only care about visible tokens)
don't care about this information, and in code that uses preprocessor
metaprogramming, this information can have a very high cost.

Addresses <rdar://problem/9389320>.

llvm-svn: 130990
2011-05-06 16:33:08 +00:00
John Wiegley 1c0675e155 Parsing/AST support for Structured Exception Handling
Patch authored by Sohail Somani.

Provide parsing and AST support for Windows structured exception handling.

llvm-svn: 130366
2011-04-28 01:08:34 +00:00
John McCall 462c055d85 Fix my earlier commit to work with escaped newlines and leave breadcrumbs
in case we want to make a world where we can check intermediate instantiations
for this kind of breadcrumb.

llvm-svn: 127221
2011-03-08 07:59:04 +00:00
John McCall cff9bcfbd3 Add an API call to retrieve the spelling data of a token from its SourceLocation.
llvm-svn: 127216
2011-03-08 04:06:57 +00:00
Abramo Bagnara ea4f7c7761 Introduced raw_identifier token kind.
llvm-svn: 122394
2010-12-22 08:23:18 +00:00
Chris Lattner 5159f6162e now the FileManager has a FileSystemOpts ivar, stop threading
FileSystemOpts through a ton of apis, simplifying a lot of code.
This also fixes a latent bug in ASTUnit where it would invoke
methods on FileManager without creating one in some code paths
in cindextext.

llvm-svn: 120010
2010-11-23 08:35:12 +00:00
Chris Lattner 39720111e0 move getSpelling from Preprocessor to Lexer, which it is more conceptually related to.
llvm-svn: 119479
2010-11-17 07:26:20 +00:00
Chris Lattner 2a6ee91619 move AdvanceToTokenCharacter and getLocForEndOfToken from
Preprocessor to Lexer where they make more sense.

llvm-svn: 119474
2010-11-17 07:05:50 +00:00
Chris Lattner b1ab2c2d3d add a static version of PP::AdvanceToTokenCharacter.
llvm-svn: 119472
2010-11-17 06:55:10 +00:00
Chris Lattner 30d4c928ac add a static form of the efficient PP::getSpelling method.
llvm-svn: 119469
2010-11-17 06:31:48 +00:00
Argyrios Kyrtzidis 71731d6b05 Implement -working-directory.
When -working-directory is passed in command line, file paths are resolved relative to the specified directory.
This helps both when using libclang (where we can't require the user to actually change the working directory)
and to help reproduce test cases when the reproduction work comes along.

--FileSystemOptions is introduced which controls how file system operations are performed (currently it just contains
 the working directory value if set).
--FileSystemOptions are passed around to various interfaces that perform file operations.
--Opening & reading the content of files should be done only through FileManager. This is useful in general since
 file operations will be abstracted in the future for the reproduction mechanism.

FileSystemOptions is independent of FileManager so that we can have multiple translation units sharing the same
FileManager but with different FileSystemOptions.

Addresses rdar://8583824.

llvm-svn: 118203
2010-11-03 22:45:23 +00:00
Ted Kremenek c8456f8c59 Really^2 fix <rdar://problem/8361834>, this time without crashing.
Now MICache is a linked list (per the FIXME), where we tradeoff between MacroInfo objects being in MICache
and MIChainHead.  MacroInfo objects in the MICache chain are already "Destroy()'ed", so they can be reused.  When
inserting into MICache, we need to remove them from the regular linked list so that they aren't destroyed more than
once.

llvm-svn: 116869
2010-10-19 22:15:20 +00:00
Ted Kremenek b865f7e025 Simplify loop. No functionality change.
llvm-svn: 116861
2010-10-19 21:30:11 +00:00
Ted Kremenek 1f1e4bdbf7 Simplify lifetime management of MacroInfo objects in Preprocessor by having the Preprocessor maintain them in a linked
list of allocated MacroInfos.  This requires only 1 extra pointer per MacroInfo object, and allows us to blow them
away in one place.  This fixes an elusive memory leak with MacroInfos (whose exact location I couldn't still figure
out despite substantial digging).

Fixes <rdar://problem/8361834>.

llvm-svn: 116842
2010-10-19 18:16:54 +00:00
Ted Kremenek 2c8028bcf4 In ~Preprocessor(), also cleanup the MacroInfo objects left-over from stray "#pragma push_macro" uses. This
fixes a potential memory leak.

llvm-svn: 116826
2010-10-19 17:40:53 +00:00
Fariborz Jahanian 9e42a952d7 Use getSpelling to get original text of the
c++ operator token. (radar 8328250).

llvm-svn: 112977
2010-09-03 17:33:04 +00:00
Fariborz Jahanian 0389df4a45 Patch to allow alternative representation of c++
operators (and, or, etc.) to be used as selectors
to match g++'s behavior.

llvm-svn: 112935
2010-09-03 01:26:16 +00:00
Alexis Hunt 3b7918625c Revert my user-defined literal commits - r1124{58,60,67} pending
some issues being sorted out.

llvm-svn: 112493
2010-08-30 17:47:05 +00:00
Alexis Hunt 79eb5469e0 Implement C++0x user-defined string literals.
The extra data stored on user-defined literal Tokens is stored in extra
allocated memory, which is managed by the PreprocessorLexer because there isn't
a better place to put it that makes sure it gets deallocated, but only after
it's used up. My testing has shown no significant slowdown as a result, but
independent testing would be appreciated.

llvm-svn: 112458
2010-08-29 21:26:48 +00:00
Douglas Gregor 33551892fa Tweak wording in an assertion, from dawn@burble.org.
llvm-svn: 112182
2010-08-26 14:07:34 +00:00
Douglas Gregor 115837041e Introduce a preprocessor code-completion hook for contexts where we
expect "natural" language and should not provide any completions,
e.g., comments, string literals, #error.

llvm-svn: 112054
2010-08-25 17:04:25 +00:00
Douglas Gregor 3a7ad25eb6 Introduce basic code-completion support for preprocessor directives,
e.g., after a "#" we'll suggest #if, #ifdef, etc.

llvm-svn: 111943
2010-08-24 19:08:16 +00:00
Chris Lattner 66b67d209e no need to pass bumppointer allocator into macroinfo::destroy
llvm-svn: 111364
2010-08-18 16:08:51 +00:00
Benjamin Kramer e8394df11b Random temporary string cleanup.
llvm-svn: 110807
2010-08-11 14:47:12 +00:00
Douglas Gregor 3f4bea0646 Introduce basic support for loading a precompiled preamble while
reparsing an ASTUnit. When saving a preamble, create a buffer larger
than the actual file we're working with but fill everything from the
end of the preamble to the end of the file with spaces (so the lexer
will quickly skip them). When we load the file, create a buffer of the
same size, filling it with the file and then spaces. Then, instruct
the lexer to start lexing after the preamble, therefore continuing the
parse from the spot where the preamble left off.

It's now possible to perform a simple preamble build + parse (+
reparse) with ASTUnit. However, one has to disable a bunch of checking
in the PCH reader to do so. That part isn't committed; it will likely
be handled with some other kind of flag (e.g., -fno-validate-pch).

As part of this, fix some issues with null termination of the memory
buffers created for the preamble; we were trying to explicitly
NULL-terminate them, even though they were also getting implicitly
NULL terminated, leading to excess warnings about NULL characters in
source files.

llvm-svn: 109445
2010-07-26 21:36:20 +00:00
Argyrios Kyrtzidis 36745fda34 Modify the pragma handlers to accept and use StringRefs instead of IdentifierInfos.
When loading the PCH, IdentifierInfos that are associated with pragmas cause declarations that use these identifiers to be deserialized (e.g. the "clang" pragma causes the "clang" namespace to be loaded).
We can avoid this if we just use StringRefs for the pragmas.

As a bonus, since we don't have to create and pass IdentifierInfos, the pragma interfaces get a bit more simplified.

llvm-svn: 108237
2010-07-13 09:07:17 +00:00
Ted Kremenek dea66e3e4c Fix memory leak in Preprocessor where MacroInfo objects in the MICache wouldn't have their
associated SmallVectors get deallocated.

llvm-svn: 105658
2010-06-08 23:00:53 +00:00
Chris Lattner fb24a3a4ec push some source location information down through the compiler,
into ContentCache::getBuffer.  This allows it to produce 
diagnostics on the broken #include line instead of without a 
location.

llvm-svn: 101939
2010-04-20 20:35:58 +00:00
Chris Lattner 58c79341ab Match MemoryBuffer API changes.
llvm-svn: 100484
2010-04-05 22:42:27 +00:00
Daniel Dunbar cb9eaf59fb PPCallbacks: Add hook for reaching the end of the main file, and fix DependencyFile to not do work in its destructor.
llvm-svn: 99257
2010-03-23 05:09:10 +00:00
Douglas Gregor 7dc8722bd3 Make the preprocessing record a PPCallbacks subclass itself,
eliminating the extra PopulatePreprocessingRecord object. This will
become useful once we start writing the preprocessing record to
precompiled headers.

llvm-svn: 98966
2010-03-19 17:12:43 +00:00
Douglas Gregor 7f6d60dcc2 Optionally store a PreprocessingRecord in the preprocessor itself, and
tie its creation to a CC1 flag -detailed-preprocessing-record.

llvm-svn: 98963
2010-03-19 16:15:56 +00:00
Douglas Gregor 4ad3da2843 Entering the main source file in the preprocessor can fail if the
source file has been changed. Handle that failure more gracefully.

llvm-svn: 98727
2010-03-17 15:44:30 +00:00
Douglas Gregor 42fe858cd6 Audit all callers of SourceManager::getCharacterData(); update some of
them to recover more gracefully on failure.

llvm-svn: 98672
2010-03-16 20:46:42 +00:00
Douglas Gregor 26266da3c3 Teach the one caller of SourceManager::getMemoryBufferForFile() to cope with errors
llvm-svn: 98664
2010-03-16 19:49:24 +00:00
Douglas Gregor 7bda4b8310 Introduce optional "Invalid" parameters to routines that invoke the
SourceManager's getBuffer() and, therefore, could fail, along with
Preprocessor::getSpelling(). Use the Invalid parameters in the literal
parsers (string, floating point, integral, character) to make them
robust against errors that stem from, e.g., PCH files that are not
consistent with the underlying file system.

I still need to audit every use caller to all of these routines, to
determine which ones need specific handling of error conditions.

llvm-svn: 98608
2010-03-16 05:20:39 +00:00
Kovarththanan Rajaratnam ba2c65277a Use SmallString instead of SmallVector
llvm-svn: 98436
2010-03-13 10:17:05 +00:00
Benjamin Kramer a197fb6731 Move method out-of-line. I thought this would be a candidate for inlining but I was wrong.
llvm-svn: 97330
2010-02-27 17:05:45 +00:00
Benjamin Kramer 0a1abd4088 Add an overload of Preprocessor::getSpelling which takes a SmallVector and
returns a StringRef. Use it to simplify some repetitive code.

llvm-svn: 97322
2010-02-27 13:44:12 +00:00
Ted Kremenek db4b7710f7 Fix subtle bug in Preprocessor::AdvanceToTokenCharacter(): use '+=' instead of '='.
llvm-svn: 94830
2010-01-29 19:38:24 +00:00
Douglas Gregor 562c1f9365 Teach CIndex's cursor visitor to restrict its traversal to a specific
region of interest (if provided). Implement clang_getCursor() in terms
of this traversal rather than using the Index library; the unified
cursor visitor is more complete, and will be The Way Forward.

Minor other tweaks needed to make this work:
  - Extend Preprocessor::getLocForEndOfToken() to accept an offset
  from the end, making it easy to move to the last character in the
  token (rather than just past the end of the token).
  - In Lexer::MeasureTokenLength(), the length of whitespace is zero.

llvm-svn: 94200
2010-01-22 19:49:59 +00:00
Chris Lattner 87d0208c41 allow the HandlerComment callback to push tokens into the
preprocessor.  This could be used by an OpenMP implementation
or something.  Patch by Abramo Bagnara!

llvm-svn: 93795
2010-01-18 22:35:47 +00:00
Douglas Gregor 9882a5aac6 Teach Preprocessor::macro_begin/macro_end to lazily load all macro
definitions from a precompiled header. This ensures that
code-completion with macro names behaves the same with or without
precompiled headers.

llvm-svn: 92497
2010-01-04 19:18:44 +00:00
Benjamin Kramer d77adb5b1c Avoid an unnecessary copy of Predefines. getMemBufferCopy does the null termination for us.
llvm-svn: 92358
2009-12-31 15:33:09 +00:00
Chris Lattner d19564b109 set up the machinery for a MacroArgs cache hanging off Preprocessor.
We creating and free thousands of MacroArgs objects (and the related
std::vectors hanging off them) for the testcase in PR5610 even though
there are only ~20 live at a time.  This doesn't actually use the 
cache yet.

llvm-svn: 91391
2009-12-15 01:51:03 +00:00
Chris Lattner d2fa78d0bd fix typo
llvm-svn: 91343
2009-12-14 22:02:43 +00:00
Douglas Gregor 9291ab6d80 Don't expand tabs when computing the offset from the code-completion column
llvm-svn: 90881
2009-12-08 21:45:46 +00:00
Daniel Dunbar 1776679e71 Change Preprocessor::EnterSourceFile to make ErrorStr non-optional, clients should be forced to deal with error conditions.
llvm-svn: 90700
2009-12-06 09:19:12 +00:00
Douglas Gregor 5f49883488 Minor cleanup to the code-completion-point logic suggested by Chris.
llvm-svn: 90459
2009-12-03 17:05:59 +00:00
Douglas Gregor 53ad6b94b0 Extend the source manager with the ability to override the contents of
files with the contents of an arbitrary memory buffer. Use this new
functionality to drastically clean up the way in which we handle file
truncation for code-completion: all of the truncation/completion logic
is now encapsulated in the preprocessor where it belongs
(<rdar://problem/7434737>).

llvm-svn: 90300
2009-12-02 06:49:09 +00:00
Daniel Dunbar bf410c6fc2 Add static version of Preprocessor::getSpelling.
llvm-svn: 88732
2009-11-14 01:20:48 +00:00
Daniel Dunbar 1b4441915a Wherein the TargetInfo argument to Preprocessor is made 'const' and propogated.
llvm-svn: 87087
2009-11-13 05:51:54 +00:00
Daniel Dunbar 0c6c930f05 Allow Preprocessor to take ownership of the HeaderSearch object. I think it should probably always own the header search object, but I'm not sure...
llvm-svn: 86882
2009-11-11 21:44:21 +00:00
Daniel Dunbar 07dcd8b9d8 Make LookUpIdentifierInfo const. This makes the Identifiers table mutable and is
a little fuzzy, but conceptually it's just uniquing the identifier.

Chris, please review. I debated splitting into const/non-const versions where
the const one propogated constness to the resulting IdentifierInfo*.

llvm-svn: 86106
2009-11-05 01:53:52 +00:00
Daniel Dunbar f539bfeb4d StringRefize Preprocessor::getIdentifierInfo.
llvm-svn: 86105
2009-11-05 01:53:39 +00:00
Daniel Dunbar d0ba0e6108 Kill PreprocessorFactory, which was both morally repugnant and totally unused.
llvm-svn: 86076
2009-11-04 23:56:25 +00:00
Daniel Dunbar 2c422dc9ca Move clients to use IdentifierInfo::getNameStart() instead of getName()
llvm-svn: 84436
2009-10-18 20:26:12 +00:00
Douglas Gregor d2eb58abac Add support for a chain of stat caches in the FileManager, rather than
only supporting a single stat cache. The immediate benefit of this
change is that we can now generate a PCH/AST file when including
another PCH file; in the future, the chain of stat caches will likely
be useful with multiple levels of PCH files.

llvm-svn: 84263
2009-10-16 18:18:30 +00:00
Douglas Gregor ea9b03e6e2 Replace the -code-completion-dump option with
-code-completion-at=filename:line:column

which performs code completion at the specified location by truncating
the file at that position and enabling code completion. This approach
makes it possible to run multiple tests from a single test file, and
gives a more natural command-line interface.

llvm-svn: 82571
2009-09-22 21:11:38 +00:00
Douglas Gregor 2436e7116b Initial implementation of a code-completion interface in Clang. In
essence, code completion is triggered by a magic "code completion"
token produced by the lexer [*], which the parser recognizes at
certain points in the grammar. The parser then calls into the Action
object with the appropriate CodeCompletionXXX action.

Sema implements the CodeCompletionXXX callbacks by performing minimal
translation, then forwarding them to a CodeCompletionConsumer
subclass, which uses the results of semantic analysis to provide
code-completion results. At present, only a single, "printing" code
completion consumer is available, for regression testing and
debugging. However, the design is meant to permit other
code-completion consumers.

This initial commit contains two code-completion actions: one for
member access, e.g., "x." or "p->", and one for
nested-name-specifiers, e.g., "std::". More code-completion actions
will follow, along with improved gathering of code-completion results
for the various contexts.

[*] In the current -code-completion-dump testing/debugging mode, the
file is truncated at the completion point and EOF is translated into
"code completion".

llvm-svn: 82166
2009-09-17 21:32:03 +00:00
Mike Stump 11289f4280 Remove tabs, and whitespace cleanups.
llvm-svn: 81346
2009-09-09 15:08:12 +00:00
Benjamin Kramer 89b422c118 Replace cerr with errs().
llvm-svn: 79854
2009-08-23 12:08:50 +00:00
Douglas Gregor c6d5edd2ed Add support for retrieving the Doxygen comment associated with a given
declaration in the AST. 

The new ASTContext::getCommentForDecl function searches for a comment
that is attached to the given declaration, and returns that comment, 
which may be composed of several comment blocks.

Comments are always available in an AST. However, to avoid harming
performance, we don't actually parse the comments. Rather, we keep the
source ranges of all of the comments within a large, sorted vector,
then lazily extract comments via a binary search in that vector only
when needed (which never occurs in a "normal" compile).

Comments are written to a precompiled header/AST file as a blob of
source ranges. That blob is only lazily loaded when one requests a
comment for a declaration (this never occurs in a "normal" compile). 

The indexer testbed now supports comment extraction. When the
-point-at location points to a declaration with a Doxygen-style
comment, the indexer testbed prints the associated comment
block(s). See test/Index/comments.c for an example.

Some notes:
  - We don't actually attempt to parse the comment blocks themselves,
  beyond identifying them as Doxygen comment blocks to associate them
  with a declaration.
  - We won't find comment blocks that aren't adjacent to the
  declaration, because we start our search based on the location of
  the declaration.
  - We don't go through the necessary hops to find, for example,
  whether some redeclaration of a declaration has comments when our
  current declaration does not. Similarly, we don't attempt to
  associate a \param Foo marker in a function body comment with the
  parameter named Foo (although that is certainly possible).
  - Verification of my "no performance impact" claims is still "to be
  done".

llvm-svn: 74704
2009-07-02 17:08:52 +00:00
Chris Lattner 4ef49c1d6e my refactoring of builtins changed target-specific builtins to only be
registered when PCH wasn't being used.  We should always install (in BuiltinInfo)
information about target-specific builtins, but we shouldn't register any builtin
identifier infos.  This fixes the build of apps that use PCH and target specific
builtins together.

llvm-svn: 73492
2009-06-16 16:18:48 +00:00
Eli Friedman 6bba2adc95 Emit keyword extension warning in all modes, not just C99 mode.
llvm-svn: 70283
2009-04-28 03:59:15 +00:00
Chris Lattner 93017cc12a Change Preprocessor::AdvanceToTokenCharacter to stop at
the first real character of a token.  For example, advancing
to byte 3 of foo\
bar

should stop at the b, not the \.

llvm-svn: 69484
2009-04-18 22:28:58 +00:00
Chris Lattner 397ca4a9ef fix typo
llvm-svn: 69479
2009-04-18 21:55:02 +00:00
Chris Lattner 184e65d363 Change Lexer::MeasureTokenLength to take a LangOptions reference.
This allows it to accurately measure tokens, so that we get:

t.cpp:8:13: error: unknown type name 'X'
static foo::X  P;
       ~~~~~^

instead of the woefully inferior:

t.cpp:8:13: error: unknown type name 'X'
static foo::X  P;
       ~~~~ ^

Most of this is just plumbing to push the reference around.

llvm-svn: 69099
2009-04-14 23:22:57 +00:00
Chris Lattner 0af3ba1748 implement the microsoft/gnu "__COUNTER__" macro: rdar://4329310
llvm-svn: 68933
2009-04-13 01:29:17 +00:00
Douglas Gregor 92863e475e Compare the predefines buffer in the PCH file with the predefines
buffer generated for the current translation unit. If they are
different, complain and then ignore the PCH file. This effectively
checks for all compilation options that somehow would affect
preprocessor state (-D, -U, -include, the dreaded -imacros, etc.).

When we do accept the PCH file, throw away the contents of the
predefines buffer rather than parsing them, since all of the results
of that parsing are already stored in the PCH file. This eliminates
the ugliness with the redefinition of __builtin_va_list, among other
things.

llvm-svn: 68838
2009-04-10 23:10:45 +00:00
Chris Lattner d959d753bc do a dance with predefines, and finally enable reading of macros from
PCH.  This works now, except for limitations not being able to do things
with identifiers.  The basic example in the testcase works though.

llvm-svn: 68832
2009-04-10 22:13:17 +00:00
Chris Lattner 3c68407868 move a bunch of code for initializing the predefines buffer out of Preprocessor.cpp
into clang-cc.cpp.  This makes it so clang-cc constructs the *entire* predefines 
buffer, not just half of it.  A bonus of this is that we get to kill a copy
of DefineBuiltinMacro.

llvm-svn: 68830
2009-04-10 21:58:23 +00:00
Douglas Gregor a7f71a91c5 PCH serialization/deserialization of the source manager. With this
improvement, source locations read from the PCH file will properly
resolve to the source files that were used to build the PCH file
itself.

Once we have the preprocessor state stored in the PCH file, source
locations that refer to macro instantiations that occur in the PCH
file should have the appropriate instantiation information.

llvm-svn: 68758
2009-04-10 03:52:48 +00:00
Daniel Dunbar 17ddaa677e More fixes to builtin preprocessor defines.
- Add -static-define option driver can use when __STATIC__ should be
   defined (instead of __DYNAMIC__).

 - Don't set __OPTIMIZE_SIZE__ on Os, __OPTIMIZE_SIZE__ is tied to Oz.

 - Set __NO_INLINE__ following GCC 4.2.

 - Set __GNU_GNU_INLINE__ or __GNU_STDC_INLINE__ following GCC 4.2.

 - Set __EXCEPTIONS for Objective-C NonFragile ABI.

 - Set __STRICT_ANSI__ for standard conforming modes.

 - I added a clang style test case in utils for this, but its not
   particularly portable and I don't think it belongs in the test
   suite.

llvm-svn: 68621
2009-04-08 18:03:55 +00:00
Daniel Dunbar ab7b2f5623 Set __PIC__ (more) correctly.
- Add -pic-level clang-cc option to specify the value for the define,
   updated driver to pass this.

 - Added __pic__

 - Added OBJC_ZEROCOST_EXCEPTIONS define while I was here (to match gcc).

llvm-svn: 68584
2009-04-08 03:03:23 +00:00
Chris Lattner c2d140156c The __weak and __strong defines are common to all darwin targets
and are even set in C mode.  As such, move them to Targets.cpp.

__OBJC_GC__ is also darwin specific, but seems reasonable to always
define it when in objc-gc mode.

This fixes rdar://6761450

llvm-svn: 68494
2009-04-07 04:48:21 +00:00
Anders Carlsson 65cb90efc1 Define __OPTIMIZE__ and __OPTIMIZE_SIZE__ if the -O[12] and -Os flags are passed to the compiler.
llvm-svn: 68450
2009-04-06 17:37:10 +00:00
Fariborz Jahanian c35c9d87a9 Put back __OBJC2__ definition.
llvm-svn: 67802
2009-03-26 23:57:56 +00:00
Fariborz Jahanian dac14a7159 - Minor change to dump of ivar layout map.
- Temporarily undef'ed __OBJC2__ in nonfragile objc abi mode
  as it was forcing ivar synthesis in a certain project which clang
  does not yet support.

llvm-svn: 67766
2009-03-26 19:10:36 +00:00
Chris Lattner 73a7cab9e1 change the __VERSION__ string to be more sensible. It would be useful to include the clang version # too.
llvm-svn: 67619
2009-03-24 16:09:18 +00:00
Chris Lattner 1d1d80e5f9 rename the <predefines> buffer to <built-in> to solve PR3849.
Add a #include directive around the command line buffer so that
diagnostics generated from -include directives get diagnostics
like:

In file included from <built-in>:98:
In file included from <command line>:3:
./t.h:2:1: warning: type specifier missing, defaults to 'int'
b;
^

llvm-svn: 67396
2009-03-20 20:16:10 +00:00
Chris Lattner 4ba73aa0c2 pass LangOptions into TargetInfo::getTargetDefines, so that targets
can have language-specific defines.

llvm-svn: 67375
2009-03-20 15:52:06 +00:00
Anders Carlsson 5bd30395b9 (Hopefully) instantiate dependent array types correctly.
llvm-svn: 67032
2009-03-15 20:12:13 +00:00
Chris Lattner 83aba00ee8 make Preprocessor::Diags be a pointer instead of a reference.
llvm-svn: 66955
2009-03-13 21:17:43 +00:00
Chris Lattner da248f4f30 fix PR3768, Clang does -D__STDC_HOSTED__=1, even if -ffreestanding is passed.
llvm-svn: 66474
2009-03-09 21:50:12 +00:00
Mike Stump 82d8d559bb Fix warnings in build on clang-x86_64-freebsd buildbot.
llvm-svn: 66344
2009-03-07 18:35:41 +00:00
Chris Lattner c25d8a7e30 improve compatibility with GCC 4.4, patch by Michel Salim (PR3697)
llvm-svn: 65884
2009-03-02 22:20:04 +00:00
Douglas Gregor 96977da72c Clean up and document code modification hints.
llvm-svn: 65641
2009-02-27 17:53:17 +00:00
Chris Lattner 70946da73a switch the macroinfo argument lists from being allocated off the heap
to being allocated from the same bumpptr that the MacroInfo objects 
themselves are.

This speeds up -Eonly cocoa.h pth by ~4%, fsyntax-only is barely measurable.

llvm-svn: 65195
2009-02-20 22:46:43 +00:00
Chris Lattner f87c510cc9 detemplatify setArgumentList and some other cleanups.
llvm-svn: 65187
2009-02-20 22:31:31 +00:00
Chris Lattner 666f7a42d6 require the MAcroInfo objects are explcitly destroyed.
llvm-svn: 65179
2009-02-20 22:19:20 +00:00
Chris Lattner 57a09cfcbc update comment.
llvm-svn: 64939
2009-02-18 18:56:29 +00:00
Chris Lattner ec396b5114 Fix some issues handling sub-token locations that come from macro expansions.
We now emit:

t.m:6:15: warning: field width should have type 'int', but argument has type 'unsigned int'
  printf(STR, (unsigned) 1, 1);
         ^    ~~~~~~~~~~~~
t.m:3:18: note: instantiated from:
#define STR "abc%*ddef"
                 ^

which has the correct location in the string literal in the note line.

llvm-svn: 64936
2009-02-18 18:52:52 +00:00
Fariborz Jahanian eb209e7dbd define __OBJC2__ for objc's nonfragile abi.
llvm-svn: 64642
2009-02-16 18:28:48 +00:00
Chris Lattner ee4b5235e3 Add support for deprecated members of RecordDecls (e.g. struct fields).
llvm-svn: 64634
2009-02-16 17:07:21 +00:00
Chris Lattner 9dc9c206d3 track "just a little more" location information for macro instantiations.
Now instead of just tracking the expansion history, also track the full
range of the macro that got replaced.  For object-like macros, this doesn't
change anything.  For _Pragma and function-like macros, this means we track
the locations of the ')'.

This is required for PR3579 because apparently GCC uses the line of the ')'
of a function-like macro as the location to expand __LINE__ to.

llvm-svn: 64601
2009-02-15 20:52:18 +00:00
Chris Lattner 7e4c81c8c6 Give TargetInfo a new IntPtrType to hold the intptr_t type for
a target.

Make Preprocessor.cpp define a new __INTPTR_TYPE__ macro based on this.

On linux/32, set intptr_t to int, instead of long.  This fixes PR3563.

llvm-svn: 64495
2009-02-13 22:28:55 +00:00
Chris Lattner 9ef847be12 Fix rdar://6562329, a static analyzer crash Ted noticed on
wine sources.  This was happening because HighlightMacros was 
calling EnterMainFile multiple times on the same preprocessor
object and getting an assert due to the new #line stuff (the
file in question was bison output with #line directives).

The fix for this is to not reenter the file.  Instead, 
relex the tokens in raw mode, swizzle them a bit and repreprocess
the token stream.  An added bonus of this is that rewrite macros
will now hilight the macro definition as well as its uses.  Woo.

llvm-svn: 64480
2009-02-13 19:33:24 +00:00
Ted Kremenek a5c2c27ebd PTH: Cache stat information for files in the PTH file. Hook up FileManager
to use this stat information in the PTH file using a 'StatSysCallCache' object.

Performance impact (Cocoa.h, PTH):
- number of stat calls reduces from 1230 to 425
- fsyntax-only: time improves by 4.2% 

We can reduce the number of stat calls to almost zero by caching negative stat
calls and directory stat calls in the PTH file as well.

llvm-svn: 64353
2009-02-12 03:26:59 +00:00
Chris Lattner 0763cc6bfd Export __INT8_TYPE__ / __INT16_TYPE__ / __INT32_TYPE__ / __INT64_TYPE__
in a gcc 4.5 compatible way so that stdint.h can follow the compiler's 
notion of types.

llvm-svn: 63976
2009-02-06 22:59:26 +00:00
Chris Lattner a31829b5bc -funsigned-char sets __CHAR_UNSIGNED__
llvm-svn: 63942
2009-02-06 18:20:57 +00:00
Chris Lattner 1630c3c4f0 Add an implementation of -dM that follows GCC closely enough to permit
diffing the output of:
  clang -dM -o - -E -x c foo.c | sort

llvm-svn: 63926
2009-02-06 06:45:26 +00:00
Chris Lattner bba531ce99 get __WCHAR_TYPE__ from the targetinfo hook
llvm-svn: 63920
2009-02-06 05:06:07 +00:00
Chris Lattner a91c30fdb0 simplify and refactor a bunch of type definition code in Preprocessor
predefines buffer initialization.

llvm-svn: 63919
2009-02-06 05:04:11 +00:00
Chris Lattner 61898606dc remove some ad-hocery and use DefineTypeSize for more things.
Now you too can have a 47 bit long long!

llvm-svn: 63918
2009-02-06 04:55:18 +00:00
Chris Lattner a3dc5d8423 refactor some code into a DefineTypeSize function.
llvm-svn: 63917
2009-02-06 04:50:25 +00:00
Chris Lattner fafd8d1be9 correct and generalize computation of __INTMAX_MAX__.
llvm-svn: 63848
2009-02-05 07:27:41 +00:00
Chris Lattner 8181312251 fix some differences between apple gcc and clang on darwin/x86-32.
llvm-svn: 63846
2009-02-05 07:19:24 +00:00
Chris Lattner 60f36223a9 move library-specific diagnostic headers into library private dirs. Reduce
redundant #includes.  Patch by Anders Johnsen!

llvm-svn: 63271
2009-01-29 05:15:15 +00:00
Chris Lattner 7368d581c1 Split the single monolithic DiagnosticKinds.def file into one
.def file for each library.  This means that adding a diagnostic
to sema doesn't require all the other libraries to be rebuilt.

Patch by Anders Johnsen!

llvm-svn: 63111
2009-01-27 18:30:58 +00:00
Chris Lattner f1ca7d3e02 Introduce a new PresumedLoc class to represent the concept of a location
as reported to the user and as manipulated by #line.  This is what __FILE__,
__INCLUDE_LEVEL__, diagnostics and other things should follow (but not 
dependency generation!).  

This patch also includes several cleanups along the way: 

- SourceLocation now has a dump method, and several other places 
  that did similar things now use it.
- I cleaned up some code in AnalysisConsumer, but it should probably be
  simplified further now that NamedDecl is better.
- TextDiagnosticPrinter is now simplified and cleaned up a bit.

This patch is a prerequisite for #line, but does not actually provide 
any #line functionality.

llvm-svn: 63098
2009-01-27 07:57:44 +00:00
Ted Kremenek 8d178f4357 PTH: Use Token::setLiteralData() to directly store a pointer to cached spelling data in the PTH file. This removes a ton of code for looking up spellings using sourcelocations in the PTH file. This simplifies both PTH-generation and reading.
Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file):
- PTH generation time improves by 5%
- PTH reading improves by 0.3%.

llvm-svn: 63072
2009-01-27 00:01:05 +00:00
Chris Lattner 5a7971e0c3 This change refactors some of the low-level lexer interfaces a bit.
Token now has a class of kinds for "literals", which include 
numeric constants, strings, etc.  These tokens can optionally have
a pointer to the start of the token in the lexer buffer.  This 
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.

This change is performance neutral, but will make other changes
more feasible down the road.

llvm-svn: 63028
2009-01-26 19:29:26 +00:00
Chris Lattner 1f6c7fe6a8 This is a follow-up to r62675:
Refactor how the preprocessor changes a token from being an tok::identifier to a 
keyword (e.g. tok::kw_for).  Instead of doing this in HandleIdentifier, hoist this
common case out into the caller, so that every keyword doesn't have to go through
HandleIdentifier.  This drops time in HandleIdentifier from 1.25ms to .62ms, and
speeds up clang -Eonly with PTH by about 1%.

llvm-svn: 62855
2009-01-23 18:35:48 +00:00
Chris Lattner ad89ec013f Add a bit to IdentifierInfo that acts as a simple predicate which
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks.  This drops the time in HandleIdentifier 
from 3.52ms to .98ms on cocoa.h on my machine.

llvm-svn: 62675
2009-01-21 07:43:11 +00:00
Ted Kremenek 8c3b812148 Run destructors of MacroInfo objects to free memory they allocate. This addresses <rdar://problem/6506035>.
llvm-svn: 62498
2009-01-19 07:45:44 +00:00
Chris Lattner 8ddb5cf0cf in Preprocessor::AdvanceToTokenCharacter, don't actually bother
creating a whole lexer when we just want one static method.

llvm-svn: 62420
2009-01-17 07:57:25 +00:00
Chris Lattner 3793bba26f suck the call to "getSpellingLoc" that all clients do into
the implementation of PTHManager::getSpelling.

llvm-svn: 62408
2009-01-17 06:29:33 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 8a42586c54 more SourceLocation lexicon change: instead of referring to the
"logical" location, refer to the "instantiation" location.

llvm-svn: 62316
2009-01-16 07:36:28 +00:00
Chris Lattner 7c8556e7bc remove obsolete comment which happened to go over 80 cols.
llvm-svn: 62313
2009-01-16 07:04:11 +00:00
Chris Lattner 15af77f679 remove an unneeded const_cast.
llvm-svn: 62311
2009-01-16 07:02:14 +00:00
Chris Lattner 53e384f633 Change some terminology in SourceLocation: instead of referring to
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!

llvm-svn: 62309
2009-01-16 07:00:02 +00:00
Ted Kremenek a705b04d7f IdentifierInfo:
- IdentifierInfo can now (optionally) have its string data not be
  co-located with itself.  This is for use with PTH.  This aspect is a
  little gross, as getName() and getLength() now make assumptions
  about a possible alternate representation of IdentifierInfo.
  Perhaps we should make IdentifierInfo have virtual methods?

IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
  IdentifierTable to perform "string -> IdentifierInfo" lookups using
  an auxilliary data structure.  This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
  because of the extra check for the IdentiferInfoLookup object (the
  regular StringMap lookup does enough work to mitigate the impact of
  an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
  owned by the IdentiferInfoLookup object.  This should be reviewed.

PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
  IdentifierTable's string map, and instead create IdentifierInfo
  objects on the fly when mapping from persistent IDs to
  IdentifierInfos.  This saves a ton of work with string copies,
  hashing, and StringMap lookup and resizing.  This change was
  motivated because when processing source files in the PTH cache we
  don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
  IdentifierTable to transparently use IdentifierInfo objects managed
  by the PTH file.  PTHManager resolves "string -> IdentifierInfo"
  queries by doing a binary search over a sorted table of identifier
  strings in the PTH file (the exact algorithm we use can be changed
  as needed).

These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement

llvm-svn: 62273
2009-01-15 18:47:46 +00:00
Ted Kremenek e9814186ac PTH:
- Use canonical FileID when using getSpelling() caching.  This
  addresses some cache misses we were seeing with -fsyntax-only on
  Cocoa.h
- Added Preprocessor::getPhysicalCharacterAt() utility method for
  clients to grab the first character at a specified sourcelocation.
  This uses the PTH spelling cache.
- Modified Sema::ActOnNumericConstant() to use
  Preprocessor::getPhysicalCharacterAt() instead of
  SourceManager::getCharacterData() (to get PTH hits).

These changes cause -fsyntax-only to not page in any sources from
Cocoa.h.  We see a speedup of 27%.

llvm-svn: 62193
2009-01-13 23:19:12 +00:00
Ted Kremenek b0b4f74b6b PTH: Fix remaining cases where the spelling cache in the PTH file was being missed when it shouldn't. This shaves another 7% off PTH time for -Eonly on Cocoa.h
llvm-svn: 62186
2009-01-13 22:05:50 +00:00
Ted Kremenek 884a558441 PTH:
- Added stub PTHLexer::getSpelling() that will be used for fetching cached
  spellings from the PTH file.  This doesn't do anything yet.
- Added a hook in Preprocessor::getSpelling() to call PTHLexer::getSpelling()
  when using a PTHLexer.
- Updated PTHLexer to read the offsets of spelling tables in the PTH file.

llvm-svn: 61911
2009-01-08 02:47:16 +00:00
Chris Lattner 07ebf302e5 simplify Preprocessor::getSpelling now that identifiers carry around
their length.

llvm-svn: 61734
2009-01-05 19:44:41 +00:00
Steve Naroff f9c29d4200 Add parser support for __forceinline, __w64, __ptr64.
llvm-svn: 61431
2008-12-25 14:41:26 +00:00
Steve Naroff 44ac777741 Add parser support for __cdecl, __stdcall, and __fastcall.
Change preprocessor implementation of _cdecl to reference __cdecl.

llvm-svn: 61430
2008-12-25 14:16:32 +00:00